making schema validation happy
forget the previous commit message: the vocabulary has been updated with a new protocol for wellcome trust
Enabling incremental collection for entityregistry::projects
introduced HDFS Action related job profiles
added anchorStats job
fixing #1951
added index field for counter_doi
rule script for narcis
rule scripts for repec, rioxx, pmc-nlm
added a new NSF contract type
#1592: New entity registry sub-classes
a couple of synonyms for re3data
mapping contributors as attributes of the result entity
added hadoop job profile for the openaire identifiers export workflow
removed ftp2 and added sftp protocol
fixed a problem in a concat()
datasource type vocabulary to match the portal needs. Related to #1811
using 'value' for date fields
fixed upper case country names
using xpath in multivalue fields (#1792)
document similarity threshold set to 0.7 instead of 0.8.
#1772: changed default trust thresholds
added index field: reldatasourcecompatibilityid
expanding new relationships, added openairecompatibility in organization expansion
#1583: introducing openaire2.0_data compliance
introduced mapping for dateOfTransformation
using new metadata cache location
introduced new hadoop job profiles (dedup)
native english name renamed to proprietary
#1265 intoduced mapping for ORCID
mappings moved from classpath as transformation profiles
added new relations: supplement, part, contributor
mapping name
fixed mapping for data provision workflow: oaf2hbase. added mapping that writes the publication in the person row, allowing to collect its coauthors with a m/r job
namespace declaration
configuration for enriched sets
mapping includes DOIs for datasets and preserve multiple original IDs
use of external properties
use of Text instead of ImmutableBytesWritable
reimplemented calculatePersonDistribution M/R job to consider only the results from pubsrepositories (not journals)
added default threshold parameters. #1209
using about instead of dataInfo
fixed dateOfCollection;support of H2020 grantAgreement
Removed Greece duplicates
added one more dedup configuration for organizations
informationSpaceImportJob
updated compression parameters
compressing output
add a new term
new vocabulary for NSF Contract Types
nsf classification
new countries and synonims from NSF projects
update of transformation rule script wrt. identifiers, fp7, h2020
added information space export job
added new protocol for re3data
#1453 Publication Catalogue new vocabulary term
updated rule script of the claiming datasource with the http://dx.doi.org prefix
added date of creation for FET context
Updated mappings for funders and funding
deleted oldest ec:h2020toas vocabulary
new vocabulary for external references types.
xslt mapping for person objects
added hadoop jobs (dedup person)
updated person dedup configuration
added new indexed fields:- projectoamandatepublications- projectecarticle29_3- projectsubject
corda h2020 from ftp
MapDocument implements a more general view of the pace model
added trust level threshold for document similarity and document classes
new parameter for pdb inference module
configurable entity unpack xsl: the person id depends on the datasource typology (see 'mergeIdForHomonymsMap' param)
write the publication in the person row, allowing to collect its coauthors with a m/r job
update of "Horizon 2020 - Types of Action" vocabulary
Added communityname and communityid index fields: we need to be able to exclude funders from the context browse
added field relfundinglevel0_name
initial nlm2oaf transformation rule script
added coauthor workflow and hadoop job
each person row contains the list of publications, each publication embeds its authors
profiles to run calculate Person Distribution
updated job props
added workflow to export the representative publications as json on hdfs
updated primary iis job profile and workflow to the latest specs
fixed index field name for relfunderjurisdiction
index fields for funders on the relationships to projects
search publications by author
added attribute enabled to dedup configuration orchestrations
added some regexes to avoid deduplicating big groups of publications
added mapping profile for datasets
Indx fields for project funders. #1241
making the schema happy
added dedup configuration and orchestration for person entities
added oaf2hbase mapping profiles
fixed rootbuilder entries
Added ftp2 protocol
added mandatory description
removing id, not permitted by the schema
merged branch dedupConf
#953 blacklisting da458477233b5561ae47042aa2a73086 content
#953 adding bea4728578070c3d66774bf9454d41fe checksum to blacklisted