fixed dateOfCollection;support of H2020 grantAgreement
Removed Greece duplicates
added one more dedup configuration for organizations
informationSpaceImportJob
updated compression parameters
compressing output
add a new term
new vocabulary for NSF Contract Types
nsf classification
new countries and synonims from NSF projects
update of transformation rule script wrt. identifiers, fp7, h2020
added information space export job
added new protocol for re3data
#1453 Publication Catalogue new vocabulary term
updated rule script of the claiming datasource with the http://dx.doi.org prefix
added date of creation for FET context
Updated mappings for funders and funding
deleted oldest ec:h2020toas vocabulary
new vocabulary for external references types.
xslt mapping for person objects
added hadoop jobs (dedup person)
updated person dedup configuration
added new indexed fields:- projectoamandatepublications- projectecarticle29_3- projectsubject
corda h2020 from ftp
MapDocument implements a more general view of the pace model
added trust level threshold for document similarity and document classes
new parameter for pdb inference module
configurable entity unpack xsl: the person id depends on the datasource typology (see 'mergeIdForHomonymsMap' param)
write the publication in the person row, allowing to collect its coauthors with a m/r job
update of "Horizon 2020 - Types of Action" vocabulary
Added communityname and communityid index fields: we need to be able to exclude funders from the context browse
added field relfundinglevel0_name
initial nlm2oaf transformation rule script
added coauthor workflow and hadoop job
each person row contains the list of publications, each publication embeds its authors
profiles to run calculate Person Distribution
updated job props
added workflow to export the representative publications as json on hdfs
updated primary iis job profile and workflow to the latest specs
fixed index field name for relfunderjurisdiction
index fields for funders on the relationships to projects
search publications by author
added attribute enabled to dedup configuration orchestrations
added some regexes to avoid deduplicating big groups of publications
added mapping profile for datasets
Indx fields for project funders. #1241
making the schema happy
added dedup configuration and orchestration for person entities
added oaf2hbase mapping profiles
fixed rootbuilder entries
Added ftp2 protocol
added mandatory description
removing id, not permitted by the schema
merged branch dedupConf
#953 blacklisting da458477233b5561ae47042aa2a73086 content
#953 adding bea4728578070c3d66774bf9454d41fe checksum to blacklisted
Fixed duplicate info:eu-repo/semantics/ prefix for dc:type
resourcetype is a dataset-specific field and should not be considered when transforming publications from oaf to oai_dc
doaj needs cleaning rule for languages.
corda h2020 projects
#1041
some more tricks to better our Opnaire compliance
attempt to define custom user names #1153
Using xslt 2.0 and transforming instancetype/@classname values into camel case to try to comply to the guidelines.
Updated datacite mdformat with the info provided by Datacite web site. Also updated names.
solving ticket #1158 Generate the provenance block at collection time
update profile with datacite format
towards openaire3 compliance for OAI-PMH exports of publications and datasets
transformation script for dlib magazine
adapting profile to schema
added httpList protocol
updated dedup configuration profile
including merge relationship in duplicate scan phase
extended organization join configuration
wf and hadoop job updates to support the exclusion of persons and duplicate records during the OAI feed.
Updated identifiers for categories and concepts to fet-fp7::* to fix #1069.
Claim EGI projects disabled. FET concepts labels changes so that the category name is not repeated.
indentation
added fct:funding_relations profile
added scheduler pool name
custom rule script for HAL due to changes in HAL records since 10/2014
added two protocols to the vocabulary
FET context
updated profiles
defaulting RESOURCE_URI to localhost
added basic action set profiles
stats conf moved in the resp. HadoopJobConfiguration profile.
Updated pid vocabulary for orcid and FCT.
updating metadataextraction_excluded_checksums to 1e5b574109da731f4918c7f91fc24864 value
added the rule that import relateddatacite
OAI config prfofile in sync with the one on services.openaire.
updated job profile
rule for embargoenddate adapted
setting metadataextraction_excluded_checksums to $UNDEFINED$ which means no documents should be excluded
added httpCSV protocol to vocabulary
updated copytable job definition
added transformation profile for pangaea datasets
new synonyms for countries
add new terms and vocabularies for fct entityregistry