new vocabulary for external references types.
xslt mapping for person objects
added hadoop jobs (dedup person)
updated person dedup configuration
[maven-release-plugin] prepare for next development iteration
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.8
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.8
added new indexed fields:- projectoamandatepublications- projectecarticle29_3- projectsubject
corda h2020 from ftp
MapDocument implements a more general view of the pace model
added trust level threshold for document similarity and document classes
new parameter for pdb inference module
configurable entity unpack xsl: the person id depends on the datasource typology (see 'mergeIdForHomonymsMap' param)
write the publication in the person row, allowing to collect its coauthors with a m/r job
update of "Horizon 2020 - Types of Action" vocabulary
Added communityname and communityid index fields: we need to be able to exclude funders from the context browse
added field relfundinglevel0_name
initial nlm2oaf transformation rule script
added coauthor workflow and hadoop job
each person row contains the list of publications, each publication embeds its authors
profiles to run calculate Person Distribution
updated job props
added workflow to export the representative publications as json on hdfs
updated primary iis job profile and workflow to the latest specs
fixed index field name for relfunderjurisdiction
index fields for funders on the relationships to projects
search publications by author
added attribute enabled to dedup configuration orchestrations
added some regexes to avoid deduplicating big groups of publications
added mapping profile for datasets
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.7
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.7
Indx fields for project funders. #1241
making the schema happy
added dedup configuration and orchestration for person entities
added oaf2hbase mapping profiles
fixed rootbuilder entries
Added ftp2 protocol
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.6
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.6
added mandatory description
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.5
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.5
removing id, not permitted by the schema
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.4
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.4
reverted pom
reverting tag
merged branch dedupConf
adding profiles used by the new incremental deduplication workflow
added flag include.children
#953 blacklisting da458477233b5561ae47042aa2a73086 content
#953 adding bea4728578070c3d66774bf9454d41fe checksum to blacklisted
Fixed duplicate info:eu-repo/semantics/ prefix for dc:type
resourcetype is a dataset-specific field and should not be considered when transforming publications from oaf to oai_dc
doaj needs cleaning rule for languages.
corda h2020 projects
#1041
some more tricks to better our Opnaire compliance
bumped version
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.3
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.3
added splitted dedup configuration profiles, new mapreduce jobs definitions
splitted dedup conf
attempt to define custom user names #1153
Using xslt 2.0 and transforming instancetype/@classname values into camel case to try to comply to the guidelines.
Updated datacite mdformat with the info provided by Datacite web site. Also updated names.
solving ticket #1158 Generate the provenance block at collection time
update profile with datacite format
towards openaire3 compliance for OAI-PMH exports of publications and datasets
transformation script for dlib magazine
adapting profile to schema
added httpList protocol
updated dedup configuration profile
including merge relationship in duplicate scan phase
extended organization join configuration
wf and hadoop job updates to support the exclusion of persons and duplicate records during the OAI feed.
Updated identifiers for categories and concepts to fet-fp7::* to fix #1069.
Claim EGI projects disabled. FET concepts labels changes so that the category name is not repeated.
indentation
added fct:funding_relations profile
added scheduler pool name
custom rule script for HAL due to changes in HAL records since 10/2014