added coauthor workflow and hadoop job
each person row contains the list of publications, each publication embeds its authors
profiles to run calculate Person Distribution
updated job props
added workflow to export the representative publications as json on hdfs
updated primary iis job profile and workflow to the latest specs
fixed index field name for relfunderjurisdiction
index fields for funders on the relationships to projects
search publications by author
added attribute enabled to dedup configuration orchestrations
added some regexes to avoid deduplicating big groups of publications
added mapping profile for datasets
[maven-release-plugin] prepare for next development iteration
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.7
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.7
Indx fields for project funders. #1241
making the schema happy
added dedup configuration and orchestration for person entities
added oaf2hbase mapping profiles
fixed rootbuilder entries
Added ftp2 protocol
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.6
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.6
added mandatory description
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.5
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.5
removing id, not permitted by the schema
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.4
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.4
reverted pom
reverting tag
merged branch dedupConf
adding profiles used by the new incremental deduplication workflow
added flag include.children
#953 blacklisting da458477233b5561ae47042aa2a73086 content
#953 adding bea4728578070c3d66774bf9454d41fe checksum to blacklisted
Fixed duplicate info:eu-repo/semantics/ prefix for dc:type
resourcetype is a dataset-specific field and should not be considered when transforming publications from oaf to oai_dc
doaj needs cleaning rule for languages.
corda h2020 projects
#1041
some more tricks to better our Opnaire compliance
bumped version
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.3
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.3
added splitted dedup configuration profiles, new mapreduce jobs definitions
splitted dedup conf
attempt to define custom user names #1153
Using xslt 2.0 and transforming instancetype/@classname values into camel case to try to comply to the guidelines.
Updated datacite mdformat with the info provided by Datacite web site. Also updated names.
solving ticket #1158 Generate the provenance block at collection time
update profile with datacite format
towards openaire3 compliance for OAI-PMH exports of publications and datasets
transformation script for dlib magazine
adapting profile to schema
added httpList protocol
updated dedup configuration profile
including merge relationship in duplicate scan phase
extended organization join configuration
wf and hadoop job updates to support the exclusion of persons and duplicate records during the OAI feed.
Updated identifiers for categories and concepts to fet-fp7::* to fix #1069.
Claim EGI projects disabled. FET concepts labels changes so that the category name is not repeated.
indentation
added fct:funding_relations profile
added scheduler pool name
custom rule script for HAL due to changes in HAL records since 10/2014
[maven-release-plugin] copy for tag dnet-openaireplus-profiles-1.0.2
[maven-release-plugin] prepare release dnet-openaireplus-profiles-1.0.2
added two protocols to the vocabulary
FET context
updated profiles
defaulting RESOURCE_URI to localhost
added basic action set profiles
stats conf moved in the resp. HadoopJobConfiguration profile.
Updated pid vocabulary for orcid and FCT.
updating metadataextraction_excluded_checksums to 1e5b574109da731f4918c7f91fc24864 value
added the rule that import relateddatacite
ignored iml file
OAI config prfofile in sync with the one on services.openaire.
updated job profile
rule for embargoenddate adapted
setting metadataextraction_excluded_checksums to $UNDEFINED$ which means no documents should be excluded
added httpCSV protocol to vocabulary
updated copytable job definition
added transformation profile for pangaea datasets