MapDocument implements a more general view of the pace model
added tests for author ids generation based on the datasource type
xsltRowTranformerFactory receives param map
more logging
added entity filter on map side
The relationship result --> project contains a resume of the project funding path: as requested by Katerina, now we have the names (acronyms) of funding streams in the new attribute "name".
Updating tests: funding path ids include funder shortnames (#1379)
added coauthor job
using dedup configuration id as inferenceprovenance
The OAI feed generates "enriched sets" for each content providers by applying a set of xpaths to records to understand if they have been enriched. The xpaths are defined in the OAI configuration profile.
using MongoClient instead of deprectaed Mongo
added conditions
renamed classes
M/R Job to collect info for NotificationBroker implementation
added workflow to export the representative publications as json on hdfs
StringTokenizer splits on each character included in the String passed in the constructor, which is not what we want. Using Splitter instead.
added missing dependency (test)
added expansion of BoolField type
updated pom
testing datasource mapping
manage the missing boolean fields
testing with yarn
wring name
playing with elasticsearch
added constant value for max retries
expanding rel funder and its attributes
adapted to latest version
fetch only instancetype and hostedby from the instance attributes, adding url to external references
catch and log memory errors
fixed infinite loop (merged from branch 0.6.x)
fixed xml record escape
cleanup
trying to catch memory errors
derp fix
added configurable max number of rel/children to be expanded in each entity
integrated fix from beta_context branch
log the xslt
aligned with version in pom.xml
bumped version due to update in dnet-pace-core
using action set id in the index record building process
depending on fixed version
Tesing with "max" attribute
using inverse rel to refer to the correct link descriptor, avoid npe
defined maximum number of relationships, configurable per relationship type [attribute 'max' in the EntityGrouperConfigurationDSResourceType]
FCT has level_0 only in funding tree
trying to debug
Updated version and scm. Changed dependency to 3.0.3 of mapping-utils
branching for beta and new fundingpaths and context
Added dc:creator
csv export of the duplicates original ids
updated to the new pace specs, cleanup
WT ids are uniform now
Using MongoClient instead of deprecated Mongo. Removed record status management, since records are always new anyway.
trying to perform explicit escape
delegating to StringEscapeUtils
escaping also quotes and apostrophes
aggressive escaping
avoid infinite loop :)
updated data used for testing
fixing WT funding tree translation as contexts
adding empty solr docs to the rotten record set
integrated changes of r36247 from trunk
write skipped records into the rotten folder
using different counter names
Better to depend on the branch of mapping utils in this branch of mapreduce-jobs because of the last changes implemented by Claudio.
reverted to r35900
merging from trunk
added dedup roots to csv export job, dedup index feed job, tests
using proper logger
added dedup configuration to the entities merging process
We can use the most up-to-date version of mapping-utils here
Fixed scm and deploy.info
Distinguish publications from datasets when counting
added more detailed counter about entity sub-type
several improvements
Increment counter in case of no rows to keep track of records without body.
updated version to 0.0.6.3.1
including changes to catch and fail for any exception of r35769 of trunk
branch for code before the re-implementation of context and fundingpaths
raised version
commenting test with big doaj dataset
different escaping
trying to catch any kind of exception
Testing DOAj for #1222#note-4
added DedupSimilarityToActionsMapper and relative dependency
increased version in scripts
updated the version of a dependency
fundingtree is an escaped xml, not a json anymore.
increased a minor virsion
sample records
reimplemented the fundingpath and context generation
updated packages
updated packages, codestyle
codestyle
OafMerger moved to mapping utils