testing author dedup
branch offline dedup
cleanup
depending on SNAPSHOT parent
playing with index feeding
removed
removing old branch
Generate compress record in OAI store.
updated pom
testing datasource mapping
manage the missing boolean fields
added expansion of BoolField type
testing with yarn
wring name
depending on fixed version
Tesing with "max" attribute
using inverse rel to refer to the correct link descriptor, avoid npe
defined maximum number of relationships, configurable per relationship type [attribute 'max' in the EntityGrouperConfigurationDSResourceType]
FCT has level_0 only in funding tree
trying to debug
Updated version and scm. Changed dependency to 3.0.3 of mapping-utils
branching for beta and new fundingpaths and context
Using MongoClient instead of deprecated Mongo. Removed record status management, since records are always new anyway.
trying to perform explicit escape
delegating to StringEscapeUtils
escaping also quotes and apostrophes
aggressive escaping
avoid infinite loop :)
updated data used for testing
fixing WT funding tree translation as contexts
integrated changes of r36247 from trunk
write skipped records into the rotten folder
Better to depend on the branch of mapping utils in this branch of mapreduce-jobs because of the last changes implemented by Claudio.
reverted to r35900
merging from trunk
We can use the most up-to-date version of mapping-utils here
Fixed scm and deploy.info
Distinguish publications from datasets when counting
Increment counter in case of no rows to keep track of records without body.
updated version to 0.0.6.3.1
including changes to catch and fail for any exception of r35769 of trunk
branch for code before the re-implementation of context and fundingpaths
temporary commit
offline dedup
removed CDATA from extraInfo payloads
using CloudSolrServer for parallel index feeding
added branch name
updated branch version and build scripts
update branch with contributes from trunk and other branches
changed properties passed to index feed m/r job
fixed pom and scripts
updated index feed job to make use of the new shared solr lib
branch to test the new index feeding libs
proto to pace mapping parses the whole entity
branch to adapt the proto to pace mapping
tests
inferred stuff will be expanded right after of the main entity element
updated test configuration
helper method to discover the type of the entity target of a relationship, used during the xml expansion
avoid to emit the relatioships stored in those rows containing a deleted metadata body
fixed relationship distribution
almost working workflows on hbase
dedup working
fixed version number
branch for 4.0.0