added dedup roots to csv export job, dedup index feed job, tests
added dedup configuration to the entities merging process
commenting test with big doaj dataset
Testing DOAj for #1222#note-4
fundingtree is an escaped xml, not a json anymore.
sample records
reimplemented the fundingpath and context generation
updated packages
renamed test
added json size test
Updated configuration for testing
extended entities join configuration, added more tests
test record took from HDFS
added FCT fundings as contexts
merged branch ProtoMapping
Added oaf:identifiers to record sample.
updated tests
cleanup & tests
added more fields in test record
revised tests
added serialization, tests
Refactored class that extracts fields from records. When we can't find an expected index from the configuration to check its repeatability, the field is indexed as repeatable and a counter is updated.
idScheme and idNamespace defined as part of the OAI configuration profile
Removed dependency to dnet-oai-utils to avoid inheritance of unwanted jars such as cnr-rmi-api, cnr-service-common, spring, etc., which should not appear when running a job on the cluster. Needed classes have been copied and adapted so they do not use spring anymore.
oaf schema location passed as parameter by the workflow
Testing without depending on a running mdstore
small refactor
OAI feed map only job
fixed oaf to xml serialization
merged from branch 0.0.4
fixed IIS output escaping