merging trunk changes with IIS-CDH-5.3.0 branch
#1197 introducing job.properties changes aligning paths to rumcajs cluster HDFS structure
creating IIS-CDH-5.3.0 branch
updating job.properties
#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.
#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.
#840 moving IdentifierMapping from importer to common package
#840 renaming DeduplicationMapping to more generic IdentifierMapping
#757 adding reducing phase for filtering out pmids by article type, mapping phase groups PmidMapping objects by pmid and at reducer phase duplicates will be filtered out
#757 fixing pmid and doi matching, fixing sourceDocumentId and destinationDocumentId generation
updating default job properties
renaming workflow to ingest_pmc_plaintext
dir names in parameters should not contain nameNode