replacing non standard dash character to '-'
fixing test run on jenkins: seting encoding explicitly to utf8
#1017 fixing expected citations
#1017 fixing PMC and DOI identifiers retrieval from avro map: addressing by Utf8 objects not by String
#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.
#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.
#955 fixing reference raw text generation for pretty printed NLM documents
introducing embedded integration test entry
#840 renaming DeduplicationMapping to more generic IdentifierMapping
#840 moving IdentifierMapping from importer to common package
#757 adding reducing phase for filtering out pmids by article type, mapping phase groups PmidMapping objects by pmid and at reducer phase duplicates will be filtered out
#757 introducing article type extraction along with unit test. Article type will be required for filtering out pmc duplicates and leaving only proper types
introducing cloudera repository in parent container, removing repository definitions from individual IIS modules
fixing sourceDocumentId which is now extracted from input DocumentText record conveying NLM
#757 fixing pmc citation matching test by providing proper input
#757 fixing pmid and doi matching, fixing sourceDocumentId and destinationDocumentId generation
Commented out test in a stub of a solution to the task #576: Ingestion of metadata from EuropePMC.
Stub of a solution to the task #576: Ingestion of metadata from EuropePMC.
Refactored code to use the XPathEvaluator.fromString method.
created tag folder for release
updating default job properties
renaming workflow to ingest_pmc_plaintext
Excluding conflicting dependency
replacing "result" string with Type.result.name()
updating job.properties
dir names in parameters should not contain nameNode
rename a field
introducing deploy.info file for module icm-iis-ingest-pmc