Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  pmc 38172 almost 9 years Marek Horst #1431 fixing PMC XML records parser disallowing...

Latest revisions

# Date Author Comment
38172 13/07/2015 11:30 AM Marek Horst

#1431 fixing PMC XML records parser disallowing null reference type, reference value will be simply omitted

37875 19/06/2015 02:10 PM Marek Horst

#1381 porting pmc citations ingestion from cascading framework to pig. Moving code from icm-iis-ingest-pmc to icm-iis-transformers including itegration tests, removing obsolete scala code along with unneded dependencies. Switching subworkflow in primary workflow.

37356 21/05/2015 12:35 PM Marek Horst

#1329 setting affiliation string as raw text if parser produced empty Element object

37344 20/05/2015 06:49 PM Marek Horst

#1329 adding affiliations field in ExtractedDocumentMetadata PMC schema. Metadata extraction code refactoring by extracting code responsible for building Affiliation avro records to AffiliationBuilder class and sharing it with pmc ingestion. Implementing affiliations ingestion functionality in PmcXmlHandler covered with unit tests. Adding affiliations field support in ingest pmc metadata transformer.

33123 01/12/2014 07:40 PM Marek Horst

#1017 fixing PMC and DOI identifiers retrieval from avro map: addressing by Utf8 objects not by String

33104 28/11/2014 06:13 PM Marek Horst

#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.

32942 21/11/2014 05:50 PM Marek Horst

#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.
Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.

32324 07/11/2014 02:57 PM Marek Horst

#955 fixing reference raw text generation for pretty printed NLM documents

31234 08/10/2014 07:45 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

View revisions

Also available in: Atom