Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  eu 33123 almost 10 years Marek Horst #1017 fixing PMC and DOI identifiers retrieval ...

Latest revisions

# Date Author Comment
33123 01/12/2014 07:40 PM Marek Horst

#1017 fixing PMC and DOI identifiers retrieval from avro map: addressing by Utf8 objects not by String

33104 28/11/2014 06:13 PM Marek Horst

#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.

32942 21/11/2014 05:50 PM Marek Horst

#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.
Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.

32324 07/11/2014 02:57 PM Marek Horst

#955 fixing reference raw text generation for pretty printed NLM documents

31234 08/10/2014 07:45 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

31117 06/10/2014 01:20 PM Marek Horst

#757 adding reducing phase for filtering out pmids by article type, mapping phase groups PmidMapping objects by pmid and at reducer phase duplicates will be filtered out

31116 06/10/2014 01:18 PM Marek Horst

#757 introducing article type extraction along with unit test. Article type will be required for filtering out pmc duplicates and leaving only proper types

30986 01/10/2014 06:37 PM Marek Horst

#757 fixing pmid and doi matching, fixing sourceDocumentId and destinationDocumentId generation

30802 20/09/2014 02:19 PM Michal Oniszczuk

Stub of a solution to the task #576: Ingestion of metadata from EuropePMC.

View revisions

Also available in: Atom