Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  core over 11 years marek.horst
  src 33249 almost 10 years Marek Horst #919 renaming DocumentToResearchInitiative to D...
README.markdown 708 Bytes almost 12 years mateusz.kobos
deploy.info 2 KB 31682 about 10 years Marek Horst adding integration-test job name suffix
pom.xml 10.7 KB 31041 about 10 years Marek Horst introducing cloudera repository in parent conta...
  • svn:ignore: .* bin target build

Latest revisions

# Date Author Comment
33249 09/12/2014 06:41 PM Marek Horst

#919 renaming DocumentToResearchInitiative to DocumentToConceptId and DocumentToResearchInitiatives to DocumentToConceptIds

33228 09/12/2014 11:02 AM Marek Horst

#1022 introducing PMC extracted document metadata collapser removing duplicates before sending output to PMC citation ingestion module

33218 05/12/2014 04:26 PM Marek Horst

#919 adding missing i/o ports related to FET projects reference extraction

33184 04/12/2014 04:09 PM Marek Horst

#919 enabling concepts matching for FET projects in mainworkflows: import, export, primary and preprocessing

33105 28/11/2014 06:13 PM Marek Horst

#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.

33098 28/11/2014 04:27 PM Marek Horst

#1022 introducing extracted document metadata collapser at importing phase.
Propagating extracted document mentadata (including PMC ingested metadata) to processing part of workflow what can be exploited by citation matching module.
Introducing citations collapser in last stage of processing phase collapsing ingested citations with matched citations.

32943 21/11/2014 05:50 PM Marek Horst

#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.
Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.

32829 17/11/2014 03:45 PM Marek Horst

#963 propagating dataset -> mdstore from import to exporting phase: importer produces DocumentToMDStore datasetore utilized by exporter module. Updating transformer definition to handle DocumentToMDStore instead of Identifier schema

32825 17/11/2014 03:43 PM Marek Horst

introducing separate citations json containing expected results, not enabled in workflow yet

32824 17/11/2014 03:42 PM Marek Horst

updating job.properties

View revisions

Also available in: Atom