Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  metadata 39086 almost 9 years Marek Horst renaming test resources to be compliant with wi...
  plaintext 35701 over 9 years Mateusz Kobos Removing usage of working_dir from Java workflo...

Latest revisions

# Date Author Comment
39086 08/09/2015 01:43 PM Marek Horst

renaming test resources to be compliant with windows file system naming requirements

39045 04/09/2015 11:26 PM Marek Horst

#1498 introducing major citations related refactoring including new generic direct citation matching moved to processing phase, introduced position field in all citations schemas and updated collapser taking position into account when merging citations details coming from 3 variuos sources: fuzzy citationmatching, direct citationmatching, references metadata

37875 19/06/2015 02:10 PM Marek Horst

#1381 porting pmc citations ingestion from cascading framework to pig. Moving code from icm-iis-ingest-pmc to icm-iis-transformers including itegration tests, removing obsolete scala code along with unneded dependencies. Switching subworkflow in primary workflow.

37344 20/05/2015 06:49 PM Marek Horst

#1329 adding affiliations field in ExtractedDocumentMetadata PMC schema. Metadata extraction code refactoring by extracting code responsible for building Affiliation avro records to AffiliationBuilder class and sharing it with pmc ingestion. Implementing affiliations ingestion functionality in PmcXmlHandler covered with unit tests. Adding affiliations field support in ingest pmc metadata transformer.

35701 27/03/2015 06:18 AM Mateusz Kobos

Removing usage of working_dir from Java workflow node.

34693 20/02/2015 07:16 PM Marek Horst

#1133 dropping useless workfing_dir creation for java nodes

33131 02/12/2014 12:48 PM Marek Horst

replacing non standard dash character to '-'

33125 01/12/2014 09:06 PM Marek Horst

#1017 fixing expected citations

33104 28/11/2014 06:13 PM Marek Horst

#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.

32942 21/11/2014 05:50 PM Marek Horst

#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.
Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.

View revisions

Also available in: Atom