Project

General

Profile

Statistics
| Revision:

# Date Author Comment
39163 10/09/2015 06:13 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

37881 19/06/2015 04:10 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

37117 11/05/2015 02:58 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

35402 17/03/2015 03:01 PM Marek Horst

#1197 introducing job.properties changes aligning paths to rumcajs cluster HDFS structure

35259 11/03/2015 04:53 PM Marek Horst

creating IIS-CDH-5.3.0 branch

34945 02/03/2015 01:18 PM Marek Horst

updating job.properties

33104 28/11/2014 06:13 PM Marek Horst

#1017 accepting ExtractedDocumentMetadata instead of DocumentText at PMC citation ingestion input. Aliging integration test and importer workflow.

32942 21/11/2014 05:50 PM Marek Horst

#1017 introducing new PMC metadata ingestion currently extracing references, journal and pages fields.
Replacing DOM/XPath based citations ingestion with much faster SAX version. Changing pmidtooaid transformer utilizing ExtractedDocumentMetadata instead of parsing XML file. Enabling PMC metadata ingestion in common/import.

31225 08/10/2014 06:19 PM Marek Horst

#840 moving IdentifierMapping from importer to common package

31218 08/10/2014 06:12 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

31117 06/10/2014 01:20 PM Marek Horst

#757 adding reducing phase for filtering out pmids by article type, mapping phase groups PmidMapping objects by pmid and at reducer phase duplicates will be filtered out

30987 01/10/2014 06:38 PM Marek Horst

#757 fixing pmid and doi matching, fixing sourceDocumentId and destinationDocumentId generation

30986 01/10/2014 06:37 PM Marek Horst

#757 fixing pmid and doi matching, fixing sourceDocumentId and destinationDocumentId generation

30145 12/09/2014 03:16 PM Marek Horst

updating default job properties

29631 28/07/2014 09:45 PM Marek Horst

renaming workflow to ingest_pmc_plaintext

28990 10/07/2014 04:15 PM Marek Horst

updating job.properties

28973 09/07/2014 05:55 PM mateusz.fedoryszak

dir names in parameters should not contain nameNode