Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  oozie_app 34914 over 9 years Marek Horst #1147 introducing HTML import and HTML plaintex...
job.properties 2.88 KB 31759 over 9 years Marek Horst renaming metadataextraction_excluded_ids to mor...

Latest revisions

# Date Author Comment
34914 27/02/2015 07:34 PM Marek Horst

#1147 introducing HTML import and HTML plaintext ingestion in main workflows: primary and preprocessing

33184 04/12/2014 04:09 PM Marek Horst

#919 enabling concepts matching for FET projects in mainworkflows: import, export, primary and preprocessing

33098 28/11/2014 04:27 PM Marek Horst

#1022 introducing extracted document metadata collapser at importing phase.
Propagating extracted document mentadata (including PMC ingested metadata) to processing part of workflow what can be exploited by citation matching module.
Introducing citations collapser in last stage of processing phase collapsing ingested citations with matched citations.

32829 17/11/2014 03:45 PM Marek Horst

#963 propagating dataset -> mdstore from import to exporting phase: importer produces DocumentToMDStore datasetore utilized by exporter module. Updating transformer definition to handle DocumentToMDStore instead of Identifier schema

31759 27/10/2014 06:20 PM Marek Horst

renaming metadataextraction_excluded_ids to more appropriate metadataextraction_excluded_checksums

31758 27/10/2014 06:11 PM Marek Horst

#913 introducing support for max file size parameter, currently checked against Content-Lenght header

31267 10/10/2014 03:37 PM Marek Horst

introducing merge_body_with_updates flag support in common/import, setting to true in statistics workflow

31250 09/10/2014 03:33 PM Marek Horst

introducing regex support in result approver to support iis::* kind of provenance, updating workflow definitions with proper regex values

29835 22/08/2014 05:38 PM Marek Horst

removing common import input parameters which are not required in this context

29827 22/08/2014 02:34 PM Marek Horst

introducing trust_level_threshold support in statistics workflow

View revisions

Also available in: Atom