Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
import.txt 320 Bytes 29731 almost 10 years Marek Horst #9059 reverting #717 change: shortening app_pat...
workflow.xml 8.44 KB 38007 almost 9 years Marek Horst #1397 removing obsolete parameters in subworkfl...

Latest revisions

# Date Author Comment
38007 29/06/2015 03:14 PM Marek Horst

#1397 removing obsolete parameters in subworkflow actions definitions

35946 02/04/2015 05:52 PM Marek Horst

#1248 introducing fault subdirectory support in all workflows wrapping metadataextraction subworkflow up to the processing and primary root workflows. This should prevent fault directory from being removed when ${remove_sideproducts} flag is enabled, it will be propagated along with metadata and plaintext.

35057 04/03/2015 05:30 PM Marek Horst

#1176 defining remove_sideproducts property in workflows headers

34914 27/02/2015 07:34 PM Marek Horst

#1147 introducing HTML import and HTML plaintext ingestion in main workflows: primary and preprocessing

34212 02/02/2015 06:21 PM Marek Horst

#1070 introducing support for multiple context identifiers, replacing import_project_concepts_context_id IIS input parameter with import_project_concepts_context_ids_csv

33184 04/12/2014 04:09 PM Marek Horst

#919 enabling concepts matching for FET projects in mainworkflows: import, export, primary and preprocessing

33098 28/11/2014 04:27 PM Marek Horst

#1022 introducing extracted document metadata collapser at importing phase.
Propagating extracted document mentadata (including PMC ingested metadata) to processing part of workflow what can be exploited by citation matching module.
Introducing citations collapser in last stage of processing phase collapsing ingested citations with matched citations.

32829 17/11/2014 03:45 PM Marek Horst

#963 propagating dataset -> mdstore from import to exporting phase: importer produces DocumentToMDStore datasetore utilized by exporter module. Updating transformer definition to handle DocumentToMDStore instead of Identifier schema

31759 27/10/2014 06:20 PM Marek Horst

renaming metadataextraction_excluded_ids to more appropriate metadataextraction_excluded_checksums

31758 27/10/2014 06:11 PM Marek Horst

#913 introducing support for max file size parameter, currently checked against Content-Lenght header

View revisions

Also available in: Atom