Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
import.txt 1.69 KB 37873 almost 9 years Marek Horst #1381 porting pmc citations ingestion from casc...
workflow.xml 41 KB 38007 almost 9 years Marek Horst #1397 removing obsolete parameters in subworkfl...

Latest revisions

# Date Author Comment
38007 29/06/2015 03:14 PM Marek Horst

#1397 removing obsolete parameters in subworkflow actions definitions

37873 19/06/2015 02:06 PM Marek Horst

#1381 porting pmc citations ingestion from cascading framework to pig. Moving code from icm-iis-ingest-pmc to icm-iis-transformers including itegration tests, removing obsolete scala code along with unneded dependencies. Switching subworkflow in primary workflow.

37585 29/05/2015 04:17 PM Marek Horst

#1339 fixing input_dedup_map in pmc citation ingestion when match_content_with_metadata=false. Should not be set dynamically but statically, it will be enabled only when metadata_import is enabled

37561 29/05/2015 02:27 PM Marek Horst

#1339 replacing active_existence_filter flag with match_content_with_metadata and changing identifiers matching logic: when flag is disabled neither contents identifiers will be filtered nor deduplicated against metadata identifiers. Up unit now, when active_existence_filter flag was disabled contents were deduplicated which is not desired when running IIS in standalone mode on contents having their representatives in HBase

37533 28/05/2015 04:16 PM Marek Horst

#1329 enabling pmc ingestion when active_metadataextraction_export flag is enabled

37464 25/05/2015 10:13 PM Marek Horst

#1308 reverting uri:oozie:distcp-action:0.2 change: version is not properly recognized by oozie 3.3.2-cdh4.3.1

37194 13/05/2015 12:49 PM Marek Horst

#1308 switching distcp namespace to uri:oozie:distcp-action:0.2

35946 02/04/2015 05:52 PM Marek Horst

#1248 introducing fault subdirectory support in all workflows wrapping metadataextraction subworkflow up to the processing and primary root workflows. This should prevent fault directory from being removed when ${remove_sideproducts} flag is enabled, it will be propagated along with metadata and plaintext.

35701 27/03/2015 06:18 AM Mateusz Kobos

Removing usage of working_dir from Java workflow node.

35229 11/03/2015 01:14 PM Marek Horst

#1195 removing obsolete ports docreation and datasetid from hbase mapred import, removing references to those ports in workflow.xml files, updating transformer by removing filtering by datasetid due to decisions made in #1072

View revisions

Also available in: Atom