Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  main 37980 over 9 years Marek Horst #1395 WorkflowRuntimeParameters static fields c...
  test 39055 about 9 years Marek Horst #1498 adding missing extracted metadata fields

Latest revisions

# Date Author Comment
39055 05/09/2015 09:14 PM Marek Horst

#1498 adding missing extracted metadata fields

37980 26/06/2015 07:46 PM Marek Horst

#1395 WorkflowRuntimeParameters static fields cleanup, moving parameters to dedicated modules to prevent excessing icm-iis-common module modifications

37924 22/06/2015 06:45 PM Marek Horst

fixing test after cermine upgrade

37714 10/06/2015 09:02 PM Marek Horst

making integration tests run on dedicated test cluster istead of embedded mini-oozie container

37713 10/06/2015 08:05 PM Marek Horst

removing obsolete avrobased workflow

37665 08/06/2015 04:42 PM Marek Horst

reverting example-1.pdf file removal which seems to be required by CermineMetadataExtractionTest

37660 08/06/2015 03:41 PM Marek Horst

fixing cermine integration test, changing PDF contents

37659 08/06/2015 03:40 PM Marek Horst

fixing cermine integration test, changing PDF contents

37343 20/05/2015 06:49 PM Marek Horst

#1329 adding affiliations field in ExtractedDocumentMetadata PMC schema. Metadata extraction code refactoring by extracting code responsible for building Affiliation avro records to AffiliationBuilder class and sharing it with pmc ingestion. Implementing affiliations ingestion functionality in PmcXmlHandler covered with unit tests. Adding affiliations field support in ingest pmc metadata transformer.

36394 15/04/2015 05:17 PM Marek Horst

#1240 raising mapred.task.timeout to 3600000 (1h) just in case any extremely complex PDF document appear. All time consuming documents will be registered in failure sink.

View revisions

Also available in: Atom