Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  branches 37882 about 9 years Marek Horst merging trunk changes with IIS-CDH-5.3.0 branch
  tags 33543 over 9 years Marek Horst [maven-release-plugin] copy for tag icm-iis-tr...
  trunk 37874 about 9 years Marek Horst #1381 porting pmc citations ingestion from casc...

Latest revisions

# Date Author Comment
37882 19/06/2015 04:22 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

37874 19/06/2015 02:07 PM Marek Horst

#1381 porting pmc citations ingestion from cascading framework to pig. Moving code from icm-iis-ingest-pmc to icm-iis-transformers including itegration tests, removing obsolete scala code along with unneded dependencies. Switching subworkflow in primary workflow.

37652 08/06/2015 01:37 PM Marek Horst

expecting null affiliations instead of empty array

37651 08/06/2015 01:29 PM Marek Horst

adding missing affiliations field in input data, removing duplicates from outut

37594 29/05/2015 05:16 PM Marek Horst

adding missing affiliations field in integration test expected output

37368 21/05/2015 06:26 PM Marek Horst

#1315 propagating confidenceLevel to DocumentToConceptIds. Updating PIG transformer script by introducing concept identifiers deduplication UDF function picking record with the highest confidence level, introducing unit and integration tests. Propagating changes in document to concepts exporter module.

37360 21/05/2015 02:46 PM Marek Horst

removing obsolete test resources

37347 20/05/2015 06:49 PM Marek Horst

#1329 adding affiliations field in ExtractedDocumentMetadata PMC schema. Metadata extraction code refactoring by extracting code responsible for building Affiliation avro records to AffiliationBuilder class and sharing it with pmc ingestion. Implementing affiliations ingestion functionality in PmcXmlHandler covered with unit tests. Adding affiliations field support in ingest pmc metadata transformer.

37263 15/05/2015 12:57 PM Marek Horst

#1306 introducing dummy field in DocumentId schema required to overcome https://issues.apache.org/jira/browse/PIG-3358 issue. Handling dummy filed in transformer pig scripts when it is required. Should be reverted as soon as PIG-3358 issue is fixed

37258 14/05/2015 11:16 PM Marek Horst

#1312 wrapping tuple schema returned by outputSchema() method as described in PIG-3082

View revisions

Also available in: Atom