Project

General

Profile

Statistics
| Revision:
  • svn:ignore: *.iml .* bin build target

# Date Author Comment
37882 19/06/2015 04:22 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

37263 15/05/2015 12:57 PM Marek Horst

#1306 introducing dummy field in DocumentId schema required to overcome https://issues.apache.org/jira/browse/PIG-3358 issue. Handling dummy filed in transformer pig scripts when it is required. Should be reverted as soon as PIG-3358 issue is fixed

37258 14/05/2015 11:16 PM Marek Horst

#1312 wrapping tuple schema returned by outputSchema() method as described in PIG-3082

37181 12/05/2015 07:01 PM Marek Horst

removing oozie-sharelib-distcp dependency from pom.xml file and relying on oozie.use.system.libpath=true set among job.properties

37147 11/05/2015 07:31 PM Marek Horst

replacing icm-iis-3rdparty-pig-avrostorage dependency with original piggybank

37109 11/05/2015 02:07 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

35411 17/03/2015 03:04 PM Marek Horst

#1198 aligning IIS dependencies and java code to CDH5.3.0 cluster

35397 17/03/2015 03:01 PM Marek Horst

#1197 introducing job.properties changes aligning paths to rumcajs cluster HDFS structure

35252 11/03/2015 04:49 PM Marek Horst

creating IIS-CDH-5.3.0 branch

35228 11/03/2015 01:14 PM Marek Horst

#1195 removing obsolete ports docreation and datasetid from hbase mapred import, removing references to those ports in workflow.xml files, updating transformer by removing filtering by datasetid due to decisions made in #1072

35151 06/03/2015 05:34 PM Marek Horst

introducing repetetive ordering of citations by ordering them by citation rawText

34993 03/03/2015 02:36 PM Marek Horst

#1169 fixing duplicate context issue, introducing integration test proving implemented solution works properly

34910 27/02/2015 06:50 PM Marek Horst

simplifying schema related PIG parameters

34909 27/02/2015 06:49 PM Marek Horst

simplifying schema related PIG parameters

34908 27/02/2015 06:48 PM Marek Horst

#1147 introducing union4 pig script

34695 20/02/2015 07:17 PM Marek Horst

#1133 dropping useless workfing_dir creation for java nodes

34687 20/02/2015 06:04 PM Marek Horst

#1133 dropping useless workfing_dir creation for pig nodes

34617 19/02/2015 06:12 PM Marek Horst

#1038 introducing ranges in dependencies definition for all IIS modules

34506 13/02/2015 02:12 PM Marek Horst

#118 introducing website usage community filter filtering out publication identifiers based on ids set retrieved from InformationSpace. This is required to exclude removed publications which were still present in logs.

34504 13/02/2015 01:07 PM Marek Horst

#118 removing obsolete and duplicate transformer

33665 18/12/2014 10:19 AM Marek Horst

updating job.properties

33544 16/12/2014 12:20 PM Marek Horst

[maven-release-plugin] prepare for next development iteration

33542 16/12/2014 12:20 PM Marek Horst

[maven-release-plugin] prepare release icm-iis-transformers-1.0.0

33541 16/12/2014 11:49 AM Marek Horst

#1044 pre-release switching to released version of parent pom and released dependencies

33422 15/12/2014 12:51 PM Marek Horst

introducing scm definition

33245 09/12/2014 06:41 PM Marek Horst

#919 renaming DocumentToResearchInitiative to DocumentToConceptId and DocumentToResearchInitiatives to DocumentToConceptIds

33237 09/12/2014 02:13 PM Marek Horst

#1019 introducing integration test

33179 04/12/2014 01:29 PM Marek Horst

#919 introducing integration test input and output

33177 04/12/2014 12:08 PM Marek Horst

#919 introducing integration test containing empty input and output

33119 01/12/2014 01:33 PM Marek Horst

#919 introducing project to concept transformer module

32993 26/11/2014 03:57 PM Marek Horst

#1019 introducing PIG module transforming pmc ingested metadata into common extracted document metadata

32827 17/11/2014 03:45 PM Marek Horst

#963 propagating dataset -> mdstore from import to exporting phase: importer produces DocumentToMDStore datasetore utilized by exporter module. Updating transformer definition to handle DocumentToMDStore instead of Identifier schema

32244 05/11/2014 05:34 PM Marek Horst

introducing embedded integration test entry

31843 28/10/2014 03:31 PM Marek Horst

#913 renaming DocumentContentUrl#contentSize to DocumentContentUrl#contentSizeKB changing field type from int to long, importing content size from ObjectStoreFile#fileSizeKB, updating dnet-objectstore-rmi dependency from 1.0.0 to 2.0.1-SNAPSHOT

31783 28/10/2014 11:50 AM Marek Horst

#913 supplementing json files with newly introduced DocumentContentUrl#contentSize field value set to null

31779 28/10/2014 11:29 AM Marek Horst

#913 introducing DocumentContentUrl#contentSize field, handling it properly in all PIG transformers

31226 08/10/2014 06:19 PM Marek Horst

#840 moving IdentifierMapping from importer to common package

31220 08/10/2014 06:12 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

31037 02/10/2014 02:29 PM Marek Horst

introducing cloudera repository in parent container, removing repository definitions from individual IIS modules

30897 26/09/2014 02:49 PM Marek Horst

adding missing affiliation fields: countryCode, address, renaming country to countryName

30896 26/09/2014 02:47 PM Marek Horst

adding missing affiliation fields: countryCode, address, renaming country to countryName

30188 16/09/2014 10:22 AM Marek Horst

#757 introducing doitooaid transformer processing DocumentMetadata datastore holding metadata imported from InformationSpace and creating datastore holding <doi,oaid> pairs which will be used by pmc ingestor for matching references identified by doi

30181 15/09/2014 05:31 PM Dominika Tkaczyk

null reference ids removed

30121 11/09/2014 12:44 PM Marek Horst

updating default job.properties

29936 02/09/2014 02:49 PM Marek Horst

removing memory related properties, fixing #757 should solve all memory related problems

29914 29/08/2014 06:29 PM Marek Horst

#568 introducing citations grouping by sourceDocumentId, still to be adjusted for ingested pmc citations outcome which currently seems to hang up

29906 29/08/2014 11:53 AM Marek Horst

#577 introducing UDF producing empty map, two transformers building common Citation datastore from citationmatching and pmc ingestion outcome. Both are required by collapser.

29482 23/07/2014 05:36 PM Marek Horst

introducing importer/plaintext/skip_extracted transformer required for plaintext import caching

29087 14/07/2014 02:08 PM Marek Horst

#354 removing obsolete transformers/export/person transformer along with tests

29084 14/07/2014 01:49 PM Marek Horst

#354 removing obsolete transformers/export/inferenced_document_without_imported_data transformer along with tests

29083 14/07/2014 01:21 PM Marek Horst

#354 removing obsolete transformers/export/identifier/referenceddatasets transformer along with tests

29080 14/07/2014 12:47 PM Marek Horst

#354 removing obsolete transformers/export/identifier/documents transformer along with tests

29079 14/07/2014 12:43 PM Marek Horst

#354 removing obsolete transformers/export/document transformer along with tests

28991 10/07/2014 04:23 PM Marek Horst

replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor

28967 09/07/2014 01:12 PM Marek Horst

replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor

28966 09/07/2014 01:02 PM Marek Horst

replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor

28954 08/07/2014 05:14 PM Marek Horst

updating default job.properties

28953 08/07/2014 05:14 PM Marek Horst

updating default job.properties

28850 02/07/2014 07:08 PM Marek Horst

updating default job.properties

28800 02/07/2014 11:43 AM Marek Horst

adding missing "confidenceLevel" field

28799 02/07/2014 11:43 AM Marek Horst

adding missing "confidenceLevel" field

28798 02/07/2014 11:42 AM Marek Horst

adding missing "confidenceLevel" field

28796 02/07/2014 11:40 AM Marek Horst

adding missing "confidenceLevel" field

28795 02/07/2014 11:40 AM Marek Horst

adding missing "confidenceLevel" field

28777 01/07/2014 05:07 PM Marek Horst

introducing deploy.info file for module icm-iis-transformers