/modules/icm-iis-transformers/trunk/src - Changes - D-Net - D-Net project tracking tool

dnet40/modules/icm-iis-transformers/trunk/src @ 36333

#	Date	Author	Comment
36306	10/04/2015 01:03 PM	Marek Horst	#1257 dropping schema generation related hacks in all PIG modules, switching to literal schema parameters
35701	27/03/2015 06:18 AM	Mateusz Kobos	Removing usage of working_dir from Java workflow node.
35517	19/03/2015 05:59 PM	Marek Horst	#1210 introducing generic PIG module filtering inferred data by confidence level
35228	11/03/2015 01:14 PM	Marek Horst	#1195 removing obsolete ports docreation and datasetid from hbase mapred import, removing references to those ports in workflow.xml files, updating transformer by removing filtering by datasetid due to decisions made in #1072
35151	06/03/2015 05:34 PM	Marek Horst	introducing repetetive ordering of citations by ordering them by citation rawText
34993	03/03/2015 02:36 PM	Marek Horst	#1169 fixing duplicate context issue, introducing integration test proving implemented solution works properly
34910	27/02/2015 06:50 PM	Marek Horst	simplifying schema related PIG parameters
34909	27/02/2015 06:49 PM	Marek Horst	simplifying schema related PIG parameters
34908	27/02/2015 06:48 PM	Marek Horst	#1147 introducing union4 pig script
34695	20/02/2015 07:17 PM	Marek Horst	#1133 dropping useless workfing_dir creation for java nodes
34687	20/02/2015 06:04 PM	Marek Horst	#1133 dropping useless workfing_dir creation for pig nodes
34506	13/02/2015 02:12 PM	Marek Horst	#118 introducing website usage community filter filtering out publication identifiers based on ids set retrieved from InformationSpace. This is required to exclude removed publications which were still present in logs.
34504	13/02/2015 01:07 PM	Marek Horst	#118 removing obsolete and duplicate transformer
33665	18/12/2014 10:19 AM	Marek Horst	updating job.properties
33245	09/12/2014 06:41 PM	Marek Horst	#919 renaming DocumentToResearchInitiative to DocumentToConceptId and DocumentToResearchInitiatives to DocumentToConceptIds
33237	09/12/2014 02:13 PM	Marek Horst	#1019 introducing integration test
33179	04/12/2014 01:29 PM	Marek Horst	#919 introducing integration test input and output
33177	04/12/2014 12:08 PM	Marek Horst	#919 introducing integration test containing empty input and output
33119	01/12/2014 01:33 PM	Marek Horst	#919 introducing project to concept transformer module
32993	26/11/2014 03:57 PM	Marek Horst	#1019 introducing PIG module transforming pmc ingested metadata into common extracted document metadata
32827	17/11/2014 03:45 PM	Marek Horst	#963 propagating dataset -> mdstore from import to exporting phase: importer produces DocumentToMDStore datasetore utilized by exporter module. Updating transformer definition to handle DocumentToMDStore instead of Identifier schema
31843	28/10/2014 03:31 PM	Marek Horst	#913 renaming DocumentContentUrl#contentSize to DocumentContentUrl#contentSizeKB changing field type from int to long, importing content size from ObjectStoreFile#fileSizeKB, updating dnet-objectstore-rmi dependency from 1.0.0 to 2.0.1-SNAPSHOT
31783	28/10/2014 11:50 AM	Marek Horst	#913 supplementing json files with newly introduced DocumentContentUrl#contentSize field value set to null
31779	28/10/2014 11:29 AM	Marek Horst	#913 introducing DocumentContentUrl#contentSize field, handling it properly in all PIG transformers
31226	08/10/2014 06:19 PM	Marek Horst	#840 moving IdentifierMapping from importer to common package
31220	08/10/2014 06:12 PM	Marek Horst	#840 renaming DeduplicationMapping to more generic IdentifierMapping
30897	26/09/2014 02:49 PM	Marek Horst	adding missing affiliation fields: countryCode, address, renaming country to countryName
30896	26/09/2014 02:47 PM	Marek Horst	adding missing affiliation fields: countryCode, address, renaming country to countryName
30188	16/09/2014 10:22 AM	Marek Horst	#757 introducing doitooaid transformer processing DocumentMetadata datastore holding metadata imported from InformationSpace and creating datastore holding <doi,oaid> pairs which will be used by pmc ingestor for matching references identified by doi
30181	15/09/2014 05:31 PM	Dominika Tkaczyk	null reference ids removed
30121	11/09/2014 12:44 PM	Marek Horst	updating default job.properties
29936	02/09/2014 02:49 PM	Marek Horst	removing memory related properties, fixing #757 should solve all memory related problems
29914	29/08/2014 06:29 PM	Marek Horst	#568 introducing citations grouping by sourceDocumentId, still to be adjusted for ingested pmc citations outcome which currently seems to hang up
29906	29/08/2014 11:53 AM	Marek Horst	#577 introducing UDF producing empty map, two transformers building common Citation datastore from citationmatching and pmc ingestion outcome. Both are required by collapser.
29482	23/07/2014 05:36 PM	Marek Horst	introducing importer/plaintext/skip_extracted transformer required for plaintext import caching
29087	14/07/2014 02:08 PM	Marek Horst	#354 removing obsolete transformers/export/person transformer along with tests
29084	14/07/2014 01:49 PM	Marek Horst	#354 removing obsolete transformers/export/inferenced_document_without_imported_data transformer along with tests
29083	14/07/2014 01:21 PM	Marek Horst	#354 removing obsolete transformers/export/identifier/referenceddatasets transformer along with tests
29080	14/07/2014 12:47 PM	Marek Horst	#354 removing obsolete transformers/export/identifier/documents transformer along with tests
29079	14/07/2014 12:43 PM	Marek Horst	#354 removing obsolete transformers/export/document transformer along with tests
28991	10/07/2014 04:23 PM	Marek Horst	replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor
28967	09/07/2014 01:12 PM	Marek Horst	replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor
28966	09/07/2014 01:02 PM	Marek Horst	replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor
28954	08/07/2014 05:14 PM	Marek Horst	updating default job.properties
28953	08/07/2014 05:14 PM	Marek Horst	updating default job.properties
28850	02/07/2014 07:08 PM	Marek Horst	updating default job.properties
28800	02/07/2014 11:43 AM	Marek Horst	adding missing "confidenceLevel" field
28799	02/07/2014 11:43 AM	Marek Horst	adding missing "confidenceLevel" field
28798	02/07/2014 11:42 AM	Marek Horst	adding missing "confidenceLevel" field
28796	02/07/2014 11:40 AM	Marek Horst	adding missing "confidenceLevel" field
28795	02/07/2014 11:40 AM	Marek Horst	adding missing "confidenceLevel" field

Project

General

Profile

D-Net