Project

General

Profile

Statistics
| Revision:

# Date Author Comment
32993 26/11/2014 03:57 PM Marek Horst

#1019 introducing PIG module transforming pmc ingested metadata into common extracted document metadata

32827 17/11/2014 03:45 PM Marek Horst

#963 propagating dataset -> mdstore from import to exporting phase: importer produces DocumentToMDStore datasetore utilized by exporter module. Updating transformer definition to handle DocumentToMDStore instead of Identifier schema

31843 28/10/2014 03:31 PM Marek Horst

#913 renaming DocumentContentUrl#contentSize to DocumentContentUrl#contentSizeKB changing field type from int to long, importing content size from ObjectStoreFile#fileSizeKB, updating dnet-objectstore-rmi dependency from 1.0.0 to 2.0.1-SNAPSHOT

31783 28/10/2014 11:50 AM Marek Horst

#913 supplementing json files with newly introduced DocumentContentUrl#contentSize field value set to null

31779 28/10/2014 11:29 AM Marek Horst

#913 introducing DocumentContentUrl#contentSize field, handling it properly in all PIG transformers

31226 08/10/2014 06:19 PM Marek Horst

#840 moving IdentifierMapping from importer to common package

31220 08/10/2014 06:12 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

30897 26/09/2014 02:49 PM Marek Horst

adding missing affiliation fields: countryCode, address, renaming country to countryName

30896 26/09/2014 02:47 PM Marek Horst

adding missing affiliation fields: countryCode, address, renaming country to countryName

30188 16/09/2014 10:22 AM Marek Horst

#757 introducing doitooaid transformer processing DocumentMetadata datastore holding metadata imported from InformationSpace and creating datastore holding <doi,oaid> pairs which will be used by pmc ingestor for matching references identified by doi

30181 15/09/2014 05:31 PM Dominika Tkaczyk

null reference ids removed

30121 11/09/2014 12:44 PM Marek Horst

updating default job.properties

29936 02/09/2014 02:49 PM Marek Horst

removing memory related properties, fixing #757 should solve all memory related problems

29914 29/08/2014 06:29 PM Marek Horst

#568 introducing citations grouping by sourceDocumentId, still to be adjusted for ingested pmc citations outcome which currently seems to hang up

29906 29/08/2014 11:53 AM Marek Horst

#577 introducing UDF producing empty map, two transformers building common Citation datastore from citationmatching and pmc ingestion outcome. Both are required by collapser.

29482 23/07/2014 05:36 PM Marek Horst

introducing importer/plaintext/skip_extracted transformer required for plaintext import caching

29087 14/07/2014 02:08 PM Marek Horst

#354 removing obsolete transformers/export/person transformer along with tests

29084 14/07/2014 01:49 PM Marek Horst

#354 removing obsolete transformers/export/inferenced_document_without_imported_data transformer along with tests

29083 14/07/2014 01:21 PM Marek Horst

#354 removing obsolete transformers/export/identifier/referenceddatasets transformer along with tests

29080 14/07/2014 12:47 PM Marek Horst

#354 removing obsolete transformers/export/identifier/documents transformer along with tests

29079 14/07/2014 12:43 PM Marek Horst

#354 removing obsolete transformers/export/document transformer along with tests

28991 10/07/2014 04:23 PM Marek Horst

replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor

28967 09/07/2014 01:12 PM Marek Horst

replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor

28966 09/07/2014 01:02 PM Marek Horst

replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor

28954 08/07/2014 05:14 PM Marek Horst

updating default job.properties

28953 08/07/2014 05:14 PM Marek Horst

updating default job.properties

28850 02/07/2014 07:08 PM Marek Horst

updating default job.properties

28800 02/07/2014 11:43 AM Marek Horst

adding missing "confidenceLevel" field

28799 02/07/2014 11:43 AM Marek Horst

adding missing "confidenceLevel" field

28798 02/07/2014 11:42 AM Marek Horst

adding missing "confidenceLevel" field

28796 02/07/2014 11:40 AM Marek Horst

adding missing "confidenceLevel" field

28795 02/07/2014 11:40 AM Marek Horst

adding missing "confidenceLevel" field