Project

General

Profile

Statistics
| Revision:

# Date Author Comment
37367 21/05/2015 06:26 PM Marek Horst

#1315 propagating confidenceLevel to DocumentToConceptIds. Updating PIG transformer script by introducing concept identifiers deduplication UDF function picking record with the highest confidence level, introducing unit and integration tests. Propagating changes in document to concepts exporter module.

37345 20/05/2015 06:49 PM Marek Horst

#1329 adding affiliations field in ExtractedDocumentMetadata PMC schema. Metadata extraction code refactoring by extracting code responsible for building Affiliation avro records to AffiliationBuilder class and sharing it with pmc ingestion. Implementing affiliations ingestion functionality in PmcXmlHandler covered with unit tests. Adding affiliations field support in ingest pmc metadata transformer.

37262 15/05/2015 12:57 PM Marek Horst

#1306 introducing dummy field in DocumentId schema required to overcome https://issues.apache.org/jira/browse/PIG-3358 issue. Handling dummy filed in transformer pig scripts when it is required. Should be reverted as soon as PIG-3358 issue is fixed

37150 11/05/2015 07:42 PM Marek Horst

removing avro-maven-plugin versioning conflicting with ${iis.avro.version}

37149 11/05/2015 07:39 PM Marek Horst

fixing parent version to 1.0.1-CDH-5.3.0-SNAPSHOT

37148 11/05/2015 07:35 PM Marek Horst

fixing version to 1.0.1-CDH-5.3.0-SNAPSHOT

36941 05/05/2015 06:33 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

35918 02/04/2015 12:26 PM Marek Horst

#1247 renaming inputEntityId to inputObjectId because not all objects are entities (e.g. metadataextraction input)

35915 02/04/2015 11:57 AM Marek Horst

#1247 renaming id field to more descriptive inputEntityId

35895 01/04/2015 04:42 PM Marek Horst

#1247 introducing third draft of Fault avro schema: adding missing stracktrace

35894 01/04/2015 04:25 PM Marek Horst

#1247 introducing second draft of Fault avro schema: refactoring recursive causes to array of causes

35885 01/04/2015 03:06 PM Marek Horst

#1247 introducing first draft of Fault avro schema

35375 16/03/2015 07:51 PM Marek Horst

dependencies cleanup: removing protocol buffer dependency from schemas, only avro should be supported

35374 16/03/2015 07:49 PM Marek Horst

dependencies cleanup: removing protocol buffer dependency from schemas, only avro should be supported

35261 11/03/2015 04:59 PM Marek Horst

creating IIS-CDH-5.3.0 branch

35260 11/03/2015 04:58 PM Marek Horst

introducing branches folder

34495 12/02/2015 09:10 PM Marek Horst

#118 introducing madis based communities generation for website usage analysis

33536 15/12/2014 08:51 PM Marek Horst

[maven-release-plugin] prepare for next development iteration

33535 15/12/2014 08:51 PM Marek Horst

[maven-release-plugin] copy for tag icm-iis-schemas-1.0.0

33534 15/12/2014 08:51 PM Marek Horst

[maven-release-plugin] prepare release icm-iis-schemas-1.0.0

33533 15/12/2014 08:50 PM Marek Horst

#1044 pre-release switching to released version of parent pom and released dependencies

33420 15/12/2014 12:49 PM Marek Horst

introducing scm definition

33247 09/12/2014 06:41 PM Marek Horst

#919 renaming DocumentToResearchInitiative to DocumentToConceptId and DocumentToResearchInitiatives to DocumentToConceptIds

33223 08/12/2014 12:04 PM Marek Horst

#1022 removing ExtractedDocumentMetadata envelope: origin info is not required

33222 08/12/2014 11:47 AM Marek Horst

#1022 introducing ExtractedDocumentMetadata envelope required for collapsing PMC metadata records

33100 28/11/2014 04:48 PM Marek Horst

removing unused import statement

33099 28/11/2014 04:47 PM Marek Horst

fixing indent

33072 28/11/2014 11:14 AM Marek Horst

#919 introducing Concept schema and importer module producing avro datastore based on XML profile

32996 26/11/2014 04:49 PM Marek Horst

#686 introducing ExtractedDocumentMetadataEnvelope schema definition

32930 21/11/2014 01:09 PM Marek Horst

#577 introducing citation envelope

32904 20/11/2014 01:51 PM Marek Horst

#1017 introducing PMC extracted metadata schema

32562 12/11/2014 04:17 PM Marek Horst

introducing detailed confidenceLevel field description placed in external eu/dnetlib/iis/README.markdown file

32383 10/11/2014 01:11 PM Marek Horst

#963 introducing DocumentToMDStore datastore definition holding mappings between dataset identifier and mdstore indetifier holding given dataset

32221 05/11/2014 02:26 PM Marek Horst

#720 confidence level description

32220 05/11/2014 01:51 PM Marek Horst

#118 introducing LogEntry related comment in avdl file

32078 03/11/2014 03:05 PM Marek Horst

#118 introducing log entry schema

31844 28/10/2014 03:31 PM Marek Horst

#913 renaming DocumentContentUrl#contentSize to DocumentContentUrl#contentSizeKB changing field type from int to long, importing content size from ObjectStoreFile#fileSizeKB, updating dnet-objectstore-rmi dependency from 1.0.0 to 2.0.1-SNAPSHOT

31781 28/10/2014 11:31 AM Marek Horst

#913 changing DocumentContentUrl#contentSize field type from string to int

31780 28/10/2014 11:29 AM Marek Horst

#913 introducing DocumentContentUrl#contentSize field, handling it properly in all PIG transformers

31227 08/10/2014 06:19 PM Marek Horst

#840 moving IdentifierMapping from importer to common package

31221 08/10/2014 06:12 PM Marek Horst

#840 renaming DeduplicationMapping to more generic IdentifierMapping

31062 02/10/2014 04:21 PM Marek Horst

introducing PmidMapping schema

30900 26/09/2014 03:04 PM Marek Horst

updating countryCode comment to: country ISO 3166-1 alpha-2 uppercased code

30891 26/09/2014 12:04 PM Mateusz Kobos

Adding info on where to find types of citations produced by PMC citation ingestmodule

30886 25/09/2014 07:10 PM Marek Horst

fixing comment

30884 25/09/2014 06:39 PM Marek Horst

introducing address field

30875 25/09/2014 04:52 PM Marek Horst

adding new countryCode field to affiliation

30429 17/09/2014 11:06 AM Sandro La Bruzzo

created tag folder for release

29900 28/08/2014 08:23 PM Marek Horst

#577 updating Citation namespace

29899 28/08/2014 07:24 PM Marek Horst

#577 introducing common.citations.Citation schema

29880 27/08/2014 11:43 AM Marek Horst

removing redundant ReferenceBasicMetadata and ReferenceMetadata definitions which are also available in standalone avdl definitions, replacing definitions with import statements.

29088 14/07/2014 02:08 PM Marek Horst

removing deprecated PersonWithInferencedData avro schema

29085 14/07/2014 01:56 PM Marek Horst

removing deprecated DocumentWithInferencedData and DataSetReferenceWithInferencedData avro schemas

28939 08/07/2014 01:04 PM Marek Horst

#568 renaming Citations#sourceDocumentId field to Citations#documentId

28938 08/07/2014 12:55 PM Marek Horst

#568 introducing CitationEntry and Citations schemas

28931 07/07/2014 05:52 PM mateusz.fedoryszak

rename a field