Project

General

Profile

Statistics
| Revision:

# Date Author Comment
37894 19/06/2015 06:25 PM Alessia Bardi

The relationship result --> project contains a resume of the project funding path: as requested by Katerina, now we have the names (acronyms) of funding streams in the new attribute "name".

37851 18/06/2015 05:12 PM Claudio Atzori

added coauthor job

37824 16/06/2015 04:24 PM Claudio Atzori

using dedup configuration id as inferenceprovenance

37821 16/06/2015 03:37 PM Alessia Bardi

The OAI feed generates "enriched sets" for each content providers by applying a set of xpaths to records to understand if they have been enriched. The xpaths are defined in the OAI configuration profile.

37794 15/06/2015 03:30 PM Alessia Bardi

using MongoClient instead of deprectaed Mongo

37781 15/06/2015 01:12 PM Michele Artini

added conditions

37778 15/06/2015 11:30 AM Michele Artini

renamed classes

37775 15/06/2015 09:01 AM Michele Artini

M/R Job to collect info for NotificationBroker implementation

37751 12/06/2015 04:05 PM Claudio Atzori

added workflow to export the representative publications as json on hdfs

37717 11/06/2015 11:53 AM Claudio Atzori

StringTokenizer splits on each character included in the String passed in the constructor, which is not what we want. Using Splitter instead.

37616 03/06/2015 02:26 PM Claudio Atzori

added expansion of BoolField type

37563 29/05/2015 02:29 PM Claudio Atzori

playing with elasticsearch

37562 29/05/2015 02:28 PM Claudio Atzori

added constant value for max retries

37531 28/05/2015 02:43 PM Claudio Atzori

expanding rel funder and its attributes

37517 27/05/2015 04:19 PM Claudio Atzori

adapted to latest version

37364 21/05/2015 05:46 PM Claudio Atzori

fetch only instancetype and hostedby from the instance attributes, adding url to external references

37353 21/05/2015 10:18 AM Claudio Atzori

catch and log memory errors

37352 21/05/2015 10:17 AM Claudio Atzori

fixed infinite loop (merged from branch 0.6.x)

37351 21/05/2015 10:16 AM Claudio Atzori

fixed xml record escape

37350 21/05/2015 10:15 AM Claudio Atzori

cleanup

37336 20/05/2015 02:51 PM Claudio Atzori

trying to catch memory errors

37335 20/05/2015 02:51 PM Claudio Atzori

derp fix

37334 20/05/2015 02:50 PM Claudio Atzori

added configurable max number of rel/children to be expanded in each entity

37273 16/05/2015 11:34 PM Claudio Atzori

integrated fix from beta_context branch

37240 14/05/2015 02:20 PM Claudio Atzori

log the xslt

37135 11/05/2015 05:34 PM Claudio Atzori

using action set id in the index record building process

36796 28/04/2015 05:13 PM Claudio Atzori

csv export of the duplicates original ids

36735 27/04/2015 11:22 AM Claudio Atzori

cleanup

36670 23/04/2015 04:57 PM Claudio Atzori

updated to the new pace specs, cleanup

36271 09/04/2015 04:30 PM Claudio Atzori

adding empty solr docs to the rotten record set

36247 09/04/2015 02:53 PM Claudio Atzori

using different counter names

36164 08/04/2015 10:48 AM Claudio Atzori

added dedup roots to csv export job, dedup index feed job, tests

36158 08/04/2015 09:51 AM Claudio Atzori

using proper logger

36157 08/04/2015 09:50 AM Claudio Atzori

added dedup configuration to the entities merging process

35981 03/04/2015 11:32 AM Claudio Atzori

added more detailed counter about entity sub-type

35975 03/04/2015 10:31 AM Claudio Atzori

several improvements

35771 30/03/2015 11:57 AM Claudio Atzori

different escaping

35769 30/03/2015 11:46 AM Claudio Atzori

trying to catch any kind of exception

35476 18/03/2015 06:47 PM Claudio Atzori

added DedupSimilarityToActionsMapper and relative dependency

35179 09/03/2015 02:39 PM Michele Artini

reimplemented the fundingpath and context generation

35133 05/03/2015 07:44 PM Claudio Atzori

updated packages, codestyle

35129 05/03/2015 07:38 PM Claudio Atzori

codestyle

35128 05/03/2015 07:34 PM Claudio Atzori

updated packages

35127 05/03/2015 07:31 PM Claudio Atzori

OafMerger moved to mapping utils

34536 16/02/2015 07:56 PM Claudio Atzori

saving disk space, less logging

34374 09/02/2015 06:44 PM Alessia Bardi
34358 09/02/2015 12:05 PM Alessia Bardi

discard persons in OAI feeding (#1107)

34225 03/02/2015 11:50 AM Claudio Atzori

do not alter inferenceprovenance; codestyle

33382 12/12/2014 06:07 PM Claudio Atzori

added FCT fundings as contexts

33137 02/12/2014 04:33 PM Claudio Atzori

merged branch ProtoMapping

32832 17/11/2014 04:41 PM Claudio Atzori

imlemented retries

31997 31/10/2014 10:19 AM Claudio Atzori

cleanup & tests

31409 16/10/2014 05:42 PM Claudio Atzori

added default bestlicense value. Used when the records doesn't provide any

31186 07/10/2014 02:54 PM Alessia Bardi

Moved counters from entity body to header.

30969 01/10/2014 02:22 PM Claudio Atzori

- provenance information parsed from element "about"
- namespace aware datacite mapping for oaf:language and oaf:dateaccepted
- dedupBuildRoot doesn't write to WAL
- removed unused claim_2_hbase.xsl
- overall cleanup

30968 01/10/2014 02:15 PM Claudio Atzori

added relationship/children counters

30882 25/09/2014 05:42 PM Alessia Bardi

Avoiding set '___' generated when we have "strange" set names such as those in cyrillic/ukrain. In those cases records are assigned to a default set, currently named "OTHER".

30863 25/09/2014 12:40 PM Claudio Atzori

expanding provenanceaction classid

30834 23/09/2014 06:14 PM Claudio Atzori

merge from branch newIndexFeed

30833 23/09/2014 06:14 PM Claudio Atzori

fixing #783 (note-18)

30827 23/09/2014 03:46 PM Claudio Atzori

extraInfo removed from CDATA block, expanding provenance action in inferred elements

29734 31/07/2014 02:28 PM Claudio Atzori

fixed blacklist type

29732 31/07/2014 12:48 PM Claudio Atzori

more logging. fixed entity type check

29707 30/07/2014 04:15 PM Claudio Atzori

more logging

29702 30/07/2014 03:38 PM Claudio Atzori

defined limit to the maximum number of counters

29657 29/07/2014 03:24 PM Claudio Atzori

defined limit to the maximum number of counters

29043 11/07/2014 06:54 PM Claudio Atzori

added serialization, tests

29042 11/07/2014 06:53 PM Claudio Atzori

instantiate one SAXReader for each call

29041 11/07/2014 06:52 PM Claudio Atzori

fixed format-layout-interpretation concatenation,
doesn't fail when the fieldExtractor returns a null result

29040 11/07/2014 06:51 PM Claudio Atzori

added json serialization, builds the matching key one time only

29036 11/07/2014 04:30 PM Alessia Bardi

do not upsert sets here in the mapper: we shall delegate to a separate workflow to be run after the OAI feeding is completed.

29032 11/07/2014 03:49 PM Alessia Bardi

Always new records to test how faster we go

29030 11/07/2014 03:36 PM Alessia Bardi

Always new records to test how faster we go

29014 10/07/2014 07:41 PM Alessia Bardi

Refactored class that extracts fields from records. When we can't find an expected index from the configuration to check its repeatability, the field is indexed as repeatable and a counter is updated.

29009 10/07/2014 06:33 PM Claudio Atzori

idScheme and idNamespace defined as part of the OAI configuration profile

29000 10/07/2014 05:54 PM Alessia Bardi

Removed dependency to dnet-oai-utils to avoid inheritance of unwanted jars such as cnr-rmi-api, cnr-service-common, spring, etc., which should not appear when running a job on the cluster. Needed classes have been copied and adapted so they do not use spring anymore.

28981 10/07/2014 09:49 AM Claudio Atzori

extended dedup configuration, including now blacklists and algorithm parameters

28904 04/07/2014 03:33 PM Alessia Bardi

Format, layout and interpretation are obtained from the collection name rather than being fixed.

28467 26/06/2014 04:38 PM Claudio Atzori

namespace cleanup

28457 25/06/2014 07:20 PM Claudio Atzori

removed unused field <dri:repositoryId/>

28411 24/06/2014 09:59 AM Claudio Atzori

removed protocolbuffers dependency from dnet-pace-core, Builders and Proto specific tests moved in dnet-openaireplus-mapping-utils, adapted dnet-mapreduce-jobs

28311 19/06/2014 03:15 PM Claudio Atzori

oaf schema location passed as parameter by the workflow

28309 19/06/2014 02:24 PM Alessia Bardi

Testing without depending on a running mdstore

28308 19/06/2014 01:49 PM Claudio Atzori

small refactor

28303 19/06/2014 12:54 PM Alessia Bardi

OAI feed map only job

28226 16/06/2014 09:30 AM Claudio Atzori

fixed oaf to xml serialization

28094 09/06/2014 04:19 PM Claudio Atzori

merged from branch 0.0.4

27199 06/05/2014 05:56 PM Claudio Atzori

added early implementation of OAI feeding job (M/R)

27148 05/05/2014 09:40 AM Claudio Atzori

fixed IIS output escaping

27147 05/05/2014 09:39 AM Claudio Atzori

added support for one way relationships