Project

General

Profile

Statistics
| Revision:

# Date Author Comment
37851 18/06/2015 05:12 PM Claudio Atzori

added coauthor job

37824 16/06/2015 04:24 PM Claudio Atzori

using dedup configuration id as inferenceprovenance

37821 16/06/2015 03:37 PM Alessia Bardi

The OAI feed generates "enriched sets" for each content providers by applying a set of xpaths to records to understand if they have been enriched. The xpaths are defined in the OAI configuration profile.

37794 15/06/2015 03:30 PM Alessia Bardi

using MongoClient instead of deprectaed Mongo

37781 15/06/2015 01:12 PM Michele Artini

added conditions

37778 15/06/2015 11:30 AM Michele Artini

renamed classes

37775 15/06/2015 09:01 AM Michele Artini

M/R Job to collect info for NotificationBroker implementation

37751 12/06/2015 04:05 PM Claudio Atzori

added workflow to export the representative publications as json on hdfs

37717 11/06/2015 11:53 AM Claudio Atzori

StringTokenizer splits on each character included in the String passed in the constructor, which is not what we want. Using Splitter instead.

37617 03/06/2015 02:27 PM Claudio Atzori

added missing dependency (test)

37616 03/06/2015 02:26 PM Claudio Atzori

added expansion of BoolField type

37615 03/06/2015 02:23 PM Claudio Atzori

updated pom

37614 03/06/2015 02:23 PM Claudio Atzori

testing datasource mapping

37613 03/06/2015 02:22 PM Claudio Atzori

manage the missing boolean fields

37604 03/06/2015 12:45 PM Claudio Atzori

added expansion of BoolField type

37566 29/05/2015 02:32 PM Claudio Atzori

testing with yarn

37565 29/05/2015 02:31 PM Claudio Atzori

wring name

37564 29/05/2015 02:31 PM Claudio Atzori

testing with yarn

37563 29/05/2015 02:29 PM Claudio Atzori

playing with elasticsearch

37562 29/05/2015 02:28 PM Claudio Atzori

added constant value for max retries

37531 28/05/2015 02:43 PM Claudio Atzori

expanding rel funder and its attributes

37517 27/05/2015 04:19 PM Claudio Atzori

adapted to latest version

37364 21/05/2015 05:46 PM Claudio Atzori

fetch only instancetype and hostedby from the instance attributes, adding url to external references

37353 21/05/2015 10:18 AM Claudio Atzori

catch and log memory errors

37352 21/05/2015 10:17 AM Claudio Atzori

fixed infinite loop (merged from branch 0.6.x)

37351 21/05/2015 10:16 AM Claudio Atzori

fixed xml record escape

37350 21/05/2015 10:15 AM Claudio Atzori

cleanup

37336 20/05/2015 02:51 PM Claudio Atzori

trying to catch memory errors

37335 20/05/2015 02:51 PM Claudio Atzori

derp fix

37334 20/05/2015 02:50 PM Claudio Atzori

added configurable max number of rel/children to be expanded in each entity

37273 16/05/2015 11:34 PM Claudio Atzori

integrated fix from beta_context branch

37240 14/05/2015 02:20 PM Claudio Atzori

log the xslt

37213 13/05/2015 03:09 PM Claudio Atzori

aligned with version in pom.xml

37212 13/05/2015 03:06 PM Claudio Atzori

bumped version due to update in dnet-pace-core

37136 11/05/2015 05:38 PM Claudio Atzori

cleanup

37135 11/05/2015 05:34 PM Claudio Atzori

using action set id in the index record building process

36979 06/05/2015 03:52 PM Claudio Atzori

depending on fixed version

36972 06/05/2015 03:44 PM Claudio Atzori

Tesing with "max" attribute

36932 05/05/2015 10:45 AM Claudio Atzori

using inverse rel to refer to the correct link descriptor, avoid npe

36916 04/05/2015 06:48 PM Claudio Atzori

defined maximum number of relationships, configurable per relationship type [attribute 'max' in the EntityGrouperConfigurationDSResourceType]

36859 03/05/2015 11:34 PM Alessia Bardi

FCT has level_0 only in funding tree

36855 01/05/2015 09:45 AM Claudio Atzori

trying to debug

36833 30/04/2015 12:34 PM Alessia Bardi

Updated version and scm. Changed dependency to 3.0.3 of mapping-utils

36832 30/04/2015 12:30 PM Alessia Bardi

branching for beta and new fundingpaths and context

36804 28/04/2015 06:21 PM Alessia Bardi

Added dc:creator

36796 28/04/2015 05:13 PM Claudio Atzori

csv export of the duplicates original ids

36735 27/04/2015 11:22 AM Claudio Atzori

cleanup

36670 23/04/2015 04:57 PM Claudio Atzori

updated to the new pace specs, cleanup

36650 23/04/2015 02:36 PM Alessia Bardi

WT ids are uniform now

36641 23/04/2015 11:33 AM Alessia Bardi

Using MongoClient instead of deprecated Mongo. Removed record status management, since records are always new anyway.

36583 22/04/2015 01:44 PM Claudio Atzori

trying to perform explicit escape

36570 22/04/2015 10:23 AM Claudio Atzori

delegating to StringEscapeUtils

36569 22/04/2015 10:14 AM Claudio Atzori

escaping also quotes and apostrophes

36568 21/04/2015 05:24 PM Claudio Atzori

aggressive escaping

36566 21/04/2015 05:07 PM Claudio Atzori

avoid infinite loop :)

36562 21/04/2015 02:11 PM Claudio Atzori

updated data used for testing

36561 21/04/2015 02:10 PM Claudio Atzori

fixing WT funding tree translation as contexts

36271 09/04/2015 04:30 PM Claudio Atzori

adding empty solr docs to the rotten record set

36270 09/04/2015 04:29 PM Alessia Bardi

integrated changes of r36247 from trunk

36269 09/04/2015 04:22 PM Alessia Bardi

write skipped records into the rotten folder

36247 09/04/2015 02:53 PM Claudio Atzori

using different counter names

36177 08/04/2015 11:36 AM Alessia Bardi

Better to depend on the branch of mapping utils in this branch of mapreduce-jobs because of the last changes implemented by Claudio.

36169 08/04/2015 11:22 AM Claudio Atzori

reverted to r35900

36168 08/04/2015 11:20 AM Claudio Atzori

merging from trunk

36164 08/04/2015 10:48 AM Claudio Atzori

added dedup roots to csv export job, dedup index feed job, tests

36158 08/04/2015 09:51 AM Claudio Atzori

using proper logger

36157 08/04/2015 09:50 AM Claudio Atzori

added dedup configuration to the entities merging process

36043 03/04/2015 06:41 PM Alessia Bardi

We can use the most up-to-date version of mapping-utils here

36042 03/04/2015 06:39 PM Alessia Bardi

Fixed scm and deploy.info

36041 03/04/2015 06:38 PM Alessia Bardi

Distinguish publications from datasets when counting

35981 03/04/2015 11:32 AM Claudio Atzori

added more detailed counter about entity sub-type

35975 03/04/2015 10:31 AM Claudio Atzori

several improvements

35917 02/04/2015 12:19 PM Alessia Bardi

Increment counter in case of no rows to keep track of records without body.

35900 01/04/2015 05:20 PM Alessia Bardi

updated version to 0.0.6.3.1

35899 01/04/2015 05:17 PM Alessia Bardi

including changes to catch and fail for any exception of r35769 of trunk

35898 01/04/2015 05:14 PM Alessia Bardi

branch for code before the re-implementation of context and fundingpaths

35897 01/04/2015 05:00 PM Alessia Bardi

raised version

35896 01/04/2015 04:55 PM Alessia Bardi

commenting test with big doaj dataset

35771 30/03/2015 11:57 AM Claudio Atzori

different escaping

35769 30/03/2015 11:46 AM Claudio Atzori

trying to catch any kind of exception

35746 27/03/2015 05:23 PM Alessia Bardi

Testing DOAj for #1222#note-4

35476 18/03/2015 06:47 PM Claudio Atzori

added DedupSimilarityToActionsMapper and relative dependency

35452 18/03/2015 01:55 PM Michele Artini

increased version in scripts

35451 18/03/2015 01:07 PM Michele Artini

updated the version of a dependency

35442 18/03/2015 12:15 PM Alessia Bardi

fundingtree is an escaped xml, not a json anymore.

35439 18/03/2015 12:01 PM Michele Artini

increased a minor virsion

35196 09/03/2015 05:09 PM Michele Artini

sample records

35179 09/03/2015 02:39 PM Michele Artini

reimplemented the fundingpath and context generation

35135 05/03/2015 07:46 PM Claudio Atzori

updated packages

35133 05/03/2015 07:44 PM Claudio Atzori

updated packages, codestyle

35129 05/03/2015 07:38 PM Claudio Atzori

codestyle

35128 05/03/2015 07:34 PM Claudio Atzori

updated packages

35127 05/03/2015 07:31 PM Claudio Atzori

OafMerger moved to mapping utils

34901 27/02/2015 05:39 PM Claudio Atzori

temporary commit

34898 27/02/2015 05:37 PM Claudio Atzori

offline dedup

34602 19/02/2015 04:17 PM Claudio Atzori

added protobuf-java-format dependency

34600 19/02/2015 04:14 PM Claudio Atzori

renamed test

34599 19/02/2015 04:07 PM Claudio Atzori

added json size test

34536 16/02/2015 07:56 PM Claudio Atzori

saving disk space, less logging

34454 11/02/2015 07:31 PM Alessia Bardi

Updated configuration for testing