/modules/dnet-mapreduce-jobs/branches/master/src/main - Changes - D-Net - D-Net project tracking tool

dnet45/modules/dnet-mapreduce-jobs/branches/master/src/main @ 53409

#	Date	Author	Comment
53409	05/10/2018 04:17 PM	Miriam Baglioni	refactoring and change of counters
53407	05/10/2018 04:06 PM	Claudio Atzori	rollback wrong commit
53386	04/10/2018 03:35 PM	Claudio Atzori	fixing and testing propagation implementation
53383	04/10/2018 02:46 PM	Miriam Baglioni	reducer for country propagation that writes on hdfs
53371	03/10/2018 10:22 AM	Claudio Atzori	cleanup pid types in order to make them valid attributes
53369	02/10/2018 02:08 PM	Miriam Baglioni
53362	02/10/2018 10:18 AM	Miriam Baglioni	added code for propagation of countries from institutional organization
53340	01/10/2018 10:04 AM	Claudio Atzori	master branch for deployments @ICM
53336	01/10/2018 09:26 AM	Claudio Atzori	why parse strings as Floats?
53288	27/09/2018 01:48 PM	Claudio Atzori	reverted to r52985 . Test runs shows we need to rely on the edgeIds produced by the connected components identfication phase instead of the vertexIds
53279	26/09/2018 03:59 PM	Miriam Baglioni	alignment to trunk version
53262	25/09/2018 05:41 PM	Claudio Atzori	avoid to produce duplicated events by eliminating the roots from the comparison process
53260	25/09/2018 03:26 PM	Claudio Atzori	introduced mapping resulttype -> portal url
53213	21/09/2018 12:36 PM	Alessia Bardi	Fixed log class name
53191	19/09/2018 05:46 PM	Claudio Atzori	avoid collisions when hashing pids by value
53190	19/09/2018 05:33 PM	Claudio Atzori	cleaned up unused method, using setDurability in put operation
53103	12/09/2018 03:03 PM	Claudio Atzori	added mapper and hadoop job configuration file for importing Grid.AC organization data
53079	11/09/2018 06:31 PM	Claudio Atzori	integrating bulktag from trunk to beta branch
53068	11/09/2018 03:27 PM	Claudio Atzori	rule out invalid dates also on CrossRefToActions
53067	11/09/2018 03:22 PM	Claudio Atzori	rule out invalid dates on ScholixToActions
53036	10/09/2018 10:17 AM	Claudio Atzori	cleanup
53035	10/09/2018 10:16 AM	Claudio Atzori	produce 'supplement' subrel type in case of supplement relationships
53025	05/09/2018 02:33 PM	Claudio Atzori	simplified connected component application on the graph
52993	28/08/2018 05:06 PM	Sandro La Bruzzo	adding check to understand the bug of wrong relation generated
52985	27/08/2018 10:07 AM	Claudio Atzori	do not skip processing datasets in DedupBuildRootsMapper, improved error reporting in DedupBuildRootsReducer
52984	27/08/2018 10:00 AM	Claudio Atzori	do not push vertex ids in memory, process them on the fly
52960	08/08/2018 12:36 PM	Claudio Atzori	added jobs for predatory journal analysis
52958	07/08/2018 06:15 PM	Sandro La Bruzzo	added invisible setup
52957	07/08/2018 06:12 PM	Sandro La Bruzzo	refactored Action
52956	07/08/2018 06:07 PM	Sandro La Bruzzo	fixed null element
52955	07/08/2018 05:51 PM	Sandro La Bruzzo	Created CrossrefImportMapper
52951	07/08/2018 05:30 PM	Sandro La Bruzzo	add CrossRefToAction
52935	07/08/2018 11:29 AM	Claudio Atzori	fixed mapping from scholix to openaire model
52931	07/08/2018 09:39 AM	Claudio Atzori	small fixes
52930	06/08/2018 06:35 PM	Sandro La Bruzzo	changed key type
52929	06/08/2018 06:29 PM	Sandro La Bruzzo	changed key type
52916	06/08/2018 05:32 PM	Sandro La Bruzzo	implemented mapper writing
52915	06/08/2018 04:52 PM	Sandro La Bruzzo	added configuration
52912	06/08/2018 04:09 PM	Sandro La Bruzzo	added Mapper for tranform scholexplorer links into actionsets
52883	02/08/2018 04:25 PM	Claudio Atzori	deprecation: use setDurability instead of setWriteToWAL
52878	02/08/2018 02:19 PM	Claudio Atzori	introduced subType in pace wf configuration
52823	25/07/2018 04:10 PM	Claudio Atzori	adjusted ids export procedure
52805	24/07/2018 05:22 PM	Claudio Atzori	avoid to emit enrichment events when the similarity score is below the threshold
52804	24/07/2018 02:56 PM	Claudio Atzori	avoid to emit enrichment events when the similarity score is below the threshold
52803	24/07/2018 02:53 PM	Claudio Atzori	avoid to emit enrichment events when the similarity score is below the threshold
52802	24/07/2018 02:04 PM	Claudio Atzori	javadoc and test
52801	24/07/2018 12:14 PM	Claudio Atzori	indentation
52797	23/07/2018 04:10 PM	Claudio Atzori	pick the 1st instance to avoid collisions
52777	20/07/2018 04:04 PM	Claudio Atzori	improved behaviour EventWrapperTest
52775	20/07/2018 03:07 PM	Michele Artini	Partial implementation of a unit test
52765	18/07/2018 11:45 AM	Michele Artini	Fixed the generation of eventIds
52751	13/07/2018 05:38 PM	Alessia Bardi	Workaround for CLARIN mining issue: #3670#note-29
52524	18/06/2018 03:07 PM	Claudio Atzori	expand author identifiers
52490	15/06/2018 11:25 AM	Claudio Atzori	generate ENRICH/MISSING/PID only when the publication didn
52488	15/06/2018 11:20 AM	Claudio Atzori	discover the invalid character from the exception details
52469	13/06/2018 03:05 PM	Claudio Atzori	mapper class that parses xml records
52462	13/06/2018 11:43 AM	Claudio Atzori	mapper class that parses xml records
52461	13/06/2018 11:42 AM	Claudio Atzori	mapper class that parses xml records
52421	08/06/2018 05:57 PM	Claudio Atzori	expand field distributionlocation in result's instances
52212	24/05/2018 06:14 PM	Alessia Bardi	Including Open SOurce among the licenses
52114	21/05/2018 12:24 PM	Alessia Bardi	Added counters for missing date of collection and transformation
52112	21/05/2018 12:19 PM	Alessia Bardi	Do not add to the BasicDBObject properties that are not listed as field to index
52111	21/05/2018 11:46 AM	Alessia Bardi	splitAsList cannot be found when running on the cluster (dependency issues with guava?). Lets try to work around it.
52109	21/05/2018 11:21 AM	Alessia Bardi	OAI M/R jobs expect a new parameter that lists the date patterns to try 'services.publisher.oai.datepatterns'
52078	17/05/2018 07:03 PM	Alessia Bardi	We also have some date as ISO DateTime with Zone...
52076	17/05/2018 05:15 PM	Alessia Bardi	All date fields are actually added as Date field on mongo, hopefully
52075	17/05/2018 05:14 PM	Alessia Bardi	Fixed bug when retrieving info about store indices for a given metadata format
51451	23/03/2018 05:07 PM	Claudio Atzori	added preliminary support for events regarding software
51221	14/03/2018 12:26 PM	Claudio Atzori	don't fail in case of missing context ids
51006	02/03/2018 10:50 AM	Claudio Atzori	force gson to serialise dates in a format that can be undrestood by ElasticSearch, updated elasticsearch-hadoop-mr lib to version 5.2.0
50270	10/01/2018 05:49 PM	Claudio Atzori	getting rid of ugly hacks
50256	09/01/2018 10:02 AM	Claudio Atzori	use getInvisible instead of hasInvisible
50236	03/01/2018 09:22 AM	Claudio Atzori	beta
50157	18/12/2017 04:34 PM	Alessia Bardi	Fixed date parsing in OAI
50153	18/12/2017 04:11 PM	Claudio Atzori	cleanup
49891	14/11/2017 05:23 PM	Alessia Bardi	#3110 Support incremental harvesting: setting dateOfTransformation as datestamp whenever available
49832	07/11/2017 09:58 AM	Claudio Atzori	refactored broker events generation
49831	07/11/2017 09:58 AM	Claudio Atzori	cleanup
49815	06/11/2017 10:09 AM	Claudio Atzori	integrated exportSummaryRecordsJob mapper from dnet40
49565	19/10/2017 05:15 PM	Claudio Atzori	using SolrServer (4.X)
49517	17/10/2017 03:08 PM	Claudio Atzori	exclude from the deduplication process results that aren't publications
49516	17/10/2017 03:07 PM	Claudio Atzori	do not index invisible records
49515	17/10/2017 03:07 PM	Claudio Atzori	added support for invisible records
49218	03/10/2017 06:14 PM	Claudio Atzori	skip weird cases in CC algo
49096	25/09/2017 05:32 PM	Claudio Atzori	fixing mapping for license vs accessright #3128, cleanup
49029	20/09/2017 06:43 PM	Claudio Atzori	getting rid of person entities
48892	08/09/2017 02:07 PM	Claudio Atzori	upgraded solr version to 6.6.0
48697	25/07/2017 10:13 AM	Claudio Atzori	some java8 refactorings, added more tests for the software entities mapping
48145	29/06/2017 10:15 AM	Claudio Atzori	integrated latest changes from dnet40
47483	13/06/2017 02:51 PM	Claudio Atzori	instead of excluding datasets from the deduplication process, we include only publications
46587	03/04/2017 01:04 PM	Alessia Bardi	implemented use of opt in/out rules for entity fields (#2557). depending on specific solrj version (thus excluding cdh6.X versions)
45318	11/01/2017 03:59 PM	Claudio Atzori	codebase used to migrate to java8 the production system

Project

General

Profile

D-Net