Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
  cc 53288 over 5 years Claudio Atzori reverted to r52985. Test runs shows we need to ...
  experiment 49029 over 6 years Claudio Atzori getting rid of person entities
  fixrelation 54182 over 5 years Claudio Atzori fixRelations must work on main entities in a si...
  gt 49029 over 6 years Claudio Atzori getting rid of person entities
DedupBuildRootsMapper.java 5.07 KB 52985 almost 6 years Claudio Atzori do not skip processing datasets in DedupBuildRo...
DedupBuildRootsReducer.java 8.02 KB 53036 over 5 years Claudio Atzori cleanup
DedupDeleteRelMapper.java 2.15 KB 54164 over 5 years Claudio Atzori introduced jobs to fix the relationships among ...
DedupDeleteSimRelMapper.java 1.92 KB almost 9 years claudio.atzori
DedupFindRootsMapper.java 4.78 KB 54164 over 5 years Claudio Atzori introduced jobs to fix the relationships among ...
DedupGrouperMapper.java 1.99 KB 54164 over 5 years Claudio Atzori introduced jobs to fix the relationships among ...
DedupMapper.java 3.81 KB 53726 over 5 years Claudio Atzori less verbose logging
DedupMarkDeletedEntityMapper.java 3.62 KB 54164 over 5 years Claudio Atzori introduced jobs to fix the relationships among ...
DedupPersonBean.java 1.06 KB almost 11 years michele.artini
DedupReducer.java 3.1 KB 53518 over 5 years Claudio Atzori introduced use of BlockProcessor
DedupRootsToCsvMapper.java 2.61 KB about 9 years claudio.atzori
DedupRootsToCsvReducer.java 2.56 KB about 9 years claudio.atzori
DedupSimilarityToHdfsActionsMapper.java 3.07 KB over 7 years claudio.atzori
FindDedupCandidatePersonsReducer.java 4 KB 54164 over 5 years Claudio Atzori introduced jobs to fix the relationships among ...
RootEntity.java 799 Bytes about 9 years claudio.atzori
SimpleDedupPersonReducer.java 5.8 KB 54164 over 5 years Claudio Atzori introduced jobs to fix the relationships among ...

Latest revisions

# Date Author Comment
54182 06/12/2018 11:01 AM Claudio Atzori

fixRelations must work on main entities in a single scan pass

54175 05/12/2018 05:04 PM Claudio Atzori

added support for simulation mode: allows to do not change the data and keep track of summary counters

54164 05/12/2018 03:56 PM Claudio Atzori

introduced jobs to fix the relationships among deduped records. Got rid of deprecations on the HBase Put method usage

53726 13/11/2018 09:08 AM Claudio Atzori

less verbose logging

53518 18/10/2018 02:48 PM Claudio Atzori

introduced use of BlockProcessor

53407 05/10/2018 04:06 PM Claudio Atzori

rollback wrong commit

53340 01/10/2018 10:04 AM Claudio Atzori

master branch for deployments @ICM

53288 27/09/2018 01:48 PM Claudio Atzori

reverted to r52985 . Test runs shows we need to rely on the edgeIds produced by the connected components identfication phase instead of the vertexIds

53036 10/09/2018 10:17 AM Claudio Atzori

cleanup

53025 05/09/2018 02:33 PM Claudio Atzori

simplified connected component application on the graph

View revisions

Also available in: Atom