updated XML record file used for opentrials testing.
reverted to use json-java-format 1.2
reverted to previous revision
changed WRITE_TO_WAL = true for all jobs writing to HBase tables
added counters to keep track of the relationships provenance
[maven-release-plugin] prepare for next development iteration
[maven-release-plugin] copy for tag dnet-mapreduce-jobs-0.0.8.8
[maven-release-plugin] prepare release dnet-mapreduce-jobs-0.0.8.8
new tst for claim updates
updated opentrial sample record
excluding dateoftransformation from metadata fields, it should be serialised only in the record header
Added dr:dateOfTransformation to some test XML files.For publications dr:dateOfCollection must be set.For datasets dri:dateOfCollections must be set.
Testing OpenTrials dataset record mapping. Depending on snapshot parent.
finally I made those scripts decent
[maven-release-plugin] copy for tag dnet-mapreduce-jobs-0.0.8.7
[maven-release-plugin] prepare release dnet-mapreduce-jobs-0.0.8.7
fixed dependencies, depending on released parent
import cleanup
reverting, we need less getters
ignores
experiments for scoreResult
score result
upload
tests for dedup experiments
added more getters
dedup experiments
added mapper class for hdfs actions
cleanup
depending on released mongo-logging
added Mapper class PromoteActionSetFromHDFS
Updated pom version to 0.0.8.7
added anchorStats map-only job
added counter for DOIs
removing useless counters
using most recent dnet-pace-core features
fixed DedupDeleteRelMapper
do not export deleted entities
adapted to the removal of contributors as relationships
updated scripts
bumped version
added utility methods to deal with strings rather than byte[]
sort merged ids
log the documents being compared before failing
test for ARC
introducing support for projects that doesn't provide a link to a specific fundingpath.
implemented job and workflow to export the openaire identifiers
log the number of items clustered on each key
do not consider deleted entities
New test for openaire2.0_data compliance for datasets
updating to dnet-openaire-data-protos:3.5.0
updated to dnet-openaire-data-protos:3.5.0-SNAPSHOT
cleanup, extended tests to include new relationships and mapping profiles
counters
counter test
depending on version range
testing author dedup
branch offline dedup
Tests load gthe XSLT from the TDSRule profiles in dnet-openaireplus-profiles
Back to revision r39888 and updated pom and sh files
depending on SNAPSHOT parent
playing with index feeding
removed
removing old branch
bumbed minor version
make SNAPSHOTs visible to this module
added possibility to post-process the result stored in the index documents
ticket #1588 Rename "native" compatibility to "proprietary"
use of external properties
added min distance algorithm, used to identify the connected components (dedup)
limit the job to insttitutional pubsrepository
counter labels
use of Text instead of ImmutableBytesWritable
reimplemented calculatePersonDistribution M/R job to consider only the results from pubsrepositories (not journals)
reuse the same outkey and outvalue objects
added more mapping tests, using xslt picked from services.openaire
spring makes me lazy
added infospace dump mapper
added information space export job
testing umlauts
updated to the new mongodb driver specs
Null values for FP7 and H2020 specific fields about OA mandate and Data Pilot.
Generate compress record in OAI store.
Do not check the status of a record: we assume we have to insert it because the OAI store is built in refresh mode.
OAIStore with compressed bodies. FCurrently for beta only.
fixed tests, added new dedup specific jobs
added implementors for offline dedup person workflow