m
added serialization, tests
instantiate one SAXReader for each call
fixed format-layout-interpretation concatenation,doesn't fail when the fieldExtractor returns a null result
added json serialization, builds the matching key one time only
updated profile to new OAI schema
using format-layout-interpretation to define the OAI store collection name
The expected name of collection is format-layout-interpretation
do not upsert sets here in the mapper: we shall delegate to a separate workflow to be run after the OAI feeding is completed.
early implementation of jar upload script
Always new records to test how faster we go
fixed a problem with lower case
renaming root element from list to citations
retry on exceptions
javadoc annotations
Refactoring
#486 fixing integration test: introducing missing document_text_wos input port for primary/processing
#486 introducing last piece missing: text collapser in front of referenceextraction_researchinitiatives joining text contents coming from already existing document_text input port and newly introduced document_text_wos input port providing WoS contents
refactor of xsl template and split function instead of recursive calls
Refactored class that extracts fields from records. When we can't find an expected index from the configuration to check its repeatability, the field is indexed as repeatable and a counter is updated.
pass the configuration string indented.
redefining the behaviour of the "skipRecord" rule by adding attribute "syntaxcheck" in the record header, #323
fixed application context
Updated SolrIndexDocument now it returns an instance of SolrInputDocument
idScheme and idNamespace defined as part of the OAI configuration profile
changed signature of the method in indexCollectionnow the method lookup throws an IndexServiceException
Implemented SolrIndexCollection
added IDSCHEME and IDNAMESPACE elements to the OAI configuration profile
#486 bugfix: reordering existence filter with id relacer: we need to update identifiers first, then update existence filter
No hadoop-parent: classes needed in mapreduce-jobs have been copied to avoid the jar to inherit "heavy dependencies" to spring and dnet IS.
proto to pace mapping parses the whole entity
Need to pass idscheme and namespace parameter to the job.
Removed dependency to dnet-oai-utils to avoid inheritance of unwanted jars such as cnr-rmi-api, cnr-service-common, spring, etc., which should not appear when running a job on the cluster. Needed classes have been copied and adapted so they do not use spring anymore.
branch to adapt the proto to pace mapping
need dnet-hadoop-parent because imported by dnet-mapreduce-jobs. Added <relativePath/> to avoid build warning.
Added bean
removed Browse and weights from IndexServerDao Interface
submittable M/R OAI feeding job
simplified index switch wf
manual start
replacing redundant transformers/ingest/pmc/citations with already existing transformers/importer/documentmetadata/idextractor
updating job.properties
Parameter check added and better string handling
intregrating pmc citations ingestion with primary workflow, adjust port names, deduplicating dependencies
derp changes
Box to select Valid/pending ds
Changed validate/invalidate labels
DS profiles updated
index format correction
extended dedup configuration, including now blacklists and algorithm parameters
added wf to perform only index switch (BB msg to the search service)
renamed meta wf definition file
small changes
wf fails when indexId is not found in the env
duplicates handling
roll-back
META-INF dir will now be omitted when priming
dir names in parameters should not contain nameNode
added carousel browsing boxes and modified initial form
added two browsing fields
added accordion-toggle class which draws bootstrap chevrons on the right edge of an accordion according to its status
added typologyclass to each api
- removed disturbing dependencies from pom- implemented actual lookups on the IS for lightUI profiles
changed target namespace
early implementation
added UnescapeHtml util
oai harvester using Jochen's HttpConnector and XmlCleaner
added record indentation
updating default job.properties
updating programatical execution of chmod on meta.json file, stil not working due to "Permission denied" warnings
renaming input ports from input_citation to input_citations to be aligned with exporter subworkflow
skipping exporting citation matching outcome
#691 introducing citations exporter module, updating exporter workflow.xml definition
added a schema validating the lightui profiles
selection on System parameters
visualization of long params
added bean wfNodeFindSearchService
introducing confidenceToTrustLevelNormalizationFactor getter method
fixing importing abstract after introducing fieldApprover for all Result fields
#568 renaming Citations#sourceDocumentId field to Citations#documentId
#568 introducing CitationEntry and Citations schemas
introducing fieldApprover for all Result fields
fixed rel direction
rename a field
testing direct sqoop import in oozie
use of TryIndentXmlString
added TryIndentXmlString. Doesn't break in case of broken xmls.