Project

General

Profile

Statistics
| Revision:

# Date Author Comment
55262 10/04/2019 02:25 PM Alessia Bardi

Datacite plugin now gets the baseUrl from the interfacedescriptor

55216 08/04/2019 12:11 PM Sandro La Bruzzo

implemented Datacite collector plugin from Elasticsearch dump

53854 17/11/2018 06:20 PM Alessia Bardi

code formatting

53853 17/11/2018 06:18 PM Alessia Bardi

fixed class name for logs

53781 15/11/2018 12:56 PM Andreas Czerniak

Issue Enhancement #3858 and code cleanup in RestIterator

53691 09/11/2018 03:50 PM Claudio Atzori

Kaggle/Reactome: added configurable params

53688 09/11/2018 02:20 PM Giorgos Papanikos

added configurable producer timeout

53685 09/11/2018 01:50 PM Giorgos Papanikos

Added safeguard parsing of produced xml to catch badly escaped illegal characters (eg etf: \001 escaped as ). Changed harvesting thread execution from ThreadExecutor to Thread.Start

53683 09/11/2018 11:45 AM Claudio Atzori

Kaggle/Reactome: factored out file write procedure

53677 08/11/2018 05:29 PM Giorgos Papanikos

added more logging and removed fair option from blocking queue

53664 08/11/2018 11:29 AM Giorgos Papanikos

Updatged string identifier type reference to use enum

53663 08/11/2018 11:21 AM Claudio Atzori

added main classes to verify the content collected from Kaggle and Reactome

53660 08/11/2018 10:41 AM Giorgos Papanikos
53654 07/11/2018 06:29 PM Miriam Baglioni

fixed issue

53653 07/11/2018 06:25 PM Miriam Baglioni
53652 07/11/2018 06:15 PM Miriam Baglioni

fixed issue for stopping iteration execution

53641 06/11/2018 06:49 PM Giorgos Papanikos

Added default empty dataset document serialization for endpoints where no dataset can be retrieved

53640 06/11/2018 04:18 PM Claudio Atzori

optional parameters for the schema.org plugin

53630 05/11/2018 06:29 PM Claudio Atzori

added all the possible parameters to the Schema.org ProtocolDescriptor

53616 04/11/2018 03:23 PM Giorgos Papanikos

Corrected Creation Date format

53615 04/11/2018 02:33 PM Giorgos Papanikos

deleted dead code. shouldn't have been commited in the first place

53614 04/11/2018 02:23 PM Giorgos Papanikos

Added schema.org harvesting plugin. Supports sitemapindex files and api listing calls to retrieve endpoints list

53183 19/09/2018 09:15 AM Andreas Czerniak

enhancement of new resumptionType, Issue Enhancement #3858

53163 18/09/2018 09:07 AM Andreas Czerniak

fix JSON replacement with cleanUnwantedJsonCharsInXmlTagnames

53123 14/09/2018 02:12 PM Andreas Czerniak

use XmlCleaner for cleaning up XML results and
prepare for next revision.

53116 13/09/2018 03:55 PM Andreas Czerniak

org.json.XML - update maven package version to 20180813
better unicode support

53071 11/09/2018 05:12 PM Miriam Baglioni

add fix related to #3849 (control characters in xml files break the transformation)

52997 03/09/2018 09:26 AM Andreas Czerniak

org.json.XML - Workaround for JSON element names -> XML tagnames.
remove resumptionParam&-Type from first 'query' URL.

52983 24/08/2018 10:10 AM Andreas Czerniak

Additional to discover option in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52982 23/08/2018 04:08 PM Andreas Czerniak

Additional comments, debugging output in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52979 23/08/2018 09:23 AM Andreas Czerniak

Additional comments, debugging output and small changes in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52971 14/08/2018 11:43 AM Andreas Czerniak

Small changes in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52970 10/08/2018 11:27 AM Andreas Czerniak

Changes in the Rest_Json CollectorPlugin with enhancements for the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52783 23/07/2018 11:52 AM Miriam Baglioni

use HttpConnector to download XML instead of VTDGen parse URL method

52644 02/07/2018 03:15 PM Miriam Baglioni

remove "\n" from all the cell contents

52643 02/07/2018 02:42 PM Miriam Baglioni

fix issue in input data

52618 29/06/2018 02:07 PM Miriam Baglioni

minor

52614 29/06/2018 01:57 PM Miriam Baglioni

changes in the implementation of the iterator

52611 29/06/2018 12:42 PM Miriam Baglioni

little adjustment to fix data format in input data

52520 18/06/2018 12:21 PM Claudio Atzori

cleanup

52519 18/06/2018 12:07 PM Miriam Baglioni

minor

52518 18/06/2018 11:23 AM Miriam Baglioni

fix for package name (HTTPWithFileName -> httpfilename and fixed issue on iterator for HTTPWithFileNameCollectorIterable

52514 15/06/2018 06:39 PM Claudio Atzori

small refactor

52510 15/06/2018 05:09 PM Claudio Atzori

small refactor

52496 15/06/2018 02:14 PM Claudio Atzori

used blocking methods in HTTPWithFileNameCollectorIterable

52240 25/05/2018 05:28 PM Miriam Baglioni

added information to the associated URL for junk metadata

52238 25/05/2018 05:08 PM Miriam Baglioni

fixed issue when .jos metadata extension contain xml content

52237 25/05/2018 03:21 PM Miriam Baglioni

code cleaning

52235 25/05/2018 02:51 PM Miriam Baglioni

changed implementation of data gathering

52230 25/05/2018 12:54 PM Miriam Baglioni

some logs for debugging reasons added

52102 18/05/2018 04:43 PM Miriam Baglioni

minor

52100 18/05/2018 04:10 PM Miriam Baglioni

stupid mistake

52099 18/05/2018 04:08 PM Miriam Baglioni

minor

52093 18/05/2018 11:58 AM Miriam Baglioni

check for malformed json

52059 16/05/2018 04:51 PM Miriam Baglioni

remove DOCTYPE from metadata xml

52056 16/05/2018 03:34 PM Miriam Baglioni

minor

52054 16/05/2018 03:01 PM Miriam Baglioni

filtering metadata and added param in template to specify what to filter out

52036 15/05/2018 03:31 PM Miriam Baglioni

removed DOCTYPE from xml metadata document

52035 15/05/2018 03:28 PM Miriam Baglioni

change in the parameters of httpWithFilename plugin

52034 15/05/2018 02:52 PM Miriam Baglioni

minor

52031 15/05/2018 02:10 PM Miriam Baglioni

modified update of information in xml metadata

52026 14/05/2018 05:57 PM Miriam Baglioni

considered the case metadata are given in xml format instead of json

51970 08/05/2018 12:43 PM Miriam Baglioni

commit after refactoring

51966 08/05/2018 12:34 PM Miriam Baglioni

commit after refactoring

51965 08/05/2018 12:31 PM Miriam Baglioni

added bean

51959 08/05/2018 11:20 AM Miriam Baglioni

fix bug

51956 08/05/2018 10:44 AM Miriam Baglioni

pluging for collecting metadata from files mapped to urls (related to #3236)

50662 09/02/2018 05:31 PM Alessia Bardi

using commons-lang3

50584 05/02/2018 11:49 AM Claudio Atzori

small adjustments in rest json plugin

50582 02/02/2018 06:29 PM Jochen Schirrwagen

fixed 'next' method of the iterator class and added new field entityPath

50116 13/12/2017 12:07 PM Jochen Schirrwagen

implemented missing method collect in RestCollectorPlugin; added test class; removed RestIteratorFactory class

50070 05/12/2017 02:13 PM Jochen Schirrwagen

declaring bean for RestCollectorPlugin

50066 04/12/2017 07:02 PM Jochen Schirrwagen

collector plugin for rest apis

49643 23/10/2017 06:24 PM Alessia Bardi

Moved HttpConnector in common package

49638 23/10/2017 05:20 PM Alessia Bardi

Using HttpConnector in re3data plugin

48028 28/06/2017 02:30 PM Claudio Atzori

integrated latest changes from dnet40

45294 11/01/2017 11:11 AM Claudio Atzori

codebase used to migrate to java8 the production system