Project

General

Profile

Statistics
| Revision:

# Date Author Comment
53854 17/11/2018 18:20 Alessia Bardi

code formatting

53853 17/11/2018 18:18 Alessia Bardi

fixed class name for logs

53781 15/11/2018 12:56 Andreas Czerniak

Issue Enhancement #3858 and code cleanup in RestIterator

53691 09/11/2018 15:50 Claudio Atzori

Kaggle/Reactome: added configurable params

53688 09/11/2018 14:20 Giorgos Papanikos

added configurable producer timeout

53685 09/11/2018 13:50 Giorgos Papanikos

Added safeguard parsing of produced xml to catch badly escaped illegal characters (eg etf: \001 escaped as ). Changed harvesting thread execution from ThreadExecutor to Thread.Start

53683 09/11/2018 11:45 Claudio Atzori

Kaggle/Reactome: factored out file write procedure

53677 08/11/2018 17:29 Giorgos Papanikos

added more logging and removed fair option from blocking queue

53664 08/11/2018 11:29 Giorgos Papanikos

Updatged string identifier type reference to use enum

53663 08/11/2018 11:21 Claudio Atzori

added main classes to verify the content collected from Kaggle and Reactome

53660 08/11/2018 10:41 Giorgos Papanikos
53654 07/11/2018 18:29 Miriam Baglioni

fixed issue

53653 07/11/2018 18:25 Miriam Baglioni
53652 07/11/2018 18:15 Miriam Baglioni

fixed issue for stopping iteration execution

53641 06/11/2018 18:49 Giorgos Papanikos

Added default empty dataset document serialization for endpoints where no dataset can be retrieved

53616 04/11/2018 15:23 Giorgos Papanikos

Corrected Creation Date format

53615 04/11/2018 14:33 Giorgos Papanikos

deleted dead code. shouldn't have been commited in the first place

53614 04/11/2018 14:23 Giorgos Papanikos

Added schema.org harvesting plugin. Supports sitemapindex files and api listing calls to retrieve endpoints list

53183 19/09/2018 09:15 Andreas Czerniak

enhancement of new resumptionType, Issue Enhancement #3858

53163 18/09/2018 09:07 Andreas Czerniak

fix JSON replacement with cleanUnwantedJsonCharsInXmlTagnames

53123 14/09/2018 14:12 Andreas Czerniak

use XmlCleaner for cleaning up XML results and
prepare for next revision.

53116 13/09/2018 15:55 Andreas Czerniak

org.json.XML - update maven package version to 20180813
better unicode support

53071 11/09/2018 17:12 Miriam Baglioni

add fix related to #3849 (control characters in xml files break the transformation)

52997 03/09/2018 09:26 Andreas Czerniak

org.json.XML - Workaround for JSON element names -> XML tagnames.
remove resumptionParam&-Type from first 'query' URL.

52983 24/08/2018 10:10 Andreas Czerniak

Additional to discover option in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52982 23/08/2018 16:08 Andreas Czerniak

Additional comments, debugging output in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52979 23/08/2018 09:23 Andreas Czerniak

Additional comments, debugging output and small changes in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52971 14/08/2018 11:43 Andreas Czerniak

Small changes in the Rest_Json CollectorPlugin for the enhancements of the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52970 10/08/2018 11:27 Andreas Czerniak

Changes in the Rest_Json CollectorPlugin with enhancements for the new OpenDOAR API at JISC under https://v2.sherpa.ac.uk/opendoar/

52783 23/07/2018 11:52 Miriam Baglioni

use HttpConnector to download XML instead of VTDGen parse URL method

52644 02/07/2018 15:15 Miriam Baglioni

remove "\n" from all the cell contents

52643 02/07/2018 14:42 Miriam Baglioni

fix issue in input data

52618 29/06/2018 14:07 Miriam Baglioni

minor

52614 29/06/2018 13:57 Miriam Baglioni

changes in the implementation of the iterator

52611 29/06/2018 12:42 Miriam Baglioni

little adjustment to fix data format in input data

52520 18/06/2018 12:21 Claudio Atzori

cleanup

52519 18/06/2018 12:07 Miriam Baglioni

minor

52518 18/06/2018 11:23 Miriam Baglioni

fix for package name (HTTPWithFileName -> httpfilename and fixed issue on iterator for HTTPWithFileNameCollectorIterable

52514 15/06/2018 18:39 Claudio Atzori

small refactor

52510 15/06/2018 17:09 Claudio Atzori

small refactor

52496 15/06/2018 14:14 Claudio Atzori

used blocking methods in HTTPWithFileNameCollectorIterable

52240 25/05/2018 17:28 Miriam Baglioni

added information to the associated URL for junk metadata

52238 25/05/2018 17:08 Miriam Baglioni

fixed issue when .jos metadata extension contain xml content

52237 25/05/2018 15:21 Miriam Baglioni

code cleaning

52235 25/05/2018 14:51 Miriam Baglioni

changed implementation of data gathering

52230 25/05/2018 12:54 Miriam Baglioni

some logs for debugging reasons added

52102 18/05/2018 16:43 Miriam Baglioni

minor

52100 18/05/2018 16:10 Miriam Baglioni

stupid mistake

52099 18/05/2018 16:08 Miriam Baglioni

minor

52093 18/05/2018 11:58 Miriam Baglioni

check for malformed json

52059 16/05/2018 16:51 Miriam Baglioni

remove DOCTYPE from metadata xml

52056 16/05/2018 15:34 Miriam Baglioni

minor

52054 16/05/2018 15:01 Miriam Baglioni

filtering metadata and added param in template to specify what to filter out

52036 15/05/2018 15:31 Miriam Baglioni

removed DOCTYPE from xml metadata document

52034 15/05/2018 14:52 Miriam Baglioni

minor

52031 15/05/2018 14:10 Miriam Baglioni

modified update of information in xml metadata

52026 14/05/2018 17:57 Miriam Baglioni

considered the case metadata are given in xml format instead of json

51970 08/05/2018 12:43 Miriam Baglioni

commit after refactoring

51959 08/05/2018 11:20 Miriam Baglioni

fix bug

51956 08/05/2018 10:44 Miriam Baglioni

pluging for collecting metadata from files mapped to urls (related to #3236)

50662 09/02/2018 17:31 Alessia Bardi

using commons-lang3

50584 05/02/2018 11:49 Claudio Atzori

small adjustments in rest json plugin

50582 02/02/2018 18:29 Jochen Schirrwagen

fixed 'next' method of the iterator class and added new field entityPath

50116 13/12/2017 12:07 Jochen Schirrwagen

implemented missing method collect in RestCollectorPlugin; added test class; removed RestIteratorFactory class

50070 05/12/2017 14:13 Jochen Schirrwagen

declaring bean for RestCollectorPlugin

50066 04/12/2017 19:02 Jochen Schirrwagen

collector plugin for rest apis

49643 23/10/2017 18:24 Alessia Bardi

Moved HttpConnector in common package

49638 23/10/2017 17:20 Alessia Bardi

Using HttpConnector in re3data plugin

48028 28/06/2017 14:30 Claudio Atzori

integrated latest changes from dnet40

45294 11/01/2017 11:11 Claudio Atzori

codebase used to migrate to java8 the production system