Project

General

Profile

Statistics
| Revision:
Name Size Revision Age Author Comment
KaggleRepositoryIterable.java 5.59 KB 53685 over 5 years Giorgos Papanikos Added safeguard parsing of produced xml to catc...

Latest revisions

# Date Author Comment
53685 09/11/2018 01:50 PM Giorgos Papanikos

Added safeguard parsing of produced xml to catch badly escaped illegal characters (eg etf: \001 escaped as ). Changed harvesting thread execution from ThreadExecutor to Thread.Start

53677 08/11/2018 05:29 PM Giorgos Papanikos

added more logging and removed fair option from blocking queue

53614 04/11/2018 02:23 PM Giorgos Papanikos

Added schema.org harvesting plugin. Supports sitemapindex files and api listing calls to retrieve endpoints list

View revisions

Also available in: Atom