Project

General

Profile

Statistics
| Revision:

# Date Author Comment
38142 09/07/2015 01:09 PM Marek Horst

#1147 preserving newlines when ingesting plaintext from htmls. This will eliminate some of the false positives in reference extraction algorithms

37142 11/05/2015 06:50 PM Marek Horst

merging trunk changes with IIS-CDH-5.3.0 branch

36288 09/04/2015 07:10 PM Marek Horst

#1257 dropping schema generation related hacks in all map-reduce modules, switching to literal schema parameters

35707 27/03/2015 09:41 AM Marek Horst

#1135 switching icm-iis-parent-container version to 1.0.1-SNAPSHOT in order to include workingDir related changes made in icm-iis-core

35406 17/03/2015 03:04 PM Marek Horst

#1198 aligning IIS dependencies and java code to CDH5.3.0 cluster

35392 17/03/2015 03:00 PM Marek Horst

#1197 introducing job.properties changes aligning paths to rumcajs cluster HDFS structure

35257 11/03/2015 04:52 PM Marek Horst

creating IIS-CDH-5.3.0 branch

35256 11/03/2015 04:52 PM Marek Horst

introducing branches folder

34912 27/02/2015 06:57 PM Marek Horst

#1147 renaming toplaintext wf name with plaintext to be more appriopriate

34911 27/02/2015 06:56 PM Marek Horst

#1147 renaming toplaintext dir name with plaintext to be more appriopriate

34906 27/02/2015 06:18 PM Marek Horst

#1147 introducing first version of html->plaintext ingester utilizing jsoup library

34902 27/02/2015 05:39 PM Marek Horst

#1047 renaming icm-iis-ingest-webcrawl SVN location to icm-iis-ingest

34897 27/02/2015 05:37 PM Marek Horst

#1047 renaming icm-iis-ingest-webcrawl SVN location to icm-iis-ingest

34895 27/02/2015 05:36 PM Marek Horst

#1147 renaming icm-iis-ingest-webcrawl module to icm-iis-ingest to make it more generic so it could contain not only webcrawl related ingesters but html ingesters as well

34621 19/02/2015 06:12 PM Marek Horst

#1038 introducing ranges in dependencies definition for all IIS modules

34432 11/02/2015 02:24 PM Marek Horst

setting svn:ingore

34431 11/02/2015 02:22 PM Marek Horst

#1083 introducing webcrawl ingester module extracting FX field from plaintext before executing project reference extraction

34430 11/02/2015 02:18 PM Marek Horst

Share project "icm-iis-ingest-webcrawl" into "https://svn.driver.research-infrastructures.eu/driver"