dnet40/modules/uoa-iis-referenceextraction/branches/IIS-CDH-5.3.0/README.markdown @ 42369
1 |
This project contains workflow nodes that **extract references to projects and data sets** from document plaintext.
|
---|---|
2 |
|
3 |
The workflow nodes are based on Python "MadIS" library (see http://code.google.com/p/madis/). In order for them to work, a non-standard `apsw` Python library has to be installed in the system (see http://code.google.com/p/apsw/) on all datanodes (slaves).
|
4 |
|
5 |
Python in version 2.7 and above is recommended. |