#1240 raising mapred.task.timeout to 3600000 (1h) just in case any extremely complex PDF document appear. All time consuming documents will be registered in failure sink.
#1257 dropping schema generation related hacks in all map-reduce modules, switching to literal schema parameters
#1248 introducing failures sink datastore support in metadata extraction module
#1240 extending mapred.task.timeout for metadata extraction to 30 minutes
#953 extending maximum heap size from 2048 to 4096 after Dominika introduced iText dependency upgrade to 5.5.3 in cermine. This combination should minimize the possibility of failures caused by fatal OOMErr.
#913 introducing support for max file size parameter, currently checked against Content-Lenght header
setting excluded_ids to undefined value