Set information about current provider
$params.("dataprovider:id")$
$params.("dataprovider:name")$
$params.("dataprovider:interface")$
Obtain data source params
$params.("dataprovider:id")$
Set in the environment all the variable needed to the collection oozie job
$params.("harv_id")$
XML
5
0
60
30
60
Start the Hadoop Job
executeOozieJob
IIS
{
"apiDescription":"apiDescription",
"dataSourceInfo":"dataSourceInfo",
"identifierPath":"identifierPath",
"metadataEncoding":"metadataEncoding",
"timestamp":"timestamp",
"workflowId":"workflowId",
"mdStoreID":"mdId",
"collectionMode":"collectionMode",
"maxNumberOfRetry":"maxNumberOfRetry",
"requestDelay":"requestDelay",
"retryDelay":"retryDelay",
"connectTimeOut":"connectTimeOut",
"readTimeOut":"readTimeOut",
"dnetMessageManagerURL":"dnetMessageManagerURL",
"oozie.wf.application.path":"oozieWfPath"
}
{
"collection_java_xmx" : "-Xmx300m"
}
BeginRead,StartTransaction,CollectionWorker
Update datasouce API extra fields
$params.("harv_id")$
$params.("dataprovider:id")$
$params.("dataprovider:interface")$
last_collection_total
last_collection_date
last_collection_mdId