HBase to HDFS
Data Provision
30
set mdformat, layout, interpretation
TMF
index
openaire
dnet.openaire.model.relclasses.xquery
relClasses
Prepare indexing
hdfsRecordsPath
rottenRecordsPath
/eu/dnetlib/msro/openaireplus/workflows/index/openaireLayoutToRecordStylesheet.xsl
oaf.schema.location
hdfs cleanup (xml)
DM
{
'path' : 'hdfsRecordsPath'
}
hdfs cleanup (rotten)
DM
{
'path' : 'rottenRecordsPath'
}
M/R group entities
DM
prepareIndexDataJob
{
'hbase.mapred.inputtable' : 'hbase.mapred.datatable',
'hbase.mapreduce.inputtable' : 'hbase.mapred.datatable'
}
{
'mapred.output.dir' : 'hdfsRecordsPath',
'index.entity.links' : 'index.entity.links',
'oaf.schema.location' : 'oaf.schema.location',
'contextmap' : 'contextmap',
'relClasses' : 'relClasses'
}