D82 LOD services » History » Revision 2
Revision 1 (Paolo Manghi, 27/02/2015 09:44 AM) → Revision 2/13 (Paolo Manghi, 27/02/2015 06:29 PM)
h1. Task 8.2 D82 LOD services *Leader*: UBONN. *Participants*: ARC, CNR h2. Subtasks * map all metadata objects in the OpenAIRE Information Space onto suitable standard vocabularies (e.g. Dublin Core, SIOC, EDM, CERIF LD) * made these metadata objects available as Linked Open Data as data dumps being published in regular intervals with more frequently published incremental updates * link OpenAIRE LOD objects with other Linked Data resources such as DBLP, ACM, Citeseer, DBpedia (UBONN and ARC) * liaise with all relevant communities (PSI, DBpedia, LOD, W3C SWEO etc.) to leverage and outreach to additional stakeholders and multipliers (UBONN and ARC) * precondition: CNR will provide technical support for synchronizing content of the OpenAIRE Information Space with LOD services and vice versa, in the case content can be moved from enriched LOD representation to the OpenAIRE Information Space. * expected outcome: OpenAIRE will increase its technical interoperability, engage with additional user communities and explore synergies with and added value to related open content initiatives (e.g. in the Open Educational Resources). h2. Task Timeline (Including Deliverables & Milestones) * M6: D8.2 LOD Services. The deliverable will describe the technical deployment of LOD services, together with their integration with the OpenAIRE information space, in terms of data (mappings from OpenAIRE data model to LOD structure and standard vocabularies) and workflows. [UBONN, R] * M12, M24, M36: M8.1 LOD services. The service software will be released in three stages, in order to match the three main releases of the OpenAIRE data model. h2. Areas of priority (where to concentrate first) # Specify mapping from the OpenAIRE data model to LOD vocabularies # Explore technical ways of producing LOD: ## From the CSV so far produced as intermediate files for generating statistics (potential issue: might not contain sufficiently complete information) ## From CSV data generated by a modified implementation, which keeps all required information (potential issue: not efficient to process with off-the-shelf CSV→RDF tools because of redundancies in the data) ## Directly from HBase, using a Map/Reduce job similar to the one generating the above CSV (potential issue: harder to adapt w.r.t. data model changes or new vocabularies; can’t reuse off-the-shelf tools) h2. Discussion (https://issue.openaire.research-infrastructures.eu/issues/1089) h2. Forseen Integration with other Work Packages and Tasks * T4.4 (Guidelines for Data providers and OpenAIRE service APIs): Discuss the possibility to import data into the OpenAIRE Information Space that exists natively in the form of LOD. * T9.2 (Statistics, reporting and visualization services): Some extensions to these services could potentially be implemented in a straightforward way as SPARQL queries over the LOD, if that’s sufficiently scalable. * T10.4 (Scholarly communication network analysis): By the same argument this could also be done on top of the LOD graph. h2. Communication Strategy: when and how to raise awareness among consortium of updates in task * Around M3/M4, when the mapping to LOD vocabularies has been specified: “These are the vocabularies we consider reasonable; any comments?” * First before M6, once more in the second half of Year 1: ask partners to check the LOD that we can so far produce from the metadata in the OpenAIRE Information Space. “Is it correct/complete?” * Some time in Year 2 (once all of our metadata have been mapped to LOD): discuss candidate LOD datasets to which we would like to identify links (starting from the list given in the description of this task).