Project

General

Profile

OpenAIRE Research Graph » History » Version 2

Alessia Bardi, 05/11/2021 12:05 PM
Links to dump and schema

1 1 Alessia Bardi
h1. The OpenAIRE Research Graph
2
3
The OpenAIRE Research Graph is one of the largest open scholarly record collections worldwide, key in fostering Open Science and establishing its practices in the daily research activities. 
4
Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back in the hands of the scientific community.
5
6
Imagine a vast collection of research products all linked together, contextualised and openly available. For the past ten years OpenAIRE has been working to gather this valuable record. It is a massive collection of metadata and links between scientific products such as articles, datasets, software, and other research products, entities like organisations, funders, funding streams, projects, communities, and data sources.
7
8
As of today, the OpenAIRE Research Graph aggregates around 450Mi metadata records with links collecting from 10K data sources trusted by scientists, including:
9
* Repositories registered in OpenDOAR or re3data.org
10
* Open Access journals registered in DOAJ
11
* Crossref
12
* Unpaywall
13
* ORCID
14
* Microsoft Academic Graph
15
* Datacite
16
17
After cleaning, deduplication, enrichment and full-text mining processes, the graph is analysed to produce statistics for OpenAIRE MONITOR (https://monitor.openaire.eu), the Open Science Observatory (https://osobservatory.openaire.eu), made discoverable via OpenAIRE EXPLORE (https://explore.openaire.eu) and programmatically accessible as described at https://develop.openaire.eu. 
18
Json dumps are also published on Zenodo 
19
20
h2. Graph Data Dumps
21
22 2 Alessia Bardi
In order to facilitate users, different dumps are available. All are available under the "Zenodo community called OpenAIRE Research Graph":https://zenodo.org/communities/openaire-research-graph.
23
Here we provide detailed documentation about the full dump:
24
25
* Json dump: https://doi.org/10.5281/zenodo.3516917
26
* Json schema: https://doi.org/10.5281/zenodo.4238938 
27
28
[[Json schema]]
29
[[FAQ]]
30 1 Alessia Bardi
31
h2. Graph provision processes
32
33
* OpenAIRE entity identifier & PID mapping policy
34
** Aggregation business logic by major sources:
35
** Unpaywall integration
36
** Crossref integration 
37
** ORCID integration
38
** Cross cleaning actions: hostedBy patch
39
** Scholexplorer business logic (relationship resolution)
40
** DataCite
41
** EuropePMC
42
** more….
43
* Deduplication business logic
44
** For research outputs 
45
** For research organizations 
46
* Enrichment
47
** Mining business logic
48
** Deduction-based inference 
49
** Propagation business logic
50
* Post-cleaning business logic
51
* FAQ