1
|
# D-Net Software Toolikt
|
2
|
|
3
|
This is a minimal instance of the D-Net software toolkit, a software framework for the realization of aggregative data infrastructures.
|
4
|
|
5
|
Official Web Site: http://www.d-net.research-infrastructures.eu/
|
6
|
|
7
|
Need support? Contact us via email at: dnet-team@isti.cnr.it
|
8
|
|
9
|
This webapp contains the minimal set of services needed to feature:
|
10
|
|
11
|
- Collection of metadata records in oai_dc format via OAI-PMH, FTP, local file system, HTTP.
|
12
|
|
13
|
- Transformation of the collected metadata records into an internal format named DMF (Driver Metadata Format)
|
14
|
|
15
|
- Indexing of DMF records in a Solr full-text index
|
16
|
|
17
|
- OAI-PMH export of aggregated metadata records in DMF and oai_dc formats. More formats can be added at runtime by providing a dedicated XSLT from DMF to the desired target format.
|
18
|
|
19
|
# Installation requirements
|
20
|
This minimal instance can be run on a single machine as web application to be deployed on a Tomcat container.
|
21
|
## Hardware requirements
|
22
|
|
23
|
Suggested minimal hardware requirements:
|
24
|
|
25
|
- Operating system: almost anything but Windows
|
26
|
- HARD DISK space: mostly depends on the quantity and size of records you are going to collect. A couple of GBs for a small repository (<10K metadata recods) should be fine. See suggestions on installing mongodb below.
|
27
|
|
28
|
## Software requirements
|
29
|
Software required:
|
30
|
|
31
|
* Apache Tomcat 7: the webapp container
|
32
|
* Mongodb >= 2.4: used to store the collected and transformed metadata records. Each collected record will be stored in three separate "versions": original, transformed, pmh-ready, hence enough disk space should be available for mongoDB.
|
33
|
* Solr 4.9.x or 4.10.x: used to make the documents searchable. The solr server should be run using the option '-DzkRun' to instruct solr to start the zookeeper server.
|
34
|
|
35
|
Note that Tomcat, Solr and Mongodb can be installed in the same machine or in dedicated nodes, although this requires to change some default system properties.
|
36
|
|
37
|
#Running the D-Net web app with Maven
|
38
|
## Maven settings
|
39
|
|
40
|
Either if you want to run the D-Net web app with the Tomcat7 plugin for maven, or you want to build the .war file to deploy on a running tomcat,
|
41
|
you need maven3 and you must add the following repository into your <code>settings.xml</code>:
|
42
|
|
43
|
```
|
44
|
<repository>
|
45
|
<id>dnet-bootstrap-releases</id>
|
46
|
<name>D-Net Bootstrap Releases</name>
|
47
|
<url>http://maven.research-infrastructures.eu/nexus/content/repositories/dnet4-bootstrap-release/</url>
|
48
|
<releases>
|
49
|
<enabled>true</enabled>
|
50
|
</releases>
|
51
|
<snapshots>
|
52
|
<enabled>false</enabled>
|
53
|
</snapshots>
|
54
|
<layout>default</layout>
|
55
|
</repository>
|
56
|
```
|
57
|
|
58
|
We also suggest to add the Tomcat plugin to the plugins group at the bottom of the same file:
|
59
|
|
60
|
```
|
61
|
<pluginGroups>
|
62
|
<pluginGroup>org.apache.tomcat.maven</pluginGroup>
|
63
|
</pluginGroups>
|
64
|
```
|
65
|
|
66
|
## Testing on local machine:
|
67
|
The D-Net Software is developed in Java using Maven. You can try out the D-Net web app on your local machine with the tomcat7 plugin, provided you are also running a mongodb and a solr server on localhost that are listening to the relative standard ports.
|
68
|
To override properties, you can modify <code>dnet-minimal-container/src/main/resources/eu/dnetlib/cnr-site.properties</code>. Please check the Section D-Net Configuration and the PROPERTIES.md file for more information about D-Net properties.
|
69
|
|
70
|
```
|
71
|
> cd dnet-minimal-container
|
72
|
|
73
|
> mvn tomcat7:run
|
74
|
```
|
75
|
|
76
|
When you see a log like:
|
77
|
```
|
78
|
52665 [Thread-7] INFO eu.dnetlib.enabling.is.store.TestContentInitializerJob - INITIALIZED
|
79
|
```
|
80
|
|
81
|
The webapp should be ready and running at http://localhost:8280/app , where 'app' is the value of the property <code>container.hostname</code> ('app' is the default).
|
82
|
|
83
|
|
84
|
# Deployment on a Tomcat instance
|
85
|
|
86
|
In this distribution you will find a ready-to-deploy war package.
|
87
|
|
88
|
Copy the war file into the Tomcat 7 <code>webapps</code> directory, ensure you have overridden the properties as explained in the D-Net configuration section and restart Tomcat.
|
89
|
|
90
|
When you see a log like:
|
91
|
```
|
92
|
52665 [Thread-7] INFO eu.dnetlib.enabling.is.store.TestContentInitializerJob - INITIALIZED
|
93
|
```
|
94
|
|
95
|
The webapp should be ready and running at http://${container.hostname}:${container.port}/${container.context}
|
96
|
|
97
|
If you want to build the web app yourself, then keep reading...
|
98
|
|
99
|
|
100
|
## Building the D-Net web app
|
101
|
The D-Net Software is developed in Java with Maven.
|
102
|
|
103
|
To build the war to use in a Tomcat 7 web app container:
|
104
|
|
105
|
```
|
106
|
> cd dnet-minimal-container
|
107
|
|
108
|
> mvn package
|
109
|
```
|
110
|
|
111
|
The .war file is then created into the <code>target</code> directory.
|
112
|
|
113
|
#D-Net configuration
|
114
|
Before you start the web application, you need to configure at least the following properties.
|
115
|
For the full list of available properties and their values, check PROPERTIES.md.
|
116
|
|
117
|
Create a file named <code>cnr.override.properties</code> in <code>$yourTomcatHomeDirectory$/common/classes</code> (<code>$yourTomcatHomeDirectory$</code> will likely be something similar to <code>/var/lib/tomcat7</code>)
|
118
|
|
119
|
- <code>container.hostname</code>: the host name where the web app will be running. Default value is <code>localhost</code>. The default value should *only* be used in local development scenarios.
|
120
|
</br>Example: <code>container.hostname = dnet-host.dnet.eu</code>
|
121
|
- <code>container.port</code>: the port where the web app will be running. Default is 8280.
|
122
|
</br>Example: <code>container.port = 8080</code>
|
123
|
- <code>container.context</code>: the name of the web app (i.e. the name of the war file). Default is "app". The default value should *only* be used in local development scenarios.
|
124
|
</br>Example: <code>container.context = is</code>
|
125
|
- <code>dnet.data.path</code>: path to the directory where all D-Net related resources will be saved. An embedded existDB will be automatically installed in this directory during the first start-up. The directory must be writable by the user running tomcat. Default value is <code>/tmp/dnet</code>. The default value should *only* be used in local development scenarios.
|
126
|
</br>Example: <code>dnet.data.path = /var/lib/dnet</code>
|
127
|
- <code>services.aggregator.country</code>: your country code. Default is <code>EU</code> (Europe).
|
128
|
</br>Example: <code>services.aggregator.country = IT</code>
|
129
|
- <code>services.aggregator.name</code>: the name of your aggregator. Default is "D-NET"
|
130
|
</br>Example: <code>services.aggregator.name = TEST_Aggregator</code>.
|
131
|
- <code>services.mdstore.mongodb.host</code>: the machine hosting mongodb for the storage of metadata records (M[eta]D[ata]Store). Default is localhost.
|
132
|
</br>Example: <code>services.mdstore.mongodb.host = mongo.dnet.eu</code>
|
133
|
- <code>services.mdstore.mongodb.db</code>: name of the mongodb database to be used for the storage of metadata records. Default is "mdstore_minimal".
|
134
|
</br>Example: <code>services.mdstore.mongodb.db = mdstore_1</code>
|
135
|
- <code>dnet.logger.mongo.host</code>: the machine hosting mongodb for the storage of workflow logs. Default is localhost.
|
136
|
</br>Example: <code>dnet.logger.mongo.host = mongo.dnet.eu</code>
|
137
|
- <code>dnet.logger.mongo.db</code>: name of the mongodb database to be used for the storage of workflow logs. Default is "dnet_logs_minimal".
|
138
|
</br>Example: <code>dnet.logger.mongo.db = dnet_logs_1</code>
|
139
|
- <code>services.oai.publisher.repo.name</code>: name of the OAI-PMH Publisher, as it will appear in the OAI Identify response. Default is "D-Net OAI-PMH Publisher".
|
140
|
</br>Example: <code>services.oai.publisher.repo.name = TEST_Aggregator OAI-PMH Publisher</code>
|
141
|
- <code>services.oai.publisher.repo.email</code>: email of the OAI-PMH Publisher administrator, as it will appear in the OAI Identify response. Default is "dnet-admin@mock.it". The default *must not* be used in beta or production system for it is a mock email.
|
142
|
</br>Example: <code>name.surname@valid.mail.com</code>
|
143
|
- <code>dnet.admin.password</code>: md5sum of the password that will allow the user "admin" to login to the D-Net Admin UI. To generate the new password: <code>echo thePassword -n | md5</code>. Default is "dnet-minimal" (without double quotes). The default value *should always be overridden*.
|
144
|
</br>Example: <code>dnet.admin.password = 5d1ed3888708c0f4cd46b29306a6b449</code>, where 5d1ed3888708c0f4cd46b29306a6b449 is the md5 for the string "pwd" obtained via the command <code>echo pwd -n | md5</code>.
|
145
|
- <code>service.solr.index.jsonConfiguration</code>: information about the Solr instance to be used to create full-text indices on the aggregated metadata records. Default value assumes a local Solr instance. Specifically:
|
146
|
<code>
|
147
|
{"id":"solr",\
|
148
|
"address":"localhost:9983",\
|
149
|
"port":"8983",\
|
150
|
"webContext":"solr",\
|
151
|
"numShards":"1",\
|
152
|
"replicationFactor":"1",\
|
153
|
"host":"localhost",\
|
154
|
"feedingShutdownTolerance":"30000",\
|
155
|
"feedingBufferFlushThreshold":"1000",\
|
156
|
"feedingSimulationMode":"false",\
|
157
|
"luceneMatchVersion":"4.9",\
|
158
|
"serverLibPath":"../../../../contrib/extraction/lib",\
|
159
|
"filterCacheSize":"512","filterCacheInitialSize":"512",\
|
160
|
"queryCacheSize":"512","queryCacheInitialSize":"512",\
|
161
|
"documentCacheSize":"512","documentCacheInitialSize":"512",\
|
162
|
"ramBufferSizeMB":"960","mergeFactor":"40",\
|
163
|
"autosoftcommit":"-1","autocommit":"15000",\
|
164
|
"termIndexInterval":"1024","maxIndexingThreads":"8",\
|
165
|
"queryResultWindowSize":"20","queryResultMaxDocCached":"200"}
|
166
|
</code>
|
167
|
|
168
|
If you are not running the Solr service on the same machine where Tomcat runs, then you need to override the above configuration according to your Solr server installation.
|
169
|
Typically, changing <code>address</code> and <code>host</code> is enough if your Solr server is not configured for sharding and replication.
|
170
|
For more details refer to the Solr documentation.
|
171
|
|
172
|
#Using D-Net
|
173
|
|
174
|
Under the root folder of the project you can find the folder <code>mock-repository-content</code>.
|
175
|
It contains 150 oai_dc metadata records you can use to test the functionality of the D-Net software with a Mock Datasource.
|
176
|
|
177
|
* Place the folder in a location that is readable from tomcat
|
178
|
* Start the container
|
179
|
* Access the Admin UI (http://${container.hostname}:${container.port}/${container.context}/mvc/ui/index.do)
|
180
|
* If you are running via the maven tomcat plugin with the default properties the URL is: http://localhost:8280/app/mvc/ui/index.do
|
181
|
* Go on Datasource Management --> Overview and search for "mock"
|
182
|
* Click on "Add metaworkflow" and select the "Collection and Transformation" meta-workflow. This action will associate a meta-workflow (i.e., a workflow of workflows) to the datasource and will create all needed metadata stores.
|
183
|
* Click on the "access params" button on the top right and change the base url to the location where you saved the sample folder (e.g. file:///dnet/test/mock-repository-content)
|
184
|
* Click on the meta-workflow "Collection and Transformation" and configure its workflows with the missing parameter for the transformation rule
|
185
|
* click on the yellow "parameters" button of the trasnformation workflow and select the rule <code>dc2dmf_DRIVER</code>
|
186
|
* Ensure the launch mode is set to "Auto" for each workflow
|
187
|
* Click on the Launch button of the first ("collect")
|
188
|
* Wait for all the workflows to complete: collect, transform, index, oai, and oaiPostFeed
|
189
|
* Verify that the records get transformed and indexed: click on MD Inspectors --> D-Net content checker and perform some queries
|
190
|
* Verify that the aggregated records are correctly exposed via the built-in OAI-PMH publisher at:
|
191
|
* http://${container.hostname}:${container.port}/${container.context}/mvc/oai/oai.do?verb=ListRecords&metadataPrefix=dmf for the DMF metadata format
|
192
|
* http://${container.hostname}:${container.port}/${container.context}/mvc/oai/oai.do?verb=ListRecords&metadataPrefix=oai_dc for the OAI_DC metadata format
|
193
|
|
194
|
#Need support?
|
195
|
Do not hesitate to contact dnet-team@isti.cnr.it
|