Project

General

Profile

1
# D-Net Software Toolikt
2

    
3
This is a minimal instance of the D-Net software toolkit, a software framework for the realization of aggregative data infrastructures.
4

    
5
Official Web Site: http://www.d-net.research-infrastructures.eu/
6

    
7
Need support? Contact us via email at: dnet-team@isti.cnr.it
8

    
9
This webapp contains the minimal set of services needed to feature:
10

    
11
- Collection of metadata records in oai_dc format via OAI-PMH, FTP, local file system, HTTP.
12

    
13
- Transformation of the collected metadata records into an internal format named DMF (Driver Metadata Format)
14

    
15
- Indexing of DMF records in a Solr full-text index
16

    
17
- OAI-PMH export of aggregated metadata records in DMF and oai_dc formats. More formats can be added at runtime by providing a dedicated XSLT from DMF to the desired target format.
18

    
19
# Installation requirements
20
This minimal instance can be run on a single machine as web application to be deployed on a Tomcat container. 
21
## Hardware requirements
22

    
23
Suggested minimal hardware requirements:
24

    
25
- Operating system: almost anything but Windows
26
- HARD DISK space: mostly depends on the quantity and size of records you are going to collect. A couple of GBs for a small repository (<10K metadata recods) should be fine. See suggestions on installing mongodb below.
27

    
28
## Software requirements
29
Software required:
30

    
31
* Apache Tomcat 7: the webapp container
32
* Mongodb >= 2.4: used to store the collected and transformed metadata records. Each collected record will be stored in three separate "versions": original, transformed, pmh-ready, hence enough disk space should be available for mongoDB.
33
* Solr 4.9.x or 4.10.x: used to make the documents searchable. The solr server should be run using the option '-DzkRun' to instruct solr to start the zookeeper server. 
34

    
35
Note that Tomcat, Solr and Mongodb can be installed in the same machine or in dedicated nodes, although this requires to change some default system properties.
36

    
37
#Running the D-Net web app with Maven
38
## Maven settings
39

    
40
Either if you want to run the D-Net web app with the Tomcat7 plugin for maven, or you want to build the .war file to deploy on a running tomcat, 
41
you need maven3 and you must add the following repository into your <code>settings.xml</code>:
42

    
43
```
44
 <repository>
45
          <id>dnet-bootstrap-releases</id>
46
          <name>D-Net Bootstrap Releases</name>
47
          <url>http://maven.research-infrastructures.eu/nexus/content/repositories/dnet4-bootstrap-release/</url>
48
          <releases>
49
            <enabled>true</enabled>
50
          </releases>
51
          <snapshots>
52
            <enabled>false</enabled>
53
          </snapshots>
54
          <layout>default</layout>
55
 </repository>
56
```
57

    
58
We also suggest to add the Tomcat plugin to the plugins group at the bottom of the same file:
59

    
60
```
61
<pluginGroups>
62
    <pluginGroup>org.apache.tomcat.maven</pluginGroup>
63
</pluginGroups>
64
```
65

    
66
## Testing on local machine:
67
The D-Net Software is developed in Java using Maven. You can try out the D-Net web app on your local machine with the tomcat7 plugin, provided you are also running a mongodb and a solr server on localhost that are listening to the relative standard ports.
68
To override properties, you can modify <code>dnet-minimal-container/src/main/resources/eu/dnetlib/cnr-site.properties</code>. Please check the Section D-Net Configuration and the PROPERTIES.md file for more information about D-Net properties.
69

    
70
```
71
> cd dnet-minimal-container
72

    
73
> mvn tomcat7:run
74
```
75

    
76
When you see a log like:
77
```
78
52665 [Thread-7] INFO  eu.dnetlib.enabling.is.store.TestContentInitializerJob  - INITIALIZED
79
```
80

    
81
The webapp should be ready and running at http://localhost:8280/app , where 'app' is the value of the property <code>container.hostname</code> ('app' is the default).
82

    
83

    
84
# Deployment on a Tomcat instance
85

    
86
In this distribution you will find a ready-to-deploy war package.
87

    
88
Copy the war file into the Tomcat 7 <code>webapps</code> directory, ensure you have overridden the properties as explained in the D-Net configuration section and restart Tomcat.
89

    
90
When you see a log like:
91
```
92
52665 [Thread-7] INFO  eu.dnetlib.enabling.is.store.TestContentInitializerJob  - INITIALIZED
93
```
94

    
95
The webapp should be ready and running at http://${container.hostname}:${container.port}/${container.context}
96

    
97
If you want to build the web app yourself, then keep reading...
98

    
99

    
100
## Building the D-Net web app
101
The D-Net Software is developed in Java with Maven.
102

    
103
To build the war to use in a Tomcat 7 web app container:
104

    
105
```
106
 > cd dnet-minimal-container
107

    
108
 > mvn package
109
```
110

    
111
The .war file is then created into the <code>target</code> directory.
112

    
113
#D-Net configuration
114
Before you start the web application, you need to configure at least the following properties.
115
For the full list of available properties and their values, check PROPERTIES.md.
116

    
117
Create a file named <code>cnr.override.properties</code> in <code>$yourTomcatHomeDirectory$/common/classes</code> (<code>$yourTomcatHomeDirectory$</code> will likely be something similar to <code>/var/lib/tomcat7</code>)
118

    
119
- <code>container.hostname</code>: the host name where the web app will be running. Default value is <code>localhost</code>. The default value should *only* be used in local development scenarios.
120
</br>Example: <code>container.hostname = dnet-host.dnet.eu</code>
121
- <code>container.port</code>: the port where the web app will be running. Default is 8280.
122
</br>Example: <code>container.port = 8080</code>
123
- <code>container.context</code>: the name of the web app (i.e. the name of the war file). Default is "app". The default value should *only* be used in local development scenarios.
124
</br>Example: <code>container.context = is</code>
125
- <code>dnet.data.path</code>: path to the directory where all D-Net related resources will be saved. An embedded existDB will be automatically installed in this directory during the first start-up. The directory must be writable by the user running tomcat. Default value is <code>/tmp/dnet</code>. The default value should *only* be used in local development scenarios.
126
</br>Example: <code>dnet.data.path = /var/lib/dnet</code>
127
- <code>services.aggregator.country</code>: your country code. Default is <code>EU</code> (Europe).
128
</br>Example: <code>services.aggregator.country = IT</code>
129
- <code>services.aggregator.name</code>: the name of your aggregator. Default is "D-NET"
130
</br>Example: <code>services.aggregator.name = TEST_Aggregator</code>. 
131
- <code>services.mdstore.mongodb.host</code>: the machine hosting mongodb for the storage of metadata records (M[eta]D[ata]Store). Default is localhost.
132
</br>Example: <code>services.mdstore.mongodb.host = mongo.dnet.eu</code>
133
- <code>services.mdstore.mongodb.db</code>: name of the mongodb database to be used for the storage of metadata records. Default is "mdstore_minimal".
134
</br>Example: <code>services.mdstore.mongodb.db = mdstore_1</code>
135
- <code>dnet.logger.mongo.host</code>: the machine hosting mongodb for the storage of workflow logs. Default is localhost.
136
</br>Example: <code>dnet.logger.mongo.host = mongo.dnet.eu</code>
137
- <code>dnet.logger.mongo.db</code>: name of the mongodb database to be used for the storage of workflow logs. Default is "dnet_logs_minimal".
138
</br>Example: <code>dnet.logger.mongo.db = dnet_logs_1</code>
139
- <code>services.oai.publisher.repo.name</code>: name of the OAI-PMH Publisher, as it will appear in the OAI Identify response. Default is "D-Net OAI-PMH Publisher".
140
</br>Example: <code>services.oai.publisher.repo.name = TEST_Aggregator OAI-PMH Publisher</code>
141
- <code>services.oai.publisher.repo.email</code>: email of the OAI-PMH Publisher administrator, as it will appear in the OAI Identify response. Default is "dnet-admin@mock.it". The default *must not* be used in beta or production system for it is a mock email.
142
</br>Example: <code>name.surname@valid.mail.com</code>
143
- <code>dnet.admin.password</code>: md5sum of the password that will allow the user "admin" to login to the D-Net Admin UI. To generate the new password: <code>echo thePassword -n | md5</code>. Default is "dnet-minimal" (without double quotes). The default value *should always be overridden*.
144
</br>Example: <code>dnet.admin.password = 5d1ed3888708c0f4cd46b29306a6b449</code>, where 5d1ed3888708c0f4cd46b29306a6b449 is the md5 for the string "pwd" obtained via the command <code>echo pwd -n | md5</code>.
145
- <code>service.solr.index.jsonConfiguration</code>: information about the Solr instance to be used to create full-text indices on the aggregated metadata records. Default value assumes a local Solr instance. Specifically:
146
<code>
147
{"id":"solr",\
148
	"address":"localhost:9983",\
149
	"port":"8983",\
150
	"webContext":"solr",\
151
	"numShards":"1",\
152
	"replicationFactor":"1",\
153
	"host":"localhost",\
154
	"feedingShutdownTolerance":"30000",\
155
	"feedingBufferFlushThreshold":"1000",\
156
	"feedingSimulationMode":"false",\
157
	"luceneMatchVersion":"4.9",\
158
	"serverLibPath":"../../../../contrib/extraction/lib",\
159
	"filterCacheSize":"512","filterCacheInitialSize":"512",\
160
	"queryCacheSize":"512","queryCacheInitialSize":"512",\
161
	"documentCacheSize":"512","documentCacheInitialSize":"512",\
162
	"ramBufferSizeMB":"960","mergeFactor":"40",\
163
	"autosoftcommit":"-1","autocommit":"15000",\
164
	"termIndexInterval":"1024","maxIndexingThreads":"8",\
165
	"queryResultWindowSize":"20","queryResultMaxDocCached":"200"} 
166
</code>
167

    
168
If you are not running the Solr service on the same machine where Tomcat runs, then you need to override the above configuration according to your Solr server installation.
169
Typically, changing <code>address</code> and <code>host</code> is enough if your Solr server is not configured for sharding and replication.
170
For more details refer to the Solr documentation.
171

    
172
#Using D-Net
173

    
174
Under the root folder of the project you can find the folder <code>mock-repository-content</code>. 
175
It contains 150 oai_dc metadata records you can use to test the functionality of the D-Net software with a Mock Datasource.
176

    
177
* Place the folder in a location that is readable from tomcat 
178
* Start the container
179
* Access the Admin UI (http://${container.hostname}:${container.port}/${container.context}/mvc/ui/index.do)
180
	* If you are running via the maven tomcat plugin with the default properties the URL is: http://localhost:8280/app/mvc/ui/index.do
181
* Go on Datasource Management --> Overview and search for "mock"
182
* Click on "Add metaworkflow" and select the "Collection and Transformation" meta-workflow. This action will associate a meta-workflow (i.e., a workflow of workflows) to the datasource and will create all needed metadata stores.
183
* Click on the "access params" button on the top right and change the base url to the location where you saved the sample folder (e.g. file:///dnet/test/mock-repository-content)
184
* Click on the meta-workflow "Collection and Transformation" and configure its workflows with the missing parameter for the transformation rule 
185
	* click on the yellow "parameters" button of the trasnformation workflow and select the rule <code>dc2dmf_DRIVER</code> 
186
* Ensure the launch mode is set to "Auto" for each workflow 
187
* Click on the Launch button of the first ("collect")
188
* Wait for all the workflows to complete: collect, transform, index, oai, and oaiPostFeed
189
* Verify that the records get transformed and indexed: click on MD Inspectors --> D-Net content checker and perform some queries
190
* Verify that the aggregated records are correctly exposed via the built-in OAI-PMH publisher at: 
191
	* http://${container.hostname}:${container.port}/${container.context}/mvc/oai/oai.do?verb=ListRecords&metadataPrefix=dmf for the DMF metadata format
192
	* http://${container.hostname}:${container.port}/${container.context}/mvc/oai/oai.do?verb=ListRecords&metadataPrefix=oai_dc for the OAI_DC metadata format
193
	
194
#Need support?
195
Do not hesitate to contact dnet-team@isti.cnr.it
(3-3/4)