Project

General

Profile

1
# D-Net Software Toolikt
2

    
3
This is a minimal instance of the D-Net software toolkit, a software framework for the realization of aggregative data infrastructures.
4

    
5
Official Web Site: http://www.d-net.research-infrastructures.eu/
6

    
7
Need support? Contact us via email at: dnet-team@isti.cnr.it
8

    
9
This webapp contains the minimal set of services needed to feature:
10

    
11
- Collection of metadata records in oai_dc format via OAI-PMH, FTP, local file system, HTTP.
12

    
13
- Transformation of the collected metadata records into an internal format named DMF (Driver Metadata Format)
14

    
15
- Indexing of DMF records in a Solr full-text index
16

    
17
- OAI-PMH export of aggregated metadata records in DMF and oai_dc formats. More formats can be added at runtime by providing a dedicated XSLT from DMF to the desired target format.
18

    
19
# Installation requirements
20
This minimal instance can be run on a single machine as web application to be deployed on a Tomcat container. 
21
## Hardware requirements
22

    
23
Suggested minimal hardware requirements:
24

    
25
- Operating system: almost anything but Windows
26
- HARD DISK space: mostly depends on the quantity and size of records you are going to collect. A couple of GBs for a small repository (<10K metadata recods) should be fine. See suggestions on installing mongodb below.
27

    
28
## Software requirements
29
Software required:
30

    
31
* Apache Tomcat 7: the webapp container
32
* Mongodb >= 2.4: used to store the collected and transformed metadata records. Each collected record will be stored in three separate "versions": original, transformed, pmh-ready, hence enough disk space should be available for mongoDB.
33
* Solr 4.9.x or 4.10.x: used to make the documents searchable. The solr server should be run using the option '-DzkRun' to instruct solr to start the zookeeper server. 
34

    
35
Note that Tomcat, Solr and Mongodb can be installed in the same machine or in dedicated nodes, although this requires to change some default system properties.
36

    
37
#Running the D-Net web app with Maven
38
## Maven settings
39

    
40
Either if you want to run the D-Net web app with the Tomcat7 plugin for maven, or you want to build the .war file to deploy on a running tomcat, 
41
you need maven3 and you must add the following repository into your <code>settings.xml</code>:
42

    
43
```
44
 <repository>
45
          <id>dnet-bootstrap-releases</id>
46
          <name>D-Net Bootstrap Releases</name>
47
          <url>http://maven.research-infrastructures.eu/nexus/content/repositories/dnet4-bootstrap-release/</url>
48
          <releases>
49
            <enabled>true</enabled>
50
          </releases>
51
          <snapshots>
52
            <enabled>false</enabled>
53
          </snapshots>
54
          <layout>default</layout>
55
 </repository>
56
```
57

    
58
We also suggest to add the Tomcat plugin to the plugins group at the bottom of the same file:
59

    
60
```
61
<pluginGroups>
62
    <pluginGroup>org.apache.tomcat.maven</pluginGroup>
63
</pluginGroups>
64
```
65

    
66
## Testing on local machine:
67
The D-Net Software is developed in Java using Maven. You can try out the D-Net web app on your local machine with the tomcat7 plugin, provided you are also running a mongodb and a solr server on localhost that are listening to the relative standard ports.
68

    
69
Please note that the solr client used in D-Net needs to interact with the zookeeper server. For simplicity we suggest to use the embedded zookepper instance provided within the solr distribution. By default solr listens on the 8983 port and its embedded zookeeper server on the 9983 port. 
70

    
71
To override properties, you can modify <code>dnet-basic-aggregator/src/main/resources/eu/dnetlib/cnr-site.properties</code>. Please check the Section D-Net Configuration and the PROPERTIES.md file for more information about D-Net properties.
72

    
73
```
74
> cd dnet-basic-aggregator
75

    
76
> mvn tomcat7:run
77
```
78

    
79
When you see a log like:
80
```
81
52665 [Thread-7] INFO  eu.dnetlib.enabling.is.store.TestContentInitializerJob  - INITIALIZED
82
```
83

    
84
The webapp should be ready and running at http://localhost:8280/app , where 'app' is the value of the property <code>container.hostname</code> ('app' is the default).
85

    
86

    
87
# Deployment on a Tomcat instance
88

    
89
In this distribution you will find a ready-to-deploy war package.
90

    
91
Copy the war file into the Tomcat 7 <code>webapps</code> directory, ensure you have overridden the properties as explained in the D-Net configuration section and restart Tomcat.
92

    
93
When you see a log like:
94
```
95
52665 [Thread-7] INFO  eu.dnetlib.enabling.is.store.TestContentInitializerJob  - INITIALIZED
96
```
97

    
98
The webapp should be ready and running at 
99

    
100
```
101
http://${container.hostname}:${container.port}/${container.context}
102
```
103

    
104
If you want to build the web app yourself, then keep reading...
105

    
106

    
107
## Building the D-Net web app
108
The D-Net Software is developed in Java with Maven.
109

    
110
To build the war to use in a Tomcat 7 web app container:
111

    
112
```
113
 > cd dnet-basic-aggregator
114

    
115
 > mvn package
116
```
117

    
118
The <code>.war</code> file is then created into the <code>target</code> directory.
119

    
120
#D-Net configuration
121
Before you start the web application, you need to configure at least the following properties.
122
For the full list of available properties and their values, check PROPERTIES.md.
123

    
124
Create a file named <code>cnr.override.properties</code> in <code>$yourTomcatHomeDirectory$/common/classes</code> (<code>$yourTomcatHomeDirectory$</code> will likely be something similar to <code>/var/lib/tomcat7</code>)
125

    
126
- <code>container.hostname</code>: the host name where the web app will be running. Default value is <code>localhost</code>. The default value should *only* be used in local development scenarios.
127
</br>Example: <code>container.hostname = dnet-host.dnet.eu</code>
128
- <code>container.port</code>: the port where the web app will be running. Default is 8280.
129
</br>Example: <code>container.port = 8080</code>
130
- <code>container.context</code>: the name of the web app (i.e. the name of the war file). Default is "app". The default value should *only* be used in local development scenarios.
131
</br>Example: <code>container.context = is</code>
132
- <code>dnet.data.path</code>: path to the directory where all D-Net related resources will be saved. An embedded existDB will be automatically installed in this directory during the first start-up. The directory must be writable by the user running tomcat. Default value is <code>/tmp/dnet</code>. The default value should *only* be used in local development scenarios.
133
</br>Example: <code>dnet.data.path = /var/lib/dnet</code>
134
- <code>services.aggregator.country</code>: your country code. Default is <code>EU</code> (Europe).
135
</br>Example: <code>services.aggregator.country = IT</code>
136
- <code>services.aggregator.name</code>: the name of your aggregator. Default is "D-NET"
137
</br>Example: <code>services.aggregator.name = TEST_Aggregator</code>. 
138
- <code>services.mdstore.mongodb.host</code>: the machine hosting mongodb for the storage of metadata records (M[eta]D[ata]Store). Default is <code>localhost</code>.
139
</br>Example: <code>services.mdstore.mongodb.host = mongodb.dnet.eu</code>
140
- <code>services.mdstore.mongodb.db</code>: name of the mongodb database to be used for the storage of metadata records. Default is <code>mdstore_minimal</code>.
141
</br>Example: <code>services.mdstore.mongodb.db = mdstore_1</code>
142
- <code>dnet.logger.mongo.host</code>: the machine hosting mongodb for the storage of workflow logs. Default is localhost.
143
</br>Example: <code>dnet.logger.mongo.host = mongo.dnet.eu</code>
144
- <code>dnet.logger.mongo.db</code>: name of the mongodb database to be used for the storage of workflow logs. Default is "dnet_logs_minimal".
145
</br>Example: <code>dnet.logger.mongo.db = dnet_logs_1</code>
146
- <code>services.oai.publisher.repo.name</code>: name of the OAI-PMH Publisher, as it will appear in the OAI Identify response. Default is "D-Net OAI-PMH Publisher".
147
</br>Example: <code>services.oai.publisher.repo.name = TEST_Aggregator OAI-PMH Publisher</code>
148
- <code>services.oai.publisher.repo.email</code>: email of the OAI-PMH Publisher administrator, as it will appear in the OAI Identify response. Default is "dnet-admin@mock.it". The default *must not* be used in beta or production system for it is a mock email.
149
</br>Example: <code>name.surname@valid.mail.com</code>
150
- <code>dnet.admin.password</code>: md5sum of the password that will allow the user "admin" to login to the D-Net Admin UI. To generate the new password: <code>echo -n "thePassword" | md5sum</code>. Default is "dnet-minimal" (without double quotes). The default value *should always be overridden*.
151
</br>Example: <code>dnet.admin.password = 9003d1df22eb4d3820015070385194c8</code>, where 9003d1df22eb4d3820015070385194c8 is the md5 for the string "pwd" obtained via the command <code>echo -n "pwd" | md5sum</code>.
152
- <code>service.solr.index.jsonConfiguration</code>: information about the Solr instance to be used to create full-text indices on the aggregated metadata records. Default value assumes a local Solr instance. Specifically:
153
<code>
154
{"id":"solr", "address":"localhost:9983", "port":"8983", "webContext":"solr", "numShards":"1", "replicationFactor":"1", "host":"localhost",	"feedingShutdownTolerance":"30000",	"feedingBufferFlushThreshold":"1000", "feedingSimulationMode":"false", "luceneMatchVersion":"4.9",	"serverLibPath":"../../../../contrib/extraction/lib", "filterCacheSize":"512","filterCacheInitialSize":"512",	"queryCacheSize":"512","queryCacheInitialSize":"512", "documentCacheSize":"512", "documentCacheInitialSize":"512", "ramBufferSizeMB":"960","mergeFactor":"40",	"autosoftcommit":"-1","autocommit":"15000", "termIndexInterval":"1024","maxIndexingThreads":"8", "queryResultWindowSize":"20","queryResultMaxDocCached":"200"} 
155
</code>
156

    
157
If you are not running the Solr service on the same machine where Tomcat runs, then you need to override the above configuration according to your Solr server installation.
158
Typically, changing <code>address</code> and <code>host</code> is enough if your Solr server is not configured for sharding and replication.
159
For more details refer to the Solr documentation.
160

    
161
#Using D-Net
162

    
163
Under the root folder of the project you can find the folder `mock-repository-content`. 
164
It contains 150 `oai_dc` metadata records you can use to test the functionality of the D-Net software with a Mock Datasource.
165

    
166
* Place the folder in a location that is readable from tomcat 
167
* Start the container
168
* Access the Admin UI (`http://${container.hostname}:${container.port}/${container.context}/mvc/ui/index.do`)
169
  * If you are running via the maven tomcat plugin with the default properties the URL is: `http://localhost:8280/app/mvc/ui/index.do`
170
* Go on Datasource Management --> Overview and search for "mock"
171
* Click on "Add metaworkflow" and select the "Collection and Transformation" meta-workflow. This action will associate a meta-workflow (i.e., a workflow of workflows) to the datasource and will create all needed metadata stores.
172
* Click on the "access params" button on the top right and change the base url to the location where you saved the sample folder (e.g. `file:///dnet/test/mock-repository-content`)
173
* Click on the meta-workflow "Collection and Transformation" and configure its workflows with the missing parameter for the transformation rule 
174
  * click on the yellow "parameters" button of the trasnformation workflow and select the rule `dc2dmf_DRIVER`
175
* Ensure the launch mode is set to "Auto" for each workflow 
176
* Click on the Launch button of the first ("collect")
177
* Wait for all the workflows to complete: collect, transform, index, oai, and oaiPostFeed
178
* Verify that the records get transformed and indexed: click on MD Inspectors --> D-Net content checker and perform some queries
179
* Verify that the aggregated records are correctly exposed via the built-in OAI-PMH publisher at: 
180
  * `http://${container.hostname}:${container.port}/${container.context}/mvc/oai/oai.do?verb=ListRecords&metadataPrefix=dmf` for the DMF metadata format
181
  * `http://${container.hostname}:${container.port}/${container.context}/mvc/oai/oai.do?verb=ListRecords&metadataPrefix=oai_dc` for the OAI_DC metadata format
182
	
183
#Need support?
184
Do not hesitate to contact dnet-team@isti.cnr.it
(3-3/4)