1
|
------------------------------------------------------------------------
|
2
|
r35409 | marek.horst | 2015-03-17 15:04:06 +0100 (Tue, 17 Mar 2015) | 1 line
|
3
|
|
4
|
#1198 aligning IIS dependencies and java code to CDH5.3.0 cluster
|
5
|
------------------------------------------------------------------------
|
6
|
r35395 | marek.horst | 2015-03-17 15:01:04 +0100 (Tue, 17 Mar 2015) | 1 line
|
7
|
|
8
|
#1197 introducing job.properties changes aligning paths to rumcajs cluster HDFS structure
|
9
|
------------------------------------------------------------------------
|
10
|
r35250 | marek.horst | 2015-03-11 16:48:11 +0100 (Wed, 11 Mar 2015) | 1 line
|
11
|
|
12
|
creating IIS-CDH-5.3.0 branch
|
13
|
------------------------------------------------------------------------
|
14
|
r34616 | marek.horst | 2015-02-19 18:12:12 +0100 (Thu, 19 Feb 2015) | 1 line
|
15
|
|
16
|
#1038 introducing ranges in dependencies definition for all IIS modules
|
17
|
------------------------------------------------------------------------
|
18
|
r33612 | marek.horst | 2014-12-16 20:53:30 +0100 (Tue, 16 Dec 2014) | 1 line
|
19
|
|
20
|
[maven-release-plugin] prepare for next development iteration
|
21
|
------------------------------------------------------------------------
|
22
|
r33610 | marek.horst | 2014-12-16 20:53:26 +0100 (Tue, 16 Dec 2014) | 1 line
|
23
|
|
24
|
[maven-release-plugin] prepare release icm-iis-documentssimilarity-1.0.0
|
25
|
------------------------------------------------------------------------
|
26
|
r33605 | marek.horst | 2014-12-16 20:15:05 +0100 (Tue, 16 Dec 2014) | 1 line
|
27
|
|
28
|
#1044 pre-release switching to released version of parent pom and released dependencies
|
29
|
------------------------------------------------------------------------
|
30
|
r33498 | marek.horst | 2014-12-15 19:01:20 +0100 (Mon, 15 Dec 2014) | 1 line
|
31
|
|
32
|
#1044 moving coansys placeholder definition to documentssimilarity and citationmatching modules to eliminate necessity of releasing parentcontainer module every time coansys version changes.
|
33
|
------------------------------------------------------------------------
|
34
|
r33411 | marek.horst | 2014-12-15 12:42:38 +0100 (Mon, 15 Dec 2014) | 1 line
|
35
|
|
36
|
introducing scm definition
|
37
|
------------------------------------------------------------------------
|
38
|
r33225 | marek.horst | 2014-12-08 14:48:27 +0100 (Mon, 08 Dec 2014) | 1 line
|
39
|
|
40
|
#1026 setting threshold_num_of_vector_elems_length to 2 which proves to be solution for mentioned problem
|
41
|
------------------------------------------------------------------------
|
42
|
r33183 | marek.horst | 2014-12-04 16:06:29 +0100 (Thu, 04 Dec 2014) | 1 line
|
43
|
|
44
|
#1026 introducing threshold_num_of_vector_elems_length parameter support which eliminate all documents with terms verctor shorter than specified threshold
|
45
|
------------------------------------------------------------------------
|
46
|
r32253 | marek.horst | 2014-11-05 18:42:11 +0100 (Wed, 05 Nov 2014) | 1 line
|
47
|
|
48
|
introducing ${iis.coansys.version} placeholder for coansys version, upgrading value to 1.7-SNAPSHOT after todays coansys release
|
49
|
------------------------------------------------------------------------
|
50
|
r32239 | marek.horst | 2014-11-05 17:27:42 +0100 (Wed, 05 Nov 2014) | 1 line
|
51
|
|
52
|
introducing embedded integration test entry
|
53
|
------------------------------------------------------------------------
|
54
|
r31036 | marek.horst | 2014-10-02 14:29:51 +0200 (Thu, 02 Oct 2014) | 1 line
|
55
|
|
56
|
introducing cloudera repository in parent container, removing repository definitions from individual IIS modules
|
57
|
------------------------------------------------------------------------
|
58
|
r30100 | marek.horst | 2014-09-10 16:34:16 +0200 (Wed, 10 Sep 2014) | 1 line
|
59
|
|
60
|
#768 fix: introducing missing mainDirectory parameter set to ${wf:appPath()}/coansys
|
61
|
------------------------------------------------------------------------
|
62
|
r30049 | marek.horst | 2014-09-08 11:39:14 +0200 (Mon, 08 Sep 2014) | 1 line
|
63
|
|
64
|
updating job.properties
|
65
|
------------------------------------------------------------------------
|
66
|
r28765 | marek.horst | 2014-07-01 17:02:31 +0200 (Tue, 01 Jul 2014) | 1 line
|
67
|
|
68
|
introducing deploy.info file for module icm-iis-documentssimilarity
|
69
|
------------------------------------------------------------------------
|
70
|
r28742 | marek.horst | 2014-07-01 14:35:30 +0200 (Tue, 01 Jul 2014) | 1 line
|
71
|
|
72
|
moving icm-iis-* modules from dnet11 to dnet40
|
73
|
------------------------------------------------------------------------
|
74
|
r27993 | marek.horst | 2014-06-05 14:01:47 +0200 (Thu, 05 Jun 2014) | 8 lines
|
75
|
|
76
|
updating default similarity properties to:
|
77
|
sample=1
|
78
|
tfidfTopnTermPerDocument=20
|
79
|
removal_least_used=20
|
80
|
removal_rate=0.99
|
81
|
similarityTopnDocumentPerDocument=20
|
82
|
mapredChildJavaOpts=-Xmx20g
|
83
|
parallel=20
|
84
|
------------------------------------------------------------------------
|
85
|
r27911 | marek.horst | 2014-06-03 10:33:21 +0200 (Tue, 03 Jun 2014) | 1 line
|
86
|
|
87
|
updating default job.properties
|
88
|
------------------------------------------------------------------------
|
89
|
r27910 | marek.horst | 2014-06-03 10:31:57 +0200 (Tue, 03 Jun 2014) | 1 line
|
90
|
|
91
|
setting remove_sideproducts=true by default
|
92
|
------------------------------------------------------------------------
|
93
|
r27908 | marek.horst | 2014-06-03 10:04:34 +0200 (Tue, 03 Jun 2014) | 1 line
|
94
|
|
95
|
setting serialize_to_proto default value
|
96
|
------------------------------------------------------------------------
|
97
|
r27906 | marek.horst | 2014-06-03 09:54:20 +0200 (Tue, 03 Jun 2014) | 1 line
|
98
|
|
99
|
updating default workflow.xml properties
|
100
|
------------------------------------------------------------------------
|
101
|
r27550 | marek.horst | 2014-05-16 14:32:19 +0200 (Fri, 16 May 2014) | 1 line
|
102
|
|
103
|
introducing most recent version of document similarity workflow with updated set of parameters
|
104
|
------------------------------------------------------------------------
|
105
|
r27412 | marek.horst | 2014-05-14 10:09:29 +0200 (Wed, 14 May 2014) | 1 line
|
106
|
|
107
|
updating default job.properties
|
108
|
------------------------------------------------------------------------
|
109
|
r27258 | marek.horst | 2014-05-09 11:28:44 +0200 (Fri, 09 May 2014) | 1 line
|
110
|
|
111
|
updating converter input path after upgrading doc-sim version
|
112
|
------------------------------------------------------------------------
|
113
|
r27256 | marek.horst | 2014-05-08 22:55:53 +0200 (Thu, 08 May 2014) | 1 line
|
114
|
|
115
|
switching to the latest version of coansys document similarity module
|
116
|
------------------------------------------------------------------------
|
117
|
r26568 | marek.horst | 2014-04-11 19:25:32 +0200 (Fri, 11 Apr 2014) | 1 line
|
118
|
|
119
|
#332 workflow definitions cleanup. 2.4) prefixing documentssimilarity input/output port names
|
120
|
------------------------------------------------------------------------
|
121
|
r26518 | marek.horst | 2014-04-11 01:13:07 +0200 (Fri, 11 Apr 2014) | 1 line
|
122
|
|
123
|
#352 replacing fixed version value 1.7.4 with iis.avro.version placeholder defined in parent pom
|
124
|
------------------------------------------------------------------------
|
125
|
r26489 | marek.horst | 2014-04-10 19:20:42 +0200 (Thu, 10 Apr 2014) | 1 line
|
126
|
|
127
|
#349 make all IIS modules dnet-spring4 compliant: updating all pom.xml definitions with propert parent and updated dnet-spring4 SNAPSHOT dependencies. Updating java code by replacing IMDStoreService API with newly introduced MDStoreService API
|
128
|
------------------------------------------------------------------------
|
129
|
r26475 | marek.horst | 2014-04-10 18:51:41 +0200 (Thu, 10 Apr 2014) | 1 line
|
130
|
|
131
|
updating job properties
|
132
|
------------------------------------------------------------------------
|
133
|
r26415 | marek.horst | 2014-04-08 11:28:38 +0200 (Tue, 08 Apr 2014) | 1 line
|
134
|
|
135
|
updating ds_parallel to 30 to match openaire cluster configuration
|
136
|
------------------------------------------------------------------------
|
137
|
r26160 | marek.horst | 2014-03-27 18:09:46 +0100 (Thu, 27 Mar 2014) | 1 line
|
138
|
|
139
|
updating default document similarity parameters
|
140
|
------------------------------------------------------------------------
|
141
|
r25986 | marek.horst | 2014-03-18 11:54:44 +0100 (Tue, 18 Mar 2014) | 1 line
|
142
|
|
143
|
parameterizing ds_mapredChildJavaOpts and ds_sample
|
144
|
------------------------------------------------------------------------
|
145
|
r24606 | marek.horst | 2014-02-03 17:41:38 +0100 (Mon, 03 Feb 2014) | 1 line
|
146
|
|
147
|
renaming pig_parallel parameter to ds_parallel
|
148
|
------------------------------------------------------------------------
|
149
|
r24603 | marek.horst | 2014-02-03 17:34:14 +0100 (Mon, 03 Feb 2014) | 1 line
|
150
|
|
151
|
updating default similarity values
|
152
|
------------------------------------------------------------------------
|
153
|
r24599 | marek.horst | 2014-02-03 17:22:22 +0100 (Mon, 03 Feb 2014) | 1 line
|
154
|
|
155
|
setting pig_parallel=40
|
156
|
------------------------------------------------------------------------
|
157
|
r24578 | marek.horst | 2014-02-03 13:52:25 +0100 (Mon, 03 Feb 2014) | 1 line
|
158
|
|
159
|
upgrading coansys similarity module from document-similarity-workflow to document-similarity-ranked-workflow
|
160
|
------------------------------------------------------------------------
|
161
|
r23998 | marek.horst | 2014-01-10 15:55:18 +0100 (Fri, 10 Jan 2014) | 1 line
|
162
|
|
163
|
changing default ds_tfidfMinValue from 0.4 to 0.6 to limit results
|
164
|
------------------------------------------------------------------------
|
165
|
r23997 | marek.horst | 2014-01-10 15:54:47 +0100 (Fri, 10 Jan 2014) | 1 line
|
166
|
|
167
|
updating default job properties
|
168
|
------------------------------------------------------------------------
|
169
|
r23961 | marek.horst | 2014-01-08 17:25:22 +0100 (Wed, 08 Jan 2014) | 2 lines
|
170
|
|
171
|
handling similarityTopnDocumentPerDocument and tfidfTopnTermPerDocument doc-sim parameters provided at runtime
|
172
|
|
173
|
------------------------------------------------------------------------
|
174
|
r23898 | marek.horst | 2014-01-02 14:35:10 +0100 (Thu, 02 Jan 2014) | 1 line
|
175
|
|
176
|
parameterizing ds_tfidfMinValue
|
177
|
------------------------------------------------------------------------
|
178
|
r23447 | marek.horst | 2013-12-16 19:22:41 +0100 (Mon, 16 Dec 2013) | 1 line
|
179
|
|
180
|
updating default datastores in job properties
|
181
|
------------------------------------------------------------------------
|
182
|
r22901 | mateusz.fedoryszak | 2013-12-09 16:30:19 +0100 (Mon, 09 Dec 2013) | 1 line
|
183
|
|
184
|
properties
|
185
|
------------------------------------------------------------------------
|
186
|
r22899 | mateusz.fedoryszak | 2013-12-09 16:28:39 +0100 (Mon, 09 Dec 2013) | 1 line
|
187
|
|
188
|
new CoAnSys version
|
189
|
------------------------------------------------------------------------
|
190
|
r22794 | mateusz.fedoryszak | 2013-12-06 12:21:29 +0100 (Fri, 06 Dec 2013) | 1 line
|
191
|
|
192
|
renaming io parameters
|
193
|
------------------------------------------------------------------------
|
194
|
r22632 | mateusz.fedoryszak | 2013-11-29 18:24:59 +0100 (Fri, 29 Nov 2013) | 1 line
|
195
|
|
196
|
new format of input data
|
197
|
------------------------------------------------------------------------
|
198
|
r22570 | mateusz.fedoryszak | 2013-11-29 14:41:45 +0100 (Fri, 29 Nov 2013) | 1 line
|
199
|
|
200
|
MiniOozie support
|
201
|
------------------------------------------------------------------------
|
202
|
r22568 | mateusz.fedoryszak | 2013-11-29 14:39:55 +0100 (Fri, 29 Nov 2013) | 1 line
|
203
|
|
204
|
Pig parallel param
|
205
|
------------------------------------------------------------------------
|
206
|
r22556 | mateusz.fedoryszak | 2013-11-29 11:51:26 +0100 (Fri, 29 Nov 2013) | 1 line
|
207
|
|
208
|
fixes
|
209
|
------------------------------------------------------------------------
|
210
|
r22555 | mateusz.fedoryszak | 2013-11-29 11:50:59 +0100 (Fri, 29 Nov 2013) | 1 line
|
211
|
|
212
|
removing unnecessary lines
|
213
|
------------------------------------------------------------------------
|
214
|
r22445 | mateusz.fedoryszak | 2013-11-26 14:57:12 +0100 (Tue, 26 Nov 2013) | 1 line
|
215
|
|
216
|
Moving generic converter to common
|
217
|
------------------------------------------------------------------------
|
218
|
r22420 | mateusz.fedoryszak | 2013-11-25 13:08:50 +0100 (Mon, 25 Nov 2013) | 1 line
|
219
|
|
220
|
fixing property misuse
|
221
|
------------------------------------------------------------------------
|
222
|
r22410 | mateusz.fedoryszak | 2013-11-25 11:37:52 +0100 (Mon, 25 Nov 2013) | 1 line
|
223
|
|
224
|
missing brackets
|
225
|
------------------------------------------------------------------------
|
226
|
r22409 | mateusz.fedoryszak | 2013-11-25 11:37:10 +0100 (Mon, 25 Nov 2013) | 1 line
|
227
|
|
228
|
Generic converting mapper
|
229
|
------------------------------------------------------------------------
|
230
|
r22229 | mateusz.fedoryszak | 2013-11-18 11:07:35 +0100 (Mon, 18 Nov 2013) | 1 line
|
231
|
|
232
|
somewhat works (no errors nor output)
|
233
|
------------------------------------------------------------------------
|
234
|
r21965 | mateusz.fedoryszak | 2013-11-13 10:04:18 +0100 (Wed, 13 Nov 2013) | 1 line
|
235
|
|
236
|
basic converters
|
237
|
------------------------------------------------------------------------
|
238
|
r21733 | marek.horst | 2013-11-04 10:49:31 +0100 (Mon, 04 Nov 2013) | 1 line
|
239
|
|
240
|
introducing "icm-iis-documentssimilarity"
|
241
|
------------------------------------------------------------------------
|
242
|
r21730 | marek.horst | 2013-11-04 10:47:29 +0100 (Mon, 04 Nov 2013) | 1 line
|
243
|
|
244
|
Share project "icm-iis-documentssimilarity" into "https://svn.driver.research-infrastructures.eu/driver"
|
245
|
------------------------------------------------------------------------
|