81
81
82
82
| Elasticsearch | Plugin | Release date |
83
83
| -------------- | -------------- | ------------ |
84
+ | 2.4.4 | 2.4.4.1 | Jan 25, 2017 |
84
85
| 2.3.3 | 2.3.3.0 | Jun 11, 2016 |
85
86
| 2.3.2 | 2.3.2.0 | Jun 11, 2016 |
86
87
| 2.3.1 | 2.3.1.0 | Apr 11, 2016 |
@@ -104,7 +105,7 @@ zh-tw
104
105
105
106
## Installation Elasticsearch 2.x
106
107
107
- ./bin/plugin install https://github.com/jprante/elasticsearch-langdetect/releases/download/2.3.3.0 /elasticsearch-langdetect-2.3.3.0 -plugin.zip
108
+ ./bin/plugin install https://github.com/jprante/elasticsearch-langdetect/releases/download/2.4.4.1 /elasticsearch-langdetect-2.4.4.1 -plugin.zip
108
109
109
110
## Installation Elasticsearch 1.x
110
111
@@ -134,7 +135,10 @@ In this example, we create a simple detector field, and write text to it for det
134
135
{
135
136
"article" : {
136
137
"properties" : {
137
- "content" : { "type" : "langdetect" }
138
+ "langcode" : {
139
+ "type" : "langdetect",
140
+ "languages" : [ "de", "en", "fr", "nl", "it" ]
141
+ }
138
142
}
139
143
}
140
144
}
@@ -143,21 +147,21 @@ In this example, we create a simple detector field, and write text to it for det
143
147
curl -XPUT 'localhost:9200/test/article/1' -d '
144
148
{
145
149
"title" : "Some title",
146
- "content " : "Oh, say can you see by the dawn`s early light, What so proudly we hailed at the twilight`s last gleaming?"
150
+ "langcode " : "Oh, say can you see by the dawn`s early light, What so proudly we hailed at the twilight`s last gleaming?"
147
151
}
148
152
'
149
153
150
154
curl -XPUT 'localhost:9200/test/article/2' -d '
151
155
{
152
156
"title" : "Ein Titel",
153
- "content " : "Einigkeit und Recht und Freiheit für das deutsche Vaterland!"
157
+ "langcode " : "Einigkeit und Recht und Freiheit für das deutsche Vaterland!"
154
158
}
155
159
'
156
160
157
161
curl -XPUT 'localhost:9200/test/article/3' -d '
158
162
{
159
163
"title" : "Un titre",
160
- "content " : "Allons enfants de la Patrie, Le jour de gloire est arrivé!"
164
+ "langcode " : "Allons enfants de la Patrie, Le jour de gloire est arrivé!"
161
165
}
162
166
'
163
167
@@ -169,7 +173,7 @@ A search for the detected language codes is a simple term query, like this:
169
173
{
170
174
"query" : {
171
175
"term" : {
172
- "content " : "en"
176
+ "langcode " : "en"
173
177
}
174
178
}
175
179
}
@@ -178,7 +182,7 @@ A search for the detected language codes is a simple term query, like this:
178
182
{
179
183
"query" : {
180
184
"term" : {
181
- "content " : "de"
185
+ "langcode " : "de"
182
186
}
183
187
}
184
188
}
@@ -188,13 +192,65 @@ A search for the detected language codes is a simple term query, like this:
188
192
{
189
193
"query" : {
190
194
"term" : {
191
- "content " : "fr"
195
+ "langcode " : "fr"
192
196
}
193
197
}
194
198
}
195
199
'
196
200
197
- ## Show stored language codes
201
+ ## Indexing language-detected text alongside with code
202
+
203
+ Just indexing the language code is not eough in most cases. The language-detected text
204
+ should be passed to a specific analyzer to papply language-specific analysis. This plugin
205
+ allows that by the ` language_to ` parameter.
206
+
207
+ curl -XDELETE 'localhost:9200/test'
208
+
209
+ curl -XPUT 'localhost:9200/test'
210
+
211
+ curl -XPOST 'localhost:9200/test/article/_mapping' -d '
212
+ {
213
+ "article" : {
214
+ "properties" : {
215
+ "langcode":{
216
+ "type" : "langdetect",
217
+ "languages" : [ "de", "en", "fr", "nl", "it" ],
218
+ "language_to" : {
219
+ "de": "german_field",
220
+ "en": "english_field"
221
+ }
222
+ },
223
+ "german_field" : {
224
+ "analyzer" : "german",
225
+ "type": "string"
226
+ },
227
+ "english_field" : {
228
+ "analyzer" : "english",
229
+ "type" : "string"
230
+ }
231
+ }
232
+ }
233
+ }
234
+ '
235
+
236
+ curl -XPUT 'localhost:9200/test/article/1' -d '
237
+ {
238
+ "langcode" : "This is a small example for english text"
239
+ }
240
+ '
241
+
242
+ curl -XPOST 'localhost:9200/test/_search?pretty' -d '
243
+ {
244
+ "query" : {
245
+ "match" : {
246
+ "english_field" : "This is a small example for english text"
247
+ }
248
+ }
249
+ }
250
+ '
251
+
252
+
253
+ ## Language code and ` multi_field `
198
254
199
255
Using multifields, it is possible to store the text alongside with the detected language(s).
200
256
Here, we use another (short nonsense) example text for demonstration,
0 commit comments