Skip to content
This repository was archived by the owner on Dec 18, 2021. It is now read-only.

Commit 5e2f9ea

Browse files
committed
update README
1 parent a9536d5 commit 5e2f9ea

File tree

1 file changed

+65
-9
lines changed

1 file changed

+65
-9
lines changed

README.md

Lines changed: 65 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ zh-tw
8181

8282
| Elasticsearch | Plugin | Release date |
8383
| -------------- | -------------- | ------------ |
84+
| 2.4.4 | 2.4.4.1 | Jan 25, 2017 |
8485
| 2.3.3 | 2.3.3.0 | Jun 11, 2016 |
8586
| 2.3.2 | 2.3.2.0 | Jun 11, 2016 |
8687
| 2.3.1 | 2.3.1.0 | Apr 11, 2016 |
@@ -104,7 +105,7 @@ zh-tw
104105

105106
## Installation Elasticsearch 2.x
106107

107-
./bin/plugin install https://github.com/jprante/elasticsearch-langdetect/releases/download/2.3.3.0/elasticsearch-langdetect-2.3.3.0-plugin.zip
108+
./bin/plugin install https://github.com/jprante/elasticsearch-langdetect/releases/download/2.4.4.1/elasticsearch-langdetect-2.4.4.1-plugin.zip
108109

109110
## Installation Elasticsearch 1.x
110111

@@ -134,7 +135,10 @@ In this example, we create a simple detector field, and write text to it for det
134135
{
135136
"article" : {
136137
"properties" : {
137-
"content" : { "type" : "langdetect" }
138+
"langcode" : {
139+
"type" : "langdetect",
140+
"languages" : [ "de", "en", "fr", "nl", "it" ]
141+
}
138142
}
139143
}
140144
}
@@ -143,21 +147,21 @@ In this example, we create a simple detector field, and write text to it for det
143147
curl -XPUT 'localhost:9200/test/article/1' -d '
144148
{
145149
"title" : "Some title",
146-
"content" : "Oh, say can you see by the dawn`s early light, What so proudly we hailed at the twilight`s last gleaming?"
150+
"langcode" : "Oh, say can you see by the dawn`s early light, What so proudly we hailed at the twilight`s last gleaming?"
147151
}
148152
'
149153

150154
curl -XPUT 'localhost:9200/test/article/2' -d '
151155
{
152156
"title" : "Ein Titel",
153-
"content" : "Einigkeit und Recht und Freiheit für das deutsche Vaterland!"
157+
"langcode" : "Einigkeit und Recht und Freiheit für das deutsche Vaterland!"
154158
}
155159
'
156160

157161
curl -XPUT 'localhost:9200/test/article/3' -d '
158162
{
159163
"title" : "Un titre",
160-
"content" : "Allons enfants de la Patrie, Le jour de gloire est arrivé!"
164+
"langcode" : "Allons enfants de la Patrie, Le jour de gloire est arrivé!"
161165
}
162166
'
163167

@@ -169,7 +173,7 @@ A search for the detected language codes is a simple term query, like this:
169173
{
170174
"query" : {
171175
"term" : {
172-
"content" : "en"
176+
"langcode" : "en"
173177
}
174178
}
175179
}
@@ -178,7 +182,7 @@ A search for the detected language codes is a simple term query, like this:
178182
{
179183
"query" : {
180184
"term" : {
181-
"content" : "de"
185+
"langcode" : "de"
182186
}
183187
}
184188
}
@@ -188,13 +192,65 @@ A search for the detected language codes is a simple term query, like this:
188192
{
189193
"query" : {
190194
"term" : {
191-
"content" : "fr"
195+
"langcode" : "fr"
192196
}
193197
}
194198
}
195199
'
196200

197-
## Show stored language codes
201+
## Indexing language-detected text alongside with code
202+
203+
Just indexing the language code is not eough in most cases. The language-detected text
204+
should be passed to a specific analyzer to papply language-specific analysis. This plugin
205+
allows that by the `language_to` parameter.
206+
207+
curl -XDELETE 'localhost:9200/test'
208+
209+
curl -XPUT 'localhost:9200/test'
210+
211+
curl -XPOST 'localhost:9200/test/article/_mapping' -d '
212+
{
213+
"article" : {
214+
"properties" : {
215+
"langcode":{
216+
"type" : "langdetect",
217+
"languages" : [ "de", "en", "fr", "nl", "it" ],
218+
"language_to" : {
219+
"de": "german_field",
220+
"en": "english_field"
221+
}
222+
},
223+
"german_field" : {
224+
"analyzer" : "german",
225+
"type": "string"
226+
},
227+
"english_field" : {
228+
"analyzer" : "english",
229+
"type" : "string"
230+
}
231+
}
232+
}
233+
}
234+
'
235+
236+
curl -XPUT 'localhost:9200/test/article/1' -d '
237+
{
238+
"langcode" : "This is a small example for english text"
239+
}
240+
'
241+
242+
curl -XPOST 'localhost:9200/test/_search?pretty' -d '
243+
{
244+
"query" : {
245+
"match" : {
246+
"english_field" : "This is a small example for english text"
247+
}
248+
}
249+
}
250+
'
251+
252+
253+
## Language code and `multi_field`
198254

199255
Using multifields, it is possible to store the text alongside with the detected language(s).
200256
Here, we use another (short nonsense) example text for demonstration,

0 commit comments

Comments
 (0)