Skip to content

Commit

Permalink
docs: add README to each example
Browse files Browse the repository at this point in the history
  • Loading branch information
KEINOS committed Apr 13, 2024
1 parent 9b43806 commit 7f0db96
Show file tree
Hide file tree
Showing 4 changed files with 60 additions and 9 deletions.
2 changes: 1 addition & 1 deletion _examples/db_search/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ It demonstrates how to tokenize Japanese text using Kagome, which is a common re
By using SQLite with FTS4, it efficiently manages and searches through a large amount of text data, making it suitable for applications like:

1. **Search Engines:** You can use this code as a basis for building a search engine that indexes and searches Japanese text content.
2. **Document Management Systems:** This code can be integrated into a document management system to enable full-text search capabilities for Japanese documents.
2. **Document Management Systems:** This code can be integrated into a document management system to enable full-text search capabilities for Japanese documents.
3. **Content Recommendation Systems:** When you have a large collection of Japanese content, you can use this code to implement content recommendation systems based on user queries.
4. **Chatbots and NLP:** If you're building chatbots or natural language processing (NLP) systems for Japanese language, this code can assist in text analysis and search within the chatbot's knowledge base.

Expand Down
26 changes: 26 additions & 0 deletions _examples/tokenize/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Example of tokenizing/analyzing Japanese text

This example demonstrates how to analyzes a sentence (tokenize) and get the part-of-speech (POS) of each word using Kagome.

- Target text data is as follows:

```text
すもももももももものうち
```

- Example output:

```shellsession
$ cd /path/to/kagome/_examples/tokenize
$ go run .
---tokenize---
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
```

> __Note__ that tokenization varies depending on the dictionary used. In this example we use the IPA dictionary.
16 changes: 8 additions & 8 deletions _examples/tokenize/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ func main() {
}

// Output:
//---tokenize---
//すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
//も 助詞,係助詞,*,*,*,*,も,モ,モ
//もも 名詞,一般,*,*,*,*,もも,モモ,モモ
//も 助詞,係助詞,*,*,*,*,も,モ,モ
//もも 名詞,一般,*,*,*,*,もも,モモ,モモ
//の 助詞,連体化,*,*,*,*,の,ノ,ノ
//うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
// ---tokenize---
// すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
// も 助詞,係助詞,*,*,*,*,も,モ,モ
// もも 名詞,一般,*,*,*,*,もも,モモ,モモ
// も 助詞,係助詞,*,*,*,*,も,モ,モ
// もも 名詞,一般,*,*,*,*,もも,モモ,モモ
// の 助詞,連体化,*,*,*,*,の,ノ,ノ
// うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
}
25 changes: 25 additions & 0 deletions _examples/wakati/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Wakati Example with Kagome

## Segmenting Japanese text into words with Kagome

In this example, we demonstrate how to segment Japanese text into words using Kagome.

- Target text data is as follows:

```text
すもももももももものうち
```

- Example output:

```shellsession
$ cd /path/to/kagome/_examples/wakati
$ go run .
----wakati---
すもも/も/もも/も/もも/の/うち
```

> __Note__ that segmentation varies depending on the dictionary used.
> In this example we use the IPA dictionary. But for searching purposes, the Uni dictionary is recommended.
>
> - [What is a Kagome dictionary?](https://github.com/ikawaha/kagome/wiki/About-the-dictionary#what-is-a-kagome-dictionary) | Wiki | kagome @ GitHub

0 comments on commit 7f0db96

Please sign in to comment.