Rasa is an open source machine learning framework to automate text-and voice-based conversations.Rasa's primary purpose is to help you build contextual, layered conversations with lots of back-and-forth. To have a real conversation, you need to have some memory and build on things that were said earlier. Rasa lets you do that in a scalable way.
- Firstly install rasa in your own environment as following:
git clone https://github.com/RasaHQ/rasa.git
cd rasa
- As we will add a custom component to tokenize japanese text , we will use mecab library for tokinzing the text.
add mecab-python3 at the end of requirements.txt file. Or install mecab using pip.
pip install mecab-python3
- Now install all the requirements using :
pip install -r requirements.txt
pip install -e .
-
First run the following command:
rasa init --no-prompt
if you find any error like " rasa.core.trackers - Tried to set non existent slot 'name'. Make sure you added all your slots to your domain file.", then please remove all the data inside rasa/data/* directory.
-
Then add the Japanese language tokenizer in the path "rasa/rasa/nlu/japanese_tokenizer.py". I have added the file in "rasa/rasa/nlu/japanese_tokenizer.py" path.
-
Add JapaneseTokenizer component class in /rasa/rasa/nlu/registry.py. E.g:
from rasa.nlu.tokenizers.japanese_tokenizer import JapaneseTokenizer
Add class name inside component_classes = [] dictionary. E.g
component_classes = [ # tokenizers JapaneseTokenizer, MitieTokenizer, SpacyTokenizer, WhitespaceTokenizer, JiebaTokenizer, ]
-
Now add JapaneseTokenizer as pipeline in config.yml as follows:
# Configuration for Rasa NLU. # https://rasa.com/docs/rasa/nlu/components/ language: jp pipeline: - name: "JapaneseTokenizer" - name: "RegexFeaturizer" - name: "CRFEntityExtractor" - name: "EntitySynonymMapper" - name: "CountVectorsFeaturizer" - name: "EmbeddingIntentClassifier" # Configuration for Rasa Core. # https://rasa.com/docs/rasa/core/policies/ policies: - name: MemoizationPolicy - name: KerasPolicy - name: MappingPolicy
-
Now Add data as data/nlu.md, stories.md and domail.yml. E.g:
nle.md:
## intent:greet - ハロー - もしもし - 初めまして - こんにちは - はじめまして ## intent:icebreak12 - よろしくお願いします - こちらこそ - 宜しくお願いします - 問題ないです
stories.md:
## happy path * greet - utter_greet * icebreak12 - utter_icebreak12
domain.yml:
intents: - utter_icebreak12 - greet templates: utter_icebreak12: - text: 今どこで、何をしていますか? utter_greet: - text: ご協力いただきありがとうございます。本日は宜しくお願いします actions: - utter_greet - utter_icebreak12
-
Now train your model using:
rasa train
-
To chat run:
rasa shell