Skip to content
taishi-i edited this page Sep 14, 2020 · 1 revision

Tokenizers

  • mecab-python3
  • janome
  • fugashi-ipadic
  • fugashi-unidic
  • sudachipy
  • ginza
  • spacy
  • nagisa
  • kytea
  • sentencepiece
  • jumanpp
  • tinysegmenter

Datasets

  • livedoor_news_corpus
  • yahoo_movie_reviews
  • amazon_reviews
Clone this wiki locally