Skip to content

Samples codes for natural language processing in Japanese

Notifications You must be signed in to change notification settings

upura/nlp-recipes-ja

Repository files navigation

NLP Recipes for Japanese

This repository contains samples codes for natural language processing in Japanese. It's highly inspired by microsoft/nlp-recipes.

Content

The following is a summary of the commonly used NLP scenarios covered in the repository. Each scenario is demonstrated in one or more scripts or Jupyter notebook examples that make use of the core code base of models and repository utilities.

Category Methods
Basic Cleaning, Normalization, Stopwords, Sentence Segmantation, Ruby
Embeddings Word2Vec, fastText, Universal Sentence Encoder
Feature Engineering Bag-of-Words, TF-IDF, BM25, SWEM, SCDV
Morphological Analysis Konoha, nagisa
Sentence Similarity Cosine Similarity
Sentiment Analysis oseti
Text Classification TF-IDF & Logistic Regression, TF-IDF & LightGBM, BERT, T5
Visualization Visualization with Japanese texts

Environment

docker-compose up -d --build
docker exec -it nlp-recipes-ja bash

About

Samples codes for natural language processing in Japanese

Resources

Stars

Watchers

Forks