Skip to content

Latest commit

 

History

History
41 lines (36 loc) · 5.61 KB

README.md

File metadata and controls

41 lines (36 loc) · 5.61 KB

NLP resources for the Georgian language

(Stuff I've encountered so far)

Models

Tools

Some projects working on Georgian NLP tools:

Datasets

  • WikiANN - NER dataset.
  • MC4 is a cleaned version of Common Crawl and contains 15+ GB of Georgian text. I've found it fairly useful in my experiments.

Linguistic resources

Research notes and questions

Contributions

  • I'm seeking input from other researchers and practitioners on best practices and useful resources for doing NLP in Georgian. Please contribute what you can, especially general wisdom.