About Traditional Chinese model #6025
-
Does spaCY support traditional Chinese? ZH-TW or ZH-HK. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
In the basic The provided Chinese models like |
Beta Was this translation helpful? Give feedback.
In the basic
Chinese
language support, there's not much specific to simplified Chinese, except maybe the stop words. If you have ajieba
dictionary orpkuseg
model for traditional Chinese characters, it should work fine.The provided Chinese models like
zh_core_web_sm
are trained on OntoNotes 5, which only contains simplified Chinese (see p. 27 in the docs: https://catalog.ldc.upenn.edu/docs/LDC2013T19/OntoNotes-Release-5.0.pdf). If you know of data with a permissive license that can be used to train models for traditional Chinese (typically it's hardest to find NER data), we'd be happy to look into whether we could provide additional models.