Switch from pyvi to underthesea for Vietnamese word tokenization #11586
Closed
BLKSerene
started this conversation in
New Features & Project Ideas
Replies: 2 comments 1 reply
-
Thanks for the suggestion! I can see that Underthesea looks like an active and well-maintained project, but I don't understand Vietnamese, and it's not clear to me what's actually different about the tokenization. Are the differences in tokenization approach or quality documented somewhere, or could you explain them? |
Beta Was this translation helpful? Give feedback.
0 replies
-
I do not speak Vietnamese either. The only difference should be the tokenization accuracy, but the author does not provide any information about it. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
spaCy
currently usespyvi
for Vietnamese word tokenization, have you ever considered Underthesea as a better alternative?Beta Was this translation helpful? Give feedback.
All reactions