We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
能简要介绍一下检测和纠错的模型思路吗
The text was updated successfully, but these errors were encountered:
思路还是挺简单的,大概如下:
最近在搞新的版本,新版本会增加新的词替换方式,会把检测和纠错合并。
Stay tuned:)
Sorry, something went wrong.
思路还是挺简单的,大概如下: 检测:对人民日报语料的句子随机采样词来替换成错误的词,替换方式目前有两种:1) 同音词替换 2) 形近字替换,记录下错词位置,然后通过建模来预测错词位置。 纠错:对于识别出错词的位置进行 [MASK] 然后预测 [MASK] 处可能的词,最后通过拼音来排序。 最近在搞新的版本,新版本会增加新的词替换方式,会把检测和纠错合并。 Stay tuned:)
非常感谢你及时的回复。 另外: 1. 我这边关注了你们的公众号,无法进到NLP交流群,貌似微信公众号有一些问题,无法显示交流群的菜单 2. 我目前在做纠错也是这个流程,检错标签,转换[MASK]用MLM模型进行预测,感觉比较棘手的是数据层面,希望能得到和你的微信沟通方式,或者是微信群。
No branches or pull requests
能简要介绍一下检测和纠错的模型思路吗
The text was updated successfully, but these errors were encountered: