-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
中文的文本生成的字幕,词语分割点总是在奇怪的位置,本来是一个词语应该在一条字幕内显示,而现在生成的字幕会把词语拆成两条字幕显示,不知道--write-subtitles命令中文本是如何分割的,我把中文标点符号替换成英文标点符号,结果还是一样 #234
Comments
Hello, can you try this branch for me? https://github.com/rany2/edge-tts/tree/wip-subtitles |
Do I need to reinstall this branch of edge-tts? I tried the command in the cmd window, but the result was unsatisfactory:edge-tts --voice zh-CN-YunxiNeural --text "萧漪气得粉拳挥舞,二师兄,你的嘴巴也太讨厌了.谁痴呆,谁脑残了.真是的,二师兄,你这样子,没有女孩子会喜欢你.吕少卿满脸不屑,爱情什么的,有灵石重要吗?谁要女孩子喜欢了?麻烦!" --write-media D:\hello.mp3 --write-subtitles D:\hello.vtt 00:00:00.100 --> 00:00:03.000 萧漪 气 得 粉 拳 挥舞 二师兄 你 的 嘴巴 00:00:03.025 --> 00:00:06.638 也 太 讨厌 了 谁 痴呆 谁 脑残 了 真是的 00:00:06.912 --> 00:00:11.062 二师兄 你 这 样子 没有 女孩子 会 喜欢 你 吕少卿 00:00:11.062 --> 00:00:14.325 满 脸 不屑 爱情 什么的 有 灵石 重要 吗 谁 00:00:14.325 --> 00:00:15.900 要 女孩子 喜欢 了 麻烦 ======================================== 1 2 3 4 5 6 7 8 9 10 11 12 13 |
You need to reinstall from that branch, I believe you didn't do so which is why the same behaviour remained. |
After testing, I found that the subtitle format did change, and it even retained punctuation marks. However, I'm not sure where the problem lies because the subtitles didn't start a new line at punctuation marks. Generally, it's more reasonable for subtitles to move to the next line at pauses in the audio to avoid making a subtitle too long and cluttered. Was it supposed to be like this originally? The following is the subtitle I obtained using the command (edge-tts --voice zh-CN-YunxiNeural --text "萧漪气得粉拳挥舞,二师兄,你的嘴巴也太讨厌了。谁痴呆,谁脑残了。真是的,二师兄,你这样子,没有女孩子会喜欢你。吕少卿满脸不屑,爱情什么的,有灵石重要吗?谁要女孩子喜欢了?麻烦!" --write-media D:\hello.mp3 --write-subtitles D:\hello.vtt): 1 2 3 4 5 |
Makes sense, thanks for the feedback. |
Hi rany2, IIUC, you are generating the subtitle using results with wordBoundaryEnabled. I wonder have you tried getting subtitle with sentenceBoundaryEnabled. that looks like a promising way to solve the issue listed above. |
Unfortunately sentence boundary has many bugs if proper punctuation isn't provided and is deprecated by Microsoft themselves. |
中文的文本生成的字幕,词语分割点总是在奇怪的位置,本来是一个词语应该在一条字幕内显示,而现在生成的字幕会把词语拆成两条字幕显示,不知道--write-subtitles命令中文本是如何分割的,我把中文标点符号替换成英文标点符号,结果还是一样
The subtitles generated from Chinese text always have word breaks at strange positions. Originally, a single word should appear in one subtitle line, but now the generated subtitles split the words into two lines. I'm not sure how the text is segmented in the --write-subtitles command. Even after replacing Chinese punctuation marks with English ones, the issue remains the same.
The text was updated successfully, but these errors were encountered: