Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

polyphonetic characters pronounce wrong in Chinese #551

Closed
4 tasks done
teoking opened this issue Nov 29, 2024 · 3 comments
Closed
4 tasks done

polyphonetic characters pronounce wrong in Chinese #551

teoking opened this issue Nov 29, 2024 · 3 comments
Labels
question Further information is requested

Comments

@teoking
Copy link

teoking commented Nov 29, 2024

Checks

  • This template is only for question, not feature requests or bug reports.
  • I have thoroughly reviewed the project documentation and read the related paper(s).
  • I have searched for existing issues, including closed ones, no similar questions.
  • I confirm that I am using English to submit this report in order to facilitate communication.

Question details

with the sentence "村民们各个神情专注地倾听", the '地' pronounces 'di4' while 'de4' is correct
with the sentence "一个衣着简朴的讲述者正在中央", the '朴' prounouces 'piao2' while 'pu3' is correct

So is this a Chinese-to-Pinyin problem? how can I fix it ? e.g. replacing with another chinese-to-pinyin library or having a chance to correct the pinyin in inference

I did some researching with the paper but found no clues.

@teoking teoking added the question Further information is requested label Nov 29, 2024
@teoking
Copy link
Author

teoking commented Nov 29, 2024

It seems pinyin converting issues in python-pinyin:

from pypinyin import pinyin, lazy_pinyin, Style

strs = ['专注地倾听', '衣着简朴的讲述者']

for s in strs:
  result = pinyin(s, style=Style.TONE3, heteronym=True)
  print(s, result)
Output:
专注地倾听 [['zhuan1'], ['zhu4', 'zhou4'], ['di4', 'de'], ['qing1'], ['ting1', 'yin3', 'yi2']]
衣着简朴的讲述者 [['yi1'], ['zhuo2'], ['jian3'], ['piao2'], ['de', 'di1', 'di2', 'di4'], ['jiang3'], ['shu4'], ['zhe3']]

@ZhikangNiu
Copy link
Collaborator

@teoking please check this issue:mozillazg/python-pinyin#249

@ZhikangNiu
Copy link
Collaborator

BTW, you can also try https://github.com/GitYCC/g2pW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants