Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similar-meaning homophone (異字同訓) distinctions #107

Open
stephenmk opened this issue Nov 1, 2023 · 4 comments
Open

Similar-meaning homophone (異字同訓) distinctions #107

stephenmk opened this issue Nov 1, 2023 · 4 comments

Comments

@stephenmk
Copy link

I was recently told that a user isn't satisfied with JMdict's handling of similar homophone / 異字同訓 terms. I pointed out that JMdict does have many sense notes for indicating that particular senses are especially used with particular kanji forms. The original port of JMdict data to the Yomichan web extension did not include any of these sense notes, which may have caused the perception that this information is not available. But it is true that there are some entries missing these sorts of notes.

At any rate, this got me thinking about how JMdict can be improved with regard to these words. Bret Mayer's website contains about 150 such groups of terms with many example sentences. He says it was translated from a list compiled by the Japanese government Council for Cultural Affairs. Also, Kanjipedia has published a large list of these words with definitions and examples in Japanese.

In the new edition of Sanseido's 国語辞典 ("sankoku"), entries for these types of words contain markers which point to adjacent entries of the same reading. The appendix in sankoku describes these markers as "書き分け注意." There are about 1,500 "groups" of these entries. These groups are not collections of all words using the same reading, but rather homophones that are similar in meaning. For example, 膿む/熟む and 生む/産む are four separate entries in sankoku with the same reading (うむ) comprising two separate "groups."

I compiled a list of these groups from sankoku and attempted to correlate them to JMdict sequence numbers. The list is in a Google spreadsheet and is editable by everyone in the edict-jmdict google group.

I attempted to identify which JMdict entries have these forms merged. Presumably words that are split into separate JMdict entries (such as 膿む and 熟む) don't require much attention. For forms that are merged (such as ほか for 他 and 外), we may want to go down the list and double check that the JMdict entries adequately draw a distinction between the different forms.

@stephenmk stephenmk changed the title Similar homophone (異字同訓) distinctions Similar-meaning homophone (異字同訓) distinctions Nov 1, 2023
@JMdictProject
Copy link
Owner

A very interesting table, and a good starting point if someone wants to check/verify/amend the handling.

I looked through a few of the ones which are merged into single JMdict entries, e.g. 香り and 薫り, but I didn't see any that I felt needed flagging. In the 香り/薫り case, all the references I checked had them as alternatives without any suggestion the meanings differed according to the kanji form.

@FragozoLeonardo
Copy link

FragozoLeonardo commented Nov 4, 2023

I'm the person who opened that issue on Stephen's project, how I can help? I'm not proficient (as of now) in Japanese, but I can understand Japanese - Japanese Definitions in Monolingual Dictionaries.

@stephenmk
Copy link
Author

The entries identified in column D might be in need of attention. I have highlighted these cells in yellow.

spreadsheet

For example, we currently have 良い, 好い, and 善い merged into the same entry. Bret's ijidoukun page describes 善い as meaning "virtuous." Various Japanese dictionaries also make a note of this meaning. The current JMdict entry has all of these forms merged together, but it doesn't have a sense for "righteous" or "virtuous." I wrote "yes" in the "Needs attention?" column in the spreadsheet.

If you'd like to help, you can go through the spreadsheet and check to see if the identified entries adequately explain the differences between the corresponding kanji forms. I granted permissions to the *[email protected] account that you have on your github profile, so you should be able to edit the spreadsheet now.

@FragozoLeonardo
Copy link

Okay! I will check them at least two or three / day, I have full time work + study, but I defitively can help, thank you for granting the permissions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants