Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

約定俗成的音 對策 --> 錯別字自動修正, like iOS typo correction suggestion bubble. #2

Open
mjhsieh opened this issue Sep 9, 2011 · 7 comments

Comments

@mjhsieh
Copy link
Contributor

mjhsieh commented Sep 9, 2011

例如很多人要打「這麼」打成「這模」, 「應該」打成「因該」。

@lukhnos
Copy link
Contributor

lukhnos commented Oct 22, 2012

This seems to be a feature worth pursuing (and a good differentiation factor "product"-wise too).

@xatier
Copy link
Contributor

xatier commented Feb 22, 2023

One workaround is to add the following terms? @lukhnos how do you think?

應該 ㄧㄣ-ㄍㄞ                                                                  
這麼 ㄓㄜˋ -ㄇㄛˊ

@lukhnos
Copy link
Contributor

lukhnos commented Mar 19, 2023

這跟「打錯字或打錯音時,顯示建議詞」並不一樣,但確實可以用自訂詞的方式解決。:)

@xatier
Copy link
Contributor

xatier commented Mar 19, 2023

Perhaps we can introduce some secondary suggestion table for these common typos?

@xatier
Copy link
Contributor

xatier commented Jun 9, 2023

@justfont recetnly shared the repo https://github.com/justfont/The-Write-Right-Font, which includes the typo correction suggestion in this spreadsheet:
https://docs.google.com/spreadsheets/d/1ihqTzoNSjh8rqhYGh-8K43cPNFJGxxh7SPEzUpZ3VT4/edit#gid=2031303555

I wrote the following script for my personal user dict. It yields ok-ish result IMO.

import csv

SRC = "/usr/share/fcitx5/data/mcbopomofo-data.txt"
REPLACE = "20230401.csv"


def read_mcbopomofo_dict() -> dict[str, list[str]]:
    """Read the mcbopomofo dict and return a dict of phrase -> readings"""
    with open(SRC) as f:
        lines: list[str] = f.readlines()

    # start with `ㄅ ㄅ -5.79764489`
    lines = lines[510:]

    d: dict[str, list[str]] = {}
    for l in lines:
        reading, phrase, _ = l.split(" ")
        if phrase not in d:
            d[phrase] = [reading]
        else:
            d[phrase].append(reading)

    return d


def read_justfont_csv() -> list[list[str]]:
    """Read the justfont csv and return a list of phrases to replace"""
    with open(REPLACE, newline="") as f:
        reader = csv.reader(f, delimiter=",", quotechar="|")

        phrases: list[list[str]] = []
        for row in reader:
            if row[0] == "華語" and row[1] == "替換單一字":
                # normalize typo marker
                row[3] = row[3].replace("‘", "'")
                phrases.append(row[2:4])

            if row[0] == "華語" and row[1] != "替換單一字":
                print(f"[-] only single replacement is supported: {row}")

        return phrases


if __name__ == "__main__":
    mcbopomofo_dict: dict[str, list[str]] = read_mcbopomofo_dict()
    typo_phrases: list[list[str]] = read_justfont_csv()
    output: list[str] = []

    for phrase in typo_phrases:
        # 正確詞, 錯誤詞
        correct, wrong = phrase[0], phrase[1]
        if correct not in mcbopomofo_dict:
            print(f"[-] phrase {phrase} not found in mcbopomofo_dict")
            continue

        # otherwise, correct in mcbopomofo_dict
        # ensure marker is present
        if "'" not in wrong:
            # no typo marker, skip
            print(f"[-] typo marker ' not found in {phrase}")
            continue

        marker: int = wrong.find("'")
        if marker <= 0:
            print(f"[-] incorrect typo marker ' position in {phrase}")
            continue

        # the typo character is right before the marker
        idx: int = marker - 1
        typo: str = wrong[idx]

        if typo in mcbopomofo_dict:
            # for all combinations found in the dict
            for correct_reading in mcbopomofo_dict[correct]:
                for typo_reading in mcbopomofo_dict[typo]:
                    # reconstruct the replacement reading
                    replacement: list[str] = []
                    for i, s in enumerate(correct_reading.split("-")):
                        if i == idx:
                            replacement.append(typo_reading)
                        else:
                            replacement.append(s)
                    replacement_phrase: str = "-".join(replacement)

                    if correct_reading != replacement_phrase:
                        print(
                            f"[+] {correct} {correct_reading} -> {correct} {replacement_phrase} ({typo} {typo_reading} {idx})"
                        )
                        output.append(f"{correct} {replacement_phrase}")
        else:
            print(f"[-] typo {typo} not found in mcbopomofo_dict")
            continue

    # finally, dump the output
    for o in output:
        print(o)

output:

[+] 一樣 ㄧˊ-ㄧㄤˋ -> 一樣 ㄧ-ㄧㄤˋ (依 ㄧ 0)
[+] 自己人 ㄗˋ-ㄐㄧˇ-ㄖㄣˊ -> 自己人 ㄗˋ-ㄧˇ-ㄖㄣˊ (已 ㄧˇ 1)
[+] 不能自已 ㄅㄨˋ-ㄋㄥˊ-ㄗˋ-ㄧˇ -> 不能自已 ㄅㄨˋ-ㄋㄥˊ-ㄗˋ-ㄐㄧˇ (己 ㄐㄧˇ 3)
[+] 什麼 ㄕㄜˊ-ㄇㄛ˙ -> 什麼 ㄕㄜˇ-ㄇㄛ˙ (捨 ㄕㄜˇ 0)
[+] 什麼 ㄕㄜˊ-ㄇㄜ˙ -> 什麼 ㄕㄜˇ-ㄇㄜ˙ (捨 ㄕㄜˇ 0)
[+] 什麼 ㄕㄣˊ-ㄇㄛ˙ -> 什麼 ㄕㄜˇ-ㄇㄛ˙ (捨 ㄕㄜˇ 0)
[+] 什麼 ㄕㄣˊ-ㄇㄜ˙ -> 什麼 ㄕㄜˇ-ㄇㄜ˙ (捨 ㄕㄜˇ 0)
[+] 什麼 ㄕㄜˊ-ㄇㄛ˙ -> 什麼 ㄕㄣˊ-ㄇㄛ˙ (神 ㄕㄣˊ 0)
[+] 什麼 ㄕㄜˊ-ㄇㄜ˙ -> 什麼 ㄕㄣˊ-ㄇㄜ˙ (神 ㄕㄣˊ 0)
[+] 生硬 ㄕㄥ-ㄧㄥˋ -> 生硬 ㄕㄣ-ㄧㄥˋ (深 ㄕㄣ 0)
[-] phrase ['不要再打錯字', "不要在'打錯字"] not found in mcbopomofo_dict
[-] phrase ['不要再錯字', "不要在'錯字"] not found in mcbopomofo_dict
[-] phrase ['不會再打錯字', "不會在'打錯字"] not found in mcbopomofo_dict
[-] phrase ['不會再錯字', "不會在'錯字"] not found in mcbopomofo_dict
[-] phrase ['再犯', "在'犯"] not found in mcbopomofo_dict
[-] phrase ['再約', "在'約"] not found in mcbopomofo_dict
[-] phrase ['在日', "再'日"] not found in mcbopomofo_dict
[-] phrase ['在任', "再'任"] not found in mcbopomofo_dict
[-] phrase ['在那', "再'那"] not found in mcbopomofo_dict
[-] phrase ['在官', "再'官"] not found in mcbopomofo_dict
[-] phrase ['在昔', "再'昔"] not found in mcbopomofo_dict
[-] phrase ['在疚', "再'疚"] not found in mcbopomofo_dict
[-] phrase ['在城', "再'城"] not found in mcbopomofo_dict
[-] phrase ['在室', "再'室"] not found in mcbopomofo_dict
[-] phrase ['在苫', "再'苫"] not found in mcbopomofo_dict
[-] phrase ['在哪', "再'哪"] not found in mcbopomofo_dict
[-] phrase ['在席', "再'席"] not found in mcbopomofo_dict
[-] phrase ['在莒', "再'莒"] not found in mcbopomofo_dict
[-] phrase ['在陳', "再'陳"] not found in mcbopomofo_dict
[-] phrase ['在嗎', "再'嗎"] not found in mcbopomofo_dict
[-] phrase ['在幹嘛', "再'幹嘛"] not found in mcbopomofo_dict
[-] phrase ['在嘛', "再'嘛"] not found in mcbopomofo_dict
[-] phrase ['如芒在背', "如芒再'背"] not found in mcbopomofo_dict
[-] phrase ['如魚在水', "如魚再'水"] not found in mcbopomofo_dict
[-] phrase ['如魚在釜', "如魚再'釜"] not found in mcbopomofo_dict
[-] phrase ['如雷在耳', "如雷再'耳"] not found in mcbopomofo_dict
[-] phrase ['安在', "安再'"] not found in mcbopomofo_dict
[-] phrase ['有在', "有再'"] not found in mcbopomofo_dict
[-] phrase ['志在', "志再'"] not found in mcbopomofo_dict
[-] phrase ['芒刺在躬', "芒刺再'躬"] not found in mcbopomofo_dict
[-] phrase ['見在', "見再'"] not found in mcbopomofo_dict
[-] phrase ['病在膏肓', "病再'膏肓"] not found in mcbopomofo_dict
[-] phrase ['病在膏髓', "病再'膏髓"] not found in mcbopomofo_dict
[-] phrase ['晦在', "晦再'"] not found in mcbopomofo_dict
[-] phrase ['猶在', "猶再'"] not found in mcbopomofo_dict
[-] phrase ['黃雀在後', "黃雀再'後"] not found in mcbopomofo_dict
[-] phrase ['意在', "意再'"] not found in mcbopomofo_dict
[-] phrase ['遊魚在鼎', "遊魚再'鼎"] not found in mcbopomofo_dict
[-] phrase ['醉翁之意不在酒', "醉翁之意不再'酒"] not found in mcbopomofo_dict
[+] 名字 ㄇㄧㄥˊ-ㄗˋ -> 名字 ㄇㄧㄥˊ-ㄗˇ (子 ㄗˇ 1)
[+] 名字 ㄇㄧㄥˊ-ㄗˋ -> 名字 ㄇㄧㄥˊ-ㄗ˙ (子 ㄗ˙ 1)
[+] 名字 ㄇㄧㄥˊ-ㄗ˙ -> 名字 ㄇㄧㄥˊ-ㄗˇ (子 ㄗˇ 1)
[+] 類似於 ㄌㄟˋ-ㄙˋ-ㄩˊ -> 類似於 ㄌㄟˋ-ㄕˋ-ㄩˊ (是 ㄕˋ 1)
[+] 莫名其妙 ㄇㄛˋ-ㄇㄧㄥˊ-ㄑㄧˊ-ㄇㄧㄠˋ -> 莫名其妙 ㄇㄛˋ-ㄇㄧㄥˊ-ㄐㄧ-ㄇㄧㄠˋ (奇 ㄐㄧ 2)
[+] 折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄑㄩㄢˋ -> 折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄐㄩㄢˇ (卷 ㄐㄩㄢˇ 2)
[+] 折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄑㄩㄢˋ -> 折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄐㄩㄢˋ (卷 ㄐㄩㄢˋ 2)
[+] 折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄑㄩㄢˋ -> 折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄑㄩㄢˊ (卷 ㄑㄩㄢˊ 2)
[+] 彩券 ㄘㄞˇ-ㄑㄩㄢˋ -> 彩券 ㄘㄞˇ-ㄐㄩㄢˇ (卷 ㄐㄩㄢˇ 1)
[+] 彩券 ㄘㄞˇ-ㄑㄩㄢˋ -> 彩券 ㄘㄞˇ-ㄐㄩㄢˋ (卷 ㄐㄩㄢˋ 1)
[+] 彩券 ㄘㄞˇ-ㄑㄩㄢˋ -> 彩券 ㄘㄞˇ-ㄑㄩㄢˊ (卷 ㄑㄩㄢˊ 1)
[-] phrase ['優惠券', "優惠卷'"] not found in mcbopomofo_dict
[+] 問卷 ㄨㄣˋ-ㄐㄩㄢˋ -> 問卷 ㄨㄣˋ-ㄑㄩㄢˋ (券 ㄑㄩㄢˋ 1)
[+] 上弦月 ㄕㄤˋ-ㄒㄧㄢˊ-ㄩㄝˋ -> 上弦月 ㄕㄤˋ-ㄒㄩㄢˊ-ㄩㄝˋ (玄 ㄒㄩㄢˊ 1)
[+] 下弦月 ㄒㄧㄚˋ-ㄒㄧㄢˊ-ㄩㄝˋ -> 下弦月 ㄒㄧㄚˋ-ㄒㄩㄢˊ-ㄩㄝˋ (玄 ㄒㄩㄢˊ 1)
[+] 弦樂團 ㄒㄧㄢˊ-ㄩㄝˋ-ㄊㄨㄢˊ -> 弦樂團 ㄒㄩㄢˊ-ㄩㄝˋ-ㄊㄨㄢˊ (玄 ㄒㄩㄢˊ 0)
[+] 弦樂器 ㄒㄧㄢˊ-ㄩㄝˋ-ㄑㄧˋ -> 弦樂器 ㄒㄩㄢˊ-ㄩㄝˋ-ㄑㄧˋ (玄 ㄒㄩㄢˊ 0)
[+] 管弦樂 ㄍㄨㄢˇ-ㄒㄧㄢˊ-ㄩㄝˋ -> 管弦樂 ㄍㄨㄢˇ-ㄒㄩㄢˊ-ㄩㄝˋ (玄 ㄒㄩㄢˊ 1)
[+] 目的 ㄇㄨˋ-ㄉㄧˋ -> 目的 ㄇㄨˋ-ㄉㄜ˙ (地 ㄉㄜ˙ 1)
[+] 目的 ㄇㄨˋ-ㄉㄧˋ -> 目的 ㄇㄨˋ-ㄉㄧ˙ (地 ㄉㄧ˙ 1)
[+] 真的 ㄓㄣ-ㄉㄜ˙ -> 真的 ㄓㄣ-ㄉㄜˊ (得 ㄉㄜˊ 1)
[+] 真的 ㄓㄣ-ㄉㄜ˙ -> 真的 ㄓㄣ-ㄉㄟˇ (得 ㄉㄟˇ 1)
[+] 有關係 ㄧㄡˇ-ㄍㄨㄢ-ㄒㄧˋ -> 有關係 ㄧㄡˇ-ㄍㄨㄢ-ㄒㄧ (西 ㄒㄧ 2)
[+] 有關係 ㄧㄡˇ-ㄍㄨㄢ-ㄒㄧˋ -> 有關係 ㄧㄡˇ-ㄍㄨㄢ-ㄒㄧ˙ (西 ㄒㄧ˙ 2)
[+] 沒關係 ㄇㄟˊ-ㄍㄨㄢ-ㄒㄧˋ -> 沒關係 ㄇㄟˊ-ㄍㄨㄢ-ㄒㄧ (西 ㄒㄧ 2)
[+] 沒關係 ㄇㄟˊ-ㄍㄨㄢ-ㄒㄧˋ -> 沒關係 ㄇㄟˊ-ㄍㄨㄢ-ㄒㄧ˙ (西 ㄒㄧ˙ 2)
[+] 重新 ㄔㄨㄥˊ-ㄒㄧㄣ -> 重新 ㄗㄨㄥˋ-ㄒㄧㄣ (從 ㄗㄨㄥˋ 0)
[+] 重新 ㄔㄨㄥˊ-ㄒㄧㄣ -> 重新 ㄘㄨㄥ-ㄒㄧㄣ (從 ㄘㄨㄥ 0)
[+] 重新 ㄔㄨㄥˊ-ㄒㄧㄣ -> 重新 ㄘㄨㄥˊ-ㄒㄧㄣ (從 ㄘㄨㄥˊ 0)
[+] 重複 ㄔㄨㄥˊ-ㄈㄨˋ -> 重複 ㄗㄨㄥˋ-ㄈㄨˋ (從 ㄗㄨㄥˋ 0)
[+] 重複 ㄔㄨㄥˊ-ㄈㄨˋ -> 重複 ㄘㄨㄥ-ㄈㄨˋ (從 ㄘㄨㄥ 0)
[+] 重複 ㄔㄨㄥˊ-ㄈㄨˋ -> 重複 ㄘㄨㄥˊ-ㄈㄨˋ (從 ㄘㄨㄥˊ 0)
[+] 倒是 ㄉㄠˇ-ㄕˋ -> 倒是 ㄉㄠˋ-ㄕˋ (到 ㄉㄠˋ 0)
[-] phrase ['剝蝦', "撥'蝦"] not found in mcbopomofo_dict
[-] phrase ['好衰', "好雖'"] not found in mcbopomofo_dict
[+] 經常 ㄐㄧㄥ-ㄔㄤˊ -> 經常 ㄐㄧㄥ-ㄓㄤˇ (長 ㄓㄤˇ 1)
[+] 經常 ㄐㄧㄥ-ㄔㄤˊ -> 經常 ㄐㄧㄥ-ㄓㄤˋ (長 ㄓㄤˋ 1)
[-] phrase ['跑得快', "跑的'快"] not found in mcbopomofo_dict
[-] phrase ['跑得慢', "跑的'慢"] not found in mcbopomofo_dict
[-] phrase ['說得好', "說的'好"] not found in mcbopomofo_dict
[+] 覺得 ㄐㄩㄝˊ-ㄉㄜˊ -> 覺得 ㄐㄩㄝˊ-ㄉㄜ˙ (的 ㄉㄜ˙ 1)
[+] 覺得 ㄐㄩㄝˊ-ㄉㄜˊ -> 覺得 ㄐㄩㄝˊ-ㄉㄧˊ (的 ㄉㄧˊ 1)
[+] 覺得 ㄐㄩㄝˊ-ㄉㄜˊ -> 覺得 ㄐㄩㄝˊ-ㄉㄧˋ (的 ㄉㄧˋ 1)
[+] 覺得 ㄐㄩㄝˊ-ㄉㄜ˙ -> 覺得 ㄐㄩㄝˊ-ㄉㄧˊ (的 ㄉㄧˊ 1)
[+] 覺得 ㄐㄩㄝˊ-ㄉㄜ˙ -> 覺得 ㄐㄩㄝˊ-ㄉㄧˋ (的 ㄉㄧˋ 1)
[+] 從今以後 ㄘㄨㄥˊ-ㄐㄧㄣ-ㄧˇ-ㄏㄡˋ -> 從今以後 ㄓㄨㄥˋ-ㄐㄧㄣ-ㄧˇ-ㄏㄡˋ (重 ㄓㄨㄥˋ 0)
[+] 從今以後 ㄘㄨㄥˊ-ㄐㄧㄣ-ㄧˇ-ㄏㄡˋ -> 從今以後 ㄔㄨㄥˊ-ㄐㄧㄣ-ㄧˇ-ㄏㄡˋ (重 ㄔㄨㄥˊ 0)
[+] 從頭 ㄘㄨㄥˊ-ㄊㄡˊ -> 從頭 ㄓㄨㄥˋ-ㄊㄡˊ (重 ㄓㄨㄥˋ 0)
[+] 從頭 ㄘㄨㄥˊ-ㄊㄡˊ -> 從頭 ㄔㄨㄥˊ-ㄊㄡˊ (重 ㄔㄨㄥˊ 0)
[+] 轉捩點 ㄓㄨㄢˇ-ㄌㄧㄝˋ-ㄉㄧㄢˇ -> 轉捩點 ㄓㄨㄢˇ-ㄌㄟˋ-ㄉㄧㄢˇ (淚 ㄌㄟˋ 1)
[+] 竟然 ㄐㄧㄥˋ-ㄖㄢˊ -> 竟然 ㄐㄧㄣˋ-ㄖㄢˊ (盡 ㄐㄧㄣˋ 0)
[+] 嵌入 ㄑㄧㄢ-ㄖㄨˋ -> 嵌入 ㄎㄢˇ-ㄖㄨˋ (坎 ㄎㄢˇ 0)
[+] 嵌入 ㄑㄧㄢ-ㄖㄨˋ -> 嵌入 ㄎㄢˇ-ㄖㄨˋ (崁 ㄎㄢˇ 0)
[+] 搜尋 ㄙㄡ-ㄒㄩㄣˊ -> 搜尋 ㄕㄡ-ㄒㄩㄣˊ (收 ㄕㄡ 0)
[+] 不禁 ㄅㄨˋ-ㄐㄧㄣ -> 不禁 ㄅㄨˋ-ㄐㄧㄥ (經 ㄐㄧㄥ 1)
[+] 詢問 ㄒㄩㄣˊ-ㄨㄣˋ -> 詢問 ㄒㄧㄣˊ-ㄨㄣˋ (尋 ㄒㄧㄣˊ 0)
[-] phrase ['中文造詣', "中文造旨'"] not found in mcbopomofo_dict
[-] phrase ['中文造詣', "中文造紙'"] not found in mcbopomofo_dict
[-] phrase ['國文造詣', "國文造紙'"] not found in mcbopomofo_dict
[-] phrase ['語文造詣', "語文造旨'"] not found in mcbopomofo_dict
[-] phrase ['語文造詣', "語文造紙'"] not found in mcbopomofo_dict
[+] 盡量 ㄐㄧㄣˋ-ㄌㄧㄤˋ -> 盡量 ㄐㄧㄥˋ-ㄌㄧㄤˋ (竟 ㄐㄧㄥˋ 0)
[+] 影響 ㄧㄥˇ-ㄒㄧㄤˇ -> 影響 ㄧㄣˇ-ㄒㄧㄤˇ (引 ㄧㄣˇ 0)
[-] phrase ['撨看看', "喬'看看"] not found in mcbopomofo_dict
[+] 播放 ㄅㄛˋ-ㄈㄤˋ -> 播放 ㄅㄛ-ㄈㄤˋ (撥 ㄅㄛ 0)
[-] phrase ['去潛水', "去淺'水"] not found in mcbopomofo_dict
[+] 潛力 ㄑㄧㄢˊ-ㄌㄧˋ -> 潛力 ㄑㄧㄢˇ-ㄌㄧˋ (淺 ㄑㄧㄢˇ 0)
[+] 潛水夫 ㄑㄧㄢˊ-ㄕㄨㄟˇ-ㄈㄨ -> 潛水夫 ㄑㄧㄢˇ-ㄕㄨㄟˇ-ㄈㄨ (淺 ㄑㄧㄢˇ 0)
[-] phrase ['學潛水', "學淺'水"] not found in mcbopomofo_dict
[+] 儘管 ㄐㄧㄣˇ-ㄍㄨㄢˇ -> 儘管 ㄐㄧㄣˋ-ㄍㄨㄢˇ (僅 ㄐㄧㄣˋ 0)
[+] 錄取 ㄌㄨˋ-ㄑㄩˇ -> 錄取 ㄖㄨˋ-ㄑㄩˇ (入 ㄖㄨˋ 0)
[+] 應該 ㄧㄥ-ㄍㄞ -> 應該 ㄧㄣ-ㄍㄞ (因 ㄧㄣ 0)
[-] phrase ['戳到笑點', "搓'到笑點"] not found in mcbopomofo_dict
[-] phrase ['戳破謊言', "搓'破謊言"] not found in mcbopomofo_dict
[+] 荒謬 ㄏㄨㄤ-ㄇㄧㄡˋ -> 荒謬 ㄏㄨㄤ-ㄇㄡˊ (繆 ㄇㄡˊ 1)
[+] 荒謬 ㄏㄨㄤ-ㄇㄧㄡˋ -> 荒謬 ㄏㄨㄤ-ㄇㄧㄠˋ (繆 ㄇㄧㄠˋ 1)
一樣 ㄧ-ㄧㄤˋ
自己人 ㄗˋ-ㄧˇ-ㄖㄣˊ
不能自已 ㄅㄨˋ-ㄋㄥˊ-ㄗˋ-ㄐㄧˇ
什麼 ㄕㄜˇ-ㄇㄛ˙
什麼 ㄕㄜˇ-ㄇㄜ˙
什麼 ㄕㄜˇ-ㄇㄛ˙
什麼 ㄕㄜˇ-ㄇㄜ˙
什麼 ㄕㄣˊ-ㄇㄛ˙
什麼 ㄕㄣˊ-ㄇㄜ˙
生硬 ㄕㄣ-ㄧㄥˋ
名字 ㄇㄧㄥˊ-ㄗˇ
名字 ㄇㄧㄥˊ-ㄗ˙
名字 ㄇㄧㄥˊ-ㄗˇ
類似於 ㄌㄟˋ-ㄕˋ-ㄩˊ
莫名其妙 ㄇㄛˋ-ㄇㄧㄥˊ-ㄐㄧ-ㄇㄧㄠˋ
折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄐㄩㄢˇ
折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄐㄩㄢˋ
折價券 ㄓㄜˊ-ㄐㄧㄚˋ-ㄑㄩㄢˊ
彩券 ㄘㄞˇ-ㄐㄩㄢˇ
彩券 ㄘㄞˇ-ㄐㄩㄢˋ
彩券 ㄘㄞˇ-ㄑㄩㄢˊ
問卷 ㄨㄣˋ-ㄑㄩㄢˋ
上弦月 ㄕㄤˋ-ㄒㄩㄢˊ-ㄩㄝˋ
下弦月 ㄒㄧㄚˋ-ㄒㄩㄢˊ-ㄩㄝˋ
弦樂團 ㄒㄩㄢˊ-ㄩㄝˋ-ㄊㄨㄢˊ
弦樂器 ㄒㄩㄢˊ-ㄩㄝˋ-ㄑㄧˋ
管弦樂 ㄍㄨㄢˇ-ㄒㄩㄢˊ-ㄩㄝˋ
目的 ㄇㄨˋ-ㄉㄜ˙
目的 ㄇㄨˋ-ㄉㄧ˙
真的 ㄓㄣ-ㄉㄜˊ
真的 ㄓㄣ-ㄉㄟˇ
有關係 ㄧㄡˇ-ㄍㄨㄢ-ㄒㄧ
有關係 ㄧㄡˇ-ㄍㄨㄢ-ㄒㄧ˙
沒關係 ㄇㄟˊ-ㄍㄨㄢ-ㄒㄧ
沒關係 ㄇㄟˊ-ㄍㄨㄢ-ㄒㄧ˙
重新 ㄗㄨㄥˋ-ㄒㄧㄣ
重新 ㄘㄨㄥ-ㄒㄧㄣ
重新 ㄘㄨㄥˊ-ㄒㄧㄣ
重複 ㄗㄨㄥˋ-ㄈㄨˋ
重複 ㄘㄨㄥ-ㄈㄨˋ
重複 ㄘㄨㄥˊ-ㄈㄨˋ
倒是 ㄉㄠˋ-ㄕˋ
經常 ㄐㄧㄥ-ㄓㄤˇ
經常 ㄐㄧㄥ-ㄓㄤˋ
覺得 ㄐㄩㄝˊ-ㄉㄜ˙
覺得 ㄐㄩㄝˊ-ㄉㄧˊ
覺得 ㄐㄩㄝˊ-ㄉㄧˋ
覺得 ㄐㄩㄝˊ-ㄉㄧˊ
覺得 ㄐㄩㄝˊ-ㄉㄧˋ
從今以後 ㄓㄨㄥˋ-ㄐㄧㄣ-ㄧˇ-ㄏㄡˋ
從今以後 ㄔㄨㄥˊ-ㄐㄧㄣ-ㄧˇ-ㄏㄡˋ
從頭 ㄓㄨㄥˋ-ㄊㄡˊ
從頭 ㄔㄨㄥˊ-ㄊㄡˊ
轉捩點 ㄓㄨㄢˇ-ㄌㄟˋ-ㄉㄧㄢˇ
竟然 ㄐㄧㄣˋ-ㄖㄢˊ
嵌入 ㄎㄢˇ-ㄖㄨˋ
嵌入 ㄎㄢˇ-ㄖㄨˋ
搜尋 ㄕㄡ-ㄒㄩㄣˊ
不禁 ㄅㄨˋ-ㄐㄧㄥ
詢問 ㄒㄧㄣˊ-ㄨㄣˋ
盡量 ㄐㄧㄥˋ-ㄌㄧㄤˋ
影響 ㄧㄣˇ-ㄒㄧㄤˇ
播放 ㄅㄛ-ㄈㄤˋ
潛力 ㄑㄧㄢˇ-ㄌㄧˋ
潛水夫 ㄑㄧㄢˇ-ㄕㄨㄟˇ-ㄈㄨ
儘管 ㄐㄧㄣˋ-ㄍㄨㄢˇ
錄取 ㄖㄨˋ-ㄑㄩˇ
應該 ㄧㄣ-ㄍㄞ
荒謬 ㄏㄨㄤ-ㄇㄡˊ
荒謬 ㄏㄨㄤ-ㄇㄧㄠˋ

There are quite a few limitations with this alternative-user-dict approach, though. Such as 破音字 or (long) phrases not present in the McBopomofo's dictionary.

I believe the more proper way is to bring up the typo suggestions in the selection prompt.

@xatier
Copy link
Contributor

xatier commented Jun 9, 2023

For phrases available in moedict (but not in McBopomofo's dict)

Ref: https://www.moedict.tw/%E7%97%85%E5%9C%A8%E8%86%8F%E8%82%93 etc

病在膏肓 ㄅㄧㄥˋ-ㄗㄞˋ-ㄍㄠ-ㄏㄨㄤ
晦在 ㄏㄨㄟˋ-ㄗㄞˋ
見在 ㄒㄧㄢˋ-ㄗㄞˋ
如芒在背 ㄖㄨˊ-ㄇㄤˊ-ㄗㄞˋ-ㄅㄟˋ
再犯 ㄗㄞˋ-ㄈㄢˋ
在官 ㄗㄞˋ-ㄍㄨㄢ
在疚 ㄗㄞˋ-ㄐㄧㄡˋ
在莒 ㄗㄞˋ-ㄐㄩˇ
在昔 ㄗㄞˋ-ㄒㄧˊ
在席 ㄗㄞˋ-ㄒㄧˊ
在陳 ㄗㄞˋ-ㄔㄣˊ
在城 ㄗㄞˋ-ㄔㄥˊ
在室 ㄗㄞˋ-ㄕˋ
在苫 ㄗㄞˋ-ㄕㄢ
在日 ㄗㄞˋ-ㄖˋ
在任 ㄗㄞˋ-ㄖㄣˋ
醉翁之意不在酒 ㄗㄨㄟˋ-ㄨㄥ-ㄓ-ㄧˋ-ㄅㄨˋ-ㄗㄞˋ-ㄐㄧㄡˇ
安在 ㄢ-ㄗㄞˋ
猶在 ㄧㄡˊ-ㄗㄞˋ

@tianjianjiang
Copy link
Member

tianjianjiang commented Jun 13, 2023

Since we are talking about bopomofo here, this ticket is likely a variant of spell check.

As long as we can have a good UI/UX for the system (e.g., asking "do you mean this [foo]?" instead of correcting "fu" to "foo" directly), algorithms behind it can be like edit distance or https://en.wikipedia.org/wiki/Metaphone .

If this is something we want to promote, I can find some time to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants