-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I get rid of extra accents from enclitics: e.g., Μαῖράν #8
Comments
In all the work I do, I calculate normalised forms and work with those (see https://jktauber.com/2018/07/23/normalisation-column-morphgnt/ for details in the context of the GNT). I have code for handling much of that normalisation. It probably makes sense for me to include at least some of that in greek-accentuation. In the short term, I've put said code in a gist: https://gist.github.com/jtauber/ed07e0fd15ecdc5394755d3e0c9304f8 |
Its not a big deal but you have so many usefully packaged routines that
I don't want to miss something you have already done.
My previous approach was exhaustive and thus emphasized recall but
normalized Greek goes a long way with the zillions of words we have to
index.
…On 8/19/18 2:33 PM, James Tauber wrote:
In all the work I do, I calculate normalised forms and work with those
(see https://jktauber.com/2018/07/23/normalisation-column-morphgnt/
for details in the context of the GNT).
I have code for handling much of that normalisation. It probably makes
sense for me to include at least some of that in greek-accentuation.
In the short term, I've put said code in a gist:
https://gist.github.com/jtauber/ed07e0fd15ecdc5394755d3e0c9304f8
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE66mUH1mD2M7_SjRPjfPhKs3K0DRnK9ks5uSa-XgaJpZM4WDCgz>.
|
I've finally created https://github.com/jtauber/greek-normalisation to package up all the various normalisation stuff I've used over the years. |
I don't see a strip/simplify accent routine. Am I missing this?
The text was updated successfully, but these errors were encountered: