Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish word forms #44

Open
kurmasz opened this issue Apr 1, 2017 · 1 comment
Open

Polish word forms #44

kurmasz opened this issue Apr 1, 2017 · 1 comment

Comments

@kurmasz
Copy link

kurmasz commented Apr 1, 2017

When analyzing the English Wikitionary, getWordForm() always returns null for IWiktionaryEntry objects with a language of "Polish" --- even if the corresponding entry has a declension/conjugation table. Is this the expected behavior?

(When trying to analyze the Polish wiktionary, I got "Exception in thread "main" de.tudarmstadt.ukp.jwktl.api.WiktionaryException: Language Polish is not supported", so I assume that behavior is expected.)

@Tbsc
Copy link

Tbsc commented Aug 24, 2017

(Note: I'm not related in any way to this project, and this is just what I understood from looking at the code. And yes, I know April was more than 4 months ago.)

Regarding your first question, word forms in the English Wiktionary are handled by ENWordFormHandler, which can only handle English entries (There is ENNonEngWordFormHandler for non-English entries, but it only handles noun genders, not word forms). Theoretically, supporting other languages is possible, but because the library doesn't let you register external handlers and only gives you the parsed values (no way of getting the wikitext), I'm pretty sure that's not possible (I encountered the same problem, and it's frustrating).

And yes, only the English, German and Russion Wiktionaries are supported (WiktionaryArticleParser, onSiteInfoComplete(), it only handles those Wiktionary languages)

Edit: I just realized how external handlers aren't really possible because the parser only saves the parsed values to the database, so it's not that the wikitext is hidden, but rather that it doesn't exist. Getting those values requires modifying the library to include those values in the database and to rebuild it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants