You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using gradle: compile 'com.github.dkpro:dkpro-jwktl:56499bdaab' to obtain latest snapshot.
I'm analyzing text from various sources and some Russian (I presume) text is in my test data, the operation "wkt.getEntriesForWord("Статтю", true);" hangs like it is in an infinite loop.
Was expecting an empty entries list, not app hang.
Example term: Статтю
The text was updated successfully, but these errors were encountered:
Not really an infinite loop, but definitely unexpected behavior. As a quick-fix, you can remove the boolean param (i.e., use wkt.getEntriesForWord("Статтю");instead. Normalization of titles is not supported for non-Latin alphabets and causes this issue also for other, e.g., Russian entries. I'll see if I can solve the actual issue in one of the later versions. Please report back if removing the normalization param helps for you.
As normalization is really wanted, we have instead implemented a step
where we run the terms through IBM's icu4j to generate "ascii/ansi"
transliterations for any language charset that doesn't fit within the
normal English/Western European range that is not pruned by an earlier
initial language check process. So far so good.
And we've worked out a methodology (via gradle) to auto build the DB
when an updated wiktionary dump is available and shove it into a jar
with a utility routine to extract the DB to a temp folder when needed to
create a Wiktionary instance.
On 8/19/19 11:30 AM, Christian M. Meyer wrote:
Not really an infinite loop, but definitely unexpected behavior. As a
quick-fix, you can remove the boolean param (i.e., use
|wkt.getEntriesForWord("Статтю");|instead. Normalization of titles is
not supported for non-Latin alphabets and causes this issue also for
other, e.g., Russian entries. I'll see if I can solve the actual issue
in one of the later versions. Please report back if removing the
normalization param helps for you.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#72?email_source=notifications&email_token=ABHY72MLCJMHEZQDWDFEJ5DQFK4AFA5CNFSM4IMB37UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4TLBFY#issuecomment-522629271>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABHY72ILPWN6XDPPM2WWOV3QFK4AFANCNFSM4IMB37UA>.
Using English version.
Latest wiktionary downloaded and parsed.
Using gradle: compile 'com.github.dkpro:dkpro-jwktl:56499bdaab' to obtain latest snapshot.
I'm analyzing text from various sources and some Russian (I presume) text is in my test data, the operation "wkt.getEntriesForWord("Статтю", true);" hangs like it is in an infinite loop.
Was expecting an empty entries list, not app hang.
Example term: Статтю
The text was updated successfully, but these errors were encountered: