Improving ranking of Cree entries for English sense (WordNet) based search) #1206

aarppe · 2024-12-19T06:28:23Z

Tweak the ranking of the Cree entries within the senses. Currently we are using corpus-based lemma frequencies, when they exist, but we might want to factor in the glossary-counts as well as dictionary-morpheme-based entry frequencies as well. [This needs an update of the source files with the frequencies, by @aarppe]

Originally posted by @aarppe in #1138 (comment)

The corpus-based lemma frequencies cover only a part of all the Cree entries in CW and the other dictionary resources, and they are skewed due to the corpora that we have. The following would be options to consider:

Include the glossary-based rankings. This will ensure that core vocabulary is ranked up (some 3 thousand entries).
Include dictionary-based morpheme aggregate rankings. This will ensure that all entries in CW (over 30k) will receive a ranking (which will cover most of the other sources as well).
Include the extent of matches of English search terms (the lexical parts remaining after English phrase analysis) with the English definitions of the Cree entries under each sense.
Include an improved form of vector similarity between the English search terms and the English definitions of the Cree entries.
A ranked combination of 1-4 above.

aarppe mentioned this issue Dec 19, 2024

Implement English-to-Cree search, based on WordNet senses #1138

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving ranking of Cree entries for English sense (WordNet) based search) #1206

Improving ranking of Cree entries for English sense (WordNet) based search) #1206

aarppe commented Dec 19, 2024

Improving ranking of Cree entries for English sense (WordNet) based search) #1206

Improving ranking of Cree entries for English sense (WordNet) based search) #1206

Comments

aarppe commented Dec 19, 2024