How to correctly add lemmas into spacy default lookup table ? #9331
-
Hello ! I want to add some lemmas in the spacy default lemma lookup table. But it doesn't work as expected. I did like this : nlp = spacy.load('fr_core_news_sm')
nlp.select_pipes(enable=["lemmatizer", "morphologizer"])
lemmatizer = nlp.get_pipe("lemmatizer")
lemma_lookup = lemmatizer.lookups.get_table("lemma_lookup")
lemma_lookup.set("rétro-péritonéale", "rétro-péritonéal")
lemma_lookup.set("rétro-péritonéales", "rétro-péritonéal")
doc = nlp("rétro-péritonéale rétro-péritonéales")
tokens = [(t.text, t.lemma_) for t in doc]
# Output
# [('rétro-péritonéale', 'rétro-péritonéale'), ('rétro-péritonéales', 'rétro-péritonéales')] Did I do something wrong ? Thanks in advance for your response. SpaCy version : 3.0.6 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
The lookups are only used in the French lemmatizer as a backoff for cases not covered by the rules. There's no detailed documentation for the French lemmatizer, but it should be pretty easy to follow the code to figure out which tables are relevant for the cases you want to modify: spaCy/spacy/lang/fr/lemmatizer.py Lines 25 to 80 in 78a88f7 Things to note:
|
Beta Was this translation helpful? Give feedback.
The lookups are only used in the French lemmatizer as a backoff for cases not covered by the rules. There's no detailed documentation for the French lemmatizer, but it should be pretty easy to follow the code to figure out which tables are relevant for the cases you want to modify:
spaCy/spacy/lang/fr/lemmatizer.py
Lines 25 to 80 in 78a88f7