Is Spacy developed English models specific to USA only? No support to India city, currency unit #5443
-
How to reproduce the behaviour
Output:
Your Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
The English spacy models were trained on OntoNotes. You can have a look here to see what sources that includes. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the swift response. The model is able to identify some Indian cities, whilst failing at some. Doesn’t it sound like a bug, if I m not wrong?
… On 16-May-2020, at 1:32 PM, Bram Vanroy ***@***.***> wrote:
The English spacy models were trained on OntoNotes. You can have a look here <https://catalog.ldc.upenn.edu/LDC2013T19>to see what sources that includes.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <https://github.com/explosion/spaCy/issues/5443#issuecomment-629606220>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJLGQSANJJKXYSDQVRMVBADRRZCB7ANCNFSM4NCZRCVQ>.
|
Beta Was this translation helpful? Give feedback.
-
Don't think so. It can simply mean that those cities were not part of the training data, so the model never learned. If you have a lot of data you can train your own spaCy model. |
Beta Was this translation helpful? Give feedback.
-
You should probably try updating the model to recognise the cities you're missing. You have two options for this:
Updating the model is always a bit fiddly, because it's hard to reason about the learning process. But the library does support it -- you can do Matcher often work really well, and are an option you should consider. |
Beta Was this translation helpful? Give feedback.
Don't think so. It can simply mean that those cities were not part of the training data, so the model never learned. If you have a lot of data you can train your own spaCy model.