Allow intersecting items in NER. #5645
Replies: 5 comments
-
Since the base |
Beta Was this translation helpful? Give feedback.
-
No, it's ok if it's not elegant now, I'd like to maximize the speed of the current solution because it's already too slow for some of the use cases. Do you have plans for v3.0 to support intersecting NER entities? |
Beta Was this translation helpful? Give feedback.
-
Yes! From v.3 onwards, we want to store the entities on the |
Beta Was this translation helpful? Give feedback.
-
Similar issue: I wanted to subclass an entity type, i.e., GPE-Nation, GPE-City, etc. Is that possible now? Can you tell me how? If not, is it on the roadmap for v.3? If not, can you add it please? Thanks. |
Beta Was this translation helpful? Give feedback.
-
Built-in support for subclassing of entity types is not currently on the roadmap. Right now what I would advice is to perform one round of classification using the supertypes, then filter the data for each type and run a second round of classification (one per supertype) to get the more fine-grained types. In 2.x, you'll have to reset the entity annotations because each Token can only be part of one entity, as described above. From 3.0 onwards, you'd be able to store both, but the principle would remain the same. As it looks like the original question was answered sufficiently, I'll go ahead and close this issue. |
Beta Was this translation helpful? Give feedback.
-
How do I allow entities of different types to intersect in NER?
For example, if I have "Address": "Moscow, Frunze str., 5", then I also have tags "GEO/LOC" ("Moscow"), and "PER" tag ("Frunze" is also a person last name, and is a street name as well).
Currently, I have two options:
( Here's speed comparison between NLP frameworks on Russian datasets: https://github.com/natasha/naeval#ner , there are also POS and DEP comparisons there, and Spacy-RU-NER uses an additional corpus to beat most other models )
Beta Was this translation helpful? Give feedback.
All reactions