Add support for NER formats that have XML tags using ENAMEX #5407
Replies: 4 comments
-
This could be implemented as an additional converter for the As an alternative, there may be existing converters that convert this format to one of the NER formats supported by spacy already. There are examples of the supported formats here: https://github.com/explosion/spaCy/tree/master/examples/training/ner_example_data |
Beta Was this translation helpful? Give feedback.
-
I have written a piece of code that works for me (not tested it on other datasets) in this repo https://github.com/afshinrahimi/markup2bio4ner |
Beta Was this translation helpful? Give feedback.
-
I made something similar to @afshinrahimi at https://github.com/kevinlu1248/researchy-api but it's mainly for parsing HTML first and then annotating it with XML tags. |
Beta Was this translation helpful? Give feedback.
-
In general, we try to focus the main library on the most common use-cases/data formats, and we don't want to clutter the FYI - from v.3 onwards, the data format will work a bit differently, and the standard format will be |
Beta Was this translation helpful? Give feedback.
-
Feature description
Add support to convert labelled NER files to Spacy format.
One example of such files is:
Could the feature be a custom component or spaCy plugin?
If so, we will tag it as
project idea
so other users can take it on.Beta Was this translation helpful? Give feedback.
All reactions