Why would a spaCy blank pipeline tokenize input? #13650
Replies: 2 comments 6 replies
-
A pipeline is defined as a tokenizer and then zero or more processes that modify the |
Beta Was this translation helpful? Give feedback.
6 replies
-
I'm sorry... this is a counting error. I've either made a mistake in the annotations or the annotation is reporting incorrect numbers. The boundary I've set exceeds the length of the string. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello;
I'm initializing a blank pipe line via nlp = spacy.blank("en") and feeding it text to label some spans. The output DOC from the call to nlp is tokenizing the input sentence rather than returning a string. What would a string contain to make a blank pipeline do this? This is an issue because when I then run char_span on the doc the indexes into the text are returning tokens instead of the individual characters of the string.
Braden.
Beta Was this translation helpful? Give feedback.
All reactions