You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
IPELINE: [('transformer', <spacy_curated_transformers.pipeline.transformer.CuratedTransformer object at 0x7f2b36231960>), ('tagger', <spacy.pipeline.tagger.Tagger object at 0x7f2b64d6e1a0>), ('parser', <spacy.pipeline.dep_parser.DependencyParser object at 0x7f2af922fed0>), ('lemmatizer', <spacy.lang.en.lemmatizer.EnglishLemmatizer object at 0x7f2b34e629c0>), ('llm', <spacy_llm.pipeline.llm.LLMWrapper object at 0x7f2b2ae4e2c0>), ('llm_rel', <spacy_llm.pipeline.llm.LLMWrapper object at 0x7f2b52016740>)]
But while processing text, it gives me the following error:
ValueError: [E030] Sentence boundaries unset. You can add the 'sentencizer' component to the pipeline with: nlp.add_pipe('sentencizer'). Alternatively, add the dependency parser or sentence recognizer, or set sentence boundaries by setting doc[i].is_sent_start.
I would appreciate any guidance or assistance in resolving this issue. Thank you!
The text was updated successfully, but these errors were encountered:
Description:
I'm encountering a problem with sentence segmentation when integrating spacy_llm components into a spaCy pipeline that is based on en_core_web_trf.
Observed Behavior:
Environment:
Using the latest versions of spaCy and en_core_web_trf.
Steps to Reproduce:
Troubleshooting:
If i run this code:
IPELINE: [('transformer', <spacy_curated_transformers.pipeline.transformer.CuratedTransformer object at 0x7f2b36231960>), ('tagger', <spacy.pipeline.tagger.Tagger object at 0x7f2b64d6e1a0>), ('parser', <spacy.pipeline.dep_parser.DependencyParser object at 0x7f2af922fed0>), ('lemmatizer', <spacy.lang.en.lemmatizer.EnglishLemmatizer object at 0x7f2b34e629c0>), ('llm', <spacy_llm.pipeline.llm.LLMWrapper object at 0x7f2b2ae4e2c0>), ('llm_rel', <spacy_llm.pipeline.llm.LLMWrapper object at 0x7f2b52016740>)]
But while processing text, it gives me the following error:
ValueError: [E030] Sentence boundaries unset. You can add the 'sentencizer' component to the pipeline with:
nlp.add_pipe('sentencizer'). Alternatively, add the dependency parser or sentence recognizer, or set sentence boundaries by setting
doc[i].is_sent_start.
I would appreciate any guidance or assistance in resolving this issue. Thank you!
The text was updated successfully, but these errors were encountered: