spacy-huggingface-pipelines
: Use pretrained Transformer models for text and token classification
#12591
adrianeboyd
announced in
News & Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The new
spacy-huggingface-pipelines
package provides wrappers for Hugging Face Transformers pipelines for text and token classification for inference only. As of Transformers v4.28, pipelines provide all the functionality needed for simple spaCy wrappers.Installation
Usage
Text classification with
hf_text_pipe
:Token classification with
hf_token_pipe
:See more config settings and examples in the package README.
Search for models on the Hugging Face Hub:
Notes
hf_text_pipe
andhf_token_pipe
only support inference, not training or fine-tuning.For texts longer than the model max length:
The transformer models are always loaded from the transformers cache directory or downloaded from the Hugging Face Hub, not from the directory/package saved with
nlp.to_disk
orspacy package
. The model data is not included in the spaCy model directory.This means that you need to set up your transformers cache for offline use if you have limited internet access, but it has the advantage that you can use the same models in different pipelines and different python environments without having to duplicate the data on disk.
Beta Was this translation helpful? Give feedback.
All reactions