-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HuggingBento: A Bento-flavoured distro running Hugging Face Transformers #108
base: main
Are you sure you want to change the base?
Conversation
…rmer NLP pipelines
model_repository: "KnightsAnalytics/distilbert-base-uncased-finetuned-sst-2-english" | ||
|
||
|
||
# In: "This meal tastes like old boots." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it would be better such that you can provide the huggingface processors with a bloblang mapping for the input. You could keep it this way but I think I would assume that the data coming in would be in a json format and the user would have to know to apply a mapping to it / use a branch processor.
Like this is the way the http processor works: it requires you to use it with a branch processor in a way, but I think that is harder to understand than a bloblang mapping field for a new user.
#!/bin/bash | ||
|
||
ONNXRUNTIME_VERSION=${ONNXRUNTIME_VERSION:-"1.18.0"} | ||
DEPENDENCY_DEST=${DEPENDENCY_DEST:-"/usr/lib"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set this to /usr/lib/local
on macOS like the README.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script assumes Linux which where the above would work. Not sure if changing this to Mac by default will confuse people more. Perhaps I'll add a comment mentioning this
import Tabs from '@theme/Tabs'; | ||
import TabItem from '@theme/TabItem'; | ||
|
||
:::caution BETA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I would have it on experimental because having at BETA limits what can be changed outside of a major release - what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense. Will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that there needs to be a way of including these processors somewhere else so that they don't appear on the website. i.e. moved to somewhere like serverless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding an admonition at the top of the docs saying this is only availble in the huggingbento
distro? I think trying to generate the docs into a new location is do-able but could end up being more trouble than it's worth if a text-block could suffice. Thoughts?
Co-authored-by: Jem Davies <[email protected]>
Co-authored-by: Jem Davies <[email protected]>
cca1170
to
dc98f1c
Compare
Please see knights-analytics/hugot#59 for how to remove the ORT dependency for Hugot, would love to see this happen! |
What is this?
Create distribution of Bento for usage with NLP pipelines. This uses the knights-analytics/hugot library to allow for running a Hugging Face pipeline with an ONNX model using Go.
It introduces three new components:
nlp_classify_text
for text classification pipelines (processor_text_classifier.go
)nlp_classify_tokens
for token and NER classification pipelines (processor_token_classifier.go
)nlp_extract_features
for feature extraction pipelines (processor_feature_extractor.go
)Since there is a lot of config overlap between all of these processors, a single
processor.go
file defines config that is shared amongst all processor types.All of these will use a shared ONNX Runtime session that is atomically initialised upon creation of one or more HuggingBento processors. This is required to interact with the underlying ONNX Runtime library and can only have a single session created at a time (which required some work when integration testing to ensure runs were not flaky).
Building HuggingBento
Note: the Go build tag
huggingbento
is used to ensure all files in this distro are only compiled when specified necessary.Docker
Run the below to build a new image on your local (without using any cached layers).
docker build --platform=linux/amd64 -f resources/huggingbento/Dockerfile -t warpstreamlabs/huggingbento:latest --no-cache .
Binary
make huggingbento
Testing
Integration Tests
Steps to manually test
config.yaml
:KnightsAnalytics/distilbert-base-uncased-finetuned-sst-2-english
model. It will also download the model and relevant files from the huggingface repository.[{"Label":"NEGATIVE","Score":0.00014481653},{"Label":"POSITIVE","Score":0.99985516}]
.TODO
serverless
docker-compose
to allow for local testing.generate/
directory to allow for generating ONNX runtime's and huggingface bindings for any OS/ARCH combo like with ollama which has multiple generate scripts.