-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-38857][Model] Introduce a Triton inference module under flink-models #27385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
||
| # Triton | ||
|
|
||
| The Triton Model Function allows Flink SQL to call [NVIDIA Triton Inference Server](https://github.com/triton-inference-server/server) for real-time model inference tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some comments, can you ask on the dev list whether this requires a Flip please; to me it seems big enough to warrant a Flip.
| ```sql | ||
| CREATE TEMPORARY VIEW movie_reviews(id, movie_name, user_review, actual_sentiment) | ||
| AS VALUES | ||
| (1, 'Great Movie', 'This movie was absolutely fantastic! Great acting and storyline.', 'positive'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I wonder whether -1, 0 and +1 would be more intuitive values.
|
|
||
| Here's an example `config.pbtxt` for a text classification model: | ||
|
|
||
| ```protobuf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we explicitly say that this should be in the text-classification/ folder.
| ├── text-classification/ | ||
| │ ├── config.pbtxt | ||
| │ └── 1/ | ||
| │ └── model.py # or model.onnx, model.plan, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the following example what file do we use for model.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question — this refers to the Triton Python backend model file.
In this example, model.py is the Python backend implementation located in the Triton model repository, specifically under:
text-classification/
├── config.pbtxt
└── 1/
└── model.py
The exact contents of model.py are not relevant to Flink itself. Flink interacts with the model only via the Triton HTTP/gRPC inference API, and does not load or execute the model code directly.
To avoid ambiguity, I will update the documentation to explicitly state that this file resides in the text-classification/ model directory.
What is the purpose of the change
This PR introduces a new optional Triton inference module under
flink-models, enabling Flink to invoke external NVIDIA Triton Inference Server for batch-oriented model inference.The module implements a reusable runtime-level integration based on the existing model provider SPI, allowing users to define Triton-backed models via
CREATE MODELand execute inference throughML_PREDICTwithout modifying the Flink planner or SQL execution semantics.Brief change log
flink-model-tritonmodule underflink-modelsVerifying this change
Does this pull request potentially affect one of the following parts?
Documentation
docs/connectors/models/triton.mdRelated issues