This repository has been archived by the owner on Nov 16, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 136
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #209 from panchul/kfserving_onnx
Adding a tutorial of using KFServing with ONNX models
- Loading branch information
Showing
11 changed files
with
2,306 additions
and
0 deletions.
There are no files selected for viewing
1,581 changes: 1,581 additions & 0 deletions
1,581
Research/kubeflow-on-azure-stack-lab/04-KFServing/assets/onnx_ml_pb2.py
Large diffs are not rendered by default.
Oops, something went wrong.
215 changes: 215 additions & 0 deletions
215
Research/kubeflow-on-azure-stack-lab/04-KFServing/assets/predict_pb2.py
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx-mnist-input.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"inputs": {"Input3": {"dims": ["1", "1", "28", "28"], "dataType": 1, "rawData": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAPwAAQEAAAAAAAAAAAAAAgEAAAABAAAAAAAAAMEEAAAAAAAAAAAAAYEEAAIA/AAAAAAAAmEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQEEAAAAAAAAAAAAA4EAAAAAAAACAPwAAIEEAAAAAAAAAQAAAAEAAAIBBAAAAAAAAQEAAAEBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA4EAAAABBAAAAAAAAAEEAAAAAAAAAAAAAAEEAAAAAAAAAAAAAmEEAAAAAAAAAAAAAgD8AAKhBAAAAAAAAgEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgD8AAAAAAAAAAAAAgD8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAMEEAAAAAAAAAAAAAIEEAAEBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABQQQAAAAAAAHBBAAAgQQAA0EEAAAhCAACIQQAAmkIAADVDAAAyQwAADEIAAIBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWQwAAfkMAAHpDAAB7QwAAc0MAAHxDAAB8QwAAf0MAADRCAADAQAAAAAAAAKBAAAAAAAAAEEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOBAAACQQgAATUMAAH9DAABuQwAAc0MAAH9DAAB+QwAAe0MAAHhDAABJQwAARkMAAGRCAAAAAAAAmEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWkMAAH9DAABxQwAAf0MAAHlDAAB6QwAAe0MAAHpDAAB/QwAAf0MAAHJDAABgQwAAREIAAAAAAABAQQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgD8AAABAAABAQAAAAEAAAABAAACAPwAAAAAAAIJCAABkQwAAf0MAAH5DAAB0QwAA7kIAAAhCAAAkQgAA3EIAAHpDAAB/QwAAeEMAAPhCAACgQQAAAAAAAAAAAAAAAAAAAAAAAAAAAACAPwAAgD8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEBBAAAAAAAAeEIAAM5CAADiQgAA6kIAAAhCAAAAAAAAAAAAAAAAAABIQwAAdEMAAH9DAAB/QwAAAAAAAEBBAAAAAAAAAAAAAAAAAAAAAAAAAEAAAIA/AAAAAAAAAAAAAAAAAAAAAAAAgD8AAABAAAAAAAAAAAAAAABAAACAQAAAAAAAADBBAAAAAAAA4EAAAMBAAAAAAAAAlkIAAHRDAAB/QwAAf0MAAIBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIA/AAAAQAAAQEAAAIBAAACAQAAAAAAAAGBBAAAAAAAAAAAAAAAAAAAQQQAAAAAAAABAAAAAAAAAAAAAAAhCAAB/QwAAf0MAAH1DAAAgQQAAIEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIA/AAAAQAAAQEAAAABAAAAAAAAAAAAAAEBAAAAAQAAAAAAAAFBBAAAwQQAAAAAAAAAAAAAAAAAAwEAAAEBBAADGQgAAf0MAAH5DAAB4QwAAcEEAAEBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIA/AACAPwAAgD8AAAAAAAAAAAAAAAAAAAAAAACAPwAAgD8AAAAAAAAAAAAAoEAAAMBAAAAwQQAAAAAAAAAAAACIQQAAOEMAAHdDAAB/QwAAc0MAAFBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEBAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAABAAACAQAAAgEAAAAAAAAAwQQAAAAAAAExCAAC8QgAAqkIAAKBAAACgQAAAyEEAAHZDAAB2QwAAf0MAAFBDAAAAAAAAEEEAAAAAAAAAAAAAAAAAAAAAAACAQAAAgD8AAAAAAAAAAAAAgD8AAOBAAABwQQAAmEEAAMZCAADOQgAANkMAAD1DAABtQwAAfUMAAHxDAAA/QwAAPkMAAGNDAABzQwAAfEMAAFJDAACQQQAA4EAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIBAAAAAAAAAAAAAAABCAADaQgAAOUMAAHdDAAB/QwAAckMAAH9DAAB0QwAAf0MAAH9DAAByQwAAe0MAAH9DAABwQwAAf0MAAH9DAABaQwAA+EIAABBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAD+QgAAf0MAAGtDAAB/QwAAf0MAAHdDAABlQwAAVEMAAHJDAAB6QwAAf0MAAH9DAAB4QwAAf0MAAH1DAAB5QwAAf0MAAHNDAAAqQwAAQEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMEEAAAAAAAAQQQAAfUMAAH9DAAB/QwAAaUMAAEpDAACqQgAAAAAAAFRCAABEQwAAbkMAAH9DAABjQwAAbkMAAA5DAADaQgAAQUMAAH9DAABwQwAAf0MAADRDAAAAAAAAAAAAAAAAAAAAAAAAwEAAAAAAAACwQQAAgD8AAHVDAABzQwAAfkMAAH9DAABZQwAAa0MAAGJDAABVQwAAdEMAAHtDAAB/QwAAb0MAAJpCAAAAAAAAAAAAAKBBAAA2QwAAd0MAAG9DAABzQwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIBAAAAlQwAAe0MAAH9DAAB1QwAAf0MAAHJDAAB9QwAAekMAAH9DAABFQwAA1kIAAGxCAAAAAAAAkEEAAABAAADAQAAAAAAAAFhCAAB/QwAAHkMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwEEAAAAAAAAAAAAAwEAAAAhCAAAnQwAAQkMAADBDAAA3QwAAJEMAADBCAAAAQAAAIEEAAMBAAADAQAAAAAAAAAAAAACgQAAAAAAAAIA/AAAAAAAAYEEAAABAAAAAAAAAAAAAAAAAAAAAAAAAIEEAAAAAAABgQQAAAAAAAEBBAAAAAAAAoEAAAAAAAACAPwAAAAAAAMBAAAAAAAAA4EAAAAAAAAAAAAAAAAAAAABBAAAAAAAAIEEAAAAAAACgQAAAAAAAAAAAAAAgQQAAAAAAAAAAAAAAAAAAAAAAAAAAAABgQQAAAAAAAIBAAAAAAAAAAAAAAMhBAAAAAAAAAAAAABBBAAAAAAAAAAAAABBBAAAAAAAAMEEAAAAAAACAPwAAAAAAAAAAAAAAQAAAAAAAAAAAAADgQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=="}}, "outputFilter": ["Plus214_Output_0"]} |
Binary file added
BIN
+25.8 KB
Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx-mnist-model.onnx
Binary file not shown.
247 changes: 247 additions & 0 deletions
247
Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx-mosaic.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# KFServing of custom ONNX models | ||
|
||
## Deploying model | ||
|
||
Deployed model is a CRD `inferenceservice`. You can crete it in the same namespace we created earlier in this lab like so: | ||
|
||
$ kubectl create -f onnx.yaml -n kfserving-test | ||
inferenceservice.serving.kubeflow.org/style-sample created | ||
|
||
In a few minutes you should see the pods running: | ||
|
||
$ kubectl get pods -n kfserving-test | ||
NAME READY STATUS RESTARTS AGE | ||
style-sample-predictor-default-5jk48-deployment-b7c89954c-6s6wn 3/3 Running 0 36s | ||
|
||
And, more importantly, the `inferenceservice` in the `READY` state: | ||
|
||
$ kubectl get inferenceservice -n kfserving-test | ||
NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE | ||
style-sample http://style-sample.kfserving-test.example.com/v1/models/style-sample True | ||
|
||
|
||
You can now [determine your ingress IP and port](https://github.com/kubeflow/kfserving/blob/master/README.md#determine-the-ingress-ip-and-ports): | ||
|
||
For KFServing deployment within Kubeflow: | ||
|
||
$ export INGRESS_HOST=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') | ||
$ export INGRESS_PORT=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}') | ||
|
||
For other stand-alone KFServing deployments: | ||
|
||
$ export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') | ||
$ export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}') | ||
|
||
Before you run inference on your model, it is useful to define environment variables: | ||
|
||
$ export MODEL_NAME=style-sample | ||
$ export SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -n kfserving-test -o jsonpath='{.status.url}' | cut -d "/" -f 3) | ||
|
||
## Inference with the deployed model | ||
|
||
From the Kubeflow dashboard of the Azure Stack environment, create Jupyter Server, and open the notebook [onnx-mosaic.ipynb](onnx-mosaic.ipynb). | ||
Provide the INGRESS_PORT, INGRESS_HOST, and SERVICE_HOSTNAME to do the inferencing. | ||
|
||
For troubleshooting, see the latest versions at [Kubeflow KFServing ONNX](https://github.com/kubeflow/kfserving/tree/master/docs/samples/onnx) | ||
|
||
## Links | ||
|
||
- https://www.tensorflow.org/guide/saved_model | ||
- https://github.com/kubeflow/kfserving/tree/master/docs/samples/onnx | ||
|
||
--- | ||
|
||
[Back](Readme.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
apiVersion: "serving.kubeflow.org/v1alpha2" | ||
kind: "InferenceService" | ||
metadata: | ||
name: "style-sample" | ||
spec: | ||
default: | ||
predictor: | ||
onnx: | ||
storageUri: "gs://kfserving-examples/onnx/style" |
62 changes: 62 additions & 0 deletions
62
Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx_custom.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# KFServing of custom ONNX models | ||
|
||
For example, see how you could [Register and Deploy ONNX Model]( | ||
https://github.com/Azure/MachineLearningNotebooks/blob/2aa7c53b0ce84e67565d77e484987714fdaed36e/how-to-use-azureml/deployment/onnx/onnx-model-register-and-deploy.ipynb) | ||
|
||
We will be using `onnx-mnist-model.onnx` from that example. You would need to move it to the `pvc` in your cluster as we did in other labs, or | ||
upload it to your `gs://` or `s3://` storage. | ||
|
||
## Deploying model | ||
|
||
Deployed model is a CRD `inferenceservice`. You can crete it in the same namespace we created earlier in this lab like so: | ||
|
||
$ kubectl create -f onnx_custom.yaml -n kfserving-test | ||
inferenceservice.serving.kubeflow.org/mnist-onnx created | ||
|
||
In a few minutes you should see the pods running: | ||
|
||
$ kubectl get pods -n kfserving-test | ||
NAME READY STATUS RESTARTS AGE | ||
mnist-onnx-predictor-default-5jk48-deployment-b7c89954c-6s6wn 3/3 Running 0 36s | ||
|
||
And, more importantly, the `inferenceservice` in the `READY` state: | ||
|
||
$ kubectl get inferenceservice -n kfserving-test | ||
NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE | ||
mnist-onnx http://mnist-onnx.kfserving-test.example.com/v1/models/mnist-onnx True | ||
|
||
|
||
You can now [determine your ingress IP and port](https://github.com/kubeflow/kfserving/blob/master/README.md#determine-the-ingress-ip-and-ports): | ||
|
||
For KFServing deployment within Kubeflow: | ||
|
||
$ export INGRESS_HOST=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') | ||
$ export INGRESS_PORT=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}') | ||
|
||
For other stand-alone KFServing deployments: | ||
|
||
$ export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') | ||
$ export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}') | ||
|
||
Before you run inference on your model, it is useful to define environment variables: | ||
|
||
$ export MODEL_NAME=mnist-onnx | ||
$ export SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -n kfserving-test -o jsonpath='{.status.url}' | cut -d "/" -f 3) | ||
|
||
## Inference with the deployed model | ||
|
||
You need to convert your input into a JSON format, as example, we provide `onnx-mnist-input.json` to show the tags. | ||
|
||
$ export INPUT_PATH=onnx-mnist-input.json | ||
$ curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH | ||
|
||
In some cases, depending on the model, you would need to do the post-processing of the output. | ||
|
||
## Links | ||
|
||
- https://www.tensorflow.org/guide/saved_model | ||
- https://github.com/kubeflow/kfserving/tree/master/docs/samples/onnx | ||
|
||
--- | ||
|
||
[Back](Readme.md) |
9 changes: 9 additions & 0 deletions
9
Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx_custom.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
apiVersion: "serving.kubeflow.org/v1alpha2" | ||
kind: "InferenceService" | ||
metadata: | ||
name: "mnist-onnx" | ||
spec: | ||
default: | ||
predictor: | ||
onnx: | ||
storageUri: "pvc://samba-share-claim/mymodels/build_models/mnist-onnx" |
Binary file added
BIN
+122 KB
Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx_input_image.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.