Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Commit

Permalink
Merge pull request #209 from panchul/kfserving_onnx
Browse files Browse the repository at this point in the history
Adding a tutorial of using KFServing with ONNX models
  • Loading branch information
initmahesh authored Oct 22, 2020
2 parents 16ae8dc + 0567cf7 commit 81bc5cf
Show file tree
Hide file tree
Showing 11 changed files with 2,306 additions and 0 deletions.
1,581 changes: 1,581 additions & 0 deletions Research/kubeflow-on-azure-stack-lab/04-KFServing/assets/onnx_ml_pb2.py

Large diffs are not rendered by default.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"inputs": {"Input3": {"dims": ["1", "1", "28", "28"], "dataType": 1, "rawData": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAPwAAQEAAAAAAAAAAAAAAgEAAAABAAAAAAAAAMEEAAAAAAAAAAAAAYEEAAIA/AAAAAAAAmEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQEEAAAAAAAAAAAAA4EAAAAAAAACAPwAAIEEAAAAAAAAAQAAAAEAAAIBBAAAAAAAAQEAAAEBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA4EAAAABBAAAAAAAAAEEAAAAAAAAAAAAAAEEAAAAAAAAAAAAAmEEAAAAAAAAAAAAAgD8AAKhBAAAAAAAAgEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgD8AAAAAAAAAAAAAgD8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAMEEAAAAAAAAAAAAAIEEAAEBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABQQQAAAAAAAHBBAAAgQQAA0EEAAAhCAACIQQAAmkIAADVDAAAyQwAADEIAAIBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWQwAAfkMAAHpDAAB7QwAAc0MAAHxDAAB8QwAAf0MAADRCAADAQAAAAAAAAKBAAAAAAAAAEEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOBAAACQQgAATUMAAH9DAABuQwAAc0MAAH9DAAB+QwAAe0MAAHhDAABJQwAARkMAAGRCAAAAAAAAmEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWkMAAH9DAABxQwAAf0MAAHlDAAB6QwAAe0MAAHpDAAB/QwAAf0MAAHJDAABgQwAAREIAAAAAAABAQQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgD8AAABAAABAQAAAAEAAAABAAACAPwAAAAAAAIJCAABkQwAAf0MAAH5DAAB0QwAA7kIAAAhCAAAkQgAA3EIAAHpDAAB/QwAAeEMAAPhCAACgQQAAAAAAAAAAAAAAAAAAAAAAAAAAAACAPwAAgD8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEBBAAAAAAAAeEIAAM5CAADiQgAA6kIAAAhCAAAAAAAAAAAAAAAAAABIQwAAdEMAAH9DAAB/QwAAAAAAAEBBAAAAAAAAAAAAAAAAAAAAAAAAAEAAAIA/AAAAAAAAAAAAAAAAAAAAAAAAgD8AAABAAAAAAAAAAAAAAABAAACAQAAAAAAAADBBAAAAAAAA4EAAAMBAAAAAAAAAlkIAAHRDAAB/QwAAf0MAAIBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIA/AAAAQAAAQEAAAIBAAACAQAAAAAAAAGBBAAAAAAAAAAAAAAAAAAAQQQAAAAAAAABAAAAAAAAAAAAAAAhCAAB/QwAAf0MAAH1DAAAgQQAAIEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIA/AAAAQAAAQEAAAABAAAAAAAAAAAAAAEBAAAAAQAAAAAAAAFBBAAAwQQAAAAAAAAAAAAAAAAAAwEAAAEBBAADGQgAAf0MAAH5DAAB4QwAAcEEAAEBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIA/AACAPwAAgD8AAAAAAAAAAAAAAAAAAAAAAACAPwAAgD8AAAAAAAAAAAAAoEAAAMBAAAAwQQAAAAAAAAAAAACIQQAAOEMAAHdDAAB/QwAAc0MAAFBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEBAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAABAAACAQAAAgEAAAAAAAAAwQQAAAAAAAExCAAC8QgAAqkIAAKBAAACgQAAAyEEAAHZDAAB2QwAAf0MAAFBDAAAAAAAAEEEAAAAAAAAAAAAAAAAAAAAAAACAQAAAgD8AAAAAAAAAAAAAgD8AAOBAAABwQQAAmEEAAMZCAADOQgAANkMAAD1DAABtQwAAfUMAAHxDAAA/QwAAPkMAAGNDAABzQwAAfEMAAFJDAACQQQAA4EAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIBAAAAAAAAAAAAAAABCAADaQgAAOUMAAHdDAAB/QwAAckMAAH9DAAB0QwAAf0MAAH9DAAByQwAAe0MAAH9DAABwQwAAf0MAAH9DAABaQwAA+EIAABBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAD+QgAAf0MAAGtDAAB/QwAAf0MAAHdDAABlQwAAVEMAAHJDAAB6QwAAf0MAAH9DAAB4QwAAf0MAAH1DAAB5QwAAf0MAAHNDAAAqQwAAQEEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMEEAAAAAAAAQQQAAfUMAAH9DAAB/QwAAaUMAAEpDAACqQgAAAAAAAFRCAABEQwAAbkMAAH9DAABjQwAAbkMAAA5DAADaQgAAQUMAAH9DAABwQwAAf0MAADRDAAAAAAAAAAAAAAAAAAAAAAAAwEAAAAAAAACwQQAAgD8AAHVDAABzQwAAfkMAAH9DAABZQwAAa0MAAGJDAABVQwAAdEMAAHtDAAB/QwAAb0MAAJpCAAAAAAAAAAAAAKBBAAA2QwAAd0MAAG9DAABzQwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIBAAAAlQwAAe0MAAH9DAAB1QwAAf0MAAHJDAAB9QwAAekMAAH9DAABFQwAA1kIAAGxCAAAAAAAAkEEAAABAAADAQAAAAAAAAFhCAAB/QwAAHkMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwEEAAAAAAAAAAAAAwEAAAAhCAAAnQwAAQkMAADBDAAA3QwAAJEMAADBCAAAAQAAAIEEAAMBAAADAQAAAAAAAAAAAAACgQAAAAAAAAIA/AAAAAAAAYEEAAABAAAAAAAAAAAAAAAAAAAAAAAAAIEEAAAAAAABgQQAAAAAAAEBBAAAAAAAAoEAAAAAAAACAPwAAAAAAAMBAAAAAAAAA4EAAAAAAAAAAAAAAAAAAAABBAAAAAAAAIEEAAAAAAACgQAAAAAAAAAAAAAAgQQAAAAAAAAAAAAAAAAAAAAAAAAAAAABgQQAAAAAAAIBAAAAAAAAAAAAAAMhBAAAAAAAAAAAAABBBAAAAAAAAAAAAABBBAAAAAAAAMEEAAAAAAACAPwAAAAAAAAAAAAAAQAAAAAAAAAAAAADgQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=="}}, "outputFilter": ["Plus214_Output_0"]}
Binary file not shown.
247 changes: 247 additions & 0 deletions Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx-mosaic.ipynb

Large diffs are not rendered by default.

54 changes: 54 additions & 0 deletions Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# KFServing of custom ONNX models

## Deploying model

Deployed model is a CRD `inferenceservice`. You can crete it in the same namespace we created earlier in this lab like so:

$ kubectl create -f onnx.yaml -n kfserving-test
inferenceservice.serving.kubeflow.org/style-sample created

In a few minutes you should see the pods running:

$ kubectl get pods -n kfserving-test
NAME READY STATUS RESTARTS AGE
style-sample-predictor-default-5jk48-deployment-b7c89954c-6s6wn 3/3 Running 0 36s

And, more importantly, the `inferenceservice` in the `READY` state:

$ kubectl get inferenceservice -n kfserving-test
NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE
style-sample http://style-sample.kfserving-test.example.com/v1/models/style-sample True


You can now [determine your ingress IP and port](https://github.com/kubeflow/kfserving/blob/master/README.md#determine-the-ingress-ip-and-ports):

For KFServing deployment within Kubeflow:

$ export INGRESS_HOST=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ export INGRESS_PORT=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')

For other stand-alone KFServing deployments:

$ export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')

Before you run inference on your model, it is useful to define environment variables:

$ export MODEL_NAME=style-sample
$ export SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -n kfserving-test -o jsonpath='{.status.url}' | cut -d "/" -f 3)

## Inference with the deployed model

From the Kubeflow dashboard of the Azure Stack environment, create Jupyter Server, and open the notebook [onnx-mosaic.ipynb](onnx-mosaic.ipynb).
Provide the INGRESS_PORT, INGRESS_HOST, and SERVICE_HOSTNAME to do the inferencing.

For troubleshooting, see the latest versions at [Kubeflow KFServing ONNX](https://github.com/kubeflow/kfserving/tree/master/docs/samples/onnx)

## Links

- https://www.tensorflow.org/guide/saved_model
- https://github.com/kubeflow/kfserving/tree/master/docs/samples/onnx

---

[Back](Readme.md)
9 changes: 9 additions & 0 deletions Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: "serving.kubeflow.org/v1alpha2"
kind: "InferenceService"
metadata:
name: "style-sample"
spec:
default:
predictor:
onnx:
storageUri: "gs://kfserving-examples/onnx/style"
62 changes: 62 additions & 0 deletions Research/kubeflow-on-azure-stack-lab/04-KFServing/onnx_custom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# KFServing of custom ONNX models

For example, see how you could [Register and Deploy ONNX Model](
https://github.com/Azure/MachineLearningNotebooks/blob/2aa7c53b0ce84e67565d77e484987714fdaed36e/how-to-use-azureml/deployment/onnx/onnx-model-register-and-deploy.ipynb)

We will be using `onnx-mnist-model.onnx` from that example. You would need to move it to the `pvc` in your cluster as we did in other labs, or
upload it to your `gs://` or `s3://` storage.

## Deploying model

Deployed model is a CRD `inferenceservice`. You can crete it in the same namespace we created earlier in this lab like so:

$ kubectl create -f onnx_custom.yaml -n kfserving-test
inferenceservice.serving.kubeflow.org/mnist-onnx created

In a few minutes you should see the pods running:

$ kubectl get pods -n kfserving-test
NAME READY STATUS RESTARTS AGE
mnist-onnx-predictor-default-5jk48-deployment-b7c89954c-6s6wn 3/3 Running 0 36s

And, more importantly, the `inferenceservice` in the `READY` state:

$ kubectl get inferenceservice -n kfserving-test
NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE
mnist-onnx http://mnist-onnx.kfserving-test.example.com/v1/models/mnist-onnx True


You can now [determine your ingress IP and port](https://github.com/kubeflow/kfserving/blob/master/README.md#determine-the-ingress-ip-and-ports):

For KFServing deployment within Kubeflow:

$ export INGRESS_HOST=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ export INGRESS_PORT=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')

For other stand-alone KFServing deployments:

$ export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')

Before you run inference on your model, it is useful to define environment variables:

$ export MODEL_NAME=mnist-onnx
$ export SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -n kfserving-test -o jsonpath='{.status.url}' | cut -d "/" -f 3)

## Inference with the deployed model

You need to convert your input into a JSON format, as example, we provide `onnx-mnist-input.json` to show the tags.

$ export INPUT_PATH=onnx-mnist-input.json
$ curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

In some cases, depending on the model, you would need to do the post-processing of the output.

## Links

- https://www.tensorflow.org/guide/saved_model
- https://github.com/kubeflow/kfserving/tree/master/docs/samples/onnx

---

[Back](Readme.md)
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: "serving.kubeflow.org/v1alpha2"
kind: "InferenceService"
metadata:
name: "mnist-onnx"
spec:
default:
predictor:
onnx:
storageUri: "pvc://samba-share-claim/mymodels/build_models/mnist-onnx"
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 81bc5cf

Please sign in to comment.