Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Commit ea85c54

Browse files
authored
Merge pull request #201 from panchul/kfserving_tf
Adding KFServing Tensorflow demo to kubeflow-lab section
2 parents 5423474 + 276d079 commit ea85c54

File tree

10 files changed

+488
-0
lines changed

10 files changed

+488
-0
lines changed
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# models, data, and intermediate files for tensorflow demo
2+
build_models
3+
# we create these files with our script
4+
mybowtie.npy
5+
mybowtie2.npy
6+
mybowtie.npy
7+
resized_image.jpg
8+
9+
# models, data, and intermediate files for the pytorch demo
10+
data
11+
model.pt
59.9 KB
Loading
Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
# KFServing TensorFlow models
2+
3+
## Building a model and running inference on it.
4+
5+
Before we plug in our models to KFServing, we can create our own model and we would
6+
need to serialize it. As example we build a model based on Keras's `MobileNet`, serialize it, and show
7+
how to load it ad create our own .npy input file for inferensing with cli.
8+
9+
See [TensorFlow documentation about saving and loading models](https://www.tensorflow.org/guide/saved_model)
10+
for more detals. This is how it works(we skept a few lines for clarity):
11+
12+
$ python3 tensorflow_custom_model.py
13+
020-10-07 20:32:09.913482: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
14+
TensorFlow version: 2.3.1
15+
pciBusID: 0001:00:00.0 name: Tesla K80 computeCapability: 3.7
16+
Downloading data from https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg
17+
65536/61306 [================================] - 0s 0us/step
18+
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt
19+
16384/10484 [==============================================] - 0s 0us/step
20+
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_1_0_224_tf.h5
21+
17227776/17225924 [==============================] - 0s 0us/step
22+
2020-10-07 20:32:20.189468: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
23+
2020-10-07 20:32:27.640408: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
24+
esult for test image: [458 835 907 452 544]
25+
['bow tie' 'suit' 'Windsor tie' 'bolo tie' 'dumbbell']
26+
Result before saving: [653 458 835 440 716]
27+
['military uniform' 'bow tie' 'suit' 'bearskin' 'pickelhaube']
28+
mobilenet_save_path is build_models/mobilenet/1/
29+
infer. structured_outputs: {'predictions': TensorSpec(shape=(None, 1000), dtype=tf.float32, name='predictions')}
30+
Result after saving and loading: ['military uniform' 'bow tie' 'suit' 'bearskin' 'pickelhaube']
31+
32+
You can now see the metadata of the saved model:
33+
34+
$ saved_model_cli show --dir ./build_models/mobilenet/1/ --tag_set serve
35+
2020-10-07 20:41:58.110146: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
36+
The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
37+
SignatureDef key: "__saved_model_init_op"
38+
SignatureDef key: "serving_default"
39+
40+
And you can see the details of the inputs:
41+
42+
$ saved_model_cli show --dir ./build_models/mobilenet/1/ --tag_set 'serve' --signature_def serving_default
43+
2020-10-07 21:00:22.577704: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
44+
The given SavedModel SignatureDef contains the following input(s):
45+
inputs['input_1'] tensor_info:
46+
dtype: DT_FLOAT
47+
shape: (-1, 224, 224, 3)
48+
name: serving_default_input_1:0
49+
The given SavedModel SignatureDef contains the following output(s):
50+
outputs['predictions'] tensor_info:
51+
dtype: DT_FLOAT
52+
shape: (-1, 1000)
53+
name: StatefulPartitionedCall:0
54+
Method name is: tensorflow/serving/predict
55+
56+
You can run this model using cli like so:
57+
58+
$ saved_model_cli run --dir ./build_models/mobilenet/1/ --tag_set 'serve' --signature_def serving_default --inputs "input_1=mybowtie.npy"
59+
2020-10-07 22:09:05.114650: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
60+
2020-10-07 22:09:06.590719: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
61+
...
62+
INFO:tensorflow:Restoring parameters from ./build_models/mobilenet/1/variables/variables
63+
...
64+
[[8.14447232e-10 1.08833642e-09 3.05714520e-09 3.76422588e-10
65+
3.02461900e-09 1.86771834e-10 9.58564408e-11 2.23442289e-11
66+
...
67+
1.03344687e-10 1.75437484e-10 5.70104797e-10 2.57304542e-08
68+
9.80437953e-10 6.50071597e-09 7.63548336e-10 2.22535121e-07
69+
2.11364273e-10 1.93390726e-09 1.75153725e-09 3.15297433e-09
70+
3.13854276e-10 1.25729163e-10 1.90465019e-10 2.17428101e-06
71+
3.23613469e-09 6.73297507e-09 1.32053316e-07 7.10744175e-08
72+
1.44242229e-09 9.99776065e-01 1.08120508e-08 2.66501246e-07 <------------- here is the bow tie, 0.99976, index 458
73+
5.10951594e-11 2.09783249e-08 2.71486139e-10 4.61643097e-08
74+
4.05468148e-09 1.06352536e-06 1.00858000e-09 6.74229839e-11
75+
2.58849914e-10 2.56112132e-09 3.45258333e-09 2.42699444e-10
76+
6.64567623e-10 9.48480761e-09 8.73305410e-08 1.71701653e-10
77+
4.04795251e-12 2.47852516e-09 5.37987823e-08 1.00287258e-10
78+
...
79+
1.32482428e-11 6.76930595e-11 7.33395428e-11 1.21903876e-10
80+
8.87640048e-12 1.07872808e-10 5.34377209e-10 1.29179213e-07]]
81+
...
82+
83+
84+
## Deploying model
85+
86+
To deploy a model, you need to create the `inferenceservice`:
87+
88+
$ kubectl create -f tensorflow_flowers.yaml -n kfserving-test
89+
inferenceservice.serving.kubeflow.org/flowers-sample configured
90+
91+
Give it some time to create the pods. You should eventually see it with `READY` state, and URL:
92+
93+
$ kubectl get inferenceservices -n kfserving-test
94+
NAME READY URL DEFAULT TRAFFIC CANARY TRAFFIC AGE
95+
flowers-sample True http://flowers-sample.default.example.com 90 10 48s
96+
97+
Now, you can identify the host and port to make requests to, it [depends on your environment](https://github.com/kubeflow/kfserving).
98+
99+
For stand-alone KFServing using minikube:
100+
101+
$ export INGRESS_HOST=$(minikube ip)
102+
$ export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')
103+
104+
For KFServing deployment within Kubeflow:
105+
106+
$ export INGRESS_HOST=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
107+
$ export INGRESS_PORT=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')
108+
109+
For other stand-alone KFServing deployments:
110+
111+
$ export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
112+
$ export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')
113+
114+
We also need to define the model you want to interact with(for the `curl` we compose later):
115+
116+
$ export MODEL_NAME=flowers-sample
117+
$ export INPUT_PATH=@./tensorflow_input.json
118+
$ export SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -n kfserving-test -o jsonpath='{.status.url}' | cut -d "/" -f 3)
119+
120+
Do the inferencing itself:
121+
122+
$ curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
123+
* Trying 12.34.56.78...
124+
* Connected to 12.34.56.78 (12.34.56.78) port 80 (#0)
125+
> POST /v1/models/flowers-sample:predict HTTP/1.1
126+
> Host: flowers-sample.kfserving-test.example.com
127+
> User-Agent: curl/7.47.0
128+
> Accept: */*
129+
> Content-Length: 16201
130+
> Content-Type: application/x-www-form-urlencoded
131+
> Expect: 100-continue
132+
>
133+
< HTTP/1.1 100 Continue
134+
* We are completely uploaded and fine
135+
< HTTP/1.1 200 OK
136+
< content-length: 221
137+
< content-type: application/json
138+
< date: Tue, 06 Oct 2020 17:24:59 GMT
139+
< x-envoy-upstream-service-time: 331
140+
< server: istio-envoy
141+
<
142+
{
143+
"predictions": [
144+
{
145+
"scores": [0.999114931, 9.20987877e-05, 0.000136786475, 0.00033725836, 0.000300533167, 1.84813962e-05],
146+
"prediction": 0,
147+
"key": " 1"
148+
}
149+
]
150+
* Connection #0 to host 12.34.56.78 left intact
151+
}
152+
153+
## Deploying custom model
154+
155+
The prepared sample model is stored at `gs://kfserving-samples/models/tensorflow/flowers`. The custom model you build yourself also
156+
needs to be put into the location InferenceService CRD understands, which is, at the moment of writing this documentation, one of:
157+
`gs://`, `s3://`, or `pvc://`.
158+
159+
For a detouched cluster, you could create a local storage using the `persistence.yaml` we provide in `sbin` folder, deploy it
160+
in `kfserving-test` namespace like so:
161+
162+
$ kubectl create -f persistence.yaml -n kfserving-test
163+
storageclass.storage.k8s.io/local-storage created
164+
persistentvolume/samba-share-volume created
165+
persistentvolumeclaim/samba-share-claim created
166+
167+
You should see the volume claims:
168+
169+
$ kubectl get pvc -n kfserving-test
170+
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
171+
samba-share-claim Bound samba-share-volume 2Gi RWX local-storage 16h
172+
173+
And the volume itself:
174+
175+
$ kubectl get pv -n kfserving-test
176+
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
177+
...
178+
samba-share-volume 2Gi RWX Retain Bound kfserving-test/samba-share-claim local-storage 16h
179+
180+
Then you can copy your model from the `build_models` to where your pvc points to, and mark it in your deployment .yaml, like so:
181+
182+
$ cat tensorflow_custom_model.yaml
183+
apiVersion: "serving.kubeflow.org/v1alpha2"
184+
kind: "InferenceService"
185+
metadata:
186+
name: "custom-model"
187+
spec:
188+
default:
189+
predictor:
190+
tensorflow:
191+
#storageUri: "gs://rollingstone/mobilenet"
192+
storageUri: "pvc://samba-share-claim/mymodels/build_models/mobilenet"
193+
194+
195+
## Inferencing using custom model
196+
197+
See [Tensorflow rest api documentation](https://www.tensorflow.org/tfx/serving/api_rest) on constructing and interpreting the json inpu/output.
198+
199+
For example, for the custom model we created earlier, we would need to define the instances with `input_1`.
200+
We can feed the 3-dimensional array with pixel values like so (see script `tensorflow_web_infer.py` for implementation suggestions):
201+
202+
{
203+
"instances":[
204+
{"input_1":[[
205+
[25, 28, 82], [29, 31, 91], [27, 28, 95], [28, 27, 96],
206+
...
207+
[13, 12, 18]
208+
]]
209+
}
210+
]
211+
}
212+
213+
And we should get the predictions.
214+
215+
{
216+
"predictions": [[7.41982103e-06, 0.00287958328, 0.000219230162, 4.96962894e-05,
217+
...
218+
]]
219+
}
220+
221+
It is up to the user of the api to pre-process the input and to post-process the results according to the application's needs.
222+
See `tensorflow_web_infer.py` for example of how to pick the right index and get the label for your model.
223+
224+
## Links
225+
226+
- https://www.tensorflow.org/guide/saved_model
227+
- https://www.tensorflow.org/tfx/serving/api_rest
228+
- https://www.tensorflow.org/tfx/tutorials/serving/rest_simple
229+
230+
[Back](Readme.md)
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
#
2+
# Based on demos at TensorFlow guide:
3+
# https://www.tensorflow.org/guide/saved_model
4+
#
5+
import os
6+
import tempfile
7+
8+
from matplotlib import pyplot as plt
9+
import numpy as np
10+
import tensorflow as tf
11+
12+
print('TensorFlow version: {}'.format(tf.__version__))
13+
tmpdir = "build_models"
14+
15+
physical_devices = tf.config.experimental.list_physical_devices('GPU')
16+
if physical_devices:
17+
tf.config.experimental.set_memory_growth(physical_devices[0], True)
18+
19+
file = tf.keras.utils.get_file(
20+
"grace_hopper.jpg",
21+
"https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg")
22+
img = tf.keras.preprocessing.image.load_img(file, target_size=[224, 224])
23+
plt.imshow(img)
24+
plt.axis('off')
25+
x = tf.keras.preprocessing.image.img_to_array(img)
26+
x = tf.keras.applications.mobilenet.preprocess_input(
27+
x[tf.newaxis,...])
28+
29+
labels_path = tf.keras.utils.get_file(
30+
'ImageNetLabels.txt',
31+
'https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
32+
imagenet_labels = np.array(open(labels_path).read().splitlines())
33+
34+
pretrained_model = tf.keras.applications.MobileNet()
35+
result_before_save = pretrained_model(x)
36+
37+
### our own image, a template how we should be pre-processing input for the inference to work
38+
file2 = tf.keras.utils.get_file(
39+
"bowtie.jpg",
40+
"https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/Bill_Nye_with_trademark_blue_lab_coat_and_bowtie.jpg/319px-Bill_Nye_with_trademark_blue_lab_coat_and_bowtie.jpg"
41+
)
42+
img2 = tf.keras.preprocessing.image.load_img(file2, target_size=[224, 224])
43+
x2 = tf.keras.preprocessing.image.img_to_array(img2)
44+
x2 = tf.keras.applications.mobilenet.preprocess_input(
45+
x2[tf.newaxis,...])
46+
np.save("mybowtie.npy",x2)
47+
48+
result_test = pretrained_model(x2)
49+
decoded_test = np.argsort(result_test)[0,::-1][:5]+1
50+
decoded_test_labeled = imagenet_labels[np.argsort(result_test)[0,::-1][:5]+1]
51+
print("Result for test image: ", decoded_test)
52+
print(" ", decoded_test_labeled)
53+
###
54+
55+
decoded = np.argsort(result_before_save)[0,::-1][:5]+1
56+
decoded_labeled = imagenet_labels[np.argsort(result_before_save)[0,::-1][:5]+1]
57+
print("Result before saving: ", decoded)
58+
print(" ", decoded_labeled)
59+
60+
61+
mobilenet_save_path = os.path.join(tmpdir, "mobilenet/1/")
62+
tf.saved_model.save(pretrained_model, mobilenet_save_path)
63+
64+
65+
loaded = tf.saved_model.load(mobilenet_save_path)
66+
print("list(loaded.signatures.keys(): ", list(loaded.signatures.keys())) # ["serving_default"]
67+
print("mobilenet_save_path is ", mobilenet_save_path)
68+
69+
infer = loaded.signatures["serving_default"]
70+
print("infer. structured_outputs: ", infer.structured_outputs)
71+
72+
labeling = infer(tf.constant(x))[pretrained_model.output_names[0]]
73+
74+
decoded = imagenet_labels[np.argsort(labeling)[0,::-1][:5]+1]
75+
76+
print("Result after saving and loading: ", decoded)
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
#
2+
# This is how it works:
3+
#
4+
# $ kubectl apply -f tensorflow_custom_model.yaml -n kfserving-test
5+
# inferenceservice.serving.kubeflow.org/custom-model configured
6+
#
7+
# $ kubectl get inferenceservice -n kfserving-test
8+
# NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE
9+
# custom-model http://custom-model.kfserving-test.example.com/v1/models/custom-model True 100 2m23s
10+
#
11+
12+
apiVersion: "serving.kubeflow.org/v1alpha2"
13+
kind: "InferenceService"
14+
metadata:
15+
name: "custom-model"
16+
spec:
17+
default:
18+
predictor:
19+
tensorflow:
20+
# for example,
21+
# storageUri: "gs://rollingstone/models/1/custom-model"
22+
storageUri: "gs://<your bucket>/models/1/custom-model"
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#
2+
# originally from https://github.com/kubeflow/kfserving/tree/master/docs/samples
3+
# see the repository for model changes.
4+
#
5+
6+
#
7+
# This is how it works:
8+
#
9+
# $ kubectl apply -f tensorflow_flowers.yaml -n kfserving-test
10+
# inferenceservice.serving.kubeflow.org/flowers-sample configured
11+
#
12+
# $ kubectl get inferenceservice -n kfserving-test
13+
# NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE
14+
# flowers-sample http://flowers-sample.kfserving-test.example.com/v1/models/flowers-sample True 100 2m23s
15+
#
16+
17+
apiVersion: "serving.kubeflow.org/v1alpha2"
18+
kind: "InferenceService"
19+
metadata:
20+
name: "flowers-sample"
21+
spec:
22+
default:
23+
predictor:
24+
tensorflow:
25+
storageUri: "gs://kfserving-samples/models/tensorflow/flowers"

0 commit comments

Comments
 (0)