This repository has been archived by the owner on Nov 16, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 136
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #202 from panchul/kfserving_pytorch
Adding PyTorch samples for KFServing on Azure Stack
- Loading branch information
Showing
7 changed files
with
3,623 additions
and
0 deletions.
There are no files selected for viewing
159 changes: 159 additions & 0 deletions
159
Research/kubeflow-on-azure-stack-lab/04-KFServing/pytorch.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
# KFServing PyTorch models | ||
|
||
## Building a model and running inference on it. | ||
|
||
You can run inferencing using pytorchserver, a part of Kubeflow KFServing GitHub reposotory. | ||
See [KFServing PyTorch demo](https://github.com/kubeflow/kfserving/tree/master/docs/samples/pytorch) for more information if needed. | ||
|
||
You need to have the `pytorchserver` installed. You may need to install the prerequisites manually, specifying | ||
versions and hardware nuances(CUDA version, etc.) | ||
|
||
In simple case: | ||
|
||
$ pip install torch torchvision | ||
|
||
Clone KFServing repository and install the pre-requisites. See KFServing's | ||
[python/pytorchserver](https://github.com/kubeflow/kfserving/tree/master/python/pytorchserver) | ||
if you have any issues. | ||
|
||
$ git clone https://github.com/kubeflow/kfserving.git | ||
$ cd kfserving/python/pytorchserver | ||
$ pip install -e . | ||
|
||
Verify that it works: | ||
|
||
/kfserving/python/pytorchserver$ python3 -m pytorchserver -h | ||
usage: __main__.py [-h] [--http_port HTTP_PORT] [--grpc_port GRPC_PORT] | ||
[--max_buffer_size MAX_BUFFER_SIZE] [--workers WORKERS] | ||
--model_dir MODEL_DIR [--model_name MODEL_NAME] | ||
[--model_class_name MODEL_CLASS_NAME] | ||
|
||
optional arguments: | ||
-h, --help show this help message and exit | ||
--http_port HTTP_PORT | ||
The HTTP Port listened to by the model server. | ||
--grpc_port GRPC_PORT | ||
The GRPC Port listened to by the model server. | ||
--max_buffer_size MAX_BUFFER_SIZE | ||
The max buffer size for tornado. | ||
--workers WORKERS The number of works to fork | ||
--model_dir MODEL_DIR | ||
A URI pointer to the model directory | ||
--model_name MODEL_NAME | ||
The name that the model is served under. | ||
--model_class_name MODEL_CLASS_NAME | ||
The class name for the model. | ||
|
||
|
||
You can create a model: | ||
|
||
$ python3 pytorch_cifar10.py | ||
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz | ||
100.0%Extracting ./data/cifar-10-python.tar.gz to ./data | ||
Files already downloaded and verified | ||
[1, 2000] loss: 2.170 | ||
[1, 4000] loss: 1.893 | ||
[1, 6000] loss: 1.695 | ||
[1, 8000] loss: 1.594 | ||
[1, 10000] loss: 1.532 | ||
[1, 12000] loss: 1.456 | ||
[2, 2000] loss: 1.397 | ||
[2, 4000] loss: 1.393 | ||
[2, 6000] loss: 1.367 | ||
[2, 8000] loss: 1.342 | ||
[2, 10000] loss: 1.320 | ||
[2, 12000] loss: 1.322 | ||
Finished Training | ||
|
||
|
||
And run the pytorchserver: | ||
|
||
$ python3 -m pytorchserver --model_dir `pwd` --model_name pytorchmodel --model_class_name Net | ||
[I 201008 17:15:32 storage:35] Copying contents of /home/azureuser/kfserving/docs/samples/pytorch to local | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/model.pt to pytorchmodel/model.pt | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/pytorch.yaml to pytorchmodel/pytorch.yaml | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/pytorchmodel to pytorchmodel/pytorchmodel | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/README.md to pytorchmodel/README.md | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/data to pytorchmodel/data | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/input.json to pytorchmodel/input.json | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/pytorch_gpu.yaml to pytorchmodel/pytorch_gpu.yaml | ||
[I 201008 17:15:32 storage:205] Linking: /home/azureuser/kfserving/docs/samples/pytorch/cifar10.py to pytorchmodel/cifar10.py | ||
[I 201008 17:15:34 kfserver:88] Registering model: pytorchmodel | ||
[I 201008 17:15:34 kfserver:77] Listening on port 8080 | ||
[I 201008 17:15:34 kfserver:79] Will fork 0 workers | ||
[I 201008 17:15:34 process:126] Starting 6 processes | ||
[E 201008 17:18:28 web:2250] 200 POST /v1/models/pytorchmodel:predict (127.0.0.1) 21.34ms | ||
|
||
In a separate terminal, you can run the client script, it will make the request: | ||
|
||
$ python3 pytorch_pytorchserver_client.py | ||
Files already downloaded and verified | ||
<Response [200]> | ||
... | ||
|
||
## Deploying model | ||
|
||
We have a .json with `inferenceservice` defined: | ||
|
||
$ kubectl create -f pytorch_cifar10.yaml -n kfserving-test | ||
inferenceservice.serving.kubeflow.org/pytorch-cifar10 created | ||
|
||
Wait until the pods are running and the service is 'ready' and has URL: | ||
|
||
$ kubectl get po -n kfserving-test | ||
NAME READY STATUS RESTARTS AGE | ||
pytorch-cifar10-predictor-default-x4597-deployment-6dd9d4bfnmqs 2/2 Running 0 119s | ||
|
||
$ k get inferenceservices -n kfserving-test | ||
NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE | ||
pytorch-cifar10 http://pytorch-cifar10.kfserving-test.example.com/v1/models/pytorch-cifar10 True 100 3m16s | ||
|
||
Define the parameters you will be using in your requests: | ||
|
||
$ export MODEL_NAME=pytorch-cifar10 | ||
$ export INPUT_PATH=@./pytorch_input.json | ||
$ export SERVICE_HOSTNAME=$(kubectl get inferenceservice pytorch-cifar10 -n kfserving-test -o jsonpath='{.status.url}' | cut -d "/" -f 3) | ||
|
||
Depending on your environment, if you run on KFServing that is part of Kubeflow instalation(this is what we do thuought this lab): | ||
|
||
$ export INGRESS_HOST=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') | ||
$ export INGRESS_PORT=$(kubectl -n istio-system get service kfserving-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}') | ||
|
||
Or for more generic case: | ||
|
||
$ export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') | ||
$ export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}') | ||
|
||
The `curl` call: | ||
|
||
$ curl -v -H "Host: ${SERVICE_HOSTNAME}" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict | ||
* Trying 12.34.56.78... | ||
* Connected to 12.34.56.78 (12.34.56.78) port 80 (#0) | ||
> POST /v1/models/pytorch-cifar10:predict HTTP/1.1 | ||
> Host: pytorch-cifar10.kfserving-test.example.com | ||
> User-Agent: curl/7.47.0 | ||
> Accept: */* | ||
> Content-Length: 110681 | ||
> Content-Type: application/x-www-form-urlencoded | ||
> Expect: 100-continue | ||
> | ||
< HTTP/1.1 100 Continue | ||
* We are completely uploaded and fine | ||
< HTTP/1.1 200 OK | ||
< content-length: 225 | ||
< content-type: application/json; charset=UTF-8 | ||
< date: Tue, 06 Oct 2020 21:43:45 GMT | ||
< server: istio-envoy | ||
< x-envoy-upstream-service-time: 14 | ||
< | ||
* Connection #0 to host 12.34.56.78 left intact | ||
{"predictions": [[-1.6099601984024048, -2.6461071968078613, 0.3284444212913513, 2.4825074672698975, 0.4352457523345947, 2.3108041286468506, 1.0005676746368408, -0.42327627539634705, -0.5100944638252258, -1.7978390455245972]]} | ||
|
||
For troubleshooting, see [Kubeflow website](https://github.com/kubeflow/kfserving/tree/master/docs/samples/pytorch) | ||
|
||
|
||
## Links | ||
|
||
- https://github.com/kubeflow/kfserving/tree/master/docs/samples/pytorch | ||
|
||
[Back](Readme.md) |
84 changes: 84 additions & 0 deletions
84
Research/kubeflow-on-azure-stack-lab/04-KFServing/pytorch_cifar10.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# | ||
# Originally from | ||
# https://github.com/kubeflow/kfserving/tree/master/docs/samples | ||
# | ||
# See https://github.com/kubeflow/kfserving | ||
# | ||
|
||
import torch | ||
import torchvision | ||
import torchvision.transforms as transforms | ||
import torch.nn as nn | ||
import torch.nn.functional as F | ||
import torch.optim as optim | ||
|
||
class Net(nn.Module): | ||
def __init__(self): | ||
super(Net, self).__init__() | ||
self.conv1 = nn.Conv2d(3, 6, 5) | ||
self.pool = nn.MaxPool2d(2, 2) | ||
self.conv2 = nn.Conv2d(6, 16, 5) | ||
self.fc1 = nn.Linear(16 * 5 * 5, 120) | ||
self.fc2 = nn.Linear(120, 84) | ||
self.fc3 = nn.Linear(84, 10) | ||
|
||
def forward(self, x): | ||
x = self.pool(F.relu(self.conv1(x))) | ||
x = self.pool(F.relu(self.conv2(x))) | ||
x = x.view(-1, 16 * 5 * 5) | ||
x = F.relu(self.fc1(x)) | ||
x = F.relu(self.fc2(x)) | ||
x = self.fc3(x) | ||
return x | ||
|
||
if __name__ == "__main__": | ||
|
||
transform = transforms.Compose( | ||
[transforms.ToTensor(), | ||
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) | ||
|
||
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, | ||
download=True, transform=transform) | ||
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, | ||
shuffle=True, num_workers=2) | ||
|
||
testset = torchvision.datasets.CIFAR10(root='./data', train=False, | ||
download=True, transform=transform) | ||
testloader = torch.utils.data.DataLoader(testset, batch_size=4, | ||
shuffle=False, num_workers=2) | ||
|
||
classes = ('plane', 'car', 'bird', 'cat', | ||
'deer', 'dog', 'frog', 'horse', 'ship', 'truck') | ||
|
||
net = Net() | ||
|
||
criterion = nn.CrossEntropyLoss() | ||
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) | ||
|
||
for epoch in range(2): # loop over the dataset multiple times | ||
|
||
running_loss = 0.0 | ||
for i, data in enumerate(trainloader, 0): | ||
# get the inputs; data is a list of [inputs, labels] | ||
inputs, labels = data | ||
|
||
# zero the parameter gradients | ||
optimizer.zero_grad() | ||
|
||
# forward + backward + optimize | ||
outputs = net(inputs) | ||
loss = criterion(outputs, labels) | ||
loss.backward() | ||
optimizer.step() | ||
|
||
# print statistics | ||
running_loss += loss.item() | ||
if i % 2000 == 1999: # print every 2000 mini-batches | ||
print('[%d, %5d] loss: %.3f' % | ||
(epoch + 1, i + 1, running_loss / 2000)) | ||
running_loss = 0.0 | ||
|
||
print('Finished Training') | ||
|
||
# Save model | ||
torch.save(net.state_dict(), "model.pt") |
28 changes: 28 additions & 0 deletions
28
Research/kubeflow-on-azure-stack-lab/04-KFServing/pytorch_cifar10.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# | ||
# originally from https://github.com/kubeflow/kfserving/tree/master/docs/samples | ||
# see the repository for model changes. | ||
# | ||
|
||
# | ||
# This is how it works: | ||
# | ||
# $ kubectl apply -f pytorch_cifar10.yaml -n kfserving-test | ||
# inferenceservice.serving.kubeflow.org/pytorch-cifar10 configured | ||
# | ||
# $ kubectl get inferenceservice -n kfserving-test | ||
# NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE | ||
# pytorch-cifar10 http://flowers-sample.kfserving-test.example.com/v1/models/pytorch-cifar10 True 100 2m23s | ||
# | ||
|
||
apiVersion: "serving.kubeflow.org/v1alpha2" | ||
kind: "InferenceService" | ||
metadata: | ||
name: "pytorch-cifar10" | ||
spec: | ||
default: | ||
parallelism: 1 | ||
predictor: | ||
pytorch: | ||
storageUri: "gs://kfserving-samples/models/pytorch/cifar10/" | ||
modelClassName: "Net" | ||
|
36 changes: 36 additions & 0 deletions
36
Research/kubeflow-on-azure-stack-lab/04-KFServing/pytorch_cifar10_gpu.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# | ||
# originally from https://github.com/kubeflow/kfserving/tree/master/docs/samples | ||
# see the repository for model changes. | ||
# | ||
|
||
# | ||
# This is how it works: | ||
# | ||
# $ kubectl apply -f pytorch_cifar10_gpu.yaml -n kfserving-test | ||
# inferenceservice.serving.kubeflow.org/pytorch-cifar10-gpu configured | ||
# | ||
# $ kubectl get inferenceservice -n kfserving-test | ||
# NAME URL READY DEFAULT TRAFFIC CANARY TRAFFIC AGE | ||
# pytorch-cifar10-gpu http://flowers-sample.kfserving-test.example.com/v1/models/pytorch-cifar10-gpu True 100 2m23s | ||
# | ||
|
||
apiVersion: "serving.kubeflow.org/v1alpha2" | ||
kind: "InferenceService" | ||
metadata: | ||
name: "pytorch-cifar10-gpu" | ||
spec: | ||
default: | ||
parallelism: 1 | ||
predictor: | ||
pytorch: | ||
storageUri: "gs://kfserving-samples/models/pytorch/cifar10/" | ||
modelClassName: "Net" | ||
resources: | ||
limits: | ||
cpu: 100m | ||
memory: 1Gi | ||
nvidia.com/gpu: "1" | ||
requests: | ||
cpu: 100m | ||
memory: 1Gi | ||
nvidia.com/gpu: "1" |
Oops, something went wrong.