InstaSlice

InstaSlice uses stable APIs and works with GPU operator to create mig slices on demand.

Why InstaSlice

Partitionable accelerators provided by vendors need partition to be created at node boot-time or to change partitions one would have to evict all the workloads at the node level to create new set of partitions.

InstaSlice will help if

user does not know all the accelerators partitions needed a priori on every node on the cluster
user partition requirements change at the workload level rather than the node level
user does not want to learn or use new API to request accelerators slices
user prefers to use stable device plugins APIs for creating partitions

Features overview

Integration with Kubernetes quota management
Integration with project Kueue
Emulator mode to run test InstaSlice firstfit placement strategy
Integration with vLLM, Kserve, Deployments, Jobs, and Statefulsets

Demo

InstaSlice demo

Getting Started

Prerequisites

Go v1.22.0+
Docker v17.03+
KinD v0.23.0+
Helm v3.0.0+
Docker buildx plugin for building cross-platform images.
kubectl v1.11.3+.
ginkgo v2+ for e2e testing

Install and configure required NVIDIA software on the host

Install the NVIDIA GPU drivers and CUDA toolkit on the host.
Install the NVIDIA Container Toolkit (CTK).
Configure the NVIDIA Container Runtime as the default Docker runtime:

sudo nvidia-ctk runtime configure --runtime=docker --set-as-default

Restart Docker to apply the changes:

sudo systemctl restart docker

Configure the NVIDIA Container Runtime to use volume mounts to select devices to inject into a container:

sudo nvidia-ctk config --set accept-nvidia-visible-devices-as-volume-mounts=true --in-place

This sets accept-nvidia-visible-devices-as-volume-mounts=true in the /etc/nvidia-container-runtime/config.toml file.

Enable MIG on the GPU

Check if MIG is enabled on the host GPU - look for Enabled in the third row of the table:

nvidia-smi

Sun Aug 18 09:41:46 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.28.03              Driver Version: 560.28.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-PCIE-40GB          Off |   00000000:07:00.0 Off |                   On |
| N/A   27C    P0             31W /  250W |       1MiB /  40960MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices:                                                                            |
+------------------+----------------------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |                     Memory-Usage |        Vol|        Shared         |
|      ID  ID  Dev |                       BAR1-Usage | SM     Unc| CE ENC  DEC  OFA  JPG |
|                  |                                  |        ECC|                       |
|==================+==================================+===========+=======================|
|  No MIG devices found                                                                   |
+-----------------------------------------------------------------------------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

If MIG is disabled, enabled it by running:

nvidia-smi -i <gpu-id> -mig 1

Example:

nvidia-smi -i 0 -mig 1

Note: You may need to reboot the node for the changes to take effect. An asterisk beside MIG status (e.g. Enabled*) means the changes are pending and will be applied after a reboot.

Install KinD cluster with GPU operator

Create a Kind cluster and install the NVIDIA GPU Operator:

bash ./deploy/setup.sh

Note: The validator pods nvidia-cuda-validator-* and nvidia-operator-validator-* of the GPU operator are expected to fail to initialize. This is because with MIG enabled, but without a MIG partition they effectively have no GPU to run on.

kubectl get pod -n gpu-operator

NAME                                                          READY   STATUS                  RESTARTS       AGE
gpu-feature-discovery-lzcpv                                   2/2     Running                 0              5m48s
gpu-operator-7b5587d878-vq2gw                                 1/1     Running                 0              6m59s
gpu-operator-node-feature-discovery-gc-8478d46f4c-wvx29       1/1     Running                 0              6m59s
gpu-operator-node-feature-discovery-master-688bb86496-cn97b   1/1     Running                 0              6m59s
gpu-operator-node-feature-discovery-worker-7twxt              1/1     Running                 0              6m52s
nvidia-container-toolkit-daemonset-gpn22                      1/1     Running                 0              6m13s
nvidia-cuda-validator-sjqgk                                   0/1     Init:CrashLoopBackOff   5 (111s ago)   4m54s
nvidia-dcgm-exporter-tlcpv                                    1/1     Running                 0              6m7s
nvidia-device-plugin-daemonset-wbbhx                          2/2     Running                 0              5m53s
nvidia-operator-validator-h7ngh                               0/1     Init:2/4                0              6m10s

Deploy InstaSlice

Optionally, build and push custom, up-to-date controller and daemonset images from source:

IMG=<registry>/<controller-image>:<tag> IMG_DMST=<registry>/<daemonset-image>:<tag> make docker-build docker-push

Example:

IMG=quay.io/example/instaslice2-controller:1.0 IMG_DMST=quay.io/example/instaslice2-daemonset:1.0 make docker-build docker-push

Note: You can use Podman instead of Docker to build images, just set CONTAINER_TOOL=podman before the image-related make targets.

Cross-platform or multi-arch images can be built and pushed using make docker-buildx. When using Docker as your container tool, make sure to create a builder instance. Refer to Multi-platform images for documentation on building mutli-platform images with Docker. You can change the destination platform(s) by setting PLATFORMS, e.g.:

PLATFORMS=linux/arm64,linux/amd64 make docker-buildx

Deploy the controller and daemonset with the default images. All required CRDs will be installed by this command:

make deploy

or with custom-build images:

IMG=<registry>/<controller-image>:<tag> IMG_DMST=<registry>/<daemonset-image>:<tag> make deploy

Example:

IMG=quay.io/example/instaslice2-controller:1.0 IMG_DMST=quay.io/example/instaslice2-daemonset:1.0 make deploy

The all-in-one command for building and deploying InstaSlice:

# make docker-build docker-push deploy

Or with custom images:

IMG=<registry>/<controller-image>:<tag> IMG_DMST=<registry>/<daemonset-image>:<tag> make docker-build docker-push deploy

Example:

IMG=quay.io/example/instaslice2-controller:1.0 IMG_DMST=quay.io/example/instaslice2-daemonset:1.0 make docker-build docker-push deploy

Verify that the InstaSlice pods are successfully running:

kubectl get pod -n instaslice-system

NAME                                               READY   STATUS    RESTARTS   AGE
instaslice-operator-controller-daemonset-5lbqg            1/1     Running   0          101s
instaslice-operator-controller-manager-57b549784c-wkqq2   2/2     Running   0          101s

Note: If you encounter RBAC errors, you may need to grant yourself cluster-admin privileges or be logged in as admin.

Run a sample workload

Submit a sample workload:

kubectl apply -f ./samples/test-pod.yaml
pod/cuda-vectoradd-1 created

check the status of the workload using commands

kubectl get pods

NAME               READY   STATUS    RESTARTS   AGE
cuda-vectoradd-1   1/1     Running   0          15s

and

kubectl logs cuda-vectoradd-1

GPU 0: NVIDIA A100-PCIE-40GB (UUID: GPU-1785aa6b-6edf-f58e-2e29-f6ccd30f306f)
  MIG 1g.5gb      Device  0: (UUID: MIG-2cc7f78c-04eb-5a3c-92c7-f423e3572bb8)
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

While the pod is running, you can observe the MIG slice created for it automatically:

nvidia-smi

Sun Aug 18 11:48:20 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.28.03              Driver Version: 560.28.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-PCIE-40GB          Off |   00000000:07:00.0 Off |                   On |
| N/A   32C    P0             63W /  250W |      13MiB /  40960MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices:                                                                            |
+------------------+----------------------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |                     Memory-Usage |        Vol|        Shared         |
|      ID  ID  Dev |                       BAR1-Usage | SM     Unc| CE ENC  DEC  OFA  JPG |
|                  |                                  |        ECC|                       |
|==================+==================================+===========+=======================|
|  0   11   0   0  |              13MiB /  4864MiB    | 14      0 |  1   0    0    0    0 |
|                  |                 0MiB /  8191MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
...

Delete the sample pod and see its MIG slice automatically deleted.

kubectl delete -f ./samples/test-pod.yaml

nvidia-smi

Sun Aug 18 13:34:55 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.28.03              Driver Version: 560.28.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-PCIE-40GB          Off |   00000000:07:00.0 Off |                   On |
| N/A   32C    P0             61W /  250W |       1MiB /  40960MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices:                                                                            |
+------------------+----------------------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |                     Memory-Usage |        Vol|        Shared         |
|      ID  ID  Dev |                       BAR1-Usage | SM     Unc| CE ENC  DEC  OFA  JPG |
|                  |                                  |        ECC|                       |
|==================+==================================+===========+=======================|
|  No MIG devices found                                                                   |
+-----------------------------------------------------------------------------------------+
...

Create instances of your solution

You can apply the samples (examples) from the sample directory:

kubectl apply -k samples/

NOTE: Ensure that the samples use the default values to test it out.

Uninstall

Delete all running samples from the cluster:

kubectl delete -k samples/

Delete the CRDs:

make uninstall

Undeploy InstaSlice:

make undeploy

To delete the Kind cluster, just run:

kind delete cluster

Run InstaSlice in simulator mode

Users (mainly developers) can leverage running the InstaSlice operator using the emulator mode as described here This has been tested on a single node cluster as of now.

Running e2e tests using emulated mode and kind cluster

To run the e2e tests locally, run the following command:

make test-e2e-kind-emulated ; make cleanup-test-e2e-kind-emulated

These e2e tests would be performed by creating a kind cluster locally.

InstaSlice and OperatorHub

InstaSlice has been published on OperatorHub.

Roadmap

High level overview of the main priorities for 2024/2025:

Allocate MIG slices on Nvidia GPUs on demand
Configure allocated slices on GPUs and bind containers to slices
Release and unconfigure slices when pods are completed or deleted
Ability to graceful termination of workload on slice deletion
Account for node classical resources when selecting a node
Schedule pods in average of 10 seconds when resources are available
Kubernetes quota system integration
Konflux onboarding
Operator SDK integration

Future tasks:

Stable integration with project Kueue
Stable integration with provisioning request CRD to support autoscaling
Handle pods requesting multiple slices
Manage slices on heterogenous GPU types in the cluster
Improved fault tolerance
Leverage DRA implementation

Note - Kubecon EU 2024 code (DRA code) is now available in the legacy branch

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 327 Commits
.github		.github
.tekton		.tekton
api/v1alpha1		api/v1alpha1
bundle-ocp		bundle-ocp
bundle		bundle
cmd		cmd
config		config
deploy		deploy
docs		docs
hack		hack
internal/controller		internal/controller
samples		samples
test		test
vendor		vendor
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.ignore		.ignore
Dockerfile.controller		Dockerfile.controller
Dockerfile.daemonset		Dockerfile.daemonset
Dockerfile.daemonset-ocp		Dockerfile.daemonset-ocp
Dockerfile.ocp		Dockerfile.ocp
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
PROJECT		PROJECT
README.md		README.md
VERSION		VERSION
bundle-ocp.Dockerfile		bundle-ocp.Dockerfile
bundle.Dockerfile		bundle.Dockerfile
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InstaSlice

Why InstaSlice

Features overview

Demo

Getting Started

Prerequisites

Install and configure required NVIDIA software on the host

Enable MIG on the GPU

Install KinD cluster with GPU operator

Deploy InstaSlice

Run a sample workload

Create instances of your solution

Uninstall

Run InstaSlice in simulator mode

Running e2e tests using emulated mode and kind cluster

InstaSlice and OperatorHub

Roadmap

Note - Kubecon EU 2024 code (DRA code) is now available in the legacy branch

License

About

Releases

Packages

Contributors 13

Languages

License

openshift/instaslice-operator

Folders and files

Latest commit

History

Repository files navigation

InstaSlice

Why InstaSlice

Features overview

Demo

Getting Started

Prerequisites

Install and configure required NVIDIA software on the host

Enable MIG on the GPU

Install KinD cluster with GPU operator

Deploy InstaSlice

Run a sample workload

Create instances of your solution

Uninstall

Run InstaSlice in simulator mode

Running e2e tests using emulated mode and kind cluster

InstaSlice and OperatorHub

Roadmap

Note - Kubecon EU 2024 code (DRA code) is now available in the legacy branch

License

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 13

Languages

Packages