1+ <!-- If you are updating this getting-started-latest.md guide, please make sure to update the index.md as well -->
2+
13# Getting started with an Inference Gateway
24
35!!! warning "Unreleased/main branch"
4143kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
4244```
4345
46+ ### Install the Gateway
47+
48+ Choose one of the following options to install Gateway.
49+
50+ === "GKE"
51+
52+ Nothing to install here, you can move to the next [section](#deploy-the-inferencepool-and-endpoint-picker-extension)
53+
54+ === "Istio"
55+
56+ 1. Requirements
57+ - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
58+
59+ 2. Install Istio
60+
61+ ```
62+ TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
63+ # on Linux
64+ wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
65+ tar -xvf istioctl-$TAG-linux-amd64.tar.gz
66+ # on macOS
67+ wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
68+ tar -xvf istioctl-$TAG-osx.tar.gz
69+ # on Windows
70+ wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
71+ unzip istioctl-$TAG-win.zip
72+
73+ ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
74+ ```
75+
76+ === "Kgateway"
77+
78+ 1. Requirements
79+
80+ - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
81+ - [Helm](https://helm.sh/docs/intro/install/) installed.
82+
83+ 2. Set the Kgateway version and install the Kgateway CRDs.
84+
85+ ```bash
86+ KGTW_VERSION=v2.1.0
87+ helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
88+ ```
89+
90+ 3. Install Kgateway
91+
92+ ```bash
93+ helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
94+ ```
95+
4496### Deploy the InferencePool and Endpoint Picker Extension
4597
4698 Install an InferencePool named ` vllm-llama3-8b-instruct ` that selects from endpoints with label ` app: vllm-llama3-8b-instruct ` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
@@ -63,7 +115,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
63115 See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
64116 for detailed instructions.
65117
66- 2. Deploy Inference Gateway:
118+ 2. Deploy the Inference Gateway:
67119
68120 ```bash
69121 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
@@ -93,34 +145,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
93145 Please note that this feature is currently in an experimental phase and is not intended for production use.
94146 The implementation and user experience are subject to changes as we continue to iterate on this project.
95147
96- 1. Requirements
97-
98- - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
99-
100- 2. Install Istio
101-
102- ```
103- TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
104- # on Linux
105- wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
106- tar -xvf istioctl-$TAG-linux-amd64.tar.gz
107- # on macOS
108- wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
109- tar -xvf istioctl-$TAG-osx.tar.gz
110- # on Windows
111- wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
112- unzip istioctl-$TAG-win.zip
113-
114- ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
115- ```
116-
117- 3. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
148+ 1. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
118149
119150 ```bash
120151 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml
121152 ```
122153
123- 4 . Deploy Gateway
154+ 2 . Deploy the Inference Gateway
124155
125156 ```bash
126157 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
@@ -133,13 +164,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
133164 inference-gateway inference-gateway <MY_ADDRESS> True 22s
134165 ```
135166
136- 5 . Deploy the HTTPRoute
167+ 3 . Deploy the HTTPRoute
137168
138169 ```bash
139170 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
140171 ```
141172
142- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
173+ 4 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
143174
144175 ```bash
145176 kubectl get httproute llm-route -o yaml
@@ -152,25 +183,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
152183 implementation. Kgateway supports Inference Gateway with the [agentgateway](https://agentgateway.dev/) data plane. Follow these steps
153184 to run Kgateway as an Inference Gateway:
154185
155- 1. Requirements
156-
157- - [Helm](https://helm.sh/docs/intro/install/) installed.
158- - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
159-
160- 2. Set the Kgateway version and install the Kgateway CRDs.
161-
162- ```bash
163- KGTW_VERSION=v2.2.0-main
164- helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
165- ```
166-
167- 3. Install Kgateway
168-
169- ```bash
170- helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
171- ```
172-
173- 4. Deploy the Gateway
186+ 1. Deploy the Inference Gateway
174187
175188 ```bash
176189 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
@@ -181,13 +194,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
181194 kubectl get gateway inference-gateway
182195 ```
183196
184- 5 . Deploy the HTTPRoute
197+ 2 . Deploy the HTTPRoute
185198
186199 ```bash
187200 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
188201 ```
189202
190- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
203+ 3 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
191204
192205 ```bash
193206 kubectl get httproute llm-route -o yaml
0 commit comments