Skip to content

Commit a1a8cc8

Browse files
authored
generalize latest release quickstart (#1966)
Signed-off-by: Nir Rozenbaum <[email protected]>
1 parent f729285 commit a1a8cc8

File tree

2 files changed

+41
-31
lines changed

2 files changed

+41
-31
lines changed

site-src/_includes/prereqs.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ A cluster with:
66
- Support for [sidecar containers](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/) (enabled by default since Kubernetes v1.29)
77
to run the model server deployment.
88

9-
Tooling:
9+
Tools:
1010

11-
- [Helm](https://helm.sh/docs/intro/install/) installed.
11+
- [Helm](https://helm.sh/docs/intro/install/).
12+
- [jq](https://jqlang.org/download/).

site-src/guides/index.md

Lines changed: 38 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,15 @@
1010

1111
## **Steps**
1212

13+
### Set Latest Release Variable
14+
15+
```bash
16+
IGW_LATEST_RELEASE=$(curl -s https://api.github.com/repos/kubernetes-sigs/gateway-api-inference-extension/releases \
17+
| jq -r '.[] | select(.prerelease == false) | .tag_name' \
18+
| sort -V \
19+
| tail -n1)
20+
```
21+
1322
### Deploy Sample Model Server
1423

1524
--8<-- "site-src/_includes/model-server-intro.md"
@@ -18,25 +27,25 @@
1827

1928
```bash
2029
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to the set of Llama models
21-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/vllm/gpu-deployment.yaml
30+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/vllm/gpu-deployment.yaml
2231
```
2332

2433
--8<-- "site-src/_includes/model-server-cpu.md"
2534

2635
```bash
27-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/vllm/cpu-deployment.yaml
36+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/vllm/cpu-deployment.yaml
2837
```
2938

3039
--8<-- "site-src/_includes/model-server-sim.md"
3140

3241
```bash
33-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/vllm/sim-deployment.yaml
42+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/vllm/sim-deployment.yaml
3443
```
3544

3645
### Install the Inference Extension CRDs
3746

3847
```bash
39-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.2.1/manifests.yaml
48+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/${IGW_LATEST_RELEASE}/manifests.yaml
4049
```
4150

4251
### Install the Gateway
@@ -115,7 +124,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
115124
Set the chart version and then select a tab to follow the provider-specific instructions.
116125

117126
```bash
118-
export IGW_CHART_VERSION=v1.2.1
127+
export IGW_CHART_VERSION=${IGW_LATEST_RELEASE}
119128
```
120129

121130
--8<-- "site-src/_includes/epp.md"
@@ -133,7 +142,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
133142
1. Deploy the Inference Gateway:
134143

135144
```bash
136-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/gke/gateway.yaml
145+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/gke/gateway.yaml
137146
```
138147

139148
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
@@ -146,7 +155,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
146155
1. Deploy the HTTPRoute:
147156

148157
```bash
149-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/gke/httproute.yaml
158+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/gke/httproute.yaml
150159
```
151160

152161
1. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
@@ -163,7 +172,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
163172
1. Deploy the Inference Gateway:
164173

165174
```bash
166-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
175+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/istio/gateway.yaml
167176
```
168177

169178
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
@@ -176,7 +185,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
176185
1. Deploy the HTTPRoute:
177186

178187
```bash
179-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
188+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/istio/httproute.yaml
180189
```
181190

182191
1. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
@@ -195,7 +204,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
195204
1. Deploy the Inference Gateway:
196205

197206
```bash
198-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/agentgateway/gateway.yaml
207+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/agentgateway/gateway.yaml
199208
```
200209

201210
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
@@ -206,7 +215,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
206215
1. Deploy the HTTPRoute:
207216

208217
```bash
209-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/agentgateway/httproute.yaml
218+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/agentgateway/httproute.yaml
210219
```
211220

212221
1. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
@@ -222,7 +231,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
222231
1. Deploy the Gateway
223232

224233
```bash
225-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
234+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
226235
```
227236

228237
2. Verify the Gateway status
@@ -240,7 +249,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
240249
Create the HTTPRoute resource to route traffic to your InferencePool:
241250

242251
```bash
243-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
252+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
244253
```
245254

246255
4. Verify the route status
@@ -271,7 +280,7 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
271280
Deploy the sample InferenceObjective which allows you to specify priority of requests.
272281

273282
```bash
274-
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/inferenceobjective.yaml
283+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/inferenceobjective.yaml
275284
```
276285

277286
--8<-- "site-src/_includes/test.md"
@@ -293,35 +302,35 @@ You have now deployed a basic Inference Gateway with a simple routing strategy.
293302

294303
```bash
295304
helm uninstall vllm-llama3-8b-instruct
296-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/inferenceobjective.yaml --ignore-not-found
297-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
298-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
299-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/vllm/sim-deployment.yaml --ignore-not-found
305+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/inferenceobjective.yaml --ignore-not-found
306+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
307+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
308+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/vllm/sim-deployment.yaml --ignore-not-found
300309
kubectl delete secret hf-token --ignore-not-found
301310
```
302311

303312
1. Uninstall the Gateway API Inference Extension CRDs:
304313

305314
```bash
306-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.2.1/manifests.yaml --ignore-not-found
315+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/${IGW_LATEST_RELEASE}/manifests.yaml --ignore-not-found
307316
```
308317

309318
1. Choose one of the following options to cleanup the Inference Gateway.
310319

311320
=== "GKE"
312321

313322
```bash
314-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/gke/gateway.yaml --ignore-not-found
315-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found
316-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found
317-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/gke/httproute.yaml --ignore-not-found
323+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/gke/gateway.yaml --ignore-not-found
324+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found
325+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found
326+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/gke/httproute.yaml --ignore-not-found
318327
```
319328

320329
=== "Istio"
321330

322331
```bash
323-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/istio/gateway.yaml --ignore-not-found
324-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/istio/httproute.yaml --ignore-not-found
332+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/istio/gateway.yaml --ignore-not-found
333+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/istio/httproute.yaml --ignore-not-found
325334
```
326335

327336
The following steps assume you would like to clean up ALL Istio resources that were created in this quickstart guide.
@@ -341,8 +350,8 @@ You have now deployed a basic Inference Gateway with a simple routing strategy.
341350
=== "Kgateway"
342351

343352
```bash
344-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/agentgateway/gateway.yaml --ignore-not-found
345-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.2.1/config/manifests/gateway/agentgateway/httproute.yaml --ignore-not-found
353+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/agentgateway/gateway.yaml --ignore-not-found
354+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/agentgateway/httproute.yaml --ignore-not-found
346355
```
347356

348357
The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
@@ -373,8 +382,8 @@ You have now deployed a basic Inference Gateway with a simple routing strategy.
373382
1. Remove Inference Gateway and HTTPRoute:
374383

375384
```bash
376-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
377-
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
385+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
386+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/${IGW_LATEST_RELEASE}/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
378387
```
379388

380389
2. Uninstall NGINX Gateway Fabric:

0 commit comments

Comments
 (0)