Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion config/charts/inferencepool/templates/gke.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ spec:
logging:
enabled: true # log all requests by default
---
{{- if or .Values.inferenceExtension.monitoring.gke.enabled (and .Values.inferenceExtension.monitoring.prometheus.enabled .Values.inferenceExtension.monitoring.prometheus.auth.enabled) }}
{{- if and .Values.inferenceExtension.monitoring.prometheus.enabled .Values.inferenceExtension.monitoring.prometheus.auth.enabled }}
Copy link
Contributor

@JeffLuoo JeffLuoo Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the prometheus enabled is by default set to false in the values.yaml: https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/v1.2.0-rc.1/config/charts/inferencepool/values.yaml#L55-L56

This implies that the monitoring stack on GKE won't have all required objects like roles binding by default if inferenceExtension.monitoring.prometheus.enabled is set to false with this change.

Shall we use inferenceExtension.monitoring.prometheus.enabled to control the enablement of monitoring stacks regardless of the provider? If so, we need to update the mentioning of inferenceExtension.monitoring.gke.enable=true, e.g. https://github.com/search?q=repo%3Allm-d%2Fllm-d%20inferenceExtension.monitoring.gke.enable%3Dtrue&type=code

wdyt? @ahg-g

Copy link
Contributor

@ahg-g ahg-g Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the prometheus enabled is by default set to false in the values.yaml: https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/v1.2.0-rc.1/config/charts/inferencepool/values.yaml#L55-L56

This implies that the monitoring stack on GKE won't have all required objects like roles binding by default if inferenceExtension.monitoring.prometheus.enabled is set to false with this change.

Can you elaborate here? the existing gke.enable flag that we are removing in this PR is not controlling that either, right?

Shall we use inferenceExtension.monitoring.prometheus.enabled to control the enablement of monitoring stacks regardless of the provider? If so, we need to update the mentioning of inferenceExtension.monitoring.gke.enable=true, e.g. https://github.com/search?q=repo%3Allm-d%2Fllm-d%20inferenceExtension.monitoring.gke.enable%3Dtrue&type=code

wdyt? @ahg-g

I agree as long as all what this flag enables is indeed prometheus-related.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. EPP uses Prometheus as the metric source behind the metrics endpoint.

The PR LGTM then, we just need to make sure we update all references later to include inferenceExtension.monitoring.prometheus.enabled in addition to setting the provider to GKE.

Copy link
Contributor

@JeffLuoo JeffLuoo Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more place to update -

--set inferenceExtension.monitoring.gke.enabled=true \

With this change, we switch to use inferenceExtension.monitoring.prometheus.enabled so mentioning of inferenceExtension.monitoring.gke.enabled needs to be updated as well. This is the only mentioning I found in EPP repo: https://github.com/search?q=repo%3Akubernetes-sigs%2Fgateway-api-inference-extension+monitoring.gke&type=code

{{- $metricsReadSA := printf "%s-metrics-reader-sa" .Release.Name -}}
{{- $metricsReadSecretName := printf "%s-metrics-reader-secret" .Release.Name -}}
{{- $metricsReadRoleName := printf "%s-%s-metrics-reader" .Release.Namespace .Release.Name -}}
Expand Down
4 changes: 0 additions & 4 deletions config/charts/inferencepool/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,10 +60,6 @@ inferenceExtension:
secretName: inference-gateway-sa-metrics-reader-secret
# additional labels for the ServiceMonitor
extraLabels: {}

# DEPRECATED: The 'gke' configuration will be removed in the next release.
gke:
enabled: false
tracing:
enabled: false
otelExporterEndpoint: "http://localhost:4317"
Expand Down