[aks-agent] Add Microsoft Entra ID authentication support for Azure OpenAI#9719
[aks-agent] Add Microsoft Entra ID authentication support for Azure OpenAI#9719mainred wants to merge 3 commits intoAzure:mainfrom
Conversation
…penAI This commit adds support for keyless authentication using Microsoft Entra ID (formerly Azure AD) for Azure OpenAI. Users can now skip providing an API key during initialization to enable workload identity-based authentication. Changes: - Bump version to 1.0.0b22 and aks-agent to v0.7.0 - Allow empty API key during Azure OpenAI configuration - Configure aks-agent pod to use workload identity with the same service account as aks-mcp - Add helm values: workloadIdentity.enabled=true, serviceAccount.create=false - Skip creating secrets and environment variables when API key is empty - Enable azureADTokenAuth flag in helm when API key is not provided
|
| rule | cmd_name | rule_message | suggest_message |
|---|---|---|---|
| aks agent-cleanup | cmd aks agent-cleanup added parameter yes |
|
Thank you for your contribution! We will review the pull request and get back to you soon. |
|
The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR. Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions). pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>
|
|
There was a problem hiding this comment.
Pull request overview
Adds keyless (Microsoft Entra ID / workload identity) authentication support for Azure OpenAI in the aks-agent extension, allowing initialization without providing an API key and propagating the necessary Helm values for agent deployment.
Changes:
- Bump extension version to
1.0.0b22and aks-agent chart/image version to0.7.0. - Allow empty
api_keyfor Azure OpenAI and omit secret/env wiring when the key is blank. - Set Helm values to enable workload identity/service account usage and toggle
azureADTokenAuthwhen keyless auth is detected.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/aks-agent/setup.py | Version bump to 1.0.0b22. |
| src/aks-agent/azext_aks_agent/agent/llm_providers/base.py | Skip secret/env substitution when api_key is empty. |
| src/aks-agent/azext_aks_agent/agent/llm_providers/azure_provider.py | Allow empty API key and short-circuit validation for keyless path. |
| src/aks-agent/azext_aks_agent/agent/llm_config_manager.py | Avoid generating env var refs when api_key is blank. |
| src/aks-agent/azext_aks_agent/agent/k8s/aks_agent_manager.py | Add Helm values for workload identity + azureADTokenAuth toggle. |
| src/aks-agent/azext_aks_agent/_consts.py | Bump AKS_AGENT_VERSION to 0.7.0. |
| src/aks-agent/HISTORY.rst | Changelog entry for 1.0.0b22. |
src/aks-agent/azext_aks_agent/agent/llm_providers/azure_provider.py
Outdated
Show resolved
Hide resolved
- Split Azure OpenAI provider into two separate providers: - Azure OpenAI (API Key): Requires API key authentication - Azure OpenAI (Microsoft Entra ID): Supports keyless authentication - Add role assignment reminders in helm deployment for keyless auth - Add --yes/-y flag to agent-cleanup command to skip confirmation - Change secret empty data message from warning to debug level - Bump to v0.7.1 (suppress litellm debug logs)
36dfbee to
643eb7d
Compare
|
cc @yanzhudd |
gossion
left a comment
There was a problem hiding this comment.
Overall this PR adds a useful feature for keyless Azure OpenAI authentication. I left a few comments on specific areas worth discussing.
| @@ -19,11 +20,11 @@ | |||
|
|
|||
| _PROVIDER_CLASSES: List[LLMProvider] = [ | |||
There was a problem hiding this comment.
Potential bug: PROVIDER_REGISTRY key collision
Both AzureProvider and AzureEntraIDProvider have model_route = "azure", which means their name property (derived from model_route) is both "azure". If PROVIDER_REGISTRY is keyed by name, the second registration (AzureEntraIDProvider) will silently overwrite the first (AzureProvider).
This could cause issues when looking up a provider by name. Consider giving AzureEntraIDProvider a distinct name (e.g., by overriding the name property to return "azure_entraid").
There was a problem hiding this comment.
It's by design. We want two kinds of azure openai.
| "enabled": True, | ||
| } | ||
| helm_values["serviceAccount"] = { | ||
| "create": False, |
There was a problem hiding this comment.
workload identity is enabled unconditionally for all deployments
These Helm values (workloadIdentity.enabled=true, serviceAccount.create=false) are set for every deployment, even when the user is purely using API key authentication and does not have workload identity configured on their cluster.
This could cause deployment failures or unexpected behavior on clusters without workload identity. Consider gating this behind the has_empty_api_key check below:
has_empty_api_key = any(
not (model_config.get("api_key") or "").strip()
for model_config in self.llm_config_manager.model_list.values()
)
if has_empty_api_key:
helm_values["workloadIdentity"] = {"enabled": True}
helm_values["serviceAccount"] = {
"create": False,
"name": self.aks_mcp_service_account_name,
}
helm_values["azureADTokenAuth"] = TrueThere was a problem hiding this comment.
It's also by design. It's ok to not have workload identity.
| } | ||
|
|
||
| def validate_connection(self, params: dict) -> Tuple[str, str]: | ||
| api_base = params.get("api_base") |
There was a problem hiding this comment.
Unlike AzureProvider.validate_connection, this method only checks parameter presence and does not actually validate the connection to the Azure OpenAI endpoint. This is understandable since the Entra ID token is not available at CLI configuration time (it will be obtained by the pod via workload identity).
However, it would be helpful to add a comment here explaining why connection validation is skipped, so future maintainers don't mistake this for an incomplete implementation.
def validate_connection(self, params: dict) -> Tuple[str, str]:
# Connection validation is skipped for Entra ID auth because the token
# is obtained at runtime via workload identity, not available at CLI config time.
...There was a problem hiding this comment.
We cannot check validation for keyless auzre openai because it relies on azure role of the current user.
|
|
||
| has_empty_api_key = any( | ||
| not model_config.get("api_key") or not model_config.get("api_key").strip() | ||
| for model_config in self.llm_config_manager.model_list.values() |
There was a problem hiding this comment.
Nit: the empty api_key condition can be simplified. The or short-circuit already handles None/empty string, so the second .strip() check only adds value for whitespace-only strings. A simpler equivalent:
has_empty_api_key = any(
not (model_config.get("api_key") or "").strip()
for model_config in self.llm_config_manager.model_list.values()
)There was a problem hiding this comment.
each llm provider with check the emptiness of the apikey
| console.print( | ||
| f"\n⚠️ IMPORTANT: If using keyless authentication with Azure OpenAI, ensure the 'Cognitive Services OpenAI User' or 'Azure AI Developer' role " | ||
| f"is assigned to the workload identity (service account: {service_account_name}).", | ||
| style=f"bold {INFO_COLOR}" |
There was a problem hiding this comment.
This Entra ID detection + warning block is duplicated almost identically in the helm_status == "not_found" branch below. Consider extracting it into a helper function like _warn_entraid_role_assignment(console, model_list, service_account_name) to reduce duplication.
There was a problem hiding this comment.
This minor change can be included in a following PR.
| Pending | ||
| +++++++ | ||
|
|
||
| 1.0.0b22 |
There was a problem hiding this comment.
Minor: HISTORY.rst says "Bump aks-agent to v0.7.1" but the PR description says "v0.7.0". The code in _consts.py matches the changelog (v0.7.1). Please update the PR description to be consistent.
Summary
Changes
workloadIdentity.enabled=true,serviceAccount.create=false, and use the same service account name as aks-mcpazureADTokenAuthflag in helm when API key is not providedTest plan