Honor AI_AGENT and pass raw values through#815
Open
renaudhartert-db wants to merge 1 commit into
Open
Conversation
Detect the AI_AGENT environment variable (Vercel @vercel/detect-agent convention) as a secondary fallback for the AI agent reported in the user agent header. It is consulted only when the agents.md AGENT variable is unset or empty; AGENT takes precedence when both are non-empty. Unrecognized AGENT or AI_AGENT values are now passed through as-is rather than coerced to the literal "unknown". The passed-through value is sanitized to the user agent allowlist [0-9A-Za-z_.+-] (disallowed characters become hyphens) and capped at 64 characters. Explicit product matchers (CLAUDECODE, CURSOR_AGENT, etc.) still take precedence over both AGENT and AI_AGENT. Mirrors databricks/databricks-sdk-go#1683.
1981b80 to
3a09577
Compare
Contributor
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The Java SDK detects AI coding agents and surfaces them as
agent/<name>in the User-Agent. Today the generic fallback (when no proprietary env var fires) only honors the agents.mdAGENT=<name>standard. Vercel's@vercel/detect-agentlibrary uses a parallelAI_AGENT=<name>convention that tools in the Vercel ecosystem set instead; we currently miss those.Separately, the existing fallback coerces any unrecognized value to the literal string
"unknown". That buries useful signal: a tool settingAI_AGENT=claude-code_2-1-141_agentends up asagent/unknown, discarding the very signal (tool name plus version variant) we want to see. Bucketing arbitrary names is an ETL concern, not the SDK's.This mirrors the Go SDK change in databricks/databricks-sdk-go#1683.
Changes
Two behavior changes in
src/main/java/com/databricks/sdk/core/UserAgent.java:AI_AGENTfallback. AddAI_AGENT=<name>as a secondary fallback afterAGENT=<name>.AGENTwins when both are set to non-empty values; empty is treated as unset for both. Explicit product matchers (e.g.CLAUDECODE) still always win over both.Raw passthrough instead of
"unknown". Drop the known-product lookup in the fallback. The value is piped through the existingsanitize()helper (disallowed chars become-, satisfying the User-Agent allowlist[0-9A-Za-z_.+-]) and capped at 64 chars to keep the header bounded. Known products likecursororclaude-codepass through unchanged because they already satisfy the allowlist. Note that the Java allowlist does not include/, so a value likecursor/1.2.3sanitizes tocursor-1.2.3.Same change is landing in
databricks-sdk-pyas a sibling PR.Test plan
mvn -pl databricks-sdk-java test -Dtest=UserAgentTestpasses (48 tests)mvn spotless:applycleanAI_AGENT=<known product>returns the product nameAI_AGENT=<unrecognized>returns the raw sanitized value (no longer"unknown")AGENTwins overAI_AGENTwhen both are non-emptyAGENTfalls through toAI_AGENTAGENT/AI_AGENTare sanitized to-CLAUDECODE) still wins over both fallbacks