Skip to content

Honor AI_AGENT and pass raw values through#815

Open
renaudhartert-db wants to merge 1 commit into
mainfrom
ai-agent-env-var
Open

Honor AI_AGENT and pass raw values through#815
renaudhartert-db wants to merge 1 commit into
mainfrom
ai-agent-env-var

Conversation

@renaudhartert-db
Copy link
Copy Markdown
Contributor

@renaudhartert-db renaudhartert-db commented May 30, 2026

Why

The Java SDK detects AI coding agents and surfaces them as agent/<name> in the User-Agent. Today the generic fallback (when no proprietary env var fires) only honors the agents.md AGENT=<name> standard. Vercel's @vercel/detect-agent library uses a parallel AI_AGENT=<name> convention that tools in the Vercel ecosystem set instead; we currently miss those.

Separately, the existing fallback coerces any unrecognized value to the literal string "unknown". That buries useful signal: a tool setting AI_AGENT=claude-code_2-1-141_agent ends up as agent/unknown, discarding the very signal (tool name plus version variant) we want to see. Bucketing arbitrary names is an ETL concern, not the SDK's.

This mirrors the Go SDK change in databricks/databricks-sdk-go#1683.

Changes

Two behavior changes in src/main/java/com/databricks/sdk/core/UserAgent.java:

  1. AI_AGENT fallback. Add AI_AGENT=<name> as a secondary fallback after AGENT=<name>. AGENT wins when both are set to non-empty values; empty is treated as unset for both. Explicit product matchers (e.g. CLAUDECODE) still always win over both.

  2. Raw passthrough instead of "unknown". Drop the known-product lookup in the fallback. The value is piped through the existing sanitize() helper (disallowed chars become -, satisfying the User-Agent allowlist [0-9A-Za-z_.+-]) and capped at 64 chars to keep the header bounded. Known products like cursor or claude-code pass through unchanged because they already satisfy the allowlist. Note that the Java allowlist does not include /, so a value like cursor/1.2.3 sanitizes to cursor-1.2.3.

Same change is landing in databricks-sdk-py as a sibling PR.

Test plan

  • mvn -pl databricks-sdk-java test -Dtest=UserAgentTest passes (48 tests)
  • mvn spotless:apply clean
  • AI_AGENT=<known product> returns the product name
  • AI_AGENT=<unrecognized> returns the raw sanitized value (no longer "unknown")
  • AGENT wins over AI_AGENT when both are non-empty
  • Empty AGENT falls through to AI_AGENT
  • Disallowed chars in AGENT / AI_AGENT are sanitized to -
  • Values longer than 64 chars are truncated
  • Explicit matcher (e.g. CLAUDECODE) still wins over both fallbacks

@renaudhartert-db renaudhartert-db changed the title Detect AI_AGENT env var and pass through unrecognized agent values Honor AI_AGENT and pass raw values through May 30, 2026
Detect the AI_AGENT environment variable (Vercel @vercel/detect-agent
convention) as a secondary fallback for the AI agent reported in the user
agent header. It is consulted only when the agents.md AGENT variable is unset
or empty; AGENT takes precedence when both are non-empty.

Unrecognized AGENT or AI_AGENT values are now passed through as-is rather than
coerced to the literal "unknown". The passed-through value is sanitized to the
user agent allowlist [0-9A-Za-z_.+-] (disallowed characters become hyphens)
and capped at 64 characters. Explicit product matchers (CLAUDECODE,
CURSOR_AGENT, etc.) still take precedence over both AGENT and AI_AGENT.

Mirrors databricks/databricks-sdk-go#1683.
@github-actions
Copy link
Copy Markdown
Contributor

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-java

Inputs:

  • PR number: 815
  • Commit SHA: 3a0957778ff4304c800a54f9ac9c9a4435858bb1

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant