Skip to content

Conversation

@JonHolman
Copy link

Rate Limit Information Enhancement

Summary

This PR adds detailed rate limit information to OpenCode's APIError schema, exposing provider-specific rate limiting details that were previously captured by the server but not accessible to SDK clients.

Problem

When OpenCode encounters rate limits from AI providers (Anthropic, Google, etc.), it captures detailed information in debug logs including:

  • Retry-after delays
  • Reset timestamps
  • Quota status and utilization
  • Provider-specific error details

However, this information was not exposed through the API, forcing applications to:

  • Use generic timeout handling
  • Implement blind retry strategies
  • Miss opportunities for intelligent rate limit management

Solution

Enhanced the MessageV2.APIError schema with a new rateLimitInfo field containing:

rateLimitInfo?: {
  retryAfter?: number          // milliseconds until retry allowed
  resetTime?: number           // Unix timestamp when limit resets  
  quotaStatus?: string         // e.g., "rejected", "allowed"
  quotaUtilization?: number    // percentage 0-1
  quotaDetails?: Record<string, any>  // provider-specific details
}

Implementation Details

1. Schema Enhancement (lines 34-41)

Added rateLimitInfo field to APIError Zod schema as an optional object with 5 optional sub-fields.

2. Helper Function (lines 611-805)

extractRateLimitInfo(error: APICallError) extracts rate limit details from:

Standard Headers:

  • retry-after: Supports both seconds (number) and HTTP-date formats

Anthropic-specific Headers:

  • anthropic-ratelimit-unified-reset: Reset timestamp
  • anthropic-ratelimit-unified-status: "rejected" or "allowed"
  • anthropic-ratelimit-unified-utilization: 0-1 percentage

Google Response Body:

  • QuotaFailure: Violations with metrics, limits, and dimensions
  • RetryInfo: Retry delays in "Xs" format

3. Integration (line 876)

Updated fromError() to call extractRateLimitInfo(e) and include result in APIError construction.

Testing

Validated with actual rate limit responses:

Anthropic 429:

retry-after: 32692
anthropic-ratelimit-unified-reset: 1767884400
anthropic-ratelimit-unified-status: rejected  
anthropic-ratelimit-unified-utilization: 1.00003

Google RESOURCE_EXHAUSTED:

{
  "error": {
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.QuotaFailure",
        "violations": [{
          "quotaMetric": "character_count",
          "quotaLimit": "CharacterCountPerDay"
        }]
      },
      {
        "@type": "type.googleapis.com/google.rpc.RetryInfo",
        "retryDelay": "22.362s"
      }
    ]
  }
}

Benefits

  1. Intelligent Retry Logic: Applications can wait exactly the required time instead of guessing
  2. Quota Awareness: See when quotas reset and current utilization levels
  3. Better Error Messages: Surface provider-specific details to users
  4. Resource Efficiency: Avoid hammering rate-limited endpoints
  5. Multi-Provider Support: Unified interface for different provider formats

Example Usage

const result = await opencode.sessions.execute(/* ... */);

if (result.error?.type === 'APIError' && result.error.rateLimitInfo) {
  const { retryAfter, resetTime, quotaStatus, quotaUtilization } = result.error.rateLimitInfo;
  
  console.log(`Rate limited. Retry in ${retryAfter}ms`);
  console.log(`Quota resets at ${new Date(resetTime)}`);
  console.log(`Status: ${quotaStatus}, Utilization: ${quotaUtilization * 100}%`);
  
  // Wait and retry
  await new Promise(resolve => setTimeout(resolve, retryAfter));
  // ... retry logic
}

Backwards Compatibility

  • Non-breaking: New field is optional
  • Existing code: Continues to work unchanged
  • Opt-in: Applications can choose to use the new field
  • Type-safe: Full TypeScript support

Files Changed

  • packages/opencode/src/session/message-v2.ts

Add detailed rate limit information to the APIError schema to surface
provider-specific rate limiting details that were previously captured
but not exposed to SDK clients.

Changes:
- Add rateLimitInfo field to APIError schema with:
  * retryAfter: milliseconds until retry is allowed
  * resetTime: Unix timestamp when rate limit resets
  * quotaStatus: status like 'rejected' or 'allowed'
  * quotaUtilization: quota usage percentage (0-1)
  * quotaDetails: provider-specific additional details

- Add extractRateLimitInfo() helper function that extracts:
  * Standard retry-after headers (seconds or HTTP-date format)
  * Anthropic-specific headers (anthropic-ratelimit-unified-*)
  * Google quota details from response body (QuotaFailure, RetryInfo)

- Update fromError() to populate rateLimitInfo when constructing APIError

This enables applications to implement intelligent rate limit handling by accessing detailed information about when to retry, quota status, and provider-specific rate limiting details.
@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

The following comment was made by an LLM, it may be inaccurate:

No duplicate PRs found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant