Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -292,6 +292,41 @@
# CUSTOM_CAP_ANTIGRAVITY_T2_3_G25_FLASH=80%
# CUSTOM_CAP_COOLDOWN_ANTIGRAVITY_T2_3_G25_FLASH=offset:1800

# ------------------------------------------------------------------------------
# | [ADVANCED] Cross-Provider Model Fallback Groups |
# ------------------------------------------------------------------------------
#
# Pool credentials from multiple providers for equivalent models. When one
# provider's credentials are exhausted, automatically fall back to the next.
#
# Key features:
# - Sequential provider rotation: Each provider is tried completely (with its
# internal tier rotation) before moving to the next provider
# - Target promotion: The requested provider is always moved to the front
# - Different model names: Each provider can use a different model name
#
# Format: JSON array of arrays, each inner array is a fallback group
#
# MODEL_FALLBACK_GROUPS='[
# ["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
# ["antigravity/claude-sonnet-4.5", "openrouter/anthropic/claude-3.5-sonnet"]
# ]'
#
# Behavior example:
# Request: "gemini_cli/gemini-2.5-pro"
# 1. Find group containing "gemini_cli/gemini-2.5-pro"
# 2. Reorder with target first: ["gemini_cli/gemini-2.5-pro", "gemini/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"]
# 3. Try gemini_cli completely (tier-2 → tier-1 credentials)
# 4. If exhausted, try gemini completely (tier-2 → tier-1 credentials)
# 5. If exhausted, try openrouter completely (tier-2 → tier-1 credentials)
#
# Notes:
# - Same provider can appear multiple times with different models
# - Request for a model NOT in any group uses single-provider behavior
# - Group name is not needed; entries are matched by exact "provider/model"
#
# MODEL_FALLBACK_GROUPS='[["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "antigravity/gemini-2.5-pro"]]'

# ------------------------------------------------------------------------------
# | [ADVANCED] Proxy Configuration |
# ------------------------------------------------------------------------------
Expand Down
78 changes: 78 additions & 0 deletions DOCUMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ client = RotatingClient(
- `enable_request_logging` (`bool`, default: `False`): If `True`, enables detailed per-request file logging.
- `max_concurrent_requests_per_key` (`Optional[Dict[str, int]]`, default: `None`): Max concurrent requests allowed for a single API key per provider.
- `rotation_tolerance` (`float`, default: `3.0`): Controls the credential rotation strategy. See Section 2.2 for details.
- `model_fallback_groups` (`Optional[List[List[str]]]`, default: `None`): Cross-provider fallback groups. See Section 2.22 for details.

#### Core Responsibilities

Expand Down Expand Up @@ -919,6 +920,83 @@ The proxy accepts both Anthropic and OpenAI authentication styles:
- `x-api-key` header (Anthropic style)
- `Authorization: Bearer` header (OpenAI style)

### 2.22. Cross-Provider Fallback Groups (`fallback_groups.py`)

The `FallbackGroupManager` enables pooling credentials from multiple providers for equivalent models, with automatic fallback when one provider's credentials are exhausted.

#### Configuration

**Environment Variable:**
```bash
MODEL_FALLBACK_GROUPS='[
["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
["antigravity/claude-sonnet-4.5", "openrouter/anthropic/claude-3.5-sonnet"]
]'
```

**Constructor Parameter:**
```python
client = RotatingClient(
model_fallback_groups=[
["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
]
)
```

#### Key Concepts

- **Fallback Group**: A list of `provider/model` combinations that are considered equivalent
- **Target Promotion**: When a request matches an entry in a group, that entry is moved to the front
- **Sequential Provider Rotation**: Each provider is tried completely (with its internal tier rotation) before moving to the next
- **Provider Priority**: Providers are tried in the order specified in the configuration

#### Algorithm

Given a request for `gemini_cli/gemini-2.5-pro` with fallback group:
`["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"]`

1. **Find matching group**: Scan all groups for exact match `gemini_cli/gemini-2.5-pro`
2. **Reorder with target first**: `["gemini_cli/gemini-2.5-pro", "gemini/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"]`
3. **Try each provider sequentially**:
- Try `gemini_cli/gemini-2.5-pro` with all its credentials (tier-2 first, then tier-1 - internal rotation)
- If exhausted, try `gemini/gemini-2.5-pro` with all its credentials
- If exhausted, try `openrouter/google/gemini-2.5-pro` with all its credentials
4. **Success stops iteration**: First successful response is returned immediately

#### Implementation Details

The fallback system reuses existing retry logic completely:

```
For each entry in fallback group (in order):
Call existing _execute_with_retry() with entry's provider/model
If success: return response
If exhausted: continue to next entry
```

**Key Benefits:**
- Simple, predictable provider ordering
- Each provider uses its own internal tier rotation
- Existing cooldown, fair cycle, and rotation logic remains intact
- Provider-specific settings (like concurrency limits) are respected
- No changes needed to `UsageManager` - orchestration is at client level

#### Edge Cases

| Scenario | Behavior |
|----------|----------|
| Request not in any group | Single-provider mode (existing behavior) |
| Same provider, different models | Both entries tried in order |
| Provider has no credentials | Entry skipped silently |
| All entries exhausted | Returns error response with details |

#### Logging

- `DEBUG`: When fallback activated, which entry being tried
- `INFO`: When entry exhausted and moving to next
- `INFO`: When fallback succeeds with a specific entry
- `WARNING`: When all entries exhausted

### 3.5. Antigravity (`antigravity_provider.py`)

The most sophisticated provider implementation, supporting Google's internal Antigravity API for Gemini 3 and Claude models (including **Claude Opus 4.5**, Anthropic's most powerful model).
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,7 @@ The proxy is powered by a standalone Python library that you can use directly in
- **Intelligent key selection** with tiered, model-aware locking
- **Deadline-driven requests** with configurable global timeout
- **Automatic failover** between keys on errors
- **Cross-provider fallback** — pool credentials from multiple providers for the same model
- **OAuth support** for Gemini CLI, Antigravity, Qwen, iFlow
- **Stateless deployment ready** — load credentials from environment variables

Expand Down
6 changes: 5 additions & 1 deletion src/rotator_library/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ A robust, asynchronous, and thread-safe Python library for managing a pool of AP
- **Shared OAuth Base**: Refactored OAuth implementation with reusable [`GoogleOAuthBase`](providers/google_oauth_base.py) for multiple providers.
- **Fair Cycle Rotation**: Ensures each credential exhausts at least once before any can be reused within a tier. Prevents a single credential from being repeatedly used while others sit idle. Configurable per provider with tracking modes and cross-tier support.
- **Custom Usage Caps**: Set custom limits per tier, per model/group that are more restrictive than actual API limits. Supports percentages (e.g., "80%") and multiple cooldown modes (`quota_reset`, `offset`, `fixed`). Credentials go on cooldown before hitting actual API limits.
- **Cross-Provider Fallback Groups**: Pool credentials from multiple providers for equivalent models. When one provider's credentials are exhausted, automatically fall back to the next provider in the configured order. Each provider uses its own internal tier rotation.
- **Centralized Defaults**: All tunable defaults are defined in [`config/defaults.py`](config/defaults.py) for easy customization and visibility.

## Installation
Expand Down Expand Up @@ -82,7 +83,10 @@ client = RotatingClient(
whitelist_models={},
enable_request_logging=False,
max_concurrent_requests_per_key={},
rotation_tolerance=2.0 # 0.0=deterministic, 2.0=recommended random
rotation_tolerance=2.0, # 0.0=deterministic, 2.0=recommended random
model_fallback_groups=[ # Cross-provider fallback groups
["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
],
)
```

Expand Down
Loading