Mirrowel · Mirrowel · Jan 20, 2026 · Jan 20, 2026 · Jan 20, 2026 · Jan 20, 2026
@@ -292,6 +292,41 @@
 # CUSTOM_CAP_ANTIGRAVITY_T2_3_G25_FLASH=80%
 # CUSTOM_CAP_COOLDOWN_ANTIGRAVITY_T2_3_G25_FLASH=offset:1800
 
+# ------------------------------------------------------------------------------
+# | [ADVANCED] Cross-Provider Model Fallback Groups                             |
+# ------------------------------------------------------------------------------
+#
+# Pool credentials from multiple providers for equivalent models. When one
+# provider's credentials are exhausted, automatically fall back to the next.
+#
+# Key features:
+#   - Sequential provider rotation: Each provider is tried completely (with its
+#     internal tier rotation) before moving to the next provider
+#   - Target promotion: The requested provider is always moved to the front
+#   - Different model names: Each provider can use a different model name
+#
+# Format: JSON array of arrays, each inner array is a fallback group
+#
+# MODEL_FALLBACK_GROUPS='[
+#   ["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
+#   ["antigravity/claude-sonnet-4.5", "openrouter/anthropic/claude-3.5-sonnet"]
+# ]'
+#
+# Behavior example:
+#   Request: "gemini_cli/gemini-2.5-pro"
+#   1. Find group containing "gemini_cli/gemini-2.5-pro"
+#   2. Reorder with target first: ["gemini_cli/gemini-2.5-pro", "gemini/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"]
+#   3. Try gemini_cli completely (tier-2 → tier-1 credentials)
+#   4. If exhausted, try gemini completely (tier-2 → tier-1 credentials)
+#   5. If exhausted, try openrouter completely (tier-2 → tier-1 credentials)
+#
+# Notes:
+#   - Same provider can appear multiple times with different models
+#   - Request for a model NOT in any group uses single-provider behavior
+#   - Group name is not needed; entries are matched by exact "provider/model"
+#
+# MODEL_FALLBACK_GROUPS='[["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "antigravity/gemini-2.5-pro"]]'
+
 # ------------------------------------------------------------------------------
 # | [ADVANCED] Proxy Configuration                                             |
 # ------------------------------------------------------------------------------

@@ -60,6 +60,7 @@ client = RotatingClient(
 -   `enable_request_logging` (`bool`, default: `False`): If `True`, enables detailed per-request file logging.
 -   `max_concurrent_requests_per_key` (`Optional[Dict[str, int]]`, default: `None`): Max concurrent requests allowed for a single API key per provider.
 -   `rotation_tolerance` (`float`, default: `3.0`): Controls the credential rotation strategy. See Section 2.2 for details.
+-   `model_fallback_groups` (`Optional[List[List[str]]]`, default: `None`): Cross-provider fallback groups. See Section 2.22 for details.
 
 #### Core Responsibilities
 
@@ -919,6 +920,83 @@ The proxy accepts both Anthropic and OpenAI authentication styles:
 - `x-api-key` header (Anthropic style)
 - `Authorization: Bearer` header (OpenAI style)
 
+### 2.22. Cross-Provider Fallback Groups (`fallback_groups.py`)
+
+The `FallbackGroupManager` enables pooling credentials from multiple providers for equivalent models, with automatic fallback when one provider's credentials are exhausted.
+
+#### Configuration
+
+**Environment Variable:**
+```bash
+MODEL_FALLBACK_GROUPS='[
+  ["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
+  ["antigravity/claude-sonnet-4.5", "openrouter/anthropic/claude-3.5-sonnet"]
+]'
+```
+
+**Constructor Parameter:**
+```python
+client = RotatingClient(
+    model_fallback_groups=[
+        ["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
+    ]
+)
+```
+
+#### Key Concepts
+
+- **Fallback Group**: A list of `provider/model` combinations that are considered equivalent
+- **Target Promotion**: When a request matches an entry in a group, that entry is moved to the front
+- **Sequential Provider Rotation**: Each provider is tried completely (with its internal tier rotation) before moving to the next
+- **Provider Priority**: Providers are tried in the order specified in the configuration
+
+#### Algorithm
+
+Given a request for `gemini_cli/gemini-2.5-pro` with fallback group:
+`["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"]`
+
+1. **Find matching group**: Scan all groups for exact match `gemini_cli/gemini-2.5-pro`
+2. **Reorder with target first**: `["gemini_cli/gemini-2.5-pro", "gemini/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"]`
+3. **Try each provider sequentially**:
+   - Try `gemini_cli/gemini-2.5-pro` with all its credentials (tier-2 first, then tier-1 - internal rotation)
+   - If exhausted, try `gemini/gemini-2.5-pro` with all its credentials
+   - If exhausted, try `openrouter/google/gemini-2.5-pro` with all its credentials
+4. **Success stops iteration**: First successful response is returned immediately
+
+#### Implementation Details
+
+The fallback system reuses existing retry logic completely:
+
+```
+For each entry in fallback group (in order):
+    Call existing _execute_with_retry() with entry's provider/model
+    If success: return response
+    If exhausted: continue to next entry
+```
+
+**Key Benefits:**
+- Simple, predictable provider ordering
+- Each provider uses its own internal tier rotation
+- Existing cooldown, fair cycle, and rotation logic remains intact
+- Provider-specific settings (like concurrency limits) are respected
+- No changes needed to `UsageManager` - orchestration is at client level
+
+#### Edge Cases
+
+| Scenario | Behavior |
+|----------|----------|
+| Request not in any group | Single-provider mode (existing behavior) |
+| Same provider, different models | Both entries tried in order |
+| Provider has no credentials | Entry skipped silently |
+| All entries exhausted | Returns error response with details |
+
+#### Logging
+
+- `DEBUG`: When fallback activated, which entry being tried
+- `INFO`: When entry exhausted and moving to next
+- `INFO`: When fallback succeeds with a specific entry
+- `WARNING`: When all entries exhausted
+
 ### 3.5. Antigravity (`antigravity_provider.py`)
 
 The most sophisticated provider implementation, supporting Google's internal Antigravity API for Gemini 3 and Claude models (including **Claude Opus 4.5**, Anthropic's most powerful model).

@@ -294,6 +294,7 @@ The proxy is powered by a standalone Python library that you can use directly in
 - **Intelligent key selection** with tiered, model-aware locking
 - **Deadline-driven requests** with configurable global timeout
 - **Automatic failover** between keys on errors
+- **Cross-provider fallback** — pool credentials from multiple providers for the same model
 - **OAuth support** for Gemini CLI, Antigravity, Qwen, iFlow
 - **Stateless deployment ready** — load credentials from environment variables
 

@@ -31,6 +31,7 @@ A robust, asynchronous, and thread-safe Python library for managing a pool of AP
 -   **Shared OAuth Base**: Refactored OAuth implementation with reusable [`GoogleOAuthBase`](providers/google_oauth_base.py) for multiple providers.
 -   **Fair Cycle Rotation**: Ensures each credential exhausts at least once before any can be reused within a tier. Prevents a single credential from being repeatedly used while others sit idle. Configurable per provider with tracking modes and cross-tier support.
 -   **Custom Usage Caps**: Set custom limits per tier, per model/group that are more restrictive than actual API limits. Supports percentages (e.g., "80%") and multiple cooldown modes (`quota_reset`, `offset`, `fixed`). Credentials go on cooldown before hitting actual API limits.
+-   **Cross-Provider Fallback Groups**: Pool credentials from multiple providers for equivalent models. When one provider's credentials are exhausted, automatically fall back to the next provider in the configured order. Each provider uses its own internal tier rotation.
 -   **Centralized Defaults**: All tunable defaults are defined in [`config/defaults.py`](config/defaults.py) for easy customization and visibility.
 
 ## Installation
@@ -82,7 +83,10 @@ client = RotatingClient(
     whitelist_models={},
     enable_request_logging=False,
     max_concurrent_requests_per_key={},
-    rotation_tolerance=2.0  # 0.0=deterministic, 2.0=recommended random
+    rotation_tolerance=2.0,  # 0.0=deterministic, 2.0=recommended random
+    model_fallback_groups=[   # Cross-provider fallback groups
+        ["gemini/gemini-2.5-pro", "gemini_cli/gemini-2.5-pro", "openrouter/google/gemini-2.5-pro"],
+    ],
 )
 ```