Skip to content

Conversation

@Mirrowel
Copy link
Owner

@Mirrowel Mirrowel commented Jan 20, 2026

This introduces a robust mechanism to define and utilize groups of equivalent models across different providers. If a request is made to a model that belongs to a fallback group, the client can now seamlessly rotate through other providers/models in that group upon exhaustion of credentials or errors.

  • Add FallbackGroupManager to parse and index fallback configurations from init args or the MODEL_FALLBACK_GROUPS environment variable.
  • Implement _execute_with_fallback in RotatingClient for standard completions, handling credential swapping and tier prioritization across providers.
  • Implement _streaming_with_fallback to support transparent failover during streaming responses.
  • Integrate fallback logic into the main acompletion entry point.

Related #85


Important

Introduces a cross-provider model fallback system in RotatingClient, enabling seamless credential rotation across providers for equivalent models.

  • Behavior:
    • Introduces cross-provider model fallback system in RotatingClient.
    • Adds FallbackGroupManager to manage fallback configurations from init args or MODEL_FALLBACK_GROUPS env variable.
    • Implements _execute_with_fallback and _streaming_with_fallback for handling fallback logic in client.py.
  • Configuration:
    • Supports fallback configuration via model_fallback_groups parameter or MODEL_FALLBACK_GROUPS env variable.
    • Updates .env.example with fallback group configuration examples.
  • Documentation:
    • Updates DOCUMENTATION.md and README.md to include details on cross-provider fallback groups.
    • Adds section on fallback groups to src/rotator_library/README.md.
  • Misc:
    • Minor logging enhancements in client.py and fallback_groups.py.

This description was created by Ellipsis for 96c4bdb. You can customize this summary. It will automatically update as commits are pushed.

This introduces a robust mechanism to define and utilize groups of equivalent models across different providers. If a request is made to a model that belongs to a fallback group, the client can now seamlessly rotate through other providers/models in that group upon exhaustion of credentials or errors.

- Add `FallbackGroupManager` to parse and index fallback configurations from init args or the `MODEL_FALLBACK_GROUPS` environment variable.
- Implement `_execute_with_fallback` in `RotatingClient` for standard completions, handling credential swapping and tier prioritization across providers.
- Implement `_streaming_with_fallback` to support transparent failover during streaming responses.
- Integrate fallback logic into the main `acompletion` entry point.
@mirrobot-agent
Copy link
Contributor

I'm diving into the new cross-provider model fallback system. Looking forward to reviewing the and how it integrates into the . Back soon with the full report!

Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing review submission with only critical findings.


# Temporarily swap credentials for this provider
original_creds = self.all_credentials.get(provider, [])
self.all_credentials[provider] = tier_creds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: This is a major concurrency race condition. self.all_credentials is shared across all requests. Swapping it globally will affect every other concurrent request to this provider, causing them to use only the credentials from this fallback tier.

To fix this properly, _execute_with_retry should be updated to accept an optional credentials_override list, avoiding global state modification during request execution.


# Temporarily swap credentials for this provider
original_creds = self.all_credentials.get(provider, [])
self.all_credentials[provider] = tier_creds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: Same race condition as in _execute_with_fallback. Global modification of self.all_credentials is not safe for concurrent streaming requests.

Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing review submission for fallback_groups.py.

group = self._groups[group_idx]

# Move target to front, preserve relative order of others
reordered = [provider_model]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider deduplicating the group members here. If an entry appears multiple times in the group config, it will currently appear multiple times in the returned list (except for the promoted target).

Suggested change
reordered = [provider_model]
# Move target to front, preserve relative order of others, and deduplicate
reordered = [provider_model]
seen = {provider_model}
for entry in group:
if entry not in seen:
reordered.append(entry)
seen.add(entry)

Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This PR introduces a powerful cross-provider model fallback system, which is a great addition for increasing the reliability of the client. The separation of the fallback logic into FallbackGroupManager is clean and well-structured.

However, I've identified a critical concurrency issue in how credentials are swapped during the fallback process (see my previous inline comments on lines 2062 and 2221 in client.py). Since self.all_credentials is a shared instance variable, modifying it during a request will interfere with any other concurrent requests being handled by the same RotatingClient instance. This needs to be addressed before the PR can be merged.

Architectural Feedback

  • The tier-aware fallback logic correctly respects provider priorities while allowing cross-provider rotation.
  • Critical Concern: The reliance on globally swapping self.all_credentials must be refactored to use local state/arguments for the retry logic to ensure thread/async safety.

Key Suggestions

  • Refactor _execute_with_retry and _streaming_acompletion_with_retry to accept an explicit credentials list instead of reading from self.all_credentials.
  • Improve the robustness of FallbackGroupManager by deduplicating and normalizing config entries (see suggestion on line 137).
  • Consider if aembedding should also benefit from these fallback groups for consistency.
  • In _streaming_with_fallback, consider if we should try other fallback entries if the first one returns only a [DONE] signal without any actual data.

Questions for the Author

  • Was the exclusion of aembedding from the fallback system intentional?
  • For streaming, do you think we should try other fallback entries if the first one returns only a [DONE] signal without any actual data?

This review was generated by an AI assistant.

- Update `.env.example` with `MODEL_FALLBACK_GROUPS` configuration template and detailed usage comments.
- Add comprehensive section to `DOCUMENTATION.md` explaining the fallback algorithm, tier-first ordering, and target promotion logic.
- Update project and library `README.md` files to highlight the new feature and provide updated `RotatingClient` instantiation examples.
Updates retry mechanisms to accept an optional `credentials_for_provider` argument. This eliminates the need to temporarily mutate `self.all_credentials` during fallback attempts, improving thread safety and code clarity.

Also in this commit:
- fix(core): deduplicate entries when generating fallback groups
Refactor the cross-provider fallback system to use a sequential provider approach instead of global tier-based ordering.

- Iterate through fallback entries sequentially, allowing each provider to complete its internal rotation (Tier 2 then Tier 1) before moving to the next.
- Remove `credentials_for_provider` injection logic, simplifying retry methods and ensuring provider-specific settings are respected.
- Update documentation and examples to reflect the new deterministic behavior.
- Adjust logging levels for better clarity during fallback operations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants