Skip to content

Conversation

@ylgibby
Copy link
Contributor

@ylgibby ylgibby commented Oct 26, 2025

Description

Fixes #15949

In v1.79.0-stable, all Bedrock health checks are failing when using inference profile IDs. This PR fixes the issue while maintaining the original fix for #15807.

Root Cause

PR #15808 attempted to fix issue #15807 (regional routing in health checks) by using get_base_model() to strip region prefixes. However, this was too aggressive and also stripped AWS-required Cross-Region Inference Profile (CRIS) prefixes like us., eu., apac., etc.

Solution

This PR implements a more targeted approach that:

  1. Strips ONLY AWS region identifiers (e.g., us-west-2, eu-central-1)
  2. Preserves CRIS prefixes that AWS requires (e.g., us., eu., apac.)
  3. Preserves route specifications (e.g., converse/, invoke/)
  4. Preserves handler prefixes (e.g., llama/, deepseek_r1/)
  5. Preserves ARN formats for provisioned models

Implementation

Modified _update_litellm_params_for_health_check() in litellm/proxy/health_check.py to:

  • Parse Bedrock model paths segment by segment
  • Filter out items that match BedrockModelInfo.all_global_regions
  • Keep everything else (CRIS prefixes, routes, handlers, ARNs)

Examples

Regional routing (issue #15807) - strips region ✅

  • Input: bedrock/us-west-2/anthropic.claude-3-5-sonnet-20240620-v1:0
  • Output: anthropic.claude-3-5-sonnet-20240620-v1:0

Inference profiles - preserves CRIS prefix ✅

  • Input: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
  • Output: us.anthropic.claude-3-5-sonnet-20240620-v1:0

Complex routing - preserves route + strips region ✅

  • Input: bedrock/converse/us-west-2/anthropic.claude-3-5-sonnet-20240620-v1:0
  • Output: converse/anthropic.claude-3-5-sonnet-20240620-v1:0

Test Coverage

Added comprehensive tests covering:

  • All 7 CRIS prefixes: us., eu., apac., jp., au., us-gov., global.
  • Regional routing + CRIS combinations
  • GovCloud regions
  • Handler prefixes (llama/, deepseek_r1/)
  • Route specifications (converse/, invoke/)
  • ARN formats (provisioned, application inference profiles, imported models)
  • Edge cases (route + region + CRIS)
  • Non-Bedrock models (no side effects)

Related Issues

- Fixes issue where health checks were stripping inference profile IDs
- Preserves cross-region inference profile prefixes (us., eu., apac., jp., au., us-gov., global.)
- Strips only AWS region routing while preserving routes and handlers
- Resolves both issue BerriAI#15807 and inference profile requirement errors
- Adds comprehensive tests for all Bedrock model format combinations

Issue BerriAI#15807 attempted to fix regional Bedrock model health checks but was too
aggressive, stripping cross-region inference profile prefixes that AWS requires.
This caused errors: "Invocation of model ID X with on-demand throughput isn't
supported. Retry your request with the ID or ARN of an inference profile."

The fix now correctly:
- Strips AWS regions (us-west-2, eu-central-1, etc.) from routing
- Preserves CRIS prefixes (us., eu., etc.) required by AWS
- Preserves routes (converse/, invoke/)
- Preserves handlers (llama/, deepseek_r1/)
- Only affects Bedrock models (checked via startswith)

Test coverage includes 20+ scenarios for all Bedrock model format combinations.
@vercel
Copy link

vercel bot commented Oct 26, 2025

@ylgibby is attempting to deploy a commit to the CLERKIEAI Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link

CLAassistant commented Oct 26, 2025

CLA assistant check
All committers have signed the CLA.

@ylgibby ylgibby force-pushed the fix/bedrock-health-check-inference-profiles branch from ea30f23 to 1f88d71 Compare October 26, 2025 22:13
@krrishdholakia krrishdholakia merged commit 2bef7c3 into BerriAI:main Oct 28, 2025
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bedrock health checks fail with inference profile IDs in v1.79.0-stable

3 participants