You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: Preserve Bedrock inference profile IDs in health checks (#15947)
* fix: Preserve Bedrock inference profile IDs in health checks
- Fixes issue where health checks were stripping inference profile IDs
- Preserves cross-region inference profile prefixes (us., eu., apac., jp., au., us-gov., global.)
- Strips only AWS region routing while preserving routes and handlers
- Resolves both issue #15807 and inference profile requirement errors
- Adds comprehensive tests for all Bedrock model format combinations
Issue #15807 attempted to fix regional Bedrock model health checks but was too
aggressive, stripping cross-region inference profile prefixes that AWS requires.
This caused errors: "Invocation of model ID X with on-demand throughput isn't
supported. Retry your request with the ID or ARN of an inference profile."
The fix now correctly:
- Strips AWS regions (us-west-2, eu-central-1, etc.) from routing
- Preserves CRIS prefixes (us., eu., etc.) required by AWS
- Preserves routes (converse/, invoke/)
- Preserves handlers (llama/, deepseek_r1/)
- Only affects Bedrock models (checked via startswith)
Test coverage includes 20+ scenarios for all Bedrock model format combinations.
* Remove unused traceback import
- updates the `model` param with the `health_check_model` if it exists Doc: https://docs.litellm.ai/docs/proxy/health#wildcard-routes
140
140
- updates the `voice` param with the `health_check_voice` for `audio_speech` mode if it exists Doc: https://docs.litellm.ai/docs/proxy/health#text-to-speech-models
141
-
- updates the `model` param with the Bedrock base model name if it is a Bedrock model
141
+
- for Bedrock models with region routing (bedrock/region/model), strips the litellm routing prefix but preserves the model ID
0 commit comments