Skip to content

fix: new URL for AWS Bedrock and model list support#4946

Open
are-ces wants to merge 1 commit intollamastack:mainfrom
are-ces:aws-fix
Open

fix: new URL for AWS Bedrock and model list support#4946
are-ces wants to merge 1 commit intollamastack:mainfrom
are-ces:aws-fix

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Feb 18, 2026

What does this PR do?

The official endpoint for AWS Bedrock has changed according to AWS Documentation

This new endpoint supports the /v1/models endpoint, thus I added the method for listing models too.

Test Plan

Call Responses API with AWS Bedrock; list models

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 18, 2026
@mattf
Copy link
Collaborator

mattf commented Feb 18, 2026

image

Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix the tests

@are-ces are-ces changed the title fix: wrong URL for AWS Bedrock fix: new URL for AWS Bedrock and model list support Feb 18, 2026
@are-ces
Copy link
Contributor Author

are-ces commented Feb 18, 2026

image

I am using Responses API in llama-stack, does this mean we need to use both endpoints?

PS: tool calling does not work on the current endpoint

@are-ces
Copy link
Contributor Author

are-ces commented Feb 18, 2026

Perhaps @skamenan7 can join the conversation

@mattf
Copy link
Collaborator

mattf commented Feb 18, 2026

I am using Responses API in llama-stack, does this mean we need to use both endpoints?

stack implements its /v1/responses using a backend provider's /v1/chat//completions. you should not have to change anything.

PS: tool calling does not work on the current endpoint

this is a good issue to file. does switching the endpoint get tool calling to work?

@are-ces
Copy link
Contributor Author

are-ces commented Feb 19, 2026

Tool working does not work, you can check this notebook.
This article also mentions tool calling not supported.

With the mantle endpoint it works.

@mattf
Copy link
Collaborator

mattf commented Feb 19, 2026

@skamenan7 what are the errors when using tools w/ the existing bedrock provider?

@skamenan7
Copy link
Contributor

@skamenan7 what are the errors when using tools w/ the existing bedrock provider?

Hi @mattf , with the current bedrock-runtime endpoint, sending tools in the request causes the server to return 200 OK but then hang indefinitely — no response body, no error, just an open connection until the client times out. There's no actionable error message, it silently accepts the request and never responds. I hit this consistently in local testing (both direct curl and through the stack).

ps: I away few days, hence late reply.

Copy link
Contributor

@skamenan7 skamenan7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @are-ces , looks great except for few comments.

return await super().list_provider_model_ids()
except Exception as e:
logger.debug(f"Failed to list Bedrock models dynamically: {e}")
return []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong on this, but I think returning [] on failure could cause model registry churn? ModelsRoutingTable.refresh() treats the result as authoritative, so on a transient 401/5xx it'd call update_registered_models(..., models=[]) and unregister all bedrock models.
Users'd see intermittent "model not found" until the next successful refresh.

If that's right, letting the exception propagate might be the simplest fix as refresh() already catches exceptions from list_models(). Or
narrowing the catch to only NotFoundError / APIConnectionError would also work, right?

return []
try:
return await super().list_provider_model_ids()
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I noticed that except Exception catches AuthenticationError too, so if someone's credentials are wrong they'd get an empty model list with a debug-level log instead of finding out right away. They wouldn't see the actual auth error until they try a chat completion.

Also the parent list_models() already has a try/except that logs at error level and re-raises, but this catch fires first and prevents that
from kicking in.

Narrowing to APIConnectionError / NotFoundError might be cleaner? That way auth failures still surface. Or bumping to warning at minimum so it's not invisible.

logger.debug(f"Failed to list Bedrock models dynamically: {e}")
return []

async def check_model_availability(self, model: str) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the parent check_model_availability already handle the empty-cache case. The only
extra thing this override seems to do is force cache population when model_store already has the model, isn't it? is there a downstream operation that depends on that?

If not, removing the override might be simpler.

@skamenan7
Copy link
Contributor

nit: the new list_provider_model_ids fallback and check_model_availability override don't have unit test coverage. A couple of quick tests would help here.

@mattf
Copy link
Collaborator

mattf commented Feb 25, 2026

@skamenan7 what are the errors when using tools w/ the existing bedrock provider?

Hi @mattf , with the current bedrock-runtime endpoint, sending tools in the request causes the server to return 200 OK but then hang indefinitely — no response body, no error, just an open connection until the client times out. There's no actionable error message, it silently accepts the request and never responds. I hit this consistently in local testing (both direct curl and through the stack).

ps: I away few days, hence late reply.

that's terrible.

i hope you have a route to file a bug against bedrock.

does moving to the new endpoint resolve this?

@skamenan7
Copy link
Contributor

Hi @mattf, I just tested both endpoints directly with SigV4 auth and tools param, and tool calling works on both now — bedrock-runtime returned finish_reason: "tool_calls" with the correct function call, no hang. Looks like AWS fixed it since my Oct testing. I did not get a chance to test it again since then.

But @are-ces change also has /v1/models support (old endpoint returns empty, mantle lists 30+ models) and a much bigger model catalog (DeepSeek, Mistral, Qwen, Gemma alongside GPT-OSS). The list_provider_model_ids() change makes sense with mantle since /v1/models actually returns data there.

@mattf
Copy link
Collaborator

mattf commented Feb 27, 2026

Hi @mattf, I just tested both endpoints directly with SigV4 auth and tools param, and tool calling works on both now — bedrock-runtime returned finish_reason: "tool_calls" with the correct function call, no hang. Looks like AWS fixed it since my Oct testing. I did not get a chance to test it again since then.

that's good news.

But @are-ces change also has /v1/models support (old endpoint returns empty, mantle lists 30+ models) and a much bigger model catalog (DeepSeek, Mistral, Qwen, Gemma alongside GPT-OSS). The list_provider_model_ids() change makes sense with mantle since /v1/models actually returns data there.

i don't have an environment to test this. will you do the review, verify it works, doesn't introduce any regressions, has a good user experience, and i'll mash merge?

@skamenan7
Copy link
Contributor

Sure, @mattf, I will run through by these criteria and let you know.

@skamenan7
Copy link
Contributor

Ran through this carefully @mattf. The core change is right — mantle endpoint confirmed working for tool calling,
model listing, and streaming.

Two things worth addressing before merge:

  1. list_provider_model_ids catches AuthenticationError in the broad except Exception block and returns
    [] with a debug log. Would be better to re-raise it so users with bad credentials get a clear error
    at model listing time, not a silent empty list.

  2. The check_model_availability override looks redundant — the parent already handles the empty-cache
    case. If there's a reason it needs to be there, a comment explaining the "why" would help.

If @are-ces addresses these I will check again and provide feedback

@are-ces
Copy link
Contributor Author

are-ces commented Mar 2, 2026

Hi! I will address all issues asap 👍

@are-ces
Copy link
Contributor Author

are-ces commented Mar 3, 2026

@skamenan7 I have addressed your point removing the overrides, they were indeed not needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants