-
Notifications
You must be signed in to change notification settings - Fork 564
feature:AWS Bedrock Initial Commit #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @sng-asyncfunc, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces initial support for AWS Bedrock as a new LLM provider within the DeepWiki application. This includes adding a new BedrockClient
class to handle interactions with the AWS Bedrock API, updating configuration files to include Bedrock as a supported provider, and modifying the chat completion logic to utilize Bedrock models. The changes also incorporate necessary environment variables for AWS credentials and update dependencies to include the boto3
library.
Highlights
- AWS Bedrock Integration: Adds support for AWS Bedrock models, allowing DeepWiki to leverage foundation models available through the Bedrock service.
- New
BedrockClient
Class: Introduces aBedrockClient
class inapi/bedrock_client.py
to handle authentication, API calls, and response formatting for AWS Bedrock. - Configuration Updates: Modifies
api/config.py
andapi/config/generator.json
to include AWS Bedrock as a supported provider, along with its associated models and client configurations. - Chat Completion Logic: Updates
api/simple_chat.py
to incorporate Bedrock into the chat completion workflow, including handling credentials and making API calls. - Dependency Update: Adds
boto3
as a required dependency inapi/requirements.txt
andpyproject.toml
to enable interaction with AWS services.
Changelog
Click here to see the changelog
- api/README.md
- Added AWS Bedrock configuration details, including required environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, AWS_ROLE_ARN).
- Updated instructions to include obtaining AWS credentials from the AWS IAM Console.
- Added AWS Bedrock to the list of supported LLM providers and their required API keys.
- api/bedrock_client.py
- Created a new
BedrockClient
class to handle interactions with the AWS Bedrock API. - Implemented methods for initializing synchronous and asynchronous clients, formatting prompts for different providers, and extracting response text.
- Added error handling and retry logic using the
backoff
library. - Implemented
call
andacall
methods for making synchronous and asynchronous API calls, respectively. - Implemented
convert_inputs_to_api_kwargs
to convert inputs to API kwargs for AWS Bedrock.
- Created a new
- api/config.py
- Imported the
BedrockClient
class. - Added environment variables for AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, AWS_ROLE_ARN).
- Added
BedrockClient
to theCLIENT_CLASSES
dictionary. - Updated
load_generator_config
to include Bedrock as a supported provider and map it to theBedrockClient
class.
- Imported the
- api/config/generator.json
- Added a new
bedrock
provider configuration, including its client class, default model, and supported models.
- Added a new
- api/requirements.txt
- Added
boto3>=1.34.0
as a required dependency.
- Added
- api/simple_chat.py
- Imported the
BedrockClient
class. - Added
bedrock
to the list of supported providers in theChatCompletionRequest
model. - Updated the
chat_completions_stream
function to handle Bedrock requests, including checking for AWS credentials and initializing theBedrockClient
. - Added error handling for AWS Bedrock API calls.
- Added fallback mechanism for AWS Bedrock API calls.
- Imported the
- pyproject.toml
- Added
boto3>=1.34.0
as a required dependency.
- Added
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
In clouds of AWS,
Bedrock models now arise,
DeepWiki expands.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces initial support for AWS Bedrock as a new LLM provider. The changes include adding a BedrockClient
, updating configuration files, and integrating Bedrock into the chat completion flow. The implementation covers basic functionality, including authentication (keys and IAM roles) and making calls to various Bedrock models.
Overall, this is a good first step for Bedrock integration. The code is generally well-structured. The main areas for improvement revolve around asynchronous operations, streaming capabilities (which are currently not implemented for Bedrock), and some hardcoded values that could be made more configurable.
Summary of Findings
- Lack of Asynchronous and Streaming Support for Bedrock: The
BedrockClient
'sacall
method currently calls the synchronouscall
method, and it usesinvoke_model
which doesn't support streaming. This means Bedrock interactions are blocking and don't stream responses, unlike potentially other providers. This is a key feature parity gap. - Potentially Insufficient Backoff
max_time
: The backoff strategy for Bedrock API calls usesmax_time=5
seconds. This might be too short for retrying calls to a cloud service, potentially leading to premature failure if transient network issues last longer than 5 seconds. - Hardcoded Configuration Values: Several values like default model IDs and
max_tokens
are hardcoded withinBedrockClient.py
. These should ideally be configurable, perhaps sourced from thegenerator.json
or passed dynamically. - Broad Exception Handling: Some methods use broad
except Exception:
clauses, which can obscure the specific type of error occurring. Catching more specific exceptions (e.g., frombotocore.exceptions
) would improve error diagnostics.
Merge Readiness
This pull request lays a solid foundation for AWS Bedrock integration. However, due to the high-severity issues identified, particularly the lack of true asynchronous/streaming support for Bedrock and the potentially insufficient backoff max_time
, I recommend that these changes not be merged until these points are addressed. Addressing these will significantly improve the robustness and usability of the Bedrock client. As a reviewer, I am not authorized to approve pull requests, so please ensure further review and approval from authorized team members after addressing the feedback.
@backoff.on_exception( | ||
backoff.expo, | ||
(botocore.exceptions.ClientError, botocore.exceptions.BotoCoreError), | ||
max_time=5, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The max_time=5
(seconds) for the backoff strategy might be too short for API calls to a cloud service like AWS Bedrock. Transient network issues or brief service slowdowns could exceed this 5-second window, leading to premature failure of the request.
Would it be more robust to increase max_time
(e.g., to 30 or 60 seconds) or use max_tries
to ensure a certain number of attempts?
async def acall(self, api_kwargs: Dict = None, model_type: ModelType = None) -> Any: | ||
"""Make an asynchronous call to the AWS Bedrock API.""" | ||
# For now, just call the sync method | ||
# In a real implementation, you would use an async library or run the sync method in a thread pool | ||
return self.call(api_kwargs, model_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The acall
method currently delegates to the synchronous self.call(api_kwargs, model_type)
. This means that calls intended to be asynchronous will block. Additionally, the Bedrock API call itself (self.sync_client.invoke_model
) is synchronous and does not support streaming. For streaming, invoke_model_with_response_stream
would be required.
This is a significant limitation if streaming and non-blocking I/O are expected for Bedrock, especially for parity with other providers. What are the plans for implementing true asynchronous operations and streaming for Bedrock? This should be a high-priority follow-up.
def init_async_client(self): | ||
"""Initialize the asynchronous AWS Bedrock client. | ||
|
||
Note: boto3 doesn't have native async support, so we'll use the sync client | ||
in async methods and handle async behavior at a higher level. | ||
""" | ||
# For now, just return the sync client | ||
return self.sync_client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The init_async_client
method currently returns the synchronous client, and the acall
method (line 286) calls the synchronous call
method. This means operations with Bedrock are not truly asynchronous.
Is there a plan to implement true asynchronous support, perhaps using aiobotocore
or running synchronous calls in a thread pool executor? If not, the current behavior should be clearly documented as a known limitation, especially if other clients offer true async capabilities.
role = "user" if msg.get("role") == "user" else "assistant" | ||
formatted_messages.append({ | ||
"role": role, | ||
"content": [{"type": "text", "text": msg.get("content", "")}] | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic role = "user" if msg.get("role") == "user" else "assistant"
assumes that if a role is present and not 'user', it should be 'assistant'. What happens if msg
doesn't have a 'role' key, or if the role is something unexpected (e.g., 'system' for some models, though Bedrock Claude uses 'user' and 'assistant')?
Consider adding more robust handling or validation for message roles to prevent unexpected behavior, or clarify the expected message structure in the docstring.
return { | ||
"anthropic_version": "bedrock-2023-05-31", | ||
"messages": formatted_messages, | ||
"max_tokens": 4096 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The max_tokens
parameter is hardcoded here (and in similar blocks for other providers like lines 156, 163, 173, 181). Different models have different maximum token limits, and this value might also be something users want to configure.
Could this be made more dynamic, perhaps by fetching it from model-specific configurations (e.g., generator.json
) or allowing it to be passed via model_kwargs
and then into api_kwargs
?
else: | ||
# Default format | ||
return {"prompt": prompt} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default fallback format {"prompt": prompt}
might not be suitable for all Bedrock models if a provider isn't explicitly handled. Could this lead to errors if a new, unhandled provider type is encountered?
Perhaps raising a NotImplementedError
or ValueError
for unknown providers, or logging a more prominent warning, would be safer than attempting a generic format.
else: | ||
# Try to extract text from the response | ||
if isinstance(response, dict): | ||
for key in ["text", "content", "output", "completion"]: | ||
if key in response: | ||
return response[key] | ||
return str(response) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback else: return str(response)
for extracting text could be problematic if the response is a complex object and not easily convertible to a meaningful string.
If the response structure for an unhandled provider or an error state is unknown, this might return unhelpful output. Consider logging the full response object here for debugging and perhaps returning a more specific error message or None
if text extraction fails definitively.
return error_msg | ||
|
||
if model_type == ModelType.LLM: | ||
model_id = api_kwargs.get("model", "anthropic.claude-3-sonnet-20240229-v1:0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default model_id
is hardcoded here: "anthropic.claude-3-sonnet-20240229-v1:0"
. This is also repeated in convert_inputs_to_api_kwargs
(line 300) and matches the default in generator.json
.
While consistent, hardcoding it in the client makes it less flexible. Could this default be sourced from a class attribute, a configuration constant, or be more tightly coupled with the generator.json
default to avoid potential drift if the config changes?
try: | ||
# Make the API call | ||
response = self.sync_client.invoke_model( | ||
modelId=model_id, | ||
body=body | ||
) | ||
|
||
# Parse the response | ||
response_body = json.loads(response["body"].read()) | ||
|
||
# Extract the generated text | ||
generated_text = self._extract_response_text(provider, response_body) | ||
|
||
return generated_text | ||
|
||
except Exception as e: | ||
log.error(f"Error calling AWS Bedrock API: {str(e)}") | ||
return f"Error: {str(e)}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The try...except Exception as e:
block is quite broad. While it ensures an error is caught, it might hide the specific nature of issues (e.g., botocore.exceptions.ClientError
for AWS-specific issues vs. a json.JSONDecodeError
).
Consider catching more specific exceptions first (like botocore.exceptions.BotoCoreError
, botocore.exceptions.ClientError
) for more targeted error logging or handling, and then a general Exception
as a final fallback. This would improve debuggability.
response = await model.acall(api_kwargs=api_kwargs, model_type=ModelType.LLM) | ||
# Handle response from Bedrock (not streaming yet) | ||
if isinstance(response, str): | ||
yield response | ||
else: | ||
# Try to extract text from the response | ||
yield str(response) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on line 558 Handle response from Bedrock (not streaming yet)
is accurate. Given that Bedrock supports streaming responses via invoke_model_with_response_stream
, is there a plan to implement this?
Currently, the entire Bedrock response will be received before any part of it is yielded, which could lead to higher perceived latency for the user compared to other streaming providers.
@sng-asyncfunc |
Looking for bedrock users to test it out first, it hasn't been tested and i have not received any feedback yet. |
okay, I am a User, I have just tried it, so far so good. |
@sng-asyncfunc |
No description provided.