added support for MAX_INPUT_TOKENS #61

VishalYadavCF · 2025-05-08T06:27:17Z

Issue #59

gemini-code-assist

Hello @VishalYadavCF, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses issue #59 by adding support for MAX_INPUT_TOKENS. The primary change involves adjusting the MAX_INPUT_TOKENS variable in api/rag.py to a lower value (3400) and passing this value to the embedder's model_kwargs during initialization. This ensures that the configured token limit is respected when using the embedding model.

Highlights

Token Limit Adjustment: The MAX_INPUT_TOKENS variable in api/rag.py has been adjusted from 7500 to 3400 to align with the embedding model's 4096 token limit.
Embedder Configuration: The MAX_INPUT_TOKENS value is now passed to the embedder's model_kwargs during initialization, ensuring the model respects the configured token limit.

Changelog

api/rag.py
- Adjusted MAX_INPUT_TOKENS from 7500 to 3400.
- Passed MAX_INPUT_TOKENS to the embedder's model_kwargs.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

A token's count,
Must stay within the bound,
Or errors arise.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request introduces support for MAX_INPUT_TOKENS to limit the number of tokens used by the embedding model. This is a good change as it allows for better control over resource usage and can prevent errors due to exceeding token limits. The changes are straightforward and seem well-implemented.

Merge Readiness

The code changes are well-implemented and address the issue of token limits for embedding models. I don't see any critical or high severity issues. I am unable to approve the pull request, and recommend that others review and approve this code before merging.

gemini-code-assist · 2025-05-08T06:27:33Z

api/rag.py

@@ -45,7 +45,7 @@ def append_dialog_turn(self, dialog_turn):
 logger = logging.getLogger(__name__)

 # Maximum token limit for embedding models
-MAX_INPUT_TOKENS = 7500  # Safe threshold below 8192 token limit
+MAX_INPUT_TOKENS = 3400  # Adjusted for embedding model with 4096 token limit


Consider adding a comment explaining why the token limit was adjusted to 3400. This will help future developers understand the reasoning behind this specific value.

Style Guide References

Suggested change

MAX_INPUT_TOKENS = 3400 # Adjusted for embedding model with 4096 token limit

MAX_INPUT_TOKENS = 3400 # Adjusted for embedding model with 4096 token limit. Testing indicates best performance at this level.

gemini-code-assist · 2025-05-08T06:27:33Z

api/rag.py

+            model_kwargs={
+                **embedder_config["model_kwargs"],
+                "max_input_tokens": MAX_INPUT_TOKENS,  # Use the configured token limit
+            },


It's good that you're passing the MAX_INPUT_TOKENS to the embedder. Consider adding a brief comment explaining why this is important, such as preventing exceeding the model's token limit.

Style Guide References

Suggested change

model_kwargs={

**embedder_config["model_kwargs"],

"max_input_tokens": MAX_INPUT_TOKENS, # Use the configured token limit

},

model_kwargs={

**embedder_config["model_kwargs"],

"max_input_tokens": MAX_INPUT_TOKENS, # Prevents exceeding model's token limit

},

sng-asyncfunc · 2025-05-08T06:37:41Z

Should we make this more dynamically define from .env? I removed it initially because i am seeing weird results especially switching from different models.

lujmo11 · 2025-05-21T06:24:25Z

shouldnt they max tokens be specified by provider / model as they can be different per model ?
so maybe they should be set in api/config/embedder.json as specific model_kwargs .. and then rag.py will read them in ?

added support for MAX_INPUT_TOKENS

Loading
Loading status checks…

3299b57

gemini-code-assist bot reviewed May 8, 2025

View reviewed changes

sng-asyncfunc closed this May 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added support for MAX_INPUT_TOKENS #61

added support for MAX_INPUT_TOKENS #61

VishalYadavCF commented May 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot May 8, 2025

Uh oh!

gemini-code-assist bot May 8, 2025

Uh oh!

sng-asyncfunc commented May 8, 2025

Uh oh!

lujmo11 commented May 21, 2025

Uh oh!

	MAX_INPUT_TOKENS = 3400 # Adjusted for embedding model with 4096 token limit
	MAX_INPUT_TOKENS = 3400 # Adjusted for embedding model with 4096 token limit. Testing indicates best performance at this level.

added support for MAX_INPUT_TOKENS #61

added support for MAX_INPUT_TOKENS #61

Conversation

VishalYadavCF commented May 8, 2025

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Merge Readiness

Uh oh!

gemini-code-assist bot May 8, 2025

Choose a reason for hiding this comment

Style Guide References

Uh oh!

gemini-code-assist bot May 8, 2025

Choose a reason for hiding this comment

Style Guide References

Uh oh!

sng-asyncfunc commented May 8, 2025

Uh oh!

lujmo11 commented May 21, 2025

Uh oh!