Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix prompt truncation logic #838

Merged
merged 6 commits into from
Mar 6, 2025
Merged

Fix prompt truncation logic #838

merged 6 commits into from
Mar 6, 2025

Conversation

DonggeLiu
Copy link
Collaborator

Given an overlong prompt, we want to truncate it to:

<prompt short initial text>
...(truncated due to exceeding input token limit)...
<prompt long ending text>

Where ...(truncated due to exceeding input token limit)... replaces sufficient prompt text so that the final prompt is within token limit.

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

DonggeLiu commented Mar 5, 2025

Truncation looking good now:
https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2025-03-05-838-dg-comparison/sample/output-xs-fxloadmodulesrejected/04.html

I reckon we can be even more aggressive on the amount of text to truncate.
Most of text are not very useful when they are that large.
Minimizing it can probably improve performance.

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg1 -ag

@DonggeLiu
Copy link
Collaborator Author

@DonggeLiu
Copy link
Collaborator Author

/gcbrun skip

total_tokens = self.estimate_token_num(raw_prompt_text)

# Allow buffer space for potential prompts that will be appended later.
allowed_tokens = self.MAX_INPUT_TOKEN // 10 - extra_tokens
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why // 10 ? can you add a comment to explain?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
A bit more context:
This is mainly used when sending the stdout/stderr of agent's bash commands or compilation requests to LLM.
Empirically, my observation is that each LLM response contains up to 10 such commands/requests from LLM. We allocate at most 1/10 of MAX_INPUT_TOKEN to each item to ensure balanced token distribution.

@DonggeLiu
Copy link
Collaborator Author

/gcbrun skip

@DonggeLiu DonggeLiu merged commit b53d7c7 into main Mar 6, 2025
6 checks passed
@DonggeLiu DonggeLiu deleted the fix-prompt-truncation branch March 6, 2025 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants