Add get_max_output_tokens for class SpyrePlatform #179

gmarinho2 · 2025-05-22T17:40:17Z

Currently the class SpyrePlatform uses the default max_tokens set in the OpenAI frontend code. This new method selects the warmup shape that fits the prompt and has the biggest shape['new_tokens']. This way SpyrePlatform will use the maximum number possible for token generation when max_tokens is not set in the request body.

Signed-off-by: Gabriel Marinho <[email protected]>

github-actions · 2025-05-22T17:40:28Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Gabriel Marinho <[email protected]>

wallashss · 2025-05-22T19:11:01Z

I think you should probably set this PR as draft and wait for the feedback of the vllm community on vllm-project/vllm#18557. This changes depends on that.

maxdebayser

LGTM, once the upstream PR gets merged.

Signed-off-by: Gabriel Marinho <[email protected]>

add maybe_update_max_tokens for SpyrePlatform

083b986

Signed-off-by: Gabriel Marinho <[email protected]>

gmarinho2 added 3 commits May 22, 2025 14:44

add maybe_update_max_tokens for SpyrePlatform

4c41b06

Signed-off-by: Gabriel Marinho <[email protected]>

add validation to check if self._warmup_shapes is None

189e6f6

Signed-off-by: Gabriel Marinho <[email protected]>

add validation to check if self._warmup_shapes is None

671ed65

Signed-off-by: Gabriel Marinho <[email protected]>

gmarinho2 marked this pull request as draft May 22, 2025 19:15

maxdebayser approved these changes May 22, 2025

View reviewed changes

gmarinho2 added 2 commits July 9, 2025 13:08

Refacts maybe_update_max_tokens in SpyrePlatform class

11aaf8f

Signed-off-by: Gabriel Marinho <[email protected]>

changes initial value of max_new_tokens to 1

6c78692

Signed-off-by: Gabriel Marinho <[email protected]>

gmarinho2 changed the title ~~Add maybe_update_max_tokens for class SpyrePlatform~~ Add get_max_output_tokens for class SpyrePlatform Jul 9, 2025

maxdebayser marked this pull request as ready for review July 9, 2025 16:28

maxdebayser requested review from yannicks1, tdoublep, nikolaospapandreou and sducouedic as code owners July 9, 2025 16:28

fix type error

69dd504

Signed-off-by: Gabriel Marinho <[email protected]>

maxdebayser merged commit 13492ed into vllm-project:main Jul 9, 2025
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add get_max_output_tokens for class SpyrePlatform #179

Add get_max_output_tokens for class SpyrePlatform #179

Uh oh!

gmarinho2 commented May 22, 2025

Uh oh!

github-actions bot commented May 22, 2025

Uh oh!

wallashss commented May 22, 2025

Uh oh!

maxdebayser left a comment

Uh oh!

Uh oh!

Uh oh!

Add get_max_output_tokens for class SpyrePlatform #179

Add get_max_output_tokens for class SpyrePlatform #179

Uh oh!

Conversation

gmarinho2 commented May 22, 2025

Uh oh!

github-actions bot commented May 22, 2025

Uh oh!

wallashss commented May 22, 2025

Uh oh!

maxdebayser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!