Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set support static cache for qwen2 #1822

Open
wants to merge 1 commit into
base: transformers_4_49
Choose a base branch
from

Conversation

skaulintel
Copy link
Collaborator

@skaulintel skaulintel commented Mar 5, 2025

In upstream, supporting static cache for this model is temporarily set to False (https://github.com/huggingface/transformers/blame/752ef3fd4e70869626ec70657a770a85c0ad9219/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L941). We will set it to true here to allow the following test to run

python3 /root/optimum-habana/examples/image-to-text/run_pipeline.py --model_name_or_path Qwen/Qwen2-VL-2B-Instruct --batch_size 1 --max_new_tokens 20 --ignore_eos --use_hpu_graphs --bf16 --sdp_on_bf16 --output_dir /tmp/tmp_26vvp09

If not, we get the following value error

File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/generation/utils.py", line 1033, in _prepare_cache_for_generation
    raise ValueError(
ValueError: This model does not support `cache_implementation='static'`. Please check the following issue: https://github.com/huggingface/transformers/issues/28981

@skaulintel skaulintel requested a review from regisss as a code owner March 5, 2025 19:13
@malkomes
Copy link
Contributor

malkomes commented Mar 5, 2025

I think you mean to say that you temporarily set it to True ;-)

@12010486
Copy link
Contributor

12010486 commented Mar 7, 2025

Good for me, but could you add also a comment in the file on why we are overwriting the variable? Also a link to https://github.com/huggingface/transformers/blame/66f29aaaf55c8fe0c3dbcd24beede2ca4effac56/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L390C5-L390C27 would do the job.

Why: I'm afraid we need to keep a look on it, for when we are going to shift the default to eager/torch.compile

@12010486
Copy link
Contributor

12010486 commented Mar 7, 2025

Good for me, but could you add also a comment in the file on why we are overwriting the variable? Also a link to https://github.com/huggingface/transformers/blame/66f29aaaf55c8fe0c3dbcd24beede2ca4effac56/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L390C5-L390C27 would do the job.

Why: I'm afraid we need to keep a look on it, for when we are going to shift the default to eager/torch.compile

@regisss, besides the comment, LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants