VLM generate: tests can't generate image/video tokens #33623

gante · 2024-09-20T13:59:58Z

What does this PR do?

Our VLM generate mixin tests, introduced in #33533, are flaky. Because they use randomly initialized models, nothing prevents the models from generating image/video tokens, which a) shouldn't happen b) crash the forward pass.

This PR ensure our generation tests don't generate those tokens.

Commands ran to ensure the issue is fixed:
✅ (the test is no longer flaky) py.test tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationModelTest::test_sample_generate_dict_output --flake-finder --flake-runs=1000
✅ (we can run generation tests on all models after these changes) py.test tests/models/ -k test_sample_generate_dict_output

cc @zucchini-nlp -- when you're back from holidays, plz review even if it is already merged, in case you want to change things 🤗

ydshieh

works for me, although I think we probably don't need this condition

image_token_index < config.get_text_config().vocab_size

i.e. just always add those 2 token to bad word

HuggingFaceDocBuilderDev · 2024-09-20T14:25:15Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2024-09-20T14:27:21Z

@ydshieh we do, llava_onevision fails without it :o (because the token in question is outside than the vocab size in the test settings)

)

curb flaky tests

4b35ed5

gante requested a review from ydshieh September 20, 2024 14:00

gante mentioned this pull request Sep 20, 2024

Fix missing test in torch_job #33593

Merged

gante changed the title ~~VLM generate: curb flaky tests~~ VLM generate: tests can't generate image/video tokens Sep 20, 2024

ydshieh approved these changes Sep 20, 2024

View reviewed changes

this one is flaky, gets tagged as so

5b6a5cf

gante merged commit 2fdb5e7 into huggingface:main Sep 20, 2024
19 checks passed

gante deleted the flaky_vlm_tests branch September 20, 2024 14:43

itazap pushed a commit to NielsRogge/transformers that referenced this pull request Sep 20, 2024

VLM generate: tests can't generate image/video tokens (huggingface#33623

1f9d4f2

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLM generate: tests can't generate image/video tokens #33623

VLM generate: tests can't generate image/video tokens #33623

gante commented Sep 20, 2024

ydshieh left a comment

HuggingFaceDocBuilderDev commented Sep 20, 2024

gante commented Sep 20, 2024

VLM generate: tests can't generate image/video tokens #33623

VLM generate: tests can't generate image/video tokens #33623

Conversation

gante commented Sep 20, 2024

What does this PR do?

ydshieh left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Sep 20, 2024

gante commented Sep 20, 2024