Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLM generate: tests can't generate image/video tokens #33623

Merged
merged 2 commits into from
Sep 20, 2024

Conversation

gante
Copy link
Member

@gante gante commented Sep 20, 2024

What does this PR do?

Our VLM generate mixin tests, introduced in #33533, are flaky. Because they use randomly initialized models, nothing prevents the models from generating image/video tokens, which a) shouldn't happen b) crash the forward pass.

This PR ensure our generation tests don't generate those tokens.

Commands ran to ensure the issue is fixed:
✅ (the test is no longer flaky) py.test tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationModelTest::test_sample_generate_dict_output --flake-finder --flake-runs=1000
✅ (we can run generation tests on all models after these changes) py.test tests/models/ -k test_sample_generate_dict_output

cc @zucchini-nlp -- when you're back from holidays, plz review even if it is already merged, in case you want to change things 🤗

@gante gante requested a review from ydshieh September 20, 2024 14:00
@gante gante changed the title VLM generate: curb flaky tests VLM generate: tests can't generate image/video tokens Sep 20, 2024
Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me, although I think we probably don't need this condition

image_token_index < config.get_text_config().vocab_size

i.e. just always add those 2 token to bad word

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante
Copy link
Member Author

gante commented Sep 20, 2024

@ydshieh we do, llava_onevision fails without it :o (because the token in question is outside than the vocab size in the test settings)

@gante gante merged commit 2fdb5e7 into huggingface:main Sep 20, 2024
19 checks passed
@gante gante deleted the flaky_vlm_tests branch September 20, 2024 14:43
itazap pushed a commit to NielsRogge/transformers that referenced this pull request Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants