Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] LlavaNext add feature size check to avoid CUDA Runtime Error #33608

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

laurentd-lunit
Copy link
Contributor

@laurentd-lunit laurentd-lunit commented Sep 20, 2024

What does this PR do?

In LlavaNextForConditionalGeneration, in the forward pass the following is applied:

special_image_mask = (
                    (input_ids == self.config.image_token_index).unsqueeze(-1).expand_as(inputs_embeds)
                )
                image_features = image_features.to(inputs_embeds.device, inputs_embeds.dtype)
                inputs_embeds = inputs_embeds.masked_scatter(special_image_mask, image_features)

It happened to me that for some edge cases the special_image_mask and image_features shapes are incompatible which leads to a RuntimeError: CUDA error: device-side assert triggered in the masked_scatter operation.
The problem is it only happens very rarely and I couldn't pin point yet what causes the missmatch.

Rather than fixing the issue for edge cases, in this PR I'm suggesting to first check the respective size of image tokens and image features and ensure they match before applying the masked_scatter operation. This allows to raise a ValueError instead of getting the CUDA Runtime error which is useful because one can then handle the exception as they see fit and still continue using the GPU while the CUDA Runtime error even if caught as an exception will throw the same CUDA Runtime error if any other CUDA operation is applied, in other words it can't really be handled and necessarily breaks running of the script.

In short, this PR allows nicer error handling in some edge cases in LlavaNext when GPU is used.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@zucchini-nlp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant