Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding positional encoder changes and tests #32600

Open
wants to merge 45 commits into
base: main
Choose a base branch
from

Conversation

manuelsh
Copy link

@amyeroberts as there were some conflicts with merging with main on #31900 (possibly due to the make scripts), I have reimplemented all the changes of #30783 in a new branch, which is rebased to main.

@manuelsh manuelsh marked this pull request as ready for review August 11, 2024 23:53
@manuelsh
Copy link
Author

manuelsh commented Aug 12, 2024

@amyeroberts I have included the interpolation of positional embeddings in all the following models, and their respective tests:

  1. altclip
  2. bridgetower
  3. chineseclip
  4. clip
  5. clipseg
  6. kosmos_2
  7. x_clip
  8. git

Thanks!

This was referenced Aug 13, 2024
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding - looks great!

Just a handful of small nits. Before merge, we'll need to run the slow tests for the models affected. Could you trigger this by running git commit --allow-empty -m "[run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip"

model(**inputs, interpolate_pos_encoding=False)
# forward pass
with torch.no_grad():
outputs = model(**inputs, interpolate_pos_encoding=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

Comment on lines +474 to +477
@unittest.skip(reason="GitForCausalLM does not support inputs_embeds in generate method")
def test_inputs_embeds_matches_input_ids_with_generate(self):
pass

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this as this logic is independent of this PR

Suggested change
@unittest.skip(reason="GitForCausalLM does not support inputs_embeds in generate method")
def test_inputs_embeds_matches_input_ids_with_generate(self):
pass

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove this, I will get the following error from the CI pipeline:

FAILED tests/models/git/test_modeling_git.py::GitModelTest::test_inputs_embeds_matches_input_ids_with_generate - ValueError: You passed `inputs_embeds` to `.generate()`, but the model class GitForCausalLM doesn't have its forwarding implemented. See the GPT2 implementation for an example (https://github.com/huggingface/transformers/pull/21405), and feel free to open a PR with it!

as shown here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rebase on main? I believe this has been resolved upstream

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should still be removed as this tests is unrelated to this PR

return_dict (`bool`, *optional*):
Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -512,6 +526,8 @@ def _init_weights(self, module):
output_hidden_states (`bool`, *optional*):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors for
more detail.
interpolate_pos_encoding (`bool`, *optional*):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
interpolate_pos_encoding (`bool`, *optional*):
interpolate_pos_encoding (`bool`, *optional*, defaults to `False`):

@@ -549,6 +565,8 @@ def _init_weights(self, module):
output_hidden_states (`bool`, *optional*):
Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors for
more detail.
interpolate_pos_encoding (`bool`, *optional*):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
interpolate_pos_encoding (`bool`, *optional*):
interpolate_pos_encoding (`bool`, *optional*, defaults to `False`):

@manuelsh
Copy link
Author

manuelsh commented Aug 15, 2024

OK, I've:

  • added your three suggestions (thanks!)
  • I haven't removed the @unittest.skip(reason="GitForCausalLM does not support inputs_embeds in generate method")... lines, as per my comment, please let me know
  • I have run the slow tests with the git command sent. However, I don't think it is running the right slow tests as I just detected some errors on them which I am fixing now (for example in the "bridgetower" one)

Please don't merge yet, just need some time to check and potentially fix the tests.

@manuelsh
Copy link
Author

GIT model test still requires to be fixed. Getting this error:

tests.models.git.test_modeling_git.GitModelIntegrationTest.test_inference_interpolate_pos_encoding failed with error: <class 'IndexError'> index out of range in self
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/usr/local/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/usr/local/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/usr/src/app/transformers/tests/models/git/test_modeling_git.py", line 588, in test_inference_interpolate_pos_encoding
    outputs = model(**inputs, interpolate_pos_encoding=True)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/src/app/transformers/src/transformers/models/git/modeling_git.py", line 1302, in forward
    embedding_output = self.embeddings(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/src/app/transformers/src/transformers/models/git/modeling_git.py", line 115, in forward
    embeddings = self.word_embeddings(input_ids)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 164, in forward
    return F.embedding(
  File "/usr/local/lib/python3.10/site-packages/torch/nn/functional.py", line 2267, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

due to input_ids having values out of range (tensor([[49406, 768, 568, 530, 518, 2867, 49407]], dtype=torch.int32)). In concrete 49406 and 49407 are not accepted. Not sure why the processor is adding them.

Still on it.

@amyeroberts
Copy link
Collaborator

@manuelsh Have you included the most recent updates from main?

@manuelsh
Copy link
Author

manuelsh commented Aug 20, 2024

Did it and still getting the same error. These two tokens (49406 and 49407) are special tokens added by the processor, they are <|startoftext|> and <|endoftext|>. I can also see that the word_embeddings tensor has a dimension of Embedding(30522, 768, padding_idx=0), i.e. a vocabulary of 30522.

I will do further debugging once I find time. In the meantime any suggestion is appreciated.

@manuelsh
Copy link
Author

@amyeroberts I found the source of the issue: the pre-trained model for GIT needed to be updated to the correct one. I think this should make it!

However, now I am getting the following integration error coming from feature_extraction_audio_spectrogram_transformer, even if I've synced the branch with the latest changes. I don't get anything related to this when I do make fixup or make repo-consistency.

Traceback (most recent call last):
  File "/root/transformers/utils/check_repo.py", line 1198, in <module>
    check_repo_quality()
  File "/root/transformers/utils/check_repo.py", line 1186, in check_repo_quality
    check_all_auto_object_names_being_defined()
  File "/root/transformers/utils/check_repo.py", line 742, in check_all_auto_object_names_being_defined
    if not hasattr(transformers, class_name):
  File "/root/transformers/src/transformers/utils/import_utils.py", line 1631, in __getattr__
    value = getattr(module, name)
  File "/root/transformers/src/transformers/utils/import_utils.py", line 1630, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/root/transformers/src/transformers/utils/import_utils.py", line 1642, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.audio_spectrogram_transformer.feature_extraction_audio_spectrogram_transformer because of the following error (look up to see its traceback):
libtorch_cuda.so: cannot open shared object file: No such file or directory

Exited with code exit status 1

@manuelsh
Copy link
Author

Hi @amyeroberts I update main again with all changes and now it seems that all tests are passed, so it's ready to merge!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@amyeroberts
Copy link
Collaborator

@manuelsh Great! As there's been a few changes since the last slow run, we'll need to do another git commit --allow-empty -m "[run_slow] altclip, bridgetower, chinese_clip, clip, clipseg, git, kosmos2, x_clip". Once those are all passing we're good to merge!

@manuelsh
Copy link
Author

manuelsh commented Sep 5, 2024

@amyeroberts I did a couple of fixes (one in another non related test, test_inference_image_segmentation in clipseg and another in GIT) and now all tests run.

@manuelsh
Copy link
Author

Hi @amyeroberts , I wonder if there is something missing or we can merge it.

@amyeroberts
Copy link
Collaborator

@manuelsh Thanks for all the work so far on this! Yes, there's a final iteration we'll need to do -- otherwise the code all looks good.

Last week we merged in #33226. This fixed an issue in a lot of our vision models, which were using scale_factor in the nn.functional.interpolate call instead of size. This is in part needed to enable exporting to onnx and hence making the models compatible with transformers.js.

Could you update the interpolate functions to use this updated logic flow? The tests shouldn't be affected.

@manuelsh
Copy link
Author

@amyeroberts done, all interpolate_pos_encoding functions updated.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful - thanks for adding this capability to our models and for iterating on a solution!

@amyeroberts
Copy link
Collaborator

@manuelsh Just the failing slow tests to address!

@manuelsh
Copy link
Author

manuelsh commented Sep 16, 2024

@amyeroberts , I think it is not just substituting interpolate_pos_encoding function, but one needs to adapt it, as the position_embeddings tensor from #33226 is different from the position_embedding object in the code of this PR (note the s).

I believe I can make them work with

self.position_embeddings = self.position_embedding.weight.unsqueeze(0)

but now all my tests are crashing for different reasons (different tensors outputs for example) and this will take longer.

Why not getting back to the previous working commit (d44e070), merge it, and then open another PR like #33226 but for the clip family models?

I would be happy to contribute to it.

@manuelsh
Copy link
Author

@amyeroberts I was able to fix all tests with the new function, so no need to do an additional PR. Please review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants