Replace VideoMAE sinusoid encoding helper with PyTorch implementation by praful-srinivasan-027 · Pull Request #46924 · huggingface/transformers

praful-srinivasan-027 · 2026-06-26T19:25:34Z

What does this PR do?

This PR replaces the NumPy implementation of get_sinusoid_encoding_table in modeling_videomae.py with an equivalent vectorized PyTorch implementation.

As part of this refactor, fixed sinusoidal positional embeddings are registered as non-persistent buffers and reinitialized in _init_weights, making the model compatible with meta device initialization while preserving the existing sinusoidal encoding behavior.

This also resolves the existing TODO to implement the helper using PyTorch.

Current status: This PR is opened as a draft while investigating a model-parallel regression introduced by the buffer registration changes. The meta initialization tests now pass, but test_model_parallelism still requires investigation.

github-actions · 2026-06-26T19:27:01Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: videomae

praful-srinivasan-027 · 2026-06-26T19:27:42Z

Hi @yonigozlan! I opened this as a draft because I'd appreciate some guidance.

While replacing the NumPy implementation of get_sinusoid_encoding_table with a PyTorch version, I updated the fixed positional embeddings to be registered as non-persistent buffers and reinitialized them in _init_weights so that the model passes the meta initialization tests.

However, this introduces a failure in test_model_parallelism due to a device mismatch between sequence_output and the positional embeddings after model sharding.

Before I continue, I wanted to check whether this is the expected approach for making these fixed positional embeddings compatible with meta initialization. If so, is there a recommended pattern for handling the model-parallel case as well? If not, I'd appreciate any guidance on the preferred approach.

Thanks!

github-actions · 2026-06-26T19:34:08Z

CI Dashboard: View test results in Grafana

Refactor VideoMAE sinusoidal position encoding to use PyTorch

bf21427

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replace VideoMAE sinusoid encoding helper with PyTorch implementation#46924

Replace VideoMAE sinusoid encoding helper with PyTorch implementation#46924
praful-srinivasan-027 wants to merge 1 commit into
huggingface:mainfrom
praful-srinivasan-027:refactor/videomae-sinusoid-torch

praful-srinivasan-027 commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

praful-srinivasan-027 commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

praful-srinivasan-027 commented Jun 26, 2026

What does this PR do?

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

praful-srinivasan-027 commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant