Replace VideoMAE sinusoid encoding helper with PyTorch implementation#46924
Replace VideoMAE sinusoid encoding helper with PyTorch implementation#46924praful-srinivasan-027 wants to merge 1 commit into
Conversation
|
[For maintainers] Suggested jobs to run (before merge) run-slow: videomae |
|
Hi @yonigozlan! I opened this as a draft because I'd appreciate some guidance. While replacing the NumPy implementation of However, this introduces a failure in Before I continue, I wanted to check whether this is the expected approach for making these fixed positional embeddings compatible with meta initialization. If so, is there a recommended pattern for handling the model-parallel case as well? If not, I'd appreciate any guidance on the preferred approach. Thanks! |
|
CI Dashboard: View test results in Grafana |
What does this PR do?
This PR replaces the NumPy implementation of
get_sinusoid_encoding_tableinmodeling_videomae.pywith an equivalent vectorized PyTorch implementation.As part of this refactor, fixed sinusoidal positional embeddings are registered as non-persistent buffers and reinitialized in
_init_weights, making the model compatible with meta device initialization while preserving the existing sinusoidal encoding behavior.This also resolves the existing TODO to implement the helper using PyTorch.
Current status: This PR is opened as a draft while investigating a model-parallel regression introduced by the buffer registration changes. The meta initialization tests now pass, but
test_model_parallelismstill requires investigation.