Why does the `offset_normalizer` use switched WH order in deformable attention #3200

Beanocean · 2024-11-05T07:08:23Z

In the deformable attention, the spatial shapes represented by height (H) and width (W). Why is it necessary to exchange the offset_normalizer to the witdh(W) height(H) order here?

mmcv/mmcv/ops/multi_scale_deform_attn.py

Lines 352 to 353 in c46684c

    
           offset_normalizer = torch.stack( 
        
               [spatial_shapes[..., 1], spatial_shapes[..., 0]], -1)

Originally posted by @Beanocean in #3197

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does the `offset_normalizer` use switched WH order in deformable attention #3200

Why does the `offset_normalizer` use switched WH order in deformable attention #3200

Beanocean commented Nov 5, 2024

Why does the offset_normalizer use switched WH order in deformable attention #3200

Why does the offset_normalizer use switched WH order in deformable attention #3200

Comments

Beanocean commented Nov 5, 2024

Why does the `offset_normalizer` use switched WH order in deformable attention #3200

Why does the `offset_normalizer` use switched WH order in deformable attention #3200