Why modulating attention by w&h works? #49

SupetZYK · 2022-08-30T16:19:52Z

I have some doubts on line https://github.com/IDEA-opensource/DAB-DETR/blob/main/models/DAB_DETR/transformer.py#L242 .

refHW_cond = self.ref_anchor_head(output).sigmoid() # nq, bs, 2

This line asks the model to learn absolute value of w, h from output. But NO supervision is applied. Besides, the 'output' tensor is used to learn the OFFSET of bbox (x, y, w, h).

So, I am wondering whether the model can learn width and height as expected?

The text was updated successfully, but these errors were encountered:

SlongLiu · 2022-09-02T04:19:57Z

The results show that our models get performance gains with the modulated operation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why modulating attention by w&h works? #49

Why modulating attention by w&h works? #49

SupetZYK commented Aug 30, 2022

SlongLiu commented Sep 2, 2022

Why modulating attention by w&h works? #49

Why modulating attention by w&h works? #49

Comments

SupetZYK commented Aug 30, 2022

SlongLiu commented Sep 2, 2022