Questions about table 5 #33

kaikai23 · 2022-10-23T08:09:15Z

Hi,

In your paper table 5, the (G,G,G,G) uses the numbers (79.8%) from PVT paper, which uses absolution positional encoding. However, I suppose the other model variants listed in this table use CPE, so they are not directly comparable. Should the accuracy of (G,G,G,G) with CPE be 81.2% as shown in table 1?

In general, I am interested in knowing if there is a benifit of using global attention in the early layers.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about table 5 #33

Questions about table 5 #33

kaikai23 commented Oct 23, 2022

Questions about table 5 #33

Questions about table 5 #33

Comments

kaikai23 commented Oct 23, 2022