You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm quite new and studying.
I have some questions while studying your paper and code.
In below code, I think each convolution using at B and C, D in PAM (da_att.py)
self.query_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1) # B
self.key_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1) # C
self.value_conv = Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1) # D
I want to know why channel reduced by 8 in B and C, not in D. And why 8?
In PAM why you using convolution kernel size as 1?
In "danet.py", after self.sa(), the output of PAM, sa_feat pass two convolutions, conv51 and conv6. (same in CAM)
feat1 = self.conv5a(x)
sa_feat = self.sa(feat1)
sa_conv = self.conv51(sa_feat)
sa_output = self.conv6(sa_conv)
Why the output feature of PAM pass through two convolution? I though just one convolution before an element-wise summation through your paper below.
we transform the outputs of two attention modules by "a convolution layer" and perform an element-wise sum to accomplish feature fusion."
Thank you.
The text was updated successfully, but these errors were encountered:
Hi, I'm quite new and studying.
I have some questions while studying your paper and code.
In below code, I think each convolution using at B and C, D in PAM (da_att.py)
self.query_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1) # B
self.key_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1) # C
self.value_conv = Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1) # D
In "danet.py", after self.sa(), the output of PAM, sa_feat pass two convolutions, conv51 and conv6. (same in CAM)
feat1 = self.conv5a(x)
sa_feat = self.sa(feat1)
sa_conv = self.conv51(sa_feat)
sa_output = self.conv6(sa_conv)
we transform the outputs of two attention modules by "a convolution layer" and perform an element-wise sum to accomplish feature fusion."
Thank you.
The text was updated successfully, but these errors were encountered: