Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiscale Vision Transformers #40

Closed
wants to merge 69 commits into from
Closed

Multiscale Vision Transformers #40

wants to merge 69 commits into from

Conversation

Amapocho
Copy link
Contributor

#14

@codecov-commenter
Copy link

codecov-commenter commented Nov 28, 2021

Codecov Report

Merging #40 (c2bffb8) into main (a8e4215) will decrease coverage by 11.15%.
The diff coverage is 0.00%.

@@             Coverage Diff              @@
##              main      #40       +/-   ##
============================================
- Coverage   100.00%   88.84%   -11.16%     
============================================
  Files           49       52        +3     
  Lines         1091     1228      +137     
============================================
  Hits          1091     1091               
- Misses           0      137      +137     
Impacted Files Coverage Δ
vformer/attention/multiscale.py 0.00% <0.00%> (ø)
vformer/encoder/embedding/patch_multiscale.py 0.00% <0.00%> (ø)
vformer/encoder/multiscale.py 0.00% <0.00%> (ø)

Copy link
Member

@abhi-glitchhg abhi-glitchhg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please use Feedforward class from encoder/nn as MLP Instead of 'common/mlp'?

Also you can import DropPath from timm library, use this import statement - from timm.models.layers import DropPath,

Also if Pull Resquest is still in progress, you can change it to Draft.

@Amapocho Amapocho changed the title Multiscale Vision Transformers Multiscale Vision Transformers [Draft] Nov 28, 2021
@Amapocho
Copy link
Contributor Author

Could you please use Feedforward class from encoder/nn as MLP Instead of 'common/mlp'?

Also you can import DropPath from timm library, use this import statement - from timm.models.layers import DropPath,

Also if Pull Resquest is still in progress, you can change it to Draft.

Made the necessary changes

@Amapocho Amapocho changed the title Multiscale Vision Transformers [Draft] Multiscale Vision Transformers Nov 28, 2021
@Amapocho Amapocho marked this pull request as draft November 28, 2021 08:50
vformer/encoder/embedding/patch_multiscale.py Show resolved Hide resolved

def forward(self, x, thw_shape):
x_block, thw_shape_new = self.attn(self.norm1(x), thw_shape)
x_res, _ = attention_pool(
Copy link
Member

@abhi-glitchhg abhi-glitchhg Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here You need to import attention_pool from the attention module.

vformer/models/classification/multiscale.py Outdated Show resolved Hide resolved
vformer/models/classification/multiscale.py Outdated Show resolved Hide resolved
vformer/models/classification/multiscale.py Outdated Show resolved Hide resolved
vformer/models/classification/multiscale.py Outdated Show resolved Hide resolved
Comment on lines 145 to 146
POOL_KV_STRIDE[i][0]
] = KVQ_KERNEL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KVQ_KERNEL this variable is not defined .

vformer/models/classification/multiscale.py Outdated Show resolved Hide resolved
vformer/models/classification/multiscale.py Outdated Show resolved Hide resolved
@NeelayS NeelayS closed this Jul 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants