Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why numer of parameter is different with paper #13

Open
bmnptnt opened this issue Jun 12, 2024 · 2 comments
Open

Why numer of parameter is different with paper #13

bmnptnt opened this issue Jun 12, 2024 · 2 comments

Comments

@bmnptnt
Copy link

bmnptnt commented Jun 12, 2024

I checked num of parameter of DRCT in paper, alomost 10M
But, When I run the DRCT model using 180 embed_dim, num of paramter is almost 13M.
What is different?

@bmnptnt bmnptnt changed the title Why numer of parameter is diffenert with paper Why numer of parameter is different with paper Jun 12, 2024
@ming053l
Copy link
Owner

ming053l commented Jun 12, 2024

Hi, this may be related to the package we use to measure the model/parameter size.

We use the same package to measure their size (SwinIR/HAT/DRCT) in our paper.

I need time to check which package was used for measurement. For some reason I can't immediately get the file we used.

Nonetheless, their relative relationship should the same since all three models was measued by the same package.

Thanks!

@ming053l
Copy link
Owner

ming053l commented Jun 13, 2024

@bmnptnt

Hi,

We recalculated the parameters for HAT / HAT-L / DRCT / DRCT-L, and after careful comparison, we found that, as you mentioned, there was an error in the parameter count for HAT/DRCT, while the numbers for the Large versions are correct. We apologize for this mistake and will be updating the correct numbers on arXiv. Thank you once again!

Here are the corrected parameter counts:

HAT : 20,772,507
HAT-L : 40,846,575

DRCT : 14,139,579
DRCT-L: 27,580,719

We used fvcore to measure the parameter counts. The code is as follows:

import torch,os,sys
from torch import nn
import drct.archs.DRCT_arch as DRCT
import drct.archs.hat_arch as HAT
model = DRCT.DRCT(img_size=64,
                 patch_size=1,
                 in_chans=3,
                 embed_dim=180,
                 depths=(6, 6, 6, 6, 6 , 6),
                 num_heads=(6, 6, 6, 6, 6, 6),
                 window_size=16,
                 compress_ratio=3,
                 squeeze_factor=30,
                 conv_scale=0.01,
                 overlap_ratio=0.5,
                 mlp_ratio=2.,
                 qkv_bias=True,
                 qk_scale=None,
                 drop_rate=0.,
                 attn_drop_rate=0.,
                 drop_path_rate=0.1,
                 norm_layer=nn.LayerNorm,
                 ape=False,
                 patch_norm=True,
                 use_checkpoint=False,
                 upscale=4,
                 img_range=1.,
                 upsampler='pixelshuffle',
                 resi_connection='1conv',
                 gc = 32)


from fvcore.nn import FlopCountAnalysis, parameter_count_table
import torch

input_tensor = torch.randn(1, 3, 64, 64)

flop_count = FlopCountAnalysis(model, input_tensor)
flops = flop_count.total()

print(f"Params: {params}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants