Why numer of parameter is different with paper #13

bmnptnt · 2024-06-12T10:46:25Z

I checked num of parameter of DRCT in paper, alomost 10M
But, When I run the DRCT model using 180 embed_dim, num of paramter is almost 13M.
What is different?

ming053l · 2024-06-12T12:39:07Z

Hi, this may be related to the package we use to measure the model/parameter size.

We use the same package to measure their size (SwinIR/HAT/DRCT) in our paper.

I need time to check which package was used for measurement. For some reason I can't immediately get the file we used.

Nonetheless, their relative relationship should the same since all three models was measued by the same package.

Thanks!

ming053l · 2024-06-13T12:10:39Z

@bmnptnt

Hi,

We recalculated the parameters for HAT / HAT-L / DRCT / DRCT-L, and after careful comparison, we found that, as you mentioned, there was an error in the parameter count for HAT/DRCT, while the numbers for the Large versions are correct. We apologize for this mistake and will be updating the correct numbers on arXiv. Thank you once again!

Here are the corrected parameter counts:

HAT : 20,772,507
HAT-L : 40,846,575

DRCT : 14,139,579
DRCT-L: 27,580,719

We used fvcore to measure the parameter counts. The code is as follows:

import torch,os,sys
from torch import nn
import drct.archs.DRCT_arch as DRCT
import drct.archs.hat_arch as HAT
model = DRCT.DRCT(img_size=64,
                 patch_size=1,
                 in_chans=3,
                 embed_dim=180,
                 depths=(6, 6, 6, 6, 6 , 6),
                 num_heads=(6, 6, 6, 6, 6, 6),
                 window_size=16,
                 compress_ratio=3,
                 squeeze_factor=30,
                 conv_scale=0.01,
                 overlap_ratio=0.5,
                 mlp_ratio=2.,
                 qkv_bias=True,
                 qk_scale=None,
                 drop_rate=0.,
                 attn_drop_rate=0.,
                 drop_path_rate=0.1,
                 norm_layer=nn.LayerNorm,
                 ape=False,
                 patch_norm=True,
                 use_checkpoint=False,
                 upscale=4,
                 img_range=1.,
                 upsampler='pixelshuffle',
                 resi_connection='1conv',
                 gc = 32)


from fvcore.nn import FlopCountAnalysis, parameter_count_table
import torch

input_tensor = torch.randn(1, 3, 64, 64)

flop_count = FlopCountAnalysis(model, input_tensor)
flops = flop_count.total()

print(f"Params: {params}")

bmnptnt changed the title ~~Why numer of parameter is diffenert with paper~~ Why numer of parameter is different with paper Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why numer of parameter is different with paper #13

Why numer of parameter is different with paper #13

bmnptnt commented Jun 12, 2024

ming053l commented Jun 12, 2024 •

edited

Loading

ming053l commented Jun 13, 2024 •

edited

Loading

Why numer of parameter is different with paper #13

Why numer of parameter is different with paper #13

Comments

bmnptnt commented Jun 12, 2024

ming053l commented Jun 12, 2024 • edited Loading

ming053l commented Jun 13, 2024 • edited Loading

ming053l commented Jun 12, 2024 •

edited

Loading

ming053l commented Jun 13, 2024 •

edited

Loading