optimizer selection #115

xuanxu92 · 2024-05-07T19:50:13Z

hi, thanks for this excellent work. I have noticed that in the code of train-stage1.py line 106, the optimizer is AdamW
opt = torch.optim.AdamW(
swinir.parameters(), lr=cfg.train.learning_rate,
weight_decay=0
)
but in the paper, 4.1 implemntations you mention adam is utilized, could you please clarify it is adam or adamW?

Thanks!

0x3f3f3f3fun · 2024-05-13T08:08:18Z

Sorry, AdamW is right. We will update our paper.

xuanxu92 · 2024-05-15T16:02:22Z

Sorry, AdamW is right. We will update our paper.

你好，想接着再问一下为什么可以实现任意upscale的sr 生成吗？我看了paper没有太懂这一步，我知道lq经过preprocessmodel之后变成condition，再经过vae encode成latent，model根据latent和xt（随机noise）生成高清sr，但是我不太懂为什么这个sampler可以对不同尺寸不同upscale都适用，想请您讲解一下原因？多谢多谢！

0x3f3f3f3fun · 2024-05-15T17:34:18Z

Sorry, AdamW is right. We will update our paper.

你好，想接着再问一下为什么可以实现任意upscale的sr 生成吗？我看了paper没有太懂这一步，我知道lq经过preprocessmodel之后变成condition，再经过vae encode成latent，model根据latent和xt（随机noise）生成高清sr，但是我不太懂为什么这个sampler可以对不同尺寸不同upscale都适用，想请您讲解一下原因？多谢多谢！

根本原因是SD的UNet可以处理任意大小的latent z，具体一点的话是任意的长宽为8的倍数的latent z。在DiffBIR中condition latent会与z进行concat，所以condition latent的大小决定了z的大小。因此当condition latent的长宽为8的倍数时，UNet可以正常运行。由于VAE降采样8倍，这个条件等价于condition的长宽为64的倍数。在代码中我们也有一个步骤是把condition padding到64的倍数。如果我没说清楚的话，欢迎继续提问。

xuanxu92 · 2024-05-15T18:03:44Z

Sorry, AdamW is right. We will update our paper.

你好，想接着再问一下为什么可以实现任意upscale的sr 生成吗？我看了paper没有太懂这一步，我知道lq经过preprocessmodel之后变成condition，再经过vae encode成latent，model根据latent和xt（随机noise）生成高清sr，但是我不太懂为什么这个sampler可以对不同尺寸不同upscale都适用，想请您讲解一下原因？多谢多谢！

根本原因是SD的UNet可以处理任意大小的latent z，具体一点的话是任意的长宽为8的倍数的latent z。在DiffBIR中condition latent会与z进行concat，所以condition latent的大小决定了z的大小。因此当condition latent的长宽为8的倍数时，UNet可以正常运行。由于VAE降采样8倍，这个条件等价于condition的长宽为64的倍数。在代码中我们也有一个步骤是把condition padding到64的倍数。如果我没说清楚的话，欢迎继续提问。

好滴好滴，谢谢您的讲解。假如我做inference时input img是128x128，upscale 4，那么condition img就是128x4=512，对应的经过vae endocder之后的condition latent就是512//8=64，此时的condition latent 维度为64，需要是8的倍数以满足sd unet（只能处理长宽为8的倍数的latent z），想请问您我的理解对吗？就是先是vae进行八倍降采样，降采样之后的latent z需要也是八的倍数满足sd unet latent z（也要是八的倍数）。

0x3f3f3f3fun · 2024-05-15T18:10:17Z

Sorry, AdamW is right. We will update our paper.

你好，想接着再问一下为什么可以实现任意upscale的sr 生成吗？我看了paper没有太懂这一步，我知道lq经过preprocessmodel之后变成condition，再经过vae encode成latent，model根据latent和xt（随机noise）生成高清sr，但是我不太懂为什么这个sampler可以对不同尺寸不同upscale都适用，想请您讲解一下原因？多谢多谢！

根本原因是SD的UNet可以处理任意大小的latent z，具体一点的话是任意的长宽为8的倍数的latent z。在DiffBIR中condition latent会与z进行concat，所以condition latent的大小决定了z的大小。因此当condition latent的长宽为8的倍数时，UNet可以正常运行。由于VAE降采样8倍，这个条件等价于condition的长宽为64的倍数。在代码中我们也有一个步骤是把condition padding到64的倍数。如果我没说清楚的话，欢迎继续提问。

好滴好滴，谢谢您的讲解。假如我做inference时input img是128x128，upscale 4，那么condition img就是128x4=512，对应的经过vae endocder之后的condition latent就是512//8=64，此时的condition latent 维度为64，需要是8的倍数以满足sd unet（只能处理长宽为8的倍数的latent z），想请问您我的理解对吗？就是先是vae进行八倍降采样，降采样之后的latent z需要也是八的倍数满足sd unet latent z（也要是八的倍数）。

是这样的。

xuanxu92 · 2024-05-16T14:07:38Z

Sorry, AdamW is right. We will update our paper.

你好，想接着再问一下为什么可以实现任意upscale的sr 生成吗？我看了paper没有太懂这一步，我知道lq经过preprocessmodel之后变成condition，再经过vae encode成latent，model根据latent和xt（随机noise）生成高清sr，但是我不太懂为什么这个sampler可以对不同尺寸不同upscale都适用，想请您讲解一下原因？多谢多谢！

根本原因是SD的UNet可以处理任意大小的latent z，具体一点的话是任意的长宽为8的倍数的latent z。在DiffBIR中condition latent会与z进行concat，所以condition latent的大小决定了z的大小。因此当condition latent的长宽为8的倍数时，UNet可以正常运行。由于VAE降采样8倍，这个条件等价于condition的长宽为64的倍数。在代码中我们也有一个步骤是把condition padding到64的倍数。如果我没说清楚的话，欢迎继续提问。

好滴好滴，谢谢您的讲解。假如我做inference时input img是128x128，upscale 4，那么condition img就是128x4=512，对应的经过vae endocder之后的condition latent就是512//8=64，此时的condition latent 维度为64，需要是8的倍数以满足sd unet（只能处理长宽为8的倍数的latent z），想请问您我的理解对吗？就是先是vae进行八倍降采样，降采样之后的latent z需要也是八的倍数满足sd unet latent z（也要是八的倍数）。

是这样的。

好滴好滴，谢谢！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizer selection #115

optimizer selection #115

xuanxu92 commented May 7, 2024

0x3f3f3f3fun commented May 13, 2024

xuanxu92 commented May 15, 2024

0x3f3f3f3fun commented May 15, 2024

xuanxu92 commented May 15, 2024 •

edited

Loading

0x3f3f3f3fun commented May 15, 2024

xuanxu92 commented May 16, 2024

optimizer selection #115

optimizer selection #115

Comments

xuanxu92 commented May 7, 2024

0x3f3f3f3fun commented May 13, 2024

xuanxu92 commented May 15, 2024

0x3f3f3f3fun commented May 15, 2024

xuanxu92 commented May 15, 2024 • edited Loading

0x3f3f3f3fun commented May 15, 2024

xuanxu92 commented May 16, 2024

xuanxu92 commented May 15, 2024 •

edited

Loading