Skip to content

πŸ”₯πŸ”₯πŸ”₯A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.

Notifications You must be signed in to change notification settings

GuoLanqing/Awesome-High-Resolution-Diffusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

60 Commits
Β 
Β 

Repository files navigation

Awesome Diffusion Models in High-Resolution Synthesis Awesome

Collection of recent diffusion-based high-resolution (e.g., $>1024^2$) image and video synthesis works. Questions and discussions are most welcome! Upcoming works will be updated on a regular basis, feel free to contact me to add... πŸ‘

Papers and Codes

πŸ”… Tuning-Free Algorithms

  • HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts (9 Sep 2024)

    Xinyu Liu, Yingqing He, Lanqing Guo, et al. Xinyu Liu, Yingqing He, Lanqing Guo, Xiang Li, Bu Jin, Peng Li, Yan Li, Chi-Min Chan, Qifeng Chen, Wei Xue, Wenhan Luo, Qifeng Liu, Yike Guo
    arXiv Project Page Code

  • MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning (20 Aug 2024)

    Haoning Wu, Shaocheng Shen, Qiang Hu, et al. Haoning Wu, Shaocheng Shen, Qiang Hu, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang
    arXiv Project Page Code

  • ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance (24 June 2024)

    Shuwei Shi, Wenbo Li, Yuechen Zhang, et al. Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang Zheng
    arXiv Project Page Code

  • DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance (26 June 2024)

    Younghyun Kim, Geunmin Hwang, Eunbyung Park Younghyun Kim, Geunmin Hwang, Eunbyung Park
    arXiv Project Page Code

  • ECCV'24 FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis (19 Mar 2024)

    Linjiang Huang, Rongyao Fang, Aiping Zhang, et al. Linjiang Huang, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu, Hongsheng Li
    arXiv Code

  • ECCV'24 ZIGMA: A DiT-style Zigzag Mamba Diffusion Model (30 July 2024)

    Hu et al. Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, BjΓΆrn Ommer
    arXiv Project Page Code

  • ECCV'24 AccDiffusion: An Accurate Method for Higher-Resolution Image Generation (18 July 2024)

    Shen Zhang, Zhaowei Chen, Zhenyu Zhao, et al. Shen Zhang, Zhaowei Chen, Zhenyu Zhao, Yuhao Chen, Yao Tang, Jiajun Liang
    arXiv Project Page

  • ECCV'24 HiDiffusion: Unlocking High-Resolution Creativity and Efficiency in Low-Resolution Trained Diffusion Models (29 Nov 2023)

    Zhihang Lin, Mingbao Lin, Meng Zhao, et al.Zhihang Lin, Mingbao Lin, Meng Zhao, Rongrong Ji
    arXiv Project Page Demo Code

  • ECCV'24 BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion (6 Apr 2024)

    Gwanghyun Kim, Hayeon Kim, Hoigi Seo, et al. Gwanghyun Kim, Hayeon Kim, Hoigi Seo, Dong Un Kang, Se Young Chun
    arXiv Project Page Code

  • ECCV'24 Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation (16 Feb 2024)

    Lanqing Guo, Yingqing He, Haoxin Chen, et al. Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen
    arXiv Project Page Code

  • ICML'24 Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (22 Jan 2024)

    Ling Yang, Zhaochen Yu, Chenlin Meng et al. Ling Yang, Zhaochen Yu, Chenlin Meng, Minkai Xu, Stefano Ermon, Bin Cui
    arXiv Code

  • CVPR'24 DemoFusion: Democratising High-Resolution Image Generation With No $$$ (24 Nov 2023)

    Ruoyi Du, Dongliang Chang, Timothy Hospedales, et al. Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma
    arXiv Project Page Demo Code

  • CVPR'24 Generative Powers of Ten (4 Dec 2024)

    Xiaojuan Wang, Janne Kontkanen, Brian Curless, et al. Xiaojuan Wang, Janne Kontkanen, Brian Curless, Steve Seitz, Ira Kemelmacher, Ben Mildenhall, Pratul Srinivasan, Dor Verbin, Aleksander Holynski
    arXiv Project Page

  • CVPR'24 FreeU: Free Lunch in Diffusion U-Net (20 Sep 2023)

    Chenyang Si, Ziqi Huang, Yuming Jiang, et al. Chenyang Si, Ziqi Huang, Yuming Jiang, Ziwei Liu
    arXiv Project Page Demo Code

  • ICLR'24 ScaleCrafter: Tuning-Free Higher-Resolution Visual Generation with Diffusion Models (11 Oct 2023)

    Yingqing He, Shaoshu Yang, Haoxin Chen, et al. Yingqing He, Shaoshu Yang, Haoxin Chen, Xiaodong Cun, Menghan Xia, Yong Zhang, Xintao Wang, Ran He, Qifeng Chen, Ying Shan
    arXiv Project Page Demo Code

  • NeurIPS'23 Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis (26 Oct 2023)

    Zhiyu Jin, Xuli Shen, Bin Li, et al. Zhiyu Jin, Xuli Shen, Bin Li, Xiangyang Xue
    arXiv

  • NeurIPS'23 SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions (8 Jun 2023)

    Yuseung Lee, Kunho Kim, Hyunjin Kim, et al. Yuseung Lee, Kunho Kim, Hyunjin Kim, Minhyuk Sung
    arXiv Project Page Demo Code

  • ICML'23 MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation (16 Feb 2023)

    Omer Bar-Tal, Lior Yariv, Yaron Lipman, et al. Omer Bar-Tal, Lior Yariv, Yaron Lipman, Tali Dekel
    arXiv Project Page Demo Code

πŸ”… Fine-Tuning Algorithms

  • LINFUSION: 1 GPU, 1 MINUTE, 16K IMAGE (3 Sep 2024)

    Songhua Liu, Weihao Yu, Zhenxiong Tan, et al.Songhua Liu, Weihao Yu, Zhenxiong Tan, and Xinchao Wang
    arXiv Project Page Code

  • UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks (2 July 2024)

    Jingjing Ren, Wenbo Li, Haoyu Chen, et al.Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, Lei Zhu
    arXiv Project Page

  • ECCV'24 Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation (16 Feb 2024)

    Lanqing Guo, Yingqing He, Haoxin Chen, et al. Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen
    arXiv Project Page Code

  • CVPR'24 Image Neural Field Diffusion Models (11 Jun 2024)

    Yinbo Chen, Oliver Wang, Richard Zhang, et al. Yinbo Chen, Oliver Wang, Richard Zhang, Eli Shechtman, Xiaolong Wang, Michael Gharbi
    arXiv Project Page

  • ICLR'24 PixArt-Ξ±: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis (13 Apr 2023)

    Enze Xie, Lewei Yao, Han Shi, Zhili Liu, et al. Enze Xie, Lewei Yao, Han Shi, Zhili Liu, Daquan Zhou, Zhaoqiang Liu, Jiawei Li, Zhenguo Li
    arXiv Project Page Demo Code

  • ICCV'23 DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning (13 Apr 2023)

    Enze Xie, Lewei Yao, Han Shi, Zhili Liu, et al. Enze Xie, Lewei Yao, Han Shi, Zhili Liu, Daquan Zhou, Zhaoqiang Liu, Jiawei Li, Zhenguo Li
    arXiv Code

  • AAAI'24 Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images (31 Aug 2023)

    Enze Xie, Lewei Yao, Han Shi, Zhili Liu, et al. Qingping Zheng, Yuanfan Guo, Jiankang Deng, Jianhua Han, Ying Li, Songcen Xu, Hang Xu
    arXiv Code

πŸ”… Training from Scratch Algorithms

Cascaded Model

  • ICLR'24 Relay Diffusion: Unifying diffusion process across resolutions for image synthesis (4 Sep 2023)

    David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, et al. David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou
    arXiv Project Page Demo Code

  • Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation (27 Sep 2023)

    David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, et al. David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou
    arXiv Project Page Demo Code

  • LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models (26 Sep 2023)

    Yaohui Wang, Xinyuan Chen, Xin Ma, et al. Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu
    arXiv Project Page Demo Code

  • JMLR'22 [CDM] Cascaded Diffusion Models for High Fidelity Image Generation (30 May 2021)

    Jonathan Ho, Chitwan Saharia, William Chan, et al. Jonathan Ho, Chitwan Saharia, William Chan, David J. Fleet, Mohammad Norouzi, Tim Salimans
    arXiv Project Page

End-to-End Model

  • ICML'24 Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers (21 Jan 2024)

    Katherine Crowson, Stefan Andreas Baumann, Alex Birch, et al. Katherine Crowson, Stefan Andreas Baumann, Alex Birch, Tanishq Mathew Abraham, Daniel Z. Kaplan, Enrico Shippole
    arXiv Project Page Code

  • ICML'24 FiT: Flexible Vision Transformer for Diffusion Model (19 Feb 2024)

    Zeyu Lu, Zidong Wang, Di Huang, et al. Zeyu Lu, Zidong Wang, Di Huang, Chengyue Wu, Xihui Liu, Wanli Ouyang, Lei Bai
    arXiv Code

  • ICLR'24 SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis (4 Jul 2023)

    Dustin Podell, Zion English, Kyle Lacey, et al. Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas MΓΌller, Joe Penna, Robin Rombach
    arXiv Code

  • ICLR'24 [Patch-DM] Patched Denoising Diffusion Models For High-Resolution Image Synthesis (2 Aug 2023)

    Zheng Ding, Mengqi Zhang, Jiajun Wu, et al. Zheng Ding, Mengqi Zhang, Jiajun Wu, Zhuowen Tu
    arXiv Code

  • ICLR'24 Matryoshka Diffusion Models (23 Oct 2023)

    Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, et al. Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly
    arXiv

  • ICLR'24 ∞-Diff: Infinite Resolution Diffusion with Subsampled Mollified States (31 Mar 2023)

    Sam Bond-Taylor, Chris G. Willcocks Sam Bond-Taylor, Chris G. Willcocks
    arXiv Code

  • On the Importance of Noise Scheduling for Diffusion Models (26 Jan 2023)

    Ting Chen Ting Chen
    arXiv

  • ICML'23 Simple diffusion: End-to-end diffusion for high resolution images (26 Jan 2023)

    Emiel Hoogeboom, Jonathan Heek, et al. Emiel Hoogeboom, Jonathan Heek, Tim Salimans
    arXiv Code

πŸ”… Super Resolution Algorithms

  • [PromptSR] Image Super-Resolution with Text Prompt Diffusion (24 Nov 2023)

    Zheng Chen, Yulun Zhang, Jinjin Gu, et al. Zheng Chen, Yulun Zhang, Jinjin Gu, Xin Yuan, Linghe Kong, Guihai Chen, Xiaokang Yang
    arXiv Code

  • TIP: Text-Driven Image Processing with Semantic and Restoration Instructions (18 Dec 2023)

    Chenyang Qi, Zhengzhong Tu, Keren Ye, et al. Chenyang Qi, Zhengzhong Tu, Keren Ye, Mauricio Delbracio, Peyman Milanfar, Qifeng Chen, Hossein Talebi
    arXiv Project Page

  • CVPR'24 [SUPIR] Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild (24 Jan 2024)

    Fanghua, Yu, Jinjin Gu, Zheyuan Li, et al. Fanghua, Yu, Jinjin Gu, Zheyuan Li, Jinfan Hu, Xiangtao Kong, Xintao Wang, Jingwen He, Yu Qiao, Chao Dong
    arXiv Project Page Code

  • CVPR'24 SinSR: Diffusion-Based Image Super-Resolution in a Single Step (23 Nov 2023)

    Yufei Wang, Wenhan Yang, Xinyuan Chen, et al. Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C. Kot, Bihan Wen
    arXiv Code

  • IJCV'24 [StableSR] Exploiting Diffusion Prior for Real-World Image Super-Resolution (11 May 2023)

    Jianyi Wang, Zongsheng Yue, Shangchen Zhou, et al. Jianyi Wang, Zongsheng Yue, Shangchen Zhou, Kelvin C.K. Chan, Chen Change Loy
    arXiv Project Page Demo Code

Metrics

  • FID / KID (Fre ́chet Inception Distance) [Ref]
  • IS (Inception Score) [Ref]
  • patch-FID (pFID) / patch-KID (pKID): use cropped local patches [Ref]
  • sFID / sKID: use the features before the global average pooling [Ref]

About

πŸ”₯πŸ”₯πŸ”₯A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published