
A Unified, Flexible and Training-free Cache Acceleration Framework for 🤗Diffusers
🎉Now, cache-dit covers almost All Diffusers' DiT Pipelines🎉
🔥Qwen-Image | FLUX.1 | Qwen-Image-Lightning | Wan 2.1 | Wan 2.2 🔥
🔥HunyuanImage-2.1 | HunyuanVideo | HunyuanDiT | HiDream | AuraFlow🔥
🔥CogView3Plus | CogView4 | LTXVideo | CogVideoX | CogVideoX 1.5 | ConsisID🔥
🔥Cosmos | SkyReelsV2 | VisualCloze | OmniGen 1/2 | Lumina 1/2 | PixArt🔥
🔥Chroma | Sana | Allegro | Mochi | SD 3/3.5 | Amused | ... | DiT-XL🔥
🔥Click here to show Important News: First API-stable (v1.0.0) Release🔥
2025.09.25: 🎉The first API-stable version (v1.0.0) of cache-dit has finally been released!
2025.09.25: 🔥cache-dit has joined the Diffusers community ecosystem:
2025.09.10: 🎉Day 1 support HunyuanImage-2.1 with 1.7x↑🎉 speedup! Check this example.
2025.09.08: 🔥Qwen-Image-Lightning 7.1/3.5 steps🎉 inference with DBCache: F16B16.
2025.09.03: 🎉Wan2.2-MoE 2.4x↑🎉 speedup! Please refer to run_wan_2.2.py as an example.
2025.08.12: 🎉First caching mechanism in QwenLM/Qwen-Image with cache-dit, check this PR.
2025.08.11: 🔥Qwen-Image 1.8x↑🎉 speedup! Please refer to run_qwen_image.py as an example.
2025.09.08: 🎉First caching mechanism in Wan2.2 with cache-dit, check this PR for more details.
2025.09.08: 🎉First caching mechanism in Qwen-Image-Lightning with cache-dit, check this PR.
2025.08.19: 🔥Qwen-Image-Edit 2x↑🎉 speedup! Check the example: run_qwen_image_edit.py.
2025.09.01: 📚Hybird Forward Pattern is supported! Please check FLUX.1-dev as an example.
2025.08.10: 🔥FLUX.1-Kontext-dev is supported! Please refer run_flux_kontext.py as an example.
2025.07.18: 🎉First caching mechanism in 🤗huggingface/flux-fast with cache-dit, check the PR.
2025.07.13: 🎉FLUX.1-dev 3.3x↑🎉 speedup! NVIDIA L20 with cache-dit + compile + FP8 DQ.
We are excited to announce that the first API-stable version (v1.0.0) of cache-dit has finally been released! cache-dit is a Unified, Flexible, and Training-free cache acceleration framework for 🤗 Diffusers, enabling cache acceleration with just one line of code. Key features include Unified Cache APIs, Forward Pattern Matching, Automatic Block Adapter, Hybrid Forward Pattern, DBCache, TaylorSeer Calibrator, and Cache CFG.
pip3 install -U cache-dit # pip3 install git+https://github.com/vipshop/cache-dit.git
You can install the stable release of cache-dit from PyPI, or the latest development version from GitHub. Then try
>>> import cache_dit
>>> from diffusers import DiffusionPipeline
>>> pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image") # Can be any diffusion pipeline
>>> cache_dit.enable_cache(pipe) # One-line code with default cache options.
>>> output = pipe(...) # Just call the pipe as normal.
>>> stats = cache_dit.summary(pipe) # Then, get the summary of cache acceleration stats.
>>> cache_dit.disable_cache(pipe) # Disable cache and run original pipe.
- 🎉Full 🤗Diffusers Support: Notably, cache-dit now supports nearly all of Diffusers' DiT-based pipelines, such as Qwen-Image, FLUX.1, Qwen-Image-Lightning, HunyuanImage-2.1, HunyuanVideo, HunyuanDiT, Wan 2.1/2.2, HiDream, AuraFlow, CogView3Plus, CogView4, LTXVideo, CogVideoX 1.5, ConsisID, SkyReelsV2, VisualCloze, OmniGen, Lumina, PixArt, Chroma, Sana, Allegro, Mochi, SD 3.5, Amused, and DiT-XL.
- 🎉Extremely Easy to Use: In most cases, you only need one line of code:
cache_dit.enable_cache(...)
. After calling this API, just use the pipeline as normal. - 🎉Easy New Model Integration: Features like Unified Cache APIs, Forward Pattern Matching, Automatic Block Adapter, Hybrid Forward Pattern, and Patch Functor make it highly functional and flexible. For example, we achieved 🎉 Day 1 support for HunyuanImage-2.1 with 1.7x speedup w/o precision loss—even before it was available in the Diffusers library.
- 🎉State-of-the-Art Performance: Compared with algorithms including Δ-DiT, Chipmunk, FORA, DuCa, TaylorSeer and FoCa, cache-dit achieves the best accuracy when the speedup ratio is below 4x.
- 🎉Support for 4/8-Steps Distilled Models: Surprisingly, cache-dit's DBCache works for extremely few-step distilled models—something many other methods fail to do.
- 🎉Compatibility with Other Optimizations: Designed to work seamlessly with torch.compile, model CPU offload, sequential CPU offload, group offloading, etc.
- 🎉Hybrid Cache Acceleration: Now supports hybrid DBCache + Calibrator schemes (e.g., DBCache + TaylorSeerCalibrator). DBCache acts as the Indicator to decide when to cache, while the Calibrator decides how to cache. More mainstream cache acceleration algorithms (e.g., FoCa) will be supported in the future, along with additional benchmarks—stay tuned for updates!
- 🤗Diffusers Ecosystem Integration: 🔥cache-dit has joined the Diffusers community ecosystem as the first DiT-specific cache acceleration framework! Check out the documentation here:
🔥Click here to show many Image/Video cases🔥




🔥Wan2.2 MoE | +cache-dit:2.0x↑🎉 | HunyuanVideo | +cache-dit:2.1x↑🎉




🔥Qwen-Image | +cache-dit:1.8x↑🎉 | FLUX.1-dev | +cache-dit:2.1x↑🎉




🔥Qwen...Lightning | +cache-dit:1.14x↑🎉 | HunyuanImage | +cache-dit:1.7x↑🎉




🔥Qwen-Image-Edit | Input w/o Edit | Baseline | +cache-dit:1.6x↑🎉 | 1.9x↑🎉





🔥FLUX-Kontext-dev | Baseline | +cache-dit:1.3x↑🎉 | 1.7x↑🎉 | 2.0x↑ 🎉





🔥HiDream-I1 | +cache-dit:1.9x↑🎉 | CogView4 | +cache-dit:1.4x↑🎉 | 1.7x↑🎉





🔥CogView3 | +cache-dit:1.5x↑🎉 | 2.0x↑🎉| Chroma1-HD | +cache-dit:1.9x↑🎉




🔥Mochi-1-preview | +cache-dit:1.8x↑🎉 | SkyReelsV2 | +cache-dit:1.6x↑🎉





🔥VisualCloze-512 | Model | Cloth | Baseline | +cache-dit:1.4x↑🎉 | 1.7x↑🎉




🔥LTX-Video-0.9.7 | +cache-dit:1.7x↑🎉 | CogVideoX1.5 | +cache-dit:2.0x↑🎉





🔥OmniGen-v1 | +cache-dit:1.5x↑🎉 | 3.3x↑🎉 | Lumina2 | +cache-dit:1.9x↑🎉




🔥Allegro | +cache-dit:1.36x↑🎉 | AuraFlow-v0.3 | +cache-dit:2.27x↑🎉





🔥Sana | +cache-dit:1.3x↑🎉 | 1.6x↑🎉| PixArt-Sigma | +cache-dit:2.3x↑🎉





🔥PixArt-Alpha | +cache-dit:1.6x↑🎉 | 1.8x↑🎉| SD 3.5 | +cache-dit:2.5x↑🎉





🔥Asumed | +cache-dit:1.1x↑🎉 | 1.2x↑🎉 | DiT-XL-256 | +cache-dit:1.8x↑🎉
For more advanced features such as Unified Cache APIs, Forward Pattern Matching, Automatic Block Adapter, Hybrid Forward Pattern, Patch Functor, DBCache, TaylorSeer Calibrator, and Hybrid Cache CFG, please refer to the 🎉User_Guide.md for details.
- ⚙️Installation
- 🔥Benchmarks
- 🔥Supported Pipelines
- 🎉Unified Cache APIs
- ⚡️Dual Block Cache
- 🔥TaylorSeer Calibrator
- ⚡️Hybrid Cache CFG
- 🛠Metrics CLI
- ⚙️Torch Compile
- 📚API Documents
How to contribute? Star ⭐️ this repo to support us or check CONTRIBUTE.md.
The cache-dit codebase is adapted from FBCache. Over time its codebase diverged a lot, and cache-dit API is no longer compatible with FBCache.
Special thanks to vipshop's Computer Vision AI Team for supporting document, testing and production-level deployment of this project.
@misc{cache-dit@2025,
title={cache-dit: A Unified, Flexible and Training-free Cache Acceleration Framework for Diffusers.},
url={https://github.com/vipshop/cache-dit.git},
note={Open-source software available at https://github.com/vipshop/cache-dit.git},
author={vipshop.com},
year={2025}
}