-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WanImageToVideoPipeline - swap out a limited number of blocks #10999
Comments
Check apply_group_offloading related docs. It'll be going into current release soon and we can work on improving discoverability/docs. It has the minimal VRAM requirements without much overhead to generation time if you're on a modern cuda gpu. Currently, the RAM requirements are high but it's a WIP to improve it (happy to accept any improvement PRs 🤗) Some numbers: #10847 (comment) If you combine it with #10623 and precompute text embeddings and do tiled VAE decode, you can run in under ~7-10 GB. |
Superbe, I'll check it out |
@a-r-r-o-w: Thanks for the hints! I did some tests on Google Colab, and the following code will saturate the System RAM (83.5 GB) and then hang.
Meanwhile,
|
Group offloading with streams currently has a significant limitation in that it pins weight tensors on the CPU -- makes it require a lot more RAM than other methods. Could you try without streams, and also with Tiling support hasn't been added to Wan yet I believe (cc @yiyixuxu). |
@a-r-r-o-w: Thanks for the info! The following code currently throws, see below. code
|
If I comment out the two instances of
|
I can fit WanImageToVideoPipeline on a 24GB card but it does scrape the ceiling and is a bit too close for comfort to OOMing at some random system event.
The kijai/ComfyUI-WanVideoWrapper has a nice option to swap a limited and user defined number of blocks out of VRAM. Can a similar thing be done right now with sequential_cpu_offload? If not I would like to request something along these lines to shave off 2-4 GB.
I'm open to other ideas for a small VRAM reduction.
The text was updated successfully, but these errors were encountered: