All videos I created here were with Stability AIʼs Stable Video Diffusion XT 1.1 Image-to-Video latent diffusion model (SVD XT 1.1), available on Hugging Face. Videos were created on Amazon Web Services (AWS) from single images using the SVD XT 1.1 model and ComfyUI on a G4dn instance, powered by NVIDIA T4 GPUs. The videos were rendered as MP4 files with recommended 25 frames at an average of 9 fps and a resolution of 1,024 x 576 pixels. The videos were also rendered as WebP format files (or in some cases, the MP4 files were then converted to WebP) for display in GitHub, shown below.
One of the best instructional videos I've seen on the subject of what is possible with SVD, is ComfyUI: Stable Video Diffusion (Workflow Tutorial), by ControlAltAI, on YouTube.
Note that SVD XT 1.1 is a 'Gated model', but, according to the Stable Video Diffusion 1.1 License Agreement, "Derivative Work(s)" means (a) any derivative work of the Software Products as recognized by U.S. copyright laws and (b) any modifications to a Model, and any other model created which is based on or derived from the Model or the Model’s output. For clarity, Derivative Works do not include the output of any Model. <<<
The ComfyUI workflow is included in this repository. It creates both an MP4 and a WebP video file. This workflow is based on the workflow referenced in the comments for the YouTube video, Image2Video. Stable Video Diffusion Tutorial., by Sebastian Kamph.
Source image |
Source image |
Source image |
Source image |
Source image |
Source image |
Source image |
Source image |
Source image |
Source image |
Source image |
The contents of this repository represent my viewpoints and not of my past or current employers, including Amazon Web Services (AWS). All third-party libraries, modules, plugins, and SDKs are the property of their respective owners.