Skip to content

gio961gio/Music-to-Image-Interpolation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 

Repository files navigation

Music to Image Interpolation

Open In Colab <--- Link to project in Google Colab


Generative AI pipeline that produces image interpolations from an audio track, leveraging Stable Diffusion technology.


Examples

Steve.Reich.-.Music.for.Pieces.of.Wood.30.seconds.extract.mp4

Steve Reich - Music for Pieces of Wood (30 second extract) (fps=7, num_inference_steps=20)


Karlheinz.Stockhausen.-.Helicopter.String.Quartet.mp4

Karlheinz Stockhausen - Helicopter String Quartet (25 seconds) (fps=5, num_inference_steps=30)


Jean-Claude.Risset.-.SUD.mp4

Jean-Claude Risset - SUD (30 second extract) (fps=7, num_inference_steps=20)


Antonio.Vivaldi.-Winter.mp4

Antonio Vivaldi - Winter (15 seconds extract) (fps=7, num_inference_steps=20)


Pipeline

Pipeline


Informations

The core of the system is the Stable Diffusion 'img2img' by Hugging Face. Image embeddings are created using the Image Bind model by Meta, which employs multimodality and transforms audio data into image embeddings.

The interpolation part is adapted from the publicly available code by nateraw (https://github.com/nateraw/stable-diffusion-videos.git), and the detextifier is also adapted from the publicly available code by iuliaturc (https://github.com/iuliaturc/detextify.git). The Stable Diffusion and ImageBind models are incorporated into the public code provided by Zeqiang-Lai (https://github.com/Zeqiang-Lai/Anything2Image.git).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published