Skip to content

Understanding the Code

soten355 edited this page Dec 19, 2023 · 1 revision

MetalDiffusion is written in Python and is heavily dependent on pre-existing libraries like Huggingface's Diffusers, PyTorch, TensorFlow, and NumPy. It's primary goals are providing an easy, straightforward, and thoughtful execution of those libraries for the end user. This means MetalDiffusion can switch between render frameworks (like Diffuers/PyTorch and TensorFlow) while at the same time changing aspects of the GUI to represent what those frameworks can and can't do.

The main GUI is currently browser based using Gradio. However, MetalDiffusion is designed to run agnostic of GUI and can even be executed from the command line Terminal. So, how does it work?

There's a hierarchy, a workflow for MetalDiffusion:

  1. Load into memory all of the global variables like model location, user preferences, etc
  2. Create the dreamWorld class. This main class will be used by the GUI. This class also receives all of the global variables, allowing that data to be changed on the fly without changing the original variables
  3. Create the GUI that will then send commands to the dreamWorld class.

Once the user selects all of their variables, the dreamWorld class will be called via the create method/function:

  • Update dreamWorld variables to match new user input variables
  • Create the model, if it doesn't exist based on the variables.
    • Some variables will trigger a re-compile of the diffusion model. This step checks to specific variables.
  • Once all variables are updated and models are compiled, MetalDiffusion will begin generating and image or animation with either the generateArt or generateCinema function/method
  • generateArt and generateCinema are essentially the same, with a few extra functions performed by generateCinema:
    • Loop onto itself until it reaches the total frame count
    • Re-use input image, ignore input image, or use previous frame as input image
    • Apply visual changes to the input image (regardless of where it came from). These changes can be zooming in, panning right, etc
  • Once the image(s) are generated, MetalDiffusion saves the images to the creations folder with metadata embedded into the image file along with a .txt file with the same information
Clone this wiki locally