Skip to content

Latest commit

 

History

History
678 lines (545 loc) · 55.3 KB

IMAGE_GEN.md

File metadata and controls

678 lines (545 loc) · 55.3 KB
Table of Contents

good reads

SD vs DallE vs MJ

July 2023: compare models: https://zoo.replicate.dev/

June 2023: https://news.ycombinator.com/item?id=36407272

DallE banned so SD https://twitter.com/almost_digital/status/1556216820788609025?s=20&t=GCU5prherJvKebRrv9urdw

https://i.redd.it/fqgv82ihav9a1.png but keep in mind that Dalle2 doesnt respond well to "photorealistic"

another comparison https://www.reddit.com/r/StableDiffusion/comments/zevuw2/a_simple_comparison_between_sd_15_20_21_and/

comparisons with other models https://www.reddit.com/r/StableDiffusion/comments/zlvrl6/i_tried_various_models_with_the_same_settings/

Lexica Aperture - finetuned version of SD https://lexica.art/aperture - fast - focused on photorealistic portraits and landscapes - negative prompting - dimensions

midjourney

Midjourney v5

nice trick to mix images https://twitter.com/javilopen/status/1613107083959738369

"midjourney style" - just feed "prompt" to it https://twitter.com/rainisto/status/1606221760189317122

or emojis: https://twitter.com/LinusEkenstam/status/1616841985599365120

DallE

DallE vs Imagen vs Parti architecture

Runway Gen-1/2

usage example https://twitter.com/nickfloats/status/1639709828603084801?s=20

Gen1 explainer https://twitter.com/c_valenzuelab/status/1652282840971722754?s=20

other text to image models

Tooling

Misc

Products

product placement

Stable Diffusion prompts

The basic intuition of Stable Diffusion is that you have to add descriptors to get what you want.

From here:

"George Washington riding a Unicorn in Times Square"

image

George Washington riding a unicorn in Times Square, cinematic composition, concept art, digital illustration, detailed

image

Prompts might go in the form of

[Prefix] [Subject], [Enhancers]

Adding the right enhancers can really tweak the outcome:

image

SD v2 prompts

SD2 Prompt Book from Stability: https://stability.ai/sdv2-prompt-book

SD 1.4 vs 1.5 comparisons

Distilled Stable Diffusion

SD2 vs SD1 user notes

Hardware requirements

Stable Diffusion

stable diffusion specific notes

Required reading:

SD Distros

SD Major forks and UIs

Main Stable Diffusion repo: https://github.com/CompVis/stable-diffusion

OpenJourney: https://happyaccidents.ai/, https://www.bluewillow.ai/

Name/Link Stars Description
AUTOMATIC1111 86000 The most well known Web UI. features: https://github.com/AUTOMATIC1111/stable-diffusion-webui#features launch announcement https://www.reddit.com/r/StableDiffusion/comments/x28a76/stable_diffusion_web_ui/. M1 mac instructions https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Installation-on-Apple-Silicon
easydiffusion 6900 "Easy Diffusion is easily my favorite UI". While it has a fraction of the features found in stable-diffusion-webui, it has the best out of the box UI I've tried so far.The way it enqueues tasks and renders the generated images beats anything I've seen in the various UIs I've played with. I also like that you can easily write plugins in Javascript, both for the UI and for server-side tweaks.
Disco Diffusion 6400 A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.
sd-webui (formerly hlky fork) 6000 A fully-integrated and easy way to work with Stable Diffusion right from a browser window. Long list of UI and SD features (incl textual inversion, alternative samplers, prompt matrix): https://github.com/sd-webui/stable-diffusion-webui#project-features
InvokeAI (formerly lstein fork) 8800 This version of Stable Diffusion features a slick WebGUI, an interactive command-line script that combines text2img and img2img functionality in a "dream bot" style interface, and multiple features and other enhancements. It runs on Windows, Mac and Linux machines, with GPU cards with as little as 4 GB of RAM. Universal Canvas (see youtube)
XavierXiao/Dreambooth-Stable-Diffusion 4900 Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion. Dockerized: https://github.com/smy20011/dreambooth-docker
Basujindal: Optimized Stable Diffusion 2600 This repo is a modified version of the Stable Diffusion repo, optimized to use less VRAM than the original by sacrificing inference speed. img2img and txt2img and inpainting under 2.4GB VRAM
stablediffusion-infinity 2800 Outpainting with Stable Diffusion on an infinite canvas. This project mainly works as a proof of concept.
Waifu Diffusion (huggingface, replicate) 1600 stable diffusion finetuned on weeb stuff. "A model trained on danbooru (anime/manga drawing site with also lewds and nsfw on it) over 56k images.Produces FAR BETTER results if you're interested in getting manga and anime stuff out of stable diffusion."
AbdBarho/stable-diffusion-webui-docker 1600 Easy Docker setup for Stable Diffusion with both Automatic1111 and hlky UI included. HOWEVER - no mac support yet AbdBarho/stable-diffusion-webui-docker#35
fast-stable-diffusion 3200 +25-50% speed increase + memory efficient + DreamBooth
nolibox/carefree-creator 1800 An infinite draw board for you to save, review and edit all your creations. Almost EVERY feature about Stable Diffusion (txt2img, img2img, sketch2img, variations, outpainting, circular/tiling textures, sharing, ...). Many useful image editing methods (super resolution, inpainting, ...). Integrations of different Stable Diffusion versions (waifu diffusion, ...). GPU RAM optimizations, which makes it possible to enjoy these features with an NVIDIA GeForce GTX 1080 Ti! It might be fair to consider this as: An AI-powered, open source Figma. A more 'interactable' Hugging Face Space. A place where you can try all the exciting and cutting-edge models, together.
imaginAIry 🤖🧠 1600 Pythonic generation of stable diffusion images with just pip install imaginairy. "just works" on Linux and macOS(M1) (and maybe windows). Memory efficiency improvements, prompt-based editing, face enhancement, upscaling, tiled images, img2img, prompt matrices, prompt variables, BLIP image captions, comes with dockerfile/colab. Has unit tests.
neonsecret/stable-diffusion 582 This repo is a modified version of the Stable Diffusion repo, optimized to use less VRAM than the original by sacrificing inference speed. Also I invented the sliced atttention technique, which allows to push the model's abilities even further. It works by automatically determining the slice size from your vram and image size and then allocating it one by one accordingly. You can practically generate any image size, it just depends on the generation speed you are willing to sacrifice.
Deforum Stable Diffusion 591 Animating prompts with stable diffusion. Weighted Prompts, Perspective 2D Flipping, Dynamic Video Masking, Custom MATH expressions, Waifu and Robo Diffusion Models. twitter, changelog. replicate demo: https://replicate.com/deforum/deforum_stable_diffusion
Maple Diffusion 550 Maple Diffusion runs Stable Diffusion models locally on macOS / iOS devices, in Swift, using the MPSGraph framework (not Python). Matt Waller working on CoreML impl
Doggettx/stable-diffusion 158 Allows to use resolutions that require up to 64x more VRAM than possible on the default CompVis build.
Doohickey Diffusion 29 CLIP guidance, perceptual guidance, Perlin initial noise, and other features.

https://github.com/Filarius/stable-diffusion-webui/blob/master/scripts/vid2vid.py with Vid2Vid

Future Diffusion https://huggingface.co/nitrosocke/Future-Diffusion https://twitter.com/Nitrosocke/status/1599789199766716418

SD in Other languages

Other Lists of Forks

SD Model search and ratings: https://civitai.com/

Dormant projects, for historical/research interest:

Misc SD UI's

UI's that dont come with their own SD distro, just shelling out to one

UI Name/Link Stars Self-Description
ahrm/UnstableFusion 815 UnstableFusion is a desktop frontend for Stable Diffusion which combines image generation, inpainting, img2img and other image editing operation into a seamless workflow. https://www.youtube.com/watch?v=XLOhizAnSfQ&t=1s
stable-diffusion-2-gui 262 Lightweight Stable Diffusion v 2.1 web UI: txt2img, img2img, depth2img, inpaint and upscale4x.
breadthe/sd-buddy 165 Companion desktop app for the self-hosted M1 Mac version of Stable Diffusion, with Svelte and Tauri
leszekhanusz/diffusion-ui 65 This is a web interface frontend for the generation of images using diffusion models.

The goal is to provide an interface to online and offline backends doing image generation and inpainting like Stable Diffusion.
GenerationQ 21 GenerationQ (for "image generation queue") is a cross-platform desktop application (screens below) designed to provide a general purpose GUI for generating images via text2img and img2img models. Its primary target is Stable Diffusion but since there is such a variety of forked programs with their own particularities, the UI for configuring image generation tasks is designed to be generic enough to accommodate just about any script (even non-SD models).

SD Prompt galleries and search engines

SD Visual search

SD Prompt generators

Img2prompt - Reverse Prompt Engineering

Explore Artists, styles, and modifiers

See https://github.com/sw-yx/prompt-eng/blob/main/PROMPTS.md for more details and notes

SD Prompt Tools directories and guides

Finetuning/Dreambooth

How to finetune

Now LORA https://github.com/cloneofsimo/lora

Stable Diffusion + Midjourney

Embeddings/Textual Inversion

Dreambooth

Trained examples

ControlNet

SD Tooling

How SD Works - Internals and Studies

SD Results

Img2Img

InstructPix2Pix

  • https://www.timothybrooks.com/instruct-pix2pix
  • Pix2Pixzero - https://pix2pixzero.github.io/
    • We propose pix2pix-zero, a diffusion-based image-to-image approach that allows users to specify the edit direction on-the-fly (e.g., cat to dog). Our method can directly use pre-trained text-to-image diffusion models, such as Stable Diffusion, for editing real and synthetic images while preserving the input image's structure. Our method is training-free and prompt-free, as it requires neither manual text prompting for each input image nor costly fine-tuning for each task.

Extremely detailed prompt examples

Solving Hands

  • Negative prompts: ugly, disfigured, too many fingers, too many arms, too many legs, too many hands

Midjourney prompts

Misc