Stable diffusion experiments

This is a repo providing same stable diffusion experiments, regarding textual inversion task and captioning task.

Installation

Clone the repo, then create a conda envirnoment from envirnoment.yml and install the dependecies.

  conda env create --file=environment.yml
  conda activate sd
  pip install -r requirements.txt

Textual inversion

The textual inversion experiment creates a video of 20 frames out of the generation of two images that starts from different concepts provided by the user.

It is possible to load concepts giving a valid Huggin Face 🤗 concept repo: https://huggingface.co/spaces/sd-concepts-library/stable-diffusion-conceptualizer

Usage

  --model_id MODEL_ID   The s.d. model checpoint you want to use
  --from_file, --no-from_file
                        load arguments from file
  -p PROMPT_FILE_PATH, --prompt_file_path PROMPT_FILE_PATH
                        path file where to read prompt
  -s SEED, --seed SEED  Set the random seed
  --from_concept_repo FROM_CONCEPT_REPO
                        The start concept you want to use. (Provide a hugginface concept repo)
  --to_concept_repo TO_CONCEPT_REPO
                        The end concept you want to use. (Provide a hugginface concept repo)
  --from_prompt FROM_PROMPT
                        Start prompt you want to use
  --to_prompt TO_PROMPT
                        End prompt you want to use
  --num_inference_steps NUM_INFERENCE_STEPS
                        Number of inference step.
  --guidance_scale GUIDANCE_SCALE
                        The guidance scale value to set.
  --width WIDTH         Canvas width of generated image.
  --height HEIGHT       Canvas height of generated image.
  --use_negative_prompt, --no-use_negative_prompt
                        flag to use negative prompt stored in negative_prompt.txt
  -b BATCH_SIZE, --batch_size BATCH_SIZE
                        Batch size to use
  --mps, --no-mps       Set the device to 'mps' (M1 Apple)

example

python textual_inversion.py --from_file -p "prompt_close_up.txt" --mps --num_inference_steps 50

python textual_inversion.py --from_concept_repo "sd-concepts-library/gta5-artwork" --to_concept_repo "sd-concepts-library/low-poly-hd-logos-icons" --from_prompt "A man planting a seed in the <concept> style" --to_prompt "A <concept> of a beautiful tree" --mps --num_inference_steps 60 -s 0

img -> caption -> img

This is more an evaluation across different models to perform image-to-text, providing caption to use as s.d. prompt for recreate the original image. It has been designed as an investigation task, so I used the notebook captioning_task.ipynb to conduct experiments.

There are 3 different models for image2caption wich have been evaluated

mscoco_finetuned_CoCa-ViT-L-14-laion2B-s13B-b90k
vit-gpt2-image-captioning
blip-image-captioning-base

And then there is a comparison with a image2prompt model, the CLIP-Interrogator

pharma/CLIP-Interrogator

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets		assets
gen_images		gen_images
gen_videos		gen_videos
images		images
modules		modules
prompts		prompts
.gitignore		.gitignore
README.md		README.md
captioning_task.ipynb		captioning_task.ipynb
environment.yml		environment.yml
requirements.txt		requirements.txt
textual_inversion.py		textual_inversion.py
textual_inversion_semantic.py		textual_inversion_semantic.py
train_concept.py		train_concept.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stable diffusion experiments

Installation

Textual inversion

Usage

example

img -> caption -> img

About

Releases

Packages

Languages

pier-maker92/stable-diffusion-experiments

Folders and files

Latest commit

History

Repository files navigation

Stable diffusion experiments

Installation

Textual inversion

Usage

example

img -> caption -> img

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages