Previously StreamMultiDiffusion: Real-Time Interactive Generation
with Region-Based Semantic Control
![]() |
![]() |
---|---|
Draw multiple prompt-masks in a large canvas | Real-time creation |
Jaerin Lee Β· Daniel Sungho Jung Β· Kanggeon Lee Β· Kyoung Mu Lee
SemanticDraw is a real-time interactive text-to-image generation framework that allows you to draw with meanings π§ using semantic brushes ποΈ.
# Install
conda create -n semdraw python=3.12 && conda activate semdraw
git clone https://github.com/ironjr/semantic-draw
cd semantic-draw
pip install -r requirements.txt
# Run streaming demo
cd demo/stream
python app.py --model "runwayml/stable-diffusion-v1-5" --port 8000
# Open http://localhost:8000 in your browser
For SD3 support, additionally run:
pip install git+https://github.com/initml/diffusers.git@clement/feature/flash_sd3
Note: this is default in requirements.txt
Interactive Drawing | Prompt Separation | Real-time Editing |
---|---|---|
![]() |
![]() |
![]() |
Paint with semantic brushes | No unwanted content mixing | Edit photos in real-time |
conda create -n smd python=3.12 && conda activate smd
git clone https://github.com/ironjr/StreamMultiDiffusion
cd StreamMultiDiffusion
pip install -r requirements.txt
pip install git+https://github.com/initml/diffusers.git@clement/feature/flash_sd3
We provide several demo applications with different features and model support:
Real-time streaming interface with semantic drawing capabilities.
cd demo/stream
python app.py --model "your-model" --height 512 --width 512 --port 8000
Options
Option | Description | Default |
---|---|---|
--model |
Path to SD1.5 checkpoint (HF or local .safetensors) | None |
--height |
Canvas height | 768 |
--width |
Canvas width | 1920 |
--bootstrap_steps |
Semantic region separation (1-3 recommended) | 1 |
--seed |
Random seed | 2024 |
--device |
GPU device number | 0 |
--port |
Web server port | 8000 |
Simplified interface for different SD versions:
cd demo/semantic_palette
python app.py --model "runwayml/stable-diffusion-v1-5" --port 8000
cd demo/semantic_palette_sdxl
python app.py --model "your-sdxl-model" --port 8000
cd demo/semantic_palette_sd3
python app.py --port 8000
Using Custom Models (.safetensors)
- Place your
.safetensors
file in the demo'scheckpoints
folder - Run with:
python app.py --model "your-model.safetensors"
Basic Generation
import torch
from model import StableMultiDiffusionPipeline
# Initialize
device = torch.device('cuda:0')
smd = StableMultiDiffusionPipeline(device, hf_key='runwayml/stable-diffusion-v1-5')
# Generate
image = smd.sample('A photo of the dolomites')
image.save('output.png')
Region-Based Generation
import torch
from model import StableMultiDiffusionPipeline
from util import seed_everything
# Setup
seed_everything(2024)
device = torch.device('cuda:0')
smd = StableMultiDiffusionPipeline(device)
# Define prompts and masks
prompts = ['background: city', 'foreground: a cat', 'foreground: a dog']
masks = load_masks() # Your mask loading logic
# Generate
image = smd(prompts, masks=masks, height=768, width=768)
image.save('output.png')
Streaming Generation
from model import StreamMultiDiffusion
# Initialize streaming pipeline
smd = StreamMultiDiffusion(device, height=512, width=512)
# Register layers
smd.update_single_layer(idx=0, prompt='background', mask=bg_mask)
smd.update_single_layer(idx=1, prompt='object', mask=obj_mask)
# Stream generation
while True:
image = smd()
display(image)
Explore our notebooks directory for interactive examples:
- Basic usage tutorial
- Advanced region control
- SD3 examples
- Custom model integration
For technical details, see our paper and project page.
What is Semantic Palette?
Semantic Palette lets you paint with text prompts instead of colors. Each brush carries a meaning (prompt) that generates appropriate content in real-time.
Which models are supported?
- β Stable Diffusion 1.5 and variants
- β SDXL and variants (with Lightning LoRA)
- β Stable Diffusion 3
- β Custom .safetensors checkpoints
Hardware requirements?
- Minimum: GPU with 8GB VRAM (for 512x512)
- Recommended: GPU with 11GB VRAM (for larger resolutions) (Tested with 1080 ti).
- π₯ June 2025: Presented at CVPR 2025
- β June 2024: SD3 support with Flash Diffusion
- β April 2024: StreamMultiDiffusion v2 with responsive UI
- β March 2024: SDXL support with Lightning LoRA
- β March 2024: First version released
See README_old.md for full history.
@inproceedings{lee2025semanticdraw,
title="{SemanticDraw:} Towards Real-Time Interactive Content Creation from Image Diffusion Models",
author={Lee, Jaerin and Jung, Daniel Sungho and Lee, Kanggeon and Lee, Kyoung Mu},
booktitle={CVPR},
year={2025}
}
Built upon StreamDiffusion, MultiDiffusion, and LCM. Special thanks to the Hugging Face team and the model contributors.
Please email [email protected]
or open an issue.