Investigate sdxl example #182

attila-dusnoki-htec · 2024-03-25T13:12:53Z

No description provided.

attila-dusnoki-htec · 2024-03-25T13:45:42Z

Improved version of sdxl is at https://github.com/ROCm/AMDMIGraphX/commits/sdxl_perf_torch_buffers/

The main idea was to move the buffers to gpu memory. This requires rocm/pytorch to make device="cuda" work.

migraphx supports argument_from_pointer, which can handle tensor.data_ptr() with the proper shape.

Note: for unetxl, the unetxl.opt version was used, which is created by tensorrt demo script.

The original and rewritten perf logs:

original (np): sdxl_perf.log

Elapsed time for decode: 440.0491 ms
Elapsed time clip: 37.4158 ms
Elapsed time unet: 8252.2065 ms
Elapsed time vae: 440.0772 ms
Elapsed time for run: 8752.8331 ms

Toggle output image

new (pt): sdxl_torch_perf.log

Elapsed time for decode: 434.3943 ms
Elapsed time clip: 24.1256 ms
Elapsed time unet: 7470.2498 ms
Elapsed time vae: 434.4229 ms
Elapsed time for run: 7951.7439 ms

Toggle output image

There are differences on the output images probably due to precision

attila-dusnoki-htec · 2024-03-26T13:28:49Z

The packages used in TRT demo:
cuda -> hip
cudart -> cudart (with hip)
polygraphy -> Can be extended with MGX backend
tensorrt -> migraphx

As seen, hip-python-as-cuda could work for the cuda part.
The tensorrt has to be replaced, or wrapped.

attila-dusnoki-htec · 2024-03-28T14:36:33Z

To get the clip.opt and clip2.opt models working, we need to use graph surgeon.
The hidden states are not exposed by default. The correspoding code is here.

Update: Actually, that is already in the model. The problem is that it is not "exposed" as an output. We need to re-export it and make sure it is an output.

attila-dusnoki-htec · 2024-04-02T10:44:56Z

The commit that enabled it: ROCm@0d9e4b9

The "hidden_states" was just renamed, but was not added to the onnx outputs. With clip_modifier.py, we are creating a "mod" (modified) version.
After fixing the dtypes, the new runtimes:

	before	after
numpy	37.4158 ms	16.2879 ms
torch	23.5778 ms	14.2189 ms

There is a change in the outputs as well. Also, now the "third" arm of the np version is fixed.

Toggle NP version output

Toggle PT version output

attila-dusnoki-htec · 2024-04-10T07:23:36Z

Both SD21 and SDXL were updated to use torch.
And Turbo was enabled as well.

Still debugging why the refiner gives strange results for certain models.

attila-dusnoki-htec · 2024-04-12T07:24:51Z

Related PRs: ROCm#2951 ROCm#2954 ROCm#2959

attila-dusnoki-htec · 2024-04-15T11:24:26Z

Prompting SDXL

The following are some experiments with SDXL

Setup

The SDXL example code

The command to start the server: python gradio_app.py -p "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" --pipeline-type sdxl-opt --use-refiner --fp16 clip clip2 unetxl refiner_clip2 refiner_unetxl

It uses the sdxl-opt version, with fp16 model quatization (except vae).

Random examples

variable	value
Prompt	Duck smoking cigarette, sepia colors, noir style, detailed, 8k
Negative prompt
Number of steps	30
Random seed	42
Guidance scale	5
Number of refiner steps	0
Aesthetic score	6
Negative Aesthetic score	2.5

variable	value
Prompt	portrait of a pretty blonde woman, a flower crown, earthy makeup, flowing maxi dress with colorful patterns and fringe, a sunset or nature scene, green and gold color scheme
Negative prompt
Number of steps	50
Random seed	42
Guidance scale	5
Number of refiner steps	0
Aesthetic score	6
Negative Aesthetic score	2.5

variable	value
Prompt	Black and white street photography of a rainy night in New York, reflections on wet pavement.
Negative prompt
Number of steps	100
Random seed	42
Guidance scale	5
Number of refiner steps	0
Aesthetic score	6
Negative Aesthetic score	2.5

Duck with fedora

The following examples all have the same values:

variable	value
Prompt
Negative prompt
Number of steps	100
Random seed	42
Guidance scale	5
Number of refiner steps	0
Aesthetic score	6
Negative Aesthetic score	2.5

Prompt	Result
Duck with fedora
Duck with fedora, sepia color
Duck with fedora, sepia color, noir style
Duck with fedora, sepia color, noir style, detailed, 8k
Detailed portrait of a duck with fedora wearing an elegant suit, sepia colors, noir art style, 50s background
Detailed portrait of a duck with fedora wearing an elegant suit, bright colors, noir art style, 50s background
Detailed portrait of a detective duck with fedora wearing an elegant suit, bright colors, noir art style, 50s background
Detailed portrait of a detective duck with fedora wearing an elegant suit, black and white colors, noir art style, 50s background
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, dark colors, noir art style, rainy street at night with lamp lights background

The following 3 is with 50 steps instead of 100

Prompt	Result
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, vibrant colors, noir art style, rainy street at night with lamp lights background
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, monochrome, noir art style, rainy street at night with lamp lights background
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, monochrome, comic book art style, rainy street at night with lamp lights background
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, vibrant colors, comic book art style, rainy street at night with lamp lights background

Timesteps montage

The following images are with the same prompt at different timesteps

Prompt: Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, dark colors, noir art style, rainy street at night with lamp lights background


5	10	15
20	25	30
35	40	45
50	75	100

attila-dusnoki-htec · 2024-05-08T07:28:47Z

Extended it with stream and events: ROCm#3051

attila-dusnoki-htec added this to MIGraphX ONNX support Mar 25, 2024

attila-dusnoki-htec converted this from a draft issue Mar 25, 2024

attila-dusnoki-htec self-assigned this Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate sdxl example #182

Investigate sdxl example #182

attila-dusnoki-htec commented Mar 25, 2024

attila-dusnoki-htec commented Mar 25, 2024 •

edited

Loading

attila-dusnoki-htec commented Mar 26, 2024

attila-dusnoki-htec commented Mar 28, 2024 •

edited

Loading

attila-dusnoki-htec commented Apr 2, 2024

attila-dusnoki-htec commented Apr 10, 2024

attila-dusnoki-htec commented Apr 12, 2024

attila-dusnoki-htec commented Apr 15, 2024 •

edited

Loading

attila-dusnoki-htec commented May 8, 2024

Investigate sdxl example #182

Investigate sdxl example #182

Comments

attila-dusnoki-htec commented Mar 25, 2024

attila-dusnoki-htec commented Mar 25, 2024 • edited Loading

attila-dusnoki-htec commented Mar 26, 2024

attila-dusnoki-htec commented Mar 28, 2024 • edited Loading

attila-dusnoki-htec commented Apr 2, 2024

attila-dusnoki-htec commented Apr 10, 2024

attila-dusnoki-htec commented Apr 12, 2024

attila-dusnoki-htec commented Apr 15, 2024 • edited Loading

Prompting SDXL

Setup

Random examples

Duck with fedora

Timesteps montage

attila-dusnoki-htec commented May 8, 2024

attila-dusnoki-htec commented Mar 25, 2024 •

edited

Loading

attila-dusnoki-htec commented Mar 28, 2024 •

edited

Loading

attila-dusnoki-htec commented Apr 15, 2024 •

edited

Loading