Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate sdxl example #182

Open
attila-dusnoki-htec opened this issue Mar 25, 2024 · 8 comments
Open

Investigate sdxl example #182

attila-dusnoki-htec opened this issue Mar 25, 2024 · 8 comments
Assignees

Comments

@attila-dusnoki-htec
Copy link

No description provided.

@attila-dusnoki-htec
Copy link
Author

attila-dusnoki-htec commented Mar 25, 2024

Improved version of sdxl is at https://github.com/ROCm/AMDMIGraphX/commits/sdxl_perf_torch_buffers/

The main idea was to move the buffers to gpu memory. This requires rocm/pytorch to make device="cuda" work.

migraphx supports argument_from_pointer, which can handle tensor.data_ptr() with the proper shape.

Note: for unetxl, the unetxl.opt version was used, which is created by tensorrt demo script.

The original and rewritten perf logs:

Elapsed time for decode: 440.0491 ms
Elapsed time clip: 37.4158 ms
Elapsed time unet: 8252.2065 ms
Elapsed time vae: 440.0772 ms
Elapsed time for run: 8752.8331 ms
Toggle output image

Image

Elapsed time for decode: 434.3943 ms
Elapsed time clip: 24.1256 ms
Elapsed time unet: 7470.2498 ms
Elapsed time vae: 434.4229 ms
Elapsed time for run: 7951.7439 ms
Toggle output image

Image

There are differences on the output images probably due to precision

@attila-dusnoki-htec attila-dusnoki-htec self-assigned this Mar 25, 2024
@attila-dusnoki-htec
Copy link
Author

The packages used in TRT demo:
cuda -> hip
cudart -> cudart (with hip)
polygraphy -> Can be extended with MGX backend
tensorrt -> migraphx

As seen, hip-python-as-cuda could work for the cuda part.
The tensorrt has to be replaced, or wrapped.

@attila-dusnoki-htec
Copy link
Author

attila-dusnoki-htec commented Mar 28, 2024

To get the clip.opt and clip2.opt models working, we need to use graph surgeon.
The hidden states are not exposed by default. The correspoding code is here.

Update: Actually, that is already in the model. The problem is that it is not "exposed" as an output. We need to re-export it and make sure it is an output.

@attila-dusnoki-htec
Copy link
Author

The commit that enabled it: ROCm@0d9e4b9

The "hidden_states" was just renamed, but was not added to the onnx outputs. With clip_modifier.py, we are creating a "mod" (modified) version.
After fixing the dtypes, the new runtimes:

before after
numpy 37.4158 ms 16.2879 ms
torch 23.5778 ms 14.2189 ms

There is a change in the outputs as well. Also, now the "third" arm of the np version is fixed.

Toggle NP version output

Image

Toggle PT version output

Image

@attila-dusnoki-htec
Copy link
Author

Both SD21 and SDXL were updated to use torch.
And Turbo was enabled as well.

Still debugging why the refiner gives strange results for certain models.

@attila-dusnoki-htec
Copy link
Author

Related PRs: ROCm#2951 ROCm#2954 ROCm#2959

@attila-dusnoki-htec
Copy link
Author

attila-dusnoki-htec commented Apr 15, 2024

Prompting SDXL

The following are some experiments with SDXL

Setup

The SDXL example code

The command to start the server: python gradio_app.py -p "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" --pipeline-type sdxl-opt --use-refiner --fp16 clip clip2 unetxl refiner_clip2 refiner_unetxl

It uses the sdxl-opt version, with fp16 model quatization (except vae).

Random examples

variable value
Prompt Duck smoking cigarette, sepia colors, noir style, detailed, 8k
Negative prompt
Number of steps 30
Random seed 42
Guidance scale 5
Number of refiner steps 0
Aesthetic score 6
Negative Aesthetic score 2.5

Image

variable value
Prompt portrait of a pretty blonde woman, a flower crown,
earthy makeup, flowing maxi dress with colorful patterns and fringe,
a sunset or nature scene, green and gold color scheme
Negative prompt
Number of steps 50
Random seed 42
Guidance scale 5
Number of refiner steps 0
Aesthetic score 6
Negative Aesthetic score 2.5

Image

variable value
Prompt Black and white street photography of a rainy
night in New York, reflections on wet pavement.
Negative prompt
Number of steps 100
Random seed 42
Guidance scale 5
Number of refiner steps 0
Aesthetic score 6
Negative Aesthetic score 2.5

Image

Duck with fedora

The following examples all have the same values:

variable value
Prompt
Negative prompt
Number of steps 100
Random seed 42
Guidance scale 5
Number of refiner steps 0
Aesthetic score 6
Negative Aesthetic score 2.5
Prompt Result
Duck with fedora Image
Duck with fedora, sepia color Image
Duck with fedora, sepia color, noir style Image
Duck with fedora, sepia color, noir style, detailed, 8k Image
Detailed portrait of a duck with fedora wearing an elegant suit, sepia colors, noir art style, 50s background Image
Detailed portrait of a duck with fedora wearing an elegant suit, bright colors, noir art style, 50s background Image
Detailed portrait of a detective duck with fedora wearing an elegant suit, bright colors, noir art style, 50s background Image
Detailed portrait of a detective duck with fedora wearing an elegant suit, black and white colors, noir art style, 50s background Image
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, dark colors, noir art style, rainy street at night with lamp lights background Image

The following 3 is with 50 steps instead of 100

Prompt Result
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, vibrant colors, noir art style, rainy street at night with lamp lights background Image
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, monochrome, noir art style, rainy street at night with lamp lights background Image
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, monochrome, comic book art style, rainy street at night with lamp lights background Image
Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, vibrant colors, comic book art style, rainy street at night with lamp lights background Image

Timesteps montage

The following images are with the same prompt at different timesteps

Prompt: Detailed portrait of a detective duck with fedora wearing an elegant 50s style suit, dark colors, noir art style, rainy street at night with lamp lights background

Image 5 Image 10 Image 15
Image 20 Image 25 Image 30
Image 35 Image 40 Image 45
Image 50 Image 75 Image 100

@attila-dusnoki-htec
Copy link
Author

Extended it with stream and events: ROCm#3051

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

No branches or pull requests

1 participant