When height and width change, the inference speed will significantly slow down. #423

serend1p1ty · 2025-01-06T07:57:34Z

Whenever height / width change, pipe.prepare_run needs to be re-executed, which is very time-consuming. Is there a better approach?

The text was updated successfully, but these errors were encountered:

feifeibear · 2025-01-06T08:19:16Z

Which script are you using？

serend1p1ty · 2025-01-06T09:04:50Z

It seems that xdit requires the resolution used in prepare_run to be the same as the resolution in the actual call. If prepare_run uses 1152x1152, but the actual call uses 1056x1056, it will be very slow.

serend1p1ty · 2025-01-06T09:49:13Z

@feifeibear
example/run.sh

...
TASK_ARGS="--height 1152 --width 1152 --no_use_resolution_binning"

N_GPUS=6
PARALLEL_ARGS="--ulysses_degree 6"

COMPILE_FLAG="--use_torch_compile"
...

flux_example.py

  output = pipe(
      height=1152,
      width=1152,
      prompt=input_config.prompt,
      num_inference_steps=input_config.num_inference_steps,
      output_type=input_config.output_type,
      max_sequence_length=256,
      guidance_scale=0.0,
      generator=torch.Generator(device="cuda").manual_seed(input_config.seed),
  )

It taken 5s. If we modify flux_example.py as following:

  output = pipe(
      height=1056,
      width=1056,
      prompt=input_config.prompt,
      num_inference_steps=input_config.num_inference_steps,
      output_type=input_config.output_type,
      max_sequence_length=256,
      guidance_scale=0.0,
      generator=torch.Generator(device="cuda").manual_seed(input_config.seed),
  )

It taken 20s.

feifeibear · 2025-01-07T02:14:54Z

I see, you did not prepare run the correct image used for inference. You can run multiple inference and see if it still slow after the first run.

serend1p1ty · 2025-01-07T03:39:17Z

@feifeibear Thanks for your response.
Multiple inference can solve this problem. But we facing a new problem, if users submit tasks at different resolutions every time, how should we handle it

feifeibear · 2025-01-07T03:54:05Z

Have you used the torch compile option?

serend1p1ty · 2025-01-07T03:55:58Z

Have you used the torch compile option?

yes

serend1p1ty · 2025-01-08T03:08:40Z

@feifeibear How to deal with the problem of frequent resolution changes? Not use torch compile option?

feifeibear · 2025-01-08T07:50:47Z

dynamic shape of torch.compile is a well-known challenge. We will investigate the problem and see if we can find some good solutions. Tell us if you find some good ideas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When height and width change, the inference speed will significantly slow down. #423

When height and width change, the inference speed will significantly slow down. #423

serend1p1ty commented Jan 6, 2025

feifeibear commented Jan 6, 2025

serend1p1ty commented Jan 6, 2025 •

edited

Loading

serend1p1ty commented Jan 6, 2025 •

edited

Loading

feifeibear commented Jan 7, 2025

serend1p1ty commented Jan 7, 2025

feifeibear commented Jan 7, 2025

serend1p1ty commented Jan 7, 2025

serend1p1ty commented Jan 8, 2025

feifeibear commented Jan 8, 2025

When height and width change, the inference speed will significantly slow down. #423

When height and width change, the inference speed will significantly slow down. #423

Comments

serend1p1ty commented Jan 6, 2025

feifeibear commented Jan 6, 2025

serend1p1ty commented Jan 6, 2025 • edited Loading

serend1p1ty commented Jan 6, 2025 • edited Loading

feifeibear commented Jan 7, 2025

serend1p1ty commented Jan 7, 2025

feifeibear commented Jan 7, 2025

serend1p1ty commented Jan 7, 2025

serend1p1ty commented Jan 8, 2025

feifeibear commented Jan 8, 2025

serend1p1ty commented Jan 6, 2025 •

edited

Loading

serend1p1ty commented Jan 6, 2025 •

edited

Loading