Trt refactor remove mixin #420

andompesta · 2025-02-21T17:50:43Z

This is the main PR used to refactor TRT inference code to:

remove onnx tracing dependencies
support multiple trt-configuration for the same model
trt-build done using polygrafy subprocess for easier understanding

Trt refactor remove exporters

…ild trt engines from onnx model

This PR is in charge of adding a trt-config class. This class is responsible to build the trt engine given an onnx path and various trt-flags. As the same model might need different trt-configurations depending on which precision is used, a registry is used to collect all the model configuration. Based on the provided key the get_config method will return the appropriate model configuration to use. Each configuration is a dataclass containing: the needed trt flags a from_model factory method to feed all needed parameters to the config class an get_input_profile method that return the max and mix input supported by the build engine Engine classes are changed to: use trt-config class instead of mixin classes

andompesta · 2025-02-21T18:09:56Z

merge 8 description

This PR is in charge of removing onnx-exporter class as it is not needed.
Onnx model comes from HF repo.

andompesta · 2025-02-21T18:11:55Z

merge 9 description

This PR is in charge of adding a trt-config class.
This class is responsible to build the trt engine given an onnx path and various trt-flags.

As the same model might need different trt-configurations depending on which precision is used, a registry is used to collect all the model configuration.
Based on the provided key the get_config method will return the appropriate model configuration to use.

@timudk here are the main changes
Each configuration is a dataclass containing:

the needed trt flags
a from_model factory method to feed all needed parameters to the config class
an get_input_profile method that return the max and mix input supported by the build engine
build process is still based on polygraphy, but instead of using python API a subprocess CLI approach is used for clarity

Engine classes are changed to:

use trt-config class instead of mixin and exporter classes

Trt refactor trt manager

andompesta · 2025-02-21T18:27:31Z

merge 10 description

This PR implements the changes on trt-manager to use trt-config classes instead of exporters and mixin.

for each porvided model a trt-config is provided by _get_trt_configs method.
for each trt-config a trt-engine is build. trt-build is fully based on trt-config classes. There is no more exporter/model_config split
after trt-engines are build, a trt-runtime is initialized
finally engine classes are instantiated with a valid cuda-stream

@timudk this PR greatly simplify this class by reducing code lines by half.
I'm keeping trt-config and engine classes separate for clarity. They could be merged in a single, really big classes, which I believe it would not provide much benefit

Trt refactor clis

andompesta · 2025-02-21T18:47:00Z

merge 11 description

This PR change the CLI script to support the new TRT interface.

A try ... except approach is used to solve tensorrt import: people that do not install trt dependencies would experience an error otherwise.
Instead of using

trt: bool = False,
trt_transformer_precision: str = "bf16",

a different set of inputs are used:

trt_onnx_dir is used to specify the folder containing onnx models
trt_engine_dir specify the location where trt engines will be loaded from or build
trt_precision specify the precision of the pipeline. For now the supported precision are: bf16, fp8 and fp4
All of these 3 args are needed to run trt engines. I found this approach to be easier rather than using a mix of input args and env-variables. @timudk please advise on which approach would be preferred on your side.

The following are additional inputs args that can be provided:

trt_batch_size the batch size used to optimize the engine, by default 1
trt_static_batch weather the engine is build with static batch size, by default yes
trt_static_shape weather the engine is build to support static image shape, by default yes

Note that:

Based on the provided trt_precision a set of input values are generated for the trt-manager class.
if TRT is not available, but input arguments are provided: an error is raised
all trt related input arguments start with trt- prefix

This reverts commit ef44038.

This reverts commit db02180.

T5 fp8

andompesta and others added 19 commits February 21, 2025 17:45

remove mixin classes

c04b6be

remove vaeMixin

84b68bd

remove exporters classes

81450d5

remove mixin classes

9dfc659

remove vaeMixin

9596aa3

Merge pull request #8 from andompesta/trt-refactor-remove-exporters

19d3e3e

Trt refactor remove exporters

add trt-config base classes to collect all trt attribute needed to bu…

9d8837e

…ild trt engines from onnx model

add build docstring

b1f7df6

clip trt config

7c8dc8c

t5 trt config

e665bc9

transformer trt config

7b162c0

vae trt config

f305c97

trt config init package

7489b4f

guidance-embed is a bool not an int

c79d2d7

update engine to use trt-config

0bb5bf1

update engine to use trt-config

33a2844

simplify registry keys

b57be34

add missing dataclass

7e5e43e

andompesta and others added 5 commits February 21, 2025 18:15

refactor trt-manager to use trt-configs to build engines

1fbcfae

update load_engines naming to match trt_ conventions

b0a51e8

not needed to know the cuda device to use

72c2655

onnx dir need to exists

cea2f80

Merge pull request #10 from andompesta/trt-refactor-trt-manager

666ef26

Trt refactor trt manager

andompesta added 3 commits February 21, 2025 18:31

implement cli-control trt modification

db02180

better handle trt dependencies

99eb549

fix assertion error

ae4ab3d

andompesta and others added 2 commits February 21, 2025 18:31

add clis support for TRT engines inference

ef44038

Merge pull request #11 from andompesta/trt-refactor-clis

3f0aa2a

Trt refactor clis

andompesta marked this pull request as ready for review February 21, 2025 18:50

andompesta and others added 24 commits March 3, 2025 21:38

parse model precision from string

da05f35

use string for model precisions, not boolean flags

2c1794c

use precision string instead of boolean flags

2a49b57

format

4787420

t5 fp8 config

2e3d077

use precision as string to check proper ckp

0281e1a

from {model-nmae}={precision} to {model-name}-{precision}

3b7b82c

use new format

e024471

add t5-fp8 config

9889768

use same config for both t5 model precisions

a3d5826

Revert "add clis support for TRT engines inference"

e75dc12

This reverts commit ef44038.

add cleanup memory function

7835c31

transformer and t5 precision provided as separate variables

7750516

add check if trt is installed or not

fdb13c6

add missing arg description

0ea2c10

new interface where all trt-args have a trt prefix

e9443d9

stream and runtime-init done when engines are build

c16bb18

use private clueanup function

c12de3a

improove logging

c05245c

Revert "implement cli-control trt modification"

2e87f66

This reverts commit db02180.

add checks to import trt

60885e9

add support for t5 in multiple precisions

e441347

add stop trt runtime

28ca974

Merge pull request #12 from andompesta/t5-fp8

2881d48

T5 fp8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trt refactor remove mixin #420

Trt refactor remove mixin #420

Uh oh!

andompesta commented Feb 21, 2025

Uh oh!

andompesta commented Feb 21, 2025 •

edited

Loading

Uh oh!

andompesta commented Feb 21, 2025 •

edited

Loading

Uh oh!

andompesta commented Feb 21, 2025 •

edited

Loading

Uh oh!

andompesta commented Feb 21, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Trt refactor remove mixin #420

Are you sure you want to change the base?

Trt refactor remove mixin #420

Uh oh!

Conversation

andompesta commented Feb 21, 2025

Uh oh!

andompesta commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andompesta commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andompesta commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andompesta commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

andompesta commented Feb 21, 2025 •

edited

Loading

andompesta commented Feb 21, 2025 •

edited

Loading

andompesta commented Feb 21, 2025 •

edited

Loading

andompesta commented Feb 21, 2025 •

edited

Loading