Skip to content

🐛 [Bug] Encountered bug when using Torch-TensorRT with CUDA-Graph #4126

@wenbingl

Description

@wenbingl

Bug Description

Error: RuntimeError: Failed to extract symbolic shape expressions from source FX graph partition

Exception has occurred: RuntimeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
Failed to extract symbolic shape expressions from source FX graph partition
File "/home/wenbingl/scratch/g/t-trt/py/torch_tensorrt/dynamo/conversion/_conversion.py", line 215, in interpret_module_to_result
"Failed to extract symbolic shape expressions from source FX graph partition"

    )

File "/home/wenbingl/scratch/g/t-trt/py/torch_tensorrt/dynamo/conversion/_conversion.py", line 343, in convert_module
module, inputs, settings, engine_cache=engine_cache

)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/home/wenbingl/scratch/g/t-trt/py/torch_tensorrt/dynamo/_compiler.py", line 1108, in compile_module
submodule,

            submodule_inputs,

            settings=settings,

            name=name,

            engine_cache=engine_cache,

        )
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/home/wenbingl/scratch/g/t-trt/py/torch_tensorrt/dynamo/_compiler.py", line 798, in compile
gm, trt_arg_inputs, trt_kwarg_inputs, settings, engine_cache

)

return trt_gm

Preliminary root cause analysis, can be wrong.

Root Cause: aten.sym_size.int is registered as a TRT-supported op (has a converter in aten_ops_converters.py). However, its output meta["val"] is a SymInt, not a
torch.Tensor. When this node ends up in a TRT partition whose output also includes that SymInt (because a later partition needs it),
_symbolic_shape_capture.extract_symbolic_shape_expressions fails at:

if not isinstance(out_val, torch.Tensor): # SymInt is not a Tensor
return None → RuntimeError

What happens structurally in the decoder:

  1. aten._scaled_dot_product_efficient_attention.default has no dynamic-shape TRT support, so it stays in non-TRT (run_on_gpu*) partitions. This splits the
    decoder into ~15 alternating TRT/non-TRT partitions.
  2. The first TRT partition (_run_on_acc_0, ~324 nodes) contains sym_size_int_87 = aten.sym_size.int(targets, 0) (batch dim, symbolic s24).
  3. Later TRT partitions (_run_on_acc_2, _run_on_acc_4, …) also need sym_size_int_87 for their own view/reshape operations.
  4. So sym_size_int_87 must be an output of _run_on_acc_0. Its meta["val"] is a SymInt (not torch.Tensor).
  5. extract_symbolic_shape_expressions sees a non-tensor partition output → returns None → RuntimeError.

Fix direction: In _symbolic_shape_capture.py, handle the case where a partition output is a SymInt (skip it or represent it as a scalar int shape expression
instead of failing outright). Alternatively, prevent aten.sym_size.int from being placed inside TRT partitions — keep it in the parent graph as a shared input to
all submodules that need it.

To Reproduce

Steps to reproduce the behavior:

Run the internal model with --cuda-graph, contact the bug owner to get the model information.

Expected behavior

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0):
  • PyTorch Version (e.g. 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions