-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inputs are reordered by TensorRT provider #22729
Comments
typically people feed inputs by name, rather than relying on position.
|
That's strange. Does the Python bindings use a C function which returns the complete input array instead of building it using the function GetInputNameAllocated that we use? Are you sure you use the same TrT version. As the error only appears on TrT provider I presume onnxruntime reads back the input list after letting TrT do the parsing. I think there are also two different onnx parsers you can use for TrT. I never understood the difference or which one to prefer, we only build onnxruntime without trying to modify the parsing. Previously we had a build based on TrT 8.x and ORT 1.17 which did not show this error. |
+@chilo-ms can you help take a quick look? |
btw, can you elaborate on "there are two different onnx parsers you can use for TRT" |
it's still better to rely on input name rather than position. when you call InferenceSession.Run() you are feeding the values by input names anyway. |
That's strange and I can't repro from my side. On Linux, i tested following code:
and the output is: On Windows, i had an issue linking my test app against onnxruntime.dll, so i turned to onnxruntime_perf_test and add the session.GetInputNameAllocated(i, allocator).get() to it. |
Also, i don't think it has anything to do with the TensorRT parser. The model inputs which kept inside ORT session won't be affected by EPs. Re: I presume onnxruntime reads back the input list after letting TrT do the parsing It needs more investigation, probably share the whole code to help us repro? |
I did some research into this. I think it is a bug in the "fake" onnx file that points out the engine file after optimizing with trt_dump_ep_context_model = true; As it seems, when loading this file back on the next run the inputs are reordered. This is why your little test program that does not do caching works. Note that we have another test network with two inputs, but only one output, which doesn't show this erroneous behaviour. |
This is the resulting file that points out the engine file, if I understand your system correctly. |
This was confirmed by listing its contents using our Julia integration of onnx: ulia> using ONNXLowLevel |
I can repro now and thanks for pointing out the issue. |
Describe the issue
When loading the onnx file inside in2out3.zip there should be two inputs and the one at index 0 should be named input1 and at index 1 input2.
With CPU or DML provider enabled this works but with TensorRT provider they are in opposite order.
We have other onnx files with more than one input and/or output and there the reordering does not happen.
A guess is that the onnx parser used by tensorRT uses a hash-map or something that does not preserve the index to input mapping.
I'm not sure what the actual guarantees of your API are but I'm sure more users than us are relying on the index based ports are in the onnx data order.
To reproduce
Load the onnx into a ort::Session set up for TensorRT processing. Get the input names using code like:
It will output:
Input2
Input1
for TensorRT provider and the reverse (correctly) for other providers.
Urgency
No response
Platform
Windows
OS Version
Windows11
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.19.2
ONNX Runtime API
C++
Architecture
X64
Execution Provider
TensorRT
Execution Provider Library Version
Trt 10.4.0.26
The text was updated successfully, but these errors were encountered: