You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Situation:
I am loading an onnx model (yolo v5) with TensorRT provider, which takes 4 Minutes on a Jetson Orin.
I successfully speeded this up by caching the TensorRT engine.
OrtTensorRTProviderOptions trt_options{};
trt_options.device_id = 0;
trt_options.trt_max_workspace_size = 2147483648;
//trt_options.trt_max_partition_iterations = 10;
trt_options.trt_min_subgraph_size = 1;
trt_options.trt_fp16_enable = 0;
trt_options.trt_int8_enable = 0;
//trt_options.trt_int8_use_native_calibration_table = 1;
trt_options.trt_engine_cache_enable = 1;
//trt_options.trt_dump_ep_context_model = 1; // desired but not available in trt_options, only in trt2
trt_options.trt_engine_cache_path = "./cache";
//trt_options.trt_dump_subgraphs = 1;
session_options.AppendExecutionProvider_TensorRT(trt_options); // add trt-options to session-options!
But I am not certain about the model security when saving the engine (we are currently loading the model from RAM, so no files are exposed to the a user who has access to the system). Is the trt engine secure, or could anyone generate inference just from that engine file? Especially: Are the model weights inside of the engine, or is the engine just some kind of "meta data" only works in combination with the model file itself (both in ONNX and native TensorRT and some theoretical custom inference engines)?
That's why I would like to embed the engine to a onnx file (context model) and load that model from RAM as before.
If I understand correctly, that should be possible?
For that I add another Provider in addition to the OrtTensorRTProviderOptions trt_options:
std::vector<const char*> option_keys2 = {
"trt_engine_cache_enable"
,"trt_dump_ep_context_model"
,"trt_ep_context_file_path"
,"ep_context_enable"
,"ep_context_file_path"
,"trt_ep_context_embed_mode"
,"trt_engine_cache_path"
//,"trt_timing_cache_enable"
//,"trt_timing_cache_path"
};
std::vector<const char*> option_values2 = {
"1"
,"1"
,"/path1" // sub-path, according to https://app.semanticdiff.com/gh/microsoft/onnxruntime/pull/19154/overview
,"1"
,"/path2" // base path, according to https://app.semanticdiff.com/gh/microsoft/onnxruntime/pull/19154/overview
,"1"
,"/path3"
//,"1"
//,"/path4"
};
Ort::ThrowOnError(api.CreateTensorRTProviderOptions(&tensorrt2_options));
Ort::ThrowOnError(api.UpdateTensorRTProviderOptions(tensorrt2_options, option_keys2.data(), option_values2.data(), option_keys2.size()));
session_options.AppendExecutionProvider_TensorRT_V2(*tensorrt2_options); // add trt2-options to session-options!
Describe the issue
Situation:
I am loading an onnx model (yolo v5) with TensorRT provider, which takes 4 Minutes on a Jetson Orin.
I successfully speeded this up by caching the TensorRT engine.
But I am not certain about the model security when saving the engine (we are currently loading the model from RAM, so no files are exposed to the a user who has access to the system). Is the trt engine secure, or could anyone generate inference just from that engine file? Especially: Are the model weights inside of the engine, or is the engine just some kind of "meta data" only works in combination with the model file itself (both in ONNX and native TensorRT and some theoretical custom inference engines)?
That's why I would like to embed the engine to a onnx file (context model) and load that model from RAM as before.
If I understand correctly, that should be possible?
For that I add another Provider in addition to the OrtTensorRTProviderOptions trt_options:
However, there I am getting a
How to do it correctly?
To reproduce
Urgency
No response
Platform
Other / Unknown
OS Version
Jetson Orin Linux
ONNX Runtime Installation
Other / Unknown
ONNX Runtime Version or Commit ID
11.4
ONNX Runtime API
C++
Architecture
X64
Execution Provider
TensorRT
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: