You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I successfully converted meta-llama/Llama-3.2-11B-Vision model from torch to onnx but when I try to run it with OnnxRuntime using CPU EP I got the following error message. My onnx versions are
onnx 1.16.1
onnxruntime 1.20.0
onnxscript 0.1.0.dev20241104
2024-11-05 11:38:58.259321744 [I:onnxruntime:, inference_session.cc:583 TraceSessionOptions] Session Options { execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath:"" enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: { } }
2024-11-05 11:38:58.259406334 [I:onnxruntime:, inference_session.cc:483 operator()] Flush-to-zero and denormal-as-zero are off
2024-11-05 11:38:58.259423922 [I:onnxruntime:, inference_session.cc:491 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2024-11-05 11:38:58.259436213 [I:onnxruntime:, inference_session.cc:509 ConstructorCommon] Dynamic block base set to 0
2024-11-05 11:39:19.692184563 [I:onnxruntime:, inference_session.cc:583 TraceSessionOptions] Session Options { execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath:"" enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: { } }
2024-11-05 11:39:19.692245285 [I:onnxruntime:, inference_session.cc:491 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2024-11-05 11:39:19.692262042 [I:onnxruntime:, inference_session.cc:509 ConstructorCommon] Dynamic block base set to 0
*************** EP Error ***************
EP Error narrowing_error when using ['CPUExecutionProvider']
Falling back to ['CPUExecutionProvider'] and retrying.
Traceback (most recent call last):
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 465, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 528, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
RuntimeError: narrowing_error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data4/duyan/xeams/code/pipeline/infer_llama_vqa.py", line 26, in
mllama_session = ort.InferenceSession(mllama_onnx_model.SerializeToString(), sess_options=sess_options, providers=providers)
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 478, in init
raise fallback_error from e
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 473, in init
self._create_inference_session(self._fallback_providers, None)
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 528, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
RuntimeError: narrowing_error
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I successfully converted meta-llama/Llama-3.2-11B-Vision model from torch to onnx but when I try to run it with OnnxRuntime using CPU EP I got the following error message. My onnx versions are
onnx 1.16.1
onnxruntime 1.20.0
onnxscript 0.1.0.dev20241104
2024-11-05 11:38:58.259321744 [I:onnxruntime:, inference_session.cc:583 TraceSessionOptions] Session Options { execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath:"" enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: { } }
2024-11-05 11:38:58.259406334 [I:onnxruntime:, inference_session.cc:483 operator()] Flush-to-zero and denormal-as-zero are off
2024-11-05 11:38:58.259423922 [I:onnxruntime:, inference_session.cc:491 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2024-11-05 11:38:58.259436213 [I:onnxruntime:, inference_session.cc:509 ConstructorCommon] Dynamic block base set to 0
2024-11-05 11:39:19.692184563 [I:onnxruntime:, inference_session.cc:583 TraceSessionOptions] Session Options { execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath:"" enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: { } }
2024-11-05 11:39:19.692245285 [I:onnxruntime:, inference_session.cc:491 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2024-11-05 11:39:19.692262042 [I:onnxruntime:, inference_session.cc:509 ConstructorCommon] Dynamic block base set to 0
*************** EP Error ***************
EP Error narrowing_error when using ['CPUExecutionProvider']
Falling back to ['CPUExecutionProvider'] and retrying.
Traceback (most recent call last):
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 465, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 528, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
RuntimeError: narrowing_error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data4/duyan/xeams/code/pipeline/infer_llama_vqa.py", line 26, in
mllama_session = ort.InferenceSession(mllama_onnx_model.SerializeToString(), sess_options=sess_options, providers=providers)
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 478, in init
raise fallback_error from e
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 473, in init
self._create_inference_session(self._fallback_providers, None)
File "/home/duyan/anaconda3/envs/llama/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 528, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
RuntimeError: narrowing_error
Beta Was this translation helpful? Give feedback.
All reactions