You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue description:
When running a UNet segmentation model that has been converted to ONNX format with FP16 precision, using the CUDAExecutionProvider, the output appears to be blank. This issue is encountered during the inference stage, where the model should ideally generate segmentation masks for the input images. However, instead of producing meaningful outputs, the model returns empty or blank results.
Expected Behavior:
The output with CUDAExecutionProvider should match with PyTorch output and other EPs(CPUExecutionProvider and TensorrtExecutionProvider) output.
Results:
To reproduce
Comparison of outputs of Pytorch, Onnxruntime with Execution Providers(EPs)(CPUExecutionProvider, CUDAExecutionProvider and TensorrtExecutionProvider)
from segmentation_models_pytorch import Unet
import torch
torch.manual_seed(0)
import torchvision.transforms as transforms
import cv2
import numpy as np
import onnxruntime as ort
import matplotlib.pyplot as plt
print(f"Torch version: {torch.__version__}")
print(f"Onnxruntime version: {ort.__version__}")
image_path = "n01644373_tree_frog.JPEG" # Replace with your image path
input_image = cv2.imread(image_path)
input_image = cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)
input_image_resized = cv2.resize(input_image, (512, 512))
plt.imshow(input_image)
plt.axis("off")
plt.show()
input_data = preprocess(input_image_resized)
input_data = input_data.unsqueeze(0)
input_data = input_data.numpy()
# Check if a GPU is available and use it if possible
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Define the Unet model
def load_pytorch_model():
model = Unet(
encoder_name="timm-efficientnet-b2",
in_channels=3,
classes=5,
encoder_weights="imagenet",
)
model.eval() # Set the model to evaluation mode
model.to(device)
return model
Model is converted to fp16 with input and output of model kept as float32
def onnx_conversion(pytorch_model, output_path, enable_fp16=False):
import onnx
from onnxconverter_common import float16
# FP32CastedModel class:
class FP32CastedModel(torch.nn.Module):
def __init__(self, model):
super().__init__()
self.model = model
def __call__(self, input):
with torch.no_grad():
output = self.model(input)
return output.to(torch.float32)
random_tensor = torch.randn(
size=(1, 3, 512, 512), requires_grad=True, dtype=torch.float32
).cuda()
with torch.no_grad():
torch.onnx.export(
model=FP32CastedModel(pytorch_model),
f=output_path,
args=random_tensor,
export_params=True,
input_names=["input"],
output_names=["output"],
do_constant_folding=True,
opset_version=17,
dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}},
)
model = onnx.load(output_path)
if enable_fp16:
model = float16.convert_float_to_float16(model, keep_io_types=True)
onnx.save(model, output_path)
return model
Using pytorch model to generate onnx model
Here there will be warnings such as
UserWarning: the float32 number -3.94251777890986e-08 will be truncated to -1e-07 warnings.warn("the float32 number {} will be truncated to {}".format(neg_max, -min_positive_val))
Describe the issue
Issue description:
When running a UNet segmentation model that has been converted to ONNX format with FP16 precision, using the CUDAExecutionProvider, the output appears to be blank. This issue is encountered during the inference stage, where the model should ideally generate segmentation masks for the input images. However, instead of producing meaningful outputs, the model returns empty or blank results.
Expected Behavior:
The output with CUDAExecutionProvider should match with PyTorch output and other EPs(CPUExecutionProvider and TensorrtExecutionProvider) output.
Results:
To reproduce
Comparison of outputs of Pytorch, Onnxruntime with Execution Providers(EPs)(CPUExecutionProvider, CUDAExecutionProvider and TensorrtExecutionProvider)
Preprocessing Transformations for Input Images
Load and preprocess the input image using OpenCV
Image Reference
Define the Unet model
Pytorch
Define Pytorch inference and post process
Final output - resize to original image size
Pytorch Output
Onnxruntime EPs
Define model conversion to onnx
Model is converted to fp16 with input and output of model kept as float32
Using pytorch model to generate onnx model
Here there will be warnings such as
UserWarning: the float32 number -3.94251777890986e-08 will be truncated to -1e-07 warnings.warn("the float32 number {} will be truncated to {}".format(neg_max, -min_positive_val))
Define loading Onnx model separately with each EPs
For CUDAExecutionProvider, the flag cudnn_conv_use_max_workspace is set to 1.
For TensorrtExecutionProvider, the flag trt_fp16_enable is set True for fp16 model.
Loading model with all EPs separately
Inference with all Execution Providers(EPs) separately
Final Results - input image and predicted outputs
Urgency
High, using TensorrtExecutionProvider for now
Platform
Windows
OS Version
11
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.20.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
12.6
cuDNN Library Version
9.5.1
The text was updated successfully, but these errors were encountered: