Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLO V8 to ONNX Quantized model conversion compatibility #80

Open
dadanugm opened this issue May 4, 2024 · 9 comments
Open

YOLO V8 to ONNX Quantized model conversion compatibility #80

dadanugm opened this issue May 4, 2024 · 9 comments

Comments

@dadanugm
Copy link

dadanugm commented May 4, 2024

Hi everyone.
Would like to discuss about AMD-Ryzen AI IPU quantize model for object detection. Need input from AMD on how to provide suitable trained model for IPU, so I can utilize the IPU to do the inference.

I have pre-trained model YOLOV8, converted to ONNX model using:
model.export(format="onnx") # export the model to ONNX format

and then I added pre-processing for the model.
add_ppp.yolo_detection(input_model_file, output_model_file, "jpg", onnx_opset=18)

and running the ONNX inference:
session = ort.InferenceSession(str(onnx_model_file), providers=providers, sess_options=session_options)

But the result of ONNX inference from converted model is really poor if I compare with YOLOV8 Inference result. Even after ONNX quantize, the detection result was getting worse.

onnx 1.16.1
onnxruntime 1.17.3
Ultralytics 8.2.1

Would like to know the recommendations how to train model or convert the model to ONNX so it is highly compatible with IPU.

Thanks.

result yolov8
result_yoloV8

result ONNX conversion
result_onnx

@uday610
Copy link
Collaborator

uday610 commented May 4, 2024

Instead of ONNX PTQ, you can try QAT (quantization-aware training) using PyTorch flow. There is a tutorial at https://github.com/amd/RyzenAI-SW/tree/main/tutorial/yolov8_e2e that has a QAT flow.

@dadanugm
Copy link
Author

dadanugm commented May 5, 2024

From the code snippet, looks like I can train my custom model using my own dataset to produce my object detection model, using below command?

yolo detect train data="datasets/coco.yaml" model=${WEIGHTS} pretrained=True sync_bn=True
epochs=${EPOCH} batch=${BATCH} optimizer="AdamW" device=${GPU_ID} lr0=0.0001 nndct_quant=True --nndct_convert_sigmoid_to_hsigmoid --nndct_convert_silu_to_hswish

Seems that it trains the model using local machine, since I dont have GPU, would that be able also run on Colab?

Thanks

@dadanugm
Copy link
Author

dadanugm commented May 6, 2024

Any idea which package that I need to install to solve SyntaxError: 'nndct_convert_sigmoid_to_hsigmoid' is not a valid YOLO argument?
I ran that Yolo detect command, and it throws this error. From screen capture, I ran this on AMD AI software installed on Conda venv. I already update my venv using env_setup.sh but still caught this error.

image

Thanks

@uday610
Copy link
Collaborator

uday610 commented May 7, 2024

Hi @dadanugm , the instructions in the tutorial ask for using docker. Are you running inside the docker?

@dadanugm
Copy link
Author

dadanugm commented May 8, 2024

Hi @uday610 . Thanks for the pointer!. I am able to resolve the syntax error by running it on Docker.

But got another Error after resolving that one :)). Seems that the code requires to run on CUDA.
image
I search the solutions, some of them suggest to reinstall torch with torch+cpu only, and add parameter device=cpu. but the assertion error keeps coming back.

Also when I run run_qat.sh, another error pops up.
image
Looks like the container missing a file?
Would like to know if you got this similiar error from your end?
Do you have input to solve these issues?

Thanks

@fanz-xlnx
Copy link
Collaborator

Hi @uday610 . Thanks for the pointer!. I am able to resolve the syntax error by running it on Docker.

But got another Error after resolving that one :)). Seems that the code requires to run on CUDA. image I search the solutions, some of them suggest to reinstall torch with torch+cpu only, and add parameter device=cpu. but the assertion error keeps coming back.

Also when I run run_qat.sh, another error pops up. image Looks like the container missing a file? Would like to know if you got this similiar error from your end? Do you have input to solve these issues?

Thanks

Please run the ptq first, then you should be able to get the json file. You can use the quantized model to run the QAT afterwards.

@dadanugm
Copy link
Author

dadanugm commented May 9, 2024

Hi @fanz-xlnx . Thanks for the input.

Any idea to bypass this raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled ?. This gated me from running the run_ptq. in this case, I dont have CUDA/Nvidia GPU.

My engine is AMD Ryzen 9 7940HS w/ Radeon 780M Graphics, Would be happy if I can utilize this to replace the CUDA for training.

Thanks

@dadanugm
Copy link
Author

dadanugm commented May 22, 2024

Hi @uday610 @fanz-xlnx

I got the GPU (NVIDIA GTX 1070 with CUDA 12.0) to test out the docker (https://hub.docker.com/r/amdih/ryzen-ai-pytorch).
Screenshot 2024-05-20 224555

but the AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled is still pops out.

I thought by installing Pytorch Cuda compatible will resolve the issue (pytorch 1.12.1+cu113) but it hits another errors
test1

Would like to know what is the real issue here? even though I have the Docker run on GPU but still can't run the Ryzen-AI environment. Do I need specific GPU to run the env?

Thanks.

@fanz-xlnx
Copy link
Collaborator

Hi @uday610 @fanz-xlnx

I got the GPU (NVIDIA GTX 1070 with CUDA 12.0) to test out the docker (https://hub.docker.com/r/amdih/ryzen-ai-pytorch). Screenshot 2024-05-20 224555

but the AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled is still pops out.

I thought by installing Pytorch Cuda compatible will resolve the issue (pytorch 1.12.1+cu113) but it hits another errors test1

Would like to know what is the real issue here? even though I have the Docker run on GPU but still can't run the Ryzen-AI environment. Do I need specific GPU to run the env?

Thanks.

Thanks for the updated info.
You have downloaded the dockerfile and build the GPU docker yourself right? Any issues met during the build progress?

savitha-srinivasan pushed a commit to savitha-srinivasan/RyzenAI-SW that referenced this issue Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants