You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Question
Hello, im trying to run deepsparse inference on raspberry pi zero 2w (quad core cpu 1Ghz, armv8), using the sparsezoo YOLO models (e. g. YOLOv5_pruned_quantized), with either Pipeline or Engine, but prediction is very slow (>1second per predict = 2000ms to 13000ms) if working at all.
I was wondering if arm is fully supported yet, and how could i increase inference speed on raspberry pi in general (and especially zero 2w).
Alternative
I also tried ONNX runtime, and i even got better results (< 1 second), which shows i could theoretically get much better results with deepsparse cpu, (i read that deepsparse can be to up 10 times faster than onnxruntime with quantization, which im using)
Additional context
Deepsparse is using neon backend (system=neon, binary=neon) for my pi, is it normal?
Solutions
Would building deepsparse from source help? Or am i just doing something wrong?
Any help would be greatly appreciated.
PS
I know some people already asked for similar things, but questions are old, and links aren't working anymore, that's why i thought it was useful to create a new issue.
The text was updated successfully, but these errors were encountered:
Question
Hello, im trying to run deepsparse inference on raspberry pi zero 2w (quad core cpu 1Ghz, armv8), using the sparsezoo YOLO models (e. g. YOLOv5_pruned_quantized), with either Pipeline or Engine, but prediction is very slow (>1second per predict = 2000ms to 13000ms) if working at all.
I was wondering if arm is fully supported yet, and how could i increase inference speed on raspberry pi in general (and especially zero 2w).
Alternative
I also tried ONNX runtime, and i even got better results (< 1 second), which shows i could theoretically get much better results with deepsparse cpu, (i read that deepsparse can be to up 10 times faster than onnxruntime with quantization, which im using)
Additional context
Deepsparse is using neon backend (system=neon, binary=neon) for my pi, is it normal?
Solutions
Would building deepsparse from source help? Or am i just doing something wrong?
Any help would be greatly appreciated.
PS
I know some people already asked for similar things, but questions are old, and links aren't working anymore, that's why i thought it was useful to create a new issue.
The text was updated successfully, but these errors were encountered: