Follow-up on
I’m observing a consistent mismatch between outputs from onnxruntime and the Hailo runtime (SDK_NATIVE/HAR) for the RESA lane-detection model ( GitHub - ZJULearning/resa: Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021. ).
For ONNX conversion, I used your Model Zoo Docker environment and referenced the YOLOv8 README ( hailo_model_zoo/training/yolov8/README.rst at master · hailo-ai/hailo_model_zoo · GitHub ) and YOLOX README ( hailo_model_zoo/training/yolox/README.rst at master · hailo-ai/hailo_model_zoo · GitHub ).
Environment
-
Platform: Hailo Model Zoo official Docker container (from your repo)
-
Model: RESA
-
ONNX opset tried: 11, 13
-
Conversion stacks tried:
onnx==1.12.0
,onnxsim==0.4.13
onnx==1.8.1
,onnx-simplifier==0.3.5
,onnxoptimizer==0.3.7
,onnxruntime==1.12.0
-
Hailo DFC versions tried: 3.27, 3.28, 3.29, 3.30, 3.31, 3.32
-
Input normalization: mean (0,0,0), std (255,255,255)
-
Additional attempt: inserted BatchNorm on all available layers
-
Confirmed net_input_format parity; also tested constant all-ones input
Symptoms
- Intermediate tensors and final outputs diverge between onnxruntime and Hailo runtime, including on constant inputs.
- Discrepancies persist across opset/DFC variants and after BN insertion.
What I need from Hailo
- Parser-related guidance: If parser behavior can cause output differences on RESA-like graphs, is there an official checklist or recommended steps to isolate such issues?
- Per-layer profiling: Is there an official tool/workflow to dump intermediate activations on the Hailo runtime and compare them to onnxruntime (layer-by-layer)? If yes, please share how to enable it and the expected tensor formats/scales..
Thank you in advance for your guidance. I’m ready to share all artifacts and run any additional checks you recommend.