Hello,
I am working on deploying a custom-trained YOLO object detection model on a Raspberry Pi 5 with Hailo AI HAT (Hailo-8).
The model performs correctly on my development machine in both PyTorch (.pt) and ONNX formats, but after converting it to HEF, the detections become incorrect.
System Information
Target device:
-
Raspberry Pi 5
-
Hailo AI HAT+ (Hailo-8, 26 TOPS)
Software environment:
-
Hailo AI Software Suite Docker: 2025-10
-
DFC version: 3.33.0
-
HailoRT: 4.23.0
-
OS: Raspberry Pi OS / Linux
-
Inference via:
-
Custom Python scripts
-
hailo-apps GStreamerDetectionApp pipeline
-
Model Details
-
Model type: Custom YOLO-style object detection
-
Classes: 1 class (product)
-
Training framework: Ultralytics YOLO
-
Export pipeline:
model.pt → ONNX → HEF
ONNX export
yolo export model=model.pt format=onnx imgsz=640 simplify=True nms=False
Observed Behavior
1) PyTorch (.pt) inference
-
Accuracy: >90%
-
Correct bounding boxes
-
Correct object locations
2) ONNX inference
-
Also works correctly
-
Same bounding boxes as .pt
-
Using direct resize preprocessing
3) HEF inference on Hailo
Problem:
-
Incorrect detections
-
Fragmented small grid-like boxes
-
Multiple false boxes where only one object exists
-
Misaligned bounding boxes
Example:
-
Expected: 1 object detected
-
HEF output: 3–10 small incorrect boxes
Preprocessing Comparison
ONNX inference preprocessing
-
Direct resize
-
Input: 128×128
-
RGB
-
float32
-
normalized (0–1)
HEF inference preprocessing (initially)
-
Letterbox to 640×640
-
UINT8 input
This mismatch was identified and corrected.
Current HEF preprocessing
-
Direct resize to model input size
-
Same as ONNX
-
RGB
-
Tested both:
-
UINT8 input
-
FLOAT32 input
-
Problem persists.
Output Format Observations
HEF outputs appear in one of two formats:
Case 1: Hailo NMS format
[1, num_classes, num_proposals, 5]
[ymin, xmin, ymax, xmax, score]
Case 2: Raw detection head
Possible format:
[1, 5, N]
[cx, cy, w, h, confidence]
Both parsing approaches were tested.
What I Have Already Tried
Conversion
-
Different input sizes (320, 640, 1024)
-
Re-exporting ONNX with:
-
nms=True
-
nms=False
-
-
Different .alls scripts
-
Different calibration datasets
Inference
-
Matching ONNX preprocessing exactly
-
Testing both:
-
UINT8 input
-
FLOAT32 input
-
-
Testing:
-
Custom inference script
-
hailo-apps GStreamerDetectionApp
-
Result
-
ONNX works correctly
-
HEF consistently produces incorrect detections
Suspected Root Causes
Based on experiments:
Possible issues:
-
Quantization/calibration mismatch
-
Incorrect .alls model script
-
Anchor/stride/post-processing mismatch
-
Incorrect NMS or decode configuration
-
Model head not correctly interpreted by DFC
Minimal Reproduction Steps
-
Train custom YOLO model
-
Export to ONNX
-
Convert to HEF using DFC
-
Run inference on Raspberry Pi 5 with Hailo-8
-
Observe fragmented/misaligned boxes
Questions
-
What is the correct conversion flow for custom YOLO models?
-
How can I verify that anchors/strides are correctly interpreted?
-
How can I confirm whether the HEF output is:
-
Raw head output
-
Post-NMS output
-
-
Is there an official recommended .alls template for custom YOLO models?
-
Could this be caused by calibration dataset issues?
Additional Info Available
I can provide:
-
ONNX model
-
HEF file
-
.alls script
-
Calibration dataset samples
-
Raw output tensor dumps
-
Screenshots of incorrect detections
Expected Behavior
HEF inference should produce:
-
Same bounding boxes as ONNX
-
Same detection count
-
Same object positions