Custom YOLO model works correctly in PyTorch/ONNX but produces wrong detections after conversion to HEF (fragmented boxes, misaligned detections)

Hello,

I am working on deploying a custom-trained YOLO object detection model on a Raspberry Pi 5 with Hailo AI HAT (Hailo-8).

The model performs correctly on my development machine in both PyTorch (.pt) and ONNX formats, but after converting it to HEF, the detections become incorrect.


System Information

Target device:

  • Raspberry Pi 5

  • Hailo AI HAT+ (Hailo-8, 26 TOPS)

Software environment:

  • Hailo AI Software Suite Docker: 2025-10

  • DFC version: 3.33.0

  • HailoRT: 4.23.0

  • OS: Raspberry Pi OS / Linux

  • Inference via:

    • Custom Python scripts

    • hailo-apps GStreamerDetectionApp pipeline


Model Details

  • Model type: Custom YOLO-style object detection

  • Classes: 1 class (product)

  • Training framework: Ultralytics YOLO

  • Export pipeline:

    model.pt → ONNX → HEF
    
    

ONNX export

yolo export model=model.pt format=onnx imgsz=640 simplify=True nms=False


Observed Behavior

1) PyTorch (.pt) inference

  • Accuracy: >90%

  • Correct bounding boxes

  • Correct object locations

2) ONNX inference

  • Also works correctly

  • Same bounding boxes as .pt

  • Using direct resize preprocessing

3) HEF inference on Hailo

Problem:

  • Incorrect detections

  • Fragmented small grid-like boxes

  • Multiple false boxes where only one object exists

  • Misaligned bounding boxes

Example:

  • Expected: 1 object detected

  • HEF output: 3–10 small incorrect boxes


Preprocessing Comparison

ONNX inference preprocessing

  • Direct resize

  • Input: 128×128

  • RGB

  • float32

  • normalized (0–1)

HEF inference preprocessing (initially)

  • Letterbox to 640×640

  • UINT8 input

This mismatch was identified and corrected.

Current HEF preprocessing

  • Direct resize to model input size

  • Same as ONNX

  • RGB

  • Tested both:

    • UINT8 input

    • FLOAT32 input

Problem persists.


Output Format Observations

HEF outputs appear in one of two formats:

Case 1: Hailo NMS format

[1, num_classes, num_proposals, 5]
[ymin, xmin, ymax, xmax, score]

Case 2: Raw detection head

Possible format:

[1, 5, N]
[cx, cy, w, h, confidence]

Both parsing approaches were tested.


What I Have Already Tried

Conversion

  • Different input sizes (320, 640, 1024)

  • Re-exporting ONNX with:

    • nms=True

    • nms=False

  • Different .alls scripts

  • Different calibration datasets

Inference

  • Matching ONNX preprocessing exactly

  • Testing both:

    • UINT8 input

    • FLOAT32 input

  • Testing:

    • Custom inference script

    • hailo-apps GStreamerDetectionApp

Result

  • ONNX works correctly

  • HEF consistently produces incorrect detections


Suspected Root Causes

Based on experiments:

Possible issues:

  1. Quantization/calibration mismatch

  2. Incorrect .alls model script

  3. Anchor/stride/post-processing mismatch

  4. Incorrect NMS or decode configuration

  5. Model head not correctly interpreted by DFC


Minimal Reproduction Steps

  1. Train custom YOLO model

  2. Export to ONNX

  3. Convert to HEF using DFC

  4. Run inference on Raspberry Pi 5 with Hailo-8

  5. Observe fragmented/misaligned boxes


Questions

  1. What is the correct conversion flow for custom YOLO models?

  2. How can I verify that anchors/strides are correctly interpreted?

  3. How can I confirm whether the HEF output is:

    • Raw head output

    • Post-NMS output

  4. Is there an official recommended .alls template for custom YOLO models?

  5. Could this be caused by calibration dataset issues?


Additional Info Available

I can provide:

  • ONNX model

  • HEF file

  • .alls script

  • Calibration dataset samples

  • Raw output tensor dumps

  • Screenshots of incorrect detections


Expected Behavior

HEF inference should produce:

  • Same bounding boxes as ONNX

  • Same detection count

  • Same object positions

1 Like