HEF inference deviates significantly from ONNX (Cosine ≈ 0.4–0.5)

Environment

Conversion Host

  • Ubuntu 22.04.3 (x86_64), Hailo DFC 3.33.0 / HailoRT 4.23.0

  • ONNXRuntime 1.18.0 (CPU only, no GPU)

  • Python 3.10.12 (venv)

Target Device

  • Raspberry Pi 5 + Hailo-8

  • Debian 12 (aarch64), HailoRT 4.20.0

  • Python 3.11.2, ONNXRuntime 1.23.2


Pipeline

  1. Generated calibration .npy dataset (RGB 288×288, 220 images)

  2. Executed hailo parser → optimize → compiler

  3. Conversion completed successfully (no critical errors)

  4. Compared ONNX vs HEF feature maps (L2: 512×36×36, L3: 1024×18×18) –> Shapes match perfectly, but numeric similarity is very poor — cosine < 0.6.


Question

Has anyone experienced cases where the model’s behavior (feature maps, accuracy, or numeric similarity) becomes severely distorted after the parser or optimize stage, even though the conversion completes successfully?

If so:

  • What were the main causes (e.g., calibration set issues, normalization mismatch, per-channel quantization, etc.)?

  • And how did you resolve them? (e.g., increasing calibration set size, enabling GPU calibration, adjusting preprocessing, or fine-tuning parser options)

In my case, I suspect weak calibration (only 64 samples actually used) and a possible normalization mismatch, but I’d like to know whether others have seen similar degradation and how you addressed it.

Any model behaves differently. Some layers are more susceptible to quantization issues than others. I recommend you work trough the tutorials inside the AI Software Suite. Run the following command to start a Jupyter Notebook server with notebooks for each step of the conversion:

hailo tutorial

See tutorial. You will need a GPU for the more advances optimization levels. I also recommend reding the Hailo Dataflow Compiler User Guide for further information.