.hef is not performing

Avichal_Baweja · December 5, 2025, 1:37pm

Here is my setup :
Raspberry PI 5
Hailo8 AI Hat on pi 5

I used yolov8n model to train on a dataset which can help to detect all kind of flying object such as birds, aeroplanes etc.
Post training converted the .pt yolo model to .onnx using Save As => Export.

Later converted the .oonx to .hef using the DFC .
Post .hef conversion , yolo’s performance degraded a lot and it’s not able to detect any object on which it was trained on.
I guess this happened due to quantisation may be.
Is there any way to make sure the conversion from .onnx to .hef happens as it is using DFC and no quantisation is applied?
Is there any other way to run .pt on the hailo8 AI hat directly?

omria · December 7, 2025, 4:19pm

Hey @Avichal_Baweja, welcome!

Just to clarify: Hailo-8 only runs quantized (integer) models, so the DFC can’t compile your YOLOv8 ONNX in float — it must go through quantization. There’s also no support for running a .pt model directly; you need to convert via:
ONNX/TFLite → HAR → quantization → HEF.
More info: [Custom ONNX]

Likely issues:

The DFC always applies quantization — even on QAT or pre-quantized models.
Accuracy drop or “no detections” is often due to:
- Preprocessing or calibration mismatches (e.g., RGB vs BGR, normalization differences)
- Custom models with few classes causing output collapse

Try adding this to the final conv layers in your ALLS if detections are missing:

quantization_param([conv42, conv53, conv63], force_range_out=[0.0, 1.0])

What to check:

Does your ONNX model give correct detections on CPU?
Is preprocessing consistent across training, calibration, and inference?
Are you starting from the official yolov8n.alls?
Is NMS added to the ALLS?

For better support:

Please share:

Your .alls file
Preprocessing details (training vs calibration vs inference)
Output of:

hailortcli parse-hef ./your_model.hef
hailortcli run your_model

And your hailort.log if there’s an issue during inference.

Let us know how it goes — happy to help further!

Avichal_Baweja · December 8, 2025, 10:16am

1. ONNX Model - Now Works on CPU

After re-export with correct settings:

PyTorch: Class 0 (drone), conf=0.766, box=[1397,834,1983,1098]
ONNX:    Class 0 (drone), conf=0.771, box=[1398,837,1982,1095]

ONNX gives correct detections on CPU.

2. Preprocessing Details

Stage	Normalization	Color	Format	Range
Training	÷255 (0-1)	RGB	CHW	float32
Calibration	0-255	RGB	HWC	float32
Hailo Inference	0-255	RGB	HWC	UINT8
ALLS normalization	`[0,0,0]/[255,255,255]`	-	-	internal

Calibration data:

Shape: (17, 640, 640, 3)
Dtype: float32
Range: [0.0, 255.0]

3. NOT Using Official yolov8n.alls

I used yolov8m.alls which has wrong layer names for my custom model:

yolov8m.alls references:

conv58, conv71, conv83

My model’s actual output conv layers:

/model.22/cv2.0/cv2.0.2/Conv  (bbox scale 1)
/model.22/cv2.1/cv2.1.2/Conv  (bbox scale 2)
/model.22/cv2.2/cv2.2.2/Conv  (bbox scale 3)
/model.22/cv3.0/cv3.0.2/Conv  (class scale 1) ← need sigmoid
/model.22/cv3.1/cv3.1.2/Conv  (class scale 2) ← need sigmoid
/model.22/cv3.2/cv3.2.2/Conv  (class scale 3) ← need sigmoid

This mismatch is likely causing the issue!

4. NMS is in ALLS

nms_postprocess("yolov8m_nms_config.json", meta_arch=yolov8, engine=cpu)

5. Current ALLS File

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
model_optimization_config(calibration, batch_size=2)
change_output_activation(conv58, sigmoid)
change_output_activation(conv71, sigmoid)
change_output_activation(conv83, sigmoid)
post_quantization_optimization(finetune, policy=enabled, learning_rate=0.000025)
nms_postprocess("../../postprocess_config/yolov8m_nms_config.json", meta_arch=yolov8, engine=cpu)

6. Proposed Fix - Custom ALLS

Should I create a new ALLS like this?

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
model_optimization_config(calibration, batch_size=2)

# Correct layer names for custom YOLOv8m
change_output_activation(model_22_cv3_0_cv3_0_2_Conv, sigmoid)
change_output_activation(model_22_cv3_1_cv3_1_2_Conv, sigmoid)
change_output_activation(model_22_cv3_2_cv3_2_2_Conv, sigmoid)

# Force range for class outputs (4 classes)
quantization_param([model_22_cv3_0_cv3_0_2_Conv, model_22_cv3_1_cv3_1_2_Conv, model_22_cv3_2_cv3_2_2_Conv], force_range_out=[0.0, 1.0])

post_quantization_optimization(finetune, policy=enabled, learning_rate=0.000025)
nms_postprocess("custom_nms_config.json", meta_arch=yolov8, engine=cpu)

TT2024 · December 8, 2025, 7:58pm

@Avichal_Baweja , You may refer to this post. (Though it was for YOLOv11, it works for YOLOv8 too). Guide to using the DFC to convert a modified YoloV11 on Google Colab .@omria , the DFC conversion process is kind of complicated. It will be nice if it could be simplified in the future release, i.e., by only providing the ONNX (or even .pt) YOLO model (or + the YOLO model config file), DFC does the conversion automatically (with given sample dataset), at most the user only needs to specify desired percentage of 8-bit or 16-bit. Thanks for your consideration!

omria · December 11, 2025, 12:03pm

Hey @Avichal_Baweja,

You’re spot on—using yolov8m.alls directly won’t work for your custom model because those conv58/71/83 names are specific to the Model Zoo YOLOv8m graph, not your exported model.

Here’s the thing: the node names you provided like /model.22/cv2.0/cv2.0.2/Conv, etc., are ONNX names. But in your .alls file, you need to use the Hailo node names that get created when your model is parsed into a .har file.

So here’s what you need to do:

Open your .har file in the visualizer or Netron:

hailo visualizer your_model.har

Look for the Hailo names that correspond to your three class heads:

/model.22/cv3.0/cv3.0.2/Conv
/model.22/cv3.1/cv3.1.2/Conv
/model.22/cv3.2/cv3.2.2/Conv

These will probably be named something like conv42, conv53, conv63 (or similar)—but the exact names depend on how your model parsed.

Once you have those Hailo names, create a custom .alls using them instead of conv58/71/83. Here’s a template (just replace cls_head_s8/s16/s32 with your actual Hailo node names):

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
model_optimization_config(calibration, batch_size=2)

# Use your actual Hailo names for the 3 class heads
change_output_activation(cls_head_s8, sigmoid)
change_output_activation(cls_head_s16, sigmoid)
change_output_activation(cls_head_s32, sigmoid)

quantization_param(
    [cls_head_s8, cls_head_s16, cls_head_s32],
    force_range_out=[0.0, 1.0]
)

post_quantization_optimization(finetune, policy=enabled, learning_rate=0.000025)
nms_postprocess("custom_nms_config.json", meta_arch=yolov8, engine=cpu)

Also update your NMS config (custom_nms_config.json) to use these same Hailo node names for reg_layer and cls_layer.

The force_range_out=[0.0, 1.0] on the class heads is really important—it’s a documented workaround for YOLOv8 models with few classes or GPU training. It prevents quantization from collapsing those outputs.

Once you’ve updated everything with the correct node names, recompile to HEF and test on your Pi. Your detections should now line up with your ONNX CPU baseline.

If things are still off after that, double-check your preprocessing on the Pi—make sure you’re not accidentally doing an extra /255 somewhere.

Let me know how it goes!

omria · December 11, 2025, 12:05pm

Hey @TT2024,

Thanks for the feedback!

I’ve forwarded your request to the R&D team.
I completely agree—improving the DFC compilation process and tutorial is a priority. We’ll also be adding more detailed guides to the community soon.

TT2024 · December 11, 2025, 7:06pm

@omria , that will be great! Thanks!