Custom Yolo model missed detections on Hailo8

Quang_Tran · December 6, 2025, 2:30am

I have trained several yolov8, 10 and 11 using the home fire dataset with 2 classes here Home fire dataset | Kaggle. All of the trained models (converted to onnx) perform OK on my video. However, the hef converted models running on Hailo8 perform much worse with little to no fire detection outputs. I am converting the model with hailomz package with this command:

hailomz compile --hw-arch hailo8 --ckpt yolov8n.onnx --calib-path /my/calib/images/ --classes 2 yolov8n

So far, I have tried

Lowering the nms_scores_th to 0.01
Using yolov8n, yolov10n, yolov10m, yolov11m
Changing the calibration set (from whole training set to a set of images with fire labels)
Replacing change_output_activation() in .alls file with quantization_param() as suggested in https://community.hailo.ai/t/yolov8-convert-to-hef/17686

All show little to no difference in output.

omria · December 7, 2025, 3:04pm

Hey @Quang_Tran,

Welcome to the Hailo Community!

The issue you’re seeing with YOLOv8/10/11 is fairly common and usually falls into one of a few categories. Here’s a quick rundown and what info we need to help:

1. Quantization + Small Dataset Issues

Models trained on few classes or small datasets can suffer from quantization instability — certain detection heads may output near-zero values, causing no detections on Hailo.

A common workaround:

quantization_param([conv42, conv53, conv63], force_range_out=[0.0, 1.0])

This helps preserve activation range post-quantization. It often works better than change_output_activation(...), especially for GPU-trained models.

What we need from you:

The .alls file you’re using (at least the relevant parts: normalization, quantization_param, change_output_activation, nms_postprocess)
A snippet of the compiler log, especially around calibration

2. Calibration/Data Mismatch

If your calibration images are already normalized (e.g., divided by 255), and your .alls also applies normalization, you may be normalizing twice — which ruins quantization.

We need:

How your calibration images are stored and preprocessed
Your normalization(...) line from .alls
How inference frames are preprocessed before going into Hailo

3. NMS or Post-Processing Issues

Incorrect NMS settings or mismatched class counts/head mappings can cause poor results or random boxes.

Please share:

Your NMS JSON or confirm if you’re using a standard one
The nms_postprocess(...) line
Whether you rely only on Hailo’s NMS or apply extra post-processing

4. Inference Preprocessing Mismatch

Often, detections fail due to mismatched color formats or preprocessing steps between training, calibration, and inference.
Example: model trained on RGB, but inference uses BGR or applies extra normalization.

What helps:

How you’re running inference (hailortcli, custom script, Frigate?)
The full preprocessing pipeline applied at inference (color format, scaling, etc.)

5. Recommended Compile Flow

We recommend using parse → optimize → compile to isolate calibration and avoid hidden issues.

hailomz parse ...
hailomz optimize ...
hailomz compile ...

Next Steps

Please send us for one failing model:

Your .alls file (relevant parts only)
NMS JSON and nms_postprocess(...)
Calibration image prep vs. inference pipeline
How inference is being run

With that info, we can usually narrow it down quickly.

Let us know!

Quang_Tran · December 8, 2025, 6:24pm

@omria

Here is the .alls file i used for yolo10m (other yolo models use default file provided)

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
quantization_param([conv64, conv77, conv89], force_range_out=[0.0, 1.0])
model_optimization_flavor(optimization_level=2)
nms_postprocess("/home/lauretta/quang/rain_detector/model_script/yolov10m_nms_config.json", meta_arch=yolov8, engine=cpu)

the nms config i am using for the model

{
    "nms_scores_th": 0.001,
    "nms_iou_th": 0.45,
    "image_dims": [
        640,
        640
    ],
    "max_proposals_per_class": 100,
    "classes": 2,
    "regression_length": 16,
    "background_removal": false,
    "background_removal_index": 0,
    "bbox_decoders": [
        {
            "name": "bbox_decoder45",
            "stride": 8,
            "reg_layer": "conv61",
            "cls_layer": "conv64"
        },
        {
            "name": "bbox_decoder56",
            "stride": 16,
            "reg_layer": "conv74",
            "cls_layer": "conv77"
        },
        {
            "name": "bbox_decoder66",
            "stride": 32,
            "reg_layer": "conv86",
            "cls_layer": "conv89"
        }
    ]
}

I provided a calibration path which is a directory containing .jpg images. The inference pipeline is resizing the input images to model input size (640x640) and use the np.uint8 type.
The inference pipeline is similar to this script

if __name__ == "__main__":
    used_size = (640, 640)
    cap1 = cv2.VideoCapture("sprinkler.mp4")
    detector = HailoDetector("/home/lauretta/quang/yolov10m.hef")
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    used_size = (int(cap1.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap1.get(cv2.CAP_PROP_FRAME_HEIGHT)))
    writer = cv2.VideoWriter("yolov10_sprinkler.mp4", fourcc, 20, used_size)
    caps = [cap1]
    start_time = time()
    current_time = start_time
    frame_proc = 0
    n = len(caps)
    while True:
        ret1, frame1 = cap1.read()
        if (not ret1):
            break
        frame_proc += n
        # print(f"Current FPS: {n / (time() - current_time)}")
        current_time = time()
        # print(f"Average FPS: {frame_proc / (current_time - start_time)}")
        frames = [frame1]
        detections = detector(frames, 0.0)
        for i, cam_dets in enumerate(detections):
            vis_frame = frames[i]
            frame_shape = vis_frame.shape
            for det in cam_dets:
                y1, x1, y2, x2, conf, cls = det
                if conf > 0.01:
                    print("something")
                    x1 = x1 * frame_shape[1]
                    y1 = y1 * frame_shape[0]
                    x2 = x2 * frame_shape[1]
                    y2 = y2 * frame_shape[0]
                    label = f"{COCO_CLASSES[int(cls)]}: {conf:.2f}"
                    cv2.rectangle(vis_frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
                    cv2.putText(vis_frame, label, (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
            # cv2.imshow(f"Camera {i}", vis_frame)
            writer.write(vis_frame)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    cap1.release()
    writer.release()
    cv2.destroyAllWindows()

The HailoDetector will handler resizing the images to the expected input size as well as batching the images.

The snippet of optimize log (1st and last step) is as follow

[info] No GPU chosen, Selected GPU 0
e[36m<Hailo Model Zoo INFO> Start run for network yolov10m ...e[0m
e[36m<Hailo Model Zoo INFO> Initializing the hailo8 runner...e[0m
[info] Translation started on ONNX model yolov10m
[info] Restored ONNX model yolov10m (completion time: 00:00:00.15)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.94)
[info] NMS structure of yolov8 (or equivalent architecture) was detected.
[info] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.23/one2one_cv3.0/one2one_cv3.0.2/Conv /model.23/one2one_cv2.0/one2one_cv2.0.2/Conv /model.23/one2one_cv2.1/one2one_cv2.1.2/Conv /model.23/one2one_cv3.1/one2one_cv3.1.2/Conv /model.23/one2one_cv2.2/one2one_cv2.2.2/Conv /model.23/one2one_cv3.2/one2one_cv3.2.2/Conv.
[info] Start nodes mapped from original model: 'images': 'yolov10m/input_layer1'.
[info] End nodes mapped from original model: '/model.23/one2one_cv2.0/one2one_cv2.0.2/Conv', '/model.23/one2one_cv3.0/one2one_cv3.0.2/Conv', '/model.23/one2one_cv2.1/one2one_cv2.1.2/Conv', '/model.23/one2one_cv3.1/one2one_cv3.1.2/Conv', '/model.23/one2one_cv2.2/one2one_cv2.2.2/Conv', '/model.23/one2one_cv3.2/one2one_cv3.2.2/Conv'.
[info] Translation completed on ONNX model yolov10m (completion time: 00:00:02.31)
[info] Appending model script commands to yolov10m from string
[info] Added nms postprocess command to model script.
[info] Saved HAR to: /home/lauretta/quang/rain_detector/yolov10m.har
e[36m<Hailo Model Zoo INFO> Preparing calibration data...e[0m
[info] Loading model script commands to yolov10m from /home/lauretta/quang/rain_detector/model_script/yolov10m.alls
[info] Loading model script commands to yolov10m from string
[info] The activation function of layer yolov10m/conv64 was replaced by a Sigmoid
[info] The activation function of layer yolov10m/conv77 was replaced by a Sigmoid
[info] The activation function of layer yolov10m/conv89 was replaced by a Sigmoid
[info] Found model with 3 input channels, using real RGB images for calibration instead of sampling random data.
[info] Starting Model Optimization
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:00.42)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 64 entries for calibration
[info] Model Optimization Algorithm Statistics Collector is done (completion time is 00:00:22.07)
[info] Starting Fix zp_comp Encoding
[info] Model Optimization Algorithm Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] Starting Matmul Equalization
[info] Model Optimization Algorithm Matmul Equalization is done (completion time is 00:00:00.02)
[info] Starting MatmulDecomposeFix
[info] Model Optimization Algorithm MatmulDecomposeFix is done (completion time is 00:00:00.00)
[info] activation fitting started for yolov10m/reduce_sum_softmax1/act_op
[info] No shifts available for layer yolov10m/conv44/conv_op, using max shift instead. delta=2.1318
[info] No shifts available for layer yolov10m/conv44/conv_op, using max shift instead. delta=1.0659
[info] No shifts available for layer yolov10m/conv84/conv_op, using max shift instead. delta=0.6651
[info] No shifts available for layer yolov10m/conv84/conv_op, using max shift instead. delta=0.3325
[info] No shifts available for layer yolov10m/conv89/conv_op, using max shift instead. delta=0.1925
[info] No shifts available for layer yolov10m/conv89/conv_op, using max shift instead. delta=0.0963
[info] Finetune encoding skipped
[info] Bias Correction skipped
[info] Adaround skipped
[info] Starting Quantization-Aware Fine-Tuning
[warning] Dataset is larger than expected size. Increasing the algorithm dataset size might improve the results
[info] Using dataset with 1024 entries for finetune
Epoch 1/4

e[1m  1/128e[0m e[37m━━━━━━━━━━━━━━━━━━━━e[0m e[1m3:26:01e[0m 97s/step - _distill_loss_yolov10m/conv57: 0.3000 - _distill_loss_yolov10m/conv61: 0.1377 - _distill_loss_yolov10m/conv64: 0.3819 - _distill_loss_yolov10m/conv70: 0.2914 - _distill_loss_yolov10m/conv74: 0.2233 - _distill_loss_yolov10m/conv77: 0.0323 - _distill_loss_yolov10m/conv83: 0.3711 - _distill_loss_yolov10m/conv86: 0.2080 - _distill_loss_yolov10m/conv89: 1.0000 - total_distill_loss: 2.9458
e[1m128/128e[0m e[32m━━━━━━━━━━━━━━━━━━━━e[0me[37me[0m e[1m59se[0m 459ms/step - _distill_loss_yolov10m/conv57: 0.2392 - _distill_loss_yolov10m/conv61: 0.1138 - _distill_loss_yolov10m/conv64: 0.2686 - _distill_loss_yolov10m/conv70: 0.2289 - _distill_loss_yolov10m/conv74: 0.1382 - _distill_loss_yolov10m/conv77: 0.4470 - _distill_loss_yolov10m/conv83: 0.2096 - _distill_loss_yolov10m/conv86: 0.1196 - _distill_loss_yolov10m/conv89: 1.0000 - total_distill_loss: 2.7651
[info] Model Optimization Algorithm Quantization-Aware Fine-Tuning is done (completion time is 00:05:36.00)
[info] Starting Layer Noise Analysis
[info] Model Optimization Algorithm Layer Noise Analysis is done (completion time is 00:01:14.04)
[info] Model Optimization is done
[info] Saved HAR to: /home/lauretta/quang/rain_detector/yolov10m.har

omria · December 11, 2025, 11:42am

Hey @Quang_Tran,

I reviewed your .alls file and the DFC logs. Below is a summary of the findings and recommended steps to resolve the issue.

Observations from the DFC Log

Model correctly detected as YOLO-like
```
NMS structure of yolov8 (or equivalent architecture) was detected.
```
The parser is correctly identifying the YOLO heads and end nodes.
Calibration size is too small
```
Using dataset with 64 entries for calibration
```
This is significantly below the recommended minimum of 1,024 images. A small calibration set typically leads to poor quantization results.
QAT dataset size is sufficient—but does not affect calibration
```
Using dataset with 1024 entries for finetune
```
Calibration still only processes 64 images. Although QAT helps, it does not replace the need for an adequate calibration dataset.

Unexpected Sigmoid activations on YOLO heads

The activation function of layer yolov10m/conv64 was replaced by a Sigmoid
The activation function of layer yolov10m/conv77 was replaced by a Sigmoid
The activation function of layer yolov10m/conv89 was replaced by a Sigmoid

However, your .alls file only includes:

quantization_param([conv64, conv77, conv89], force_range_out=[0.0, 1.0])

There is no change_output_activation(...), suggesting the Sigmoid operations are coming from an auto-generated model script.

What Is Likely Not the Issue

Your preprocessing appears correct:

Calibration images are .jpg
Input is np.uint8 and resizing is handled by HailoDetector

Normalization in .alls is:

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])

Unless there is additional host-side scaling (dividing by 255), double-normalization does not appear to be occurring.

Likely Root Causes

Given that ONNX runs well but the HEF produces few detections, the most probable issues are:

Unwanted Sigmoid activations on the output heads (conv64/77/89)
Insufficient calibration data (64 images instead of 1,024+)
A potential RGB/BGR mismatch during inference

Recommended Fixes

1. Force usage of your `.alls` script

Run the following commands with an explicit model script:

hailo optimize yolov10m.har --model-script /home/lauretta/quang/rain_detector/model_script/yolov10m.alls
hailo compiler yolov10m.har --model-script /home/lauretta/quang/rain_detector/model_script/yolov10m.alls

Check whether the lines indicating Sigmoid replacements disappear.

If they do not, extract the auto-generated script:

hailo har extract yolov10m.har --auto-model-script-path auto_model_script.alls

Look for any change_output_activation(...) calls and remove them. Avoid applying both change_output_activation and force_range_out to the same layers.

2. Use at least 1,024 calibration images

Ensure your calibration directory contains 1,024 or more images.
If using the Python API:

model_optimization_config(calibration, calibset_size=1024)

Re-optimize and confirm the log reports:

Using dataset with 1024 entries for calibration

3. Verify RGB/BGR consistency

If your model was trained with RGB images but inference uses BGR, add:

frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

Also ensure the input remains np.uint8 without host-side normalization (no division by 255).

4. If issues persist, run noise analysis

hailo analyze-noise yolov10m.har --data-path /path/to/calib_data

You can also try increasing optimization levels:

model_optimization_flavor(optimization_level=4, compression_level=0)

If necessary, apply a16_w16 quantization selectively to sensitive layers.

Summary

The DFC log shows unwanted Sigmoid activations on output heads (conv64/77/89). These originate from an auto-generated model script and should be removed.
Only 64 calibration images are currently used. Increase this to at least 1,024 and verify the log reflects the change.
Confirm that preprocessing matches training conditions:
- Input should be raw uint8
- Convert BGR to RGB if required:
```
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
```

Hopefully this helps! Please update me on progress to help you better!

Quang_Tran · December 11, 2025, 7:09pm

Hi @omria ,

Thank you for the findings, the model is converted successfully on my side now. Here is what I did.

I added this model_optimization_config(calibration, calibset_size=2048) to my .alls for the calibration process. Either 1024 or 2048 work fine.
“Convert the BGR to RGB” - I forgot to add this in my model.

Both of those steps have to be there in order to have the model work fine.

For some reason, my .alls file after running the extract script is identical to my provided .alls file but sigmoid activation are still added into those layers. This, however, not affecting my model performance.

hailo har extract yolov10m.har --auto-model-script-path auto_model_script.alls

omria · December 15, 2025, 12:21pm

That’s awesome to hear!