Model seems to work but eval metrics all zero

Hi,

I’m running a finetuned yolov8n for detection/localization and was trying to run the evaluation on it. The model appears to work: when I run the evaluation with --visualize the generated bounding boxes are as expected.
However, just looking at the eval metrics they are all 0:

hailomz eval --har finetuned_yolov8n.har --target full_precision --hw-arch hailo8l --yaml ./cfg/finetuned_yolov8n.yaml --model-script cfg/finetuned_yolov8n.alls --data-path asdf.tfrecord --data-count 128 --classes 1
<Hailo Model Zoo INFO> Start run for network finetuned_yolov8n ...
<Hailo Model Zoo INFO> Initializing the runner...
<Hailo Model Zoo INFO> Chosen target is full_precision
<Hailo Model Zoo INFO> Preparing calibration data...
checking in /mnt
[info] Loading model script commands to finetuned_yolov8n from cfg/finetuned_yolov8n.alls
[info] Loading model script commands to finetuned_yolov8n from string
<Hailo Model Zoo INFO> Initializing the dataset ...
checking in /mnt
<Hailo Model Zoo INFO> Running inference...
[info] Setting NMS score threshold to 0.001
Processed: 128images [00:53,  2.41images/s]
creating index...
index created!
Loading and preparing results...
Converting ndarray to lists...
(12800, 7)
0/12800
DONE (t=0.03s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.30s).
Accumulating evaluation results...
DONE (t=0.04s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
<Hailo Model Zoo INFO> Done 128 images AP=0.000 AP50=0.000

I checked my .tfrecord file with tfrecord-viewer1 and the labels in it seem to make sense.

Here’s my customized configs:

finetuned_yolov8n.yaml
base:
- networks/yolov8n.yaml
network:
  network_name: finetuned_yolov8n
paths:
  alls_script: finetuned_yolov8n.alls
evaluation:
  classes: 1
  dataset: asdf.tfrecord
quantization:
  calib_set:
  - asdf.tfrecord
finetuned_yolov8n.alls
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv42, sigmoid)
change_output_activation(conv53, sigmoid)
change_output_activation(conv63, sigmoid)
nms_postprocess("finetuned_yolov8n_nms_config.json", meta_arch=yolov8, engine=cpu)

allocator_param(width_splitter_defuse=disabled)
finetuned_yolov8n_nms_config.json
{
	"nms_scores_th": 0.2,
	"nms_iou_th": 0.7,
	"image_dims": [
		1920,
		1920
	],
	"max_proposals_per_class": 100,
	"classes": 1,
	"regression_length": 16,
	"background_removal": false,
	"bbox_decoders": [
		{
			"name": "finetuned_yolov8n/bbox_decoder41",
			"stride": 8,
			"reg_layer": "finetuned_yolov8n/conv41",
			"cls_layer": "finetuned_yolov8n/conv42"
		},
		{
			"name": "finetuned_yolov8n/bbox_decoder52",
			"stride": 16,
			"reg_layer": "finetuned_yolov8n/conv52",
			"cls_layer": "finetuned_yolov8n/conv53"
		},
		{
			"name": "finetuned_yolov8n/bbox_decoder62",
			"stride": 32,
			"reg_layer": "finetuned_yolov8n/conv62",
			"cls_layer": "finetuned_yolov8n/conv63"
		}
	]
}

The finetuned_yolov8n.har is currently just an imported/“parsed” version converted from the onnx (using hailomz parse --ckpt=v8n_v3_1920_agdswi2m_best.onnx --hw-arch hailo8l --yaml cfg/finetuned_yolov8n.yaml).

Any thoughts as to what might be causing this issue?

Thanks!

Someone reported a problem with similar symptoms here:

But I don’t have the same model and should just have a normal yolo postprocessing step, so I don’t think the resolution applies in my case?

Hey @user108,

Welcome to the Hailo Community!

If you have the same output structure as regular YOLO, then yes, it’s better to run post-processing on Hailo.

Potential Causes and Fixes

1. Dataset and Annotation Issues

  • Incorrect dataset format:
    • The Hailo Model Zoo expects the dataset format to be correct and align with the yaml configuration.
    • Check: Convert the TFRecord dataset to .npy for inspection:
      python hailo_model_zoo/tools/conversion_tool.py /path/to/tfrecord_file --npy
      
      Then, verify that bounding boxes are correctly formatted.
  • Incorrect Class Mapping:
    • Ensure that the classes: 1 setting in your YAML matches the number of classes in the TFRecord dataset.
    • If your dataset annotations have different class indexes than expected by the model, this could result in missed detections.

2. Bounding Boxes Being Filtered

  • Your evaluation metrics indicate (12800, 7) detections were processed, but none matched ground truth.
  • Try running evaluation with relaxed IoU thresholds:
    hailomz eval --har finetuned_yolov8n.har --iou-threshold 0.3
    

3. Run these for evaluation

  • Run evaluation with debug mode to see detailed output:
hailomz eval --har finetuned_yolov8n.har --debug
  • Run without NMS to verify raw detections:
hailomz eval --har finetuned_yolov8n.har --no-nms
  • Ensure input image preprocessing is correctly applied:
hailomz eval --har finetuned_yolov8n.har --input-conversion nv12_to_rgb --resize 1920 1920

These should help identify why your evaluation results are zero. Let me know what you find to help you more!

1 Like