Running hailomz optimize reports layer yolov8n/conv41 doesn't have one output layer

@gbelair, the issue is with the file yolov8n_nms_config.json , which is referenced by yolov8n.alls . If you open this file, you’ll see reference to yolov8n/conv41 and other layers but they probably don’t correspond anymore to your custom model’s output layer’s name in the HAR.
The fix is to remove the reference to that json file altogether, therefore edit your /home/mytest/Hailo8l/hailo_model_zoo/hailo_model_zoo/cfg/alls/generic/yolov8n.alls and replace the line:

nms_postprocess(“…/…/postprocess_config/yolov8n_nms_config.json”, meta_arch=yolov8, engine=cpu)

with:

nms_postprocess(meta_arch=yolov8, engine=cpu)

The tool will automatically find the correct values to apply if no json file is supplied.

In addition, please use the option --classes when you call 'hailomz optimize’ if your number of classes is different than 80, which is the default for COCO dataset.

2 Likes

Just recomendation this solved for me

Thank you, this works for me!!
I further investigated the error and I found there was a layer mismatch. In my case, the json file had conv57 inside instead of conv58.

Thanks @victorc, this got me further. After making the recommended changes I then re-ran:
hailomz parse --hw-arch hailo8l --ckpt best.onnx yolov8n --end-node-names /model.22/cv3.0/cv3.0.2/Conv /model.22/cv2.0/cv2.0.2/Conv /model.22/cv2.1/cv2.1.2/Conv /model.22/cv3.1/cv3.1.2/Conv /model.22/cv3.2/cv3.2.2/Conv /model.22/cv2.2/cv2.2.2/Conv
That created a new yolov8n.har file
Now when I run:
hailomz optimize --hw-arch hailo8l --har ./yolov8n.har --calib-path /home/mytest/.hailomz/data/models_files/coco/2023-08-03/coco_calib2017.tfrecord
–model-script /home/mytest/Hailo8l/hailo_model_zoo/hailo_model_zoo/cfg/alls/generic/yolov8n.alls
yolov8n
I got the following error:
NegativeSlopeExponentNonFixable(
hailo_model_optimization.acceleras.utils.acceleras_exceptions.NegativeSlopeExponentNonFixable: Quantization failed in layer yolov8n/conv42 due to unsupported required slope. Desired shift is 9.0, but op has only 8 data bits. This error raises when the data or weight range are not balanced. Mostly happens when using random calibration-set/weights, the calibration-set is not normalized properly or batch-normalization was not used during training.
To recap, I’m just following the original tutorial, thus not adding any of my own changes, just those suggested in this thread.
However, per this article: Problem With Model Optimization - #24 by emasty I modified the yolov8n.alls file further as:
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv42, sigmoid)
change_output_activation(conv53, sigmoid)
change_output_activation(conv63, sigmoid)
nms_postprocess(meta_arch=yolov8, engine=cpu)
allocator_param(width_splitter_defuse=disabled)
quantization_param([conv42, conv53, conv63], force_range_out=[0.0, 1.0])
I can now optimize and compile, but have not tested. Of course this may simply be patching a root cause. A guess on my part might be related to how the tutorial presents its initial training. However I’m still a newbie, so lots more for me to learn. I have not tested the compiled model yet.

@gbelair The NegativeSlopeExponentNonFixable error is likely due to the fact that not enough training data were provided to make the weights of this branch to converge. Yolov8n has 3 heads, each one corresponding to a particular scale of the image. The yolov8n/conv42 layer probably corresponds to a resolution for which you have too little labeled data. You can try to modify the architecture of the model by removing one head or collect more data for the particular resolution it is lacking data. Or you can keep the workaround that you have in place.
Note that you only need to add the line

quantization_param([conv42, conv53, conv63], force_range_out=[0.0, 1.0])

to the existing yolov8n.alls file:

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
nms_postprocess(meta_arch=yolov8, engine=cpu)
allocator_param(width_splitter_defuse=disabled)
quantization_param([conv42, conv53, conv63], force_range_out=[0.0, 1.0])

The other lines related to changing the output activation are no longer necessary for the latest compiler.

Thanks @victorc that indeed was my problem. That is, not enough training data. I removed the quantization_param([conv42, conv53, conv63], force_range_out=[0.0, 1.0]) after a new (corrected) training run (with epoch=100) which resulted in no further errors. The final optimized and compile .hef file transferred to my RP5 works, albeit detection working but poor, but I feel I can work on that now that I have a confirmed/tested end-to-end environment.

@gbelair That’s good, you got something functional. To troubleshoot the accuracy, I advise you to decrease the compression level to 0 by modifying the alls file. This will make sure all the layers are compressed to 8-bits, not 4-bits. Your fps might drop but you’ll see the accuracy with all the layers quantized to 8-bits. If accuracy is not good at that point, I’ll advice you to apply the quantization optimization methods mentioned here: Accuracy degradation after quantization for Hailo HW