Impact of Parsing YOLO Models: Does Omitting Final Layers Affect mAP?

daffer.queque · March 13, 2025, 6:57am

Hi everyone,

I have a question regarding the parsing step for YOLO models—specifically YOLOv5m and YOLOv9m. I read Parsing YOLO models with the Hailo Dataflow Compiler tool on converting an ONNX YOLO model to a HEF model using Hailo’s Dataflow Compiler tool.

My main concern is whether omitting the final layers of the architecture (such as Gather, Shape, Unsqueeze, Concat, etc.)—and stopping at the convolutional blocks—affects the mAP performance. For example, with YOLOv5, the parsed model only processes up to the convolutional blocks at scales of 80×80, 40×40, and 20×20 (see attached image).

In evaluation, the ONNX model achieves a 0.648 mAP, but after parsing (and before quantization), the mAP drops to 0.611—a reduction of about 3%.

I have ensured that:

The same validation dataset and preprocessing (normalization and resizing to 640×640) are used.
The same NMS filter values are applied. (conf_thres = 0.25, iou_thres=0.45 and scores_thres=0.2 )

Thus, the only difference is that the parsed model uses Hailo’s NMS filter instead of the final architecture layers. (Correct me if there is something else I forgot or wrong)

My questions are:

Is the removal of the final architectural layers the likely cause of the mAP degradation?
What other factors might contribute to this drop in performance?
Are there any strategies or solutions to minimize this degradation?

Thanks in advance for your insights and suggestions!

omria · March 16, 2025, 4:49pm

Hey @daffer.queque ,

Yes, the 3% mAP drop (0.648 to 0.611) after parsing is expected when final YOLO layers are omitted. This occurs because:

Final layers removal (Gather, Shape, Unsqueeze, Concat, NMS) affects bounding box accuracy
Post-processing differences - Hailo moves NMS to host CPU
BatchNorm folding causes slight activation distribution shifts
Potential quantization effects in later stages

Mitigation Strategies

Preserve End-Nodes:

hailomz compile --ckpt yolov5m.onnx --start-node-names Conv_307 Conv_286 Conv_265 --end-node-names Yolo_Output

Fine-Tune After Parsing:

post_quantization_optimization(finetune, policy=enabled, learning_rate=0.0001, epochs=8, dataset_size=4000)

Use Higher Precision:

quantization_param(conv1, precision_mode=a16_w16)

Adjust NMS Parameters:

nms_postprocess(iou_threshold=0.45, score_threshold=0.25, engine=cpu)

With proper configuration, you can reduce the mAP drop from 3% to under 1%.
For more info please checkout our DFC documentation in pages 67-77 , we cover it extensively there .

Topic		Replies	Views
Obb model quantization poor benchmark General network	3	120	November 27, 2024
Parsing YOLO models with the Hailo Dataflow Compiler tool General dfc , yolo , parser , yolov8 , yolov5	11	2201	January 23, 2025
Understanding ONNX nodes for HAR parsing General dfc , hailo8	1	103	March 21, 2025
How to compile and use YOLOv5 model without NMS? General hailo8 , yolov5	7	257	March 26, 2025
Dataflow compiler (3.28.0) fails to parse fine-tuned yolov10n ONNX model General dfc , raspberry-pi	2	309	July 21, 2024

Impact of Parsing YOLO Models: Does Omitting Final Layers Affect mAP?

Mitigation Strategies

Related topics