Yolo decoding and nms

Hi! The best way to predict yolo(v5,v8) model is to convert from onnx to hef with special end nodes. To postprocess outputs we need to add this command nms_postprocess(“…/yolov8n_nms_config.json”, meta_arch=yolov8, engine=cpu) in alls file. Where can I found source code of decoding yolo outputs and nms?

The source code is part of HailoRT. Here is a link to our GitHub repo.

GitHub - HailoRT - Postprocessing source code

@klausk, I trained yolov8n on custom dataset and exported onnx to hef by two ways: with decoding and NMS/without decoding.
Inference time for model with decoding and NMS:
hailortcli run yolov8n_4classes_hailo.hef
Running streaming inference (yolov8n_4classes_hailo.hef):
Transform data: true
Type: auto
Quantized: true
Network yolov8n/yolov8n: 100% | 1087 | FPS: 217.20 | ETA: 00:00:00

Inference result:
Network group: yolov8n
Frames count: 1087
FPS: 217.21
Send Rate: 2135.26 Mbit/s
Recv Rate: 1002.02 Mbit/s

Inference time for model without decoding and NMS:
hailortcli run yolov8n_no_nms.hef
Running streaming inference (yolov8n_no_nms.hef):
Transform data: true
Type: auto
Quantized: true
Network yolov8n_no_nms/yolov8n_no_nms: 100% | 1032 | FPS: 206.29 | ETA: 00:00:00

Inference result:
Network group: yolov8n_no_nms
Frames count: 1032
FPS: 206.30
Send Rate: 2028.03 Mbit/s
Recv Rate: 945.36 Mbit/s

As I see from this tests that yolo with decoding and nms is faster than without it. Can you explain this?