Hi! The best way to predict yolo(v5,v8) model is to convert from onnx to hef with special end nodes. To postprocess outputs we need to add this command nms_postprocess(“…/yolov8n_nms_config.json”, meta_arch=yolov8, engine=cpu) in alls file. Where can I found source code of decoding yolo outputs and nms?
The source code is part of HailoRT. Here is a link to our GitHub repo.
@KlausK, I trained yolov8n on custom dataset and exported onnx to hef by two ways: with decoding and NMS/without decoding.
Inference time for model with decoding and NMS:
hailortcli run yolov8n_4classes_hailo.hef
Running streaming inference (yolov8n_4classes_hailo.hef):
Transform data: true
Type:      auto
Quantized: true
Network yolov8n/yolov8n: 100% | 1087 | FPS: 217.20 | ETA: 00:00:00
Inference result:
Network group: yolov8n
Frames count: 1087
FPS: 217.21
Send Rate: 2135.26 Mbit/s
Recv Rate: 1002.02 Mbit/s
Inference time for model without decoding and NMS:
hailortcli run yolov8n_no_nms.hef
Running streaming inference (yolov8n_no_nms.hef):
Transform data: true
Type:      auto
Quantized: true
Network yolov8n_no_nms/yolov8n_no_nms: 100% | 1032 | FPS: 206.29 | ETA: 00:00:00
Inference result:
Network group: yolov8n_no_nms
Frames count: 1032
FPS: 206.30
Send Rate: 2028.03 Mbit/s
Recv Rate: 945.36 Mbit/s
As I see from this tests that yolo with decoding and nms is faster than without it. Can you explain this?