.hef generation with existing onnx model without training

Hi. I have a pt/onnx model which I already heavily trained. The model works very well, and now we wanted to try it out on the raspberry pi. The inference works, but one frame takes 130ms. Thats why we bought the Raspberry Pi Ai Kit. The demo runs really fast.

Now that the DFC is out, I took a look into it. I managed to create a .har file. But now, I need to again “train” with images? why is that necessary, my model is already trained I just want it converted. Can I somehow quantize (which is needed for the compiler) without the need of images again?

Thanks in advance

Hi @kamper,
Thanks for using the AI Kit on the Raspberry pi5.
If you think about it, naively compressing numbers 4 times fold (32b → 8int) almost without degradation is amazing. Part of achieving this is by using a small portion of the data (typically up to 1000 images, no need for GT) as a calibration set. This is a standard term called PTQ (Post Training Quantization) and is very common. This really helps ‘tune’ the quantized numbers correctly. With that, we can typically get less than 2% degradation.

Welcome to the Hailo Community!

The Hailo Dataflow Compiler aims to approximate a large equation involving floating-point numbers using smaller integer numbers of 4, 8 (default), and 16 bits. With just the model the DFC only knows half the numbers in the the equation. The other half comes from the images. Providing the Hailo Dataflow Compiler with a representative set of images that reflects the data the model will encounter during inference enables it to optimize the model more effectively by selecting the appropriate quantization for each layer.

Thank you for your answers. I tried optimizing and it seemed to work after some debugging. I got to my .hef file now.

Then I tried to run an example with my own hef file:

python basic_pipelines/detection.py --input resources/detection0.mp4 --hef test/fish.hef

which then promtly is followed by those error messages:

NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6

the full output in the console was

(venv_hailo_rpi5_examples) pi@raspberrypi1:~/Ai_Kit/hailo-rpi5-examples $ DISPLAY=:0 python basic_pipelines/detection.py --input resources/detection0.mp4 --hef test/fish.hef
hailomuxer name=hmux filesrc location=resources/detection0.mp4 name=src_0 ! queue name=queue_dec264 max-size-buffers=3 max-size-bytes=0 max-size-time=0 !  qtdemux ! h264parse ! avdec_h264 max-threads=2 !  video/x-raw, format=I420 ! queue name=queue_scale max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoscale n-threads=2 ! queue name=queue_src_convert max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoconvert n-threads=3 name=src_convert qos=false ! video/x-raw, format=RGB, width=640, height=640, pixel-aspect-ratio=1/1 ! tee name=t ! queue name=bypass_queue max-size-buffers=20 max-size-bytes=0 max-size-time=0 ! hmux.sink_0 t. ! queue name=queue_hailonet max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoconvert n-threads=3 ! hailonet hef-path=test/fish.hef batch-size=2 nms-score-threshold=0.3 nms-iou-threshold=0.45 output-format-type=HAILO_FORMAT_TYPE_FLOAT32 force-writable=true ! queue name=queue_hailofilter max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailofilter so-path=/home/pi/Ai_Kit/hailo-rpi5-examples/basic_pipelines/../resources/libyolo_hailortpp_post.so  qos=false ! queue name=queue_hmuc max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hmux.sink_1 hmux. ! queue name=queue_hailo_python max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! queue name=queue_user_callback max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! identity name=identity_callback ! queue name=queue_hailooverlay max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailooverlay ! queue name=queue_videoconvert max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoconvert n-threads=3 qos=false ! queue name=queue_hailo_display max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! fpsdisplaysink video-sink=xvimagesink name=hailo_display sync=true text-overlay=False signal-fps-measurements=true
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6
NMS score threshold is set, but there is no NMS output in this model.
CHECK_SUCCESS failed with status=6

did i miss something in the optimization step?

When just trying to run the model, this seems to work

(venv_hailo_rpi5_examples) pi@raspberrypi1:~/Ai_Kit/hailo-rpi5-examples $ hailortcli run test/fish.hef
Running streaming inference (test/fish.hef):
  Transform data: true
    Type:      auto
    Quantized: true
Network yolov8/yolov8: 100% | 633 | FPS: 126.45 | ETA: 00:00:00
> Inference result:
 Network group: yolov8
    Frames count: 633
    FPS: 126.45
    Send Rate: 1243.08 Mbit/s
    Recv Rate: 420.83 Mbit/s