Hi,
Well, yes I tried something, but the results weren’t good. Let me explain a little further.
The live video source is being streamed out of a commerical product. I won’t mention the brand or product name, but it’s essentially a Raspberry Pi 4 running some version of Linux (it’s a closed propriety system). One can configure this Pi to stream the video, which is H.264, in a number of ways - a raw UDP stream, a RTP stream, a RTSP stream, a RTMP stream, etc - you get the picture.
Rather than using Python code to construct the gstreamer “invocation”, I just got hold of the invocation from one of your examples (I think it was from the hailo-apps-infra git repo) and modified the start of it. So it’s:
sudo gst-launch-1.0 udpsrc port=3000 buffer-size=13000000 name=source ! queue name=source_queue_decode leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! decodebin name=source_decodebin ! queue name=source_scale_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoscale name=source_videoscale n-threads=2 ! queue name=source_convert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoconvert n-threads=3 name=source_convert qos=false ! video/x-raw, pixel-aspect-ratio=1/1, format=RGB, width=1280, height=720 ! queue name=inference_wrapper_input_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailocropper name=inference_wrapper_crop so-path=/usr/lib/aarch64-linux-gnu/hailo/tappas/post_processes/cropping_algorithms/libwhole_buffer.so function-name=create_crops use-letterbox=true resize-method=inter-area internal-offset=true hailoaggregator name=inference_wrapper_agg inference_wrapper_crop. ! queue name=inference_wrapper_bypass_q leaky=no max-size-buffers=20 max-size-bytes=0 max-size-time=0 ! inference_wrapper_agg.sink_0 inference_wrapper_crop. ! queue name=inference_scale_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoscale name=inference_videoscale n-threads=2 qos=false ! queue name=inference_convert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! video/x-raw, pixel-aspect-ratio=1/1 ! videoconvert name=inference_videoconvert n-threads=2 ! queue name=inference_hailonet_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailonet name=inference_hailonet hef-path=/home/monster/git/hailo-apps-infra/hailo_apps_infra/../resources/yolov8m.hef batch-size=2 vdevice-group-id=1 nms-score-threshold=0.3 nms-iou-threshold=0.45 output-format-type=HAILO_FORMAT_TYPE_FLOAT32 force-writable=true ! queue name=inference_hailofilter_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailofilter name=inference_hailofilter so-path=/home/monster/git/hailo-apps-infra/hailo_apps_infra/../resources/libyolo_hailortpp_postprocess.so function-name=filter_letterbox qos=false ! queue name=inference_output_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! inference_wrapper_agg.sink_1 inference_wrapper_agg. ! queue name=inference_wrapper_output_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailotracker name=hailo_tracker class-id=1 kalman-dist-thr=0.8 iou-thr=0.9 init-iou-thr=0.7 keep-new-frames=2 keep-tracked-frames=15 keep-lost-frames=2 keep-past-metadata=False qos=False ! queue name=hailo_tracker_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! queue name=identity_callback_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! identity name=identity_callback ! queue name=hailo_display_overlay_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailooverlay name=hailo_display_overlay ! queue name=hailo_display_videoconvert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoconvert name=hailo_display_videoconvert n-threads=2 qos=false ! queue name=hailo_display_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! fpsdisplaysink name=hailo_display video-sink=autovideosink sync=true text-overlay=False signal-fps-measurements=true
(The sudo
is necessary to permit the large buffer.)
The problem is that, done this way, the frame rate is appalling (i.e. on the Pi 5 that I’m using for this). All the other things I’ve tried with object detection and classification, with live cameras and prerecorded video were great - low latency and butter smooth.
Currently, I don’t have the knowledge to analyze where the problem is, although a clue might be that if I use an RTSP stream, I get butter smooth results but with huge latency (perhaps as much as 2 seconds). If you could point me in the right direction, I’d very much appreciate it. How can I see where the bottleneck(s) is/are? And is the fact that the Pi 5 doesn’t have H.264 hardware decoding a factor? Would I be better off using a Pi 4, for example? So I have many questions and I’m not sure what to do next.