Setup
Device: Hailo-8 (M.2)
Hosts tested:
• CM4 carrier (PCIe Gen2 x1)
• PC x86_64 (PCIe Gen4 x1)
• Raspberry Pi 5 (default Gen2 x1 → forced Gen3 x1)
• HailoRT versions tested: 4.19 / 4.20 / 4.22 (same results)
• Model: ssd_mobilenet_v2.hef
• CPU load is low
PCIe info (lspci)
• CM4: LnkSta: Speed 5GT/s, Width x1 (Gen2 x1)
• PC: LnkSta: Speed 8GT/s, Width x1 (Gen3 x1)
• RPi5: Gen2 x1 by default; after pciex1_gen=3 → LnkSta: Speed 8GT/s, Width x1
Throughput / FPS (hailortcli)
• CM4 (Gen2 x1): ~75 FPS, Send ~162 Mbit/s (~20 MB/s)
• PC (Gen3 x1): ~126 FPS, Send ~271 Mbit/s
• RPi5: Gen2 x1 ~81 FPS → Gen3 x1 ~126 FPS
So bandwidth isn’t saturated (Gen2 x1 ≈ 492 MB/s), but latency/transaction overhead seems to limit feeding the device on CM4.
Questions:
• On CM4, is ~70–80 FPS a realistic ceiling for ssd_mobilenet_v2 at 300×300, or should I expect more?
• Any recommended HailoRT knobs to better “fill” the device on CM4?
• Is there any way to adjust PCIe parameters for the CM4, either through the Linux kernel or config.txt, to improve link efficiency?
• Are there any known kernel or firmware settings specific to CM4 that could help in this context?
• Is there any known CM4-specific limitations/patches for Hailo-8 that I should be aware of?
Happy to provide more logs if useful.
Thanks for any guidance or examples of configs that improved FPS on CM4!