Improve FPS using Python API

Hello! I am using Hailo8 for real-time inference. I use YOLO8-11 custom object detection models with Python API for inference and opencv-python for getting images from camera. I looked through the benchmarks and saw than YOLOv8n from Model Zoo can reach up to 600fps+ streaming using CLI commands. However, when I tried to make inference using Python Script the time it takes to make inference on one single image is like 15-20ms which is about ±60fps.
Then I tried to give a batch of 8 images and it took 50-60ms (~16 FPS) for YOLOv11 Nano to process this images and give me the result on Hailo8.
What could be the problem? I looked through all the docs and all the User Guide, but I can’t find anything that can help me reach such beautiful results that are demonstraded on GitHub. Moreover, I found out that the power consumption on Benchmark differs between Model Zoo models (like 3,5W) and my custom trained model(1,7W). What could be the problem?
I compiled my model using the latest Data Compiler using CLI commands and using guides on GitHub and Developer Zone. My board is Orange Pi 5 Max with PCIe 3.0-4 lane.

Welcome to the Hailo Community!

Please have a look at the following post:

Hailo Community - My model runs slower than expected

Power is to a large part linear to work done plus a bit of static power. So, when you run the model at half the FPS you will measure half the power.

When you use non-blocking code (not waiting for the result) FPS is not the inverse of the latency. The Hailo-8 can process images in a true pipeline. As soon as the first layer finishes one image it can start processing the next while the other layers continue processing the previous frame.