Hi,How can I set the batch size to 2? For example, I would like to just add two identical images. I would be very grateful for your answer.“inference_result = model(frame)
results = inference_result.results
box1 = np.array([det[‘bbox’] for det in results], dtype=np.float32)
score1 = np.array([det[‘score’] for det in results], dtype=np.float32)” “model = dg.load_model(
model_name=‘yolov11n_5’,
inference_host_address=‘@local’,
zoo_url=‘/home/zoo_url/yolov11n_5’
)”
Hi, @An_ti11 ,
You can use model.predict_batch()
instead of mode.predict()
to effectively pipeline a sequence of frames, see detailed description here: Running AI Model Inference | DeGirum Docs
In a few words, you provide frame iterator as a method parameter, which, in turn, also returns iterator over results, which you can use in for loop: for result in model.predict_batch(["image1.jpg", "image2.jpg"]):
Your input iterator may yield various frame types:
- strings containing image filenames
- numpy arrays with image bitmaps
- PIL image objects
If you want to process camera stream, degirum_tools
package provides convenient wrappers like degirum_tools.predict_stream(model, video_source)
, see example here: hailo_examples/examples/004_rtsp.ipynb at main · DeGirum/hailo_examples
Thank you for your response, but I have a problem: I’m using YOLOv11n with the following code:
for result in model.predict_batch(frames):
boxes = [det['bbox'] for det in result.results]
scores = [det['score'] for det in result.results]
all_boxes_list.append(boxes)
all_scores_list.append(scores)
I pass five images, and when checking FPS using:
hailortcli run yolov11n_5.hef --batch-size 5
I get:
Running streaming inference (yolov11n_5.hef):
Transform data: true
Type: auto
Quantized: true
Network best/best: 100% | 1235 | FPS: 246.65 | ETA: 00:00:00
> Inference result:
Network group: best
Frames count: 1235
FPS: 246.66
Send Rate: 2424.74 Mbit/s
Recv Rate: 1087.66 Mbit/s
That’s 246 FPS per frame, as I understand. However, in my code, when I process five images in a batch, I only get around 11–12 FPS overall, and I have no idea why this is happening.
@An_ti11 ,
To accurately measure FPS please try longer batch, say, 500-1000 frames.
Also, keep in mind that on the first frame the model is loaded into accelerator, causing extra delay.
Also, degirum_tools has a function, which measures inference performance of a model. You can try this to see accurate results:
import degirum_tools
# assuming your model object is `model`
profile = degirum_tools.model_time_profile(model, iterations=500)
print(f"observed FPS = {profile.observed_fps:.2f}, single frame inference = {profile.time_stats["CoreInferenceDuration_ms"].avg:.2f} ms")
@An_ti11
Please note that even after your try @Vlad_Klimov 's suggestions, you will not be able to match the 246FPS number you get from profiling because internally PySDK still does batch of 1.
So the problem is that I can’t use batching in PySDK, only sequential inference calls?
Hi @An_ti11
There is pipelining inside PySDK but no native batching at an inference call level.
Thank you for your response. Perhaps you know of any libraries that implement full batching?
@An_ti11 , next release of PySDK (ETA - next week) will support Hailo batching.