batch more than 1

Hi,How can I set the batch size to 2? For example, I would like to just add two identical images. I would be very grateful for your answer.“inference_result = model(frame)
results = inference_result.results
box1 = np.array([det[‘bbox’] for det in results], dtype=np.float32)
score1 = np.array([det[‘score’] for det in results], dtype=np.float32)” “model = dg.load_model(
model_name=‘yolov11n_5’,
inference_host_address=‘@local’,
zoo_url=‘/home/zoo_url/yolov11n_5’
)”

Hi, @An_ti11 ,

You can use model.predict_batch() instead of mode.predict() to effectively pipeline a sequence of frames, see detailed description here: Running AI Model Inference | DeGirum Docs

In a few words, you provide frame iterator as a method parameter, which, in turn, also returns iterator over results, which you can use in for loop: for result in model.predict_batch(["image1.jpg", "image2.jpg"]):

Your input iterator may yield various frame types:

  • strings containing image filenames
  • numpy arrays with image bitmaps
  • PIL image objects

If you want to process camera stream, degirum_tools package provides convenient wrappers like degirum_tools.predict_stream(model, video_source), see example here: hailo_examples/examples/004_rtsp.ipynb at main · DeGirum/hailo_examples

Thank you for your response, but I have a problem: I’m using YOLOv11n with the following code:

for result in model.predict_batch(frames):
    boxes = [det['bbox'] for det in result.results]
    scores = [det['score'] for det in result.results]
    all_boxes_list.append(boxes)
    all_scores_list.append(scores)

I pass five images, and when checking FPS using:

hailortcli run yolov11n_5.hef --batch-size 5

I get:

Running streaming inference (yolov11n_5.hef):
  Transform data: true
    Type:      auto
    Quantized: true
Network best/best: 100% | 1235 | FPS: 246.65 | ETA: 00:00:00
> Inference result:
 Network group: best
    Frames count: 1235
    FPS: 246.66
    Send Rate: 2424.74 Mbit/s
    Recv Rate: 1087.66 Mbit/s

That’s 246 FPS per frame, as I understand. However, in my code, when I process five images in a batch, I only get around 11–12 FPS overall, and I have no idea why this is happening.

@An_ti11 ,

To accurately measure FPS please try longer batch, say, 500-1000 frames.
Also, keep in mind that on the first frame the model is loaded into accelerator, causing extra delay.

Also, degirum_tools has a function, which measures inference performance of a model. You can try this to see accurate results:

import degirum_tools

# assuming your model object is `model`
profile = degirum_tools.model_time_profile(model, iterations=500)
print(f"observed FPS = {profile.observed_fps:.2f}, single frame inference = {profile.time_stats["CoreInferenceDuration_ms"].avg:.2f} ms")

@An_ti11
Please note that even after your try @Vlad_Klimov 's suggestions, you will not be able to match the 246FPS number you get from profiling because internally PySDK still does batch of 1.

So the problem is that I can’t use batching in PySDK, only sequential inference calls?

Hi @An_ti11
There is pipelining inside PySDK but no native batching at an inference call level.

Thank you for your response. Perhaps you know of any libraries that implement full batching?

@An_ti11 , next release of PySDK (ETA - next week) will support Hailo batching.