batch more than 1

An_ti11 · May 30, 2025, 10:58pm

Hi,How can I set the batch size to 2? For example, I would like to just add two identical images. I would be very grateful for your answer.“inference_result = model(frame)
results = inference_result.results
box1 = np.array([det[‘bbox’] for det in results], dtype=np.float32)
score1 = np.array([det[‘score’] for det in results], dtype=np.float32)” “model = dg.load_model(
model_name=‘yolov11n_5’,
inference_host_address=‘@local’,
zoo_url=‘/home/zoo_url/yolov11n_5’
)”

Vlad_Klimov · May 30, 2025, 11:12pm

Hi, @An_ti11 ,

You can use model.predict_batch() instead of mode.predict() to effectively pipeline a sequence of frames, see detailed description here: Running AI Model Inference | DeGirum Docs

In a few words, you provide frame iterator as a method parameter, which, in turn, also returns iterator over results, which you can use in for loop: for result in model.predict_batch(["image1.jpg", "image2.jpg"]):

Your input iterator may yield various frame types:

strings containing image filenames
numpy arrays with image bitmaps
PIL image objects

If you want to process camera stream, degirum_tools package provides convenient wrappers like degirum_tools.predict_stream(model, video_source), see example here: hailo_examples/examples/004_rtsp.ipynb at main · DeGirum/hailo_examples

An_ti11 · June 2, 2025, 4:28pm

Thank you for your response, but I have a problem: I’m using YOLOv11n with the following code:

for result in model.predict_batch(frames):
    boxes = [det['bbox'] for det in result.results]
    scores = [det['score'] for det in result.results]
    all_boxes_list.append(boxes)
    all_scores_list.append(scores)

I pass five images, and when checking FPS using:

hailortcli run yolov11n_5.hef --batch-size 5

I get:

Running streaming inference (yolov11n_5.hef):
  Transform data: true
    Type:      auto
    Quantized: true
Network best/best: 100% | 1235 | FPS: 246.65 | ETA: 00:00:00
> Inference result:
 Network group: best
    Frames count: 1235
    FPS: 246.66
    Send Rate: 2424.74 Mbit/s
    Recv Rate: 1087.66 Mbit/s

That’s 246 FPS per frame, as I understand. However, in my code, when I process five images in a batch, I only get around 11–12 FPS overall, and I have no idea why this is happening.

Vlad_Klimov · June 2, 2025, 4:42pm

@An_ti11 ,

To accurately measure FPS please try longer batch, say, 500-1000 frames.
Also, keep in mind that on the first frame the model is loaded into accelerator, causing extra delay.

Also, degirum_tools has a function, which measures inference performance of a model. You can try this to see accurate results:

import degirum_tools

# assuming your model object is `model`
profile = degirum_tools.model_time_profile(model, iterations=500)
print(f"observed FPS = {profile.observed_fps:.2f}, single frame inference = {profile.time_stats["CoreInferenceDuration_ms"].avg:.2f} ms")

shashi · June 2, 2025, 4:46pm

@An_ti11
Please note that even after your try @Vlad_Klimov 's suggestions, you will not be able to match the 246FPS number you get from profiling because internally PySDK still does batch of 1.

An_ti11 · June 2, 2025, 5:00pm

So the problem is that I can’t use batching in PySDK, only sequential inference calls?

shashi · June 2, 2025, 5:42pm

Hi @An_ti11
There is pipelining inside PySDK but no native batching at an inference call level.

An_ti11 · June 5, 2025, 11:41am

Thank you for your response. Perhaps you know of any libraries that implement full batching?

Vlad_Klimov · June 6, 2025, 6:11am

@An_ti11 , next release of PySDK (ETA - next week) will support Hailo batching.

alexandra · June 20, 2025, 6:55pm

Hi @An_ti11, PySDK 0.17.0 has been released and supports Hailo batching. Please let us know if you have any questions!

Zoey_Goh · July 15, 2025, 7:10pm

Hi, I’m now using 0.17.0.

The input tensor is (1, 640, 640, 3) and I want to set batch size = 2.
But when I’m using model.predict_batch(input_data) where the size of data is (2, 640, 640, 3), it says input sensor size is not matched.

Only when I convert it to (2, 1, 640, 640, 3), it works. But it’s slow.

Do I do anything wrong?

Thank you so much!

Vlad_Klimov · July 15, 2025, 7:21pm

Hi @Zoey_Goh ,

Batching in PySDK works a little bit different: predict_batch always accepts single frames, but internally, it accumulates them in a batch when batching is enabled.
By default, batching is enabled with batch size 8. To change the batch size, you assign model.eager_batch_size property of your model object.

Please be advised that for signle-context models (smaller models which completely fit into Hailo accelerator internal memory) batching does not improve performance, so we internally force batch size to Auto for such models. Only for multi-context models batching makes sense. FPS improvements due to batching depend on the model. You may try to measure FPS vs. batch size to figure out optimal setting.

Tell me if you need help with benchmarking: degirum_tools has nice function model_time_profile to do it.

Zoey_Goh · July 15, 2025, 7:30pm

Our model’s batch size is uncertain and depends on the number of connected cameras, which may vary. Does this mean we can only process in a sequential manner using a for-loop?

Vlad_Klimov · July 15, 2025, 7:56pm

Hi @Zoey_Goh ,

You may implement it many ways, depending on your needs.

Approach A.
Multiplex frames from multiple cameras in the frame source function and feed this multiplexed stream into single model object. Then demultiplex results in the for-loop body. To simplify demultiplexing, your source function (which you pass as predict_batch() argument) may return a tuple of a frame and arbitrary frame info. This arbitrary frame info is then accessible via result.info property. You may put camera index into frame info.

Approach B.
Create as many model objects as you have cameras. Run each model prediction loop in a separate thread in parallel with others. Since batching is done internally, this will work nicely if you have the same model name for each model object.

Approach C.
If you have indeed too many cameras, you may even use synchronous predict method, model.predict. Yes, it is less effective than predict_batch, but when many model objects are used in parallel, a bunch of such objects still can provide enough load for accelerator to avoid starvation. This may simplify your code where you can avoid those for-loops over predict_batch().

Tell me if you need any help implementing particular solution.

Zoey_Goh · July 15, 2025, 7:57pm

Also, when I try to call model_time_profile method, it throws

bb = degirum_tools.model_time_profile(model, 500)
File “/home/lib/python3.10/site-packages/degirum_tools/inference_support.py”, line 475, in model_time_profile
raise NotImplementedError
NotImplementedError

Vlad_Klimov · July 15, 2025, 8:08pm

Yeah… for non-image input types it is not implemented…

Vlad_Klimov · July 15, 2025, 8:09pm

What is the model input type? Can you share model JSON?

Zoey_Goh · July 15, 2025, 8:11pm

Thank you so much! I think Approach B would work well for me.
Could you please help me with the example implementations of Approach B? I really appreciate it.

Vlad_Klimov · July 15, 2025, 8:13pm

Also, for advanced use cases we have dgstreams framework, implemented as a part of degirum_tools package. You may take a look at the docs:
Streams | DeGirum Docs

And examples:

PySDKExamples/examples/dgstreams at main · DeGirum/PySDKExamples

Basically, it is like GStreamer, but in Python, and much simpler.
You connect multiple gizmos in an execution graph, when each gizmo runs in a separate thread.

Vlad_Klimov · July 15, 2025, 8:16pm

Please take a look at this example:
PySDKExamples/examples/dgstreams/multi_camera_multi_model_detection.ipynb at main · DeGirum/PySDKExamples

Most likely it is exactly what you need.

Topic		Replies	Views
using two models on a single device General	4	80	May 31, 2025
Improve FPS using Python API General fps , python , hailo8	1	162	April 22, 2025
C/++ example of batch size > 1 please General	2	345	July 14, 2024
User Guide 3: Simplifying Object Detection on a Hailo Device Using DeGirum PySDK Guides	59	1826	July 29, 2025
My model runs slower than expected General debug , optimization	1	757	July 17, 2024

batch more than 1

Related topics