Relevance of batch size in CPP API for hailoRt

Wolfram_Strothmann · January 24, 2025, 9:18am

Hi guys,

I am trying to understand the way batch size is handled in the HailoRT cpp API.

I am starting from this example:

github.com/hailo-ai/Hailo-Application-Code-Examples

runtime/cpp/classifier/classifier.cpp

main

/**
 * Copyright 2021 (C) Hailo Technologies Ltd.
 * All rights reserved.
 *
 * Hailo Technologies Ltd. ("Hailo") disclaims any warranties, including, but not limited to,
 * the implied warranties of merchantability and fitness for a particular purpose.
 * This software is provided on an "AS IS" basis, and Hailo has no obligation to provide maintenance,
 * support, updates, enhancements, or modifications.
 *
 * You may use this software in the development of any project.
 * You shall not reproduce, modify or distribute this software without prior written permission.
 **/
/**
 * @ file classifier_example
 * This example demonstrates using virtual streams over c++
 **/

#include "hailo/hailort.hpp"

#include <iostream>

This file has been truncated. show original

I learned that I can modify the batch size by configuring the relevant instance of the configure_params. Here I call

configure_params->begin()->second.batch_size = 4;

In the configure_network_group function, e.g. in line 86 Hailo-Application-Code-Examples/runtime/cpp/classifier/classifier.cpp at dd6ada9d0d10e8b75660b74ab56ba018165204c0 · hailo-ai/Hailo-Application-Code-Examples · GitHub

After doing this, I’d assume that I’d have to call

input[0].write(MemoryView(batch.data())

in line 134 (Hailo-Application-Code-Examples/runtime/cpp/classifier/classifier.cpp at dd6ada9d0d10e8b75660b74ab56ba018165204c0 · hailo-ai/Hailo-Application-Code-Examples · GitHub) with a memory view 4 times bigger, i.e. matching the size of the data for a full batch.

However, if I do this, i get:
[HailoRT] [error] CHECK failed - write size 602112 must be 150528

Similar I get the same issue for output.read:

[HailoRT] [error] CHECK failed - Buffer size is not the same as expected for pool! (16000 != 4000)

Do I have to also configure the InputVStream and OutputVStream somehow in order to pass a full batch?
Or is the batch size handled completely inside the Hailo, so regardless of batch size configured, the data has to be passed image by image …?

I am looking forward to hearing from you!

Best regards,
Wolf

giladn · January 24, 2025, 9:32am

Hi @Wolfram_Strothmann
The batch size is handled in HailoRT. You should keep pushing frames one at a time. We will manage the batching. Also on the output, you will get your inference output one at a time. This way you can use the same code w/o worrying about batch size. In addition we can automatically use “lower” batch size if not enough frames are pushed after some timeout (configurable).

Anton_Kumaigorodskyi · February 3, 2025, 4:54am

How does batch size affect inference latency? Generally, should I try to keep it larger or smaller?

Intuitively I thought the smaller the better, but in practice batch_size=10 has much less latency than batch_size=1.

Also, is batch_size parameter at inference somehow related to batch_size parameters when converting a .pt model to .hef file?

Topic		Replies	Views
C/++ example of batch size > 1 please General	2	401	July 14, 2024
Custom batch_size for models. General	6	65	August 26, 2025
About the maximum of --batch-size General hailo8	8	688	July 23, 2024
How do I increase the max queue size for HailoRT? General hailort , hailo-api	3	281	November 6, 2024
What does hailonet do when an invalid batch-size is set (e.g., 20, outside the 0-16 range)? General	2	24	October 27, 2025

Relevance of batch size in CPP API for hailoRt

Related topics