How to Run Batched Inference (Batch Size 2) with YOLOX on Hailo-8?

pal-uchi · March 21, 2025, 6:58am

I’m currently working on running YOLOX-tiny on the Hailo-8 accelerator using C++, referencing the hailo-ai/Hailo-Application-Code-Examples GitHub repository.
My setup uses the imx8 CPU, and the code is based on the following example:Hailo-Application-Code-Examples/runtime/cpp/object_detection
/object_detection.cpp

I’m working on a task that requires running inference on two images simultaneously.
Currently, I’m running inference twice in a loop, one image at a time. Each inference takes around 18 milliseconds, so processing two images takes 36 milliseconds, which exceeds my target of 33 milliseconds.

To improve this, I’m considering running inference with a batch size of 2, hoping that processing both images together will reduce the overall latency.

However, the object_detection.cpp example only shows how to process a single image.
Could anyone guide me on how to modify the code to process two images at once, i.e., how to perform batched inference with a batch size of 2?

omria · March 24, 2025, 1:19pm

Hey @pal-uchi,

We’re planning to add this feature to the example you’re following. For now, here are the steps to implement batching with two images:

Update your input tensor from [1, C, H, W] to [2, C, H, W] to include the batch dimension.
Load and preprocess two images - resize, normalize, and convert both to the appropriate format expected by your model (typically CHW).
Create a single batch buffer by concatenating the two processed images, then copy this data into your reshaped input tensor.
Submit for async inference and make sure your post-processing can handle the outputs for each image in the batch.

Let us know if you need any clarification on implementing these steps!

pal-uchi · March 25, 2025, 5:23am

Thanks for the response, @omria san.

I have one question regarding STPE1.

Update your input tensor from [1, C, H, W] to [2, C, H, W] to include the batch dimension.

What should I change to change the tensor size? C program at runtime or DFC settings for HEF generation?

Topic		Replies	Views
C/++ example of batch size > 1 please General	2	306	July 14, 2024
Multicamera inference General raspberry-pi , hailo8	6	130	June 2, 2025
Running inference on 2 Rpi5 camera feeds General raspberry-pi	10	669	February 21, 2025
How to run Hailo inference (object detection with custom yolo model) in ROS2 callback? General hailo8 , yolov8	0	72	April 5, 2025
Inference FPS Fixed on Hailo 8L with Different YOLOv8 Models(batch size = 1) General hailo8	1	371	December 23, 2024

How to Run Batched Inference (Batch Size 2) with YOLOX on Hailo-8?

Related topics