Combining Hailo and ONNX Runtime

Nils-Oliver · June 24, 2025, 8:46am

Hi everyone!

I was looking around the Application Code Examples and stumbled upon the Hailo ONNX Runtime Example. My idea was to apply this approach to porting models outside of the Hailo Model Zoo to the chip - run the first half with all compatible steps on the Hailo 8 chip, then directly feed the output from the Hailo 8 to the rest of the ONNX and run it locally. In my mind, you can rather easily port models to the Hailo 8, while offering a (hopefully) significant performance boost. Does this approach makes sense, or are there any downsides I might be missing?

KlausK · June 24, 2025, 9:10am

While ONNX Runtime can simplify the execution of a model’s pre- and post-processing steps, it has the drawback of using a blocking API. On the Hailo device, layers are executed concurrently - once the first layer finishes processing all rows of an image, it can immediately begin processing the next image. However, with a blocking API, you must wait for the current image’s result before sending the next one, which can reduce performance. Depending on your requirements, this may still be sufficient.

Topic		Replies	Views
hailo.ai/ONNXRuntime release plan General hailort	1	114	August 20, 2025
Running model on Hailo with ONNX Runtime — Workflow Clarification General dfc , hailo8	1	88	October 27, 2025
ONNXRuntime - Why are all nodes executed on the CPU? General onnxruntime	0	351	March 15, 2024
Background Removal with Hailo Accelerator & ONNX Runtime General raspberry-pi	7	854	October 27, 2025
window os question General hailo8	2	143	September 10, 2024

Combining Hailo and ONNX Runtime

Related topics