Frustrating experience of converting a model from Torch to Hailo

heratsi · August 13, 2025, 9:22am

The whole process of using Hailo-8 with Raspberry PI 5 was pretty frustrating so far. Hope that somebody could help me figuring out the ways to work around the problems.

The documentation is cryptic and hard to find.
Parts of the system work only on the RPI, and other parts work only on the x86 host, and need to be downloaded from different places.
I had to spend couple of hours finding out best params for the model to fix “shift delta” errors.
I had to convert the model from Torch to ONNX, and then from ONNX to HEF.
The final compilation step just takes multiple hours or is just plain stuck. I have a beefy 16-core CPU with 128GB of RAM and it’s using just two cores.
All the command line scripts show cryptic error messages and it’s pretty hard to figure out why e.g. it’s not using my GPU for the quantization step.

Overall, quantizing a model for ONNX was easy and fun, the process for Hailo is just frustrating.

shashi · August 13, 2025, 1:24pm

Hi @heratsi

If you are looking to compile a YOLO checkpoint, you can check our cloud compiler: Early Access to DeGirum Cloud Compiler

heratsi · August 13, 2025, 1:27pm

I need to run my custom segmentation model, based on deeplabv3+resnet50 or lraspp+mobilenet.

heratsi · August 13, 2025, 1:28pm

I really don’t understand how am I supposed to interpret messages like:

Iteration failed on: Automri finished with too many resources on context_0
Single context flow failed: Recoverable single context error

Please advise!

heratsi · August 13, 2025, 1:32pm

And my personal favorite:

Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered

lawrence · August 13, 2025, 2:16pm

Sounds like the compiler is forced to a single context for compilation.

Add this line to your alls file:

context_switch_param(mode=enabled)

heratsi · August 13, 2025, 2:23pm

I did, still uses just two cores.

$ cat model-script.alls (hailo)
context_switch_param(mode=enabled)
$ hailo compiler --hw-arch hailo8 --model-script model-script.alls deeplabv3_resnet50_10_256_0_005_0_0001_export_optimized.har (hailo)

heratsi · August 13, 2025, 2:31pm

Now it worked about 15 minutes and failed with the following assert:

compiler: ../src/allocator/network_graph_appender.cpp:402: Status<network_graph::NetworkNode*> allocator::NetworkGraphAppender::AddShortcut(std::string, network_graph::NetworkNode&, std::vector<network_graph::NetworkNode*>): Assertion `src_node.output_format() == (*first_succ)->input_format()’ failed.

lawrence · August 13, 2025, 11:41pm

The alls line has nothing to do with compilation speed. It simply allows one to split the model into multiple contexts so that the Hailo chip can execute each context sequentially in chunks. The error message that you posted earlier suggested that the compiler was trying to fit everything into one context.

The compilation time if there is no finetuning/adaround (these two optimizations should use GPU, everything isn’t necessary) is dominated by LP/MIP problems. The hailo compiler looks like it is using COIN-OR, which needs to be specifically built to support multithreading. Even then, multithreading is not guaranteed to accelerate MIP solving, depending on the problem itself.

Anyways, you need to post more info on what you’re doing or the onnx itself. There’s not enough information to replicate or help.

See,

Network Graph Compilation Failure - #3 by omria for the same exception you just posted.

Topic		Replies	Views
Compilation of Yolov8 network General	12	964	September 22, 2024
hailo compiler Mapping Failed General raspberry-pi	2	469	September 8, 2025
Model training GPU General hailo8	4	129	April 16, 2025
Unable to compile Quantized HAR file to HEF for YOLOv8n model General	8	542	August 26, 2024
YoloV9-tiny compilation fails General dfc , hailo8	7	817	August 4, 2024

Frustrating experience of converting a model from Torch to Hailo

Related topics