The whole process of using Hailo-8 with Raspberry PI 5 was pretty frustrating so far. Hope that somebody could help me figuring out the ways to work around the problems.
The documentation is cryptic and hard to find.
Parts of the system work only on the RPI, and other parts work only on the x86 host, and need to be downloaded from different places.
I had to spend couple of hours finding out best params for the model to fix “shift delta” errors.
I had to convert the model from Torch to ONNX, and then from ONNX to HEF.
The final compilation step just takes multiple hours or is just plain stuck. I have a beefy 16-core CPU with 128GB of RAM and it’s using just two cores.
All the command line scripts show cryptic error messages and it’s pretty hard to figure out why e.g. it’s not using my GPU for the quantization step.
Overall, quantizing a model for ONNX was easy and fun, the process for Hailo is just frustrating.
The alls line has nothing to do with compilation speed. It simply allows one to split the model into multiple contexts so that the Hailo chip can execute each context sequentially in chunks. The error message that you posted earlier suggested that the compiler was trying to fit everything into one context.
The compilation time if there is no finetuning/adaround (these two optimizations should use GPU, everything isn’t necessary) is dominated by LP/MIP problems. The hailo compiler looks like it is using COIN-OR, which needs to be specifically built to support multithreading. Even then, multithreading is not guaranteed to accelerate MIP solving, depending on the problem itself.
Anyways, you need to post more info on what you’re doing or the onnx itself. There’s not enough information to replicate or help.