YoloV9-tiny compilation fails

Omer · August 4, 2024, 5:46am

Hi @dennis.huegle,
Please run “hailortcli parse-hef ./yolov9-t.hef” and see if the model is multi contexts or single context.

If it’s single context, there are two things you can do:

In the optimization step - increase the compression level (might hurt the accuracy a bit as there will be more 4-bit weights).
Run compilation with the alls commands “performance_param(compiler_optimization_level=max)” (will take a long time for compilation, but will give you the best performance results possible).

If it’s multiple contexts, it make sense that the FPS will be lowered compared to the single context yolov7-tiny compiled model.
Because a compiled model in Hailo is loaded on the chip, which has limited resources, if a model is too big to fit the chip’s resources we implement “multiple contexts”, meaning that the model’s is “broken” into two or more parts, where each part fits the chip’s resources, and only one part each time is run on the chip while the rest are stored in the host machine’s memory.
This allows to compile bigger models, but the overhead is that the performance (FPS, latency) would be lesser because of the context switching.
The are two optional ways to increase performance when you have a multi contexts model:

Increase the batch size when running inference (hailortcli run ./yolov9-t.hef --batch-size 8, for example)
Run compilation with the alls commands “performance_param(compiler_optimization_level=max)”

If it’s single context and the above mentioned suggestions doesn’t help, there’s not much we can do to increase performance.

Regards,