If you’re frequently going through the conversion flow—whether due to continuous model re-training or re-quantization—you might find the final compilation step particularly time-consuming, especially when using the maximum performance mode. Fortunately, there’s a way to significantly reduce re-compilation time.
Here’s how:
- Save the compiled HAR file.
- Extract the contents of the HAR file. You can either extract it as any other tar file or use the
hailo har extract
command. The key file you need isMODEL_NAME.auto.alls
. - Use the extracted auto model script during re-compilation. If you are using the
hailo
CLI tool, you can pass the model script with the flag--model-script
. With the Python API, re-load the model script before compiling.
- Be aware that if the model has been re-trained, slight changes in allocation could occur. This may cause errors when using the entire
auto.alls
file. - To resolve this, start by removing all
buffers(.
commands from the file. - If issues persist, also remove the
context_N.place(
commands.