Hi everyone,
I’m attempting to run DinoV2/V3 models on Hailo hardware, but I’ve run into some challenges that I’d like to clarify with the community.
Context:
From what I understand, the Hailo Dataflow Compiler (DFC) doesn’t fully support certain internal layers used in Dino models. As a workaround, I found the ONNX Runtime Hailo Execution Provider and hoped to run the exported Dino ONNX model directly, leveraging at least partial acceleration from the Hailo device.
Current Status:
-
I successfully compiled the model and can run inference using ONNX Runtime.
-
However, when monitoring with
hailortcli monitor, the Hailo device does not appear to be utilized. -
It seems the inference falls back to CPU execution, even though the ONNX Runtime Hailo examples use the accelerator correctly.
Questions:
-
What is the correct workflow to convert an ONNX model with the DFC when some layers are unsupported? Is there documentation or a working example for generating a Hailo-compatible ONNX file?
-
Are there plans to further develop or expand ONNX Runtime Hailo integration, especially for models with unsupported layers?
-
Is it feasible to run Dino models on Hailo in any form (e.g., partial offloading), or would you recommend an alternative approach?
-
The ONNX Runtime README mentions potential “extraction capabilities” in the DFC — is this feature planned or available to help support models like Dino?
Any guidance or pointers would be greatly appreciated.
Thank you in advance for your help!
Best,
Francesco