Hi, I’m working on converting a custom YOLO-based object detection model (YOLO26m) to a Hailo.hef and ran into some confusion regarding end nodes and postprocessing.
Environment
-
Hailo Dataflow Compiler: 3.33.0
-
CUDA: 12.5
-
cuDNN: 9.20 (for CUDA 12.x)
-
NVIDIA Driver: 595
-
Python: 3.10
-
Ubuntu 22.04
Installation was done according to the documentation. GPU is available and being used.
I was advised to set the end nodes at Conv layers (detection heads).
This works, but results in multiple output tensors (bbox + class heads across scales).
However, I know that some models in the Hailo Model Zoo / Model Explorer return a single output tensor and already have postprocessed / structured outputs.
By analyzing the ONNX graph, I noticed that:
-
Cutting at Conv layers removes the entire postprocessing pipeline
-
The model actually includes operations like Sigmoid, Concat, TopK, Gather, ReduceMax, etc.
So instead, I tried setting the end node at the very end of the graph (right before the final output) at /model.23/Concat_6. This results in DFC parsing fails with errors like “GatherElements/TopK/Mod/ReduceMax operation is unsupported”.
- Are these postprocessing operations (TopK, Gather, ReduceMax, etc.) generally unsupported by DFC?
- If so, how are the models in Hailo Model Zoo/ Model Explorer able to produce a single, structured output tensor?
- How do you implement the postprocessing? Would you be able to describe what steps are applied or share any example pipelines / reference implementations?
Any clarification or best practices would be greatly appreciated.
Thanks!