YOLO ONNX to Hailo HEF: End nodes, postprocessing and unsupported ops.

Hi, I’m working on converting a custom YOLO-based object detection model (YOLO26m) to a Hailo.hef and ran into some confusion regarding end nodes and postprocessing.

Environment

  • Hailo Dataflow Compiler: 3.33.0

  • CUDA: 12.5

  • cuDNN: 9.20 (for CUDA 12.x)

  • NVIDIA Driver: 595

  • Python: 3.10

  • Ubuntu 22.04

Installation was done according to the documentation. GPU is available and being used.

I was advised to set the end nodes at Conv layers (detection heads).
This works, but results in multiple output tensors (bbox + class heads across scales).

However, I know that some models in the Hailo Model Zoo / Model Explorer return a single output tensor and already have postprocessed / structured outputs.

By analyzing the ONNX graph, I noticed that:

  • Cutting at Conv layers removes the entire postprocessing pipeline

  • The model actually includes operations like Sigmoid, Concat, TopK, Gather, ReduceMax, etc.

So instead, I tried setting the end node at the very end of the graph (right before the final output) at /model.23/Concat_6. This results in DFC parsing fails with errors like “GatherElements/TopK/Mod/ReduceMax operation is unsupported”.

  1. Are these postprocessing operations (TopK, Gather, ReduceMax, etc.) generally unsupported by DFC?
  2. If so, how are the models in Hailo Model Zoo/ Model Explorer able to produce a single, structured output tensor?
  3. How do you implement the postprocessing? Would you be able to describe what steps are applied or share any example pipelines / reference implementations?

Any clarification or best practices would be greatly appreciated.

Thanks!