Issue with yoloe-11s-seg model structure change when I optimize in hailomz

I’m converting yoloe-11s-seg model to hef.

Using hailomz parse to convert onnx to har is successful.

However, after quantizing the har file using hailomz optimize, the output of the Matmul1 layer changes to two during the hailomz compile process, resulting in the following error.

[error] Mapping Failed (allocation time: 8m 53s)
Failed to reach required FPS on the following layers:
Compilation failed with exception: More than one output is not supported for layer matmul1

[error] Failed to produce compiled graph
[error] BackendAllocatorException: Compilation failed: Failed to reach required FPS on the following layers:
Compilation failed with exception: More than one output is not supported for layer matmul1

My analysis of this issue that after optimization, the model’s structure changes so that matmul1’s output is split into two. So, if we can avoid this process, the conversion should succeed. Is there a way to do this?

  • Before optimization har structure

  • After optimization har structure

Hey @HWANG_JUNYOUNG,

Welcome to the Hailo Community!

Your analysis is spot on. The issue occurs because hailomz optimize transforms your graph so that Matmul1 gets multiple outputs, but our compiler doesn’t support multi-output MatMul operations.

What’s happening:
The optimization process applies graph rewrites that sometimes split outputs to serve multiple consumers. In your case, this creates the unsupported multi-output MatMul scenario.

Solutions:

  1. Limit optimization scope - Use --only-quantization to skip structural changes:

    hailomz optimize model.har --only-quantization
    
  2. Disable problematic passes - Run with --disable-pass split_outputs or similar (check --print-passes for exact names)

  3. Preprocess your ONNX - Ensure Matmul1 has single consumer before parsing by adding Identity nodes or using tools like onnx-simplifier

  4. Try QAT - Quantization Aware Training can avoid post-training structural changes

Hope this helps!

Thank you for your answer.
But In my hailomz, there are no –only-quantization argument and –disable-pass split_outputs argument.

I think this is because my hailomz version is 2.16.0.

Then, how can i solve this problem except Preprocess my onnx and QAT?