ClientRunner.translate_onnx_model() failed in Expand-Concat layer

SDK version: Hailo AI Software Suite 2024-10

I encountered an error while running ClientRunner.translate_onnx_model() on a network that includes an Expand-Concat layer. To investigate the issue, I created a simple code snippet that reproduces the problem, which I’ve included below.

import torch
import torch.nn as nn
import onnx
from hailo_sdk_client import ClientRunner
from onnxsim import simplify

class SimpleONNXStyleNetwork(nn.Module):
    def __init__(self):
        super(SimpleONNXStyleNetwork, self).__init__()

    def forward(self, x):
        gathered = x[:,0,:]
        unsqueezed = gathered.unsqueeze(1)

        expanded = unsqueezed.expand(-1, 576, -1)

        sliced = expanded[:, 0:576, :]

        concatenated = torch.cat((expanded, sliced), dim=2)

        return concatenated


def main():

    dummy_input = torch.randn(1, 577, 768)
    model = SimpleONNXStyleNetwork()
    model.eval()

    # demo
    output = model(dummy_input)
    print("Output shape:", output.shape)

    onnx_path = 'model.onnx'

    torch.onnx.export(model, dummy_input, onnx_path,
                      export_params=True,
                      training=torch.onnx.TrainingMode.EVAL,
                      input_names=["input.1"], # default: input.1
                      do_constant_folding=False,
                      opset_version=13)

    onnx_model = onnx.load(onnx_path)
    #onnx_model = onnx.shape_inference.infer_shapes(onnx_model)
    onnx_model, check = simplify(onnx_model)
    onnx.save(onnx_model, onnx_path)

    print("output ONNX file")

    chosen_hw_arch = 'hailo8'
    onnx_model_name = 'model'
    runner = ClientRunner(hw_arch=chosen_hw_arch)

    end_node_names = ['26']
    hn, npz = runner.translate_onnx_model(onnx_path, onnx_model_name,
                                          start_node_names=['input.1'],
                                          end_node_names=end_node_names,
                                          net_input_shapes={'input.1': [1, 577, 768]})


if __name__ == '__main__':
    main()

Error Message:

Traceback (most recent call last):
  File "/home/hailo/work/hailo-problem/my_network.py", line 62, in <module>
    main()
  File "/home/hailo/work/hailo-problem/my_network.py", line 55, in main
    hn, npz = runner.translate_onnx_model(onnx_path, onnx_model_name,
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1158, in translate_onnx_model
    parser.translate_onnx_model(
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 276, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 316, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 367, in parse_model_to_hn
    fuser = HailoNNFuser(converter.convert_model(), net_name, converter.end_node_names)
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/translator.py", line 83, in convert_model
    self._create_layers()
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 40, in _create_layers
    self._add_direct_layers()
  File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 163, in _add_direct_layers
    raise ParsingWithRecommendationException(
hailo_sdk_client.model_translator.exceptions.ParsingWithRecommendationException: Parsing failed. The errors found in the graph are:
 UnexpectedNodeError in op /Expand: Unexpected node /Expand (Expand)
 UnsupportedModelError in op /Concat: In vertex /Expand the constant value shape (3,) must be broadcastable to the output shape [1, 768, 576]
Please try to parse the model again, using these end node names: /Unsqueeze

The shape [1, 768, 576] reported in the error does not match the expected shape [1, 576, 768] in the model structure.

I appreciate any guidance you can provide.
Best regards.

Hey @masaru.ueki ,

Welcome to the Hailo Community!

It seems the error is caused by a mismatch between the broadcasted shape of the Expand layer and the expected output shape, as well as compatibility issues with how the Hailo SDK handles Concat operations. To fix this, you can try the following:

  1. Understand the Error
  • The Expand layer requires the constant value shape to be broadcastable to the target output shape.
  • The reported shape [1, 768, 576] doesn’t match your model’s expected shape [1, 576, 768].
  • The Hailo SDK has specific requirements for Expand and Concat operations that might differ from the standard ONNX graph representation.
  1. Modify the Code
  • Replace the Expand operation with a repeat function to explicitly replicate the data:
expanded = unsqueezed.repeat(1, 576, 1)
  • When exporting the model to ONNX, set parameters for better Hailo compatibility:
torch.onnx.export(
    model, 
    dummy_input, 
    onnx_path,
    export_params=True,
    opset_version=13,
    do_constant_folding=True,
    input_names=["input.1"],
    dynamic_axes={"input.1": {0: "batch_size"}}
)
  • Update the end_node_names to ['/Unsqueeze'] when parsing the model:
end_node_names = ['/Unsqueeze']
hn, npz = runner.translate_onnx_model(
    onnx_path, 
    onnx_model_name,
    start_node_names=['input.1'],
    end_node_names=end_node_names,
    net_input_shapes={'input.1': [1, 577, 768]}
)
  1. Test with a Simplified Model
  • Create a simplified version of your model to test the changes before applying them to the full model.
  1. Enable Verbose Logging
  • Set the environment variable HAILO_SDK_LOG_LEVEL=DEBUG to get more detailed logs for debugging.

Hi @omria,

Thank you for your helpful response.
Based on your advice to use repeat(), I converted the Expand node to a Tile node in the ONNX file, and it worked well. I really appreciate your guidance.

By the way, I have a quick question:
Are the limitations regarding Expand and Concat documented somewhere in the reference materials?