UnsupportedShuffleLayerError while parsing ONNX model

Hello,
I’m trying to convert a model with a ConvNext Backbone to HEF.
I’ve already exported my model to ONNX using:

torch.onnx.export(my_pt_model,
                  sample_input,
                  onnx_model_path,
                  export_params=True,
                  opset_version=15, 
                  training=torch.onnx.TrainingMode.PRESERVE,
                  do_constant_folding=False,
                  input_names=['input'],
                  output_names=['output'])

When I try parsing the ONNX model using:

hailo parser onnx /my/model.onnx --net-name model --har-path /my/model.har --hw-arch hailo8

I get this error message:

Traceback (most recent call last):
  File "/venv/bin/hailo", line 8, in <module>
    sys.exit(main())
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/main.py", line 111, in main
    ret_val = client_command_runner.run()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 68, in run
    return self._run(argv)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 89, in _run
    return args.func(args)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 216, in run
    raise ParsingWithRecommendationCLIException(
hailo_sdk_client.tools.parser_cli.ParsingWithRecommendationCLIException: Parsing failed. The errors found in the graph are:
 UnsupportedShuffleLayerError in op /backbone/features.1/features.1.0/block/block.6/Transpose: Failed to determine type of layer to create in node /backbone/features.1/features.1.0/block/block.6/Transpose
[...]
Please try to parse the model again, using:
hailo parser onnx /my/model.onnx --net-name model --har-path /my/model.har --hw-arch hailo8 --end-node-names "/backbone/features.5/features.5.0/stochastic_depth/Div" [...]

If I try to parse my model with the recommended command I get this message:

Traceback (most recent call last):
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 220, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 300, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 351, in parse_model_to_hn
    fuser = HailoNNFuser(converter.convert_model(), net_name, converter.end_node_names)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/translator.py", line 79, in convert_model
    self._create_layers()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 34, in _create_layers
    self._add_direct_layers()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 187, in _add_direct_layers
    raise ParsingWithRecommendationException(
hailo_sdk_client.model_translator.exceptions.ParsingWithRecommendationException: Parsing failed. The errors found in the graph are:
 UnsupportedShuffleLayerError in op /backbone/features.1/features.1.0/block/block.6/Transpose: Failed to determine type of layer to create in node /backbone/features.1/features.1.0/block/block.6/Transpose
[...]
Please try to parse the model again, using these end node names: /backbone/features.1/features.1.0/block/block.5/Add, [...]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 188, in run
    runner = self._parse(net_name, args, tensor_shapes)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 251, in _parse
    runner.translate_onnx_model(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1158, in translate_onnx_model
    parser.translate_onnx_model(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 260, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 300, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 351, in parse_model_to_hn
    fuser = HailoNNFuser(converter.convert_model(), net_name, converter.end_node_names)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/translator.py", line 79, in convert_model
    self._create_layers()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 34, in _create_layers
    self._add_direct_layers()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 187, in _add_direct_layers
    raise ParsingWithRecommendationException(
hailo_sdk_client.model_translator.exceptions.ParsingWithRecommendationException: Parsing failed. The errors found in the graph are:
 UnsupportedShuffleLayerError in op /backbone/features.1/features.1.0/block/block.6/Transpose: Failed to determine type of layer to create in node /backbone/features.1/features.1.0/block/block.6/Transpose
[...]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 220, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 300, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 352, in parse_model_to_hn
    hailo_nn = fuser.convert_model()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/fuser/fuser.py", line 106, in convert_model
    self._finalize_fused_model()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/fuser/fuser.py", line 456, in _finalize_fused_model
    self._output_graph.update_output_layers_order(self._end_node_names)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_common/hailo_nn/hailo_nn.py", line 904, in update_output_layers_order
    raise InvalidHNError(
hailo_sdk_common.hailo_nn.exceptions.InvalidHNError: The original node name /backbone/features.3/features.3.0/stochastic_depth/Div in end_node_names is missing in the HN.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/venv/bin/hailo", line 8, in <module>
    sys.exit(main())
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/main.py", line 111, in main
    ret_val = client_command_runner.run()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 68, in run
    return self._run(argv)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 89, in _run
    return args.func(args)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 206, in run
    runner = self._parse(net_name, args, tensor_shapes)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 251, in _parse
    runner.translate_onnx_model(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1158, in translate_onnx_model
    parser.translate_onnx_model(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 260, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 300, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 352, in parse_model_to_hn
    hailo_nn = fuser.convert_model()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/fuser/fuser.py", line 106, in convert_model
    self._finalize_fused_model()
  File "/venv/lib/python3.10/site-packages/hailo_sdk_client/model_translator/fuser/fuser.py", line 456, in _finalize_fused_model
    self._output_graph.update_output_layers_order(self._end_node_names)
  File "/venv/lib/python3.10/site-packages/hailo_sdk_common/hailo_nn/hailo_nn.py", line 904, in update_output_layers_order
    raise InvalidHNError(
hailo_sdk_common.hailo_nn.exceptions.InvalidHNError: The original node name /backbone/features.3/features.3.0/stochastic_depth/Div in end_node_names is missing in the HN.

The documentation mentions some limitations with transpose operations regarding sizes over 1.5MB so I tried using a (1, 3, 128, 128) sized input instead of (1, 3, 1920, 1080) for my ONNX export.
However I still get the same error message.
Using different opset versions (8, 15, 17) didn’t change anything either.

Is there a way to deal with these transpose operations?

Transpose is a tricky operation, and it depends which dimensions are being transposed. Generally speaknig, if you can rotate the kernel instead of the data tensor that would be much better.

1 Like

The model uses alternating 0, 2, 3, 1 and 0, 3, 1, 2 transposes. So the tensor should alternate between (N, C, H, W) and (N, H, W, C).
The first few alternations don’t seem to cause any errors. Does this mean I ran into the 1.5 MB limit?
There are no convolutions happening in the ‘rotated state’.

Hi @txm,

Could you explain what you mean by the 1.5 MB limit?

About your error, there seems to be a problem with the recommended end-node name offered by the parser.

You can try opening the onnx with https://netron.app/ and checking what’s the correct name for the operation just before the /backbone/features.1/features.1.0/block/block.6/Transpose node.

If you have trouble there, please share the exported png from Netron.

1 Like

Hey @nina-vilela,

thank you for the quick reply.

The dfc docs mention:
“Hailo supports the following Transpose operations:
• Transpose of Width ↔ Column dimensions.
• Transpose of Height ↔ Width dimensions, only in tensors where their complete quantized size is smaller than 1.5MB. This type of transpose is not optimal for performance, since it requires the buffering of the whole tensor, creating a “pipeline stop” that raises the latency of the model.
• Transpose of Height ↔ Feature, with the same disclaimer as above.”
So I thought this might cause the problem. Since I have a repeating structure of conducted operations and the first few seem to cause no problems.

According to the Netron graph of my ONNX model the output name of the previous node and input name in the current node match.

could you please share the exported png from Netron for this model?

Partially due to the size restrictions I was only able to upload a snippet of the model but this should showcase the structure.

@txm could you please zoom in on the problematic part of the model?

@nina-vilela sorry, of course. The parts causing problems look like this:

The error is thrown for the 0, 3, 1, 2 transpose at the end.

I want to try to remove the transpose operations, change the axis for the LayerNormalization, use broadcasting for the add Operation and use einsum instead of matmul to specify the axis accordingly.
I have no experience doing this. What are your thoughts about this?