Pytroch to ONNX to HAR

Hi!

I am currently trying to convert a Superpoint algorithm to ONNX so that I can eventually convert it to an HEF format.

Now I converted the Superpoint model to ONNX format, which seems to work correctly as I compared the outputs of the ONNX and Pytorch models and they appeared to be the same. However, when I use the hailo parse function ( hailo parser onnx superpoint.onnx --hw-arch hailo8 --tensor-shapes [1,1,720,1280] I get the following error:

2025-02-25 17:09:05.101374238 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running TopK node. Name:'/TopK' Status Message: k argument [500] should not be greater than specified axis dim value [0]
[warning] ONNX shape inference failed: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running TopK node. Name:'/TopK' Status Message: k argument [500] should not be greater than specified axis dim value [0]
[info] Simplified ONNX model for a parsing retry attempt (completion time: 00:00:01.27)
2025-02-25 17:09:06.104015686 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running TopK node. Name:'/TopK' Status Message: k argument [500] should not be greater than specified axis dim value [0]
[warning] ONNX shape inference failed: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running TopK node. Name:'/TopK' Status Message: k argument [500] should not be greater than specified axis dim value [0]
Traceback (most recent call last):
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 239, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 320, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 371, in parse_model_to_hn
    fuser = HailoNNFuser(converter.convert_model(), net_name, converter.end_node_names)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/translator.py", line 83, in convert_model
    self._create_layers()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 38, in _create_layers
    self._update_vertices_info()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_translator.py", line 316, in _update_vertices_info
    node.update_output_format()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 507, in update_output_format
    or self.is_null_transpose()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 6143, in is_null_transpose
    if self.is_torch_tile_resize_nearest() or self.is_null_transpose_near_torch_tile():
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 4718, in is_null_transpose_near_torch_tile
    input_shape = self.get_input_shapes(convert_to_nhwc=False)[0]
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dylan/hailodfc/bin/hailo", line 8, in <module>
    sys.exit(main())
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/main.py", line 111, in main
    ret_val = client_command_runner.run()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 68, in run
    return self._run(argv)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 89, in _run
    return args.func(args)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 201, in run
    runner = self._parse(net_name, args, tensor_shapes)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 287, in _parse
    runner.translate_onnx_model(
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1177, in translate_onnx_model
    parser.translate_onnx_model(
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 280, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 320, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 371, in parse_model_to_hn
    fuser = HailoNNFuser(converter.convert_model(), net_name, converter.end_node_names)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/translator.py", line 83, in convert_model
    self._create_layers()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 38, in _create_layers
    self._update_vertices_info()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_translator.py", line 316, in _update_vertices_info
    node.update_output_format()
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 500, in update_output_format
    self.update_reshape_output_format(input_format)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 332, in update_reshape_output_format
    is_features_to_heads, f_to_g_format = self.is_features_to_groups_reshape(input_format)
  File "/home/dylan/hailodfc/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 3491, in is_features_to_groups_reshape
    default_out_format.insert(stack_idx, Dims.STACK)
TypeError: 'NoneType' object cannot be interpreted as an integer

This is my code for extracting the top k keypoints

def top_k_keypoints(keypoints, scores, k: int):
    # Create a mask where scores are in top k
    # This avoids Python if statements entirely
    values, indices = torch.topk(scores, k=min(k, len(scores)), dim=0)

    # Create a boolean mask for top k indices
    mask = torch.zeros_like(scores, dtype=torch.bool)
    mask[indices] = True

    # Apply mask to both keypoints and scores
    return keypoints[mask], scores[mask]

I have tried a lot of different ways for this method to be ONNX compatible, but I keep getting this error when trying to hailo parse.

Does anyone know what might be going wrong?

Hey @Dylan_Durand,

Welcome to the Hailo Community!

The issue appears to be in the TopK operation. Before we start changing things and trying different options, could you try exporting your ONNX model in NHWC format? The Dataflow Compiler (DFC) and Hailo8 hardware expect tensors in NHWC (batch, height, width, channels) format rather than NCHW.

This format mismatch is often the root cause of these types of errors. Let us know if reformatting helps resolve the issue!

Hi omria,

Thank you for your answer! I have now converted the onnx file to NHWC, however, the issue now appears to be that I get this error

 3489             stack_idx = perm[stack_idx]
   3490         default_out_format = default_out_format if shapes_diff == 2 else input_format.copy()
-> 3491         default_out_format.insert(stack_idx, Dims.STACK)
   3492         return True, default_out_format
   3494 return False, None

TypeError: 'NoneType' object cannot be interpreted as an integer

Now I think this has something to do with my outputs not being fixed. Because when I check the outputs of my onnx model, and check the dimension size I get something like:

Model Outputs:
keypoints: [1, 0, 2]
scores: [1, 0]
descriptors: [1, 256, 0]

where 0 is supposed to be the number of keypoints. However, during runtime of the ONNX model (with random inputs), the number of keypoints becomes 500 as it should. Could the issue be related to this?

Upon further inspection I noticed that one of the reshape operations in the ONNX model outputs a tensor of shape unk__2, meaning that the shape will only be inferred during runtime and that it is currently unkown. Does this cause an issue perhaps with the DFC parser?

After some further work, I found that I had to use a fixed shape throughout the model. After doing this, I got a ParsingWithRecommendationException. After using the recommended end node names I suddenly get list index out of range errors, which I never experienced when using the original end node names. Is there any reason why this could be happening?