Exporting locotrack (pytorck -> ONNX -> HAR -> HEF)

Hello Hailo Community,

I am trying to run locotrack (GitHub - cvlab-kaist/locotrack: Official implementation of "Local All-Pair Correspondence for Point Tracking" (ECCV 2024)) on HAILO-8L. In order to do that I exported (*) its PyTorch implementation to ONNX format (opset=16) and then tried to parse it to HAR format using the DataFlow Compiler (both 3.28 and 3.29 versions).

Export PyTorch → ONNX worked well, I even simplified the model. I later run the command

hailo parser onnx shared_with_docker/locotrack_sim.onnx

but got the error (**).

Here (https://hailo.ai/developer-zone/documentation/v3-29-0/?sp_referrer=sdk/translating_tf_models.html#id3) I learned that GridSample operation is not supported by the DFC as well as many other standard operations (like torch.Tensor.permute). My questions are:

  1. Do you think you will extend the set of allowed operations in the near future such that majority of custom models can be exported to HAR format?
  2. Would path would you recommend to take in order to export this model to HEF somehow? Are there any tricks to do so?

Additional information:

  • (*) I actually tweaked the model such that it did not use volumetric (5D) grid sample (torch.nn.functional.grid_sample — PyTorch 2.5 documentation) as it was introduced in ONNX opset 20. I did that to check whether there are additional points of failure
  • (**) full error stacktrace (probably not related to GridSample)
docker-user@quczer-seagle:/mnt/ml-infra$ hailo parser onnx models/locotrack_sim.onnx            
[info] Current Time: 11:53:24, 11/14/24
[info] CPU: Architecture: x86_64, Model: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz, Number Of Cores: 16, Utilization: 2.9%
[info] Memory: Total: 30GB, Available: 15GB
[info] System info: OS: Linux, Kernel: 6.8.0-48-generic
[info] Hailo DFC Version: 3.28.0
[info] HailoRT Version: Not Installed
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo parser onnx models/locotrack_sim.onnx`
[info] Translation started on ONNX model locotrack_sim
[warning] Large model detected. The graph may contain either a large number of operators, or weight variables with a very large capacity.
[warning] Translation time may be a bit long, and some features may be disabled (e.g. model augmentation, retry simplified model, onnx runtime hailo model extraction, etc.).
[info] Restored ONNX model locotrack_sim (completion time: 00:00:00.17)
Traceback (most recent call last):
  File "/home/docker-user/.local/bin/hailo", line 8, in <module>
    sys.exit(main())
    │        └ <function main at 0x760cc3fc9630>
    └ <module 'sys' (built-in)>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/main.py", line 111, in main
    ret_val = client_command_runner.run()
              └ <hailo_sdk_client.tools.cmd_utils.main.ClientCommands object at 0x760cc40ea740>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 68, in run
    return self._run(argv)
           │         └ ['parser', 'onnx', 'models/locotrack_sim.onnx']
           └ <hailo_sdk_client.tools.cmd_utils.main.ClientCommands object at 0x760cc40ea740>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/base_utils.py", line 89, in _run
    return args.func(args)
           │         └ Namespace(input_format='onnx', func=<bound method NetParser.run of <hailo_sdk_client.tools.parser_cli.NetParser object at 0x760c...
           └ Namespace(input_format='onnx', func=<bound method NetParser.run of <hailo_sdk_client.tools.parser_cli.NetParser object at 0x760c...
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 188, in run
    runner = self._parse(net_name, args, tensor_shapes)
             │           │         │     └ None
             │           │         └ Namespace(input_format='onnx', func=<bound method NetParser.run of <hailo_sdk_client.tools.parser_cli.NetParser object at 0x760c...
             │           └ 'locotrack_sim'
             └ <hailo_sdk_client.tools.parser_cli.NetParser object at 0x760cc3526e30>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/parser_cli.py", line 251, in _parse
    runner.translate_onnx_model(
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
           │    │      │       └ {'start_node_names': None, 'end_node_names': None, 'net_input_shapes': None, 'augmented_path': None, 'disable_rt_metadata_extrac...
           │    │      └ ('models/locotrack_sim.onnx', 'locotrack_sim')
           │    └ <hailo_sdk_client.runner.client_runner.ClientRunner object at 0x760cc3513670>
           └ <function ClientRunner.translate_onnx_model at 0x760cc40db760>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1158, in translate_onnx_model
    parser.translate_onnx_model(
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 232, in translate_onnx_model
    raise e from None
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 220, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 300, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/parser/parser.py", line 351, in parse_model_to_hn
    fuser = HailoNNFuser(converter.convert_model(), net_name, converter.end_node_names)
            │            │                          │         └ <hailo_sdk_client.model_translator.onnx_translator.onnx_translator.ONNXConverter object at 0x760cc35ac040>
            │            │                          └ 'locotrack_sim'
            │            └ <hailo_sdk_client.model_translator.onnx_translator.onnx_translator.ONNXConverter object at 0x760cc35ac040>
            └ <class 'hailo_sdk_client.model_translator.fuser.fuser.HailoNNFuser'>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/model_translator/translator.py", line 79, in convert_model
    self._create_layers()
    └ <hailo_sdk_client.model_translator.onnx_translator.onnx_translator.ONNXConverter object at 0x760cc35ac040>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 34, in _create_layers
    self._add_direct_layers()
    └ <hailo_sdk_client.model_translator.onnx_translator.onnx_translator.ONNXConverter object at 0x760cc35ac040>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/model_translator/edge_nn_translator.py", line 111, in _add_direct_layers
    self._layer_callback_from_vertex(vertex)
    │                                └ <hailo_sdk_client.model_translator.onnx_translator.onnx_graph.ONNXGraphNode object at 0x760cc35722c0>
    └ <hailo_sdk_client.model_translator.onnx_translator.onnx_translator.ONNXConverter object at 0x760cc35ac040>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_translator.py", line 279, in _layer_callback_from_vertex
    if vertex.op in OPTIONAL_NULL_OPS and vertex.is_null_operation() and not is_flattened_global_maxpool:
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 4640, in is_null_operation
    if self.op in PAD_OPS and self.is_null_padding():
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 1030, in is_null_padding
    _, pads, _, _ = self.get_vertex_padding()
                    └ <hailo_sdk_client.model_translator.onnx_translator.onnx_graph.ONNXGraphNode object at 0x760cc35722c0>
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/model_translator/onnx_translator/onnx_graph.py", line 993, in get_vertex_padding
    pads = [int(x) for x in self._graph.output_shapes[self._info.input[1] + "_value"]]
    │                       │                         └ <hailo_sdk_client.model_translator.onnx_translator.onnx_graph.ONNXGraphNode object at 0x760cc35722c0>
    │                       └ <hailo_sdk_client.model_translator.onnx_translator.onnx_graph.ONNXGraphNode object at 0x760cc35722c0>
    └ [0, 0, 0, 0, 0, 0]
KeyError: '/cmdtop.1/conv.0/conv.0.0_5/Reshape_1_output_0_value'
  • ONNX model summary
(hailo_virtualenv) hailo@quczer-seagle:/local$ onnxsim shared_with_docker/locotrack.onnx shared_with_docker/locotrack_sim.onnx 
Your model contains "Tile" ops or/and "ConstantOfShape" ops. Folding these ops can make the simplified model much larger. If it is not expected, please specify 
"--no-large-tensor" (which will lose some optimization chances)
Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                       ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add                   │ 302            │ 302              │
│ ArgMax                │ 1              │ 1                │
│ Cast                  │ 360            │ 29               │
│ Concat                │ 365            │ 197              │
│ Constant              │ 3130           │ 212              │
│ ConstantOfShape       │ 89             │ 24               │
│ Conv                  │ 70             │ 70               │
│ Div                   │ 189            │ 157              │
│ Einsum                │ 17             │ 17               │
│ Equal                 │ 16             │ 1                │
│ Erf                   │ 12             │ 12               │
│ Expand                │ 20             │ 2                │
│ Gather                │ 582            │ 409              │
│ GridSample            │ 12             │ 12               │
│ Identity              │ 144            │ 0                │
│ InstanceNormalization │ 64             │ 64               │
│ Less                  │ 1              │ 1                │
│ MatMul                │ 105            │ 105              │
│ Max                   │ 4              │ 4                │
│ Mul                   │ 368            │ 293              │
│ Not                   │ 1              │ 1                │
│ Pad                   │ 50             │ 33               │
│ Pow                   │ 28             │ 28               │
│ Range                 │ 15             │ 14               │
│ ReduceMax             │ 1              │ 1                │
│ ReduceMean            │ 84             │ 84               │
│ ReduceMin             │ 1              │ 1                │
│ ReduceSum             │ 12             │ 12               │
│ Relu                  │ 64             │ 64               │
│ Reshape               │ 372            │ 320              │
│ Resize                │ 2              │ 2                │
│ Round                 │ 1              │ 1                │
│ ScatterND             │ 3              │ 3                │
│ Shape                 │ 646            │ 208              │
│ Sin                   │ 4              │ 4                │
│ Slice                 │ 106            │ 77               │
│ Softmax               │ 13             │ 13               │
│ Sqrt                  │ 31             │ 31               │
│ Squeeze               │ 6              │ 3                │
│ Sub                   │ 281            │ 176              │
│ Transpose             │ 162            │ 137              │
│ Trilu                 │ 48             │ 48               │
│ Unsqueeze             │ 974            │ 518              │
│ Where                 │ 15             │ 0                │
│ Model Size            │ 32.6MiB        │ 32.2MiB          │
└───────────────────────┴────────────────┴──────────────────┘

Hey @quczer,

Thanks for raising these points. Let me address each one:

  1. Support for Additional Operations
    I’ll need to check with our R&D team regarding the roadmap for supporting more operations. Currently, some operations remain unsupported due to hardware mapping complexities.

  2. HEF Model Export Options
    Here are some practical solutions for handling unsupported operations:

    a) Replace Unsupported Operations:

    • Modify your model to use supported operations
    • Example: Replace GridSample with basic ONNX operations

    b) Model Splitting Strategy:

    • Separate your model into supported and unsupported components
    • Run unsupported parts on CPU/GPU
    • Use Hailo-8L for supported sections
    • Leverage Hailo Model Scheduler for pipeline management

    c) Host-Side Preprocessing:

    • Move operations like GridSample and reshaping to host system
    • Feed preprocessed data to Hailo-8L
  3. Parser Troubleshooting
    For the parsing errors you’re seeing, try these steps:

  • Use onnxsim to simplify your ONNX model
  • Check your model structure in Netron, particularly around reshaping and padding nodes

Let me know if you need help implementing any of these solutions.

Best,
Omri

Hey @omria,

Thank you for your response.

AD 1. Please let me know what is more or less the roadmap for extending supported operations. This information could help me out with planning the future of the project I am working on.

AD 2. All of these ideas seem reasonable but unfortunately require some work I wanted to avoid.

AD 3. I did simplify the model already. netron helped with indentifying the problems before, but this time I don’t quite understand the error. What would be really helpful is improved error handling. There were some cases that DFC raised an error because Python crashed. I wish there was some indication that an operation (or a sequence of them) is not supported.

Nevertheless, thank you for your response and help.

Best,
Michał