Can not parse my onnx model which trained used Faster-RCNN

Hi,

I used the Faster-RCNN model to train the PyTorch model.
Conversion to onnx works fine and the model works, but I have dynamic_axes. And hailomz parse doesn’t want to work with this model.

I created a yaml file for fasterrcnn_resnet50_fpn, but I am not sure if the configurations are correct.

base:
- base/coco.yaml
network:
  network_name: fasterrcnn_resnet50_fpn                                                                     paths:
  alls_script: fasterrcnn_resnet50_fpn.alls                                                                   network_path:                                                                                               - models_files/ObjectDetection/Detection-COCO/fasterrcnn_resnet50_fpn/2024-03-05/fasterrcnn_resnet50_fpn.onnx
  url: https://hailo-model-zoo.s3.eu-west-2.amazonaws.com/ObjectDetection/Detection-COCO/fasterrcnn_resnet50_fpn/2024-03-05/fasterrcnn_resnet50_fpn.zip
postprocessing:
  device_pre_post_layers:
    nms: true
  hpp: true
parser:
  nodes:
  - input
  - - boxes
    - scores
    - labels
info:
  task: object detection
  input_shape: 800x800x3
  output_shape: 1x100x4, 1x100, 1x100
  operations: 60G
  parameters: 41.1M
  framework: pytorch
  training_data: coco train2017
  validation_data: coco val2017
  eval_metric: mAP
  full_precision_result: 37.0
  source: https://github.com/pytorch/vision
  license_url: https://github.com/pytorch/vision/blob/main/LICENSE
  license_name: BSD-3-Clause

And output of hailomz is this:

hailomz parse --yaml yaml_files/fasterrcnn_resnet50_fpn.yaml  --ckpt fasterrcnn_static.onnx --hw-arch hailo8l --start-node-names input --end-node-names boxes labels scores
<Hailo Model Zoo INFO> Start run for network fasterrcnn_resnet50_fpn ...
<Hailo Model Zoo INFO> Initializing the runner...
[info] Translation started on ONNX model fasterrcnn_resnet50_fpn
[warning] Large model detected. The graph may contain either a large number of operators, or weight variables with a very large capacity.
[warning] Translation time may be a bit long, and some features may be disabled (e.g. model augmentation, retry simplified model, onnx runtime hailo model extraction, etc.).
[info] Restored ONNX model fasterrcnn_resnet50_fpn (completion time: 00:00:00.57)
[warning] ONNX shape inference failed: Unsupported dynamic shape([0, 3, 0, 0]) found on input node input. Please use net_input_shapes, see documentation for additional info.
Traceback (most recent call last):
  File "/local/workspace/hailo_virtualenv/bin/hailomz", line 33, in <module>
    sys.exit(load_entry_point('hailo-model-zoo', 'console_scripts', 'hailomz')())
  File "/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py", line 122, in main
    run(args)
  File "/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py", line 111, in run
    return handlers[args.command](args)
  File "/local/workspace/hailo_model_zoo/hailo_model_zoo/main_driver.py", line 203, in parse
    parse_model(runner, network_info, ckpt_path=args.ckpt_path, results_dir=args.results_dir, logger=logger)
  File "/local/workspace/hailo_model_zoo/hailo_model_zoo/core/main_utils.py", line 124, in parse_model
    raise Exception(f"Encountered error during parsing: {err}") from None
Exception: Encountered error during parsing: Could not parse the model due to dynamic shapes. Please try to parse the model again, using: --tensor-shapes,
 e.g. hailomz parse --yaml yaml_files/fasterrcnn_resnet50_fpn.yaml --ckpt fasterrcnn_static.onnx --hw-arch hailo8l --start-node-names input --end-node-names boxes labels scores --tensor-shapes [0,3,224,224]

Hey @Bogdan_Boryslavskyi ,

Welcome to the Hailo Community!

I see your issue is caused by dynamic axes in your ONNX model, which Hailo tools don’t support without explicit tensor shapes.

The error Unsupported dynamic shape([0, 3, 0, 0]) found on input node input indicates dynamic dimensions in your model. While you specified input_shape: 800x800x3 in your YAML, this isn’t used by hailomz during ONNX parsing unless explicitly provided.

Try this solution:

hailomz parse \
  --yaml yaml_files/fasterrcnn_resnet50_fpn.yaml \
  --ckpt fasterrcnn_static.onnx \
  --hw-arch hailo8l \
  --start-node-names input \
  --end-node-names boxes labels scores \
  --tensor-shapes [1,3,800,800]

Also, there’s a minor formatting issue in your YAML where the network_name and paths lines need proper indentation.

The dynamic shape [0,3,0,0] is common with torchvision FasterRCNN models. You can either use the --tensor-shapes parameter as shown above or freeze your model’s input size during ONNX export.

Hi @omria .

I did try already to sue --tensor-shapes in hailomz, but this parameter is not exists.

hailomz parse --yaml yaml_files/fasterrcnn_resnet50_fpn.yaml  --ckpt fasterrcnn_static.onnx --hw-arch hailo8l --start-node-names input --end-node-names boxes labels scores --tensor-shapes [1,3,800,800]
usage: hailomz parse [-h] [--yaml YAML_PATH] [--ckpt CKPT_PATH] [--hw-arch] [--start-node-names START_NODE_NAMES [START_NODE_NAMES ...]]
                     [--end-node-names END_NODE_NAMES [END_NODE_NAMES ...]]
                     [model_name]
hailomz parse: error: argument model_name: invalid choice: '[1,3,800,800]' (choose from 'arcface_mobilefacenet', 'arcface_mobilefacenet_nv12', 'arcface_mobilefacenet_rgbx', 'arcface_r50', 'cas_vit_m', 'cas_vit_s', 'cas_vit_t', 'centernet_resnet_v1_18_postprocess', 'centernet_resnet_v1_50_postprocess', 'centerpose_regnetx_1.6gf_fpn', 'centerpose_regnetx_800mf', 'centerpose_repvgg_a0', 'clip_resnet_50', 'clip_resnet_50x4', 'clip_text_encoder_resnet50x4', 'clip_text_encoder_vit_l_14_laion2B', 'clip_text_encoder_vit_large', 'clip_vit_l_14_laion2B_16b', 'damoyolo_tinynasL20_T', 'damoyolo_tinynasL25_S', 'damoyolo_tinynasL35_M', 'davit_tiny', 'deeplab_v3_mobilenet_v2', 'deeplab_v3_mobilenet_v2_wo_dilation', 'deit_base', 'deit_small', 'deit_tiny', 'detr_resnet_v1_18_bn', 'detr_resnet_v1_50', 'dncnn3', 'dncnn_color_blind', 'efficientdet_lite0', 'efficientdet_lite1', 'efficientdet_lite2', 'efficientformer_l1', 'efficientnet_l', 'efficientnet_lite0', 'efficientnet_lite1', 'efficientnet_lite2', 'efficientnet_lite3', 'efficientnet_lite4', 'efficientnet_m', 'efficientnet_s', 'espcn_x2', 'espcn_x3', 'espcn_x4', 'face_attr_resnet_v1_18', 'face_attr_resnet_v1_18_nv12', 'face_attr_resnet_v1_18_rgbx', 'fast_depth', 'fast_depth_nv12_fhd', 'fast_sam_s', 'fastvit_sa12', 'fcn8_resnet_v1_18', 'hand_landmark_lite', 'hardnet39ds', 'hardnet68', 'inception_v1', 'levit128', 'levit192', 'levit256', 'levit384', 'lightface_slim', 'lightface_slim_nv12', 'lightface_slim_nv12_fhd', 'lprnet', 'lprnet_yuy2', 'mobilenet_v1', 'mobilenet_v2_1.0', 'mobilenet_v2_1.4', 'mobilenet_v3', 'mobilenet_v3_large_minimalistic', 'mspn_regnetx_800mf', 'mspn_regnetx_800mf_nv12', 'nanodet_repvgg', 'nanodet_repvgg_a12', 'nanodet_repvgg_a1_640', 'osnet_x1_0', 'person_attr_resnet_v1_18', 'person_attr_resnet_v1_18_nv12', 'person_attr_resnet_v1_18_rgbx', 'petrv2_repvggB0_backbone_pp_800x320', 'petrv2_repvggB0_transformer_pp_800x320', 'r3d_18', 'real_esrgan_x2', 'regnetx_1.6gf', 'regnetx_800mf', 'repghost_1_0x', 'repghost_2_0x', 'repvgg_a0_person_reid_512', 'repvgg_a1', 'repvgg_a2', 'resmlp12_relu', 'resnet_v1_18', 'resnet_v1_34', 'resnet_v1_50', 'resnext26_32x4d', 'resnext50_32x4d', 'retinaface_mobilenet_v1', 'retinaface_mobilenet_v1_rgbx', 'scdepthv3', 'scrfd_10g', 'scrfd_10g_nv12_fhd', 'scrfd_2.5g', 'scrfd_500m', 'segformer_b0_bn', 'squeezenet_v1.1', 'ssd_mobilenet_v1', 'ssd_mobilenet_v2', 'stdc1', 'stereonet', 'swin_small', 'swin_tiny', 'tddfa_mobilenet_v1', 'tddfa_mobilenet_v1_nv12', 'tiny_yolov3', 'tiny_yolov4', 'tiny_yolov4_license_plates', 'tiny_yolov4_license_plates_yuy2', 'unet_mobilenet_v2', 'vit_base', 'vit_base_bn', 'vit_pose_small', 'vit_pose_small_bn', 'vit_small', 'vit_small_bn', 'vit_tiny', 'vit_tiny_bn', 'yolact_regnetx_1.6gf', 'yolact_regnetx_800mf', 'yolov10b', 'yolov10n', 'yolov10s', 'yolov10x', 'yolov11l', 'yolov11m', 'yolov11n', 'yolov11s', 'yolov11x', 'yolov3', 'yolov3_416', 'yolov3_gluon', 'yolov3_gluon_416', 'yolov4_leaky', 'yolov5l_seg', 'yolov5m', 'yolov5m6_6.1', 'yolov5m_6.1', 'yolov5m_seg', 'yolov5m_vehicles', 'yolov5m_vehicles_nv12', 'yolov5m_vehicles_yuy2', 'yolov5m_wo_spp', 'yolov5m_wo_spp_nv12_fhd', 'yolov5m_wo_spp_yuy2', 'yolov5n_seg', 'yolov5n_seg_nv12_fhd', 'yolov5s', 'yolov5s_bbox_decoding_only', 'yolov5s_c3tr', 'yolov5s_personface', 'yolov5s_personface_nv12', 'yolov5s_personface_nv12_fhd', 'yolov5s_personface_rgbx', 'yolov5s_seg', 'yolov5s_wo_spp', 'yolov5xs_wo_spp', 'yolov5xs_wo_spp_nms_core', 'yolov6n', 'yolov6n_0.2.1', 'yolov6n_0.2.1_nms_core', 'yolov7', 'yolov7_tiny', 'yolov7e6', 'yolov8l', 'yolov8m', 'yolov8m_pose', 'yolov8m_seg', 'yolov8n', 'yolov8n_seg', 'yolov8s', 'yolov8s_bbox_decoding_only', 'yolov8s_pose', 'yolov8s_seg', 'yolov8x', 'yolov9c', 'yolox_l_leaky', 'yolox_s_leaky', 'yolox_s_wide_leaky', 'yolox_tiny', 'zero_dce', 'zero_dce_pp')

Hey @Bogdan_Boryslavskyi ,

You are correct , I just checked it , you have to export the ONNX with static input shapes and no dynamic axes. That’s the only way Hailo’s ONNX parser will work without throwing the [0,3,0,0] shape error.

Yes, thanks. I retrain the model with yolo, and it is working now.