Average Pool error cannot quantize

Hi,

I’m currently working on converting an OpenMMLab model to use on the Hailo 8, specifically the RTMDet model in its tiny version. I’ve tried other versions of the model (s, m, x, etc.), but I keep encountering the same issue.

Here’s the process I’m following:

  1. First, I convert the .onnx model with the following command:
$ hailo parser onnx rtmdet_tiny_syncbn_fast_8xb32-300e_coco_20230102_140117-dbb1dc83.onnx 
  1. Then, I try to optimize it using:
$ hailo optimize --hw-arch hailo8 --use-random-calib-set --output-har-path ./MODELS/rtmdet_tiny_syncbn_fast_8xb32-300e_coco_20230102_140117-dbb1dc83_optimized.har ./MODELS/rtmdet_tiny_syncbn_fast_8xb32-300e_coco_20230102_140117-dbb1dc83.har

However, I encounter the following error during optimization:

[info] Current Time: 13:07:14, 08/21/24
[info] CPU: Architecture: x86_64, Model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, Number Of Cores: 12, Utilization: 7.8%
[info] Memory: Total: 15GB, Available: 9GB
[info] System info: OS: Linux, Kernel: 6.5.0-18-generic
[info] Hailo DFC Version: 3.28.0
[info] HailoRT Version: 4.18.0
[info] PCIe: 0000:07:00.0: Number Of Lanes: 2, Speed: 8.0 GT/s PCIe
[info] Running `hailo optimize --hw-arch hailo8 --use-random-calib-set --output-har-path ./MODELS/rtmdet_tiny_syncbn_fast_8xb32-300e_coco_20230102_140117-dbb1dc83_optimized.har ./MODELS/rtmdet_tiny_syncbn_fast_8xb32-300e_coco_20230102_140117-dbb1dc83.har`
[info] Found model with 3 input channels, using real RGB images for calibration instead of sampling random data.
[info] Starting Model Optimization
[warning] Reducing optimization level to 0 (the accuracy won't be optimized and compression won't be used) because there's no available GPU
[warning] Running model optimization with zero level of optimization is not recommended for production use and might lead to suboptimal accuracy results
[info] Model received quantization params from the hn
[info] Starting Mixed Precision
[info] Mixed Precision is done (completion time is 00:00:00.92)
[info] Layer Norm Decomposition skipped
[info] Starting Stats Collector
[info] Using dataset with 64 entries for calibration
Calibration: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 64/64 [00:38<00:00,  1.65entries/s]
[info] Stats Collector is done (completion time is 00:00:41.40)
[info] Starting Fix zp_comp Encoding
[info] Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] matmul_equalization skipped
Traceback (most recent call last):
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/bin/hailo", line 8, in <module>
    sys.exit(main())
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/main.py", line 111, in main
    ret_val = client_command_runner.run()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_platform/tools/hailocli/main.py", line 64, in run
    return self._run(argv)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_platform/tools/hailocli/main.py", line 104, in _run
    return args.func(args)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_client/tools/optimize_cli.py", line 120, in run
    self._runner.optimize(dataset, work_dir=args.work_dir)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 2093, in optimize
    self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1935, in _optimize
    self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1045, in full_quantization
    self._full_acceleras_run(self.calibration_data, data_type)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1229, in _full_acceleras_run
    optimization_flow.run()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 306, in wrapper
    return func(self, *args, **kwargs)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 316, in run
    step_func()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py", line 122, in parent_wrapper
    func(self, *args, **kwargs)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 335, in step1
    self.core_quantization()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 389, in core_quantization
    self._create_hw_params()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 437, in _create_hw_params
    create_hw_params.run()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/optimization_algorithm.py", line 50, in run
    return super().run()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/algorithm_base.py", line 151, in run
    self._run_int()
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/create_hw_params/create_hw_params.py", line 337, in _run_int
    comp_to_retry = self._create_hw_params_component(matching_component_group)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/create_hw_params/create_hw_params.py", line 203, in _create_hw_params_component
    layer.create_hw_params(layer_clip_cfg, hw_shifts=hw_shifts)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/hailo_layers/hailo_avgpool_v2.py", line 301, in create_hw_params
    self.avgpool_op.create_hw_params(max_final_accumulator_by_channel, hw_shifts=hw_shifts)
  File "/home/omendez/Downloads/hailo_ai_sw_suite/hailo_venv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/atomic_ops/avgpool_op.py", line 104, in create_hw_params
    raise AccelerasNumerizationError(
hailo_model_optimization.acceleras.utils.acceleras_exceptions.AccelerasNumerizationError: Shift delta in rtmdet_tiny_syncbn_fast_8xb32-300e_coco_20230102_140117-dbb1dc83/avgpool1/avgpool_op is larger than 2 (4.06), cannot quantize. A possible solution is to use a pre-quantization model script command to reduce global average-pool spatial dimensions, please refer to the user guide for more info.

To troubleshoot, I tried preprocessing the model using the command below:

python -m onnxruntime.quantization.preprocess --input <model>.onnx --output <model_output>.onnx

Unfortunately, this didn’t solve the issue.

Has anyone faced a similar problem or have any suggestions on how to address this error? I would greatly appreciate any help or advice you can offer.

Thank you in advance for your support!

Hey @oscar.mendez


It seems that the error is related to quantization, particularly with the average pooling layer. Here are a few potential solutions to address this:

  1. Use the global_avgpool_reduction Command:
    The error message suggests applying a pre-quantization optimization to reduce the spatial dimensions of the global average pooling layer. You can include the following command in your model script:

    pre_quantization_optimization(global_avgpool_reduction, layers=avgpool1, division_factors=[4, 4])
    

    This command reduces the height and width of the global average pooling layer by a factor of 4. You can adjust the division_factors as needed.

  2. Adjust Optimization Level:
    Since you’re running the process without a GPU, the optimization level defaults to 0. If possible, try using a machine with a GPU to enable higher optimization levels. Alternatively, you can manually set a higher optimization level by using:

    model_optimization_flavor(optimization_level=2)
    
  3. Use 16-bit Precision:
    If the problem persists, consider setting the problematic layer to use 16-bit precision with the following command:

    quantization_param(avgpool1, precision_mode=a16_w16)
    
  4. Modify the Model Architecture:
    If the above solutions don’t resolve the issue, you might need to adjust the model architecture. One option is to replace the global average pooling layer with a standard average pooling layer that uses a large kernel size, followed by a flatten operation.

  5. Expand the Calibration Dataset:
    Consider increasing the size of your calibration dataset. Instead of random data, use a representative set of real images that reflect your expected input distribution.

To apply these solutions, create a model script file (e.g., rtmdet_script.alls) with your chosen commands, and include it in your optimization command:

hailo optimize --hw-arch hailo8 --calib-path /path/to/calibration/images --alls rtmdet_script.alls --output-har-path ./MODELS/rtmdet_tiny_optimized.har ./MODELS/rtmdet_tiny.har

If the issue persists, you might want to analyze the model structure to identify any layers or operations that are particularly challenging for quantization. Using the Hailo Model Profiler can help provide more insights into the quantization process and highlight problematic layers.


Regards

1 Like