Hi
I’m using the Python API to quantize a ViT network.
Here is a portion of the code:
model_script_lines = [f"normalization_rule1 = normalization({mean}, {sdev})\n",
"model_optimization_config(calibration, batch_size=1, calibset_size=128)\n",
"quantization_param({*conv*}, precision_mode=a16_w16)\n",
"quantization_param({*dw*}, precision_mode=a16_w16)\n",
"quantization_param({*fc*}, precision_mode=a16_w16)\n",
]
runner = ClientRunner(har=model_har_name)
runner.load_model_script(''.join(model_script_lines))
runner.optimize(calib_dataset)
I get an error massage:
hailo_model_optimization.acceleras.utils.acceleras_exceptions.AccelerasUnsupportedError: Unsupported layers for the provided optimization target. Review the log to see exact layers and configurations
and some weird messages on the screen:
[info] Loading model script commands to tiny_vit_5m_224_from_microsoft from string
[info] Starting Model Optimization
[info] Using default optimization level of 2
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:02.07)
[info] Starting LayerNorm Decomposition
[info] Using dataset with 128 entries for calibration
[info] Model Optimization Algorithm LayerNorm Decomposition is done (completion time is 00:01:31.70)
[error] Unsupported layers for the target sage: tiny_vit_5m_224_from_microsoft/precision_change14:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change21:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change28:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change35:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change42:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change49:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change56:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change63:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change7:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
tiny_vit_5m_224_from_microsoft/precision_change70:
bias_mode: single_scale_decomposition
precision_mode: a16_w16_a8
quantization_groups: 1
signed_output: true
The precision mode a16_w16_a8 is unexpected. What should I do to quantize correctly the network?