Non Uniform Quantization

Hi,
Does Hailo support or will support non uniform quantization with more than one scale parameter?

I also see in the documentation Group quantization and Sub-group quantization. In general, how can I access the scale and zero point values of intermediate layers. I obtained scale and zero point of the input and output of the stream but I am not sure how to access the values between the model layers.

Hey @Rogelio_Hernandez ,

Welcome to the Hailo Community!

1. Non-Uniform Quantization with Multiple Scale Parameters

Hailo doesn’t currently support non-uniform quantization in the traditional sense (like log-based or custom LUT quantization). The quantization is based on linear affine transformations using scale and zero_point.

However, Hailo does support multiple quantization parameters per layer through group quantization. This lets you divide weights in a layer into multiple groups, where each group gets its own quantization parameters.

You can enable this in your model script:

quantization_param(conv1, quantization_groups=4)

This splits the conv1 layer weights into 4 groups, each with its own scale and zero_point. It’s the closest thing to having “multiple scale parameters” that’s currently available.

2. Accessing Scale and Zero Point of Intermediate Layers

The quantization parameters for input/output layers are accessible through the hailo_quant_info_t structure in HailoRT. But for intermediate layers, the compiler doesn’t expose their quantization parameters directly through HailoRT or the HEF interface.

For debugging purposes, you can use the Hailo Profiler (during model optimization) to analyze per-layer quantization stats, including histograms and SNR for quantized activations. It’s not API-level access, but it’s helpful for understanding what’s happening with quantization throughout your model.

1 Like

Thank you for your answer @omria !