Hailo Calibration - Quantization process

SAN · July 31, 2024, 2:32am

Hi as I know on hailo8l model conversion is it require model quantization from float to int. Is there any available or any future upgrade on using model conversion without model quantization to int, instead is there any solution to use the network on floating point.

KlausK · July 31, 2024, 9:51pm

Short answer no.

Floating point hardware is larger and requires more energy for each computation than integer. It also requires more memory to store each weight of the model.

With our architecture we can run inference more efficient and at a much lower cost than on a GPU or CPU at the price of converting a model into a HEF (Hailo Executable Format) file.

The future is not supporting floating point. Actually it is quite the opposite. Neural networks are getting larger and quantization algorithms are getting more advanced allowing more and more layers to be quantized to 4-bit further improving efficiency and performance.

Most of the hard work is done for you in the Hailo Dataflow Compiler and we provide tools like our profiler report and layer noise analysis that will help you optimize that step even further when necessary.

SAN · August 1, 2024, 2:15am

If i not mistaken i did go through this website where it say in future there will be some update on support for floating point models .

KlausK · August 2, 2024, 8:41am

I do not know what they are referring too.

You can run a model in our emulator in floating point. This is used to validate a model after parsing.

As I wrote above, the future is even smaller quantization. You can google “quantizing LLMs” and find many pages from different sources.

You may have read that AI consumes significant energy globally. Reducing this cost is a priority for all AI companies. Quantizing models, while preserving accuracy, lowers power consumption and increases performance, even on large servers equipped with GPUs. This is particularly crucial at the edge.

Topic		Replies	Views
Data understanding on hailo quantization General	2	565	December 3, 2024
Where to quantize inputs General raspberry-pi , hailo8	3	395	September 25, 2024
Custom ONNX models on H8L Raspberry General dfc , raspberry-pi , hailo8	3	2243	July 30, 2024
FLOAT32 Custom HEF model problem General dfc , hailo8	3	59	May 15, 2025
Running local LLM using Hailo-8L General hailo8	5	9331	September 13, 2024

Hailo Calibration - Quantization process

Related topics