Model quantization issue from tensorflow to tensorflow Lite

psimon · November 14, 2024, 7:56am

Hello,
I am currently working on deploying a TensorFlow model onto the Hailo platform and have encountered several challenges during the quantization and optimization processes. Despite aligning my TensorFlow and Python versions with Hailo’s requirements, I am experiencing issues that I am unable to resolve independently.

Challenges Faced:

Quantization Precision: After converting my model to TensorFlow Lite with INT8 quantization, the model’s performance deteriorated, yielding incorrect results during inference.
Calibration Dataset: I am uncertain about the requirements and preparation of a calibration dataset for the quantization process, is for example 120 img enough for calibration?

Steps Taken:

Aligned TensorFlow and Python versions with Hailo’s specifications.
Attempted to save the model in H5 format and perform optimization within the Hailo environment, but encountered quantization failures.

Request for Assistance:

I would greatly appreciate your guidance on the following:

-Best practices for preparing and utilizing a calibration dataset during quantization.
-Recommended workflow for model quantization and deployment on the Hailo platform.
-Whether it is better to perform the quantization through the Hailo Software Suite or to first do it on my own device and save the quantized model before compiling it through the Hailo compiler.

Thank you in advance

omria · November 17, 2024, 10:00am

Hey @psimon

Welcome to the Hailo Community!

For your INT8 quantization concerns, I recommend starting with Post-Training Quantization (PTQ) first. While your current 120-image calibration dataset is a start, increasing it to 500-1000 images would give you much better results. The key is making sure these images truly represent your real-world use cases.

If PTQ doesn’t give you the accuracy you need, we can explore Quantization-Aware Training (QAT). While it requires retraining, it often provides better results for challenging cases.

Here’s the simplest workflow I’d suggest:

Export your TensorFlow model to .tflite (floating-point)
Use our Hailo Software Suite for quantization
Run the Hailo compiler for optimization
Validate your results

Let me know if you run into any issues or need more specific guidance on any of these steps. We’re here to help!

psimon · November 18, 2024, 5:31am

Thank you so much for your reply.
When I tried to export my tensorflow lite model (floating point) to a HAR file with the sofware suite for quantization it said the format was in FLOAT32 and that this wasn’t right. This is why I decided to quantisize the tensorflow lite model to int8 outside of Hailo and was thinking to do the optimizing to HEF format there now.

Do you have any idea why the parsing didnt work? Thank you in davance!

nina-vilela · November 18, 2024, 2:33pm

Hi @psimon,

The quantization must be performed with Hailo’s tools. If parsing a full-precision model was unsuccessful, could you please share the error?

The best amount of images for the statistics collection (calibration) is the default 64. For other pre/post algorithms, like the QFT mentioned by @omria you might need more instances. A tip for the calibset is that it would represent well the inference dataset.

psimon · November 21, 2024, 1:57am

Hello and thank you for your help! Now the quantization and optimization was done through hailo compiler and it worked! I first exported to HAR format and then to HEF format following the Hailo guidelines. However, when I now try to run the hef file with Hailo Cli on the device where the Hailo chip is located I get this result:
Transform data: true
Type: auto
Quantized: true
[HailoRT] [error] CHECK failed - HEF file length does not match
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26)
[HailoRT] [error] Failed parsing HEF file
[HailoRT] [error] Failed creating HEF
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26)
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26) - Failed reading hef file /home/basicbox/Desktop/HAILO/my_model.hef
[HailoRT] [error] CHECK failed - HEF file length does not match
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26)
[HailoRT] [error] Failed parsing HEF file
[HailoRT] [error] Failed creating HEF
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26)
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26)

I am no sure how to proceed. Would appreciate your help, thank you.

nina-vilela · November 21, 2024, 8:31am

This might be due to an incompatibility between the DFC version that you compiled the model with and the HailoRT version.

psimon · December 9, 2024, 1:25am

Hello, thank you very much it was in fact the issue.

I do have some other questions regarding the inference with pyhailort.

Currently I am testing a simple classification model that I have build from scratch with tensorflow and then exported to tensorflow lite and finally compiled to hef file.
The Hailortcli run command is working and showing metrics but is there a way to calculate accuracy results with the Hailortlci run command as well? Or is it only to see how well the model is performing (speed etc.)?

Thank you very much!

Topic		Replies	Views
How to Replicate TFLite’s int8 Quantization in Hailo Optimization? General dfc	2	40	March 27, 2025
Hailo Calibration - Quantization process General raspberry-pi , hailo8	3	635	August 2, 2024
Hailo Inference on Different Data General hailo8	1	24	June 7, 2025
Where to quantize inputs General raspberry-pi , hailo8	3	343	September 25, 2024
TF Resnet50v1 accuracy degradation General raspberry-pi , hailo8	2	38	March 24, 2025

Model quantization issue from tensorflow to tensorflow Lite

Related topics