My tensorflow model was trained with Quantization aware training (QAT)and doesn’t seem to go through the parser successfully. The parser complains that it cannot find the specified end nodes.
The DFC doesn’t support models that have been trained with Tensorflow’s QAT. There are layers such as act_quant and FakeQuantWithMinMaxVars that cannot be parsed.
Solution is to remove QAT using one of the following method:
-
If you are using Keras to create your mode from scratch, as suggested in Quantization aware training comprehensive guide | TensorFlow Model Optimization , and if there is a call to tfmot.quantization.keras.quantize_model () when the model is created, Tensorflow automatically add the act_quant and FakeQuantWithMinMaxVars layers even if you don’t quantize at the end. It is when TFLite converter is called(), that the model gets quantized.
So make sure to remove calls to tfmot.quantization.keras.quantize_model (). in your Keras code. -
If you are using the TensorFlow Object Detection API, according to Quantization of TensorFlow Object Detection API Models | Galliot , the way to disable QAT in such a case, is to remove these lines in ssd_mobilenet_v2_quantized_300x300_coco.config:
graph_rewriter {
quantization {
delay: 48000
weight_bits: 8
activation_bits: 8
}
}
I’m facing a similar issue and above solution (do not call tfmot.quantization.keras.quantize_model () actually disables QAT.
I was wondering if there’s a way to train a model with QAT (Quantization Aware Training) and then convert it to Hailo format? If not, are there any plans to support this feature in the future?
You can use Keras to enable QAT with Hailo toolchain. Please refer to https://hailo.ai/developer-zone/documentation/v3-29-0/?sp_referrer=tutorials_notebooks/notebooks/DFC_6_QAT_Tutorial.html#Quantization-Aware-Training-Tutorial for more details.
Note that, even if you didn’t use Keras to build and train the model, you can call the function runner.get_keras_model()
to get the keras version of the model from the HAR file.