Groupnorm/Layernorm support in QAT process

Byung_Hoo_Park · July 24, 2025, 7:44am

I’ve been trying to quantize a custom model that has a group normalization layer, and seeing it is not supported by hailo, I thought of trying layernorm instead, as the documentation states it as one of the supported layers.

However, I am also currently attempting QAT as well. Previously, I proceeded without QAT by converting a pytorch model to onnx (whereas I retrained my model in tf currently), and it seemed the pytorch to onnx conversion managed to break down the groupnorm into a subgraph of operations that were recognized by hailo, and I was able to perform the full quantization process though the accuracy after quantization suffered greatly. Due to this accuracy loss, I decided to perform QAT, though now it seems likely that the accuracy loss may have been precisely due to translating the unsupported groupnorm layer.

It seemed the tf/keras - tflite conversion and QAT workflow has the same issues, where tflite does not support either groupnorm or layernorm but rather breaks it down into subgraph of ops. I’ve already tried the tflite conversion with a groupnorm model, and this exhibited multiple unsupported operation errors for reshape, reducemean, broadcast_to etc (which I believe to be the broken down operations replacing groupnorm).

What I was wondering was whether the support of layernorm was exclusive to onnx parser workflows (as from the groupnorm example, the onnx conversion seems more robust for decomposition of layers) or also applicable to QAT workflows via tflite.

lawrence · July 24, 2025, 4:44pm

I can’t comment on the tflite route, but the onnx route is dependent on the onnx opset version. Any opset before 17 will result in a decomposed LayerNorm and any opset before 18 will result in a decomposed GroupNorm.

For LayerNorm, both the decomposed and onnx op version are supported. The compiler looks for either the plain LayerNorm op or a specific chain of decomposed ops. Because GroupNorm is not supported, the specific chain of ops for GroupNorm is not being searched for.

Byung_Hoo_Park · July 25, 2025, 5:04am

Would it be possible to confirm layernorm is also supported on the tflite route, so that QAT would be possible?

lawrence · July 25, 2025, 6:30pm

The QAT support shouldn’t be necessary to go through the tflite translation route, if you’re referring to the QAT guide in the DFC manual (Section 4.7).

The HAR format is based on either TFLite/Keras, when an ONNX is translated, the layers are automatically translated into HN Layers which are from what I remember wrapped TFLite/Keras layers.

You should be able to export the keras model via runner.get_keras_model() in order to do QAT, even with the ONNX route.

Disclaimer: I am not part of the Hailo team, this is information I’ve been able to pick up through extensive use of DFC.

omria · July 27, 2025, 3:12pm

Hey @Byung_Hoo_Park ,

Welcome to the Hailo Community!

You’re not stuck with just the ONNX route for LayerNorm! The Dataflow Compiler actually recognizes and quantizes LayerNorm properly when you use the TensorFlow/TFLite path too, including with QAT workflows.

LayerNorm works great in TFLite - The TensorFlow/TFLite parser treats LayerNorm as a proper “block” even when it gets broken down into smaller operations. If you check the “Supported Tensorflow Layers” table, you’ll see Layer Normalization listed as “Translated as a block of several Hailo layers.”

Better QAT support - We have specifically improved how LayerNorm gets quantized, so your QAT-trained models with LayerNorm should have much better accuracy after quantization.

Byung_Hoo_Park · July 29, 2025, 5:25am

I’ve also inspected hailo_sdk_client/model_translator and I could find a definition for a layernorm layer in hailo, although not a ‘lookup’ of the graph which specified the exact layers it would search for to identify a layernorm layer, though this may be a mistake on my part.

Byung_Hoo_Park · July 29, 2025, 8:21am

I’ve created a tflite file with layernorm, and it has decomposed it as can be seen in the image.

When I try to parse this, multiple ‘unsupported layer’ error pops up such as:
Mean operations that attempt to reduce the tensor across the batch, height, and width axes simultaneously (axis=[0, 1, 2]).
Unsupported SquaredDifference Operation
The parser fails on the specific Reshape and Transpose operations used within the LayerNormalization block, flagging them as an UnsupportedShuffleLayerError

I believe this indicates that the parser failed to identify the set of operations as a ‘layernorm block’.

Hailo Dataflow Compiler: 3.31.0

TensorFlow: 2.12.0

Keras: 2.12.0

Python: 3.10

NumPy: 1.23.3

NVIDIA Driver: 528

CUDA Toolkit: 11.8

cuDNN: 8.9

Operating System: Ubuntu 22.04 (on WSL2)

Is it that the decomposed pattern is somehow slightly different, or is it the precise method of layernormalization (I found in layer_normalization.py in hailo sdk client that ‘axes’ defaults to [3] - feature)?

Topic		Replies	Views
Can DFC oompile ONNX model trained with QAT from pytorch General	7	146	November 18, 2024
Error converting segormer b0 onnx model using DFC General dfc	4	142	August 8, 2024
DFC parser cannot parse a Tensorflow model trained with Quantization aware training (QAT) General dfc	3	178	October 11, 2024
UnsupportedGatherLayerError in op Gather_166 General	1	73	October 11, 2024
Error occured while converting YOLOv8 ONNX to HEF General error	2	523	June 27, 2024

Groupnorm/Layernorm support in QAT process

Related topics