Groupnorm/Layernorm support in QAT process

I’ve been trying to quantize a custom model that has a group normalization layer, and seeing it is not supported by hailo, I thought of trying layernorm instead, as the documentation states it as one of the supported layers.

However, I am also currently attempting QAT as well. Previously, I proceeded without QAT by converting a pytorch model to onnx (whereas I retrained my model in tf currently), and it seemed the pytorch to onnx conversion managed to break down the groupnorm into a subgraph of operations that were recognized by hailo, and I was able to perform the full quantization process though the accuracy after quantization suffered greatly. Due to this accuracy loss, I decided to perform QAT, though now it seems likely that the accuracy loss may have been precisely due to translating the unsupported groupnorm layer.

It seemed the tf/keras - tflite conversion and QAT workflow has the same issues, where tflite does not support either groupnorm or layernorm but rather breaks it down into subgraph of ops. I’ve already tried the tflite conversion with a groupnorm model, and this exhibited multiple unsupported operation errors for reshape, reducemean, broadcast_to etc (which I believe to be the broken down operations replacing groupnorm).

What I was wondering was whether the support of layernorm was exclusive to onnx parser workflows (as from the groupnorm example, the onnx conversion seems more robust for decomposition of layers) or also applicable to QAT workflows via tflite.

I can’t comment on the tflite route, but the onnx route is dependent on the onnx opset version. Any opset before 17 will result in a decomposed LayerNorm and any opset before 18 will result in a decomposed GroupNorm.

For LayerNorm, both the decomposed and onnx op version are supported. The compiler looks for either the plain LayerNorm op or a specific chain of decomposed ops. Because GroupNorm is not supported, the specific chain of ops for GroupNorm is not being searched for.

Would it be possible to confirm layernorm is also supported on the tflite route, so that QAT would be possible?

The QAT support shouldn’t be necessary to go through the tflite translation route, if you’re referring to the QAT guide in the DFC manual (Section 4.7).

The HAR format is based on either TFLite/Keras, when an ONNX is translated, the layers are automatically translated into HN Layers which are from what I remember wrapped TFLite/Keras layers.

You should be able to export the keras model via runner.get_keras_model() in order to do QAT, even with the ONNX route.

Disclaimer: I am not part of the Hailo team, this is information I’ve been able to pick up through extensive use of DFC.

Hey @Byung_Hoo_Park ,

Welcome to the Hailo Community!

You’re not stuck with just the ONNX route for LayerNorm! The Dataflow Compiler actually recognizes and quantizes LayerNorm properly when you use the TensorFlow/TFLite path too, including with QAT workflows.

LayerNorm works great in TFLite - The TensorFlow/TFLite parser treats LayerNorm as a proper “block” even when it gets broken down into smaller operations. If you check the “Supported Tensorflow Layers” table, you’ll see Layer Normalization listed as “Translated as a block of several Hailo layers.”

Better QAT support - We have specifically improved how LayerNorm gets quantized, so your QAT-trained models with LayerNorm should have much better accuracy after quantization.

I’ve also inspected hailo_sdk_client/model_translator and I could find a definition for a layernorm layer in hailo, although not a ‘lookup’ of the graph which specified the exact layers it would search for to identify a layernorm layer, though this may be a mistake on my part.

I’ve created a tflite file with layernorm, and it has decomposed it as can be seen in the image.

When I try to parse this, multiple ‘unsupported layer’ error pops up such as:
Mean operations that attempt to reduce the tensor across the batch, height, and width axes simultaneously (axis=[0, 1, 2]).
Unsupported SquaredDifference Operation
The parser fails on the specific Reshape and Transpose operations used within the LayerNormalization block, flagging them as an UnsupportedShuffleLayerError

I believe this indicates that the parser failed to identify the set of operations as a ‘layernorm block’.

Hailo Dataflow Compiler: 3.31.0

TensorFlow: 2.12.0

Keras: 2.12.0

Python: 3.10

NumPy: 1.23.3

NVIDIA Driver: 528

CUDA Toolkit: 11.8

cuDNN: 8.9

Operating System: Ubuntu 22.04 (on WSL2)

Is it that the decomposed pattern is somehow slightly different, or is it the precise method of layernormalization (I found in layer_normalization.py in hailo sdk client that ‘axes’ defaults to [3] - feature)?