How can I quantize a reparameterization model such as Repvgg?

Hit_ASA · July 8, 2025, 1:33am

I need to quantize a custom model with RepVGG as the backbone.

I noticed that Hailo Model Zoo has several models that use a reparameterization model such as RepVGG as their backbone. How are these models quantized?

Is the Conv after reparameterization quantized? Or is it quantized without reparameterization (with multiple Conv and Norm)?

omria · July 8, 2025, 7:36am

Hey @Hit_ASA,

When you compile a RepVGG-based network through the Hailo Model Zoo (like using hailomz compile … --yaml …), the backend automatically handles the reparameterization before quantization. Here’s what happens under the hood:

The compiler first takes your multi-branch Conv + BatchNorm structure and folds it all down into a single “deploy” convolution. Only after this folding step does it apply quantization. So you’re not actually quantizing each individual training-time branch or BatchNorm layer separately.

The process works like this:

Folding/Re-parameterization: During the full-precision optimization step (optimize_full_precision), all the training-time branches and batch norms get combined into one Conv + Bias layer (the “deploy” kernel). This handles conv + BN folding, conv + add operations, etc.
Quantization: Then in the next optimization pass (optimize), the compiler quantizes this single resulting convolution’s weights and activations, applying whatever weight/activation clipping you’ve configured before converting everything to int8.

So yeah, you’re right - it’s the post-reparameterization Conv that actually gets quantized, not the original multi-Conv-plus-Norm graph structure.

Hope that clears things up!

Hit_ASA · July 8, 2025, 7:48am

Thank you @omria, I think I understand what is happening.

So, even when performing QAT, I will still need to run training on the post-reparameterization Conv, correct?

omria · July 8, 2025, 11:41am

Yes, even when you’re doing QAT you still train the reparameterized Conv – you’re simply fine-tuning it (and any subsequent layers) in the quantized domain so that your final INT8 model recovers as much accuracy as possible

Topic		Replies	Views
Data understanding on hailo quantization General	2	505	December 3, 2024
Dequantization for yolov3_416 Model in Hailo Model Explorer General	2	155	July 2, 2024
[PLZ] Is it possible to perform 4-bit quantization on the multi-head attention layers of the DETR model? General hailort , hailo8	1	128	September 25, 2024
Accuracy degradation after quantization for Hailo HW General network	1	333	March 1, 2024
Can DFC oompile ONNX model trained with QAT from pytorch General	7	124	November 18, 2024

How can I quantize a reparameterization model such as Repvgg?

Related topics