Dynamic Weights Quantization Problem

Good day,

I am trying to port a customized model. I am currently at the quantization step (i.e. runner.optimize(calib_dataset)) therefore my model is already translated from onnx to har without problems. During quantization, I am encountering an error:

hailo_model_optimization.acceleras.utils.acceleras_exceptions.AccelerasImplementationError: layer model/dw2/crosscorrelation_dw_op do not support zero point != 0

I think this problem comes from the “torch.nn.functional.conv2d” where I use dynamic/runtime weights for this operation. I had to do this because I am doing matrix multiplication on 4D tensors.

  • Is there any way to mend this problem somewhere in the quantization options?
  • I can simplify this operation but I need to do torch.sum which lowers the rank of the tensor, but I still need rank 4 for the succeeding operations/layers. which DFC doesn’t like. Unless there is a way I can trick the DFC to do unsqueeze?

For context, I am trying to port my Mamba-based speech processing model. I have already rewritten the blocks/layers to 4D shapes.