Wrong dimensions in har file from onnx containing torch.nn.ConvTranspose2d layers

Hello, I have a problem when converting an onnx file that contains ConvTranspose2d layers to har: the output dimensions of the 2 files do not match.

For example, given an input of dimension (3,24,14) that goes through a nn.ConvTranspose2d(in_channels=3 , out_channels=9 , kernel_size=3), the output (in pytorch and onnx) has dimension (9,26,16) while the har model output has dimension (9,24,14).

The har model is generated by

hailo parser onnx --hw-arch hailo8l model.onnx

Hey @sorbafabrizio,

The output dimension mismatch between your ONNX and HAR models for the ConvTranspose2d layer likely stems from differences in how padding and stride parameters are interpreted during conversion. Here are some steps to troubleshoot:

  1. Verify padding, stride, and dilation parameters are correctly translated from PyTorch to ONNX to HAR.

  2. Use Netron to check if ONNX intermediate layer dimensions match the PyTorch model before HAR conversion.

  3. Review Hailo parser documentation for any specific handling of ConvTranspose2d layers or hardware compatibility requirements.

  4. Consider adjusting the model architecture (e.g., modifying padding or kernel sizes) to ensure consistent output dimensions post-conversion.

  5. If the issue persists, consult Hailo support or documentation for known issues with ConvTranspose2d layer conversion.

Let me know if you need any clarification or further assistance!

Best Regards,
Omri

Thanks @omria,

yes from documentation it appears that only SAME_TENSORFLOW padding is supported.
After fixing that in the ConvTranspose2d layer, the output dimensions of the har and onnx match.

@omria,

I still have troubles getting the hef output right.

I am testing this simple network (trained to return the input)

class Autoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(in_channels=3,out_channels=3,kernel_size=3, stride=1, padding=1)
        
    def forward(self, x):
        x = self.conv(x)
        return x

After conversion, the dimensions and weights in the har file match those in the onnx file. However, when I run the hef model, the output does not match the one from the onnx model.
Here are the outputs of the onnx and hef for a test input image:

Onnx output
out onnx

Hef output
out hailo

Do you have any idea what the problem could be?
I tried to rearrange the tensor dimensions in different ways, but I always get the same kind of output.

Further info:

The onnx file is generated by

model = Autoencoder()
model.load_state_dict(torch.load("model.tar"))
dummy_in = torch.randn(1,3,384,216)
torch.onnx.export(model, dummy_in, 'model.onnx', export_params=True, opset_version=15)

and the hef file by

hailo parser onnx --hw-arch hailo8l model.onnx
hailo optimize --hw-arch hailo8l --use-random-calib-set model.har 
hailo compiler --hw-arch hailo8l model_optimized.har

The hef is run on a raspberry pi by

import cv2
from picamera2.devices import Hailo

hailo = Hailo("model.hef")
img_test=cv2.imread("test.png")
out = hailo.run(img_test)

@omnia,

nevermind, I solved the issue by changing the input shape with the --tensor-shapes parameter in the hailo parser:

hailo parser onnx --hw-arch hailo8l model_fp32.onnx --tensor-shapes [1,3,216,384]
1 Like

Hey @sorbafabrizio @omria,

Thanks for sharing your experience with the conversion of ConvTranspose2d from onnx to HAR! We are facing a similar issue and would like to ask for your support.

Problem description

We have an onnx model, which runs a successful inference using the onnxruntime. The library versions we are using to export/run the model are

onnx==1.16.0 onnxruntime-gpu==1.18.0 onnxscript==0.1.0.dev20241030 torch==2.3.1+cu121

The model contains a ConvTranspose layer with an 1x112x23x40 input and 1x40x45x80 output in the onnx file. This ConvTranspose layer has the following attributes:

dilations = (1, 1) group = 1 kernel_shape = (3, 3) output_padding = (0, 1) pads = (1, 1, 1, 1) strides = (2, 2)

According to the formulas given in the pytorch documentation, the output dimensions given in the onnx file should be correct.

When we convert the onnx model to the HAR format, the function call ClientRunner.translate_onnx_model() throws an hailo_sdk_common.hailo_nn.exceptions.UnsupportedModelError. The problem here is that the output of the ConvTranspose has dimensions [-1, 46, 80, 40] (HAR) and does therefore not match the dimensions of this layer in the onnx file ([-1, 45, 80, 40]). Therefore, a subsequent concat layer, which uses the input from ConvTranspose, fails.

For the transform we are using the Hailo AI SW Suite version 2024-10_docker.

Debugging steps

To get a better understanding of what is happening during the onnx to HAR conversion, we tried to only convert the first part of the network, which is from the input node to the ConvTranspose layer. This transform runs through successfully and we can visualize the resulting HAR model:

From the screenshot one can see that the ConvTranspose layer turned into a deconv with the output sizes mentioned above.

The attributes of the deconv layer in the HAR model are the following:

activation = linear batch_norm = false dilations = (1, 1, 1, 1) elementwise_add = false groups = 1 input_disparity = 1 kernel_shape = (3, 3, 112, 40) layer_disparity = 1 padding = DECONV strides = (1, 2, 2, 1)

Currently, we have the suspicion that the unsymmetric padding we are using (0, 1) might cause this issue. Could you therefore please let us know your ideas on how to transform the model to HAR with correct shapes?