Issues with custom model

Hello !

I’m actually working on a project involving automatic resolution of a marble labyrinth on a raspberry pi 5. Since I required more fps, I bought an Hailo hat in order to improve it.
I’m struggling running a custom model :
class MobileNetV2Custom(nn.Module):
def init(self):
super(MobileNetV2Custom, self).init()
base_model = models.mobilenet_v2(weights=None)
base_model.features[0][0] = nn.Conv2d(3, 32, kernel_size=3, stride=2, padding=1, bias=False)
self.features = base_model.features
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Sequential(
nn.Flatten(),
nn.Linear(1280, 256),
nn.ReLU(),
nn.Linear(256, 128),
nn.ReLU(),
)

    self.out_coord = nn.Linear(128, 2)
    self.out_presence = nn.Linear(128, 1)

def forward(self, x):
    x = self.features(x)
    x = self.avgpool(x)
    x = self.fc(x)
    coord = self.out_coord(x)
    presence = torch.sigmoid(self.out_presence(x))
    # Fusion outputs (coord + presence)
    return torch.cat((coord, presence), dim=1) 

I’m following notebook tutorials to convert from onnx to hef. I can see that the runner.infer is very good (like the onnx) but when compiling, it changes coordinates and the presence. For instance, presence is no more a sigmoid between 0,1. When there’s the ball, it gets to 1.5. I’m assuming it’s a normalization issue in the output. I tried getting the scale and zero but the command i saw in the forum didn’t work.

My last attempt was to get the 3 outputs separated but I had no more luck :
class MobileNetV2Custom(nn.Module):
def init(self):
super(MobileNetV2Custom, self).init()
base_model = models.mobilenet_v2(weights=None)
base_model.features[0][0] = nn.Conv2d(3, 32, kernel_size=3, stride=2, padding=1, bias=False)
self.features = base_model.features
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))

    self.fc = nn.Sequential(
        nn.Flatten(),
        nn.Linear(1280, 256),
        nn.ReLU(),
        nn.Linear(256, 128),
        nn.ReLU(),
    )

    self.out_x = nn.Linear(128, 1)
    self.out_y = nn.Linear(128, 1)
    self.out_presence = nn.Linear(128, 1)

def forward(self, x):
    x = self.features(x)
    x = self.avgpool(x)
    x = self.fc(x)
    x_out = self.out_x(x)
    y_out = self.out_y(x)
    presence_out = torch.sigmoid(self.out_presence(x))
    return x_out, y_out, presence_out

In this case, with :
hn, npz = runner.translate_onnx_model(
onnx_path,
onnx_model_name,
start_node_names=[“input”],
end_node_names=[“x_out”, “y_out”, “presence_out”],
net_input_shapes={“input”: [1, 3, 158, 200]},
)

I get the error :

3786 pred.replace_output_index(conv.index, dense.index)
→ 3787 pred.output_shapes[pred.outputs.index(dense.name)] = dense.input_shape
3788 self._output_graph.add_edge(pred, dense)
3789 self._output_graph.remove_edge(pred, conv)

IndexError: list assignment index out of range

Can you please help me.
Alexandre.

Hi @Alexandre_Monnin
If you can share the hef file (the hef with single output), we can help you figure out the scaling issue.

Hello,
Here is the hef : https://drive.google.com/file/d/1eOAq70RNw32ZyKR_veBI66XXuJ6dUpZd/view?usp=drive_link

and the onnx if needed : https://drive.google.com/file/d/1m5Wi-cRLjp39XSdX2fwapO1Vu8pmpcVx/view?usp=drive_link

I tried using more than 64 images for the calibration, it seems to get better.
I also “renormalized” coordinates outputs by hand.

In fact, the model give one single output composed of localisation and presence (x,y,presence). That last one should be a sigmoid (working fine on onnx) but with the hef model, there’s only 2 possibles outcome : 0 or 1.49 (something like that), odd for a sigmoid.

Last thing is that when there’s no ball, coordinates give -1,-1.

I just installed docker on wsl in order to use GPU. Do you have a tutorial to use it somewhere ? I saw notebooks but can’t find something for the API.

Last thing (x2). Is that possible to get different ouputs in order to avoid the concatenation at the end ?

Thanks a lot.

I managed to get those informations :
outputs
VStreamInfo(“best_model_v20/concat1”)
scale : 1.52171790599823
zero : 2.0

with this script i think you provided to an other :
from hailo_platform import HEF

hef = HEF(“/home/hailo/best_model_v20.hef”)
output_vstream_info = hef.get_output_vstream_infos()

print(“outputs”)
for output_info in output_vstream_info:
print(output_info)
print(“scale : {}”.format(output_info.quant_info.qp_scale))
print(“zero : {}\n”.format(output_info.quant_info.qp_zp))

How can I preserve my probability of presence ? Since the ouput is composed of position and presence. Is it possible to have more thant one output without the cat layer ?

Thanks

Ok ! I got it !
Using docker was simplier.

It’s being late and I am exhausted, but I will answer my topic in order to eventually help others !
Teaser : I managed to train a new model (pytorch) with 3 differents outputs and convert it in onnx and hef. The tricky part was to get names of outputs in the har/hef model.

So my project was to implement a custom mobilenetV2 model. Its purpose is to detect a marble in a labyrinth. The desired output is the coordinates x, y and the presence probability.
For that, i took several snapshots and manually plotted coordinates and presence, augmented images with differents transformation, and finally train it with Pytorch. Took time..

My friend chatGPT was NOT of a great help since he gave me inaccurate / out of date informations. I think hailo commands changed a lot. He advised me to use only an ouput in order to work with hailo. So i did this with the concatenation of outputs. I was inspired by the tutorial example after converting it in .onnx format. That models worked fine, so I was struggling finding my mistakes.

It appears that the same dequantification was used for the outputs. the sigmoïd would’nt work anymore giving 0 or 1.49.

So I tried other ways, and took more times understanding the pipeline. I installed on my windows wsl/linux in order to use the docker you provide. Very cool thing by the way. Much more simpler than installing it by hand !

I choose to reverse the concatenation thing and tried 3 different outputs. And it worked after some research and mistakes.

The tricky part was to get names of the output names used by the .hef model (Is there a way to modify them ?)
I used a small script to get all infos :

from hailo_platform import HEF

hef = HEF("/home/hailo/mobilenetv2_bille.hef")
output_vstream_info = hef.get_output_vstream_infos()

print("outputs")
for output_info in output_vstream_info:
        print(output_info)
        print(output_info.name)
        print("scale : {}".format(output_info.quant_info.qp_scale))
        print("zero : {}\n".format(output_info.quant_info.qp_zp))

Results was :

(hailo_virtualenv) hailo@docker-desktop:~$ python3 verif.py
outputs
VStreamInfo("mobilenetv2_bille/fc4")
mobilenetv2_bille/fc4
scale : 1.5009634494781494
zero : 1.0

VStreamInfo("mobilenetv2_bille/fc5")
mobilenetv2_bille/fc5
scale : 1.176542043685913
zero : 1.0

VStreamInfo("mobilenetv2_bille/fc3")
mobilenetv2_bille/fc3
scale : 0.003921568859368563
zero : 0.0

You can see here names of outputs :
mobilenetv2_bille/fc3, mobilenetv2_bille/fc4, mobilenetv2_bille/fc5

An other method is to use hailo visualizer ! but be carefull not using last layers.

So you have to use something like to get the x position :

out_fc4 = results["mobilenetv2_bille/fc4"].flatten()[0]

And a probability of presence between 0 and 1 at last !

In the end, I still have two questions :

  • Is it possible to rename outputs’ names to sexier ones ?
  • Is it possible to automatize scale and zero to match better results of the .onnx format ?

For this last one, the problem is that my coordinates are relative to an edge of my labyrinth and the ball won’t take all coordinates (walls on the edge). But with the automatic quantification, i have to manually tweak settings to match expected localization. All of that depending on calibration.

A last question if you managed to read through this :
Is that necessary to calibreate with datas, can’t we manually put settings !? like range of data and range of expected outputs ?

Thanks again !
Alexandre

Hey @Alexandre_Monnin ,

Nice work getting through all those challenges! Your process notes are going to be super helpful for anyone else dealing with similar issues.

For your questions:

1. Can we rename outputs to something more readable?

Well sort of, You can set custom names when you’re exporting from PyTorch.
The catch is that our DFC might change these names during optimization and layer fusion. So you’ll still need to use hef.get_output_vstream_infos() like you did to see what the final names actually are in the HEF file. We don’t currently have a way to force specific names in the compiled model.

2. Any way to automate scale/zero-point for better ONNX matching?

Partially, yeah. Since we use post-training quantization based on your calibration data, the scale/zero-point get determined from actual data ranges.

Your best bet is:

  • Use quantization-aware training if possible - makes the model way more robust to quantization differences
  • Apply the dequantization manually: real_value = (quantized_value - zero_point) × scale

We don’t have manual scale/zero-point control yet, but it’s definitely on people’s wishlist.

3. Can we skip images and just define calibration ranges manually?

Nope, calibration data is still required as of our latest release. The compiler really needs that data to figure out proper quantization parameters.

What you can do though:

  • Create very controlled synthetic calibration data with specific ranges (coords from -1 to 1, presence around 0 and 1)
  • Handle any scaling adjustments in post-processing on the host side using the extracted scale/zero values

Hope this helps! Keep up the great work and feel free to reach out if you need anything else.