Noisy pose for yolo model running on rpi5

Hello,

When running pose_estimation.py example on my raspberry pi 5 for yolo pose estimation, the resulting pose output is very shaky, compared to running yolo pose on my pc. This in turn seems to negatively affect the accuraccy for an st-gcn classification model I am running at the same time (on cpu). Is this shakyness normal?

Also if I compile yolo pose using the compiler, will there be any improvement in the shakyness?

videos demonstrating my problem:
https://drive.google.com/drive/folders/15d2P6t58gRhUI3NEnbMt7PHezDHL81bh?usp=drive_link

Thank you

This is likely due to the quantization. The model outputs are quantized to 8-bit. You can see fluctuations also in the floating-point model on the CPU. However because you have more bits available the changes are visually less noticeable.

You can try the following two options.

  • Filter the outputs before you send them to your second model. This will smooth the transition and can be done in the application.
  • Quantize the last layers of the model to 16-bit. This will give your model a finer resolution. This will require you to convert the model using the Hailo Dataflow Compiler.
1 Like

Can you give me some steps/links on how I can quantize the model?

Should I start from an existing onnx file?

Thanks

I recommend to work trough the tutorials in the Hailo AI Software Suite. After starting the suite docker run the following command:

hailo tutorial

This will start a Jupyter notebook server with individual pages for the steps to convert a model.

1 Like

looking at the tutorials, there are options for choosing the quantization level for some percentage of all weights. However, I couldn’t find any way for specifically disabling quantization for specific layers. Could you help with that?

You can have a look at the model script files in the Model Zoo. I just found that the yolov8m_pose and yolov8s_pose model in the Model Zoo already use the quantization parameter command.
It seems that the model in the example was specifically built for the Raspberry Pi without using this command. Using this will affect performance especially with the single PCIe lane.

quantization_param(output_layer3, precision_mode=a16_w16)
quantization_param(output_layer6, precision_mode=a16_w16)
quantization_param(output_layer9, precision_mode=a16_w16)

GitHub - Hailo Model Zoo - yolov8m_pose.alls

It is not disabling quantization. The layer is still quantized but to 16-bit.

1 Like

great that helps a lot! So is there any difference building the model using the modelzoo cli or using the jupyter notebooks? Which approach should I choose? Finally, is the step for data calibration necessary? If so, should i use coco dataset for data calibration or my own data?

That is up to you. I think:

  • The Jupyter notebooks are the way to start. They are interactive and make it easy to experiment and understand the whole workflow.
  • The Model Zoo flow is very easy especially when you use our retraining docker and popular models like Yolo.
  • Using the CLI allows you to do some quick tests.
  • Using a python script provides the best flexibility and full automation of the conversion process. You can add your own validation and integrate everything as you see fit.

Yes, calibration is necessary. You can do a quick calibration if you are less interested in accuracy and want to evaluate the FPS of a model.

To get the best accuracy you want to use data that is representative of the data the model will see when it is running inference.
We use the public dataset like COCO because they are available to everyone and are a good starting point for evaluation.

1 Like

Hello,

I successfully changed the quantization on these layers, however now I am facing another issue:

Architecture HEF was compiled for: HAILO8L
Network group name: yolov8s_pose, Multi Context - Number of contexts: 4
Network name: yolov8s_pose/yolov8s_pose
VStream infos:
Input yolov8s_pose/input_layer1 UINT8, NHWC(640x640x3)
Output yolov8s_pose/conv70 UINT8, FCR(20x20x64)
Output yolov8s_pose/conv71 UINT8, NHWC(20x20x1)
Output yolov8s_pose/conv72 UINT16, FCR(20x20x51)
Output yolov8s_pose/conv57 UINT8, FCR(40x40x64)
Output yolov8s_pose/conv58 UINT8, NHWC(40x40x1)
Output yolov8s_pose/conv59 UINT16, FCR(40x40x51)
Output yolov8s_pose/conv43 UINT8, FCR(80x80x64)
Output yolov8s_pose/conv44 UINT8, FCR(80x80x1)
Output yolov8s_pose/conv45 UINT16, FCR(80x80x51)
(venv_hailo_rpi5_examples) (.openmmlab) obh@raspberrypi:~/hailo-rpi5-examples $ hailortcli parse-hef resources/yolov8s_pose_h8l_pi.hef
Architecture HEF was compiled for: HAILO8L
Network group name: yolov8s_pose, Multi Context - Number of contexts: 4
Network name: yolov8s_pose/yolov8s_pose
VStream infos:
Input yolov8s_pose/input_layer1 UINT8, NHWC(640x640x3)
Output yolov8s_pose/conv70 UINT8, FCR(20x20x64)
Output yolov8s_pose/conv71 UINT8, NHWC(20x20x1)
Output yolov8s_pose/conv72 UINT8, FCR(20x20x51)
Output yolov8s_pose/conv57 UINT8, FCR(40x40x64)
Output yolov8s_pose/conv58 UINT8, NHWC(40x40x1)
Output yolov8s_pose/conv59 UINT8, FCR(40x40x51)
Output yolov8s_pose/conv43 UINT8, FCR(80x80x64)
Output yolov8s_pose/conv44 UINT8, FCR(80x80x1)
Output yolov8s_pose/conv45 UINT8, FCR(80x80x51)

Due to the different output format (16-bit vs 8-bit) the post processing on gstreamer doesnt work. As a result I can’t get the keypoints, just the bounding box. Can you help with that?

Thanks