what's the meaning of outputs of hailo8 yolov8s_seg model and how can i use them?

Hi, I would like to run yolov8s_seg model on my hailo8.
Cause I’m new to compiling/using segmentation models and would like to know how to get segmentation results and how to use them properly in my c++ project in RPI5.
First, I think I need to understand the following results which show the input/output data structures of yolov8s_seg.hef file.

Could you please kindly explain the data structures of output from yolov8s_seg model and how to get/use segmentation results in my c++ project? (E. g. I would like to get the each pixels of data from each segmentation area(instance) and store each of them as an image.)

\# hailortcli parse-hef yolov8s_seg.hef
Architecture HEF was compiled for: HAILO8
Network group name: yolov8s_seg, Multi Context - Number of contexts: 2
Network name: yolov8s_seg/yolov8s_seg
VStream infos:
Input yolov8s_seg/input_layer1 UINT8, NHWC(640x640x3)
Output yolov8s_seg/conv73 UINT8, NHWC(20x20x64)
Output yolov8s_seg/conv74 UINT8, NHWC(20x20x80)
Output yolov8s_seg/conv75 UINT8, NHWC(20x20x32)
Output yolov8s_seg/conv60 UINT8, FCR(40x40x64)
Output yolov8s_seg/conv61 UINT8, FCR(40x40x80)
Output yolov8s_seg/conv62 UINT8, NHWC(40x40x32)
Output yolov8s_seg/conv44 UINT8, FCR(80x80x64)
Output yolov8s_seg/conv45 UINT8, FCR(80x80x80)
Output yolov8s_seg/conv46 UINT8, FCR(80x80x32)
Output yolov8s_seg/conv48 UINT8, FCR(160x160x32)

Thank you for reading my question and for giving me any kinds of advice in advance.

Best regards,
SJ

The tensors with the same first two dimensions all refer to features that are derived from one specific head. For example, every tensor with 20x20 as the first two dimensions are derived from the P5 head of YOLO, 40x40 is the P4 head and 80x80 is the P3 head.

Every tensor whose last dimension is 80 is used for classification. This corresponds to the 80 classes in COCO. Every tensor whose last dimension is 64 is used for bounding box regression. Every tensor whose last dimension is 32 (except the tensor with shape 160x160x32) are the mask coefficients.

The last tensor which has the largest two dimensions (160x160x32) is the proto tensor.

For bounding box and classification if you follow from this line you can see how it is decoded: ultralytics/ultralytics/nn/modules/head.py at main · ultralytics/ultralytics · GitHub

Here is how masks are decoded: ultralytics/ultralytics/utils/ops.py at main · ultralytics/ultralytics · GitHub

Alternatively, if you use DeGirum PySDK we include a C++ postprocessor that already handles segmentation results. You can get the speed of C++ postprocessing with the ease of a python interface. You can see details here for compiling a segmentation model Early Access to DeGirum Cloud Compiler

Hey @user319 ,

Welcome to the hailo Community!

if you want to use the hailo-rpi5-examples or want to use gstreamer elements then look at the following postprocess : hailo-apps-infra/hailo_apps/hailo_app_python/core/cpp_postprocess/cpp/yolov5seg.cpp at main · hailo-ai/hailo-apps-infra · GitHub

if you want to use c++ native API then check out the following example of postprocess : Hailo-Application-Code-Examples/runtime/hailo-8/cpp/instance_segmentation/yolov8seg/yolov8seg_postprocess.cpp at main · hailo-ai/Hailo-Application-Code-Examples · GitHub

Oh, thank you for your reply!
Through your explanation, I understand the output structure from Yolo segmentation model ! Based on my understanding I will go through.

Thank you and have a nice day!

1 Like