Hello ! i trained a custom yolov8seg model but when i try to use instance_segmentation.py file i could see that the config file for yolov8_seg is missing like the postprocess, how do i generate config file and postproces??
Thanks to all!
Hello ! i trained a custom yolov8seg model but when i try to use instance_segmentation.py file i could see that the config file for yolov8_seg is missing like the postprocess, how do i generate config file and postproces??
Thanks to all!
Hey @Andrew92,
Great job creating a YOLOv8 seg model! Here’s how to integrate it:
For the config file:
For the post-processing:
self.default_post_process_so
in instance_segmentation_pipeline.py to point to your compiled .so fileIf you need more specific guidance on configuring the JSON file or creating a custom post-process, please share more details about your model architecture and outputs.
You can run the following to get the info :
hailo profiler your_model.har
hailortcli run your_model.hef --frames-count 1 --measure-latency
hailortcli parse-hef your_model.hef
If you need any further assistance, we’re here to help!
Hello @omria and thanks for the answer. Since some of us have problems with segmentation custom models, i’ll go more deeper on what i need to let who have the same problem understand what to do.
I took from hailo-apps-infra/resources/yolov5n_seg.json and this is the content
{
"iou_threshold": 0.6,
"score_threshold": 0.25,
"outputs_size": [
20,
40,
80
],
"outputs_name": [
"yolov5n_seg/conv63",
"yolov5n_seg/conv48",
"yolov5n_seg/conv55",
"yolov5n_seg/conv61"
],
"anchors": [
[
116,
90,
156,
198,
373,
326
],
[
30,
61,
62,
45,
59,
119
],
[
10,
13,
16,
30,
33,
23
]
],
"input_shape": [
640,
640
],
"strides": [
32,
16,
8
]
}
Is this the correct file?
I would like to understand how to modify this file correctly basing on my model requirements to generalize it for other models.
as you can see in another thread of mine segmentation error yolov5_seg where i tried to train a yolov5_seg model to avoid to modify json, generate postprocess.so and avoid to modify instance_segmentation_pipeline.py, but i still have issues.
I searched in the hailo-apps-infra/cpp/yolov5seg.cpp and yolov5seg.hpp, but i cannot tell what’s the part i should modify to makeit work, would be great if you could explain how to make it work for other models (like for example the yolov5-seg, as seems it works only with hailo yolov5_seg example model but not with the customs).
I run hailo profiler yolov8s_seg.hef but turn out the command need the har file, so i run hailo profiler yolov8s_seg.har and i got:
[info] Current Time: 15:36:02, 03/18/25
[info] CPU: Architecture: x86_64, Model: AMD Ryzen 9 7845HX with Radeon Graphics, Number Of Cores: 24, Utilization: 0.6%
[info] Memory: Total: 15GB, Available: 12GB
[info] System info: OS: Linux, Kernel: 5.15.167.4-microsoft-standard-WSL2
[info] Hailo DFC Version: 3.30.0
[info] HailoRT Version: 4.20.0
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo profiler yolov8s_seg.har`
[info] Running profile for yolov8s_seg in state compiled_model
[info]
Model Details
-------------------------------- -----------
Input Tensors Shapes 640x640x3
Operations per Input Tensor 39.92 GOPs
Operations per Input Tensor 20.00 GMACs
Pure Operations per Input Tensor 42.44 GOPs
Pure Operations per Input Tensor 21.25 GMACs
Model Parameters 13.24 M
-------------------------------- -----------
Profiler Input Settings
----------------- -----------------
Optimization Goal Reach Highest FPS
Profiler Mode Compiled
----------------- -----------------
Performance Summary
---------------------- ---
Number of Devices 1
Number of Contexts 2
Throughput N/A
Latency N/A
Operations per Second N/A
MACs per Second N/A
Total Input Bandwidth N/A
Total Output Bandwidth N/A
Context Switch Configs N/A
---------------------- ---
[info] Saved Profiler HTML Report to: /local/yolov8s_seg_compiled_model.html
then hailortcli run yolov8s_seg.hef --frames-count 1 --measure-latency
Running streaming inference (yolov8s_seg.hef):
Transform data: true
Type: auto
Quantized: true
Network yolov8s_seg/yolov8s_seg: 100% | 272 | FPS: 54.32 | ETA: 00:00:00
> Inference result:
Network group: yolov8s_seg
Frames count: 272
FPS: 54.33
Send Rate: 534.05 Mbit/s
Recv Rate: 713.80 Mbit/s
and finally hailortcli parse-hef yolov8s_seg.hef
Architecture HEF was compiled for: HAILO8
Network group name: yolov8s_seg, Multi Context - Number of contexts: 2
Network name: yolov8s_seg/yolov8s_seg
VStream infos:
Input yolov8s_seg/input_layer1 UINT8, NHWC(640x640x3)
Output yolov8s_seg/conv73 UINT8, FCR(20x20x64)
Output yolov8s_seg/conv74 UINT8, NHWC(20x20x2)
Output yolov8s_seg/conv75 UINT8, NHWC(20x20x32)
Output yolov8s_seg/conv60 UINT8, FCR(40x40x64)
Output yolov8s_seg/conv61 UINT8, NHWC(40x40x2)
Output yolov8s_seg/conv62 UINT8, FCR(40x40x32)
Output yolov8s_seg/conv44 UINT8, FCR(80x80x64)
Output yolov8s_seg/conv45 UINT8, NHWC(80x80x2)
Output yolov8s_seg/conv46 UINT8, FCR(80x80x32)
Output yolov8s_seg/conv48 UINT8, FCR(160x160x32)
It’s only needed to change the postprocess file or should i compile new one?
That’s all.. could you help me to understand also how to modify everything to use it with my custom models (also other model than yolov which i could be interested in) ?
Thanks A LOT!
Hey @Andrew92,
Ok, let’s do this step by step:
outputs_name
: Replace the layer names with the actual output names from your YOLOv8 model.hailortcli parse-hef yolov8s_seg.hef
output, the correct layer names are:"outputs_name": [
"yolov8s_seg/conv73",
"yolov8s_seg/conv60",
"yolov8s_seg/conv44"
]
outputs_size
: Match it with your model’s output feature map sizes.[20, 40, 80]
(if your model outputs at those scales).strides
: Confirm that the strides [32, 16, 8]
are correct for your YOLOv8 model.anchors
: YOLOv8 segmentation does not use explicit anchors.You can create a new file or update the current one. If you create a new one, make sure to add the changes to the instance_segmentation_pipeline.py so it uses the new .so file.
yolov5seg.hpp
to match your model’s layer names:
conv73
, conv60
, conv44
conv74
, conv61
, conv45
conv75
, conv62
, conv46
parse_detections()
in yolov5seg.cpp
to properly extract bounding boxes and mask data from the correct indices.resize_masks()
function to handle your model’s different mask resolutions (20x20, 40x40, 80x80). Use cv::INTER_CUBIC
for sharper masks if needed.I’ll now provide some examples for the above steps (these haven’t been tested but are based on the inputs and outputs you provided):
// yolov8seg.cpp
#include "yolov8seg.hpp"
#include "xtensor/xpad.hpp"
#include "xtensor/xsort.hpp"
#include "hailo_common.hpp"
#include "common/tensors.hpp"
#include "common/nms.hpp"
#include "common/labels/coco_eighty.hpp"
#include "mask_decoding.hpp"
#include <cmath>
#include <thread>
#include <future>
#define MASK_CO 32
#define BOX_CO 4
std::vector<HailoDetection> decode_yolov8_output(
xt::xarray<uint16_t> &output,
float qp_zp,
float qp_scale,
float score_threshold,
int input_width,
int input_height)
{
int num_classes = (output.shape()[2] - BOX_CO - 1 - MASK_CO);
auto reshaped = xt::reshape_view(output, {output.shape()[0] * output.shape()[1], BOX_CO + 1 + num_classes + MASK_CO});
auto objectness = xt::view(reshaped, xt::all(), xt::range(4, 5));
auto scores = xt::view(reshaped, xt::all(), xt::range(5, 5 + num_classes));
// Filter detections
std::vector<uint> indices;
std::vector<float> confidences;
std::vector<uint> classes;
for (size_t i = 0; i < objectness.shape()[0]; ++i) {
auto class_scores = xt::view(scores, i, xt::all());
auto max_index = xt::argmax(class_scores)(0);
float class_score = sigmoid(dequant(class_scores(max_index), qp_zp, qp_scale));
float obj_score = sigmoid(dequant(objectness(i, 0), qp_zp, qp_scale));
float confidence = class_score * obj_score;
if (confidence > score_threshold) {
indices.push_back(i);
confidences.push_back(confidence);
classes.push_back(max_index);
}
}
std::vector<HailoDetection> detections;
for (size_t i = 0; i < indices.size(); ++i) {
auto row = indices[i];
float x = sigmoid(dequant(reshaped(row, 0), qp_zp, qp_scale)) * input_width;
float y = sigmoid(dequant(reshaped(row, 1), qp_zp, qp_scale)) * input_height;
float w = std::exp(dequant(reshaped(row, 2), qp_zp, qp_scale)) * input_width;
float h = std::exp(dequant(reshaped(row, 3), qp_zp, qp_scale)) * input_height;
HailoBBox bbox(x - w / 2, y - h / 2, w, h);
std::string label = common::coco_eighty[classes[i]];
float confidence = confidences[i];
xt::xarray<float> mask_coeffs = xt::view(reshaped, row, xt::range(5 + num_classes, _));
std::vector<float> data(mask_coeffs.size());
memcpy(data.data(), mask_coeffs.data(), sizeof(float) * data.size());
HailoDetection det(bbox, classes[i], label, confidence);
det.add_object(std::make_shared<HailoMatrix>(data, data.size(), 1));
detections.push_back(det);
}
return detections;
}
std::vector<HailoDetection> yolov8seg_post(std::map<std::string, HailoTensorPtr> &tensors,
const std::vector<std::string> &outputs_name,
float score_threshold,
float iou_threshold,
int input_width,
int input_height)
{
auto proto_tensor = common::dequantize(common::get_xtensor(tensors[outputs_name[0]]),
tensors[outputs_name[0]]->vstream_info().quant_info.qp_scale,
tensors[outputs_name[0]]->vstream_info().quant_info.qp_zp);
std::vector<HailoDetection> all_detections;
for (size_t i = 1; i < outputs_name.size(); ++i) {
auto tensor = tensors[outputs_name[i]];
auto output = common::get_xtensor_uint16(tensor);
float qp_zp = tensor->vstream_info().quant_info.qp_zp;
float qp_scale = tensor->vstream_info().quant_info.qp_scale;
auto dets = decode_yolov8_output(output, qp_zp, qp_scale, score_threshold, input_width, input_height);
all_detections.insert(all_detections.end(), dets.begin(), dets.end());
}
common::nms(all_detections, iou_threshold);
decode_masks(all_detections, proto_tensor);
return all_detections;
}
void yolov8seg(HailoROIPtr roi, void *params_void_ptr)
{
std::map<std::string, HailoTensorPtr> tensors = roi->get_tensors_by_name();
const float iou_threshold = 0.6f;
const float score_threshold = 0.25f;
const int input_width = 640;
const int input_height = 640;
std::vector<std::string> outputs_name = {
"yolov8s_seg/conv75",
"yolov8s_seg/conv73",
"yolov8s_seg/conv60",
"yolov8s_seg/conv44"
};
auto detections = yolov8seg_post(tensors, outputs_name, score_threshold, iou_threshold, input_width, input_height);
hailo_common::add_detections(roi, detections);
}
void filter(HailoROIPtr roi, void *params_void_ptr)
{
yolov8seg(roi, params_void_ptr);
}
void filter_letterbox(HailoROIPtr roi, void *params_void_ptr)
{
filter(roi, params_void_ptr);
HailoBBox roi_bbox = hailo_common::create_flattened_bbox(roi->get_bbox(), roi->get_scaling_bbox());
auto detections = hailo_common::get_hailo_detections(roi);
for (auto &detection : detections)
{
auto bbox = detection->get_bbox();
float xmin = bbox.xmin() * roi_bbox.width() + roi_bbox.xmin();
float ymin = bbox.ymin() * roi_bbox.height() + roi_bbox.ymin();
float xmax = bbox.xmax() * roi_bbox.width() + roi_bbox.xmin();
float ymax = bbox.ymax() * roi_bbox.height() + roi_bbox.ymin();
detection->set_bbox(HailoBBox(xmin, ymin, xmax - xmin, ymax - ymin));
}
roi->clear_scaling_bbox();
}
Let us know if you need any clarification or additional help with implementation!
Still trying to setting up all the things, but there’s no parse_detections() and resize_masks() into yolov5seg.cpp, so i’m stucked here.
Hey @Andrew92 ,
Updated the code as a yolov8 seg , please try it out and change it as you’re model needs ,
Hello @omria , i tried your update but it still doesn’t work, i get “Segmentation fault”
(venv_hailo_rpi5_examples) onesight@onesight:~/hailo-rpi5-examples $ python basic_pipelines/instance_segmentation.py --input resources/example.mp4 --arch hailo8 --hef-path /home/onesight/Desktop/New/yolov8s_seg.hef
filesrc location="resources/example.mp4" name=source ! queue name=source_queue_decode leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! decodebin name=source_decodebin ! queue name=source_scale_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoscale name=source_videoscale n-threads=2 ! queue name=source_convert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoconvert n-threads=3 name=source_convert qos=false ! video/x-raw, pixel-aspect-ratio=1/1, format=RGB, width=640, height=640 ! queue name=inference_wrapper_input_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailocropper name=inference_wrapper_crop so-path=/usr/lib/aarch64-linux-gnu/hailo/tappas/post_processes/cropping_algorithms/libwhole_buffer.so function-name=create_crops use-letterbox=true resize-method=inter-area internal-offset=true hailoaggregator name=inference_wrapper_agg inference_wrapper_crop. ! queue name=inference_wrapper_bypass_q leaky=no max-size-buffers=20 max-size-bytes=0 max-size-time=0 ! inference_wrapper_agg.sink_0 inference_wrapper_crop. ! queue name=inference_scale_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoscale name=inference_videoscale n-threads=2 qos=false ! queue name=inference_convert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! video/x-raw, pixel-aspect-ratio=1/1 ! videoconvert name=inference_videoconvert n-threads=2 ! queue name=inference_hailonet_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailonet name=inference_hailonet hef-path=/home/onesight/Desktop/New/yolov8s_seg.hef batch-size=2 vdevice-group-id=1 force-writable=true ! queue name=inference_hailofilter_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailofilter name=inference_hailofilter so-path=/home/onesight/hailo-rpi5-examples/venv_hailo_rpi5_examples/lib/python3.11/site-packages/hailo-apps-infra/resources/libyolov8seg.so config-path=/home/onesight/Desktop/New/yolov8s_seg.json function-name=filter_letterbox qos=false ! queue name=inference_output_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! inference_wrapper_agg.sink_1 inference_wrapper_agg. ! queue name=inference_wrapper_output_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailotracker name=hailo_tracker class-id=1 kalman-dist-thr=0.8 iou-thr=0.9 init-iou-thr=0.7 keep-new-frames=2 keep-tracked-frames=15 keep-lost-frames=2 keep-past-metadata=False qos=False ! queue name=hailo_tracker_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! queue name=identity_callback_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! identity name=identity_callback ! queue name=hailo_display_overlay_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! hailooverlay name=hailo_display_overlay ! queue name=hailo_display_videoconvert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! videoconvert name=hailo_display_videoconvert n-threads=2 qos=false ! queue name=hailo_display_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0 ! fpsdisplaysink name=hailo_display video-sink=autovideosink sync=true text-overlay=False signal-fps-measurements=true
Segmentation fault
Also i had to do some updates to your code because build_postprocess.sh noted some errors:
inline float dequant(uint16_t num, float qp_zp, float qp_scale) { return (float(num) - qp_zp) * qp_scale;}
#include <array>
// ...
auto reshaped = xt::reshape_view(output, std::array<size_t, 2>{
static_cast<size_t>(output.shape()[0] * output.shape()[1]),
static_cast<size_t>(BOX_CO + 1 + num_classes + MASK_CO)
});
Said so, i also created yolovseg8.hpp file as follow
#pragma once
#include "hailo_objects.hpp"
#include "xtensor/xarray.hpp"
#include "xtensor/xio.hpp"
#define YOLOV8_SEG_OUTPUT_BBOX "yolov8s_seg/conv73"
#define YOLOV8_SEG_OUTPUT_CLS "yolov8s_seg/conv74"
#define YOLOV8_SEG_OUTPUT_MASK "yolov8s_seg/conv75"
#define YOLOV8_SEG_OUTPUT_BBOX_MID "yolov8s_seg/conv60"
#define YOLOV8_SEG_OUTPUT_CLS_MID "yolov8s_seg/conv61"
#define YOLOV8_SEG_OUTPUT_MASK_MID "yolov8s_seg/conv62"
#define YOLOV8_SEG_OUTPUT_BBOX_SMALL "yolov8s_seg/conv44"
#define YOLOV8_SEG_OUTPUT_CLS_SMALL "yolov8s_seg/conv45"
#define YOLOV8_SEG_OUTPUT_MASK_SMALL "yolov8s_seg/conv46"
__BEGIN_DECLS
class Yolov8segParams
{
public:
float iou_threshold;
float score_threshold;
int num_anchors;
std::vector<int> outputs_size;
std::vector<std::string> outputs_name;
std::vector<xt::xarray<float>> anchors;
std::vector<int> input_shape;
std::vector<int> strides;
std::vector<xt::xarray<float>> grids;
std::vector<xt::xarray<float>> anchor_grids;
Yolov8segParams() {
iou_threshold = 0.6;
score_threshold = 0.25;
outputs_size = {20, 40, 80};
outputs_name = {"YOLOV8_SEG_OUTPUT_BBOX", "YOLOV8_SEG_OUTPUT_CLS", "YOLOV8_SEG_OUTPUT_MASK", "YOLOV8_SEG_OUTPUT_BBOX_MID", "YOLOV8_SEG_OUTPUT_CLS_MID", "YOLOV8_SEG_OUTPUT_MASK_MID","YOLOV8_SEG_OUTPUT_BBOX_SMALL","YOLOV8_SEG_OUTPUT_CLS_SMALL","YOLOV8_SEG_OUTPUT_MASK_SMALL"};
anchors = {{116, 90, 156, 198, 373, 326},
{30, 61, 62, 45, 59, 119},
{10, 13, 16, 30, 33, 23} };
input_shape = {640,640};
strides = {32, 16, 8};
}
};
Yolov8segParams *init(const std::string config_path, const std::string function_name);
void yolov8seg(HailoROIPtr roi, void *params_void_ptr);
void free_resources(void *params_void_ptr);
void filter(HailoROIPtr roi, void *params_void_ptr);
void filter_letterbox(HailoROIPtr roi, void *params_void_ptr);
__END_DECLS
Is that correct?
Hi @Andrew92
At DeGirum, we integrated postprocessor for yolov8 segmentation models. You can see User Guide 4: Simplifying Instance Segmentation on a Hailo Device Using DeGirum PySDK for details. Your current approach is unlikely to work for yolov8 segmentation as it does not use anchors.
Hi @shashi thanks for your reply! I tried DeGyrum but i did not understood how to use it..
Maybe i can have a second try but the most important thing to know is if DeGyrum support custom model.. it does?
Thanks in advance, is very important to me to have a resolution on this problem!
Hi @Andrew92
Yes, our PySDK supports custom models. PySDK is a wrapper over hailort and it makes application development simpler. So, whatever model can run on hailo device, it can run through PySDK. Please let me know what you tried and what did not work so that we can help.
Wonderfull! I don’t remember why it doesn’t work, probably i was frustated and did something stupid
In particular i tried to install DeGyrum on a raspberry pi 5.. is raspi supported?
Yes, raspi is supported. In fact, that is the most common use case for PySDK+Hailo. We understand that it can sometimes be frustrating to get started with a new package. We tried to provide detailed instructions: DeGirum/hailo_examples: DeGirum PySDK with Hailo AI Accelerators
Please reach out if you encounter any troubles.
Ok, now i’m pretty sure that with Segmentation Fault the only error is in config.json file.
I have doubt about that
“outputs_size”: [20, 40, 80],
“outputs_name”: [
“yolov8s_seg/conv73”,
“yolov8s_seg/conv60”,
“yolov8s_seg/conv44”
],
which seems the cause of Segmentation Fault
@omria any help to write a correct one?