Yolov8 seg custom train config.json and posprocess file missing

Hello ! i trained a custom yolov8seg model but when i try to use instance_segmentation.py file i could see that the config file for yolov8_seg is missing like the postprocess, how do i generate config file and postproces??

Thanks to all!

Hey @Andrew92,

Great job creating a YOLOv8 seg model! Here’s how to integrate it:

For the config file:

  1. Use resources/yolov5n_seg.json as your starting template
  2. Update the parameters based on your model requirements
  3. Modify instance_segmentation_pipeline.py to point to your new config file

For the post-processing:

  1. Check the existing postprocess code in the cpp directory (in your virtual env under hailo-apps-infra)
  2. Either use the existing code (maybe just rename it for YOLOv8) or write a new one
  3. If you create a new one, use ./compile_postprocess to build it as .so and move it to resources
  4. Update self.default_post_process_so in instance_segmentation_pipeline.py to point to your compiled .so file

If you need more specific guidance on configuring the JSON file or creating a custom post-process, please share more details about your model architecture and outputs.

You can run the following to get the info :

hailo profiler your_model.har
hailortcli run your_model.hef --frames-count 1 --measure-latency
hailortcli parse-hef your_model.hef

If you need any further assistance, we’re here to help!

Hello @omria and thanks for the answer. Since some of us have problems with segmentation custom models, i’ll go more deeper on what i need to let who have the same problem understand what to do.

config file

I took from hailo-apps-infra/resources/yolov5n_seg.json and this is the content

{
    "iou_threshold": 0.6,
    "score_threshold": 0.25,
    "outputs_size": [
        20,
        40,
        80
    ],
    "outputs_name": [
        "yolov5n_seg/conv63",
        "yolov5n_seg/conv48",
        "yolov5n_seg/conv55",
        "yolov5n_seg/conv61"
    ],
    "anchors": [
        [
            116,
            90,
            156,
            198,
            373,
            326
        ],
        [
            30,
            61,
            62,
            45,
            59,
            119
        ],
        [
            10,
            13,
            16,
            30,
            33,
            23
        ]
    ],
    "input_shape": [
        640,
        640
    ],
    "strides": [
        32,
        16,
        8
    ]
}

Is this the correct file?
I would like to understand how to modify this file correctly basing on my model requirements to generalize it for other models.

as you can see in another thread of mine segmentation error yolov5_seg where i tried to train a yolov5_seg model to avoid to modify json, generate postprocess.so and avoid to modify instance_segmentation_pipeline.py, but i still have issues.

post-processing

I searched in the hailo-apps-infra/cpp/yolov5seg.cpp and yolov5seg.hpp, but i cannot tell what’s the part i should modify to makeit work, would be great if you could explain how to make it work for other models (like for example the yolov5-seg, as seems it works only with hailo yolov5_seg example model but not with the customs).

model infos

I run hailo profiler yolov8s_seg.hef but turn out the command need the har file, so i run hailo profiler yolov8s_seg.har and i got:

[info] Current Time: 15:36:02, 03/18/25
[info] CPU: Architecture: x86_64, Model: AMD Ryzen 9 7845HX with Radeon Graphics, Number Of Cores: 24, Utilization: 0.6%
[info] Memory: Total: 15GB, Available: 12GB
[info] System info: OS: Linux, Kernel: 5.15.167.4-microsoft-standard-WSL2
[info] Hailo DFC Version: 3.30.0
[info] HailoRT Version: 4.20.0
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo profiler yolov8s_seg.har`
[info] Running profile for yolov8s_seg in state compiled_model
[info]
Model Details
--------------------------------  -----------
Input Tensors Shapes              640x640x3
Operations per Input Tensor       39.92 GOPs
Operations per Input Tensor       20.00 GMACs
Pure Operations per Input Tensor  42.44 GOPs
Pure Operations per Input Tensor  21.25 GMACs
Model Parameters                  13.24 M
--------------------------------  -----------

Profiler Input Settings
-----------------  -----------------
Optimization Goal  Reach Highest FPS
Profiler Mode      Compiled
-----------------  -----------------

Performance Summary
----------------------  ---
Number of Devices       1
Number of Contexts      2
Throughput              N/A
Latency                 N/A
Operations per Second   N/A
MACs per Second         N/A
Total Input Bandwidth   N/A
Total Output Bandwidth  N/A
Context Switch Configs  N/A
----------------------  ---
[info] Saved Profiler HTML Report to: /local/yolov8s_seg_compiled_model.html

then hailortcli run yolov8s_seg.hef --frames-count 1 --measure-latency

Running streaming inference (yolov8s_seg.hef):
  Transform data: true
    Type:      auto
    Quantized: true
Network yolov8s_seg/yolov8s_seg: 100% | 272 | FPS: 54.32 | ETA: 00:00:00
> Inference result:
 Network group: yolov8s_seg
    Frames count: 272
    FPS: 54.33
    Send Rate: 534.05 Mbit/s
    Recv Rate: 713.80 Mbit/s

and finally hailortcli parse-hef yolov8s_seg.hef

Architecture HEF was compiled for: HAILO8
Network group name: yolov8s_seg, Multi Context - Number of contexts: 2
    Network name: yolov8s_seg/yolov8s_seg
        VStream infos:
            Input  yolov8s_seg/input_layer1 UINT8, NHWC(640x640x3)
            Output yolov8s_seg/conv73 UINT8, FCR(20x20x64)
            Output yolov8s_seg/conv74 UINT8, NHWC(20x20x2)
            Output yolov8s_seg/conv75 UINT8, NHWC(20x20x32)
            Output yolov8s_seg/conv60 UINT8, FCR(40x40x64)
            Output yolov8s_seg/conv61 UINT8, NHWC(40x40x2)
            Output yolov8s_seg/conv62 UINT8, FCR(40x40x32)
            Output yolov8s_seg/conv44 UINT8, FCR(80x80x64)
            Output yolov8s_seg/conv45 UINT8, NHWC(80x80x2)
            Output yolov8s_seg/conv46 UINT8, FCR(80x80x32)
            Output yolov8s_seg/conv48 UINT8, FCR(160x160x32)

Postprocess

It’s only needed to change the postprocess file or should i compile new one?

That’s all.. could you help me to understand also how to modify everything to use it with my custom models (also other model than yolov which i could be interested in) ?
Thanks A LOT!

Hey @Andrew92,

Ok, let’s do this step by step:

1. Configuration File (JSON)

  • Update outputs_name: Replace the layer names with the actual output names from your YOLOv8 model.
  • From your hailortcli parse-hef yolov8s_seg.hef output, the correct layer names are:
"outputs_name": [
    "yolov8s_seg/conv73",
    "yolov8s_seg/conv60",
    "yolov8s_seg/conv44"
]
  • Update outputs_size: Match it with your model’s output feature map sizes.
    Based on your model, keep [20, 40, 80] (if your model outputs at those scales).
  • Update strides: Confirm that the strides [32, 16, 8] are correct for your YOLOv8 model.
  • Remove anchors: YOLOv8 segmentation does not use explicit anchors.
    It should look something like this:
    {
    “iou_threshold”: 0.6,
    “score_threshold”: 0.25,
    “outputs_size”: [20, 40, 80],
    “outputs_name”: [
    “yolov8s_seg/conv73”,
    “yolov8s_seg/conv60”,
    “yolov8s_seg/conv44”
    ],
    “input_shape”: [640, 640],
    “strides”: [32, 16, 8]
    }

2. Post-processing

You can create a new file or update the current one. If you create a new one, make sure to add the changes to the instance_segmentation_pipeline.py so it uses the new .so file.

  1. Match Output Names: Update the output tensor names in yolov5seg.hpp to match your model’s layer names:
    • Bounding boxes: conv73, conv60, conv44
    • Class scores: conv74, conv61, conv45
    • Segmentation masks: conv75, conv62, conv46
  2. Adjust Tensor Order: Modify parse_detections() in yolov5seg.cpp to properly extract bounding boxes and mask data from the correct indices.
  3. Update Mask Processing: Revise the resize_masks() function to handle your model’s different mask resolutions (20x20, 40x40, 80x80). Use cv::INTER_CUBIC for sharper masks if needed.
  4. After you finish, run the compile_postprocess.sh which will create the .so file (add the name of the new file if you created one)

I’ll now provide some examples for the above steps (these haven’t been tested but are based on the inputs and outputs you provided):

// yolov8seg.cpp
#include "yolov8seg.hpp"
#include "xtensor/xpad.hpp"
#include "xtensor/xsort.hpp"
#include "hailo_common.hpp"
#include "common/tensors.hpp"
#include "common/nms.hpp"
#include "common/labels/coco_eighty.hpp"
#include "mask_decoding.hpp"

#include <cmath>
#include <thread>
#include <future>

#define MASK_CO 32
#define BOX_CO 4

std::vector<HailoDetection> decode_yolov8_output(
    xt::xarray<uint16_t> &output,
    float qp_zp,
    float qp_scale,
    float score_threshold,
    int input_width,
    int input_height)
{
    int num_classes = (output.shape()[2] - BOX_CO - 1 - MASK_CO);
    auto reshaped = xt::reshape_view(output, {output.shape()[0] * output.shape()[1], BOX_CO + 1 + num_classes + MASK_CO});
    auto objectness = xt::view(reshaped, xt::all(), xt::range(4, 5));
    auto scores = xt::view(reshaped, xt::all(), xt::range(5, 5 + num_classes));

    // Filter detections
    std::vector<uint> indices;
    std::vector<float> confidences;
    std::vector<uint> classes;

    for (size_t i = 0; i < objectness.shape()[0]; ++i) {
        auto class_scores = xt::view(scores, i, xt::all());
        auto max_index = xt::argmax(class_scores)(0);
        float class_score = sigmoid(dequant(class_scores(max_index), qp_zp, qp_scale));
        float obj_score = sigmoid(dequant(objectness(i, 0), qp_zp, qp_scale));
        float confidence = class_score * obj_score;
        if (confidence > score_threshold) {
            indices.push_back(i);
            confidences.push_back(confidence);
            classes.push_back(max_index);
        }
    }

    std::vector<HailoDetection> detections;
    for (size_t i = 0; i < indices.size(); ++i) {
        auto row = indices[i];
        float x = sigmoid(dequant(reshaped(row, 0), qp_zp, qp_scale)) * input_width;
        float y = sigmoid(dequant(reshaped(row, 1), qp_zp, qp_scale)) * input_height;
        float w = std::exp(dequant(reshaped(row, 2), qp_zp, qp_scale)) * input_width;
        float h = std::exp(dequant(reshaped(row, 3), qp_zp, qp_scale)) * input_height;

        HailoBBox bbox(x - w / 2, y - h / 2, w, h);
        std::string label = common::coco_eighty[classes[i]];
        float confidence = confidences[i];

        xt::xarray<float> mask_coeffs = xt::view(reshaped, row, xt::range(5 + num_classes, _));
        std::vector<float> data(mask_coeffs.size());
        memcpy(data.data(), mask_coeffs.data(), sizeof(float) * data.size());

        HailoDetection det(bbox, classes[i], label, confidence);
        det.add_object(std::make_shared<HailoMatrix>(data, data.size(), 1));
        detections.push_back(det);
    }

    return detections;
}

std::vector<HailoDetection> yolov8seg_post(std::map<std::string, HailoTensorPtr> &tensors,
                                           const std::vector<std::string> &outputs_name,
                                           float score_threshold,
                                           float iou_threshold,
                                           int input_width,
                                           int input_height)
{
    auto proto_tensor = common::dequantize(common::get_xtensor(tensors[outputs_name[0]]),
                                           tensors[outputs_name[0]]->vstream_info().quant_info.qp_scale,
                                           tensors[outputs_name[0]]->vstream_info().quant_info.qp_zp);

    std::vector<HailoDetection> all_detections;

    for (size_t i = 1; i < outputs_name.size(); ++i) {
        auto tensor = tensors[outputs_name[i]];
        auto output = common::get_xtensor_uint16(tensor);
        float qp_zp = tensor->vstream_info().quant_info.qp_zp;
        float qp_scale = tensor->vstream_info().quant_info.qp_scale;

        auto dets = decode_yolov8_output(output, qp_zp, qp_scale, score_threshold, input_width, input_height);
        all_detections.insert(all_detections.end(), dets.begin(), dets.end());
    }

    common::nms(all_detections, iou_threshold);
    decode_masks(all_detections, proto_tensor);
    return all_detections;
}

void yolov8seg(HailoROIPtr roi, void *params_void_ptr)
{
    std::map<std::string, HailoTensorPtr> tensors = roi->get_tensors_by_name();

    const float iou_threshold = 0.6f;
    const float score_threshold = 0.25f;
    const int input_width = 640;
    const int input_height = 640;

    std::vector<std::string> outputs_name = {
        "yolov8s_seg/conv75",
        "yolov8s_seg/conv73",
        "yolov8s_seg/conv60",
        "yolov8s_seg/conv44"
    };

    auto detections = yolov8seg_post(tensors, outputs_name, score_threshold, iou_threshold, input_width, input_height);
    hailo_common::add_detections(roi, detections);
}

void filter(HailoROIPtr roi, void *params_void_ptr)
{
    yolov8seg(roi, params_void_ptr);
}

void filter_letterbox(HailoROIPtr roi, void *params_void_ptr)
{
    filter(roi, params_void_ptr);
    HailoBBox roi_bbox = hailo_common::create_flattened_bbox(roi->get_bbox(), roi->get_scaling_bbox());
    auto detections = hailo_common::get_hailo_detections(roi);
    for (auto &detection : detections)
    {
        auto bbox = detection->get_bbox();
        float xmin = bbox.xmin() * roi_bbox.width() + roi_bbox.xmin();
        float ymin = bbox.ymin() * roi_bbox.height() + roi_bbox.ymin();
        float xmax = bbox.xmax() * roi_bbox.width() + roi_bbox.xmin();
        float ymax = bbox.ymax() * roi_bbox.height() + roi_bbox.ymin();
        detection->set_bbox(HailoBBox(xmin, ymin, xmax - xmin, ymax - ymin));
    }
    roi->clear_scaling_bbox();
}

Let us know if you need any clarification or additional help with implementation!

Still trying to setting up all the things, but there’s no parse_detections() and resize_masks() into yolov5seg.cpp, so i’m stucked here.

Hey @Andrew92 ,

Updated the code as a yolov8 seg , please try it out and change it as you’re model needs ,

Hello @omria , i tried your update but it still doesn’t work, i get “Segmentation fault”

(venv_hailo_rpi5_examples) onesight@onesight:~/hailo-rpi5-examples $ python basic_pipelines/instance_segmentation.py --input resources/example.mp4 --arch hailo8 --hef-path /home/onesight/Desktop/New/yolov8s_seg.hef 
filesrc location="resources/example.mp4" name=source ! queue name=source_queue_decode leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! decodebin name=source_decodebin !  queue name=source_scale_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! videoscale name=source_videoscale n-threads=2 ! queue name=source_convert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! videoconvert n-threads=3 name=source_convert qos=false ! video/x-raw, pixel-aspect-ratio=1/1, format=RGB, width=640, height=640  ! queue name=inference_wrapper_input_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! hailocropper name=inference_wrapper_crop so-path=/usr/lib/aarch64-linux-gnu/hailo/tappas/post_processes/cropping_algorithms/libwhole_buffer.so function-name=create_crops use-letterbox=true resize-method=inter-area internal-offset=true hailoaggregator name=inference_wrapper_agg inference_wrapper_crop. ! queue name=inference_wrapper_bypass_q leaky=no max-size-buffers=20 max-size-bytes=0 max-size-time=0  ! inference_wrapper_agg.sink_0 inference_wrapper_crop. ! queue name=inference_scale_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! videoscale name=inference_videoscale n-threads=2 qos=false ! queue name=inference_convert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! video/x-raw, pixel-aspect-ratio=1/1 ! videoconvert name=inference_videoconvert n-threads=2 ! queue name=inference_hailonet_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! hailonet name=inference_hailonet hef-path=/home/onesight/Desktop/New/yolov8s_seg.hef batch-size=2  vdevice-group-id=1  force-writable=true  ! queue name=inference_hailofilter_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! hailofilter name=inference_hailofilter so-path=/home/onesight/hailo-rpi5-examples/venv_hailo_rpi5_examples/lib/python3.11/site-packages/hailo-apps-infra/resources/libyolov8seg.so  config-path=/home/onesight/Desktop/New/yolov8s_seg.json   function-name=filter_letterbox  qos=false ! queue name=inference_output_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0   ! inference_wrapper_agg.sink_1 inference_wrapper_agg. ! queue name=inference_wrapper_output_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0   ! hailotracker name=hailo_tracker class-id=1 kalman-dist-thr=0.8 iou-thr=0.9 init-iou-thr=0.7 keep-new-frames=2 keep-tracked-frames=15 keep-lost-frames=2 keep-past-metadata=False qos=False ! queue name=hailo_tracker_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0   ! queue name=identity_callback_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! identity name=identity_callback  ! queue name=hailo_display_overlay_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! hailooverlay name=hailo_display_overlay  ! queue name=hailo_display_videoconvert_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! videoconvert name=hailo_display_videoconvert n-threads=2 qos=false ! queue name=hailo_display_q leaky=no max-size-buffers=3 max-size-bytes=0 max-size-time=0  ! fpsdisplaysink name=hailo_display video-sink=autovideosink sync=true text-overlay=False signal-fps-measurements=true 
Segmentation fault

Also i had to do some updates to your code because build_postprocess.sh noted some errors:

  • Missing dequant function, i added this
inline float dequant(uint16_t num, float qp_zp, float qp_scale) { return (float(num) - qp_zp) * qp_scale;}
  • xt::reshape_view cannot correctly deduce the type of the second argument because the specified dimensions have different types, i changed the function into
#include <array>

// ...

auto reshaped = xt::reshape_view(output, std::array<size_t, 2>{
    static_cast<size_t>(output.shape()[0] * output.shape()[1]),
    static_cast<size_t>(BOX_CO + 1 + num_classes + MASK_CO)
});

Said so, i also created yolovseg8.hpp file as follow

#pragma once
#include "hailo_objects.hpp"
#include "xtensor/xarray.hpp"
#include "xtensor/xio.hpp"

#define YOLOV8_SEG_OUTPUT_BBOX "yolov8s_seg/conv73"
#define YOLOV8_SEG_OUTPUT_CLS "yolov8s_seg/conv74"
#define YOLOV8_SEG_OUTPUT_MASK "yolov8s_seg/conv75"
#define YOLOV8_SEG_OUTPUT_BBOX_MID "yolov8s_seg/conv60"
#define YOLOV8_SEG_OUTPUT_CLS_MID "yolov8s_seg/conv61"
#define YOLOV8_SEG_OUTPUT_MASK_MID "yolov8s_seg/conv62"
#define YOLOV8_SEG_OUTPUT_BBOX_SMALL "yolov8s_seg/conv44"
#define YOLOV8_SEG_OUTPUT_CLS_SMALL "yolov8s_seg/conv45"
#define YOLOV8_SEG_OUTPUT_MASK_SMALL "yolov8s_seg/conv46"

__BEGIN_DECLS
class Yolov8segParams
{
public:
    float iou_threshold;
    float score_threshold;
    int num_anchors;
    std::vector<int> outputs_size;
    std::vector<std::string> outputs_name;
    std::vector<xt::xarray<float>> anchors;
    std::vector<int> input_shape;
    std::vector<int> strides;
    std::vector<xt::xarray<float>> grids;
    std::vector<xt::xarray<float>> anchor_grids;

    Yolov8segParams() {
        iou_threshold = 0.6;
        score_threshold = 0.25;
        outputs_size = {20, 40, 80};
        outputs_name = {"YOLOV8_SEG_OUTPUT_BBOX", "YOLOV8_SEG_OUTPUT_CLS", "YOLOV8_SEG_OUTPUT_MASK", "YOLOV8_SEG_OUTPUT_BBOX_MID", "YOLOV8_SEG_OUTPUT_CLS_MID", "YOLOV8_SEG_OUTPUT_MASK_MID","YOLOV8_SEG_OUTPUT_BBOX_SMALL","YOLOV8_SEG_OUTPUT_CLS_SMALL","YOLOV8_SEG_OUTPUT_MASK_SMALL"};
        anchors = {{116, 90, 156, 198, 373, 326},
                                            {30, 61, 62, 45, 59, 119},
                                            {10, 13, 16, 30, 33, 23} };
        input_shape = {640,640};
        strides = {32, 16, 8};
    }
};

Yolov8segParams *init(const std::string config_path, const std::string function_name);
void yolov8seg(HailoROIPtr roi, void *params_void_ptr);
void free_resources(void *params_void_ptr);
void filter(HailoROIPtr roi, void *params_void_ptr);
void filter_letterbox(HailoROIPtr roi, void *params_void_ptr);
__END_DECLS

Is that correct?

Hi @Andrew92
At DeGirum, we integrated postprocessor for yolov8 segmentation models. You can see User Guide 4: Simplifying Instance Segmentation on a Hailo Device Using DeGirum PySDK for details. Your current approach is unlikely to work for yolov8 segmentation as it does not use anchors.

Hi @shashi thanks for your reply! I tried DeGyrum but i did not understood how to use it..
Maybe i can have a second try but the most important thing to know is if DeGyrum support custom model.. it does?

Thanks in advance, is very important to me to have a resolution on this problem!

Hi @Andrew92
Yes, our PySDK supports custom models. PySDK is a wrapper over hailort and it makes application development simpler. So, whatever model can run on hailo device, it can run through PySDK. Please let me know what you tried and what did not work so that we can help.

1 Like

Wonderfull! I don’t remember why it doesn’t work, probably i was frustated and did something stupid :rofl:
In particular i tried to install DeGyrum on a raspberry pi 5.. is raspi supported?

Yes, raspi is supported. In fact, that is the most common use case for PySDK+Hailo. We understand that it can sometimes be frustrating to get started with a new package. We tried to provide detailed instructions: DeGirum/hailo_examples: DeGirum PySDK with Hailo AI Accelerators
Please reach out if you encounter any troubles.

Ok, now i’m pretty sure that with Segmentation Fault the only error is in config.json file.

I have doubt about that

“outputs_size”: [20, 40, 80],
“outputs_name”: [
“yolov8s_seg/conv73”,
“yolov8s_seg/conv60”,
“yolov8s_seg/conv44”
],

which seems the cause of Segmentation Fault

@omria any help to write a correct one?