Custom Postprocess (.so) File: Keypoints Not Visible Inside Bounding Boxes

user113 · August 6, 2025, 1:51pm

Environment

Hardware: Raspberry Pi 5 + Hailo-8L
Software: HailoRT 4.19, Tappas 3.30.0

Details

I have 2 classes with different keypoints:

deo: Should show 2 keypoints (deo_bottom, deo_top)
lighter: Should show 3 keypoints (lighter_bottom, lighter_mid, lighter_top)

The .so file I am having trouble with: libyolov8pose_post.so

Problem: Problems creating a correct .so file

Bounding boxes work perfectly - I can see both “deo” and “lighter” objects detected with correct labels
Keypoints are missing - I should see 2 keypoints inside the “deo” bbox and 3 keypoints inside the “lighter” bbox, but they don’t appear at all
Only 2 dots visible - These appear outside any bounding box. One dot is larger than the other, which makes me suspect the keypoints might be overlapping at the same location. They move in tandem wrt bboxes

My debug output (terminal)
DEBUG: Detection bbox - xmin: 0.297505, ymin: 0.462294, width: 0.240393, height: 0.0950694

DEBUG: Landmark 0 normalized coords: (0.194664, 0.267652) confidence: 0.0161987

DEBUG: Landmark 1 normalized coords: (0.192605, 0.269344) confidence: 0.0155519

DEBUG: Adding 2 keypoints for class 0 (deo)

DEBUG: Detection bbox - xmin: 0.160458, ymin: 0.571637, width: 0.355206, height: 0.201214

DEBUG: Landmark 0 normalized coords: (0, 0.30443) confidence: 0.049372

DEBUG: Landmark 1 normalized coords: (0, 0.304152) confidence: 0.0497917

DEBUG: Landmark 2 normalized coords: (0, 0.30456) confidence: 0.0493894

DEBUG: Adding 3 keypoints for class 1 (lighter)

Frame count: 130

Key Code Changes Made

1. yolov8pose_postprocess.cpp

// Changed from 1 to 2 classes

#define NUM_CLASSES 2

// Lowered thresholds for 16-bit quantized model

#define SCORE_THRESHOLD 0.4f // was 0.6

float joint_threshold = 0.005f // was 0.5

// Dynamic keypoint handling based on class

int max_keypoints = custom_labels::get_max_keypoints(class_id);

std::vector<std::pair<int, int>> class_joint_pairs = custom_labels::get_joint_pairs(class_id);

// Added coordinate clipping to stay within bounds

xt::view(kpts_corrdinates, xt::all(), 0) = xt::clip(xt::view(kpts_corrdinates, xt::all(), 0), 0.0f, (float)network_dims[0]);

xt::view(kpts_corrdinates, xt::all(), 1) = xt::clip(xt::view(kpts_corrdinates, xt::all(), 1), 0.0f, (float)network_dims[1]);

// Normalize keypoints to 0-1 range to match bbox coordinate system

landmarks(i, 0) = scaled_keypoints[i].xs / network_dims[0]; // x normalized

landmarks(i, 1) = scaled_keypoints[i].ys / network_dims[1]; // y normalized

// Pass to Hailo function

hailo_common::add_landmarks_to_detection(detection, “centerpose”, landmarks, 0.001f, class_joint_pairs);

2. Created custom_labels.hpp

static std::map<uint8_t, std::string> custom_classes = {

{0, "deo"}, {1, "lighter"}

};

static std::map<int, int> max_keypoints_per_class = {

{0, 2}, {1, 3}  *// deo: 2 keypoints, lighter: 3 keypoints*

};

static std::map<int, std::vector<std::pair<int, int>>> joint_pairs_per_class = {

{0, {{0, 1}}},                    *// deo: connect bottom to top*

{1, {{0, 1}, {1, 2}}}            *// lighter: bottom→mid→top chain*

};

3. Modified Functions

decode_boxes_and_keypoints(): Added class-specific logic using argmax for multi-class detection
process_single_decoding()
yolov8(): Passes class ID through the pipeline

Model Details

My model outputs keypoints as UINT16 (not UINT8 like standard models):
Output yolov8s_pose/conv72 UINT16, FCR(20x20x9)

Custom HEF Details:

HEF Properties:
admin@C2:~ $ hailortcli parse-hef yolov8s_pose_04_08_2025.hef

Architecture HEF was compiled for: HAILO8L

Network group name: yolov8s_pose, Multi Context - Number of contexts: 4

Network name: yolov8s_pose/yolov8s_pose

    VStream infos:

        Input  yolov8s_pose/input_layer1 UINT8, NHWC(640x640x3)

        Output yolov8s_pose/conv70 UINT8, FCR(20x20x64)

        Output yolov8s_pose/conv71 UINT8, NHWC(20x20x2)

        Output yolov8s_pose/conv72 UINT16, FCR(20x20x9)

        Output yolov8s_pose/conv57 UINT8, FCR(40x40x64)

        Output yolov8s_pose/conv58 UINT8, NHWC(40x40x2)

        Output yolov8s_pose/conv59 UINT16, FCR(40x40x9)

        Output yolov8s_pose/conv43 UINT8, FCR(80x80x64)

        Output yolov8s_pose/conv44 UINT8, NHWC(80x80x2)

        Output yolov8s_pose/conv45 UINT16, FCR(80x80x9)

DRIVE LINK
Folder structure:

Could someone help me to implement custom multi-class pose detection with different number of keypoints, and create a compatible .so file? Thanks.

omria · August 13, 2025, 9:22am

Hey @user113!

Your keypoints are all clustered at (0, 0.304xx) because you’re either reading the wrong tensor indices or missing the anchor/stride calculations for your conv45/conv59/conv72 layers. Plus that “two dots moving together” thing? Classic sign that you’re truncating your keypoint array too early.

The main issues:

1. You’re decoding in the wrong order
Don’t resize to max_keypoints first! Decode ALL your model keypoints (looks like 9 from your FCR output) from UINT16 to float with proper stride scaling, THEN pick which ones you want to show.

2. Missing multi-scale handling
Your model has 3 FCR outputs at different scales (20x20, 40x40, 80x80) that need different strides (8, 16, 32). If you’re only decoding one scale, most keypoints will be garbage - just like YOLO bbox decoding.

3. Hailo expects the full array
hailo_common::add_landmarks_to_detection wants the complete fixed-size keypoint array, not a resized vector. Filter by confidence, don’t resize.

Quick fixes for your .so:

Decode full tensor first, select class-specific subset after
Handle all 3 output scales with correct strides
Keep joint_threshold around 0.2 for production (0.005 is fine for debugging though)
Make sure you’re building with -fPIC and proper linking

Your per-class logic and joint pairs look good actually! It’s really just the decoding order that’s messing things up.

user113 · August 22, 2025, 9:13am

Hi,

It worked, thanks a lot for your help! This was a small experiment before labeling/training data for my custom model with more keypoints. I will let you know if I need your help in the future.

Best.