Hey @MaHG,
Welcome to the Hailo Community!
Let’s dive into the issues you mentioned by tweaking the post-processing steps in the script. Here are some recommendations to enhance the detection and keypoint estimation for multiple people:
-
Adjust NMS Parameters: Instead of limiting nms_max_output_per_class
to 1, consider using a higher value to allow multiple detections. Ensure this is balanced with appropriate NMS settings to avoid duplicate detections.
-
Modify the NMS Function: The current NMS implementation might be overly aggressive. We can fine-tune it to better accommodate multiple people in the frame.
-
Adjust Confidence Thresholds: Fine-tuning the confidence thresholds for both detection and keypoints can help in striking a balance between detecting all individuals and avoiding false positives.
Here’s a modified version of the non_max_suppression
function in yolov8_pose_utils.py
:
def non_max_suppression(prediction, conf_thres=0.25, iou_thres=0.45,
max_det=300, n_kpts=17):
"""Non-Maximum Suppression (NMS) on inference results to reject overlapping detections"""
# ... (existing code until the NMS part)
for xi, x in enumerate(prediction): # image index, image inference
# ... (existing code until the NMS part)
# Apply NMS
keep = nms(preds, iou_thres)
if keep.shape[0] > max_det:
keep = keep[:max_det]
out = x[keep]
scores = out[:, 4]
boxes = out[:, :4]
kpts = out[:, 6:]
kpts = np.reshape(kpts, (-1, n_kpts, 3))
out = {'bboxes': boxes,
'keypoints': kpts,
'scores': scores,
'num_detections': int(scores.shape[0])}
output.append(out)
return output
Next, let’s make some adjustments to the yolov8_pose_inference.py
script:
- Update the
kwargs
Dictionary:
kwargs = {
'classes': 1,
'nms_max_output_per_class': 100, # Increased from 300 to allow more detections
'anchors': {'regression_length': 15, 'strides': [8, 16, 32]},
'score_threshold': 0.25, # Increased from 0.001 to filter out low-confidence detections
'nms_iou_thresh': 0.45, # Adjusted from 0.7 to allow for closer detections
'meta_arch': 'nanodet_v8',
'device_pre_post_layers': None
}
- Modify the
visualize_pose_estimation_result
Function Call in the Main Loop:
image = Image.fromarray(cv2.cvtColor(visualize_pose_estimation_result(results, processed_image, detection_threshold=0.25, joint_threshold=0.2, **kwargs), cv2.COLOR_BGR2RGB))
These adjustments should enhance the detection and visualization of multiple individuals in the image. The key changes include:
- Increasing
nms_max_output_per_class
to permit more detections.
- Adjusting the
score_threshold
to filter out low-confidence detections.
- Fine-tuning the
nms_iou_thresh
to balance detecting closely positioned people and avoiding duplicates.
- Adding
detection_threshold
and joint_threshold
parameters to the visualization function for better control over which detections and keypoints are displayed.
Note: I haven’t tested these changes myself, but this approach should help. Give these modifications a try and see if they improve the results for both single-person and multi-person images. You may need to further tweak the thresholds based on your specific use case and the characteristics of your images.
Best Regards