Hi @Theo_Vioux
It is hard to say if people can be recognized at certain angles especially if it is just a side view where only one eye is visible. Face embedding models rely on 5 keypoints (two eyes, nose, two lip corners) and quality of embeddings can degrade if only half of the face is visible. This being said, the final accuracy depends on the total number of people you want to recognize. If it is a small set, a lot of such optimizations would work as probability of embeddings crossing a threshold that leads to false identification is low.
Hi,
There should be 3 or 4 people, but ideally they should be recognizable from the back as well. If that’s not possible, I’d make sure that at least the front of the face is recognizable by a camera. With the face detection+face recognition+pose estimation model, I have about 5 FPS which is not enough, do you see a way to optimize this?
Hi @shashi
For person detection, I use the
yolo11n_pose--640x640_quant_hailort_multidevice_1
model, but since it is light for its nano version, it is not accurate enough to detect body joints (for example hand detected only if head is detected and occlusion). Do you have a more accurate version, even if it degrades fps a bit (I was thinking of an m version, or one configured directly on hailo8 and not multidevice). If there’s no compatible model, would it be better to change the output tensors currently set to :
{
"ConfigVersion": 6,
"Checksum": "01082212afab31375c5ba2f66641681d2719e5ea18053d336c4bc2ab37a81362",
"DEVICE": [
{
"DeviceType": "HAILO8L",
"RuntimeAgent": "HAILORT",
"ThreadPackSize": 6,
"SupportedDeviceTypes": "HAILORT/HAILO8L, HAILORT/HAILO8"
}
],
"PRE_PROCESS": [
{
"InputN": 1,
"InputH": 640,
"InputW": 640,
"InputC": 3,
"InputQuantEn": true
}
],
"MODEL_PARAMETERS": [
{
"ModelPath": "yolo11n_pose--640x640_quant_hailort_multidevice_1.hef"
}
],
"POST_PROCESS": [
{
"OutputPostprocessType": "PoseDetectionYoloV8",
"OutputNumClasses": 1,
"LabelsPath": "labels_yolo11n_pose.json",
"OutputNumLandmarks": 17
}
]
}
Hi @Theo_Vioux
We added yolo11s_pose model compiled for Hailo8 to our hailo model zoo. Please see if that helps.
Hi @shashi
Do you have a more accurate pose detection model because despite the occlusions, sometimes a joint is detected slightly off center in the RGB sensor, placing a dot on the wall behind in the 2D image sometimes which causes my infrared depth sensor to misinterpret the depth of that dot, and take the z of the wall. In fact, the data is quite noisy, and I want to make sure I have the most accurate model possible, so that the noise is as low as possible and post-processed correctly.
Hi @Theo_Vioux
We compiled the yolo11m pose detection model with keypoint outputs in 16 bit precision as the keypoints are more sensitive to quantization. The model can be found here: DeGirum AI Hub. Please see if this is better for your use case.
Hello @shashi,
Thanks for your feedback.
So far, I don’t see too much of a noticeable difference. The problem being that the detected joint points fluctuate too much in space, even when I stay still, is there any way to stabilize this?
Hello @shashi
The coco pose model for multi-human pose estimation can be used to detect wrists, but with mediocre accuracy because they are thin and therefore generate a lot of false detection (often an object behind when the wrist is moving). Is there a Hailo-compatible model with hand estimation instead, like mediapipe could do for example?
Hi @Theo_Vioux
You need just hand detection or keypoints on the hands also?
Just hand detection being accurate (no keypoints on it), I saw that the
hand_landmark_lite--224x224_quant_hailort_hailo8_1
model was available, but is it accurate enough? Because I haven’t managed to get it to detect a hand yet with code, if not do you have any others?
Hi @Theo_Vioux
For just hand detection, you can use this model: https://hub.degirum.com/degirum/hailo/yolov8n_relu6_hand--640x640_quant_hailort_hailo8_1
The landmarks model takes a detected hand as input. Hence, if you provide a larger image with multiple hands in it, it will not show impressive results. You can use these two models in series: use had model to detect hands and then crop the hands and run the landmarks model.
Do you have a code example, because with :
model_names = [
"yolo11s_pose--640x640_quant_hailort_hailo8_1",
"yolov8n_relu6_hand--640x640_quant_hailort_hailo8_1"
]
I get :
Version is not supported
The version 11 of 'yolov8n_relu6_hand--640x640_quant_hailort_hailo8_1' model parameter collection is NEWER than the current version 10 of DeGirum framework software.
Hi @Theo_Vioux
Please upgrade PySDK to latest version.