Hey @nicholas.young ,
Let me help explain how to read those YOLOv7 outputs you’re getting from Hailo. The data might look a bit confusing at first, but it’s actually organized in a pretty specific way.
When your model detects objects, it outputs them in what we call an NMS (Non-Maximum Suppression) format. Here’s what each piece means:
For each object detected, you’ll see a group of numbers that follows this pattern:
[class_label, confidence_score, x_min, y_min, x_max, y_max]
So if you see something like {2e0, ...}, that 2 at the start means it’s detected an object from class #2. The numbers that follow tell you:
- How confident the model is about this detection
- Where exactly the object is in the frame (those x_min, y_min, x_max, y_max coordinates)
After each detection’s main data, you’ll see either:
- More detections following the same pattern
- A bunch of zeros (this is just padding to keep the output size consistent)
The coordinates are normalized to your frame size, so they’ll be between 0 and 1. You’ll need to multiply them by your actual frame dimensions to get pixel locations.