How to interpret raw output

Hey @nicholas.young ,

Let me help explain how to read those YOLOv7 outputs you’re getting from Hailo. The data might look a bit confusing at first, but it’s actually organized in a pretty specific way.

When your model detects objects, it outputs them in what we call an NMS (Non-Maximum Suppression) format. Here’s what each piece means:

For each object detected, you’ll see a group of numbers that follows this pattern:

[class_label, confidence_score, x_min, y_min, x_max, y_max]

So if you see something like {2e0, ...}, that 2 at the start means it’s detected an object from class #2. The numbers that follow tell you:

  • How confident the model is about this detection
  • Where exactly the object is in the frame (those x_min, y_min, x_max, y_max coordinates)

After each detection’s main data, you’ll see either:

  • More detections following the same pattern
  • A bunch of zeros (this is just padding to keep the output size consistent)

The coordinates are normalized to your frame size, so they’ll be between 0 and 1. You’ll need to multiply them by your actual frame dimensions to get pixel locations.