How to interpret raw output

nicholas.young · November 9, 2024, 1:46am

I’m using the C library to utilize the hailo-8l hardware on the raspberry pi 5. I have gotten an image into frames and pixels, and am able to feed them into the configured .hef file, and I have output, but I don’t know how to interpret the output.

I’m using the model zoo yolov7, which parse-hef gives as
Output yolov7/yolov5_nms_postprocess FLOAT32, HAILO NMS(number of classes: 80, maximum bounding boxes per class: 80, maximum frame size: 128320)
But I don’t know what bits are what. When I run the model, I get an output like the snippet below. The first 5 make sense, classification, confidence, and 4 coords, but after all of the zeros, there is a classification, followed by 7 floating point numbers, followed by a possible classification, followed by 5 floating point numbers, and it doesn’t make any sense to me.

Example output:

{ 2e0, 2.9149818e-1, 1.3322696e-1, 1.0043838e0, 8.3363557e-1, 8.901919e-1, 8.9297575e-1, 7.028085e-2, 9.282986e-1, 8.92289e-2, 2.3529352e-1, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 1e0, 7.443513e-1, 1.754624e-1, 8.730015e-1, 2.6385126e-1, 3.3716178e-1, 1e0, 9.2661124e-1, 1.9907206e-3, 9.98977e-1, 1.4516605e-1, 6.3420063e-1, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, 0e0, -3.0316488e-13,

omria · November 10, 2024, 11:42am

Hey @nicholas.young ,

Let me help explain how to read those YOLOv7 outputs you’re getting from Hailo. The data might look a bit confusing at first, but it’s actually organized in a pretty specific way.

When your model detects objects, it outputs them in what we call an NMS (Non-Maximum Suppression) format. Here’s what each piece means:

For each object detected, you’ll see a group of numbers that follows this pattern:

[class_label, confidence_score, x_min, y_min, x_max, y_max]

So if you see something like {2e0, ...}, that 2 at the start means it’s detected an object from class #2. The numbers that follow tell you:

How confident the model is about this detection
Where exactly the object is in the frame (those x_min, y_min, x_max, y_max coordinates)

After each detection’s main data, you’ll see either:

More detections following the same pattern
A bunch of zeros (this is just padding to keep the output size consistent)

The coordinates are normalized to your frame size, so they’ll be between 0 and 1. You’ll need to multiply them by your actual frame dimensions to get pixel locations.

nicholas.young · November 13, 2024, 3:10am

Why are they x_min and x_max instead of “x_left” or “x_right”? Is this because of the NMS step?

nicholas.young · November 13, 2024, 3:12am

I’m also seeing that the y_max is sometimes larger than the y_min? What could cause this?
Person found! Confidence: 0.47459447, 0.15992427:1.0001111, 0.8834089, 0.92156434
Person found! Confidence: 0.46784216, 0.16221726:1.0045106, 0.8850374, 0.9176428
Person found! Confidence: 0.78671217, 0.63345283:0.92201316, 0.73821384, 0.3840052

Topic		Replies	Views
RPi5-hailo8L C/C++ object detection JPEG image General dfc	18	993	January 18, 2025
Interpretation of FastSAM_s Output from HEF Model on Hailo, Missing Postprocessing or Prompt Support ? General raspberry-pi , hailo8 , error	1	20	June 26, 2025
Issue in face detection in c++ pipeline postprocessing General raspberry-pi	1	25	June 28, 2025
Hailo8 - help needed with YOLOv4 output interpretation General	2	35	May 5, 2025
[HOWTO] Converting, inferencing and checking YOLO model on Raspberry Pi with HAILO AI HAT General	4	41	July 5, 2025

How to interpret raw output

Related topics