Inference using Arducam frames

dgarrido · March 20, 2025, 4:06pm

import argparse
import cv2
from picamera2 import MappedArray, Picamera2, Preview
from picamera2.devices import Hailo
import numpy as np
from control_settings_in_yaml import generate_controls_from_yaml


def extract_detections(hailo_output, w, h, class_names, threshold=0.5):
    """Extract detections from the HailoRT-postprocess output."""
    results = []
    for class_id, detections in enumerate(hailo_output):
        for detection in detections:
            detection_array = np.array(detection)
            score = detection_array[4]
            if score >= threshold:
                y0, x0, y1, x1 = detection_array[:4]
                bbox = (int(x0 * w), int(y0 * h), int(x1 * w), int(y1 * h))
                results.append([class_names[class_id], bbox, score])
                print(
                    f"Detection(s) found for class '{class_names[class_id]}', Score: {score:.2f}"
                )
    return results


def draw_objects(request):
    current_detections = detections
    if current_detections:
        with MappedArray(request, "main") as m:
            for class_name, bbox, score in current_detections:
                x0, y0, x1, y1 = bbox
                label = f"{class_name} %{int(score * 100)}"
                cv2.rectangle(m.array, (x0, y0), (x1, y1), (0, 255, 0), 2)
                cv2.putText(
                    m.array,
                    label,
                    (x0 + 5, y0 + 15),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    0.5,
                    (0, 255, 0),
                    1,
                    cv2.LINE_AA,
                )


if __name__ == "__main__":

    parser = argparse.ArgumentParser(
        description="Record a video with Picamera2 and perform object detection."
    )
    parser.add_argument("--width", type=int, default=1080,
                        help="Width of the video")
    parser.add_argument("--height", type=int, default=720,
                        help="Height of the video")
    parser.add_argument(
        "--config_file_path",
        type=str,
        default="config.yaml",
        help="Configuration file path",
    )
    parser.add_argument(
        "-m", "--model", help="Path for the HEF model.", default="yolov8n.hef"
    )
    parser.add_argument(
        "-l",
        "--labels",
        default="coco_1.txt",
        help="Path to a text file containing labels.",
    )
    parser.add_argument(
        "-s",
        "--score_thresh",
        type=float,
        default=0.5,
        help="Score threshold, must be a float between 0 and 1.",
    )

    args = parser.parse_args()
    video_w = args.width
    video_h = args.height
    score_thresh = args.score_thresh
    labels = args.labels
    model = args.model

    # Get the Hailo model, the input size it wants, and the size of our preview stream.
    with Hailo(model) as hailo:
        model_h, model_w, _ = hailo.get_input_shape()

        # Load class names from the labels file
        with open(labels, "r", encoding="utf-8") as f:
            class_names = f.read().splitlines()

        # The list of detected objects to draw.
        detections = None

        with Picamera2() as picam2:

            # Configure and start Picamera2.
            picam2.video_configuration.main.size = (video_w, video_h)
            main = {'size': (video_w, video_h), 'format': 'XBGR8888'}
            lores = {'size': (model_w, model_h), 'format': 'YUV420'}
            config = picam2.create_preview_configuration(main, lores=lores)
            picam2.configure(config)

            # Generate control dictionary from yaml file
            camera_control_dict = generate_controls_from_yaml(
                args.config_file_path)
            picam2.set_controls(camera_control_dict)

            picam2.start_preview(Preview.QTGL, x=0, y=0,
                                 width=video_w, height=video_h)

            picam2.start()

            picam2.pre_callback = draw_objects

            while True:
                frame = picam2.capture_array('lores')
                rgb = cv2.cvtColor(frame, cv2.COLOR_YUV420p2RGB)
                results = hailo.run(rgb)
                detections = extract_detections(
                    results, video_w, video_h, class_names, score_thresh
                )

I am running inference on a YOLOv8n model using a Raspberry Pi CM4 with Hailo. I am using the following script and the ‘lores’ camera flow for inference.

I have tested passing the frame directly in all available image formats in Picamera2 (XBGR8888, XRGB8888, RGB888, BGR888, and YUV420) to hailo.run, but they all throw errors. I have to convert them first using OpenCV, and I am concerned that this might be affecting latency.

In the visualization, there is flickering in the bounding boxes, and the model is not detecting properly. The model was compiled using Hailo Dataflow Compiler with optimization level 2, and I tested it with a dataset where it performed very well, so the issue might be in the inference script.

Could you suggest what might be wrong in this script?

Thanks!

dgarrido · March 20, 2025, 4:30pm

This is the error when trying to pass a frame in format YUV420 directly to hailo.run.

marco · March 21, 2025, 4:50am

I had to comment out your ‘control-file’ things because, I don’t know, what you had there. But otherwise, things work for me on a Pi5, Hailo 4.20.1 & Arducam ‘PiCam v3-w-AF’

ubuntu@ubuntu-2404-pi5b:/tmp$ python3 foo.py 
[12:48:10.661803559] [7763]  INFO Camera camera_manager.cpp:327 libcamera v0.4.0+53-29156679
[12:48:10.684896543] [7778]  INFO RPI pisp.cpp:720 libpisp version v1.0.7 28196ed6edcf 14-03-2025 (00:01:06)
[12:48:10.779826451] [7778]  INFO RPI pisp.cpp:1179 Registered camera /base/axi/pcie@120000/rp1/i2c@88000/imx708@1a to CFE device /dev/media0 and ISP device /dev/media1 using PiSP variant BCM2712_C0
[12:48:10.782920553] [7763]  WARN V4L2 v4l2_pixelformat.cpp:346 Unsupported V4L2 pixel format RPBP
[12:48:10.783121145] [7763]  WARN V4L2 v4l2_pixelformat.cpp:346 Unsupported V4L2 pixel format RPBP
[12:48:10.783802179] [7763]  INFO Camera camera.cpp:1202 configuring streams: (0) 1080x720-XBGR8888 (1) 640x640-YUV420 (2) 1536x864-BGGR_PISP_COMP1
[12:48:10.783947716] [7778]  INFO RPI pisp.cpp:1484 Sensor: /base/axi/pcie@120000/rp1/i2c@88000/imx708@1a - Selected sensor format: 1536x864-SBGGR10_1X10 - Selected CFE format: 1536x864-PC1B
Detection(s) found for class 'person', Score: 0.88
Detection(s) found for class 'person', Score: 0.87
Detection(s) found for class 'person', Score: 0.87
Detection(s) found for class 'person', Score: 0.88
Detection(s) found for class 'person', Score: 0.86
Detection(s) found for class 'person', Score: 0.84
Detection(s) found for class 'person', Score: 0.85

dgarrido · March 21, 2025, 1:51pm

Thank you, Marco, for the feedback. It’s working for me too, but I’m experiencing latencies. Is your detection also experiencing latencies? What detection thresholds did you set?

dgarrido · March 21, 2025, 2:03pm

@omria Sorry for tagging you. Could you please take a look at this code? The hailo.run function does not accept the frame directly from the camera (I tested all formats), and I have to convert it first using OpenCV. Do you know why? In this example picamera2/examples/hailo/detect.py at main · raspberrypi/picamera2 they pass the frame directly without conversion but when I tested it raised the error in the screenshot.

marco · March 21, 2025, 3:47pm

Actually, with

            # lores = {'size': (model_w, model_h), 'format': 'YUV420'}
            lores = {'size': (model_w, model_h), 'format': 'RGB888'}

it works for me;-)

                frame = picam2.capture_array('lores')
                # rgb = cv2.cvtColor(frame, cv2.COLOR_YUV420p2RGB)
                # results = hailo.run(rgb)
                results = hailo.run(frame)

But even with the conversion in the mix, I did not see noticeable lag?! I did not really measure it and just me/person and a few other things as test;-)

dgarrido · March 21, 2025, 4:04pm

I’m using a RPi CM4, and it only works converting first. Thank you!

marco · March 21, 2025, 4:26pm

What version of Hailo* ?

dgarrido · March 21, 2025, 6:03pm

I’m using Hailo 4.20.1

marco · March 21, 2025, 6:19pm

At least we agree there;-)

Topic		Replies	Views
Multicamera inference General raspberry-pi , hailo8	6	130	June 2, 2025
Bare bones python object detection API General python , hailo-api	11	363	February 14, 2025
Need sample python code for inference General hailort , hailo8	2	1140	June 19, 2024
'Hailo' object has no attribute 'configured_infer_model' General	1	25	January 20, 2025
License plate detection - How to use? General raspberry-pi	7	637	January 21, 2025

Inference using Arducam frames

Related topics