Using Picamera2 YUV420 lores stream directly for Hailo inference

Hi,
I’m using a Raspberry Pi CM4 with Picamera2 and Hailo for object detection. My pipeline uses the Picamera2 lores stream for inference and the main stream for streaming/recording. The issue is that Picamera2 lores currently outputs frames as YUV420, which is not directly compatible with my RGB-trained Hailo model. Right now I need to perform a CPU-side color conversion using OpenCV (cv2.cvtColor(..., cv2.COLOR_YUV2RGB)) before inference. On the CM4, this extra conversion increases CPU and memory bandwidth usage, which I would like to avoid for performance and efficiency reasons. I know Hailo supports some input conversions such as nv12_to_rgb during model compilation, so I’m wondering if there is any supported way to “bake” the YUV420 to RGB conversion inside the HEF/model itself.

Hi @dgarrido,

Yes, the Hailo Dataflow Compiler does support baking YUV-to-RGB conversion directly into the HEF so inference accepts raw YUV frames without CPU-side conversion. Since Picamera2’s YUV420 lores output is typically I420 (planar), you might try recompiling your model with the hybrid input conversion flag - if you’re using the Hailo Model Zoo, it could be as simple as hailomz compile <model_name> --hw-arch hailo8 --input-conversion i420_to_rgb which chains the format rearrangement and color conversion on-chip in a single step. If your lores stream happens to be NV12 (semi-planar) rather than I420, you’d use nv12_to_rgb instead. According to the DFC docs, an RGB calibration dataset should still work with these hybrid conversions, so you likely won’t need to regenerate your calibration set in YUV format. This does require access to the Dataflow Compiler to recompile - you can’t patch an existing pre-compiled HEF after the fact.

Please also see here:

2 Likes

Thank you very much for your quick response.

I normally use the Dataflow Compiler. In that case, this is my current .alls file:

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])

change_output_activation(conv42, sigmoid)
change_output_activation(conv53, sigmoid)
change_output_activation(conv63, sigmoid)

model_optimization_flavor(optimization_level=3)

nms_postprocess("yolov8s_nms_config.json", meta_arch=yolov8, engine=cpu)

What do I need to add to this file?

Thanks in advance.

Hi @dgarrido,

You might try adding the hybrid i420_to_rgb input conversion line after your normalization command. Because of how the DFC applies model script commands in reverse order (each new command inserts closest to the input), putting it after normalization means the final execution order will be: input → i420_to_yuv → yuv_to_rgb → normalization → network. So your .alls would look like:

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
i420_to_yuv_layer, yuv_to_rgb_layer = input_conversion(i420_to_rgb)

change_output_activation(conv42, sigmoid)
change_output_activation(conv53, sigmoid)
change_output_activation(conv63, sigmoid)

model_optimization_flavor(optimization_level=3)

nms_postprocess("yolov8s_nms_config.json", meta_arch=yolov8, engine=cpu)

If it turns out your Picamera2 lores stream is NV12 (semi-planar) rather than I420 (planar), swap i420_to_rgb for nv12_to_rgb. You can keep your existing RGB calibration images since the format conversion part is handled separately from the optimization process. After recompiling you should be able to feed the raw YUV420 buffer from picam2.capture_array("lores") directly without cv2.cvtColor.

1 Like

Hi @Michael
Thanks for your recommendations, it worked. I just had to perform a shape conversion because Picamera2 outputs lores YUV420 as a planar 2D buffer (960, 640) but Hailo expects the I420 input as a 3D tensor (320, 640, 3).

1 Like