DMABUF on rpi 5 + hailo 8

How can I obtain “zero copy” using DMABUF all the way on rpi5?

what i found so far is :

  1. Camera path is libcamera, not plain V4L2 memory:DMABuf to apps
    • libcamerasrc generally doesn’t advertise video/x-raw(memory:DMABuf) downstream.

    • tries ('... (memory:DMABuf) ...') failed to negotiate; GStreamer stayed in system memory.

    • v4l2src with /dev/video0 can’t give NV12 DMABUF either (that node isn’t a conventional YUV capture).

  2. Hailo GStreamer elements expect CPU-accessible buffers
    • hailonet doesn’t import the camera’s DMABUF FDs as input tensors.

    • Even if the camera delivered DMABUF caps, hailonet still wants a CPU buffer it can DMA from.

  3. HailoRT inherently does one DMA into the device
    • There is no public API to feed the camera’s DMABUF FD straight into Hailo SRAM/DDR.

    • One host→device DMA is unavoidable (that’s normal). The goal is to avoid extra CPU copies before that.

Is there any way to feed camera’s DMABUF straight into HAILO 8 without passing trough CPU?

Have you found the solution? I’m looking for it too.

Hey @Frenki , @Thierry_Chantry

Unfortunately not. HailoRT needs at least one DMA copy from the host (CPU memory) to the device, and that’s by design ,there’s no way around it. It can’t import camera DMABUFs directly as input tensors.

Still working on it. Got stuck with this errors

[HailoRT] [error] CHECK failed - Manually activate a core-op is not allowed when the core-op scheduler is active!
[HailoRT] [error] Failed activate HAILO_INVALID_OPERATION(6)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
[fatal] could not init scheduler runner

Will let you know in a few days.

It should be nice HAILO gives har files in the zoo so we can recompile for nv12 to RGB

Hey @Frenki ,

You can download the ONNX from the model zoo and you have all the yaml and alls ready to recompile it for nv12 or rgb

Hi together, like I understood, the hailo8 does not support zero-copy by the design?
Do you have then recommendations how to design the system to get quite low cpu-usage? At my usecase i have 4 streams, which leads at the moment to a quite high cpu-load.

Did this design changed on other Hailo-chips?

1 Like