Running inference on 2 Rpi5 camera feeds

saurabh · September 29, 2024, 4:30pm

What are best ways to run inference on hailo8L with 2 camera feeds on raspberry pi5.

I tried multiprocessing Queue to communicate with HailoAsyncInference implmeneted in Hailo Application Example Repo.

I am currently running with one camera.
I changed the queue with Pipe to improve speed. And time is improved somewhat.

The post processing also take so much time almost about 20-25ms.
Can you recommend something to improve performance. Any way to smartly handle the post processing step to reduce timing also.

I am using yolov8 pose to detect object and it’s 2 keypoints.

github.com

hailo-ai/Hailo-Application-Code-Examples/blob/main/runtime/python/pose_estimation/pose_estimation_utils.py

from pathlib import Path
from multiprocessing import Process
import numpy as np
import cv2
from PIL import Image
from hailo_platform import HEF
from loguru import logger
from typing import List, Dict, Tuple

# Joint pairs used for drawing pose estimations
JOINT_PAIRS = [
    [0, 1], [1, 3], [0, 2], [2, 4],
    [5, 6], [5, 7], [7, 9], [6, 8], [8, 10],
    [5, 11], [6, 12], [11, 12],
    [11, 13], [12, 14], [13, 15], [14, 16]
]

class PoseEstPostProcessing:
    def __init__(self, max_detections: int, score_threshold: float, nms_iou_thresh: float,
                 regression_length: int, strides: List[int]):

This file has been truncated. show original

Even small improvements can be helpful.

omria · September 30, 2024, 9:58am

Hey @saurabh

Thank you for sharing your setup details. To improve inference performance for your dual camera feed setup on the Raspberry Pi 5 with Hailo-8L, consider the following optimizations:

Parallel Processing:
- Continue using multiprocessing.Pipe for efficient inter-process communication.
- Implement asynchronous inference (HailoAsyncInference) to process both camera feeds concurrently.
Streamline Post-Processing:
- Leverage Hailo’s built-in post-processing (HailoRT-pp) to offload tasks like NMS to the Hailo-8L chip.
- Optimize NMS by adjusting IoU and confidence score thresholds.
- If applicable, implement batch processing for frames from both cameras.
- Tailor your keypoint detection to process only the required 2 keypoints instead of the default 17.
Model Optimization:
- Consider using a smaller model variant (e.g., YOLOv8n instead of YOLOv8m) if high precision isn’t crucial.
- Experiment with reduced input image resolution to decrease processing time.
Efficient Pre-Processing:
- Use OpenCV with NumPy for faster image resizing and normalization.
- If possible, resize frames once and cache them for subsequent use.
Utilize Profiling Tools:
- Use the Hailo Profiler from the TAPPAS suite to identify bottlenecks in your pipeline.

By implementing these optimizations, you should see improvements in both inference and post-processing performance. The key is to balance parallel processing, efficient pre/post-processing, and appropriate model selection for your specific use case.

If you need more details on any of these suggestions or have questions about implementation, please don’t hesitate to ask.

saurabh · September 30, 2024, 12:21pm

@omria

I am interested to know more abot HailoRT-pp. I am not sure how can i do that. And where i need to configure to run NMS on chip.
Can you either share me any resource or explain me here.

saurabh · October 1, 2024, 7:57am

@omria
I have another Issue- My application is crashing after sometime.
The application keeps increasing memory consumptions and due to that application is crashing.
I commented the Hailo inference code lines, And memory is stable now. So It means no issue with my code. Seems issue in handling frames on hailo. I am just using basic example(with 2 cams) with HailoAsyncInference.

Is there any minimum memory requirement. on Device. I am using rpi5 4GB.

What is the issue or I need to cleanup maybe?

omria · October 1, 2024, 1:16pm

To use Hailo’s built-in post-processing (HailoRT-pp) for NMS (Non-Maximum Suppression) on the Hailo-8L accelerator:

HailoRT-pp Integration:
- Use Hailo’s post-processing APIs to configure NMS on the device.
- This offloads CPU work and can reduce bounding box filtering time.
HEF Configuration:
- Configure your HEF (Hailo Executable Format) file to include post-processing operations like NMS.
- This enables HailoRT to handle post-processing on the hardware.
Resource:
- Refer to the Hailo TAPPAS suite for example implementations and detailed guidance on using HailoRT-pp.

Let me know if you need more specific information on implementation!

Best regards,
Omri

omria · October 1, 2024, 1:18pm

I understand you’re experiencing memory consumption issues with your application during inference. This could be due to suboptimal resource management. Here are some suggestions to address the problem:

Memory Cleanup:
Ensure you’re freeing up resources after each inference cycle. Clear unnecessary frames or results when they’re no longer needed. Consider using Python’s garbage collector:
```
import gc
gc.collect()
```
HailoAsyncInference Management:
If you’re using HailoAsyncInference, verify that you’re handling inference objects efficiently within your loop. Avoid continuously creating new ones, as this could lead to memory leaks.
Memory Profiling:
Use a tool like memory_profiler or tracemalloc to pinpoint where memory usage is increasing. This will help you determine if the issue lies in inference handling, post-processing, or elsewhere.
Hardware Considerations:
While a Raspberry Pi 5 with 4GB RAM should suffice for basic inference tasks, large models or high input resolutions might overwhelm it. Consider reducing input resolution or using a lighter model (e.g., YOLOv8n) to decrease memory usage.
Batch Processing:
If you’re running continuous inference, try adding delays between cycles or processing in batches to allow memory to stabilize.

Implementing these strategies should help you manage memory more effectively and prevent application crashes. Please let me know if you need any clarification or additional assistance.

Best regards,
Omri

saurabh · October 2, 2024, 5:23am

@omria
I think, I figured out the issue. I am doing batch processing, And when I increase batch to more than 1 it started increasing memory.
I have not Raspberry PI 5 8GB variant yet. But I will confirm soon trying on 8GB.

saurabh · October 2, 2024, 5:26am

@omria
The fps shown on the in the GStreamer Examples GitHub - hailo-ai/hailo-rpi5-examples is calculated considering post-processing step or just the hailo inference timing.

omria · October 6, 2024, 8:01am

The FPS shown in the GStreamer examples on the GitHub - hailo-ai/hailo-rpi5-examples is typically calculated based on the inference timing from the Hailo device itself, not including the post-processing steps. If you’re looking to measure the end-to-end performance, including post-processing, you would need to factor in the additional time taken for those operations.

Let me know if you need help with anything else!

saurabh · October 6, 2024, 8:24am

Thank you. Can you help me to resolve this issue Hey I want to build my own custom postprocessing .so

In case you need more info i can share.

Simone_Tortorella · February 21, 2025, 2:50pm

Did you do that? Can you share which are the lines to adjust in c++ files to receive, process and postprocess the images?

Topic		Replies	Views
Multicamera inference General raspberry-pi , hailo8	6	149	June 2, 2025
Best approach for multi-model video inference in rpi General raspberry-pi	1	145	January 13, 2025
How to Run Dual Camera with Two AI Models in Parallel on Raspberry Pi 5 + Hailo-8? General	16	240	July 9, 2025
Inference Performance Issue of Hailo-8L on RPi5 General	5	262	December 16, 2024
Facing issue when using multimodal when run on raspberry pi 5 AI kit+ General raspberry-pi , error	2	87	May 7, 2025

Running inference on 2 Rpi5 camera feeds

Related topics