I am currently running with one camera.
I changed the queue with Pipe to improve speed. And time is improved somewhat.
The post processing also take so much time almost about 20-25ms.
Can you recommend something to improve performance. Any way to smartly handle the post processing step to reduce timing also.
I am using yolov8 pose to detect object and it’s 2 keypoints.
Thank you for sharing your setup details. To improve inference performance for your dual camera feed setup on the Raspberry Pi 5 with Hailo-8L, consider the following optimizations:
Parallel Processing:
Continue using multiprocessing.Pipe for efficient inter-process communication.
Implement asynchronous inference (HailoAsyncInference) to process both camera feeds concurrently.
Streamline Post-Processing:
Leverage Hailo’s built-in post-processing (HailoRT-pp) to offload tasks like NMS to the Hailo-8L chip.
Optimize NMS by adjusting IoU and confidence score thresholds.
If applicable, implement batch processing for frames from both cameras.
Tailor your keypoint detection to process only the required 2 keypoints instead of the default 17.
Model Optimization:
Consider using a smaller model variant (e.g., YOLOv8n instead of YOLOv8m) if high precision isn’t crucial.
Experiment with reduced input image resolution to decrease processing time.
Efficient Pre-Processing:
Use OpenCV with NumPy for faster image resizing and normalization.
If possible, resize frames once and cache them for subsequent use.
Utilize Profiling Tools:
Use the Hailo Profiler from the TAPPAS suite to identify bottlenecks in your pipeline.
By implementing these optimizations, you should see improvements in both inference and post-processing performance. The key is to balance parallel processing, efficient pre/post-processing, and appropriate model selection for your specific use case.
If you need more details on any of these suggestions or have questions about implementation, please don’t hesitate to ask.
I am interested to know more abot HailoRT-pp. I am not sure how can i do that. And where i need to configure to run NMS on chip.
Can you either share me any resource or explain me here.
@omria
I have another Issue- My application is crashing after sometime.
The application keeps increasing memory consumptions and due to that application is crashing.
I commented the Hailo inference code lines, And memory is stable now. So It means no issue with my code. Seems issue in handling frames on hailo. I am just using basic example(with 2 cams) with HailoAsyncInference.
Is there any minimum memory requirement. on Device. I am using rpi5 4GB.
I understand you’re experiencing memory consumption issues with your application during inference. This could be due to suboptimal resource management. Here are some suggestions to address the problem:
Memory Cleanup:
Ensure you’re freeing up resources after each inference cycle. Clear unnecessary frames or results when they’re no longer needed. Consider using Python’s garbage collector:
import gc
gc.collect()
HailoAsyncInference Management:
If you’re using HailoAsyncInference, verify that you’re handling inference objects efficiently within your loop. Avoid continuously creating new ones, as this could lead to memory leaks.
Memory Profiling:
Use a tool like memory_profiler or tracemalloc to pinpoint where memory usage is increasing. This will help you determine if the issue lies in inference handling, post-processing, or elsewhere.
Hardware Considerations:
While a Raspberry Pi 5 with 4GB RAM should suffice for basic inference tasks, large models or high input resolutions might overwhelm it. Consider reducing input resolution or using a lighter model (e.g., YOLOv8n) to decrease memory usage.
Batch Processing:
If you’re running continuous inference, try adding delays between cycles or processing in batches to allow memory to stabilize.
Implementing these strategies should help you manage memory more effectively and prevent application crashes. Please let me know if you need any clarification or additional assistance.
@omria
I think, I figured out the issue. I am doing batch processing, And when I increase batch to more than 1 it started increasing memory.
I have not Raspberry PI 5 8GB variant yet. But I will confirm soon trying on 8GB.
@omria
The fps shown on the in the GStreamer Examples GitHub - hailo-ai/hailo-rpi5-examples is calculated considering post-processing step or just the hailo inference timing.
The FPS shown in the GStreamer examples on the GitHub - hailo-ai/hailo-rpi5-examples is typically calculated based on the inference timing from the Hailo device itself, not including the post-processing steps. If you’re looking to measure the end-to-end performance, including post-processing, you would need to factor in the additional time taken for those operations.