We wanted to share a tool that simplifies edge AI development specifically for Hailo hardware—DeGirum PySDK. It’s a Python-based SDK designed to streamline the process of loading models, running inference, and visualizing results, making it easier to build and deploy AI applications on Hailo devices.
Example Repository:
We’ve created a repository that demonstrates how to use DeGirum PySDK with Hailo hardware. The examples include:
Loading pre-trained models like YOLO for classification, object detection, keypoint detection and segmentation.
Running inference on video streams, including webcam, RTSP, or video files.
Interactive widgets for dynamically selecting models and video sources.
The PySDK integrates with a Model Zoo, which provides access to a wide range of pre-trained AI models optimized for edge devices. This includes object detection models like YOLO, segmentation models like DeepLab, and more. You can quickly load models from the Model Zoo and deploy them on Hailo hardware without manual configuration, enabling faster experimentation and prototyping.
Key Features:
Unified API: The PySDK offers a consistent interface for interacting with AI models, allowing you to focus on development without worrying about low-level details.
Streamlined Inference: Easily load models onto Hailo devices and run inference on images, video files, or live streams with minimal setup.
Dynamic Model Support: The SDK supports various AI tasks, including detection, segmentation, classification, and keypoints. Models can be loaded dynamically from the cloud or local resources.
Built-In Visualization: Visualize inference results directly, including bounding boxes for detections, segmentation overlays, and keypoints. This helps with debugging and rapid iteration during development.
Video Stream Processing: Use the predict_stream function to process live streams or video files in real-time, with a simple and intuitive interface.
Why Use DeGirum PySDK with Hailo:
It significantly reduces the effort needed to set up and run inference workflows on Hailo devices.
The SDK provides tools to visualize and debug results, making it easier to fine-tune models and applications for edge AI.
It integrates seamlessly into Python environments like Jupyter Notebooks or standalone scripts, enabling rapid prototyping and testing.
Hello Shashi, I just found out about Regirum and I honestly find the functionalities it proposes very interesting. I would like to know if it is possible to do inference with my own .hef models. Thanks !!!
@claudio.veas
Yes, it is possible to do inference with your own hef models. We are currently working on a user guide to explain the procedure. Basically, it involves creating a json configuration file, a labels file, and an optional postprocessing file. If it is one of the popular model architectures (say yolo or resent or mobilenet) you can just copy our existing jsons and modify them. While we work on the user guide with detailed instructions, please let us know what models you are interested in and we can help you.
In order to set it up and perform inference on the Hailo-8 locally, I also had to enable the HailoRT Multi-Process service, as per HailoRT documentation:
sudo systemctl enable --now hailort.service # for Ubuntu
This was not mentioned in your GitHub, so please double check if it is necessary (if DeGirum use the multi-process service, the user should check whether the service is enabled)
I have a question about the implementation (feel free not to answer, if you are not allowed to). Thanks to its dataflow architecture, Hailo-8 can run inference in synchronous (blocking) or asynchronous (non-blocking) mode. Especially for single-context models, asynchronous or multi-threaded inference allows to use the Hailo-8 NN core at its best, achieving higher throughput.
Does DeGirum make use of this functionality?
Again, congratulations for the great tool you developed!
Hi @pierrem
Glad to hear you tried the PySDK. Thanks for pointing out about enabling the Multi-Process Service. This multi process service is a very well-designed feature that made our integration extremely easy. We will add it to the instructions.
Regarding the implementation, we submit multiple inference requests in parallel (to the multi process service) to ensure the highest performance (this number can be controlled using a parameter called ThreadPackSize and we will write a user guide soon on its usage). So, while within our agent, each inference looks like synchronous, from outside it looks like a multi-threaded asynchronous call to achieve maximum performance. One main criterion for PySDK design was not to trade-off performance for simplicity. So, we have extensively benchmarked FPS achieved by PySDK with Hailo’s benchmarking tool and confirmed that the numbers match.
I have been able to run most of the examples (hailo_examples/examples at main · DeGirum/hailo_examples · GitHub). However, I have encountered issues when executing multithreading. I am currently using a Hailo 8l and would like to know if you have conducted tests with this chip. I see that in this particular example, models for Hailo 8 are used.
I have also experienced considerable decreases in inference speed when using RTSP cameras. I believe it is due to the size and quality of the video. Is it possible to modify the input size to the model to improve performance?
HI @claudio.veas
Glad to hear you were able to run most of the examples. Please let us know what kind of issues you are encountering when using multithreading and we will investigate further.
Regarding RTSP cameras: from the image you shared, it appears that the input resolution to the model is full HD (1920x1080). If the model is using input resolution of 640x640, then it is resized before going to the model. This operation is compute intensive and when executed on CPU, it can lead to lower FPS. One way to deal with this is to configure RTSP stream to a lower resolution. Typically, RTSP cameras allow multiple streams: main stream that is usually for recording and sub stream that can be configured to lower resolution (say 720p, 1280x720) and can be used for AI inference. Hope this helps.
Hi @shashi
Just found your amazing PySDK. I just make it through all the process and got my .hef file from a custom trained yolov8m_seg model. I also setup my environment to get hailo-8L recognized by degirum_cli. My only problem now is how to run inference with my own model to annotate images. Is the user guide coming soon or anything I can try to do now? Thanks a lot!
Hi @calvinlo1219
Glad to hear you were able to set up the environment. If you have your own segmentation model, you can download one of our segmentation models and use the model and label files as template to run your own model. We are still working on the user guide, but we can help you get started. Can you confirm that the output from your model are 10 tensors?
Hi @shashi
Yes the output is 10 tensors. I just found two yolo object detection folders in the [hailo_examples] repository, one for hailo-8l and one for hailo-8. Looks like I need two json files to setup, but I can’t find the segmentation one. Did I miss it somewhere?
Thanks!
Hi @calvinlo1219
We have more models in our AI hub (https://hub.degirum.com) and you can download them when you sign up for AI Hub. Please let me know if you encounter any issues.
Hi @shashi
Thank you for the instruction. I found the seg models on your website.
Should I change “ConfigVersion” and “Checksum”?
And change the zoo_url to local path to my .hef file in the yolov8.ipynb from github?
Thank you so much!
Hi @calvinlo1219
You can leave ConfigVersion and Checksum unchanged. zoo_url can be your local path. Just make sure that you have a folder with labels file, model json file and model hef file with consistent naming.