Working python hailo inference

dvir.itzko · January 2, 2025, 5:57pm

I am trying to understand how to use hailort to run inference on pretrained models with hailo8l on my raspberry pi 5.
While I did see the raspberry 5 examples, it is very complicated with streaming etc, and I want to start with a simple example where I take a model, image, pass the image trhoguh the model and see the output.

I did see in a few places (including using hailo tutorial command) something like working example code, but I couldn’t get a real prediction out of it.

# running on HailoRT v4.19.0, Raspberry Pi 5 AI HAT (Hailo8, python 3.10)
import numpy as np
import hailo_platform as hpf

hef_path = './retinaface_mobilenet_v1.hef'

hef = hpf.HEF(hef_path)

with hpf.VDevice() as target:
    configure_params = hpf.ConfigureParams.create_from_hef(hef, interface=hpf.HailoStreamInterface.PCIe)
    network_group = target.configure(hef, configure_params)[0]
    network_group_params = network_group.create_params()

    input_vstream_info = hef.get_input_vstream_infos()[0]
    output_vstream_info = hef.get_output_vstream_infos()[0]

    input_vstreams_params = hpf.InputVStreamParams.make_from_network_group(network_group, quantized=False, format_type=hpf.FormatType.FLOAT32)
    output_vstreams_params = hpf.OutputVStreamParams.make_from_network_group(network_group, quantized=False, format_type=hpf.FormatType.FLOAT32)

    input_shape = input_vstream_info.shape
    output_shape = output_vstream_info.shape

    print(f"Input shape: {input_shape}, Output shape: {output_shape}")

    with network_group.activate(network_group_params):
        with hpf.InferVStreams(network_group, input_vstreams_params, output_vstreams_params) as infer_pipeline:
            for _ in range(10):
                random_input = np.random.rand(*input_shape).astype(np.float32)
                input_data = {input_vstream_info.name: np.expand_dims(random_input, axis=0)}
                results = infer_pipeline.infer(input_data)
                output_data = results[output_vstream_info.name]
                print(f"Inference output: {output_data}")

The issues I’m having is:

I’m not even sure what output I should take. in the example(s) they do output_vstream_info = hef.get_output_vstream_infos()[0] but I don’t think it is the last layer where I should actually take. Instead I saw output_vstream_info = hef.get_output_vstream_infos()[-1] that made more sense since it 'feels ’ more like a last layer, but I really can’t tell.
How do I post-process to get the wanted results.? In the pi5 hailo examples they are using compiled postprocess in the inference pipeline, but I want to have more control (and also to learn how things work) and using something more like in the example code, where I take the inference result and process it myself.

If anyone has a working ‘real’ example (that if I’ll pass a training/real image the results would make sense) for any model that would be amazing.

shashi · January 2, 2025, 6:02pm

@dvir.itzko
We made a python SDK that makes working with Hailo devices simple: Simplifying Edge AI Development with DeGirum PySDK and Hailo - General - Hailo Community
Please take a look and see if you find it useful.

Nadav · January 2, 2025, 6:16pm

Hi Dvir,
The vstream_info just hold the static info of the vstreams, such as shapes etc. The actual output tensor of the logits will be recieved from the infer command.
I think that your current issue lies in the fact that the simple pipelines that you’ve seen makes use of conpiled networks that have the post processing integrated in the HEF. In that case, they didn’t need to apply any bbox or NMS post processing.

Try starting with one of the yolos.

dvir.itzko · January 5, 2025, 5:52am

While the pythonSDK looks relly cool and simple, the list of supported model is limited and I don’t have the models I’m looking for (face classification mainly, but also not sure I want yolo8 for face detection).
Thanks though, I’ll keep it for future reference, maybe with more models supported I would move to it : )

dvir.itzko · January 5, 2025, 5:56am

I’m not sure what you mean.
I think that I understand better now that somehow, all of the outputs suppose to be used for inference, and there isn’t a first/last layer I should use.
My problem is that the hef models’ output is different than the model itself (in the original pytroch repo) in the sense that in the pytorch repo, there is 3 outputs for bbox,classification, keypoints but in the inference here (in the line results = inference_pipeline.infer(input_data) I’m getting technically 9 outputs so not sure how to deal with that

shashi · January 5, 2025, 3:41pm

Hi @dvir.itzko
What models do you want the support for? Our PySDK can run with any model supported by Hailo8 devices.

dvir.itzko · January 6, 2025, 7:29am

Hi @shashi

Maybe I missed something, but looking at PySDK docs hailo_model_zoo.md and in the DeGirum AI hub models and I can’t find my models.
Currently I’m using retinaface_mobilenet_v1/scrfd_10g for face detection, and arcface_mobilefacenet for classification
Thanks

shashi · January 6, 2025, 2:57pm

@dvir.itzko
Thanks for letting us know the models you need. We will let you know as soon as these are integrated to PySDK. Also, just curious why you do not want yolov8 for face detection.

dvir.itzko · January 6, 2025, 3:43pm

TBH it is mainly the fact that I already checked and at least theoretically those models suit me with their performance and runtime. But, since you don’t have a face classification mode, I’ll have to figure out how to work with Hailo models myself.

shashi · January 12, 2025, 10:00pm

@dvir.itzko
The retinaface_mobilenet_v1/scrfd_10/scrfd_2.5g_scrfd_500m models for face detection and arcface_mobilefacenet model for face embedding are now added to the zoo. Please try and let us know if you need any help in integrating these to your applications.

dvir.itzko · January 19, 2025, 6:47pm

@shashi
Thanks for the reply!.
So first of all - the retinaface does work for me!. thats great news, now I’m just wondering what is this magic you have done to post process it. if you used the same hef file I’m using, then what was I missing?

I’m just wondering where can I get the files (like in the tutorial) so I can run it offline?

Again thanks for the help!

shashi · January 19, 2025, 11:19pm

Hi @dvir.itzko
AI accelerators like the Hailo8/Hailo8L run the compute heavy portions of the ML models (like the conv layers) but the final tensor outputs need to go through a post-processor to convert them to human readable outputs like bounding boxes, scores, and landmark coordinates. The postprocessing logic varies from model to model. For popular models like YOLOv5 and YOLOv8, Hailo team already integrated these postprocessors so that the final results are already in the desired format. For other models, someone needs to write and integrate these postprocessors.

At DeGirum, we have developed a framework where the preprocessor, ML inference, and postprocessor are effectively pipelined to provide a seamless inference API that can take a user image and provide the final detection results. This pipeline also includes resizing the image to the size expected by the model and resizing outputs to original image size. This enables end users to develop AI applications easily without writing this type of boilerplate code over and over.

The actual code to perform the postprocessing is available here: RetinaFace Postprocessor

I am not sure I understand what you mean by running offline. If you use our cloud model zoo and run locally, the model is downloaded (the postprocessor python file, the model hef file, the labels file, and the model json). Alternatively, you can register for our AI Hub and manually download these files. Please let me know if you encounter any problems.

We are preparing a user guide to explain all these steps and I will keep you posted once the guide is available.

dvir.itzko · January 20, 2025, 10:22am

@shashi
Thanks again!
You provided exactly what I was looking for—I found the model in your AI Hub. Your infrastructure is truly impressive, but I believe the documentation and guides could use some enhancement. The information you shared here is crucial, yet I couldn’t find it anywhere else. Thanks once more for your help!

shashi · January 20, 2025, 3:27pm

@dvir.itzko
Glad you found it useful. We are working on our documentation now and hope to release comprehensive user guides for all PySDK features by mid-February. I will keep you posted. In the meantime, please feel free to reach out if we can help in any way.

user65 · March 18, 2025, 12:59am

Hi @dvir.itzko, the DeGirum PySDK user guides are now live. You can find them in our documentation: PySDK User Guide | DeGirum Docs

Any feedback is welcome!

dvir.itzko · March 18, 2025, 11:49am

In first glance it looks amazing!. A lot of important information of things I struggled with previously!. Thanks for the update

Topic		Replies	Views
HailoRT minimal working example for Python and Hailo8 Guides hailort , raspberry-pi , hailo8	5	2289	May 14, 2025
Working Python inference example with .hef on Raspberry Pi (HailoRT 4.21.0, critical request) General hailort , raspberry-pi , hailo8	6	207	May 22, 2025
How to perform inference on Hailo-8L using the pre-compiled hef from Model Explorer General hailo8	1	555	October 24, 2024
### How to Load a Model and Run Inference on Raspberry Pi 5 Using Python? General raspberry-pi , hailo8	3	221	April 1, 2025
Face detector inference in hailo General hailort , raspberry-pi	0	179	January 2, 2025

Working python hailo inference

Related topics