Working python hailo inference

I am trying to understand how to use hailort to run inference on pretrained models with hailo8l on my raspberry pi 5.
While I did see the raspberry 5 examples, it is very complicated with streaming etc, and I want to start with a simple example where I take a model, image, pass the image trhoguh the model and see the output.

I did see in a few places (including using hailo tutorial command) something like working example code, but I couldn’t get a real prediction out of it.

# running on HailoRT v4.19.0, Raspberry Pi 5 AI HAT (Hailo8, python 3.10)
import numpy as np
import hailo_platform as hpf

hef_path = './retinaface_mobilenet_v1.hef'

hef = hpf.HEF(hef_path)

with hpf.VDevice() as target:
    configure_params = hpf.ConfigureParams.create_from_hef(hef, interface=hpf.HailoStreamInterface.PCIe)
    network_group = target.configure(hef, configure_params)[0]
    network_group_params = network_group.create_params()

    input_vstream_info = hef.get_input_vstream_infos()[0]
    output_vstream_info = hef.get_output_vstream_infos()[0]

    input_vstreams_params = hpf.InputVStreamParams.make_from_network_group(network_group, quantized=False, format_type=hpf.FormatType.FLOAT32)
    output_vstreams_params = hpf.OutputVStreamParams.make_from_network_group(network_group, quantized=False, format_type=hpf.FormatType.FLOAT32)

    input_shape = input_vstream_info.shape
    output_shape = output_vstream_info.shape

    print(f"Input shape: {input_shape}, Output shape: {output_shape}")

    with network_group.activate(network_group_params):
        with hpf.InferVStreams(network_group, input_vstreams_params, output_vstreams_params) as infer_pipeline:
            for _ in range(10):
                random_input = np.random.rand(*input_shape).astype(np.float32)
                input_data = { np.expand_dims(random_input, axis=0)}
                results = infer_pipeline.infer(input_data)
                output_data = results[]
                print(f"Inference output: {output_data}")

The issues I’m having is:

  1. I’m not even sure what output I should take. in the example(s) they do output_vstream_info = hef.get_output_vstream_infos()[0] but I don’t think it is the last layer where I should actually take. Instead I saw output_vstream_info = hef.get_output_vstream_infos()[-1] that made more sense since it 'feels ’ more like a last layer, but I really can’t tell.
  2. How do I post-process to get the wanted results.? In the pi5 hailo examples they are using compiled postprocess in the inference pipeline, but I want to have more control (and also to learn how things work) and using something more like in the example code, where I take the inference result and process it myself.

If anyone has a working ‘real’ example (that if I’ll pass a training/real image the results would make sense) for any model that would be amazing.

We made a python SDK that makes working with Hailo devices simple: Simplifying Edge AI Development with DeGirum PySDK and Hailo - General - Hailo Community
Please take a look and see if you find it useful.

Hi Dvir,
The vstream_info just hold the static info of the vstreams, such as shapes etc. The actual output tensor of the logits will be recieved from the infer command.
I think that your current issue lies in the fact that the simple pipelines that you’ve seen makes use of conpiled networks that have the post processing integrated in the HEF. In that case, they didn’t need to apply any bbox or NMS post processing.

Try starting with one of the yolos.

While the pythonSDK looks relly cool and simple, the list of supported model is limited and I don’t have the models I’m looking for (face classification mainly, but also not sure I want yolo8 for face detection).
Thanks though, I’ll keep it for future reference, maybe with more models supported I would move to it : )

I’m not sure what you mean.
I think that I understand better now that somehow, all of the outputs suppose to be used for inference, and there isn’t a first/last layer I should use.
My problem is that the hef models’ output is different than the model itself (in the original pytroch repo) in the sense that in the pytorch repo, there is 3 outputs for bbox,classification, keypoints but in the inference here (in the line results = inference_pipeline.infer(input_data) I’m getting technically 9 outputs so not sure how to deal with that

Hi @dvir.itzko
What models do you want the support for? Our PySDK can run with any model supported by Hailo8 devices.

Hi @shashi

Maybe I missed something, but looking at PySDK docs and in the DeGirum AI hub models and I can’t find my models.
Currently I’m using retinaface_mobilenet_v1/scrfd_10g for face detection, and arcface_mobilefacenet for classification

Thanks for letting us know the models you need. We will let you know as soon as these are integrated to PySDK. Also, just curious why you do not want yolov8 for face detection.

TBH it is mainly the fact that I already checked and at least theoretically those models suit me with their performance and runtime. But, since you don’t have a face classification mode, I’ll have to figure out how to work with Hailo models myself.

The retinaface_mobilenet_v1/scrfd_10/scrfd_2.5g_scrfd_500m models for face detection and arcface_mobilefacenet model for face embedding are now added to the zoo. Please try and let us know if you need any help in integrating these to your applications.