Help on postprocessing of inference on board

fabrice.auzanneau · February 1, 2025, 2:44pm

Hi
I’m using the Python API for running inference on a Hailo8 board.
I have quantized my network and created the hef file. Now I’m trying to run the inference.

I’m using the example from the tutorial Python inference. When I use an input dataset made of 25000 images, I get as a result an array of 25000 arrays of 1000 integer numbers (classification network trained on ImageNet 1k).

What is the meaning of this array of 1000 numbers? Is it just the classes ordered in decreasing value of inference probability?

When I count the correctly classified images, I only get around 30 out of 25000 which is very low, considering that the performance is around 70% in float32. So there must be something wrong with my script.

Maybe the dataset is not the good one? Should I use the original dataset, the normalized dataset, should divide the images by 255?

Thanks for your help.

fabrice.auzanneau · February 1, 2025, 11:27pm

Answering to myself: I found the information in the HailoRT user guide.
I needed to use the class hailo_platform.pyhailort.pyhailort.InferVStreams described in page 281. The infer method returns:

Output tensors of all output layers. The keys are outputs names and the values are output data tensors as numpy.ndarray (or list of numpy.ndarray in case of nms output and tf_nms_format=False).

The example Python Inference Tutorial - Single Model on page 64 provides a code to obtain the inference results:

# Infer
with InferVStreams(network_group, input_vstreams_params, output_vstreams_params)
,→as infer_pipeline:
input_data = {input_vstream_info.name: dataset}
with network_group.activate(network_group_params):
infer_results = infer_pipeline.infer(input_data)

From that, I could obtain the Top1 and Top5 classification results, together with the FPS estimation using:

            # The result output tensor is infer_results[output_vstream_info.name]
            nbok1 = 0
            nbok5 = 0
            i = 0
            for output in infer_results[output_vstream_info.name]:
                max_output = max(output)
                nmax = [i for i, x in enumerate(output) if x == max(output)] # list largest elements
                # nmax = np.argmax(np.array(output))
                if labels[GT_folders[i]] in nmax:
                    nbok1 += 1
                top5 = (-output).argsort()[:5]
                if labels[GT_folders[i]] in top5:
                    nbok5 += 1
                i += 1
    top1 = 100*nbok1/len(dataset)
    top5 = 100*nbok5/len(dataset)
    FPS = len(dataset) / infer_pipeline.get_hw_time()
    print(f'{hef_path[:-4]}\tTop1 {nbok1}/{len(dataset)} = {top1}%\tTop5 {nbok5}/{len(dataset)} = {top5}%\tFPS= {FPS}\tDuration: {duration:.2f} seconds')

Topic		Replies	Views
How to interpret the YOLO outputs? General	5	417	April 16, 2025
H8 hardware inference result General	5	55	November 21, 2024
Working python hailo inference General	15	822	March 18, 2025
How to interpret raw output General hailo-api , network	3	109	November 13, 2024
Cannot get correct out put for yolov5m from hailo_infer() API General	5	156	August 9, 2024

Help on postprocessing of inference on board

Related topics