Hailo8L Async Inference issue - invalid operation (

Inference worked fine when it was a .har file. I compiled it, following the DFC tutorials, and now I get this error regarding output when following the HRT_0_Async_Inference_Tutorial

Why is the output buffer 4 times smaller than the expected?

/tmp/ipykernel_5482/2010725074.py:23: RuntimeWarning: invalid value encountered in cast
  buffer = np.empty(infer_model.output().shape).astype(np.uint8)
[HailoRT] [error] CHECK failed - Output buffer size 1002 is different than expected 4008 for output 'yolo11n_visdrone/yolov8_nms_postprocess'
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
---------------------------------------------------------------------------
HailoRTStatusException                    Traceback (most recent call last)
File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3286, in ConfiguredInferModel.run_async(self, bindings, callback)
   3285 with ExceptionWrapper():
-> 3286     cpp_job = self._configured_infer_model.run_async(
   3287         [b.get() for b in bindings], callback_wrapper
   3288     )
   3290 job = AsyncInferJob(cpp_job)

HailoRTStatusException: 6

The above exception was the direct cause of the following exception:

HailoRTInvalidOperationException          Traceback (most recent call last)
Cell In[2], line 27
     24 bindings.output().set_buffer(buffer)
     26 # Run synchronous inference and access the output buffers
---> 27 configured_infer_model.run([bindings], timeout_ms)
     28 buffer = bindings.output().get_buffer()
     30 # Run asynchronous inference

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3236, in ConfiguredInferModel.run(self, bindings, timeout)
   3223 def run(self, bindings, timeout):
   3224     """
   3225     Launches a synchronous inference operation with the provided bindings.
   3226 
   (...)   3234         :class:`HailoRTTimeout` in case the job did not finish in the given timeout.
   3235     """
-> 3236     with ExceptionWrapper():
   3237         job = self.run_async(bindings)
   3238         job.wait(timeout)

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3237, in ConfiguredInferModel.run(self, bindings, timeout)
   3224 """
   3225 Launches a synchronous inference operation with the provided bindings.
   3226 
   (...)   3234     :class:`HailoRTTimeout` in case the job did not finish in the given timeout.
   3235 """
   3236 with ExceptionWrapper():
-> 3237     job = self.run_async(bindings)
   3238     job.wait(timeout)

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3285, in ConfiguredInferModel.run_async(self, bindings, callback)
   3282     # remove the buffers - they are no longer needed
   3283     self._buffer_guards.popleft()
-> 3285 with ExceptionWrapper():
   3286     cpp_job = self._configured_infer_model.run_async(
   3287         [b.get() for b in bindings], callback_wrapper
   3288     )
   3290 job = AsyncInferJob(cpp_job)

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:118, in ExceptionWrapper.__exit__(self, exception_type, value, traceback)
    116 if value is not None:
    117     if exception_type is _pyhailort.HailoRTStatusException:
--> 118         self._raise_indicative_status_exception(value)
    119     else:
    120         raise

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:166, in ExceptionWrapper._raise_indicative_status_exception(self, libhailort_exception)
    164 def _raise_indicative_status_exception(self, libhailort_exception):
    165     error_code = int(libhailort_exception.args[0])
--> 166     raise self.create_exception_from_status(error_code) from libhailort_exception

HailoRTInvalidOperationException: Invalid operation. See hailort.log for more information

I checked the Hailort.log file and there wasn’t a great deal in there that seems helpful beyond what was already in the error message, but this was a snippet from that:

[2025-03-20 09:56:39.446] [5482] [HailoRT] [info] [vdevice.cpp:523] [create] Creating vdevice with params: device_count: 1, scheduling_algorithm: ROUND_ROBIN, multi_process_service: false
[2025-03-20 09:56:39.457] [5482] [HailoRT] [info] [device.cpp:49] [Device] OS Version: Linux 6.6.74+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.74-1+rpt1 (2025-01-27) aarch64
[2025-03-20 09:56:39.458] [5482] [HailoRT] [info] [control.cpp:108] [control__parse_identify_results] firmware_version is: 4.20.0
[2025-03-20 09:56:39.458] [5482] [HailoRT] [info] [vdevice.cpp:651] [create] VDevice Infos: 0000:01:00.0
[2025-03-20 09:56:39.501] [5482] [HailoRT] [info] [hef.cpp:1929] [get_network_group_and_network_name] No name was given. Addressing all networks of default network_group: yolo11n_visdrone
[2025-03-20 09:56:39.501] [5482] [HailoRT] [info] [hef.cpp:1929] [get_network_group_and_network_name] No name was given. Addressing all networks of default network_group: yolo11n_visdrone
[2025-03-20 09:56:39.507] [5482] [HailoRT] [info] [internal_buffer_manager.cpp:204] [print_execution_results] Planned internal buffer memory: CMA memory 0, user memory 1539584. memory to edge layer usage factor is 0.9087338
[2025-03-20 09:56:39.507] [5482] [HailoRT] [info] [internal_buffer_manager.cpp:212] [print_execution_results] Default Internal buffer planner executed successfully
[2025-03-20 09:56:39.520] [5482] [HailoRT] [info] [device_internal.cpp:57] [configure] Configuring HEF took 18.913141 milliseconds
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [vdevice.cpp:749] [configure] Configuring HEF on VDevice took 19.227162 milliseconds
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [infer_model.cpp:436] [configure] Configuring network group 'yolo11n_visdrone' with params: batch size: 0, power mode: PERFORMANCE, latency: NONE
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [multi_io_elements.cpp:756] [create] Created (AsyncHwEl)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [queue_elements.cpp:450] [create] Created (EntryPushQEl0yolo11n_visdrone/input_layer1 | timeout: 10s)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [filter_elements.cpp:101] [create] Created (PreInferEl1yolo11n_visdrone/input_layer1 | Reorder - src_order: NHWC, src_shape: (640, 640, 3), dst_order: NHCW, dst_shape: (640, 640, 3))
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [queue_elements.cpp:450] [create] Created (PushQEl1yolo11n_visdrone/input_layer1 | timeout: 10s)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [multi_io_elements.cpp:135] [create] Created (NmsPPMuxEl0YOLOV8-Post-Process | Op YOLOV8, Name: YOLOV8-Post-Process, Score threshold: 0.200, IoU threshold: 0.70, Classes: 2, Cross classes: false, NMS results order: BY_CLASS, Max bboxes per class: 100, Image height: 640, Image width: 640)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [queue_elements.cpp:942] [create] Created (MultiPushQEl0YOLOV8-Post-Process | timeout: 10s)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [edge_elements.cpp:187] [create] Created (LastAsyncEl0NmsPPMuxEl0YOLOV8-Post-Process)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [pipeline.cpp:891] [print_deep_description] EntryPushQEl0yolo11n_visdrone/input_layer1 | inputs: user | outputs: PreInferEl1yolo11n_visdrone/input_layer1(running in thread_id: 5684)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [pipeline.cpp:891] [print_deep_description] PreInferEl1yolo11n_visdrone/input_layer1 | inputs: EntryPushQEl0yolo11n_visdrone/input_layer1[0] | outputs: PushQEl1yolo11n_visdrone/input_layer1
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [pipeline.cpp:891] [print_deep_description] PushQEl1yolo11n_visdrone/input_layer1 | inputs: PreInferEl1yolo11n_visdrone/input_layer1[0] | outputs: AsyncHwEl(running in thread_id: 5685)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [pipeline.cpp:891] [print_deep_description] AsyncHwEl | inputs: PushQEl1yolo11n_visdrone/input_layer1[0] | outputs: MultiPushQEl0YOLOV8-Post-Process MultiPushQEl0YOLOV8-Post-Process MultiPushQEl0YOLOV8-Post-Process MultiPushQEl0YOLOV8-Post-Process MultiPushQEl0YOLOV8-Post-Process MultiPushQEl0YOLOV8-Post-Process
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [pipeline.cpp:891] [print_deep_description] MultiPushQEl0YOLOV8-Post-Process | inputs: AsyncHwEl[0] AsyncHwEl[1] AsyncHwEl[2] AsyncHwEl[3] AsyncHwEl[4] AsyncHwEl[5] | outputs: NmsPPMuxEl0YOLOV8-Post-Process(running in thread_id: 5686)
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [pipeline.cpp:891] [print_deep_description] NmsPPMuxEl0YOLOV8-Post-Process | inputs: MultiPushQEl0YOLOV8-Post-Process[0] | outputs: LastAsyncEl0NmsPPMuxEl0YOLOV8-Post-Process
[2025-03-20 09:56:39.521] [5482] [HailoRT] [info] [pipeline.cpp:891] [print_deep_description] LastAsyncEl0NmsPPMuxEl0YOLOV8-Post-Process | inputs: NmsPPMuxEl0YOLOV8-Post-Process[0] | outputs: user
[2025-03-20 09:56:39.566] [5482] [HailoRT] [info] [hef.cpp:1929] [get_network_group_and_network_name] No name was given. Addressing all networks of default network_group: yolo11n_visdrone
[2025-03-20 09:56:39.568] [5482] [HailoRT] [error] [infer_model.cpp:898] [validate_bindings] CHECK failed - Output buffer size 1002 is different than expected 4008 for output 'yolo11n_visdrone/yolov8_nms_postprocess'

any ideas what this is down to? The code from the tutorial is here:

import numpy as np
from hailo_platform import VDevice, HailoSchedulingAlgorithm
print(np.__version__)
timeout_ms = 1000

params = VDevice.create_params()
params.scheduling_algorithm = HailoSchedulingAlgorithm.ROUND_ROBIN

# The vdevice is used as a context manager ("with" statement) to ensure it's released on time.
with VDevice(params) as vdevice:

    # Create an infer model from an HEF:
    infer_model = vdevice.create_infer_model(model_hef)

    # Configure the infer model and create bindings for it
    with infer_model.configure() as configured_infer_model:
        bindings = configured_infer_model.create_bindings()

        # Set input and output buffers
        buffer = np.empty(infer_model.input().shape).astype(np.uint8)
        bindings.input().set_buffer(buffer)

        buffer = np.empty(infer_model.output().shape).astype(np.uint8)
        bindings.output().set_buffer(buffer)

        # Run synchronous inference and access the output buffers
        configured_infer_model.run([bindings], timeout_ms)
        buffer = bindings.output().get_buffer()

        # Run asynchronous inference
        job = configured_infer_model.run_async([bindings])
        job.wait(timeout_ms)

Hey @natsayin_nahin ,

The error you’re encountering:

Output buffer size 1002 is different than expected 4008 for output 'yolo11n_visdrone/yolov8_nms_postprocess'

indicates a mismatch between the expected output shape by HailoRT and the allocated buffer shape in your Python code.

You’re using this line to allocate the output buffer:

buffer = np.empty(infer_model.output().shape).astype(np.uint8)

But infer_model.output().shape returns the shape in elements, not in bytes. This causes incorrect memory allocation because the actual output size (in bytes) depends on the data type (e.g., float32 vs uint8), not just the number of elements.

This is especially critical when using post-processed models (like NMS) that return float32 or other non-uint8 types.

Use .shape and .dtype together to compute the correct buffer:

output_info = infer_model.output()
output_shape = output_info.shape
output_dtype = output_info.dtype

# Allocate the correct buffer based on shape * dtype
buffer = np.empty(output_shape, dtype=output_dtype)
bindings.output().set_buffer(buffer)

This ensures the buffer is the correct size in bytes, avoiding the error.

1 Like