How do I increase the max queue size for HailoRT?

When I queue up too many images, I hit an error about the queue being full

---------------------------------------------------------------------------
HailoRTStatusException                    Traceback (most recent call last)
File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3282, in ConfiguredInferModel.run_async(self, bindings, callback)
   3281 with ExceptionWrapper():
-> 3282     cpp_job = self._configured_infer_model.run_async(
   3283         [b.get() for b in bindings], callback_wrapper
   3284     )
   3286 job = AsyncInferJob(cpp_job)

HailoRTStatusException: 82

The above exception was the direct cause of the following exception:

HailoRTException                          Traceback (most recent call last)
Cell In[25], line 110
    107 clip_enqueue_thread.start()
    108 clip_process_thread.start()
--> 110 hailo_clip_inference.run(input_queue=clip_input_queue, output_queue=clip_output_queue)
    112 clip_enqueue_thread.join()
    113 clip_output_queue.put(None)  # Signal process thread to exit

File ~/Desktop/hailo-CLIP/hailo_utils.py:144, in HailoAsyncInference.run(self, input_queue, output_queue)
    141         bindings_list.append(bindings)
    143     configured_infer_model.wait_for_async_ready(timeout_ms=10000)
--> 144     job = configured_infer_model.run_async(
    145         bindings_list, partial(
    146             self.callback,
    147             output_queue=output_queue,
    148             batch_data=batch_data,
    149             original_images=original_images,
    150             metadata=metadata,
    151             bindings_list=bindings_list
    152         )
    153     )
    154 job.wait(10000)

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3281, in ConfiguredInferModel.run_async(self, bindings, callback)
   3278     # remove the buffers - they are no longer needed
   3279     self._buffer_guards.popleft()
-> 3281 with ExceptionWrapper():
   3282     cpp_job = self._configured_infer_model.run_async(
   3283         [b.get() for b in bindings], callback_wrapper
   3284     )
   3286 job = AsyncInferJob(cpp_job)

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:110, in ExceptionWrapper.__exit__(self, exception_type, value, traceback)
    108 if value is not None:
    109     if exception_type is _pyhailort.HailoRTStatusException:
--> 110         self._raise_indicative_status_exception(value)
    111     else:
    112         raise

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:155, in ExceptionWrapper._raise_indicative_status_exception(self, libhailort_exception)
    153 def _raise_indicative_status_exception(self, libhailort_exception):
    154     error_code = int(libhailort_exception.args[0])
--> 155     raise self.create_exception_from_status(error_code) from libhailort_exception

HailoRTException: libhailort failed with error: 82 (HAILO_QUEUE_IS_FULL)

Is there a way to set the queue size to be larger?

Hey @tarmily.wen

The error occurs when the input queue for asynchronous inference reaches its capacity. The solution is to increase the queue size in the HailoRT framework:

Implementation:

from hailo_platform import Device, VDevice, ConfigureParams, HailoStreamInterface

# Set up device
devices = Device.scan()
with VDevice(device_ids=[devices[0]]) as target:
    hef = HEF("your_model.hef")
    
    # Configure with larger queue sizes
    configure_params = ConfigureParams.create_from_hef(
        hef,
        interface=HailoStreamInterface.PCIe,
        input_queue_size=10,  # Adjustable
        output_queue_size=10  # Adjustable
    )
    
    network_group = target.configure(hef, configure_params)[0]
    with network_group.activate():
        # Run inference

Key Recommendations:

  • Start with queue sizes of 10-20 and adjust based on needs
  • Monitor memory usage when increasing queue sizes
  • Ensure queue sizes are appropriate for your hardware setup

This should resolve the queue full error while maintaining stable inference performance.

im sorry but can you provide a little more example code to show how to run inference with the “with network_group.activate()” context manager? Currently I run inference with a ConfiguredInferModel from hailo_platform.pyhailort.pyhailort.InferModel.configure()

Hey @tarmily.wen,

Here’s a clearer explanation of running inference with HailoRT:

import numpy as np
from hailo_platform import (
    HEF, VDevice, HailoStreamInterface, InferVStreams,
    ConfigureParams, InputVStreamParams, OutputVStreamParams, FormatType
)

def run_inference(hef_path, num_images=10):
    # Load model
    hef = HEF(hef_path)
    
    # Setup device and configure model
    with VDevice() as device:
        # Configure network
        config_params = ConfigureParams.create_from_hef(
            hef=hef, 
            interface=HailoStreamInterface.PCIe
        )
        network_group = device.configure(hef, config_params)[0]
        
        # Setup streams
        input_params = InputVStreamParams.make(
            network_group, 
            format_type=FormatType.FLOAT32
        )
        output_params = OutputVStreamParams.make(
            network_group, 
            format_type=FormatType.UINT8
        )
        
        # Prepare input data
        input_info = hef.get_input_vstream_infos()[0]
        h, w, c = input_info.shape
        dataset = np.random.rand(num_images, h, w, c).astype(np.float32)
        
        # Run inference
        with network_group.activate():
            with InferVStreams(network_group, input_params, output_params) as pipeline:
                results = pipeline.infer({input_info.name: dataset})
                return results

# Example usage
results = run_inference('path/to/your_model.hef')

Key components:

  1. Load HEF model
  2. Configure device and network
  3. Set up input/output streams
  4. Prepare data
  5. Run inference using context managers

For more examples, check: Hailo-Application-Code-Examples/runtime/python/utils.py at main · hailo-ai/Hailo-Application-Code-Examples · GitHub