How do I increase the max queue size for HailoRT?

tarmily.wen · October 23, 2024, 3:01pm

When I queue up too many images, I hit an error about the queue being full

---------------------------------------------------------------------------
HailoRTStatusException                    Traceback (most recent call last)
File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3282, in ConfiguredInferModel.run_async(self, bindings, callback)
   3281 with ExceptionWrapper():
-> 3282     cpp_job = self._configured_infer_model.run_async(
   3283         [b.get() for b in bindings], callback_wrapper
   3284     )
   3286 job = AsyncInferJob(cpp_job)

HailoRTStatusException: 82

The above exception was the direct cause of the following exception:

HailoRTException                          Traceback (most recent call last)
Cell In[25], line 110
    107 clip_enqueue_thread.start()
    108 clip_process_thread.start()
--> 110 hailo_clip_inference.run(input_queue=clip_input_queue, output_queue=clip_output_queue)
    112 clip_enqueue_thread.join()
    113 clip_output_queue.put(None)  # Signal process thread to exit

File ~/Desktop/hailo-CLIP/hailo_utils.py:144, in HailoAsyncInference.run(self, input_queue, output_queue)
    141         bindings_list.append(bindings)
    143     configured_infer_model.wait_for_async_ready(timeout_ms=10000)
--> 144     job = configured_infer_model.run_async(
    145         bindings_list, partial(
    146             self.callback,
    147             output_queue=output_queue,
    148             batch_data=batch_data,
    149             original_images=original_images,
    150             metadata=metadata,
    151             bindings_list=bindings_list
    152         )
    153     )
    154 job.wait(10000)

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:3281, in ConfiguredInferModel.run_async(self, bindings, callback)
   3278     # remove the buffers - they are no longer needed
   3279     self._buffer_guards.popleft()
-> 3281 with ExceptionWrapper():
   3282     cpp_job = self._configured_infer_model.run_async(
   3283         [b.get() for b in bindings], callback_wrapper
   3284     )
   3286 job = AsyncInferJob(cpp_job)

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:110, in ExceptionWrapper.__exit__(self, exception_type, value, traceback)
    108 if value is not None:
    109     if exception_type is _pyhailort.HailoRTStatusException:
--> 110         self._raise_indicative_status_exception(value)
    111     else:
    112         raise

File /usr/lib/python3/dist-packages/hailo_platform/pyhailort/pyhailort.py:155, in ExceptionWrapper._raise_indicative_status_exception(self, libhailort_exception)
    153 def _raise_indicative_status_exception(self, libhailort_exception):
    154     error_code = int(libhailort_exception.args[0])
--> 155     raise self.create_exception_from_status(error_code) from libhailort_exception

HailoRTException: libhailort failed with error: 82 (HAILO_QUEUE_IS_FULL)

Is there a way to set the queue size to be larger?

omria · October 27, 2024, 2:09pm

Hey @tarmily.wen

The error occurs when the input queue for asynchronous inference reaches its capacity. The solution is to increase the queue size in the HailoRT framework:

Implementation:

from hailo_platform import Device, VDevice, ConfigureParams, HailoStreamInterface

# Set up device
devices = Device.scan()
with VDevice(device_ids=[devices[0]]) as target:
    hef = HEF("your_model.hef")
    
    # Configure with larger queue sizes
    configure_params = ConfigureParams.create_from_hef(
        hef,
        interface=HailoStreamInterface.PCIe,
        input_queue_size=10,  # Adjustable
        output_queue_size=10  # Adjustable
    )
    
    network_group = target.configure(hef, configure_params)[0]
    with network_group.activate():
        # Run inference

Key Recommendations:

Start with queue sizes of 10-20 and adjust based on needs
Monitor memory usage when increasing queue sizes
Ensure queue sizes are appropriate for your hardware setup

This should resolve the queue full error while maintaining stable inference performance.

tarmily.wen · October 29, 2024, 3:52pm

im sorry but can you provide a little more example code to show how to run inference with the “with network_group.activate()” context manager? Currently I run inference with a ConfiguredInferModel from hailo_platform.pyhailort.pyhailort.InferModel.configure()

omria · November 6, 2024, 10:40am

Hey @tarmily.wen,

Here’s a clearer explanation of running inference with HailoRT:

import numpy as np
from hailo_platform import (
    HEF, VDevice, HailoStreamInterface, InferVStreams,
    ConfigureParams, InputVStreamParams, OutputVStreamParams, FormatType
)

def run_inference(hef_path, num_images=10):
    # Load model
    hef = HEF(hef_path)
    
    # Setup device and configure model
    with VDevice() as device:
        # Configure network
        config_params = ConfigureParams.create_from_hef(
            hef=hef, 
            interface=HailoStreamInterface.PCIe
        )
        network_group = device.configure(hef, config_params)[0]
        
        # Setup streams
        input_params = InputVStreamParams.make(
            network_group, 
            format_type=FormatType.FLOAT32
        )
        output_params = OutputVStreamParams.make(
            network_group, 
            format_type=FormatType.UINT8
        )
        
        # Prepare input data
        input_info = hef.get_input_vstream_infos()[0]
        h, w, c = input_info.shape
        dataset = np.random.rand(num_images, h, w, c).astype(np.float32)
        
        # Run inference
        with network_group.activate():
            with InferVStreams(network_group, input_params, output_params) as pipeline:
                results = pipeline.infer({input_info.name: dataset})
                return results

# Example usage
results = run_inference('path/to/your_model.hef')

Key components:

Load HEF model
Configure device and network
Set up input/output streams
Prepare data
Run inference using context managers

For more examples, check: Hailo-Application-Code-Examples/runtime/python/utils.py at main · hailo-ai/Hailo-Application-Code-Examples · GitHub

Topic		Replies	Views
How to increase input_queue_size on a ConfiguredInferModel General dfc , network	2	61	November 6, 2024
How to increase Hailo_timeout value when using vstream CPP code Guides	0	139	August 19, 2024
CHECK_SUCCESS failed with status=HAILO_NOT_FOUND(61) after period of time General hailort , error	1	39	December 10, 2024
Relevance of batch size in CPP API for hailoRt General hailort	2	22	February 3, 2025
C/++ example of batch size > 1 please General	2	183	July 14, 2024

How do I increase the max queue size for HailoRT?

Related topics