How does each work runner.infer_context(param)

InferenceContext.SDK_HAILO_HW
InferenceContext.SDK_NATIVE
InferenceContext.SDK_BIT_EXACT
InferenceContext.SDK_FP_OPTIMIZED

And I’d like to know the difference between the two inference methods, with runner.infer_context and withInferVStreams.

Hey @clear.han

Let me explain the key differences between runner.infer_context and withInferVStreams:

  1. runner.infer_context(param):
    This method allows you to specify the inference context for batch processing. You can choose from:

    • SDK_HAILO_HW: Uses Hailo hardware for optimal performance
    • SDK_NATIVE: Runs on the host CPU (useful for testing without Hailo hardware)
    • SDK_BIT_EXACT: Simulates Hailo hardware behavior on the host for exact precision
    • SDK_FP_OPTIMIZED: Optimized for floating-point operations, focusing on speed
  2. withInferVStreams:
    This method is optimized for real-time streaming. It’s designed to manage continuous input/output with lower latency, making it ideal for applications like video or sensor data processing.

The main difference lies in their use cases:

  • runner.infer_context is best for batch inference, where you process a set of data all at once.
  • withInferVStreams is suited for scenarios requiring continuous, real-time data processing.

Choose the method that aligns best with your specific application needs. If you’re working with batch data, go with runner.infer_context. For real-time streaming applications, withInferVStreams would be more appropriate.

Regards

In the sample code, inferVStream uses a hef file,
infer.context uses haf files.
Can I use the hef file in runner.infer_context?
If possible, How do I load hef on runner???

Yes, you can use an .hef file with runner.infer_context in HailoRT. Here’s how:

  1. Load the .hef file and create a runner:
from hailo_platform import VDevice, HEF, ConfigureParams

hef = HEF('path_to_your_model.hef')
vdevice = VDevice()
configure_params = ConfigureParams.create_from_hef(hef)
network_group = vdevice.configure(hef, configure_params)
runner = network_group.create_runner()
  1. Run inference:
input_tensor = runner.get_input_vstream().create_host_buffers()
output_tensor = runner.get_output_vstream().create_host_buffers()
# Fill input_tensor with data
runner.infer(input_tensor, output_tensor)
# Access results from output_tensor

This method allows you to use .hef files with runner.infer_context for inference on Hailo hardware.