Multi context flow

petar.ristic · December 28, 2024, 7:19pm

Could someone explain to me what does this mean ? It appeared during compilation step with this command
Is it bad? Is it good?

hailomz compile yolov8n --ckpt=simple.onnx --hw-arch hailo8 --calib-path train/images --classes 4--performance

with this as the .alls file

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv42, sigmoid)
change_output_activation(conv53, sigmoid)
change_output_activation(conv63, sigmoid)

allocator_param(width_splitter_defuse=disabled)
model_optimization_flavor(optimization_level=3,compression_level=1,batch_size=32)
model_optimization_config(calibration, batch_size=32, calibset_size=2048)
quantization_param({conv*}, bias_mode=double_scale_initialization)
model_optimization_config(compression_params, auto_16bit_weights_ratio=1)
pre_quantization_optimization(zero_static_channels, policy=enabled)

model_optimization_config(checker_cfg, policy=enabled,dataset_size=2048)

post_quantization_optimization(bias_correction, policy=enabled)

post_quantization_optimization(finetune,policy=enabled, dataset_size=2048,batch_size=32,epochs=15,learning_rate=0.0001,val_images=256,val_batch_size=32,force_pruning=True)
pre_quantization_optimization(equalization, layers={conv*}, policy=enabled)
post_quantization_optimization(adaround, policy=enabled,batch_size=8,cache_compression=enabled,dataset_size=256,train_bias=False,epochs=100)
change_output_activation(gelu)
allocator_param(automatic_ddr=True, timeout=20m)
performance_param(compiler_optimization_level=max)

KlausK · December 29, 2024, 12:04pm

The Hailo devices have three types of resources - memory, control and compute. If a network requires more resources then available, it is split by the Hailo Dataflow Compiler into multiple contexts. During runtime each context is loaded automatically by the HailoRT runtime and the network is executed context by context.

This will allow larger networks to be executed on the Hailo devices.

When using a Hailo-8L you will more often see networks compiled to multiple contexts because the chip has fewer resources.

If a network fits into a single context it will typically run at higher FPS.

For networks with multiple context the FPS can be increased by using a larger batch size.

In some cases you may be able to compile a network into a single contexts by quantizing some larger layers to 4 bit.

The compiler has a performance mode flag that will make it try harder to allocate the network. The will however take a lot longer to run.

So, overall nothing to worry about.

Topic		Replies	Views
Compilation of Yolov8 network General	12	855	September 22, 2024
Compilation hef from onnx (or har) stucks or takes too long (> 20 hours) General dfc	3	258	January 27, 2025
Compilation failed with unexpected crash General compiler , hailo8 , error	7	231	June 30, 2025
Problem with convertation model Yolov11m General raspberry-pi , hailo8 , yolo	1	15	July 23, 2025
Failing to compile onnx to hef General hailort , raspberry-pi , hailo8	31	495	January 23, 2025

Multi context flow

Related topics