DFC optimize using only 64/ ~300 pictures for calibration

Hello there,

I’m facing the issue that my calibration dataset is small despite loading hundreds of pictures.

Here is the Code for loading my dataset:

images = []
for img_file in sorted(image_files):
    img = Image.open(img_file).convert("RGB")
    img = img.resize(target_size, Image.BILINEAR)
    img_np = np.array(img).astype(np.float32)
    images.append(img_np)

calib_dataset = np.stack(images, axis=0)

“Calibration dataset shape: (1024, 640, 640, 3)”

This is my process output:

Simplified ONNX model saved to: /compilation/datasets/waste_detection/models/run1/logs20250718-123851/model_simplified.onnx
[info] Translation started on ONNX model custom
[info] Restored ONNX model custom (completion time: 00:00:00.03)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.05)
[info] Start nodes mapped from original model: ‘model_1/model/conv1_conv/Conv2D__6’: ‘custom/input_layer1’.
[info] End nodes mapped from original model: ‘model_1/conv2d_1/Sigmoid’.
[info] Translation completed on ONNX model custom (completion time: 00:00:00.38)
Model translation to Hailo format completed.
Calibration dataset created.
Calibration dataset shape: (1024, 640, 640, 3)
[info] Found model with 3 input channels, using real RGB images for calibration instead of sampling random data.
[info] Starting Model Optimization
[warning] Reducing optimization level to 0 (the accuracy won’t be optimized and compression won’t be used) because there’s no available GPU
[warning] Running model optimization with zero level of optimization is not recommended for production use and might lead to suboptimal accuracy results
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:00.06)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 64 entries for calibration

[…]

[info] Iterations: 4
Reverts on cluster mapping: 0
Reverts on inter-cluster connectivity: 0
Reverts on pre-mapping validation: 0
Reverts on split failed: 0
[info] ±----------±--------------------±--------------------±-------------------+
[info] | Cluster | Control Utilization | Compute Utilization | Memory Utilization |
[info] ±----------±--------------------±--------------------±-------------------+
[info] | cluster_0 | 6.3% | 12.5% | 9.4% |
[info] | cluster_1 | 6.3% | 1.6% | 2.3% |
[info] | cluster_2 | 25% | 17.2% | 25% |
[info] | cluster_3 | 6.3% | 1.6% | 18% |
[info] | cluster_6 | 87.5% | 64.1% | 82% |
[info] ±----------±--------------------±--------------------±-------------------+
[info] | Total | 16.4% | 12.1% | 17.1% |
[info] ±----------±--------------------±--------------------±-------------------+
[info] Successful Mapping (allocation time: 7s)
[info] Compiling kernels of custom_context_0…
[info] Bandwidth of model inputs: 9.375 Mbps, outputs: 3.125 Mbps (for a single frame)
[info] Bandwidth of DDR buffers: 0.0 Mbps (for a single frame)
[info] Bandwidth of inter context tensors: 0.0 Mbps (for a single frame)
[info] Building HEF…
[info] Successful Compilation (compilation time: 1s)

I thought I was having issues with the dataset format but this doesn’t seem to be the case.

Am I using the Runner-Function wrong?

You can change the calibration dataset size by adding this alls line:


model_optimization_config(calibration, batch_size=8, calibset_size=64)
 

Having a large calibration dataset size is not recommend as this may include unnecessary outlier values in the quantization. If you wish to increase the calibration dataset size you should also perform activation clipping to remove outliers by adding this alls line

pre_quantization_optimization(activation_clipping, layers={*},
mode=percentile, clipping_values=[0.01, 99.99])