An error when compiling the pretrained yolov11n model in onxx format using hailo 8 software suite.

Qaiser_anwar · July 1, 2025, 1:27pm

The following is the error:

 python3 onnx2hef.py 
Initializing Hailo ClientRunner for hailo8...
ClientRunner initialized successfully.
Loading ONNX model from last.onnx with specified end nodes...
[info] Translation started on ONNX model model
[info] Restored ONNX model model (completion time: 00:00:00.19)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.78)
[info] NMS structure of yolov8 (or equivalent architecture) was detected.
[info] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.23/cv3.0/cv3.0.2/Conv /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv2.2/cv2.2.2/Conv /model.23/cv3.2/cv3.2.2/Conv.
[info] Start nodes mapped from original model: 'images': 'model/input_layer1'.
[info] End nodes mapped from original model: '/model.23/cv2.0/cv2.0.2/Conv', '/model.23/cv3.0/cv3.0.2/Conv', '/model.23/cv3.1/cv3.1.2/Conv', '/model.23/cv2.1/cv2.1.2/Conv', '/model.23/cv2.2/cv2.2.2/Conv', '/model.23/cv3.2/cv3.2.2/Conv'.
[info] Translation completed on ONNX model model (completion time: 00:00:02.12)
ONNX model translated successfully.
Starting model optimization (quantization) using calibration data from config.yaml...
[info] Found model with 3 input channels, using real RGB images for calibration instead of sampling random data.
[info] Starting Model Optimization
An error occurred during compilation: Cannot load file containing pickled data when allow_pickle=False
Please ensure your 'config.yaml' has a valid 'quantization' section
with 'quantization_mode: full' and a correctly specified 'calibration_set' path.
ClientRunner closed.

The following is the code:

# onnx2hef.py
import hailo_sdk_client
import os
# from ultralytics import YOLO # Only uncomment if you need to export ONNX

# --- Configuration Section ---
onnx_model_path = 'last.onnx'
yaml_config_path = 'config.yaml' # This YAML is CRUCIAL for quantization and NMS
hef_output_path = 'last.hef'
target_hw_arch = "hailo8"

# --- Crucial: Use the end nodes recommended by the compiler for HailoRT post-processing ---
# These are the raw outputs from the detection heads (e.g., bbox_reg_head_output, cls_head_output for each stride)
# The compiler expects these if it's to perform on-chip NMS.
recommended_end_node_names = [
    "/model.23/cv2.0/cv2.0.2/Conv",  # Stride 8, bbox (Note: Order in your last output was /model.23/cv3.0/cv3.0.2/Conv first,
    "/model.23/cv3.0/cv3.0.2/Conv",  # so I'm reordering them here to match the info log precisely if it matters.
    "/model.23/cv3.1/cv3.1.2/Conv",  # It shouldn't strictly matter for the list, but for mapping in YAML, it does.)
    "/model.23/cv2.1/cv2.1.2/Conv",
    "/model.23/cv2.2/cv2.2.2/Conv",
    "/model.23/cv3.2/cv3.2.2/Conv"
]
# Double check the order against your latest output:
# /model.23/cv3.0/cv3.0.2/Conv /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv2.2/cv2.2.2/Conv /model.23/cv3.2/cv3.2.2/Conv.
# My previous list might have had them in a different order. It's best to match the compiler's output exactly.

# --- Verification of Paths ---
if not os.path.exists(onnx_model_path):
    print(f"Error: ONNX model not found at {onnx_model_path}")
    exit(1)
if not os.path.exists(yaml_config_path):
    print(f"Error: YAML configuration not found at {yaml_config_path}")
    exit(1)

# --- Main Compilation Logic ---
runner = None
try:
    print(f"Initializing Hailo ClientRunner for {target_hw_arch}...")
    runner = hailo_sdk_client.ClientRunner(hw_arch=target_hw_arch)
    print("ClientRunner initialized successfully.")

    print(f"Loading ONNX model from {onnx_model_path} with specified end nodes...")
    
    # --- Step 1: Load/Parse the ONNX model with the recommended end_node_names ---
    # The YAML configuration is crucial for this step if it contains input/output definitions
    # and also for the subsequent optimization step.
    hn, _ = runner.translate_onnx_model(
        onnx_model_path,
        end_node_names=recommended_end_node_names
        # No yaml_path here, as we found it's not a direct argument for translate_onnx_model
    )

    print("ONNX model translated successfully.")

    # --- Step 2: Load the external YAML configuration ---
    # It appears that your SDK version *does not* have `load_external_model_config`.
    # This means the YAML might be passed directly to `optimize()`.
    # Let's try passing the YAML path to optimize().
    # If the YAML contains the model input/output definitions, those might be picked up by
    # translate_onnx_model without explicit `yaml_path` argument if the model_name in YAML
    # matches the parsed ONNX model.
    
    # --- Step 3: Optimize (Quantize) the model using the YAML config for calibration ---
    # This is the step that applies quantization. The YAML is expected to be read here.
    print(f"Starting model optimization (quantization) using calibration data from {yaml_config_path}...")
    # The `optimize` method usually takes the YAML path or a loaded config object.
    runner.optimize(yaml_config_path) # Pass the YAML path directly to optimize

    print("Model optimized (quantized) successfully.")

    # --- Step 4: Compile the quantized model to HEF ---
    print("Compiling the translated and quantized model to HEF...")
    compiled_hef_data = runner.compile()

    # --- Step 5: Save the HEF to a file ---
    print(f"Saving compiled HEF to: {hef_output_path}")
    runner.save_hef(hef_output_path)

    print(f"Successfully compiled model to HEF: {hef_output_path}")
    print(f"Your HEF file is located at: {os.path.abspath(hef_output_path)}")

except Exception as e:
    print(f"An error occurred during compilation: {e}")
    print("Please ensure your 'config.yaml' has a valid 'quantization' section")
    print("with 'quantization_mode: full' and a correctly specified 'calibration_set' path.")
finally:

    print("ClientRunner closed.")

The following is the config.yaml file:

network:
  name: yolov11n_custom
  # Input shape should precisely match what your ONNX model expects
  input_shape: [1, 3, 640, 640] # Ensure this matches your model's input size
  
  # IMPORTANT: Preprocessing for typical YOLO models
  normalization:
    # Scale pixels from [0, 255] to [0, 1]
    scale: 0.00392156862745098 # Equivalent to 1.0 / 255.0
    mean: [0.0, 0.0, 0.0]     # No mean subtraction, as trained on 0-1
    std: [1.0, 1.0, 1.0]      # No std scaling, as trained on 0-1
    bias: 0.0                 # No bias
  
  inputs:
    # The name here must match the input node name in your ONNX model (e.g., 'images')
    - name: images
      shape: [1, 3, 640, 640]
      # data_type: uint8  # Uncomment if you plan to feed uint8 images directly to the HEF
                          # and let Hailo handle the float conversion internally after pre-processing.
                          # Otherwise, it often defaults to float32.

  # For on-chip NMS, you typically do NOT explicitly list the raw detection head outputs here.
  # The `post_processing` section implicitly defines the final output (NMS results).
  # outputs:
  #   - name: output0 # Remove this if using on-chip NMS

# --- Quantization Configuration ---
# This is absolutely necessary for the 'Model requires quantized weights' error.
quantization:
  quantization_mode: "full" # Full INT8 quantization
  calibration_set:
    # IMPORTANT: This path MUST point to a directory containing representative images
    # for calibration. These should be a diverse subset of your training data.
    # For example: '/home/linux/Desktop/Drone_Navigation/data/calibration_images/'
    data_path: "/home/linux/Desktop/hailo/calibration_npy_batches" # <<< CHANGE THIS TO YOUR ACTUAL CALIBRATION DATASET PATH
    data_loader: "numpy_loader" # Uses Hailo's built-in image loader for common image formats
    #batch_size: 100 # Number of images to process per batch for calibration. Adjust if needed.
    #num_images: 25 # You can specify a total number of images to use from the data_path

# --- Post-Processing Configuration (for on-chip NMS) ---
# This block tells the Hailo compiler how to perform Non-Maximum Suppression on the device.
post_processing:
  nms_scores_th: 0.25 # Confidence threshold for filtering detections (adjust as per your model)
  nms_iou_th: 0.45    # IoU threshold for NMS suppression (adjust as per your model)
  image_dims: [640, 640] # Must match your model's input image dimensions (height, width)
  max_proposals_per_class: 100 # Maximum number of detections per class after NMS
  classes: 32 # Number of classes your YOLOv11n model was trained on (e.g., 80 for COCO)
  
  # Define how the raw outputs from your model map to bounding box regression and classification.
  # These `reg_layer` and `cls_layer` names MUST match the exact ONNX node names
  # that the Hailo compiler recommended in your previous log:
  # /model.23/cv3.0/cv3.0.2/Conv /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv2.2/cv2.2.2/Conv /model.23/cv3.2/cv3.2.2/Conv.
  # You'll need to know which of these are regression outputs and which are classification outputs
  # for each stride (8, 16, 32). Use Netron to map them visually.
  bbox_decoders:
    # Head 1 (e.g., P3, smallest stride, usually 8 for largest objects)
    - name: "bbox_decoder_s8"
      stride: 8
      reg_layer: "/model.23/cv3.0/cv3.0.2/Conv" # <<<< VERIFY WITH NETRON! This is a guess.
      cls_layer: "/model.23/cv2.0/cv2.0.2/Conv" # <<<< VERIFY WITH NETRON! This is a guess.
    
    # Head 2 (e.g., P4, middle stride, usually 16)
    - name: "bbox_decoder_s16"
      stride: 16
      reg_layer: "/model.23/cv3.1/cv3.1.2/Conv" # <<<< VERIFY WITH NETRON!
      cls_layer: "/model.23/cv2.1/cv2.1.2/Conv" # <<<< VERIFY WITH NETRON!

    # Head 3 (e.g., P5, largest stride, usually 32 for smallest objects)
    - name: "bbox_decoder_s32"
      stride: 32
      reg_layer: "/model.23/cv2.2/cv2.2.2/Conv" # <<<< VERIFY WITH NETRON! The compiler's order was different.
      cls_layer: "/model.23/cv3.2/cv3.2.2/Conv" # <<<< VERIFY WITH NETRON!

# If your YOLOv11 model has a different number of heads or different strides, adjust accordingly.
# Also, the precise mapping of `cvX.Y/cvX.Y.Z/Conv` to reg/cls for each stride is critical.
# YOU MUST USE NETRON TO VISUALIZE YOUR `last.onnx` and find these exact mappings.

I do not understand where the error arises, Is it some internal error related to the hailo software suite or is it something else. The error mainly arises from the quantization section in the config.yaml file where I specify the calibration set.

omria · July 6, 2025, 8:04am

Hey @Qaiser_anwar,

Welcome to the Hailo Community!

That error you’re seeing:

Cannot load file containing pickled data when allow_pickle=False

This isn’t actually coming from the Hailo compiler - it’s coming from NumPy when trying to load your calibration files.

When you set up your calibration like this:

quantization:
  calibration_set:
    data_path: "/home/linux/Desktop/hailo/calibration_npy_batches"
    data_loader: "numpy_loader"

The Hailo SDK goes through your directory and basically does:

for fn in sorted(os.listdir(data_path)):
    arr = np.load(os.path.join(data_path, fn), allow_pickle=False)

By default, np.load with allow_pickle=False won’t load .npy files that contain pickled Python objects, which is what’s causing your error.

This isn’t a Hailo bug - it’s actually NumPy’s security feature. The Hailo SDK just uses NumPy’s default behavior for safety.

Here’s how to fix it:

Option 1: Regenerate your calibration files as pure arrays
If your calibration data is just numeric arrays (like image batches), make sure you save them properly:

import numpy as np
arr = np.random.rand(640,640,3).astype(np.float32)
np.save("batch000.npy", arr)  # This creates a "clean" .npy file

Option 2: Use image files instead
If your calibration data are actually images, don’t use numpy_loader at all:

quantization:
  quantization_mode: "full"
  calibration_set:
    data_path: "/home/linux/Desktop/hailo/calibration_images"
    data_loader: "opencv_loader"  # or "pil_loader"

Quick test:
Try loading one of your files manually:

import numpy as np
x = np.load("/home/linux/Desktop/hailo/calibration_npy_batches/batch000.npy")
print(type(x), x.dtype)

If that gives you the pickle error, then your files need to be re-saved as pure arrays.

Once your calibration data loads cleanly, the Hailo quantizer should work fine!

Let me know if you need more help!

Topic		Replies	Views
Compile YOLOv8 onnx to hef General hailo8	6	747	July 25, 2024
Compile yolov10n onnx to hef General hailo8	10	800	August 12, 2025
Can't compile onnx to hef General network	4	319	November 10, 2024
Yolov8m and Yolov8L HEF compilation errors General	1	200	January 12, 2025
.onnx to .hef problem General hailo8 , error	1	99	July 6, 2025

An error when compiling the pretrained yolov11n model in onxx format using hailo 8 software suite.

Related topics