The inference accuracy of the HEF model on Hailo-8L has significantly dropped

The quantized HAR model has an accuracy of 87% on the dataset, while its compiled HEF model only has an accuracy of 42%.

The following is the program for converting an ONNX model to a HEF model:

from hailo_sdk_client import ClientRunner
import os
import cv2
import numpy as np



input_size = 512  # 模型输入的尺寸
chosen_hw_arch = "hailo8l"  # 要使用的 Hailo 硬件架构,这里是 Hailo-8
onnx_model_name = "ms_unet_512_09"  # 模型的名字
file_path = "/home/zengzixuan/hailo-convert/checkpoint/ms_unet_512"
onnx_path = os.path.join(file_path, "ms_unet_512_09.onnx") # 模型的路径
hailo_model_har_path = os.path.join(file_path, f"{onnx_model_name}.har") # 转换后模型的保存路径
hailo_quantized_har_path = os.path.join(file_path, f"{onnx_model_name}\_quantized.har")  # 量化后模型的保存路径
hailo_model_hef_path = os.path.join(file_path, f"{onnx_model_name}.hef") # 编译后模型的保存路径
images_path = "/home/zengzixuan/hailo-convert/Test/images"  # 数据集图像路径
CALIB_SAMPLE_NUM = 396

# 将 onnx 模型转为 har
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_onnx_model(
    model=onnx_path,
    net_name=onnx_model_name,
    # 复制日志中推荐的end_node_names
    # end_node_names=\[
    # '/encoder1/transformer/Squeeze_1',
    # '/encoder1/shortcut/Conv',
    # '/encoder1/local_feat/local_feat.5/Mul'
    # \]
)
runner.save_har(hailo_model_har_path)

# 校准数据集准备(与训练时预处理完全一致)
images_list = \[img_name for img_name in os.listdir(images_path) if os.path.splitext(img_name)\[1\] in \[".jpg",".jpeg",".png", "bmp"\]\]\[:CALIB_SAMPLE_NUM\] # 获取图像名称列表
calib_dataset = np.zeros((len(images_list), input_size, input_size, 3), dtype=np.float32)  # 初始化 numpy 数组

# ImageNet 标准化参数(与训练时一致)
mean = np.array(\[0.485, 0.456, 0.406\], dtype=np.float32)
std = np.array(\[0.229, 0.224, 0.225\], dtype=np.float32)

for idx, img_name in enumerate(sorted(images_list)):
    # 1. 读取图像(BGR格式)
    img = cv2.imread(os.path.join(images_path, img_name))

    # 2. BGR → RGB 转换(关键!训练时用的是 RGB)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

    # 3. 调整尺寸
    resized = cv2.resize(img, (input_size, input_size))

    # 4. 归一化到 \[0, 1\]
    img_normalized = resized.astype(np.float32) / 255.0

    # 5. ImageNet 标准化(与训练时完全一致)
    img_normalized = (img_normalized - mean) / std

    calib_dataset\[idx, :, :, :\] = img_normalized



# 确保校准集数量>0
if len(images_list) == 0:
    raise ValueError(f"校准集为空!请检查:\\n1. 图片路径:{images_path}\\n2. 路径下是否有.jpeg/.jpg/.png/.bmp格式图片")

# 量化模型(使用正确预处理的校准数据)
runner = ClientRunner(har=hailo_model_har_path)
alls_lines = \[
    'model_optimization_flavor(optimization_level=0, compression_level=0)',  # 提升优化等级以获得更好的精度
    'resources_param(max_control_utilization=0.6, max_compute_utilization=0.6, max_memory_utilization=0.6)',
    'performance_param(fps=1)',
\]
runner.load_model_script('\\n'.join(alls_lines))
runner.optimize(calib_dataset)
runner.save_har(hailo_quantized_har_path)

# 编译为 hef
runner = ClientRunner(har=hailo_quantized_har_path)
compiled_hef = runner.compile()
with open(hailo_model_hef_path, "wb") as f:
    f.write(compiled_hef)

The following is the program for hef model inference on hailo8l:
“”"
HEF模型验证脚本 - 铁锈检测
使用Hailo8设备对HEF模型在Test测试集上进行验证,计算MIOU指标

关键:预处理必须与校准时完全一致!
校准时使用: cv2读取(BGR) → BGR转RGB → resize → ImageNet归一化
“”"

import os
import cv2
import torch
import numpy as np
from PIL import Image
from hailo_platform import (VDevice, HEF, HailoStreamInterface, InferVStreams, ConfigureParams,
                             InputVStreamParams, OutputVStreamParams, FormatType)
from tqdm import tqdm
import yaml

# ===================== 配置参数 =====================
# HEF模型输出已包含sigmoid(从调试信息确认:输出范围\[0,1\])
HEF_OUTPUT_WITH_SIGMOID = True
# IoU计算阈值
IOU_THRESHOLD = 0.5
# ===================================================


def calculate_iou_numpy(pred, target, threshold=0.5):
    """
    计算IoU(直接使用numpy,不经过sigmoid)

    Args:
        pred: 预测结果 numpy array,已经是概率值\[0,1\]
        target: 目标掩码 numpy array,二值化后的\[0,1\]
        threshold: 二值化阈值

    Returns:
        iou: IoU值
    """
    # 二值化预测
    pred_bin = (pred > threshold).astype(np.float32)
    target_bin = (target > threshold).astype(np.float32)

    # 计算交集和并集
    intersection = (pred_bin \* target_bin).sum()
    union = pred_bin.sum() + target_bin.sum() - intersection

    # 避免除以零
    iou = (intersection + 1e-6) / (union + 1e-6)
    return iou


def validate_hef_model(hef_model_path, test_image_dir, test_mask_dir, img_size=512):
    """
    验证HEF模型性能,计算平均MIOU

    预处理与校准时完全一致:
    - cv2读取图像(BGR格式)
    - BGR转RGB
    - resize到指定尺寸
    - ImageNet归一化
    """
    print("="\*60)
    print("HEF模型验证 - 测试集MIOU计算")
    print("="\*60)
    print(f"\\n预处理方式:与校准一致(BGR->RGB, ImageNet归一化)")

    # ImageNet归一化参数(与校准时一致)
    mean = np.array(\[0.485, 0.456, 0.406\], dtype=np.float32)
    std = np.array(\[0.229, 0.224, 0.225\], dtype=np.float32)

    # 获取测试图像列表
    image_files = \[f for f in os.listdir(test_image_dir)
                   if f.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp'))\]

    print(f"\\nTest集大小: {len(image_files)}")
    print(f"图像尺寸: {img_size}x{img_size}")

    # 初始化 Hailo 设备
    print(f"\\n\[1/3\] 初始化 Hailo 设备...")
    params = VDevice.create_params()
    params.device_count = 1

    total_iou = 0.0
    valid_count = 0

    with VDevice(params) as target:
        print(f"✓ Hailo 设备已连接")

        # 加载 HEF 模型
        print(f"\\n\[2/3\] 加载 HEF 模型: {hef_model_path}")
        hef = HEF(hef_model_path)

        # 配置网络组
        configure_params = ConfigureParams.create_from_hef(hef, interface=HailoStreamInterface.PCIe)
        network_groups = target.configure(hef, configure_params)
        network_group = network_groups\[0\]
        network_group_params = network_group.create_params()

        # 获取输入输出流参数 - 使用FLOAT32输入以支持归一化数据
        input_vstreams_params = InputVStreamParams.make(network_group, quantized=False, format_type=FormatType.FLOAT32)
        output_vstreams_params = OutputVStreamParams.make(network_group, quantized=False, format_type=FormatType.FLOAT32)

        print(f"✓ 模型加载成功")

        # 获取输入输出信息
        input_vstream_info = hef.get_input_vstream_infos()\[0\]
        output_vstream_info = hef.get_output_vstream_infos()\[0\]

        print(f"  输入形状: {input_vstream_info.shape}")
        print(f"  输出形状: {output_vstream_info.shape}")

        # 开始验证
        print(f"\\n\[3/3\] 开始验证...")
        print(f"  HEF输出格式: {'已包含sigmoid' if HEF_OUTPUT_WITH_SIGMOID else 'logits'}")

        first_image = True
        with InferVStreams(network_group, input_vstreams_params, output_vstreams_params) as infer_pipeline:

            with tqdm(image_files, desc='验证进度') as pbar:
                for filename in pbar:
                    try:
                        # ========== 图像预处理(与校准完全一致)==========
                        img_path = os.path.join(test_image_dir, filename)

                        # 使用cv2读取(BGR格式)
                        img = cv2.imread(img_path)
                        if img is None:
                            print(f"\\n警告: 无法读取图像 {filename}")
                            continue

                        # BGR转RGB(与校准时一致)
                        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

                        # resize到模型输入尺寸
                        img_resized = cv2.resize(img, (img_size, img_size))

                        # ImageNet归一化(与校准时完全一致)
                        img_normalized = img_resized.astype(np.float32) / 255.0
                        img_normalized = (img_normalized - mean) / std

                        # 添加batch维度: \[H, W, C\] -> \[1, H, W, C\]
                        input_data = np.expand_dims(img_normalized, axis=0).astype(np.float32)

                        # ========== 加载对应的mask ==========
                        base_name = os.path.splitext(filename)\[0\]
                        mask_path = os.path.join(test_mask_dir, base_name + '.png')

                        if not os.path.exists(mask_path):
                            print(f"\\n警告: 找不到mask文件 {mask_path}")
                            continue

                        mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
                        mask = cv2.resize(mask, (img_size, img_size))
                        mask = (mask > 128).astype(np.float32)  # 二值化

                        # ========== 执行推理 ==========
                        input_dict = {input_vstream_info.name: input_data}

                        with network_group.activate(network_group_params):
                            output = infer_pipeline.infer(input_dict)

                        # 获取输出
                        output_data = output\[output_vstream_info.name\]  # \[1, H, W, 1\]

                        # 首次输出调试信息
                        if first_image:
                            print(f"\\n  调试信息(第一张图像):")
                            print(f"    输入形状: {input_data.shape}, 范围: \[{input_data.min()}, {input_data.max()}\]")
                            print(f"    输出形状: {output_data.shape}")
                            print(f"    输出范围: \[{output_data.min():.4f}, {output_data.max():.4f}\]")
                            print(f"    输出均值: {output_data.mean():.4f}")
                            print()
                            first_image = False

                        # ========== 计算IoU ==========
                        # 移除batch和channel维度
                        pred = output_data.squeeze()  # \[H, W\]

                        # HEF输出已经是sigmoid后的概率值,直接计算IoU
                        if HEF_OUTPUT_WITH_SIGMOID:
                            iou = calculate_iou_numpy(pred, mask, IOU_THRESHOLD)
                        else:
                            # 如果是logits,需要先应用sigmoid
                            pred_prob = 1.0 / (1.0 + np.exp(-pred))
                            iou = calculate_iou_numpy(pred_prob, mask, IOU_THRESHOLD)

                        total_iou += iou
                        valid_count += 1

                        # 更新进度条
                        pbar.set_postfix({
                            'IoU': f'{iou:.4f}',
                            'Avg_IoU': f'{total_iou/valid_count:.4f}'
                        })

                    except Exception as e:
                        print(f"\\n❌ 处理 {filename} 失败: {str(e)}")
                        import traceback
                        traceback.print_exc()
                        continue

    # 计算平均值
    if valid_count > 0:
        avg_iou = total_iou / valid_count

        print(f"\\n{'='\*60}")
        print(f"验证结果:")
        print(f"{'='\*60}")
        print(f"平均MIOU: {avg_iou:.4f} ({avg_iou\*100:.2f}%)")
        print(f"处理图像数量: {valid_count}/{len(image_files)}")
        print(f"{'='\*60}")

        return avg_iou
    else:
        print("\\n❌ 没有成功处理任何图像")
        return None


def main():
    """
    主验证函数(HEF版本)
    """
    # 配置路径
    HEF_MODEL_PATH = '/home/zengzixuan/rust_detection/checkpoint/ms_unet_512/ms_unet_512_09.hef'
    TEST_IMAGE_DIR = '/home/zengzixuan/rust_detection/Test/images'
    TEST_MASK_DIR = '/home/zengzixuan/rust_detection/Test/labels'
    IMG_SIZE = 512

    print("\\n开始HEF模型验证")
    print(f"模型路径: {HEF_MODEL_PATH}")
    print(f"测试图像: {TEST_IMAGE_DIR}")
    print(f"测试标签: {TEST_MASK_DIR}\\n")

    # 验证Test集性能
    avg_iou = validate_hef_model(
        HEF_MODEL_PATH,
        TEST_IMAGE_DIR,
        TEST_MASK_DIR,
        IMG_SIZE
    )

    if avg_iou is not None:
        # 保存验证结果
        results = {
            'test_miou': float(avg_iou),
            'test_miou_percent': f"{avg_iou\*100:.2f}%",
            'hef_model_path': HEF_MODEL_PATH,
            'preprocessing': 'BGR->RGB, ImageNet normalization (same as calibration)',
            'timestamp': '2025-12-08'
        }

        results_path = "/home/zengzixuan/rust_detection/Test/result_hef.yaml"
        with open(results_path, 'w', encoding='utf-8') as f:
            yaml.dump(results, f, default_flow_style=False, allow_unicode=True)

        print(f"\\n✓ 验证结果已保存到: {results_path}")
        print("HEF模型验证完成!")
    else:
        print("\\n❌ 验证失败,未能生成结果")


if \__name_\_ == "\__main_\_":
    main()

The input and output processing of the HEF model is the same as during training, so it probably isn’t an input/output issue. Could it be that the custom HEF model itself is not compatible with the Hailo8L?

Hey @user366,

If your model compiles successfully for hailo8l, it’s compatible. The big accuracy drop you’re seeing (87% → 42%) is almost always caused by a mismatch in how the model is evaluated at different stages, or by something going wrong during compilation.

About those optimization settings

I noticed you’re loading the model with optimization_level=0 and compression_level=0, but you mentioned wanting better accuracy by increasing the optimization level. Just a heads-up: according to the DFC docs, higher optimization levels generally improve quantized accuracy. Level 0 is the lowest setting and can definitely hurt your results. I’d try bumping both of those up.

Your I/O setup looks fine

You’ve configured the model to handle FP32 input/output, which matches the SDK emulation path, so that part shouldn’t be causing issues.

Here’s how I’d debug this:

1. Verify SDK emulation accuracy

Run the quantized HAR through SDK emulation using the exact same preprocessing and IoU calculation you use on the hardware. Use InferenceContext.SDK_QUANTIZED.
You should still be seeing ~87% mIoU here.
If you’re already seeing ~42% at this stage, the issue is in quantization or calibration, not the HEF.

2. Test the FP-optimized version

Generate an FP-optimized HAR (graph optimizations only — no quantization) and test that too.

  • If FP-optimized matches your original ONNX but quantized drops → quantization/calibration problem.
  • If FP-optimized is already bad → something’s off in your model script, preprocessing, or graph structure.

3. Run a layer noise analysis

This helps pinpoint which layers are causing trouble:

hailo analyze-noise ms_unet_512_09_quantized.har \
  --data-path /path/to/your/images \
  --batch-size 2 \
  --data-count 64

It’ll show you which layers have high noise. Once you find them, you can try things like raising the optimization level, switching those layers to 16-bit, or tweaking activation clipping.

Hope this helps narrow things down! Let me know what you find for further debug!