Large difference between output using SDK_QUANTIZED and SDK_HAILO_HW

Hello,

I’m working on an image denoising task and I’m at a point where I’m a little stuck. I optimized, quantized, and compiled the model successfully (although accuracy is not where I want it right now).

Following the guides, I test the accuracy after each step. Using the exact same input images with the same preprocessing, I get the following outputs. For context, clipping in this scenario is just “outputs*255” clipped to a range 0-255.

On GPU (quantized) (SDK_QUANTIZED)

Output range before clipping: min=-0.06805047392845154, max=1.0162203311920166

Mean and std of output: mean=0.54256272315979, std=0.1305958777666092Min and max of target: min=0, max=255

2026-03-11 15:38:12.101 | INFO | _main_::111 - Average PSNR: 36.706 dB (32 samples processed)

2026-03-11 15:38:12.101 | INFO | _main_::112 - Average SSIM: 0.903 (32 samples processed)

On Hailo8 (SDK_HAILO_HW)

Output range before clipping: min=-0.06805047392845154, max=1.0888075828552246

Mean and std of output: mean=0.5452663898468018, std=0.3206518590450287

Min and max of target: min=0, max=255

2026-03-11 15:41:30.090 | INFO | _main_::78 - Average PSNR: 27.950 dB (32 samples processed)

2026-03-11 15:41:30.090 | INFO | _main_::79 - Average SSIM: 0.020 (32 samples processed)

I expected both inference runs to result in approximately the same accuracy, but there is a significant drop, together with a noticeable change in the statistics (especially std).

So I’m wondering what the reason could be.

1 Like

I did some further investigation, checked HailoRT versions, and compared different optimization levels, but the result is still the same (Both settings lead the exact the same results)

performance_param(compiler_optimization_level=max, optimize_for_batch=1)

performance_param(compiler_optimization_level=1, optimize_for_batch=1)