Poor performance

Justin_Brody · September 8, 2025, 9:08am

Hello,
I have yolo11 medium model that I finetuned.

Like the poster here, I am seeing serious performance degradation when I use the .hef model produced by the optimization and compilation process. Like that poster, the compile was unable to recognize my GPU – I was on a workstation with a T550 laptop GPU. I had cuda v12 on the system and also tried with 11.8 in the conda environment – no luck on either.

I got this message on the optimization step:

[info] Starting Quantization-Aware Fine-Tuning
[warning] Dataset is larger than expected size. Increasing the algorithm dataset size might improve the results
[info] Using dataset with 1024 entries for finetune

which I wasn’t sure how to interpret?

I saw on the other post that having optimization set to 0 (as I ended up with) will affect FPS, but will it effect accuracy as well?

For what it’s worth, I was running the model through the script in Hailo-Application-Code-Examples/runtime/hailo-8/python/object_detection.

Should the accuracy on the HEF usually be comparable to what you get from the .pt model on an Nvidia card?

Thanks in advance for any insights!

omria · September 10, 2025, 10:34am

Hey @Justin_Brody,

Let me break down what’s happening with your model compilation:

That dataset warning you’re seeing just means you provided more training data than the system was set up to use. By default, it only uses about 1024 images for calibration, but you can bump this up with --calibration-size to use more of your data. More data usually means better quantization results, especially for YOLO models.

About optimization level 0 - yeah, it definitely affects accuracy, not just speed. When it’s set to 0, the compiler basically skips a bunch of optimizations that could help with precision and resource allocation. Your model architecture might be tricky to optimize, which is why it defaulted to this conservative setting.

The GPU detection issue - can you share the specific error messages you’re getting? That way I can help you troubleshoot what’s actually going wrong with your T550. It might be a CUDA version mismatch, driver issue, or memory requirements that we can work around. The exact error output would tell us whether it’s a compatibility problem or something we can fix.

As for accuracy differences - don’t expect your HEF to match your PyTorch model exactly. You’re going from full precision (FP32) on NVIDIA to quantized INT8 on specialized hardware, so some drop is inevitable. Hailo aims for less than 2% mAP loss with proper fine-tuning, but without it (or with optimization level 0), you could see much bigger drops.

Could you share your config and alls files, plus those GPU error messages? That would help me give you more specific advice on how to improve things.

Topic		Replies	Views
Error while compiling the optimized file for converting tflite model to HEF file General dfc	21	770	August 27, 2024
YOLO recall degradation after compiling to hef General	3	111	June 13, 2025
Error while compiling hef yolo11 optimization=4 compression=0 General	3	109	March 18, 2025
Yolov8m and Yolov8L HEF compilation errors General	1	221	January 12, 2025
Optimization custom model General optimization , hailo8	1	415	January 6, 2025

Poor performance

Related topics