Problem wigh GPU on Hailo ([warning] Reducing optimization level to 0 (the accuracy won’t be optimized and compression won’t be used) because there’s no available GPU)

Hello.
I have Hailo AI Software Suite. Version 2025-10. I use virtual environment from hailo_ai_sw_suite. I need optimize my model as better, as i could. For this work i need gpu. nvidia-smi work and watch my 5060. But tensorflow don`t watch it. How could i use gpu for optimization?

(hailo_venv) mihail@mihail-MS-7D98:~/hailo_light/light_onnx$ nvidia-smi
Fri Nov 21 13:33:13 2025
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 13.0 |
±----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 5060 Off | 00000000:01:00.0 On | N/A |
| 0% 42C P8 12W / 145W | 734MiB / 8151MiB | 2% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2519 G /usr/lib/xorg/Xorg 292MiB |
| 0 N/A N/A 2731 G /usr/bin/gnome-shell 53MiB |
| 0 N/A N/A 3332 G …exec/xdg-desktop-portal-gnome 3MiB |
| 0 N/A N/A 3781 G …rack-uuid=3190708988185955192 141MiB |
| 0 N/A N/A 4803 G /usr/bin/hiddify 33MiB |
| 0 N/A N/A 106728 G /proc/self/exe 88MiB |
| 0 N/A N/A 717910 G /usr/bin/nautilus 15MiB |
±----------------------------------------------------------------------------------------+

(hailo_venv) mihail@mihail-MS-7D98:~/hailo_light/light_onnx$ python -c “import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))”
2025-11-21 13:33:18.521258: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2025-11-21 13:33:18.530488: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1763721198.540700 748114 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1763721198.543995 748114 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-11-21 13:33:18.555496: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
W0000 00:00:1763721200.170204 748114 gpu_device.cc:2433] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
[PhysicalDevice(name=‘/physical_device:GPU:0’, device_type=‘GPU’)]

(hailo_venv) mihail@mihail-MS-7D98:~/hailo_light/light_onnx$ hailo --version
[info] No GPU chosen and no suitable GPU found, falling back to CPU.
[info] Current Time: 13:33:28, 11/21/25
[info] CPU: Architecture: x86_64, Model: , Number Of Cores: 12, Utilization: 3.9%
[info] Memory: Total: 31GB, Available: 14GB
[info] System info: OS: Linux, Kernel: 6.14.0-35-generic
[info] Hailo DFC Version: 3.33.0
[info] HailoRT Version: 4.23.0
[info] PCIe: 0000:04:00.0: Number Of Lanes: 1, Speed: 8.0 GT/s PCIe
[info] Running hailo --version
HailoRT v4.23.0
Hailo Dataflow Compiler v3.33.0

(hailo_venv) mihail@mihail-MS-7D98:~/hailo_light/light_onnx$ hailo optimize lg10_kpts.har --calib-set-path ../onnx_models/calib_inputs_correct.npz --hw-arch hailo8
[info] No GPU chosen and no suitable GPU found, falling back to CPU.
[info] Current Time: 13:39:47, 11/21/25
[info] CPU: Architecture: x86_64, Model: , Number Of Cores: 12, Utilization: 8.5%
[info] Memory: Total: 31GB, Available: 14GB
[info] System info: OS: Linux, Kernel: 6.14.0-35-generic
[info] Hailo DFC Version: 3.33.0
[info] HailoRT Version: 4.23.0
[info] PCIe: 0000:04:00.0: Number Of Lanes: 1, Speed: 8.0 GT/s PCIe
[info] Running hailo optimize lg10_kpts.har --calib-set-path ../onnx_models/calib_inputs_correct.npz --hw-arch hailo8
[info] Starting Model Optimization
[warning] Reducing optimization level to 0 (the accuracy won’t be optimized and compression won’t be used) because there’s no available GPU
[warning] Running model optimization with zero level of optimization is not recommended for production use and might lead to suboptimal accuracy results
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:00.54)
[info] Starting LayerNorm Decomposition

I’m experiencing exactly the same issue — the Hailo Model Zoo (hailomz compile) always reports:

[warning] Reducing optimization level to 0 ... because there's no available GPU

even though the system clearly has a working NVIDIA GPU with CUDA and cuDNN correctly installed.

Below is my environment for reference:

  • OS: Ubuntu 24.04

  • GPU: NVIDIA RTX 5060 Laptop GPU

  • Driver: 580.95.05

  • CUDA Toolkit: 13.0.2 (installed under /usr/local/cuda-13.0)

  • cuDNN: 9.16.0 (installed from NVIDIA official apt repository)

  • Hailo AI Software Suite: 2025-10

  • Python environment: the auto-created hailo_venv inside the SDK

The Hailo system requirements script (sw_suite_check_system_requirements.sh) reports everything as OK, including:

V | CUDA version is 13.0.
V | CUDNN version is 9.16.


:magnifying_glass_tilted_right: What I found

I tracked the problem down to the internal TensorFlow GPU detection called by:

hailo_model_optimization/acceleras/utils/tf_utils.py

TensorFlow performs a GPU usability test:

tf.config.list_physical_devices("GPU")
tf.config.experimental.set_memory_growth(gpus[0], True)
with tf.device("/GPU:0"):
    tf.constant(1.0)

In my case, the TensorFlow bundled inside hailo_venv fails to load CUDA/cuDNN and prints:

Could not find cuda drivers on your machine, GPU will not be used.
Cannot dlopen some GPU libraries...
Skipping registering GPU devices...
Physical GPUs: []

So even though the OS sees CUDA correctly, TensorFlow inside hailo_venv does NOT.
This results in:

has_gpu = False
→ optimization_level forced to 0

and therefore the Hailo Model Optimization always runs in CPU-only mode.


Conclusion / Question

It looks like the TensorFlow build packaged inside the Hailo SDK is not compatible with CUDA 13.0 + cuDNN 9.16 on Ubuntu 24.04.
Even manually linking headers, adjusting PATH, or installing proper CUDA libraries does not resolve the issue.

Could the Hailo team confirm:

  1. Which CUDA / cuDNN versions are officially supported for TensorFlow GPU mode inside the SDK?

  2. Is there a recommended way to rebuild or override the bundled TensorFlow with a CUDA 13-compatible version?

Any guidance would be greatly appreciated.

Thanks!

Hey @Mihail_Matveenko , @han_star00138

Your GPU isn’t being used because Hailo’s Dataflow Compiler only works with specific CUDA and cuDNN versions. If your system has a different version installed—even if TensorFlow and nvidia-smi detect the GPU fine—the Hailo compiler will fall back to CPU optimization. This is a version compatibility issue, not a hardware problem.

Hailo DFC supports GPU acceleration only with these exact configurations:

For DFC v5.0.0 (Hailo-10/Hailo-15):

  • GPU Driver: 525
  • CUDA: 11.8
  • cuDNN: 8.9

For DFC v3.32.0 / v3.33.0 (Hailo-8):

  • GPU Driver: 555+
  • CUDA: 12.5.1
  • cuDNN: 9.10

Install the matching stack for your DFC version, then verify with:

python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Important Limitations

  • CUDA 13.x is not supported. There is no method to rebuild Hailo’s bundled TensorFlow for CUDA 13.
  • WSL2 is not supported for GPU optimization—use native Linux instead.
  • On Docker, add the --gpus all flag and ensure your driver version is compatible.

Hope this helps!