Support for CUDA 12 in AI Software Suite

We have new workstations with RTX5090 GPUs. Unfortunately we can not use the AI Software Suite anymore, as these cards do not support Cuda 11 anymore. I tried updating by using a newer Nvidia DOcker image with cuda 12.8 but that leads to other issues down the road regarding tensorflow and so on.

Is the upgrade on the roadmap in the near future?

Hey @Steffen_Urban

Welcome to the Hailo Community!

We’re actively working on official support for the RTX 5090 / Blackwell architecture, but there’s no confirmed timeline just yet. I’ll update this thread as soon as we have more concrete info.

In the meantime, here’s what I’d try right now:

System Requirements (Hailo‑8 / Hailo‑10)

Whether you’re using:

  • Hailo‑8 / Hailo‑8L → DFC 3.33.0
  • Hailo‑10 / Hailo‑15 → DFC 5.1.0

…the GPU requirements are the same:

  • NVIDIA Driver: 555 or newer
  • CUDA: 12.5.1
  • cuDNN: 9.10

This setup also aligns with TensorFlow 2.18.0 and is consistent across the 3.32+/3.33 and 5.x lines.


:warning: About RTX 5090 / Blackwell

Blackwell (SM_120) isn’t officially supported yet — but with the right versions, it might work. Still, stability isn’t guaranteed.

What I’d recommend:

  • Make sure your driver is 555+
  • Use CUDA 12.5.1 and cuDNN 9.10
  • Let DFC manage its own TensorFlow environment to avoid conflicts

Even with that setup, errors like CUDA_ERROR_INVALID_HANDLE may still appear — we’re tracking this internally, but support isn’t finalized yet.


:white_check_mark: Workaround for Now

Your current fallback using:

CUDA_VISIBLE_DEVICES=-1

…is the safest and most stable approach for now. It works fine — just without GPU acceleration.


Hope this clarify things!

I too am trying to compile a custom Yolov11n model using the 10-25 suite container with a 5090. It runs fine if I comment out the gpus=all in the setup.sh file. If I read the output correctly I am getting no optimization and it is ignoring my calibration files. The end result hosted on a Trixie RPi-5 with Hailo 8 seems to be much slower than a similar model (different single object class) I built last spring running the RPi-5 with the rpi5-examples. The compiler output seems to be asking for GPU and 1024 calibration images to optimize.

Do I need both?