GPU not detected in Hailo Suite

Hi guys, I can’t make my GPU to be detected within the Hailo virtualenv.

I’m doing a model retraining, and it works alright while using the model zoo yolov8 docker, so I can do the training and export, as below:

But when changing to the hailo docker for the compilation, GPU is not detected.

I can still complete the compilation, but just wanted to get this sorted to hopefully make full use of my hardware.

Thank you team!

First make sure you install the GPU driver and nvidia-docker2 support. See Hailo AI SW Suite User Guide for detailed instructions.

Second you can check the hailo_ai_sw_suite_docker_run.sh file. There is a function that checks for GPU support. You can run the commands used in there manually in a terminal and see what output you get.

nvidia-smi

This should not fail.

nvidia-smi -q | grep "Driver Version" | awk '{print $4}'

The version returned here should be larger or equal to the value in req_driver variable defined somewhere at the beginning of the script.

Additionally check the two variables/macros NVIDIA_GPU_EXIST and NVIDIA_DOCKER_EXIST and run the commands as well.

Running nvidia-smi inside the Hailo docker says “command not found”. Running it outside, runs and detects the GPU:

The second command returns: 565.90

Not sure exactly how to check the variables, I used:

printenv NVIDIA_GPU_EXIST
printenv NVIDIA_DOCKER_EXIST

and got nothing back.

Thanks @klausk , forgot to tag you before.

You need to run the command. See the script file:

NVIDIA_GPU_EXIST

lspci | grep "VGA compatible controller: NVIDIA"

NVIDIA_DOCKER_EXIST

dpkg -l | grep 'nvidia-docker\|nvidia-container-toolkit'

@klausk I’ve ran those and got nothing back.

The first one is weird because nvidia-smi shows your GPU. Run the lspci command without the grep command and see what string is shown for the GPU.

The second indicates you did not install NVIDIA docker support. Make sure you install nvidia-docker2 support. See Hailo AI SW Suite User Guide for detailed instructions.

@klausk
I remember installing docker2 before, but just tried it again as per Hailo AI Suite instructions below:

The first bit, distribution, it gives an error message.
Typing the other commands seem to work.

After that, running “lspci” returns:
10
and "dpkg -l | grep ‘nvidia-docker|nvidia-container-toolkit’ " returns:
12

To clarify, I’m running everything from WSL.

Docker Desktop is installed on Windows.

Nvidia GPU driver was installed on Windows.

We do support the Hailo Dataflow Compiler under WSL2. However this was mainly developed to support some partner frameworks to use the Hailo Dataflow Compiler under the hood on Windows systems.

We do not support/validate the Hailo AI Software Suite Docker under Windows.

If you must use the DFC under Windows install it directly using the instructions in the Hailo Dataflow Compiler User Guide.

I recommend to use the Hailo AI Software Suite Docker on Ubuntu directly. This provides the easiest installation and upgrade.

1 Like

You can enable GPU on WSL by following these instructions:

For the Dataflow compiler, install the wheel in WSL by following these instructions:

1 Like

I haven’t had success with these instructions unfortunately. I am also trying to use my NVidia GPU on WSL. I have tried both using the docker, and the extractable SW suite that I installed in a virtualenv. In the virtualenv, I run nvidia-smi and can see that the GPU is present & recognized, but when I run hailo/hailomz it gives me a message about no GPU being available. Is there any further guidance?

Installing the Hailo-sw suite in WSL is not supported. You can only install DFC by following: How to install the Hailo Dataflow compiler (DFC) on WSL2

Thanks for the reply. I was able to successfully install DFC in my Ubuntu WSL2 instance, but DFC refuses to recognize the existence of my GPU. I can run nvidia-smi in the WSL2 instance and confirm the Ubuntu sees the GPU, but I just can’t get DFC to do the same. I would love some more pointers, thank you!

I would like to wake this thread up again. I have to say that I am very disappointed with the state of the Hailo DFC and Software Tooklit. I have foubd it impossible to get the non-Docker installations running, eirther under WSL or directly on Ubuntu (generally, the issues seem to be due to Pip incompatibilities). But that is not the focus of this post as I have now moved to Docker under Ubuntu 24.04.2 and I am seeing same GPU issues that were reported earlier in this thread and still remain unresolved after 3 months.

Installing nvidia-docker2 does not work with the current instructions. I guess that Nvidia have changed something in their Github repo. The command:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L \
https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list \
| sudo tee /etc/apt/sources.list.d/nvidia-docker.list

gives the following error message:

Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
OK
# Unsupported distribution!
# Check https://nvidia.github.io/nvidia-docker

So even though my GPU is detected with nvidia-smi at version 550.120 it is not appearing within the container.

I have been trying to do something that should be fundamental - convert an ONNX file to HEF - for over two weeks now with no success and am getting frustrated. The folks at Hailo should really try to look at the state of the DFC documentation and bring it up to date ASAP. Without the ability to train custom models, the Hailo board for the RPi is little more than a toy.

I’m really sorry you’re running into this frustration. I completely understand how frustrating it can be when things don’t work as expected.

The Hailo AI Software Suite is officially supported on Ubuntu 20.04 and 22.04, as outlined in the User Guide. If you’re using a different OS version, you may encounter compatibility issues.

We plan to support Ubuntu 24 and deprecate support for Ubuntu 20 in one of the next versions.

Ok, thank you for the quick reply. I will install Ubuntu 22.04 to try again and report back.

Okay, I can confirm that with Ubuntu 22.04 I can now use nvidia-docker2 and I can see the GPU inside the Docker container. For others arriving here it is worth noting that:

  • The sudo apt-key command still gives a deprecation warning but the rest of the commands go through successfully.
  • The documentation says to use Docker 20.10.6 but this is no longer available in thwe archine. I used 20.10.13.
  • The docs say to use GPU driver 525 but the one I found was 535 and this seems to work okay.

Thanks for your help getting this sorted.