GPU not detected in Hailo Suite

Hi guys, I can’t make my GPU to be detected within the Hailo virtualenv.

I’m doing a model retraining, and it works alright while using the model zoo yolov8 docker, so I can do the training and export, as below:

But when changing to the hailo docker for the compilation, GPU is not detected.

I can still complete the compilation, but just wanted to get this sorted to hopefully make full use of my hardware.

Thank you team!

First make sure you install the GPU driver and nvidia-docker2 support. See Hailo AI SW Suite User Guide for detailed instructions.

Second you can check the hailo_ai_sw_suite_docker_run.sh file. There is a function that checks for GPU support. You can run the commands used in there manually in a terminal and see what output you get.

nvidia-smi

This should not fail.

nvidia-smi -q | grep "Driver Version" | awk '{print $4}'

The version returned here should be larger or equal to the value in req_driver variable defined somewhere at the beginning of the script.

Additionally check the two variables/macros NVIDIA_GPU_EXIST and NVIDIA_DOCKER_EXIST and run the commands as well.

Running nvidia-smi inside the Hailo docker says “command not found”. Running it outside, runs and detects the GPU:

The second command returns: 565.90

Not sure exactly how to check the variables, I used:

printenv NVIDIA_GPU_EXIST
printenv NVIDIA_DOCKER_EXIST

and got nothing back.

Thanks @klausk , forgot to tag you before.

You need to run the command. See the script file:

NVIDIA_GPU_EXIST

lspci | grep "VGA compatible controller: NVIDIA"

NVIDIA_DOCKER_EXIST

dpkg -l | grep 'nvidia-docker\|nvidia-container-toolkit'

@klausk I’ve ran those and got nothing back.

The first one is weird because nvidia-smi shows your GPU. Run the lspci command without the grep command and see what string is shown for the GPU.

The second indicates you did not install NVIDIA docker support. Make sure you install nvidia-docker2 support. See Hailo AI SW Suite User Guide for detailed instructions.

@klausk
I remember installing docker2 before, but just tried it again as per Hailo AI Suite instructions below:

The first bit, distribution, it gives an error message.
Typing the other commands seem to work.

After that, running “lspci” returns:
10
and "dpkg -l | grep ‘nvidia-docker|nvidia-container-toolkit’ " returns:
12

To clarify, I’m running everything from WSL.

Docker Desktop is installed on Windows.

Nvidia GPU driver was installed on Windows.

We do support the Hailo Dataflow Compiler under WSL2. However this was mainly developed to support some partner frameworks to use the Hailo Dataflow Compiler under the hood on Windows systems.

We do not support/validate the Hailo AI Software Suite Docker under Windows.

If you must use the DFC under Windows install it directly using the instructions in the Hailo Dataflow Compiler User Guide.

I recommend to use the Hailo AI Software Suite Docker on Ubuntu directly. This provides the easiest installation and upgrade.