Make sure you are using an Ubuntu 20/22 host machine when running the docker.
The Docker already contains the right CUDA and CUDNN for the specific version of the suite you are using. However, you still have take car of two things:
Install Nvidia driver 525 outside the docker, to have better compatibility with CUDA 11.8 (used inside the docker).
My host is ubuntu 20.04 but the driver is 535. I have no control over that it seems as this is an azure ML instance. I’ve tried uninstalling it and reinstalling the correct driver but it just gives me problems.
I have installed nvidia-docker2 but that doesn’t seem to change anything.
Look at the commands in the hailo_ai_sw_suite_docker_run.sh script which check for the GPU avaiability, as explained here: GPU not detected in Hailo Suite - #5 by klausk
The scripts uses two internal variables NVIDIA_GPU_EXIST and NVIDIA_DOCKER_EXIST to decide whether to enable GPU support the first time the docker container is launched.
From within the new docker, you should be able to get an output when running the following commands.
NVIDIA_GPU_EXIST corresponds to the output of:
nvidia-docker2 must be installed before creating the container
the GPU device must appear as a “VGA compatible controller”. Your device - listed as a “3D controller” - was filtered out by the grep "VGA compatible controller: NVIDIA" command.
This is under investigation and will be fixed in the next releases.
If you want to use the GPU in the docker, you can we modify the grep command in the docker script to look for both “VGA compatible controller: NVIDIA” and “3D controller: NVIDIA”. Then you can create a new container.
Thanks, I’ll try that next time. Although, I’m quite sure I tried modifying the grep to pick it up and it caused it to break in other ways. I don’t have the results available to me right now though, sorry.