Can I decouple Nvidia driver and CUDA installation?

Question

Like others, I've been very confused about instructions for how to install a specific CUDA version for the purposes of deep learning.

Today's main deep learning libraries (Tensorflow and PyTorch) don't support the latest CUDA version which is 11.2. But, when I install the recommended NVIDIA driver after a fresh Ubuntu installation, I end up getting CUDA 11.2 by default as can be seen when I run nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39       Driver Version: 460.39       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    Off  | 00000000:09:00.0  On |                  N/A |
| 30%   37C    P8    40W / 350W |    443MiB / 24259MiB |     12%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       954      G   /usr/lib/xorg/Xorg                 59MiB |
|    0   N/A  N/A      1470      G   /usr/lib/xorg/Xorg                161MiB |
|    0   N/A  N/A      1603      G   /usr/bin/gnome-shell              125MiB |
|    0   N/A  N/A      2031      G   ...AAAAAAAAA= --shared-files       59MiB |
+-----------------------------------------------------------------------------+

I tried following this Medium guide to downgrade CUDA to 11.0, using these instructions from NVIDIA for installing the version I want. But a couple of things are confusing me:

Why is the Medium guide telling me to delete nvidia? I want to leave the drivers the way they are and just change CUDA.
The rm -rf also looks scary. Feels wrong. Is it?
I end up getting an error on trying to install CUDA 11.0. Something about lots of dependencies missing. I don't have it anymore because I've jumped ship and wiped everything for a fresh start.

You check your version for the CUDA installation by nvcc -V. As far as the installation of CUDA goes, if you download the .run file version of it, you can unselect the NVIDIA driver that it installs that way it is "decoupled". — Terrance, Apr 10 '21 at 14:29
Thanks for jumping in @Terrance. Couple of things here. nvcc command is not found, and I've looked here but I don't actually have a /usr/local/cuda. Regarding the .run file, I gave that a go and I'm being told: "Existing package manager installation of the driver found. It is strongly recommended that you remove this before continuing." — Alexander Soare, Apr 10 '21 at 14:44
You can have more than one CUDA version installed at a time. So, it would be OK to go ahead and install the version of CUDA that you want. I have wrote up some answers at https://askubuntu.com/questions/1077061/how-do-i-install-nvidia-and-cuda-drivers-into-ubuntu where it shows how to do the .run file installations and then how to add in the CUDA library paths, etc. You can find the one that suits you. It is also very possible just to change the version in the answer that I wrote that matches what you want to do. — Terrance, Apr 10 '21 at 21:09
@Terrance this is like the book the alchemist, where I had everything I was looking for at the start of my journey... There was a warning when installing with .run saying that the driver wasn't installed properly. The warning felt scary, but it was merely acknowledging the fact that I unticked the driver installation option. This sent me on a wild goose chase for hours... I've sorted it out now. Many thanks for your help — Alexander Soare, Apr 10 '21 at 22:19

score 0 · Answer 1 · answered May 04 '22 at 00:20

0

Adding to Alexander's answer about unticking the driver option - here's a screenshot where you can do this. (CUDA 11.6 runfile installation on Ubuntu 22.04)

answered May 04 '22 at 00:20

ATutorMe

141

Can I decouple Nvidia driver and CUDA installation?

1 Answers1