3

I have installed nvidia-cuda-toolkit on my ubuntu 22.04 and it removed nvidia-smi. It removed libnvidia-compute-515 nvidia-utils-515 which also removed nvidia-smi. If I try updating my drivers using sudo ubuntu-drivers autoinstall, it says:

The following packages were automatically installed and are no longer required:
  libaccinj64-11.5 libcub-dev libcublas11 libcublaslt11 libcudart11.0 libcufft10 libcufftw10 libcurand10 libcusolver11 libcusolvermg11 libcusparse11 libnppc11 libnppial11 libnppicc11 libnppidei11
  libnppif11 libnppig11 libnppim11 libnppist11 libnppisu11 libnppitc11 libnpps11 libnvblas11 libnvjpeg11 libnvrtc-builtins11.5 libnvrtc11.2 libnvtoolsext1 libnvvm4 libtbb-dev libtbb12 libtbbmalloc2
  libthrust-dev libvdpau-dev nvidia-cuda-gdb nvidia-cuda-toolkit-doc nvidia-opencl-dev ocl-icd-opencl-dev opencl-c-headers opencl-clhpp-headers
Use 'sudo apt autoremove' to remove them.
The following additional packages will be installed:
  libgles2:i386 libnvidia-cfg1-515 libnvidia-common-515 libnvidia-compute-515 libnvidia-compute-515:i386 libnvidia-decode-515 libnvidia-decode-515:i386 libnvidia-encode-515 libnvidia-encode-515:i386
  libnvidia-extra-515 libnvidia-fbc1-515 libnvidia-fbc1-515:i386 libnvidia-gl-515 libnvidia-gl-515:i386 libopengl0:i386 libxnvctrl0 nvidia-compute-utils-515 nvidia-dkms-515 nvidia-prime
  nvidia-settings nvidia-utils-515 screen-resolution-extra xserver-xorg-video-nvidia-515
The following packages will be REMOVED:
  libcuinj64-11.5 libnvidia-compute-495 libnvidia-ml-dev nvidia-cuda-dev nvidia-cuda-toolkit nvidia-profiler nvidia-visual-profiler
The following NEW packages will be installed:
  libgles2:i386 libnvidia-cfg1-515 libnvidia-common-515 libnvidia-compute-515 libnvidia-compute-515:i386 libnvidia-decode-515 libnvidia-decode-515:i386 libnvidia-encode-515 libnvidia-encode-515:i386
  libnvidia-extra-515 libnvidia-fbc1-515 libnvidia-fbc1-515:i386 libnvidia-gl-515 libnvidia-gl-515:i386 libopengl0:i386 libxnvctrl0 nvidia-compute-utils-515 nvidia-dkms-515 nvidia-driver-515
  nvidia-prime nvidia-settings nvidia-utils-515 screen-resolution-extra xserver-xorg-video-nvidia-515
0 upgraded, 24 newly installed, 7 to remove and 0 not upgraded.

It mentions to remove nvidia-cuda-toolkit and other packages. How do I get the GPU usage statistics with nvidia-cuda-toolkit installed and install nvidia-smi without removing the other packages it states to remove while updating drivers?

I want to install GPU for Tensorflow. It always goes for missing packages.

JAMSHAID
  • 281
  • Avoid the dependency mess, get your Nvidia drivers working, install cuda with the .run script -- skipping the driver offer, and optioning bin/lib locations into /usr/local/cudaxx by temporarily taking write permission of /usr/local. No sudo needed. See https://askubuntu.com/questions/1077061/how-do-i-install-nvidia-and-cuda-drivers-into-ubuntu/1077063#1077063 https://askubuntu.com/questions/1219761/cuda-10-2-different-installation-paths/1244010#1244010 – ubfan1 Sep 10 '22 at 15:54
  • @ubfan1 it causes issues while working with tensorflow. it says some packages are missing when I use cuda toolkit debian file – JAMSHAID Sep 10 '22 at 16:29
  • Your hardware capabilities may limit what version of CUDA will run, and software like Tensorflow may have its own version requirements. Select a CUDA compatible with both requirements (if possible). It may be possible to get things running with enough tweaking, but it's much easier to stay on documented paths. – ubfan1 Sep 10 '22 at 17:10

2 Answers2

4

The problem is that the latest version of the nvidia-cuda-toolkit does not match the latest driver version. In the example you gave, nvidia-cuda-toolkit wants libnvidia-compute-495, but the latest driver (515) depends on libnvidia-compute-515. Those can't coexist, so the only option it has is to remove the package that depends on the other version.

You can solve this by installing the specific driver version that matches the version depended on by the latest available nvidia-cuda-toolkit, i.e. in your case apt install nvidia-driver-495.

At the moment of writing, the latest driver release that has a matching CUDA version in Ubuntu 22.04 is 510.

rem
  • 141
3

I ran into exactly the same problem and if I interpret your comments correctly you also need it for Tensorflow. In order to get Tensorflow running on GPU you only need the driver, CUDA and cuDNN.

So in my case, this worked:

  1. Install the GPU driver
  2. Install CUDA following https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
  3. Install cuDNN https://developer.nvidia.com/cudnn

I know this does not directly answer your question but I cannot write comments and if it's only for Tensorflow this should work. At least on my computer, Tensorflow is now able to use the GPU.

möö
  • 31