2

I was trying to install the CUDA toolkit, so I went to the recommended nvidia.developer.com and installed CUDA Toolkit 10.2 with the following deb(network):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda

Along the way I must've screwed up by not disabling secure startup, as the installer prompted to input a password that would be needed later. After reboot I was not asked to input this password.

Since nvidia-smi was returning NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. I tried a suggestion and blacklisted nvidiafb from /etc/modprobe.d/blacklist-framebuffer.conf/ and sudo update-initramfs -u which ended with my boot screen frozen after reboot. Managed to get it working again, without blacklistings, and now using the opensource 440 driver.

I'm on Ubuntu 18.04 with a GeForce GTX 950M.

Quite stupidly, I ran sudo apt install nvidia-cuda-toolkit and it installed Cuda 9.1.85.

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

Now I was trying to uninstall CUDA-10.2 but unable because of some shared dependecies - can't remember for sure. I tried several suggestions:

sudo apt-get remove cuda-10.2
sudo apt --fix-broken install
sudo apt-get --purge remove cuda-10.2
sudo apt-get remove --dry-run cuda-10.2
sudo apt-get autoclean
sudo apt-get autoremove
sudo apt --fix-broken install
sudo apt-get -o Dpkg::Options::="--force-overwrite" install --fix-broken

And, finally, I'm not even able to remove any CUDA because of a missing package. I couldn't find any info about this hence asking here.

dan@dann:~$ sudo apt-get --purge autoremove cuda*
[sudo] password for dan: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package cuda-workspace

Edit: Adding that I have a folder "cuda-workspace", with ".metadata" folder inside. I believe I created this when installing Cuda 10.2.

dan@dann:~$ whereis cuda-workspace
cuda-workspace:
dan@dann:~$ dpkg -S /home/dan/cuda-workspace
dpkg-query: no path found matching pattern /home/dan/cuda-workspace
dan@dann:~$ dpkg -l | grep -i cuda
rc  cuda-cudart-10-2                           10.2.89-1                                        amd64        CUDA Runtime native Libraries
rc  cuda-cudart-dev-10-2                       10.2.89-1                                        amd64        CUDA Runtime native dev links, headers
rc  cuda-cufft-10-2                            10.2.89-1                                        amd64        CUFFT native runtime libraries
rc  cuda-cupti-10-2                            10.2.89-1                                        amd64        CUDA profiling tools runtime libs.
rc  cuda-curand-10-2                           10.2.89-1                                        amd64        CURAND native runtime libraries
rc  cuda-cusolver-10-2                         10.2.89-1                                        amd64        CUDA solver native runtime libraries
rc  cuda-cusparse-10-2                         10.2.89-1                                        amd64        CUSPARSE native runtime libraries
rc  cuda-npp-10-2                              10.2.89-1                                        amd64        NPP native runtime libraries
rc  cuda-nvcc-10-2                             10.2.89-1                                        amd64        CUDA nvcc
rc  cuda-nvgraph-10-2                          10.2.89-1                                        amd64        NVGRAPH native runtime libraries
rc  cuda-nvjpeg-10-2                           10.2.89-1                                        amd64        NVJPEG native runtime libraries
rc  cuda-nvprof-10-2                           10.2.89-1                                        amd64        CUDA Profiler tools
rc  cuda-nvrtc-10-2                            10.2.89-1                                        amd64        NVRTC native runtime libraries
rc  cuda-nvtx-10-2                             10.2.89-1                                        amd64        NVIDIA Tools Extension
rc  cuda-sanitizer-api-10-2                    10.2.89-1                                        amd64        CUDA Sanitizer API
rc  cuda-toolkit-10-2                          10.2.89-1                                        amd64        CUDA Toolkit 10.2 meta-package
rc  cuda-visual-tools-10-2                     10.2.89-1                                        amd64        CUDA visual tools
ii  libcudart9.1:amd64                         9.1.85-3ubuntu1                                  amd64        NVIDIA CUDA Runtime Library
ii  libnvrtc9.1:amd64                          9.1.85-3ubuntu1                                  amd64        CUDA Runtime Compilation (NVIDIA NVRTC Library)
ii  nvidia-cuda-dev                            9.1.85-3ubuntu1                                  amd64        NVIDIA CUDA development files
ii  nvidia-cuda-doc                            9.1.85-3ubuntu1                                  all          NVIDIA CUDA and OpenCL documentation
ii  nvidia-cuda-gdb                            9.1.85-3ubuntu1                                  amd64        NVIDIA CUDA Debugger (GDB)
ii  nvidia-cuda-toolkit                        9.1.85-3ubuntu1                                  amd64        NVIDIA CUDA development toolkit
ii  nvidia-profiler                            9.1.85-3ubuntu1                                  amd64        NVIDIA Profiler for CUDA and OpenCL
ii  nvidia-visual-profiler                     9.1.85-3ubuntu1                                  amd64        NVIDIA Visual Profiler for CUDA and OpenCL

I ultimately wanted to keep the driver and a correct install of CUDA, confirm that everything is OK with nvcc --version and nvidia-smi, and try to use Pytorch with CUDA (torch.cuda.is_available() still returns False).

I'm a very obvious (and dumb) beginner... Any help would be tremendous! ;)

  • 1
    It's not you, it's a problem getting CUDA installed without zorching your video setup. See https://askubuntu.com/questions/1219761/cuda-10-2-different-installation-paths/1244010#1244010 for a suggested way to avoid the package manager and just install the CUDA files. – ubfan1 May 27 '20 at 18:54
  • Ok cool. Can I run your suggestion even with versions 10.2 and 9.1.85 installed? – dan-moth May 27 '20 at 19:40
  • Yes, since the suggested setup is in your directory, and depends upon nothing else in the system, other than maybe the compiler. I just noticed I too have the partially removed 10.1 cuda packages, leftover from when I reinstalled my current Nvidia drivers, and my 10.1 setup works fine. – ubfan1 May 27 '20 at 20:07

0 Answers0