3

Goal

I am trying to use CUDA on my nvidia card for research. I don't really care to use it to manage my display as I plan on only using the computer via bash-shell after I finish setting it up.

Problem

My video card is unclaimed by Ubuntu. Bounce to login loop after signing in.

Background

I'm a linux-savy, power-user, computer science phd student, but I'm stumped trying to get my Nvidia gtx 1070Ti graphics card to work. I've been at this every sunday for over two months now.

I've followed these tutorials:

https://help.ubuntu.com/community/BinaryDriverHowto/Nvidia
https://help.ubuntu.com/community/BinaryDriverHowto
https://kislayabhi.github.io/Installing_CUDA_with_Ubuntu/
https://askubuntu.com/a/760935/13693
https://askubuntu.com/a/937204/13693
http://docs.nvidia.com/cuda/cuda-installation-guide-linux

Installing nvidia-current or nvidia-387 (default chosen when ubuntu installed) , or the latest nvidia-390 results in a boot loop where I'm bounced back to the login screen after login in.

So I used prime-select intel and removed the modeset=0 blacklist to get to a working desktop. So below is a review of my current status:

Nvidia card is seen by lspci

$ uname -a
Linux datalake2 4.13.0-36-generic #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ lspci | grep VGA
03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b82 (rev a1)
08:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. G200eR2 (rev 01)
$ sudo lshw -C video
  *-display UNCLAIMED
       description: VGA compatible controller
       product: NVIDIA Corporation
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:03:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller cap_list
       configuration: latency=0
       resources: iomemory:33f0-33ef iomemory:33f0-33ef memory:91000000-91ffffff memory:33fe0000000-33fefffffff memory:33ff0000000-33ff1ffffff ioport:2000(size=128) memory:92080000-920fffff

$ apt list --installed | grep "nvidia"

nvidia-387/unknown,now 387.26-0ubuntu1 amd64 [installed]
nvidia-387-dev/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-cuda-dev/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-cuda-doc/xenial,xenial,now 7.5.18-0ubuntu1 all [installed,automatic]
nvidia-cuda-gdb/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-cuda-toolkit/xenial,now 7.5.18-0ubuntu1 amd64 [installed]
nvidia-modprobe/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-opencl-dev/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-opencl-icd-387/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-prime/xenial,now 0.8.2 amd64 [installed]
nvidia-profiler/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]
nvidia-settings/unknown,now 387.26-0ubuntu1 amd64 [installed,automatic]
nvidia-visual-profiler/xenial,now 7.5.18-0ubuntu1 amd64 [installed,automatic]

$ cat /proc/driver/nvidia/version
cat: /proc/driver/nvidia/version: No such file or directory

Weirdness

My second problem seems to be that ubuntu is unable to recognize the need for drivers for my card, even-though I have enabled restricted propitiatory drivers. no drivers

sudo software-properties-gtk gives me nothing as well.

No drivers Restricted Enabled

results of nvidia-settings

My gcc version: enter image description here

2 Answers2

2

Here is the workaround:

1. edit /etc/default/grub

Modify GRUB_CMDLINE_LINUX_DEFAULT to

GRUB_CMDLINE_LINUX_DEFAULT='pcie_port_pm=off acpi_backlight=none acpi_osi=Linux acpi_osi=! acpi_osi="Windows 2009"'

This step is to prevent blank screen after logging in.

2. move nvidia library directories to /etc/ld.so.conf.d/nvidia.conf

The content of nvidia.conf is

/usr/lib/nvidia-390
/usr/lib32/nvidia-390

These directories depends on driver version on your computer.

3. create /etc/init.d/nvidia

To disable and enable nvidia runtime libraries.

#!/bin/sh
### BEGIN INIT INFO
# Provides:          nvidia 
# Required-Start:    $all
# Required-Stop:     $all
# Default-Start:     5
# Default-Stop:      0 6
# Short-Description: load/unload nvidia library
# Description:       load/unload nvidia library
### END INIT INFO

PRIME=$(prime-select query)
if [ "$PRIME" = "nvidia" ]; then
    exit 0
fi

case "$1" in
  start)
    sleep 10
    cd /etc/ld.so.conf.d
    mv nvidia.conf.bak nvidia.conf
    ldconfig
    nvidia-smi
    ;;
  stop)
    cd /etc/ld.so.conf.d
    mv nvidia.conf nvidia.conf.bak
    ldconfig
esac

4. execute update-rc.d nvidia defaults

You should find SXXnvidia in /etc/rc5.d/ and KXXnvidia in /etc/rc6.d/, /etc/rc0.d/.

Try to execute /etc/init.d/nvidia stop and nvidia-smi, you should see error messages of libraries not found.

Try to execute /etc/init.d/nvidia start, then nvidia-smi is fine again.

If everything is OK, you can reboot now. You are expected to login to desktop.

5. If anything goes wrong

The most possible problem is nvidia script not executed. If it happens, you can press Ctrl+Alt+F1 to tty mode, execute /etc/init.d/nvidia stop; reboot. Then you can go back to unity desktop to debug.

6. known side-effect

When use intel as prime GPU, unity-control-center(system settings) will be failed to start.

GLib-CRITICAL **: g_strsplit: assertion `string != NULL' failed.

Note: my system spec

# uname -r
4.13.0-32-generic
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:    16.04
Codename:   xenial
# dpkg -l | grep cuda
ii  cuda-9-0                                    9.0.176-1                                    amd64        CUDA 9.0 meta-package
ii  cuda-command-line-tools-9-0                 9.0.176-1                                    amd64        CUDA command-line tools
ii  cuda-core-9-0                               9.0.176-1                                    amd64        CUDA core tools
ii  cuda-cublas-9-0                             9.0.176.1-1                                  amd64        CUBLAS native runtime libraries
ii  cuda-cublas-dev-9-0                         9.0.176.1-1                                  amd64        CUBLAS native dev links, headers
ii  cuda-cudart-9-0                             9.0.176-1                                    amd64        CUDA Runtime native Libraries
ii  cuda-cudart-dev-9-0                         9.0.176-1                                    amd64        CUDA Runtime native dev links, headers
ii  cuda-cufft-9-0                              9.0.176-1                                    amd64        CUFFT native runtime libraries
ii  cuda-cufft-dev-9-0                          9.0.176-1                                    amd64        CUFFT native dev links, headers
ii  cuda-curand-9-0                             9.0.176-1                                    amd64        CURAND native runtime libraries
ii  cuda-curand-dev-9-0                         9.0.176-1                                    amd64        CURAND native dev links, headers
ii  cuda-cusolver-9-0                           9.0.176-1                                    amd64        CUDA solver native runtime libraries
ii  cuda-cusolver-dev-9-0                       9.0.176-1                                    amd64        CUDA solver native dev links, headers
ii  cuda-cusparse-9-0                           9.0.176-1                                    amd64        CUSPARSE native runtime libraries
ii  cuda-cusparse-dev-9-0                       9.0.176-1                                    amd64        CUSPARSE native dev links, headers
ii  cuda-demo-suite-9-0                         9.0.176-1                                    amd64        Demo suite for CUDA
ii  cuda-documentation-9-0                      9.0.176-1                                    amd64        CUDA documentation
ii  cuda-driver-dev-9-0                         9.0.176-1                                    amd64        CUDA Driver native dev stub library
ii  cuda-drivers                                390.12-1                                     amd64        CUDA Driver meta-package
ii  cuda-libraries-9-0                          9.0.176-1                                    amd64        CUDA Libraries 9.0 meta-package
ii  cuda-libraries-dev-9-0                      9.0.176-1                                    amd64        CUDA Libraries 9.0 development meta-package
ii  cuda-license-9-0                            9.0.176-1                                    amd64        CUDA licenses
ii  cuda-misc-headers-9-0                       9.0.176-1                                    amd64        CUDA miscellaneous headers
ii  cuda-npp-9-0                                9.0.176-1                                    amd64        NPP native runtime libraries
ii  cuda-npp-dev-9-0                            9.0.176-1                                    amd64        NPP native dev links, headers
ii  cuda-nvgraph-9-0                            9.0.176-1                                    amd64        NVGRAPH native runtime libraries
ii  cuda-nvgraph-dev-9-0                        9.0.176-1                                    amd64        NVGRAPH native dev links, headers
ii  cuda-nvml-dev-9-0                           9.0.176-1                                    amd64        NVML native dev links, headers
ii  cuda-nvrtc-9-0                              9.0.176-1                                    amd64        NVRTC native runtime libraries
ii  cuda-nvrtc-dev-9-0                          9.0.176-1                                    amd64        NVRTC native dev links, headers
ii  cuda-repo-ubuntu1604                        9.1.85-1                                     amd64        cuda repository configuration files
ii  cuda-runtime-9-0                            9.0.176-1                                    amd64        CUDA Runtime 9.0 meta-package
ii  cuda-samples-9-0                            9.0.176-1                                    amd64        CUDA example applications
ii  cuda-toolkit-9-0                            9.0.176-1                                    amd64        CUDA Toolkit 9.0 meta-package
ii  cuda-visual-tools-9-0                       9.0.176-1                                    amd64        CUDA visual tools
ii  libcuda1-390                                390.12-0ubuntu1                              amd64        NVIDIA CUDA runtime library
ii  libcudnn7                                   7.0.5.15-1+cuda9.0                           amd64        cuDNN runtime libraries
ii  libcudnn7-dev                               7.0.5.15-1+cuda9.0                           amd64        cuDNN development libraries and headers
# dpkg -l | grep nvidia
ii  nvidia-390                                  390.12-0ubuntu1                              amd64        NVIDIA binary driver - version 390.12
ii  nvidia-390-dev                              390.12-0ubuntu1                              amd64        NVIDIA binary Xorg driver development files
ii  nvidia-modprobe                             390.12-0ubuntu1                              amd64        Load the NVIDIA kernel driver and create device files
ii  nvidia-opencl-icd-390                       390.12-0ubuntu1                              amd64        NVIDIA OpenCL ICD
ii  nvidia-prime                                0.8.2                                        amd64        Tools to enable NVIDIA's Prime
ii  nvidia-settings                             390.12-0ubuntu1                              amd64        Tool for configuring the NVIDIA graphics driver
1

You should be able to get CUDA working with this answer. by Ping Chu Hung If you still have issues with the login loop after that there are some highly rated answers here that should resolve that for you.

Note: Like most things in life, Nvidia drivers can leave a bunch of garbage lying around if you've tried to install several versions or had failed installations and it may be necessary to purge them all and then reinstall the one you've had working in the past to get the desired results.

Elder Geek
  • 36,023
  • 25
  • 98
  • 183