2

UPDATE: See this first is it relevant? Nvidia driver 460 does not work with 5.4.0-64 kernel in Ubuntu 18.04

nvidia-settings 460 is installed regardless of what version of driver you grab it is possible that everything is broken with these versions of things? And the package repo makes this unfixable?

The following does not work: https://linuxconfig.org/how-to-install-the-nvidia-drivers-on-ubuntu-20-04-focal-fossa-linux

It also has no mention of testing the result works.

This is my test:

$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

I have done this many times in the past and am referring to my old notes but nothing is working.

I am purging nvidia nad cuda in between.

I am running purely on command line so I only care about the minimal command line instructions.

Am wondering if something has changed with the paths in 20.04. There is this weird file

cat /usr/lib/nvidia/alternate-install-available
The NVIDIA driver provided by Ubuntu can be installed by launching the "Software & Updates" application, and by selecting the NVIDIA driver from the "Additional Drivers" tab.

And I should mention I am running in headless mode (no monitor attached) so if there are any weird triggers that the monitor usage happens to trigger that would be useful to know. This did not use to be the case as I have installed on servers in the past.

And I have a GeForce RTX 2070 ... maybe that is the issue. Not sure yet.

UPDATE: plugging in monitor and restarting there is ABSOLUTELY NOTHING APPEARING. DARK SCREEN. I can still ssh into box.

inxi
CPU: 6-Core Intel Core i7-8700 (-MT MCP-) speed/min/max: 800/800/4600 MHz Kernel: 5.4.0-65-generic x86_64 Up: 5m
Mem: 1631.7/32075.2 MiB (5.1%) Storage: 465.76 GiB (54.9% used) Procs: 406 Shell: bash 5.0.17 inxi: 3.0.38
(38) $ inxi -F
System:    Host: xxx Kernel: 5.4.0-65-generic x86_64 bits: 64 Console: tty 1 Distro: Ubuntu 20.04.2 LTS (Focal Fossa)
Machine:   Type: Desktop Mobo: Micro-Star model: Z370-A PRO (MS-7B48) v: 1.0 serial: <superuser/root required>
           UEFI: American Megatrends v: 2.40 date: 03/08/2018
CPU:       Topology: 6-Core model: Intel Core i7-8700 bits: 64 type: MT MCP L2 cache: 12.0 MiB
           Speed: 800 MHz min/max: 800/4600 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 800 7: 800 8: 800
           9: 800 10: 800 11: 800 12: 800
Graphics:  Device-1: NVIDIA TU106 [GeForce RTX 2070] driver: N/A
           Display: server: X.org 1.20.9 driver: fbdev,nouveau unloaded: modesetting,vesa tty: 181x44
           Message: Advanced graphics data unavailable in console. Try -G --display
Audio:     Device-1: Intel 200 Series PCH HD Audio driver: snd_hda_intel
           Device-2: NVIDIA TU106 High Definition Audio driver: snd_hda_intel
           Sound Server: ALSA v: k5.4.0-65-generic
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169
           IF: enp3s0 state: down mac: 30:9c:23:d0:bb:05
           Device-2: TP-Link TL-WN722N v2 type: USB driver: rtl8812au
           IF: enx503eaa4de20b state: up speed: N/A duplex: N/A mac: 50:3e:aa:4d:e2:0b
           IF-ID-1: docker0 state: down mac: 02:42:f8:2b:a8:df
Drives:    Local Storage: total: 465.76 GiB used: 255.67 GiB (54.9%)
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO 500GB size: 465.76 GiB
Partition: ID-1: / size: 456.96 GiB used: 255.66 GiB (55.9%) fs: ext4 dev: /dev/nvme0n1p2
Sensors:   System Temperatures: cpu: 35.0 C mobo: N/A
           Fan Speeds (RPM): N/A
Info:      Processes: 416 Uptime: 5m Memory: 31.32 GiB used: 1.60 GiB (5.1%) Init: systemd runlevel: 5 Shell: bash
           inxi: 3.0.38

UPDATE: managed to install some nvidia driver and now the monitor is working but the resolution is too low.

There is something very very broken with sopme new combination of things. How can I revert back to whatever was there previously. Too many moving parts.

UPDATE:

sudo gpu-manager 
last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can't access /run/u-d-c-nvidia-was-loaded file
can't access /opt/amdgpu-pro/bin/amdgpu-pro-px
Looking for nvidia modules in /lib/modules/5.4.0-65-generic/updates/dkms
Looking for amdgpu modules in /lib/modules/5.4.0-65-generic/updates/dkms
Is nvidia loaded? no
Was nvidia unloaded? no
Is nvidia blacklisted? no
Is intel loaded? no
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is amdgpu versioned? no
Is amdgpu pro stack? no
Is nouveau loaded? no
Is nouveau blacklisted? yes
Is nvidia kernel module available? no
Is amdgpu kernel module available? no
Vendor/Device Id: 10de:1f02
BusID "PCI:1@0:0:0"
Is boot vga? yes
Error: can't access /sys/bus/pci/devices/0000:01:00.0/driver
The device is not bound to any driver.
Error : Failed to open /dev/dri
Error : Failed to open /dev/dri
Error : Failed to open /dev/dri
Error : Failed to open /dev/dri
Does it require offloading? no
last cards number = 1
Has amd? no
Has intel? no
Has nvidia? yes
How many cards? 1
Has the system changed? No
Single card detected
Nothing to do

Wondering if this is relevant: Nvidia-173 driver package comes with a wrong and useless nvidia-settings app

TOPKAT
  • 103
mathtick
  • 589

1 Answers1

7

The following seemed to work. Perhaps nvidia driver 460 does not work with 5.4.0-64 kernel in Ubuntu 18.04

sudo apt-get purge '*nvidia*' -y
sudo apt-get autoremove -y
sudo apt list --installed | grep nvidia
sudo ppa-purge ppa:graphics-drivers/ppa
sudo apt auto-clean
sudo apt install gcc-8
sudo update-alternatives --remove-all gcc
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 10
sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc-8 10
sudo apt-get install --reinstall linux-headers-$(uname -r)
sudo apt-get install nvidia-driver-460
# reboot

And make sure to look at

dpkg --list | grep nvidia

For any weird version disagreements. nvidia-settings is somehow pinned at 460 in all cases.

mathtick
  • 589
  • 2
    Man, I feel like kissing you for posting it, many thanks! – Maciek Sep 01 '21 at 14:31
  • 1
    Was literally stuck for hours with an issue with RTX 3050 where it would be installed but not loaded.

    This solution worked right off the bat on Ubuntu 20.04!

    – Sasanka Panguluri Feb 11 '22 at 23:23
  • FWIW I'm on 21.10 with same hardware set up and no problems so far. I keep this reset script around for next time some update clobbers me. – mathtick Feb 13 '22 at 09:30