I know similar questions have been asked countless times, but none of the solutions worked for me so far. Let's start from the beginning.
I have a workstation with an NVidia RTX A5000 running Ubuntu 20.04. Before the events of today, I was able to connect to my workstation using SSH and render OpenGL windows using XQuartz on a Mac Book Pro.
Today I was at my workstation trying to run a windowed program, and getting a "Failed to establish dbus connection" error. After Googling, it seemed that this was because of a bug in the driver I had (495.44, mentioned here). I decided to update the driver. All the options in the "Additional drivers" tab were greyed out, and at the end it said I was using a manually installed driver. I came across this question and ran
sudo ubuntu-drivers autoinstall
It ran without hiccups. I then rebooted the machine, and instead of getting the login screen I was greeted with the black screen and blinking white cursor at the top. Fortunately, I was still able to SSH into it. After a while I figured out that ubuntu-drivers
had installed nvidia-driver-470
. I followed the instructions on this answer to remove it
dpkg -P $(dpkg -l | grep nvidia-driver | awk '{print $2}')
apt autoremove
I did not install noveau
(the Nvidia driver README said it was a bad idea). I rebooted the machine and got the login screen this time. Opening the "Additional drivers" showed that I was back to the manually installed (and buggy) driver (I confirmed this by running nvidia-smi
, also).
This time around I decided to install the driver manually, and ran
sudo apt install nvidia-driver-510
I rebooted the computer and everything worked this time. I got to the login screen, the bug was gone, and I managed to run the application I wanted (I also discovered I was calling it with incorrect arguments, which is now making me wonder whether all of this was in vain...). Everything was fine when I used the display attached to the workstation.
The problem I have now is with rendering over SSH. For reference, xclock
works, so I did not lose remote X rendering completely. For other applications (I think the common thread is that they use GL, but I'm not sure) I get the following errors
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
I saw some answers saying to run apt-get intall -y mesa-utils libgl1-mesa-glx
. I was missing libgl1-mesa-glx
, but that did not fix it. Running a GL application such as glxgears
or glxinfo
with debug information I get the following
$ LIBGL_DEBUG=verbose glxinfo
name of display: localhost:10.0
X Error of failed request: BadValue (integer parameter out of range for operation)
Major opcode of failed request: 149 (GLX)
Minor opcode of failed request: 24 (X_GLXCreateNewContext)
Value in failed request: 0x0
Serial number of failed request: 16
Current serial number in output stream: 17
$ LIBGL_DEBUG=verbose glxgears
libGL: MESA-LOADER: dlopen(/usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
libGL: Can't open configuration file /etc/drirc: No such file or directory.
libGL: Can't open configuration file /home/[my username]/.drirc: No such file or directory.
libGL: Can't open configuration file /etc/drirc: No such file or directory.
libGL: Can't open configuration file /home/[my username]/.drirc: No such file or directory.
libGL: Disabling server's aux buffer support
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
Error: glXCreateContext failed
I tried several things, among which selecting the noveau driver from the "Additional drivers" tab (which stopped being greyed out after installing the Nvidia driver). Nothing worked. I read somewhere that this could be due to the Nvidia drivers, so I decided to uninstall nvidia-driver-510
and go back to the manually installed driver. However, after rebooting, my screen resolution was down to 640x480, everything out of scale, and the noveau driver selected. I uninstalled xserver-xorg-video-nouveau
and rebooted, hoping to get the manually installed driver, but once I logged in I had noveau again (this despite xserver-xorg-video-nouveau
not being installed; I checked). I installed nvidia-driver-510
again to get my screen back to normal.
I saw that other users with the same problem were able to solve it by installing distribution specific packages (mesa-libGLw-devel.x86_64
in CentOS7, mesa-dri-drivers
in Redhat), but I don't know how to find the equivalent package for my distribution. Other answers recommend uninstalling the Nvidia drivers, but that alone is not enough to revert back to the (buggy but working) manually installed driver. Is there any package that provides the noveau
driver in addition to xserver-xorg-video-nouveau
that I might have missed? If so, how can I find what package that is?
In case it might be helpful, here is additional information I see people asking for in other questions
$ sudo ldconfig -p | grep -i gl.so
libwayland-egl.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libwayland-egl.so.1
libcogl.so.20 (libc6,x86-64) => /lib/x86_64-linux-gnu/libcogl.so.20
libOpenGL.so.0 (libc6,x86-64) => /lib/x86_64-linux-gnu/libOpenGL.so.0
libOpenGL.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libOpenGL.so
libGL.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libGL.so.1
libGL.so.1 (libc6) => /lib/i386-linux-gnu/libGL.so.1
libGL.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libGL.so
libEGL.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libEGL.so.1
libEGL.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libEGL.so
where libGL.so
and libGL.so.1
both point to /lib/x86_64-linux-gnu/libGL.so.1.7.0
.
$ lspci -k | grep -EA3 'VGA|3D|Display'
08:00.0 VGA compatible controller: NVIDIA Corporation Device 2231 (rev a1)
Subsystem: NVIDIA Corporation Device 147e
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
This was a long question... Thank you for reading all the way down here.