Cuda Installation error in sudo apt-get install linux-headers-$(uname -r)

Question

I'm running Ubuntu 16.04.6 on liveUSB with persistent data, my machine has an Nvidia Quadro M1200 and a built-in Intel HD 630. In Additional Drivers I chose Using NVIDIA binary driver -version 384.130 from nvidia-384(proprietary, tested) and Apply Changes then reboot and disable Secure Boot in BIOS.

But when I boot to Ubuntu and enter command nvidia-smi, it says nvidia-smi: command not found.

Checking nvidia-settings gets an error:

** Message: PRIME: No offloading required. Abort
** Message: PRIME: is it supported? no
ERROR: nvidia-settings could not find the registry key file. This file should
       have been installed along with this driver at
       /usr/share/nvidia/nvidia-application-profiles-key-documentation. The
       application profiles will continue to work, but values cannot be
       prepopulated or validated, and will not be listed in the help text.
       Please see the README for possible values and descriptions.

Then I follow Cuda installation guide- Linux and ran into a problem when I try to make sure kernel header matches with kernel version.

$ uname -r
4.15.0-45-generic
$ sudo apt-get install linux-headers-$(uname -r)
Reading package lists... Done
Building dependency tree       
Reading state information... Done
linux-headers-4.15.0-45-generic is already the newest version (4.15.0-45.48~16.04.1).
0 upgraded, 0 newly installed, 0 to remove and 418 not upgraded.
4 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up nvidia-384 (384.130-0ubuntu0.16.04.2) ...
/usr/sbin/update-initramfs: 6: /usr/sbin/update-initramfs: cannot create /cdrom/casper/vmlinuz: Read-only file system
dpkg: error processing package nvidia-384 (--configure):
 subprocess installed post-installation script returned error exit status 2
dpkg: dependency problems prevent configuration of libcuda1-384:
 libcuda1-384 depends on nvidia-384 (>= 384.130); however:
  Package nvidia-384 is not configured yet.
dpkg: error processing package libcuda1-384 (--configure):
 dependency problems - leaving unconfigured
Setting up linux-image-4.4.0-187-generic (4.4.0-187.217) ...
No apport report written because the error message indicates its a followup error from a previous failure.
dpkg: dependency problems prevent configuration of nvidia-opencl-icd-384:
 nvidia-opencl-icd-384 depends on nvidia-384 (>= 384.130); however:
  Package nvidia-384 is not configured yet.
dpkg: error processing package nvidia-opencl-icd-384 (--configure):
 dependency problems - leaving unconfigured
No apport report written because the error message indicates its a followup error from a previous failure.
                          Processing triggers for libc-bin (2.23-0ubuntu11) ...
Processing triggers for linux-image-4.4.0-187-generic (4.4.0-187.217) ...
/etc/kernel/postinst.d/dkms:

dkms: running auto installation service for kernel 4.4.0-187-generic
...done.

/etc/kernel/postinst.d/initramfs-tools:
/usr/sbin/update-initramfs: 6: /usr/sbin/update-initramfs: cannot create /cdrom/casper/vmlinuz: Read-only file system
run-parts: /etc/kernel/postinst.d/initramfs-tools exited with return code 2
dpkg: error processing package linux-image-4.4.0-187-generic (--configure):
 subprocess installed post-installation script returned error exit status 1
No apport report written because MaxReports is reached already
                                                              Errors were encountered while processing:
 nvidia-384
 libcuda1-384
 nvidia-opencl-icd-384
 linux-image-4.4.0-187-generic
E: Sub-process /usr/bin/dpkg returned an error code (1)

By checking linux headers I get:

$ dpkg -l | grep linux-headers-
ii  linux-headers-4.15.0-45                    4.15.0-45.48~16.04.1                         all          Header files related to Linux kernel version 4.15.0
ii  linux-headers-4.15.0-45-generic            4.15.0-45.48~16.04.1                         amd64        Linux kernel headers for version 4.15.0 on 64 bit x86 SMP
ii  linux-headers-4.4.0-187                    4.4.0-187.217                                all          Header files related to Linux kernel version 4.4.0
ii  linux-headers-4.4.0-187-generic            4.4.0-187.217                                amd64        Linux kernel headers for version 4.4.0 on 64 bit x86 SMP
ii  linux-headers-generic                      4.4.0.187.193                                amd64        Generic Linux kernel headers
ii  linux-headers-generic-hwe-16.04            4.15.0.45.66                                 amd64        Generic Linux kernel headers

I have no idea what the error message means, not sure if I should I continue the installation or fix it now? But how ?

Try using a full Ubuntu install instead of a live usb. The live installs have certain limitations which you are apparently running into. Not to say that the CUDA install is not unnecessarily tangled up with the Nvidia drivers in any case but there are answers here to help with that. — ubfan1, Aug 24 '20 at 15:36
It all boils down to "Can you successfully update the kernel and boot from it?". A live media's running kernel may be sitting on a FAT/iso9660 filesystem, not in the /boot. Then you want to update a video module... Last time I looked (awhile ago), even /etc/fstab updates on live media wouldn't be picked up. — ubfan1, Aug 24 '20 at 16:24
Same with live usb with persistent data too? I'm using live usb with persistent data. — JGrey, Aug 24 '20 at 16:30
Yes, the live media boots off a compressed filesystem, the overlay filesystem with your persistent data is then overlaid, but by then, things are running. Maybe the live patch system might be used to switch kernels, but haven't heard of that, not that I'm a good source of knowledge on that topic. — ubfan1, Aug 24 '20 at 16:40
Full install does look different. In Additional Drivers, it's selecting an open source Nvidia binary driver(410.78). Should I go ahead and choose the proprietary driver(384.130) instead? Also in Nvidia cuda linux guide, it says Cuda 10 for Ubuntu 16.04.6 kernel version should be 4.5.0, but in reality it's 4.15.0. I think that's how I got two kernel header versions in liveusb. — JGrey, Aug 27 '20 at 05:29
That 410 is a proprietary Nvidia driver -- The labeling in Additional Drivers can get mixed up, probably from a PPA. The 4.15 kernel in 16.04.6 is a later kernel that the point releases after the .1 bring in. Use 16.04.1 if you want the original (but I thought that was 4.4). I install the latest Nvidia driver then work backwards from the stable tensorflow you select -- it will specify which CUDA version it needs. The trick is to avoid letting the CUDA install replace the Nvidia driver (Answers here just unpack the Intel deb locally (no samples), dumping the Nvidia debs found there). — ubfan1, Aug 27 '20 at 15:55
Hmmm, so I let Cuda replaced the Nvidia driver to 450.21.06 and installed Cuda But when I do nvcc --V, it says The program 'nvcc' is currently not installed. You can install it by typing: sudo apt install nvidia-cuda-toolkit .... And nvidia-persistenced failed to initialize... — JGrey, Aug 27 '20 at 18:38
Take a look at https://askubuntu.com/questions/1256378/question-about-installing-cuda-10-1-on-ubuntu-20-04-the-root-folder-is-empty/1256583#1256583 The problems with using the Intel CUDA supplied Nvidia drivers are they don't come with the scripts to update them every kernel update, and they may not be appropriate for your hardware. When you just unpack the Intel deb, you still need to set PATH and LD_LIBRARY_PATH (and maybe get a non-default gcc/g++ and add their links in ...cuda/bin. — ubfan1, Aug 27 '20 at 21:13

Cuda Installation error in sudo apt-get install linux-headers-$(uname -r)

0 Answers0