CUDA 10.2 on Ubuntu 20.04, Kernel 5.4.0-40
This procedure avoids any package manager involvement, and allows you to keep
the tested version 440 Nvidia drivers. The way Nvidia packages the deb files changes all the time, so this may not work for CUDA 11, but should give you an idea of what needs to be done.
Update your Ubuntu 20.04 Nvidia drivers to the latest (tested) versions (440
as of Jul. 2, 2020) Run Software and Updates, then select the Additional
Drivers tab, and make your Nvidia selection. Various software like compilers
and kernel headers should already be present for the Nvidia module to be
build. When built, reboot and run nvdia-smi to endure you are running your
selected Nvidia drivers.
In your browser, go to the Nvidia site, and select your CUDA version to download.
https://developer.nvidia.com/cuda-10.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal
A window will open with a Base Installer script, just copy the wget line and get
the offered 1.8GB .deb file into the directory of your choice..
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
Hashcheck the downloaded deb file with md5sum.
md5sum cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
4dfcc4d2bcca28e2f4b40f54171374ec cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
and check it against the supplied checksums at the "Installer Checksums" link under the script.
Unpack the .deb file (the contents are just other deb files).
Avoid unpacking the Nvidia deb files, but unpack all the others.
All the nvidia files or libxnv... files but one, should already be installed, with a standard Nvidia driver install from "Software and Updates". The one to maually install is libxnvctrl-dev, which will likely have an earlier version than the system one.
cd to your cuda location and run:
dpkg-deb --extract cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb cuda102
Selecting an "install" directory, like cuda102, will allow other cuda versions to be installed in parallel if necessary. Depending upon the CUDA release, the deb files may be in further subdirectories. I found it useful to create a directory to contain the final setup of files copied from the "install" directory. e.g, unpack the deb files into /usr/local/data/cuda/cuda102, then use /usr/local/data/cuda/cuda-10.2 as the final setup location. The final setup will not have the deep directories of the "install" directory.
Add a link from /usr/local named cuda-10.2 to wherever you unpacked the debs. e.g.:
sudo ln -s /usr/local/data/cuda/cuda-10.2 /usr/local/cuda-10.2
sudo apt-get install libxnvctrl-dev nvidia-headless-440 nvidia-headless-no-dkms-440 nvidia-modprobe
sudo apt-get install libglu1-mesa-dev freeglut3-dev
You will have etc, usr, and var directories created, with further subdirectories containing more .deb files in the var and usr directories.
cd var/cuda-repo-10-2-local-10.2.89-440.33.01
you will see about 70 deb files, with about 20 nvidia ones to delete (or ignore).
The cuda-compat-10-2_440.33.01-1_amd64.deb has no equivalent, and has some programs with wired in 440.33.... version names, so leave it and hope it works. The cuda-drivers_440.33.01-1_amd64.deb has nothing but a changelog.
for f in *deb do
echo "Unpacking deb $f"
dpkg-deb --extract "$f" .
done
See the "Installation Guide" link just below the script. It will have the system requirements which you should install yourself, since you will not be running the supplied installers.
Move the contents of the deeply embedded ...cuda-10.2 up to the install directory (cuda102). There are no conflicts. (Just move the cuda-10.2 directory up to cuda, for now.)
...cuda102/var/cuda-repo-10-2-local-10.2.89-440.33.01/usr/local/cuda-10.2
Now collect all the random ...why bother, just leave it until needed if ever.
var/c*/src fortran files already present in the cuda-10.2 src. delete them.
var/c*/include/cublas h files, copy over to the cuda-10.2/include and delete them.
var/c*/share/* move all to the cuda-10.2/share (2 dirs, no overlap), and delete the share dir.
leave the lib/pkgconfig dir until needed .
leave the opt dir, only has some nvidia nsite stuff of unknown utility.
leave the etc with a config file with /usr/local/cuda-10.2/targets/x86_64-linux/lib
unneeded with a proper LD_LIBRARY_PATH I would think.
You should have a location with all the CUDA libraries and binaries. Add the
bin and lib64 to your PATH and LD_LIBRARY_PATH at their beginnings. Note the
required gcc version for your CUDA selection -- the default 9.x version should
work, but the samples' makefiles will be set up to limit to a specific,
earlier, version. For current CUDA releases, the earlier compiler versions
should be available in the standard repositories, and are already installed,
except for g++-8.
sudo apt-get install g++-8 Ubuntu 20.04 supplies gcc-8, but gcc-9 is the
default when "gcc" is invoked. Older CUDA releases may require a compiler
older than those supplied in the standard repositories, so use an older Ubuntu
release archive or get the source.
/usr/local/cuda-10.2/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2:
error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
The gcc-8 version supplied by default is 8.4, which does not trigger the error message.
Forgetting to install g++-8 will cause misleading errors about gcc versions.
Manually changing the HOST_COMPILER may work, but may lead to undefined symbols.
HOST_COMPILER=/usr/bin/gcc-8 make
undefined reference to symbol '_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE7compareEPKc@@GLIBCXX_3.4.21'
Add soft links to these ealier version tools (gcc, g++, nm, ar, ranlib) in
your cuda/bin directory. Since the cuda/bin is first in your PATH, they should
override the system defaults. Avoid using the update-alternatives mechanism to
change the system's default compiler. Every kernel update needs to recompile
parts of the Nvidia video driver, and an old compiler version is untested for
this, and may not work.
If the samples are not included in your initial deb, get their deb, install
them, and copy the samples directory to a writeable location, taking
ownership. Try to make a sample, like 5_simulation/nbody. you should just have
to type "make", and the compliles and loads should work, producing an
executable, nbody. Run it ./nbody.
you can run the make file from the top level, and note any missing libraries
for some samples. At least one sample, simpleDevice... seems to need a lot of
memory, maybe more than available.
apt-get
may work, but all the files that should under the root folder are now separated into different categories. Thank you for the clarification. However, when I tried to use the.run
file download from Nvidia official website, there are always some mistakes like pre-scripts are failed. Then we need to disable Nouveau Kernel Driver, which is quite annoying. Do you have any advice on it? Thank you a lot! – Nick Nick Nick Jul 05 '20 at 02:08