I have upgraded to the latest (beta) Nvidia driver - nvidia-381
due to an issue with the previous driver. I had a problem with window edges after waking from suspend - see here.
For this reason I upgraded to the newer driver, from 375.39
to 381.09
.
Since upgrading, I have had to reinstall Nvidia's Cuda Toolkit 8.0 (and CUDNN v5.1), however there seems to be a driver file missing, which prevents me from installing both Tensorflow and the gputools package in R, which build upon the Cuda Toolkit, which in turn needs the missing libcuda.so.1 file.
Neither Tensorflow
nor gputools
are able to locate the file: libcuda.so.1
. With the previous driver I was able to install the Cuda Toolkit without issues.
Here is a similar issue, but with older drivers involved: https://github.com/tensorflow/tensorflow/issues/4078
I have read that I could possibly create this file as it is a symlink, however I would prefer not to, as I do not know what other dependencies exist. Example of possible workaround: https://stackoverflow.com/questions/41890549/tensorflow-cannot-open-libcuda-so-1
I am running Ubuntu 16.04. I have also posted this question on the Ubuntu launchpad
Questions:
- Can somebody see why this file is missing or propose a stable solution?
- If I am going to have to change my driver - which is the best way to downgrade and how to I decide which to downgrade to?
Extra info:
If I search for the missing file on my system, I find the following similar files, but not the one I need:
user@user $ tree / -fiC | grep libcuda.so /usr/local/cuda-8.0/doc/man/man7/libcuda.so.7 /usr/local/cuda-8.0/lib64/stubs/libcuda.so /usr/share/man/man7/libcuda.so.7
If I look to see what the Nvidia driver would like to uninstall, should I use the given uninstallation script, then we see that it wasn't aware of the libcuda.so.1 file at installation, hence it isn't in this script:
user@user $ /usr/local/cuda-8.0/bin$ cat .uninstall_manifest_do_not_delete.txt | grep libcuda.so file:/usr/share/man/man7/libcuda.so.7:5708adf9bb3c591eb4f1d0d50e78f3df file:/usr/local/cuda-8.0/lib64/stubs/libcuda.so:8347cb2f5500934b1942ba42f3979fac file:/usr/local/cuda-8.0/doc/man/man7/libcuda.so.7:5708adf9bb3c591eb4f1d0d50e78f3df
As there is the stub of the libcuda.so.1 file (seen in output above), I created the missing symlink to that file:
user@user $ sudo ln -s /usr/local/cuda-8.0/lib64/stubs/libcuda.so /usr/local/cuda-8.0/lib64/libcuda.so.1
This actually allowed the
gputools
package in R to be successfully installed, however the functions that call upon the GPU failed:R> gpuMatMult(matA, matB) Error in gpuMatMult(matA, matB) : device memory allocation failed Calls: gpuMatMult -> .Call
Using the
deviceQuery
utility that is bundled into thesamples
of the Cuda Toolkit (you have to firstsudo make
it), I see there is definitely something wrong, which Cuda notices itself:user@user $ /usr/local/cuda/samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) cudaGetDeviceCount returned 35 -> CUDA driver version is insufficient for CUDA runtime version Result = FAIL
Current status:
I have downgraded to the previous driver I knew to work with the Cuda Toolkit 8.0, CUDNN 5.1 - nvidia-378.13
. The tools that use Cuda and CUDNN are also now working fine as before, e.g. tensorflow
, gputools
(in R), etc.
Everything is working just as expected, including the bug showing pixelated window edges after waking from suspend.
nvidia-381
driver doesn't include thelibcuda.so.1
file that is needed by bothtensorflow
andtheano
. I spent hours trying to figure it out andsudo apt install nvidia-378
along with a reboot gotlibcuda.so.1
installed and still worked with my1080 ti
. – Chris Apr 24 '17 at 04:06