2

I have a dell xps 15 9570 laptop with intel and nvidia GPU running ubuntu linux 18.04 and would like to use the nvidia card exclusively for training deep neural networks. I managed to have the X server running on intel following

How to configure igpu for xserver and nvidia gpu for cuda?

it works perfectly when in gdm3 I select to login using the gnome shell (ubuntu Wayland). running nvidia-smi shows that no process is running on the GPU. However now I wanted to try kde with plasma and there the xserver ends up on the nvidia gpu.

(base) ooo: (~) 505> nvidia-smi 
Sat Jul 13 14:30:18 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   50C    P5    N/A /  N/A |     66MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2073      G   /usr/lib/xorg/Xorg                            66MiB |
+-----------------------------------------------------------------------------+

I tried to for the X server to use intel adding teh two config files

/etc/X11/xorg.conf.d/01-noautogpu.conf /etc/X11/xorg.conf.d/20-intel.conf

into /etc/X11/xorg.conf.d as explained here

https://gist.github.com/s41m0n/323513c95290c85f7054384ac34c41c5

The result is unfortunately that after login the screen remains black. It seems that the plasmashell itself finds and uses the nvidia gpu.

Any idea how to force plasma to use the intel gpu would be very much appreciated.

A Roebel
  • 411
  • 3
  • 12
  • With laptops with hybrid graphics you can't do what you want. That's possible with certain desktops though. –  Jul 15 '19 at 12:55
  • @GabrielaGarcia, I don't quite agree neither with what you are saying nor with the fact that apparently you downrated my question and the answer. First, it appears to me that posing a question that would have the answer that what is asked is not possible does not mean it is a bad question. Second, I would like you to look into the ps/nvidia-smi results I added to the answer to prove that my answer is indeed having the effect that was desired. – A Roebel Jul 15 '19 at 22:52
  • @GabrielaGarcia, I would be curious to understand why you believe that this would be possible on a desktop and not on a laptop. The software running on laptops and desktops is the same isn't it? What is the fundamental difference? – A Roebel Jul 15 '19 at 23:04
  • running nvidia-smi shows that no process is running on the GPU... Yes, because Wayland which doesn't work with Nvidia graphics (but works with nouveau). This seems to be the root of this confusion. Hybrid graphics are switchable, they do not work simultaneously. Whatever profile is selected one OR the other will work both for rendering the desktop and everything else. Different tasks are possible in some desktops with multiple independent cards and proper BIOS/UEFI settings only. –  Jul 16 '19 at 14:25
  • @GabrielaGarcia - Please read this thread, where nvidia devs clearly stated (in 2013) that cuda can run on a nvidia GPU and at the same time Xorg server on the intel GPU. With the setup given in my answer the GPU memory used is 0 when I don't run cuda, I don't think whoever manages the display can do this without using memory on the card. So at least with cuda (my question) it is not one or the other card, you can use both. – A Roebel Jul 17 '19 at 19:19
  • Your link is about desktops - Intel + nVida simultaneously. Is it possible for Linux desktop? -, my point from the very beginning, certain desktops with the proper configuration can use the discrete graphics headless. And the GPU memory used is 0 when I don't run cuda because you're running the Intel iGPU which uses system's shared memory (the dedicated VRAM is exclusively for Nvidia). Once you run Cuda then it's Nvidia doing everything. –  Jul 17 '19 at 19:58
  • Further measurements: I re-established my original setup with nvidia as primary gpu. Memory use for my desktop (with Xorg not Xwayland) and no cuda loaded is 123MB on the nvidia GPU, loading tensorflow it adds the 109MB from the answer. Then I switched back to intel GPU as PrimaryGPU and start the session again with Xorg (not Xwayland). I get 0 memory used on the Nvidia GPU and after running tensorflow I get again the 109MB. So where are the 123MB that my desktop is using? – A Roebel Jul 17 '19 at 21:55
  • Here a link from nvidia specifically for laptops: https://devtalk.nvidia.com/default/topic/991849/-solved-run-cuda-on-dedicated-nvidia-gpu-while-connecting-monitors-to-intel-hd-graphics-is-this-possible-/ so according to nvidia it is basically unproblematic for desktops but depends on the design for optimus laptops. So no doubt that it is possible with laptops. They provide as indicator the fact that you can see both GPUs in the lspci output (positive in my case) and with the measurements I've sent above I am rather confident that it works on my laptop. Thanks for your patience. – A Roebel Jul 17 '19 at 22:27

1 Answers1

3

After discovering the questions and answers here How to configure iGPU for xserver and nvidia GPU for CUDA work, notably the answer of user890178, and studying the syslog I finally found that it is not plasma that does anything specific but the problem is the same for gnome and plasma shell when using Xorg. With Xorg the gpu-manager.service

/lib/systemd/system/gpu-manager.service

is triggered by the display-manager

/etc/systemd/system/display-manager.service.wants/gpu-manager.service

and gpu-manager detects nvidia and writes the file

/usr/share/X11/xorg.conf.d/11-nvidia-prime.conf

which contains

# DO NOT EDIT. AUTOMATICALLY GENERATED BY gpu-manager

Section "OutputClass" Identifier "Nvidia Prime" MatchDriver "nvidia-drm" Driver "nvidia" Option "AllowEmptyInitialConfiguration" Option "IgnoreDisplayDevices" "CRT" Option "PrimaryGPU" "Yes" ModulePath "/x86_64-linux-gnu/nvidia/xorg" EndSection

This file is not used by Wayland and so the nvidia card is not used, but it is used for gnome-shell on ubuntu and plasma. So in fact both will use the nvidia card for Xorg.

The solution is then a variation of the answer of Maksym Ganenko in the same question above which means is replacing /usr/share/X11/xorg.conf.d/11-nvidia-prime.conf by

# DO NOT EDIT. AUTOMATICALLY GENERATED BY gpu-manager

Section "OutputClass" Identifier "Nvidia Prime" MatchDriver "nvidia-drm" Driver "nvidia" Option "AllowEmptyInitialConfiguration" Option "IgnoreDisplayDevices" "CRT" # Option "PrimaryGPU" "Yes" <<< commented out ModulePath "/x86_64-linux-gnu/nvidia/xorg" EndSection

added

Section "OutputClass" Identifier "intel" MatchDriver "i915" Driver "modesetting" Option "PrimaryGPU" "yes"
EndSection

and additionally to avoid gpu-manager to replace these changes when starting the next session to follow the advice of Oren in question gpu-manager overwrites xorg.conf to protect the file against changes by means of running

chattr +i /usr/share/X11/xorg.conf.d/11-nvidia-prime.conf

I seems that the fact that the screen remained black after the adding the two files I mentioned in the question to /etc/X11/xorg.conf.d is due to the fact that with the files in /usr/share/X11/xorg.conf.d that the config file did contain contradicting information.

Given the comment of GabrielaGarcia that astonishingly made the claim that what I ask cannot work on a laptop, I feel the necessity to provide a prove, that what I asked can work, and that the answer I provided is indeed a means to make it work.

Here the output of lspci proving the existence of two graphics cards

(base) m3088: (~) 505> lspci | egrep "VGA|NVIDIA"
00:02.0 VGA compatible controller: Intel Corporation Device 3e9b
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1)

Here the output of ps aux filtering the Xorg, plasma, and anaconda python running a tensorflow session. This shows that all run happily together, while plasma and Xorg do not use the nvidia card as desired (see nvidia-smi below)

(base) m3088: (~) 511> ps aux  | egrep "Xorg|plasmashell|anaconda"
roebel   13139  0.9  5.1 17315584 819236 pts/1 Sl+  00:23   0:10 /data/anasynth/anaconda3/bin/python /data/anasynth/anaconda3/bin/ipython
roebel   16198  0.0  0.0  21540  1068 pts/5    S+   00:42   0:00 grep -E Xorg|plasmashell|anaconda
roebel   18886  1.5  1.3 628292 210572 tty2    Sl+  juil.14  24:22 /usr/lib/xorg/Xorg vt2 -displayfd 3 -auth /run/user/1000/gdm/Xauthority -background none -noreset -keeptty -verbose 3
roebel   19171  2.0  3.4 6576588 561212 ?      Sl   juil.14  33:16 /usr/bin/plasmashell

Here the output of nvidia-smi proving that Xorg is not using nvidia, but the tensorflow session in anaconda python is suing it.

(base) m3088: (~) 506> nvidia-smi
Tue Jul 16 00:34:51 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P8    N/A /  N/A |    123MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 13139 C /data/anasynth/anaconda3/bin/python 109MiB | +-----------------------------------------------------------------------------+

I am ready to provide screenshots to show that all this happens on a laptop.

EDIT Update for Ubuntu 22.04

I finally started to use Wayland, with the unfortunate side effect that the previous solution did not work anymore. The gnome-shell was running on the GPU, leading subsequently to some problems with the interface. Following the discussion here I tried uninstalling nvidia wayland support package

sudo apt remove libnvidia-egl-wayland1

and subsequently gnome-shell does now no longer run on the Nvidia GPU keeping the GPU free for DNN training.

A Roebel
  • 411
  • 3
  • 12