I reinstalled a fresh Ubuntu 16.04.02 last week on an i7 Sandy Bridge Nvidia + Intel (Optimus graphics) laptop which was previously installed with the same specs whithout any problem.
Since then, I'm experiencing random system crashes while writing emails, editing photos, etc. with Nvidia GPU enabled or disabled (no pattern here). The system just stop working, no error message, no inputs, no console available, the display is frozen and the CPU heating more and more (guessing from the fan RPM) until I shut down manually the computer.
Removing all the Nvidia packages seems to resolve the issue, so I suspect Nvidia drivers to be responsible for this. In /var/log/syslog
I have this line which appears a lot of times :
nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000857d:0:0:0x00000033
I run the nvidia-367.57
driver from Ubuntu repos, the xserver-xorg-hwe-16.04
stack and the linux-generic-hwe-16.04
kernel (linux-4.8.0.39.10). It's the same with nvidia-375
and even worse with nvidia-378
drivers. But again, as it is not really repeatable, it could be just bad luck.
Here are the last few lines of the syslog
before a crash :
Feb 23 10:51:02 ouranos anacron[1277]: Job `cron.weekly' started
Feb 23 10:51:02 ouranos anacron[3472]: Updated timestamp for job `cron.weekly' to 2017-02-23
Feb 23 10:56:02 ouranos systemd[1]: Starting Cleanup of Temporary Directories...
Feb 23 10:56:02 ouranos systemd-tmpfiles[3506]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
Feb 23 10:56:04 ouranos systemd[1]: Started Cleanup of Temporary Directories.
Feb 23 10:56:22 ouranos com.canonical.Unity.Scope.Applications[2356]: Error loading package indexes: Couldn't stat '/var/cache/software-center/xapian'
Feb 23 10:56:22 ouranos com.canonical.Unity.Scope.Applications[2356]: (unity-scope-loader:3525): unity-applications-daemon-CRITICAL **: daemon.vala:144: Failed to load Software Center index. 'Apps Available for Download' will not be listed
Feb 23 10:56:25 ouranos gnome-session[2531]: Gtk-Message: GtkDialog mapped without a transient parent. This is discouraged.
Feb 23 11:02:29 ouranos anacron[1277]: Job `cron.weekly' terminated
Feb 23 11:02:29 ouranos anacron[1277]: Normal exit (1 job run)
Feb 23 11:06:25 ouranos thermald[1355]: sysfs write failed trip_point_0_temp
Feb 23 11:06:29 ouranos thermald[1355]: sysfs write failed trip_point_0_temp
Feb 23 11:06:36 ouranos systemd[1]: Started CUPS Scheduler.
Feb 23 11:06:37 ouranos thermald[1355]: sysfs write failed trip_point_0_temp
And another one :
Feb 23 14:05:00 ouranos gnome-session[7432]: Done!
Feb 23 14:05:13 ouranos thermald[1350]: sysfs write failed trip_point_0_temp
Feb 23 14:05:16 ouranos bluetoothd[1317]: Endpoint unregistered: sender=:1.254 path=/MediaEndpoint/A2DPSource
Feb 23 14:05:16 ouranos bluetoothd[1317]: Endpoint unregistered: sender=:1.254 path=/MediaEndpoint/A2DPSink
Feb 23 14:05:19 ouranos org.gnome.zeitgeist.Engine[7259]: ** (zeitgeist-datahub:8084): WARNING **: zeitgeist-datahub.vala:229: Unable to get name "org.gnome.zeitgeist.datahub" on the bus!
Feb 23 14:05:21 ouranos thermald[1350]: sysfs write failed trip_point_0_temp
Feb 23 14:05:29 ouranos gnome-session[7432]: ** (zeitgeist-datahub:8064): WARNING **: zeitgeist-datahub.vala:212: Error during inserting events: GDBus.Error:org.gnome.zeitgeist.EngineError.InvalidArgument: Incomplete event: interpretation, manifestation and actor are required
Feb 23 14:05:29 ouranos gnome-session[7432]: [2017-02-23T19:05:29] [ERR] hddtemp : échec de l'ouverture de la connexion.
Feb 23 14:05:29 ouranos gnome-session[7432]: [2017-02-23T19:05:29] [ERR] atasmart : échec de sk_disk_open() : /dev/sda.
Feb 23 14:05:29 ouranos gnome-session[7432]: [2017-02-23T19:05:29] [ERR] atasmart : échec de sk_disk_open() : /dev/sdb.
\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00
(Note : /dev/sda
is the local HDD and /dev/sdb
is an external USB HDD).
How can I find a trace of what caused the crash ? Is the nvidia-modeset
error something I should worry about ?
Since my CPU is a Sandy Bridge generation, the Baytrail bug affecting Pstate is most likely not the cause of the problem.