2

My 16.04.1 desktop is going to sleep (hibernating?) arbitrarily. When I check on it the case fan is still running and the network adapter is still blinking away, so it appears it's not totally powered off. There is no signal to the display, any SSH connections I have are dropped, and typing on my USB keyboard doesn't appear to wake it or anything. I've seen this answer and this answer and tried them both. CPU temperatures appear fine (snapshot that appears representative of the system):

$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Core 0:       +33.0°C  (high = +105.0°C, crit = +105.0°C)
Core 1:       +33.0°C  (high = +105.0°C, crit = +105.0°C)
Core 2:       +33.0°C  (high = +105.0°C, crit = +105.0°C)
Core 3:       +33.0°C  (high = +105.0°C, crit = +105.0°C)

I ran memtest for 8 hours with no memory errors (note that it didn't turn itself off during the memtest). I've cruised through /var/log/syslog but haven't found anything particularly notable in the time leading up to the system shutting off. Here's an example of the output leading up to losing the machine:

Dec  3 23:29:45 machine anacron[698]: Job `cron.daily' terminated
Dec  3 23:29:45 machine anacron[698]: Normal exit (1 job run)
Dec  3 23:39:39 machine systemd[1]: Starting Cleanup of Temporary Directories...
Dec  3 23:39:39 machine systemd-tmpfiles[3273]: [/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
Dec  3 23:39:39 machine systemd[1]: Started Cleanup of Temporary Directories.
Dec  4 00:17:01 machine CRON[3400]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Dec  4 00:24:47 machine org.gnome.evolution.dataserver.Sources5[1118]: ** (evolution-source-registry:1903): WARNING **: secret_service_search_sync: must specify at least one attribute to match
Dec  4 00:32:23 machine kernel: [ 4072.573263] perf interrupt took too long (6332 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
Dec  4 00:35:19 machine kernel: [ 4248.653576] perf interrupt took too long (6327 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
Dec  4 01:17:01 machine CRON[3607]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Dec  4 02:04:01 machine CRON[3748]: (root) CMD (   test -x /etc/cron.daily/popularity-contest && /etc/cron.daily/popularity-contest --crond)
Dec  4 02:17:01 machine CRON[3792]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Dec  4 03:17:01 machine CRON[3985]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Dec  4 03:27:14 machine systemd[1]: snapd.refresh.timer: Adding 3min 56.294250s random time.

Physically, the machine is hardwired to the network and has a Bluetooth USB receiver for a Bluetooth keyboard. I previously had Ubuntu 14.04.5 running on the machine with the same hardware without issues, and did a clean install of Ubuntu 16.

This is almost a clean install of Ubuntu - I installed OpenSSH server, Docker and TightVNC, but otherwise everything else has been in attempts to try and diagnose why this is happening.

Any ideas as to what would cause this? Or where I should look to try and find why this is happening? If I can't get this sorted out, then I guess it's back to Ubuntu 14 for me...

Reaching the end of my rope with Ubuntu knowledge, I looked at the actual hardware and found the motherboard:

$ sudo lshw
... <snip> ...
*-core
       description: Motherboard
       product: Q1900M
       vendor: ASRock
       physical id: 0
       serial: M80-4A010400202
     *-firmware
          description: BIOS
          vendor: American Megatrends Inc.
          physical id: 0
          version: P1.40
          date: 09/01/2014
          size: 64KiB
          capacity: 8128KiB
          capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer acpi usb biosbootspecification

was running an older version (1.40) of the available BIOS - 1.70 exists. I've just upgraded the BIOS to see if that has any impact.

  • I just updated the BIOS, so I'll wait until it next hangs (or whatever it's doing) before trying a kernel update just to be sure the BIOS change isn't part of the equation.

    Is a kernel update a "sketchy" at all? Presently the machine is using 4.4.0-51-generic. Are there any considerations before upgrading the kernel like that/why is it still on 4.4 when 4.5 is released?

    – user2152081 Dec 04 '16 at 19:45
  • @sbarb - Could be completely cognitive bias, but it seems more stable now with the BIOS update (stays up for many hours at a time instead of just a few), but the computer was unresponsive when I got home today. I'll give the kernel update a try now. – user2152081 Dec 06 '16 at 01:14
  • I upgraded to 4.7.0-040700.201608021801 and the computer flashed a bunch of black and white squiggles and then sat at a black screen. Back to 4.4 right now, I'll try 4.6 and if that doesn't work I'll try 4.5. – user2152081 Dec 06 '16 at 02:29
  • Computer hung during kernel update. Now the computer is having all sorts of trouble. I'm thinking at this point I'll do a clean install of 16.10 to get the updated kernel and see how that goes. – user2152081 Dec 07 '16 at 04:08
  • I've installed 16.10, now I'll wait and see if it's any better - will report back. – user2152081 Dec 10 '16 at 21:54
  • With the clean install of Ubuntu 16.10, all I've done is upgrade packages and install OpenSSH server. So far it has been up stably for just under a day. I'll keep it as is for a couple days to see if it was something I installed that caused it to hang, then I'll reintroduce the programs I had previously installed. – user2152081 Dec 11 '16 at 17:32
  • The machine has been completely stable running 16.10 for more than 2 days, so I'm going to try reintroducing software, see how it goes. – user2152081 Dec 13 '16 at 04:51
  • In the end, I'm suspicious that one of the sticks of RAM was not totally and completely seated or there was some kind of debris in the slot. In the end, I've installed Ubuntu 19.04 on the machine and it has been stable for a few weeks. Whether it was some obscure bug or a physical issue, I'll never know. – user2152081 May 16 '19 at 02:25

0 Answers0