5

Since a couple of days ago I am experiencing a lot of freezes and I do not know how to investigate them.

It does not matter what I do - browsing, playing music, typing on LaTeX - at some point (after 2 minutes, 10 minutes or even after 1 hour) the computer freezes.
It becomes unresponsive to anything, the LED on the Caps Lock starts blinking and it stays like that forever.
Even the "magic" REISUB does not work (I enabled it to try to avoid corrupting my HDD with these frequent hard shutdowns).

The only thing that works is to long-press the power button and force shutdown.

I had a look at the log files in /var/log with no help (nothing gets registered).

This is my hardware.

I am on 15.04 with kernel 3.19.0-29-generic.
I tried to revert to old kernels with no help. In particular, 3.19.0-28-generic has the same problem.

Any hint on how to investigate this further?

PS: With Windows 8 there is no problem, even with intensive gameplay, so I would tend to exclude hardware problems.
PPS: temperatures also are not a problem, I was able to monitor them with sensors via a terminal.

Zanna
  • 70,465
dadexix86
  • 6,616
  • No comments? It is really an annoying behaviour :( – dadexix86 Sep 15 '15 at 06:11
  • try to exclude HW problem by Memory test. Memtest 86+ is available from GRUb loader. – Dee Sep 15 '15 at 20:36
  • I didn't think about it, thanks for the hint. It's running right now, I'll keep you posted. – dadexix86 Sep 15 '15 at 20:42
  • what's your machine's manufacturer? blinking caps lock probably means something ("ask" the manufacturer, see for instance http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c01732674 ). – nutty about natty Sep 15 '15 at 20:49
  • This is also a good hint. The manufacturer is Dell, the laptop is the 15z. I'll Google for info (but I doubt I'll find any) when memtest finishes. – dadexix86 Sep 15 '15 at 20:51
  • 3
    noleti implicitly pointed out that this is not related to manufacturer, but rather a feature implemented in the kernel: keyword now is kernel panic. PS: memtest won't "finish" (...) – nutty about natty Sep 15 '15 at 21:09
  • I know that memtest doesn't "finish", strictly speaking... But above 50% one is almost sure already that the problem is not the RAM, since all the blocks have been used already once ;) I'll go with the kernel panic trail now :) – dadexix86 Sep 15 '15 at 21:12
  • Are you running RAID? 82801 Mobile SATA Controller [RAID mode] Or is this an Ultrabook where you turned off Intel SRT and used small SSD for Ubuntu. I might change to AHCI mode. – oldfred Sep 15 '15 at 21:40
  • Sounds either like the power supply is failing or like the CPU vcore is dropping too much when idling. Very unlikely, but does the BIOS allow you to increase the vcore / drop the base frequency of the CPU to check if the vcore dropping is actually the problem? – kos Sep 15 '15 at 22:12
  • @oldfred no, I do not have a RAID and never had. Indeed, I have a SSD + HDD, don't know what SRT is. Ubuntu is installed on the SSD and the data are on HDD, since more than a couple of years now :) – dadexix86 Sep 16 '15 at 06:25
  • @kos I never heard of such output for those problems. Supply is not falling on the whole system (screen is up, as well as keyboard backlight, and battery is not removable, so in case of line drops it switches to it). My BIOS indeed does not allow manual modifications of the vcore or base frequencies, I can thought modify the governor (now it's on powersave mode). – dadexix86 Sep 16 '15 at 06:30
  • The fact that the freeze happens only when idling, that nothing goes into syslog and that everything just freezes without visibly affecting the running system it's very likely (although indeed not surely) because of faulty hardware. The fact that everything else is still up and running (monitor etc) is irrelevant, since a power supply failing usually starts failing by not being able to sustain a constant tension, which may result in the problem you're describing; a little (on scale big) drop on the vcore (we're talking about 0.01V-0.02V) might end up freezing the system. – kos Sep 16 '15 at 08:28
  • You may try setting the frequency using cpufrequtils each time you boot to a frequency low enough to make sure that the CPU won't halt on a sudden vcore drop for a while; if nothing happens when the frequency is manually set like that then probably the power supply failing is not the problem – kos Sep 16 '15 at 08:34
  • @kos I did never say that happens only when idle. Indeed, I said "browsing, playing music, typing on LaTeX"... – dadexix86 Sep 16 '15 at 12:06
  • I have a similar problem, I think in my case it's a bug in Intel driver. Try https://wiki.archlinux.org/index.php/Intel_graphics#SNA_issues --- I am now trying to run with UXA acceleration, let's see. (To me it happens once a day or less, so it's quite difficult to check and test things.) – Rmano Sep 16 '15 at 13:13
  • I didn't notice that "intensive gameplay" referred to Windows and not to Ubuntu, my bad; however a better solution to test a possibile vcore drop, if want to test this, would be keeping the CPU under load using cpustress to see if the problem goes away. – kos Sep 16 '15 at 16:27
  • @Rmano I don't have any of the issues described in that Arch wiki page. I am updating to the latest Intel graphic stack now, we'll see what happens. – dadexix86 Sep 16 '15 at 17:36
  • I had worst behavior with the latest stack. In my case, the display simply flash during a couple of seconds and then the machine hangs. Running since this morning with UXA acceleration, but it's early to say anything. But yes, it could be different --- sound like a panic. Try to find someone that can help with another PC and set up netconsole... – Rmano Sep 16 '15 at 18:23
  • @Rmano it looks like the new stack is working better than before for me. The last freeze was before installing it, so I would not exclude that that was the problem. I'll keep testing in these days and see if the problem comes back. – dadexix86 Sep 17 '15 at 06:27
  • @dadexix86 - is this wildly necessary to own a dual-boot machine with something in about near Redmond/Richmond ??? When you are fond of gaming then you have to accept ransom-ware-like malware which try to build in booting malware, which again should prevent linux (rescue-mode of Linux or protected mode of Linux) ... – dschinn1001 Sep 22 '15 at 17:23
  • @dschinn1001 I'm sorry but I really do not understand your comment... Maybe it is just my bad English! Can you please elaborate a bit? ;) – dadexix86 Sep 22 '15 at 17:27
  • @dadexix86 - i mean ... with less dual-boot then you have less trouble ?! in case of dual-boot only use linux-distros and less proprietary operating systems ?! – dschinn1001 Sep 22 '15 at 18:16
  • @dschinn1001 I still don't understand what's your point, sorry :) – dadexix86 Sep 22 '15 at 18:20
  • use a dictionary ... – dschinn1001 Sep 22 '15 at 18:23
  • I did, so here we are ;) "with less dual-boot then you have less trouble ?!" no, definitely not. Never had problems with dual boot and surely this is not related to it (still dual boot, but now it's working ;) ) "in case of dual-boot only use linux-distros and less proprietary operating systems ?!" Assuming a comma before "only", I really don't see your point here. This problem is definitely unrelated to booting process (which is the only thing that could be affected by dual booting). Indeed, the problem is now solved and I assure you that the dual boot with a proprietary OS still works fine. – dadexix86 Sep 22 '15 at 18:28

3 Answers3

3

The blinking caps lock is probably caused by a kernel panic, more info here. Look for log files as instructed here to debug this. It seems like there should be something related in /var/log/syslog.

noleti
  • 4,053
  • 27
  • 25
  • If it is a Kernel Panic, then probably to reinstall kernel would be a good solution. Remove all possible custom/proprietal hardware drivers before you do so. – Dee Sep 16 '15 at 07:31
  • If it's a hard kernel panic, probably the only way to debug it is with a serial console https://help.ubuntu.com/community/SerialConsoleHowto (if you have a serial port) or hoping in a network console https://wiki.archlinux.org/index.php/Netconsole – Rmano Sep 16 '15 at 13:20
  • @Rmano both options wuold be unavailable to me, since I only have one laptop and no other machine. – dadexix86 Sep 16 '15 at 13:40
  • @noleti Yesterday I modified /etc/rsyslog.d/50-default.conf following the instruvtions there (removed the dash). It happened right now again but nothing got registered in syslog. Something new showed up after reboot though, i.e. that Ubuntu has encountered an internal error and such, two times. These are screenshots (no copy-paste from that window, too bad). – dadexix86 Sep 16 '15 at 16:55
  • @dadexix86 Thanks for the screenshots. It definitely looks like a kernel issue ("oops"). Did this start recently? Maybe you can go back to an older version of the kernel. Not sure why no logging shows up. – noleti Sep 17 '15 at 09:58
  • It started recently, yes, and, looking at apt and dpkg logs, probably after the update to 3.19.0-28-generic. I would love to go back to older kernels, maybe to 3.19.0-27-generic, but I can't because it is like they are not there anymore to install (the packages linux-image are vanished from the repos). – dadexix86 Sep 17 '15 at 13:24
  • I have some news. In Xorg.0.log.old, after the last freeze today I found this. So I investigate the present dmesg and I got this, so might there be any (big) problem with udev? – dadexix86 Sep 18 '15 at 16:53
  • Other news! :D Look at this. – dadexix86 Sep 18 '15 at 16:57
  • Ok, after a few tries, this "http://paste.ubuntu.com/12450887/" is a bleutooth audio output adapter, dunno why it shows up in input though... – dadexix86 Sep 18 '15 at 16:58
  • Other good (better) news, via last I could find trace of the crashes. As you can see they are really frequent. – dadexix86 Sep 19 '15 at 07:49
0

Since the system became impossible to use, I reinstalled it.

After testing for a couple of says, it seems that the problem is solved.

Maybe I changed some configuration files in the previous setup that caused the error after some update.

In case the problem comes back I'll update this question.

dadexix86
  • 6,616
  • 250 rep lost forever... ;-) – Fabby Sep 20 '15 at 21:35
  • 1
    I did not have any other real option ;) – dadexix86 Sep 20 '15 at 22:15
  • Sorry, but this is not how answers should be awarded. It is not about finding the easiest way out, but about identifying the root cause, and solving it. How is this answer going to benefit future users with the same problem? – noleti Sep 21 '15 at 22:01
  • I'm sorry too, but this was the only solution. It will help future users in avoiding looking for something that does not exists for more than one week, risking breaking hardware with the frequent forced shutdown. – dadexix86 Sep 22 '15 at 05:48
0

I had the same problem and it seems to be related to kernel 3.19.0-29. I am pretty sure about it because I began to experience it from sept.11 (the same date you reported), and that's the date the new kernel was installed in my PC. If I use kernel 3.19.0-28 it does not freeze anymore.

Angus73
  • 65