Recently, almost every time I run Ubuntu, the operating system has experienced an internel error. I believe my current version of xorg is partially responsible, but I've received many kerneloops
errors, none of which I experienced while I had 4.4.0-31
as the in-use kernel. Thus, I wish to downgrade my kernel from 4.4.0-83
to 4.4.0-31
.
I've changed my grub file according to the instructions in
Set "older" kernel as default grub entry
but upon booting up 4.4.0-83
is still the kernel in use. The instructions in
Grub does not autoboot the default option after upgrade to 12.10
did not fix the issue (though I'm using 14.04). Now, when choosing "advanced options" in grub, the 4.4.0-31 kernel is the default selection. But if I boot using the advanced options, I am taken to a tty1
screen, which I can't exit. I tried the commands in
but received either no response or an error message. Below is my grub file (minus commented out lines):
GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 4.4.0-31-generic"
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
GRUB_CMDLINE_LINUX=""
GRUB_RECORDFAIL_TIMEOUT=0
Let me know if there's any command lines I should run that may identify the problem.
EDIT 1
Here is the output from entering ls -alt /var/crash
total 71060
-rw-r----- 1 root whoopsie 1512336 Jul 24 19:47 _usr_bin_Xorg.0.crash
drwxrwsrwt 2 root whoopsie 4096 Jul 24 19:47 .
-rw------- 1 whoopsie whoopsie 0 Jul 24 16:36 _usr_bin_Xorg.0.uploaded
-rw-r--r-- 1 root whoopsie 0 Jul 24 16:36 _usr_bin_Xorg.0.upload
-rw-rw---- 1 root whoopsie 0 Jul 24 01:55 .lock
-rw-r----- 1 kernoops whoopsie 8445 Jul 24 00:55 linux-image-4.4.0-83-generic.233306.crash
-rw------- 1 whoopsie whoopsie 0 Jul 23 23:37 _opt_google_chrome_chrome.1000.uploaded
-rw-rw-r-- 1 zachary whoopsie 0 Jul 23 23:37 _opt_google_chrome_chrome.1000.upload
-rw-r----- 1 zachary whoopsie 58735028 Jul 23 23:37 _opt_google_chrome_chrome.1000.crash
-rw------- 1 whoopsie whoopsie 0 Jul 23 21:59 linux-image-4.4.0-83-generic.285645.uploaded
-rw-r--r-- 1 root whoopsie 0 Jul 23 21:59 linux-image-4.4.0-83-generic.285645.upload
-rw-r----- 1 kernoops whoopsie 8789 Jul 23 21:55 linux-image-4.4.0-83-generic.285645.crash
-rw-r----- 1 kernoops whoopsie 7976 Jul 23 15:07 linux-image-4.4.0-83-generic.220593.crash
-rw-r----- 1 kernoops whoopsie 8746 Jul 23 15:06 linux-image-4.4.0-83-generic.255332.crash
-rw------- 1 whoopsie whoopsie 0 Jul 23 15:06 ttf-mscorefonts-installer.0.uploaded
-rw-r--r-- 1 root whoopsie 0 Jul 23 15:06 ttf-mscorefonts-installer.0.upload
-rw-r----- 1 root whoopsie 153662 Jul 23 15:06 ttf-mscorefonts-installer.0.crash
-rw-r--r-- 1 kernoops whoopsie 3484 Jul 23 03:10 linux-image-4.4.0-83-generic.245092.crash
-rw-r----- 1 zachary whoopsie 12051671 Jul 19 01:52 _usr_bin_compiz.1000.crash
-rw-r----- 1 zachary whoopsie 238085 Jul 18 10:44 _usr_lib_dconf_dconf-service.1000.crash
-rw-r--r-- 1 kernoops whoopsie 2823 Jul 16 14:03 linux-image-4.4.0-83-generic.215830.crash
drwxr-xr-x 14 root root 4096 May 21 23:22 ..
of free -h
total used free shared buffers cached
Mem: 62G 1.8G 61G 16M 40M 626M
-/+ buffers/cache: 1.1G 61G
Swap: 29G 0B 29G
and of swapon -s
Filename Type Size Used Priority
/dev/sda6 partition 31250428 0 -1
also, having GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset"
completely broke my installation, but I hadn't rebooted at the time of writing my original post. I fixed it, however, by changing it back to GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
in recovery mode. I had made this change after reading some post but I can no longer find it.
EDIT 2
An image of the MemTest run
EDIT 3
In response to:
(heynnema) Looks like you've got a hardware problem, as I suspected. It's picking up a high bit in the data bus. First thing to do is reseat your memory sticks in their current slots. Power off the computer, unplug it from the AC, hold down the power button for 5 seconds, release and reinsert each memory stick, then rerun memtest. What is your current RAM config? How many sticks of what sizes? Report back. ps: do you have intel-microcode installed?
I was only able to reseat two of my memory sticks because the CPU and water cooler cords completely covered the other two, and I wasn't comfortable removing those components. I reran MemTest, trying both individual cores and all in parallel, and it freezes on test 2 like before.
My desktop memory is the DDR4 Corsair Vengeance. It contains four sticks each with 16GB of memory for a total of 64GB.
Here is the output of entering dmesg | grep microcode
[ 8.808196] microcode: CPU0 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808205] microcode: CPU1 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808217] microcode: CPU2 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808252] microcode: CPU3 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808289] microcode: CPU4 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808326] microcode: CPU5 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808338] microcode: CPU6 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808350] microcode: CPU7 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808363] microcode: CPU8 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808375] microcode: CPU9 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808388] microcode: CPU10 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808399] microcode: CPU11 sig=0x406f1, pf=0x4, revision=0xb00001c
[ 8.808445] microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
I believe that means intel microcode is installed, according to Step F on Easy Linux Tips Project (I can't yet include more than two links).
EDIT 4
In response to heynnema:
ok, some progress. no way to reach the other two simms, eh? so try this next. remove the two simms that you can reach, and see if you can still boot, and/or run memtest. if it runs, it'll tell us that one of the two pulled simms may be defective
ps: another test that we can do is to run different single CPUs during memtest. So... if it fails with CPU #0, but runs with CPUs 1-11, we may have a defective CPU.
I first ran MemTest on each distinct, individual CPU. All resulted in a freeze on the second test. I then removed the two memory sticks that are easily accessible, booted up, and was able to run MemTest. I did not try to boot into any installation.
However, after putting the two memory sticks back in, I am unable to boot Windows or Ubuntu. Windows shows my desktop background but with a blue filter and Ubuntu shows only the default, Unity background. Though in Ubuntu the computer was not completely frozen as I could enter tty1 through keyboard commands.
I ran MemTest, hoping it would give indication as to what went wrong, and it now fails on the first test. It says [CPU Error] Could not start CPU 0
. I tried reseating the memory sticks again and it's still completely broken.
The Could not start CPU 0
error now occurs if I run MemTest with the two accessible memory sticks removed.
EDIT 5
I reseated the memory sticks again, and I can (sometimes) boot my Ubuntu installation, but Windows is even more broken. It simply leads to the blue screen with options to repair your computers. When I do successfully boot Ubuntu the system will usually freeze upon any attempt to open an application.
EDIT 6
In response to heynnema:
You may have actually found the problem, but missed the clue. With the 2 accessible SIMMS removed, memtest ran, but right there you should have tried to boot Ubuntu and Windows to see how they ran. But instead, you put both SIMMS back in, memtest failed, and both OS's had trouble. Remove those same two SIMMS again, retest with memtest to confirm that it still works, then boot the OS's and see how they run! More steps coming after that test. Good luck! ps: with 2 SIMMS removed, confirm that the OS's think you have 32G RAM.
I removed the accessible SIMMS and booted the PC. I entered into the terminal at the login screen and used the free -m
command to check available RAM. It was 32GB. The first attempt at logging in succeeded but upon opening google chrome it froze. The second attempt led to a black screen that said the graphics card could not be found. The third attempt led to a freeze after selecting Ubuntu in grub and just prior to the login screen appearing.
I found entering tty1 at the login screen was rather stable and could do many basic commands without freezing unlike when I actually log in. Though I'm not sure that's of any relevance.
EDIT 7
In response to heynnema:
You may very well have more than one problem. Power off the computer and reseat the video card. You may have to loosen a screw that holds its bracket down, and you may have to release a catch at the lower/front of the card, or order to be able to remove/reseat it. As far as the memory is concerned, what would it take for you to get to the other two? Do you need a technician to help you? Can you see the color of the four memory slots? Sometimes they're white, or black. And beside each socket, etched on the motherboard, is a designation like J0/J1/J2/J4... can use see those?
ps2: show me
sudo dmidecode -t memory
.ps3: have you overclocked the CPU or memory?
I will be having some one take a look at the PC tomorrow. Still, I checked the colors of the memory slots, and all four are grey. The four other possible memory slots are all black. For lack of time at the moment, I couldn't open up my PC to look at the socket designations.
I ran sudo dmidecode -t memory
and it displayed information on all my memory devices. I couldn't copy the text, and it took several screens so I didn't take a photo, but of note was that only two devices had identified sizes or manufacturers. Both were SIMMS, since they were Corsair brand and 16GB, but I had all four SIMMS in memory slots at the time. Otherwise, unknown
and NA
were all the details given for other devices.
I have not overclocked my CPU or memory.
EDIT 8
I had a person to take a look at my computer. Two issues were found with the hardware:
1) Only two memory slots worked. The memory sticks themselves all worked but the motherboard was faulty. Strangely, MemTest initially picked up on 64GB of RAM, but that's no longer the case regardless of how the SIMMS are configured on the motherboard.
2) My GPUs were slightly too long for the motherboard and couldn't lock into their slots completely. There's a "sweet spot" where they work, but at some point while reseating my memory sticks, I must have jostled them.
While putting the GPUs back in better alignment and only using the two working memory slots has stopped the error messages (so far) it's not a permanent solution. I still have no answer for why issues started when I upgraded to 4.4.0-83.