I have been trying to solve this issue on my own, but no luck so far.
What I have tried in order to clear the fault is:
- blank all HDDs back to factory state
- did a full RAM error check (passed everything, zero errors)
- re-created all RAID & LVM partitions (two drives are raid1, the other 4 are raid5)
- then reinstall ubuntu (13.04 server: i386)
... but the problem still occurs, even though the installation seemed to go just fine.
What I get on screen after a reboot is this:
- bottom half of screen is vertical stripes (alternating white & light blue)
- top part of screen is expected black with text ...
- but the only text I get are lines like this:
[ 58.865374] nouveau E[ PFIFO][0000:05:02.0] DMA_PUSHER - Ch 0 Get 0x20000000 Put 0x00002eb8 State 0xc0020000 (err: MEM_FAULT) Push 0x00000000
- the HDD light is still on, and every now & then a new line like the one above appears (but with different numbers at the start (before 'nouveau') and again after the 'Put' ... everything else remains the same in each successive line on screen)
Can anyone explain to me what I am seeing, and the process of finding out what is causing it & how to fix it?
I haven't been able to find any discussions that go into this in depth, only ones to solve other people's situations specific to them. Their issue seemed to relate to nVidia graphics card, but nothing explained how I can check what is causing this for me, and then how to solve it.
TEST 1:
- removed 3 of 4 ram chips
- gets to login screen ... but
- error message at login says:
computer_name login: mountall: Playmouth command failed mountall: Disconnected from Plymouth
- will try again with other 3 ram chips each on their own to see if different result
update on test 1
- I hit CTRL+ALT+F1 (as per suggestions in some similar forum questions)
- Login prompt resolved after hitting ENTER
- Successfully logged in (have not yet tried rebooting yet or adding other ram chips)
- df command gives me view of root & other folders all where I would expect them to be
- so will now do sudo apt-get update & upgrade followed by a reboot to see if that changes things
*further update*
- following apt-get update & upgrade, I have to press every key on the keyboard twice to get it to show at the command line prompt ... and doing an:
install --help
(to get some info about a command option I saw used in another forum suggestion)
- this resulted in the output to screen being staggered across the screen from one line to the next, instead of in a nicely formatted column
so clearly something (?) is up with the display driver (?)
will try reboot now
SEMI SOLUTION BELOW ... BUT PART OF PROBLEM STILL PERSISTS
As per the answer below, I did manage to get the install working with 1 ram chip installed ... and, it also worked with any combination of 2 ram chips installed ... but (unfortunately), it went back to delivering the errors & blue/white stripes on screen when I tried all 4 chips. - Is there some way to narrow down this next part of the problem via the command line? Should I be checking log-files for something? If so, which files & check for what?
any help greatly appreciated ... I am stumped on this one
YAY FINALLY SOLVED 100% - see update in answer below
- finally figured out the rest of it, and will add that to my answer following ... hope this helps someone else :-)