10

RAID array doesn't assemble after reboot.

I have one SSD from which the system is booted, and three HDD that are part of the array. The system is Ubuntu 16.04.

The steps that I've followed are based mostly on this guide:

https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-16-04#creating-a-raid-5-array

  1. Verifying if I'm good to go.

    lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT
    

The output shows sda, sdb, and sdc devices besides the SSD partitions. I've verified if these in fact represent HDDs by looking at output of this:

hwinfo --disk

Everything matches.

  1. Assembling the array.

    sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sda /dev/sdb /dev/sdc
    

I verify if it looks OK by entering: cat /proc/mdstat

The output looks something like this:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdc[3] sdb[1] sda[0]
      7813774336 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
      [=======>.............]  recovery = 37.1% (1449842680/3906887168) finish=273.8min speed=149549K/sec
      bitmap: 0/30 pages [0KB], 65536KB chunk

unused devices: <none>

I wait till the process ends.

Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid5 sdc[3] sdb[1] sda[0]
      209584128 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>
  1. Creating and mounting the filesystem.

    sudo mkfs.ext4 -F /dev/md0
    
    sudo mkdir -p /mnt/md0
    
    sudo mount /dev/md0 /mnt/md0
    
    df -h -x devtmpfs -x tmpfs
    

I put some data in, and the output looks like this:

Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p2  406G  191G  196G  50% /
/dev/nvme0n1p1  511M  3.6M  508M   1% /boot/efi
/dev/md0        7.3T  904G  6.0T  13% /mnt/md0
  1. Saving the array layout.

    sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
    
    sudo update-initramfs -u
    
    echo '/dev/md0 /mnt/md0 ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab
    
  2. Rebooting and verifying if everything works correctly.

After reboot I try: cat /proc/mdstat
It isn't showing any active raid devices.

ls /mnt/md0 

is empty.

The following command isn't printing anything and doesn't work either:

mdadm --assemble --scan -v

Only the following restores the array with data on it:

sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sda /dev/sdb /dev/sdc

What should be done differently?

Additional, probably useful info:

sudo dpkg-reconfigure mdadm

The output shows:

update-initramfs: deferring update (trigger activated)
Generating grub configuration file ...
Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
Found linux image: /boot/vmlinuz-4.4.0-51-generic
Found initrd image: /boot/initrd.img-4.4.0-51-generic
Found linux image: /boot/vmlinuz-4.4.0-31-generic
Found initrd image: /boot/initrd.img-4.4.0-31-generic
Adding boot menu entry for EFI firmware configuration
done
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
Processing triggers for initramfs-tools (0.122ubuntu8.5) ...
update-initramfs: Generating /boot/initrd.img-4.4.0-51-generic

The intriguing part for me is "start and stop actions are no longer supported; falling back to defaults"

Also the output of /usr/share/mdadm/mkconf doesn't print any arrays at the end.

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR craftinity@craftinity.com

# definitions of existing MD arrays

whereas the output of cat /etc/mdadm/mdadm.conf does.

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# DEVICE /dev/sda /dev/sdb /dev/sdc

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR craftinity@craftinity.com

# definitions of existing MD arrays

# This file was auto-generated on Sun, 04 Dec 2016 18:56:42 +0100
# by mkconf $Id$

ARRAY /dev/md0 metadata=1.2 spares=1 name=hinton:0 UUID=616991f1:dc03795b:8d09b1d4:8393060a

What's the solution? I've browsed through half the internet and no one seems to have the same problem.

I've also added the exact same question on serverfault a couple of days ago (no answer). I apologize if I violated stack exchange's community rules by doing that.

5 Answers5

9

I had the same problem, I am not sure but the work around I have found was to create new partitions on the raid members of type LINUX Raid and then when creating the array I used the partition rather than using the device.

anonymous2
  • 4,298
Avi
  • 191
  • 1
  • 2
  • 1
    Thank you! I was facing the same problem and this is a solution for me: creating the raid on the partitions, not the whole disk: mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1 – ojovirtual Jun 15 '17 at 10:53
3

It seems, this is quite a common problem when using whole disks in an array.

This post gives the latest summary of the problem and how I solved it: mdadm RAID underlaying an LVM gone after reboot

This post helped me to understand and solve my problem: What's the difference between creating mdadm array using partitions or the whole disks directly

2

I was unable to reproduce your exact problem, but I think I found a possible reason for the behavior of your system:

When you create the 3 disk RAID5 array with the command:

sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sda /dev/sdb /dev/sdc

While the RAID device is in recovery, the mdadm scan command shows:

sudo mdadm --detail --scan
ARRAY /dev/md0 metadata=1.2 spares=1 name=desktop:0 UUID=da60c69a:1cbe4f2e:e83d0971:0520ac89

After the recovery process is completed, the spares=1 parameter is gone:

sudo mdadm --detail --scan
ARRAY /dev/md0 metadata=1.2 name=desktop:0 UUID=da60c69a:1cbe4f2e:e83d0971:0520ac89

I assume, that reassembling the 3 disk read with the spares=1 parameter will fail on a fully recovered software RAID5, since you don’t have any more spared disks. If you try to assemble a RAID with the following command, it will fail:

sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 --spare-devices=1 /dev/sda /dev/sdb /dev/sdc

and the next command will create a different RAID layout:

sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=2 --spare-devices=1 /dev/sda /dev/sdb /dev/sdc

On a different note: If you don’t need to boot from the RAID5, there is no need to add the configuration to the /etc/mdadm/mdadm.conf file. Ubuntu will automatically start the RAID, since the configuration is available in the RAID superblock.

Simon Sudler
  • 3,931
  • 3
  • 21
  • 34
0

I had the exact same problem. Solved by:

$ sudo update-initramfs -u

followed by a reboot

Paul
  • 4,511
W1T3H4T
  • 121
0

The solution is the following: You scan all drives that do not carry your OS. Then you choose from those what drives you want to select. Before creating any RAID you need to format the drives to a compatible format like ext4. See my example bash code to select all drives from a rented computer except the OS drive and build a RAID0 for maximum speed to later form a fast database node.

{
  #database here
  cd "/home/$1/UltiHash/"
  cwd=$(pwd)
  sudo systemctl start postgresql.service
  sudo service postgresql restart
  if ! [[ $(sudo -u postgres psql --command='SHOW data_directory') == *"/mnt/md0/postgresql/13/main"* ]]; then
    sudo chown -R ${mainuser} /home/${mainuser}/
    echo "${passwd}\n${passwd}\n" | sudo passwd postgres
    echo "postgres password set!"
    sudo usermod -aG sudo postgres
    echo "Configured postgres as a root user"
    sudo postgres -D /usr/local/pgsql/data > logfile 2>&1 &
    version=$(psql --version)
    cutter="$(cut -d ' ' -f 3- <<< "$version" )" && tmpcut="${cutter%% *}"
    folver="${tmpcut%%.*}"
    cd /etc/postgresql/${folver}/main
    sudo -u postgres psql --command="ALTER USER postgres WITH PASSWORD '${passwd}';"
    sudo -u postgres psql --command="CREATE USER admin WITH PASSWORD '${passwd}';"
    sudo -u postgres psql --command="ALTER USER admin WITH SUPERUSER;"
    sudo -u postgres psql --command="ALTER USER admin CREATEDB;"
    sudo service postgresql restart
    sudo systemctl stop postgresql
    cd ~/
#unmask cinnamon
if [[ $(file /lib/systemd/system/x11-common.service) == &quot;/lib/systemd/system/x11-common.service: symbolic link to /dev/null&quot; ]]; then
  sudo rm /lib/systemd/system/x11-common.service
  sudo apt-get install --reinstall x11-common
  sudo systemctl daemon-reload
  #systemctl status x11-common
fi

#scan drives except OS drive
drivelist=&quot;&quot;
count=0
os_drive_path=&quot;$(df -hT | grep /$ | awk -F &quot; &quot; '{print $1}')&quot;
os_drive=${os_drive_path%p*}
no_dev=${os_drive#*/dev/}
NL=$'\n'
while read line; do
  if [[ ($line == *nvme* || $line == sd*) &amp;&amp; ! ( $line == *&quot;${no_dev}&quot;* ) ]]; then
    drivelist=&quot;${drivelist}/dev/${line}${NL}&quot;
    count=$((count+1))
  fi
done &lt;&lt;&lt; &quot;$(lsblk -f)&quot;

#create partitions on the free drives directly in ext4 format
drivelist_overrideable=&quot;&quot;
while read line; do
  drivename=$(echo &quot;$line&quot; | awk -F &quot; &quot; '{print $1}')
  drivelist_overrideable=&quot;${drivelist_overrideable}${drivename}${NL}&quot;
  if [[ $drivename == *nvme* ]]; then
    sudo mkfs.ext4 -b 1024 -m 0 -F &quot;$drivename&quot;
  else
    sudo mkfs.ext4 -m 0 -F &quot;$drivename&quot;
  fi
done &lt;&lt;&lt; &quot;${drivelist}&quot;

#now we know what has been formatted, so we know the hardware name, now we must scan again and mount the partition to the files created
#create RAID0
printf &quot;Y\n&quot; | sudo mdadm --create --verbose /dev/md0 --level=0 --raid-devices=&quot;${count}&quot; ${drivelist_overrideable}
cat /proc/mdstat
sudo mkfs.ext4 -b 1024 -m 0 -F /dev/md0
sudo mkdir -p /mnt/md0
sudo chown -R benjamin-elias /mnt/md0
sudo chmod -R 777 /mnt/md0
sudo mount /dev/md0 /mnt/md0
df -h -x devtmpfs -x tmpfs
hardware_uid_string=$(sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf)
sudo update-initramfs -u
echo &quot;$(echo &quot;${hardware_uid_string}&quot; | awk -F &quot; &quot; '{print $5}') /mnt/md0 ext4 defaults,nofail,discard 0 0&quot; | sudo tee -a /etc/fstab
sudo update-initramfs -u

#safely move database
sudo rsync -av /var/lib/postgresql /mnt/md0
sudo mv /var/lib/postgresql/${folver}/main /var/lib/postgresql/${folver}/main.bak
sudo chown -R postgres /mnt/md0/

postgresconf=&quot;/etc/postgresql/${folver}/main/postgresql.conf&quot;
sudo cp -f ${postgresconf} ${postgresconf}.bak

while read line; do
    if [[ $line == data_directory* ]]; then
      echo &quot;data_directory = '/mnt/md0/postgresql/${folver}/main'&quot; | sudo tee -a &quot;${postgresconf}&quot;
    else
      echo &quot;${line}&quot; | sudo tee -a &quot;${postgresconf}&quot;
    fi
done &lt;&lt;&lt; $(cat ${postgresconf}.bak)

#setup ports
sudo sed -i &quot;s/#listen_addresses = 'localhost'/listen_addresses = '*'/&quot; ${postgresconf}

postgresconf=&quot;/etc/postgresql/${folver}/main/pg_hba.conf&quot;

while read line; do
  echo &quot;${line}&quot; | sudo tee -a &quot;${postgresconf}&quot;
  ipadress=$(echo &quot;${line}&quot; | awk -F &quot; &quot; '{print $4}')
  echo $ipadress
  ipfilter=${ipadress%%/*}
  echo $ipfilter
  sudo iptables -A INPUT -p tcp -s 0/0 --sport 1024:65535 -d ${ipfilter} --dport 5432 -m state --state NEW,ESTABLISHED -j ACCEPT
  sudo iptables -A OUTPUT -p tcp -s ${ipfilter} --sport 5432 -d 0/0 --dport 1024:65535 -m state --state ESTABLISHED -j ACCEPT
done &lt;&lt;&lt; $(cat /home/${mainuser}/UltiHash/trust_host.txt)
#restart to load changes
sudo systemctl start postgresql
sudo systemctl status postgresql
#setup firewall
echo &quot;$(sudo -u postgres psql --command=&quot;SHOW data_directory;&quot;)&quot;
sudo rm -Rf /var/lib/postgresql/${folver}/main.bak
sudo ufw allow 5432/tcp
sudo systemctl restart postgresql
sudo systemctl status postgresql

fi }||{ echo "Could not configure database!" }

karel
  • 114,770