Mirror your System Drive using Software RAID

Posted by Gregory Bartholomew on May 3, 2019

Nothing lasts forever. When it comes to the hardware in your PC, most of it can easily be replaced. There is, however, one special-case hardware component in your PC that is not as easy to replace as the rest — your hard disk drive.

Drive Mirroring

Your hard drive stores your personal data. Some of your data can be backed up automatically by scheduled backup jobs. But those jobs scan the files to be backed up for changes and trying to scan an entire drive would be very resource intensive. Also, anything that you’ve changed since your last backup will be lost if your drive fails. Drive mirroring is a better way to maintain a secondary copy of your entire hard drive. With drive mirroring, a secondary copy of all the data on your hard drive is maintained in real time.

An added benefit of live mirroring your hard drive to a secondary hard drive is that it can increase your computer’s performance. Because disk I/O is one of your computer’s main performance bottlenecks, the performance improvement can be quite significant.

Note that a mirror is not a backup. It only protects your data from being lost if one of your physical drives fail. Types of failures that drive mirroring, by itself, does not protect against include:

File System Corruption
Bit Rot
Accidental File Deletion
Simultaneous Failure of all Mirrored Drives (highly unlikely)

Some of the above can be addressed by other file system features that can be used in conjunction with drive mirroring. File system features that address the above types of failures include:

Using a Journaling or Log-Structured file system
Using Checksums (ZFS , for example, does this automatically and transparently)
Using Snapshots
Using BCVs

This guide will demonstrate one method of mirroring your system drive using the Multiple Disk and Device Administration (mdadm) toolset. Just for fun, this guide will show how to do the conversion without using any extra boot media (CDs, USB drives, etc). For more about the concepts and terminology related to the multiple device driver, you can skim the md man page:

$ man md

The Procedure

Use sgdisk to (re)partition the extra drive that you have added to your computer:
```
$ sudo -i
# MY_DISK_1=/dev/sdb
# sgdisk --zap-all $MY_DISK_1
# test -d /sys/firmware/efi/efivars || sgdisk -n 0:0:+1MiB -t 0:ef02 -c 0:grub_1 $MY_DISK_1
# sgdisk -n 0:0:+1GiB -t 0:ea00 -c 0:boot_1 $MY_DISK_1
# sgdisk -n 0:0:+4GiB -t 0:fd00 -c 0:swap_1 $MY_DISK_1
# sgdisk -n 0:0:0 -t 0:fd00 -c 0:root_1 $MY_DISK_1
```
– If the drive that you will be using for the second half of the mirror in step 12 is smaller than this drive, then you will need to adjust down the size of the last partition so that the total size of all the partitions is not greater than the size of your second drive.
– A few of the commands in this guide are prefixed with a test for the existence of an efivars directory. This is necessary because those commands are slightly different depending on whether your computer is BIOS-based or UEFI-based.

Use mdadm to create RAID devices that use the new partitions to store their data:

# mdadm --create /dev/md/boot --homehost=any --metadata=1.0 --level=1 --raid-devices=2 /dev/disk/by-partlabel/boot_1 missing
# mdadm --create /dev/md/swap --homehost=any --metadata=1.0 --level=1 --raid-devices=2 /dev/disk/by-partlabel/swap_1 missing
# mdadm --create /dev/md/root --homehost=any --metadata=1.0 --level=1 --raid-devices=2 /dev/disk/by-partlabel/root_1 missing

# cat << END > /etc/mdadm.conf
MAILADDR root
AUTO +all
DEVICE partitions
END

# mdadm --detail --scan >> /etc/mdadm.conf

– The missing parameter tells mdadm to create an array with a missing member. You will add the other half of the mirror in step 14.
– You should configure sendmail so you will be notified if a drive fails.
– You can configure Evolution to monitor a local mail spool.

Use dracut to update the initramfs:
```
# dracut -f --add mdraid --add-drivers xfs
```
– Dracut will include the /etc/mdadm.conf file you created in the previous section in your initramfs unless you build your initramfs with the hostonly option set to no. If you build your initramfs with the hostonly option set to no, then you should either manually include the /etc/mdadm.conf file, manually specify the UUID’s of the RAID arrays to assemble at boot time with the rd.md.uuid kernel parameter, or specify the rd.auto kernel parameter to have all RAID arrays automatically assembled and started at boot time. This guide will demonstrate the rd.auto option since it is the most generic.
Format the RAID devices:
```
# mkfs -t vfat /dev/md/boot
# mkswap /dev/md/swap
# mkfs -t xfs /dev/md/root
```
– The new Boot Loader Specification states “if the OS is installed on a disk with GPT disk label, and no ESP partition exists yet, a new suitably sized (let’s say 500MB) ESP should be created and should be used as $BOOT” and “$BOOT must be a VFAT (16 or 32) file system”.
Reboot and set the rd.auto, rd.break and single kernel parameters:
```
# reboot
```
– You may need to set your root password before rebooting so that you can get into single-user mode in step 7.
– See “Making Temporary Changes to a GRUB 2 Menu” for directions on how to set kernel parameters on compters that use the GRUB 2 boot loader.
Use the dracut shell to copy the root file system:
```
# mkdir /newroot
# mount /dev/md/root /newroot
# shopt -s dotglob
# cp -ax /sysroot/* /newroot
# rm -rf /newroot/boot/*
# umount /newroot
# exit
```
– The dotglob flag is set for this bash session so that the wildcard character will match hidden files.
– Files are removed from the boot directory because they will be copied to a separate partition in the next step.
– This copy operation is being done from the dracut shell to insure that no processes are accessing the files while they are being copied.
Use single-user mode to copy the non-root file systems:
```
# mkdir /newroot
# mount /dev/md/root /newroot
# mount /dev/md/boot /newroot/boot
# shopt -s dotglob
# cp -Lr /boot/* /newroot/boot
# test -d /newroot/boot/efi/EFI && mv /newroot/boot/efi/EFI/* /newroot/boot/efi && rmdir /newroot/boot/efi/EFI
# test -d /sys/firmware/efi/efivars && ln -sfr /newroot/boot/efi/fedora/grub.cfg /newroot/etc/grub2-efi.cfg
# cp -ax /home/* /newroot/home
# exit
```
– It is OK to run these commands in the dracut shell shown in the previous section instead of doing it from single-user mode. I’ve demonstrated using single-user mode to avoid having to explain how to mount the non-root partitions from the dracut shell.
– The parameters being past to the cp command for the boot directory are a little different because the VFAT file system doesn’t support symbolic links or Unix-style file permissions.
– In rare cases, the rd.auto parameter is known to cause LVM to fail to assemble due to a race condition. If you see errors about your swap or home partition failing to mount when entering single-user mode, simply try again by repeating step 5 but omiting the rd.break paramenter so that you will go directly to single-user mode.

Update fstab on the new drive:

# cat << END > /newroot/etc/fstab
/dev/md/root / xfs defaults 0 0
/dev/md/boot /boot vfat defaults 0 0
/dev/md/swap swap swap defaults 0 0
END

Configure the boot loader on the new drive:

# NEW_GRUB_CMDLINE_LINUX=$(cat /etc/default/grub | sed -n 's/^GRUB_CMDLINE_LINUX="\(.*\)"/\1/ p')
# NEW_GRUB_CMDLINE_LINUX=${NEW_GRUB_CMDLINE_LINUX//rd.lvm.*([^ ])}
# NEW_GRUB_CMDLINE_LINUX=${NEW_GRUB_CMDLINE_LINUX//resume=*([^ ])}
# NEW_GRUB_CMDLINE_LINUX+=" selinux=0 rd.auto"
# sed -i "/^GRUB_CMDLINE_LINUX=/s/=.*/=\"$NEW_GRUB_CMDLINE_LINUX\"/" /newroot/etc/default/grub

– You can re-enable selinux after this procedure is complete. But you will have to relabel your file system first.

Install the boot loader on the new drive:

# sed -i '/^GRUB_DISABLE_OS_PROBER=.*/d' /newroot/etc/default/grub
# echo "GRUB_DISABLE_OS_PROBER=true" >> /newroot/etc/default/grub
# MY_DISK_1=$(mdadm --detail /dev/md/boot | grep active | grep -m 1 -o "/dev/sd.")
# for i in dev dev/pts proc sys run; do mount -o bind /$i /newroot/$i; done
# chroot /newroot env MY_DISK_1=$MY_DISK_1 bash --login
# test -d /sys/firmware/efi/efivars || MY_GRUB_DIR=/boot/grub2
# test -d /sys/firmware/efi/efivars && MY_GRUB_DIR=$(find /boot/efi -type d -name 'fedora' -print -quit)
# test -e /usr/sbin/grub2-switch-to-blscfg && grub2-switch-to-blscfg --grub-directory=$MY_GRUB_DIR
# grub2-mkconfig -o $MY_GRUB_DIR/grub.cfg \;
# test -d /sys/firmware/efi/efivars && test /boot/grub2/grubenv -nt $MY_GRUB_DIR/grubenv && cp /boot/grub2/grubenv $MY_GRUB_DIR/grubenv
# test -d /sys/firmware/efi/efivars || grub2-install "$MY_DISK_1"
# logout
# for i in run sys proc dev/pts dev; do umount /newroot/$i; done
# test -d /sys/firmware/efi/efivars && efibootmgr -c -d "$MY_DISK_1" -p 1 -l "$(find /newroot/boot -name shimx64.efi -printf '/%P\n' -quit | sed 's!/!\\!g')" -L "Fedora RAID Disk 1"

– The grub2-switch-to-blscfg command is optional. It is only supported on Fedora 29+.
– The cp command above should not be necessary, but there appears to be a bug in the current version of grub which causes it to write to $BOOT/grub2/grubenv instead of $BOOT/efi/fedora/grubenv on UEFI systems.
– You can use the following command to verify the contents of the grub.cfg file right after running the grub2-mkconfig command above:

# sed -n '/BEGIN .*10_linux/,/END .*10_linux/ p' $MY_GRUB_DIR/grub.cfg

– You should see references to mdraid and mduuid in the output from the above command if the RAID array was detected properly.

Boot off of the new drive:
```
# reboot
```
– How to select the new drive is system-dependent. It usually requires pressing one of the F12, F10, Esc or Del keys when you hear the System OK BIOS beep code.
– On UEFI systems the boot loader on the new drive should be labeled “Fedora RAID Disk 1”.
Remove all the volume groups and partitions from your old drive:
```
# MY_DISK_2=/dev/sda
# MY_VOLUMES=$(pvs | grep $MY_DISK_2 | awk '{print $2}' | tr "\n" " ")
# test -n "$MY_VOLUMES" && vgremove $MY_VOLUMES
# sgdisk --zap-all $MY_DISK_2
```
– WARNING: You want to make certain that everything is working properly on your new drive before you do this. A good way to verify that your old drive is no longer being used is to try booting your computer once without the old drive connected.
– You can add another new drive to your computer instead of erasing your old one if you prefer.

Create new partitions on your old drive to match the ones on your new drive:

# test -d /sys/firmware/efi/efivars || sgdisk -n 0:0:+1MiB -t 0:ef02 -c 0:grub_2 $MY_DISK_2
# sgdisk -n 0:0:+1GiB -t 0:ea00 -c 0:boot_2 $MY_DISK_2
# sgdisk -n 0:0:+4GiB -t 0:fd00 -c 0:swap_2 $MY_DISK_2
# sgdisk -n 0:0:0 -t 0:fd00 -c 0:root_2 $MY_DISK_2

– It is important that the partitions match in size and type. I prefer to use the parted command to display the partition table because it supports setting the display unit:

# parted /dev/sda unit MiB print
# parted /dev/sdb unit MiB print

Use mdadm to add the new partitions to the RAID devices:

# mdadm --manage /dev/md/boot --add /dev/disk/by-partlabel/boot_2
# mdadm --manage /dev/md/swap --add /dev/disk/by-partlabel/swap_2
# mdadm --manage /dev/md/root --add /dev/disk/by-partlabel/root_2

Install the boot loader on your old drive:

# test -d /sys/firmware/efi/efivars || grub2-install "$MY_DISK_2"
# test -d /sys/firmware/efi/efivars && efibootmgr -c -d "$MY_DISK_2" -p 1 -l "$(find /boot -name shimx64.efi -printf "/%P\n" -quit | sed 's!/!\\!g')" -L "Fedora RAID Disk 2"

Use mdadm to test that email notifications are working:
```
# mdadm --monitor --scan --oneshot --test
```

As soon as your drives have finished synchronizing, you should be able to select either drive when restarting your computer and you will receive the same live-mirrored operating system. If either drive fails, mdmonitor will send an email notification. Recovering from a drive failure is now simply a matter of swapping out the bad drive with a new one and running a few sgdisk and mdadm commands to re-create the mirrors (steps 13 through 15). You will no longer have to worry about losing any data if a drive fails!

Video Demonstrations

Converting a UEFI PC to RAID1

Converting a BIOS PC to RAID1

TIP: Set the the quality to 720p on the above videos for best viewing.

Fedora Project community mdadm software raid

Gregory Bartholomew

Systems Administrator for the department of Computer Science at Southern Illinois University Edwardsville

13 Comments

Cody

‘Simultaneous Failure of all Mirrored Drives (highly unlikely)’

Nevertheless it has happened to a good friend of mine as well as me. Highly unlikely or not it’s still a risk. And unfortunately people see words like ‘highly unlikely’ and they don’t consider it at all. That’s unwise but that’s how most people think and act. This goes especially for those who are not aware. And if someone is looking at the list without knowing about the things in the list many will indeed not be aware. Depends on their experience. This cannot be ignored.

Critically (and I am not saying you didn’t cover this – I am utterly knackered and only looked at a line here and there) RAIDs are indeed very valuable but only in so far as allowing to not have to restore from backup as in the system can stay online for a time. In other words redundancy is not a form of backup. And don’t forget to TEST YOUR BACKUPS – and have a disaster recovery plan.

May 3, 2019
- Gregory Bartholomew
  
  Sure, it is definitely important to have backups in addition to the mirroring.
  
  That said, I have never, in my experience (over twenty years as a sysadmin at a major university), seen 2 drives in a computer fail at exactly the same time. What I have seen is one drive fail, the notices either not being sent or being ignored if they are sent, and then the other drive fail sometime later.
  
  Anyone using RAID1 on their computer should check on it from time to time to be sure that everything is OK. Also, make sure the notifications are working properly (scheduling a recurring test with cron is probably advisable here).
  
  Just for fun, here is another way to get your computer to make a little extra noise in the event that an error is detected in the RAID array:
  
  # echo 'PROGRAM /bin/espeak' >> /etc/mdadm.conf
  # systemctl restart mdmonitor.service
  # mdadm --monitor --scan --oneshot --test
  
  If you have your speakers turned up when you run that last command, you should hear something akin to “WARNING, WARNING sysadmin, a drive has failed!” ????
  
  May 3, 2019
  - Joao Rodrigues
    
    I have not seen two drives failing exactly at the same time, but I have seen two drives failing in the interval of a week.
    They were both a part of the same RAID and were the same brand/model and were probably made at the same time.
    
    May 3, 2019
  - Chris
    
    How would you monitor your raid setup after you created it to check that everyting is okay?
    
    May 10, 2019
    - Gregory Bartholomew
      
      Personally, I have sendmail configured to forward the notices from the mdmonitor service to my main email account. I check that account every day. If a drive fails, I will know within 24 hours. I will likely get the notice within a few minutes since I usually have my smart phone on me and it chimes when a message hits my main inbox.
      
      May 13, 2019
Gregory Bartholomew

So, RedHat has thrown me a bit of a curve ball in how they handle kernel options with the just-released Fedora 30. If you do a fresh install of Fedora, then then the options are stored in /boot/grub2/grubenv instead of /boot/efi/fedora/grub.cfg. This guide will convert your system to RAID1 just fine, but if you attempt to change your kernel options in /etc/default/grub, the changes will not get written to the right place. Below is a quick hack that should work around this problem. Only use this hack if you have converted your computer to RAID1 using this guide (and you are running a UEFI system; BIOS systems are unaffected by this bug):

Create a /etc/grub.d/99_hack file and put the following in it:

#!/usr/bin/sh

grub2-editenv /boot/efi/fedora/grubenv set kernelopts="root=/dev/md/root ro ${GRUB_CMDLINE_LINUX}"

Then make it executable:

# chmod +x /etc/grub.d/99_hack

Now, when you run grub2-mkconfig -o /boot/efi/fedora/grub.cfg, your kernel options should get written to the correct place.

You may need to remove this hack if the bug ever gets patched. Here is a link to the bug report:

https://bugzilla.redhat.com/show_bug.cgi?id=1706117

May 3, 2019
Damien Dye

personally I woouldn’t trust raid with my mission critical data.
would be ZFS all the way to protect that.
also software raid is no good if a failed disk takes out the controller is only good if the mirror disk is on a completely separate controller

May 5, 2019
- Gregory Bartholomew
  
  ZFS is cool! ????
  
  I had one problem with it on rotational drives though. It turns out that ZFS gets progressively slower over time on rotational drives due to file fragmentation. The only way to defragment the files is to temporarily break the mirror and re-copy the file system to the disabled mirror drive, reboot your computer and select the drive with the freshly-copied file system, then re-enable mirroring to overwrite the FS on the original drive.
  
  It isn’t a problem on SSDs because file fragmentation doesn’t degrade their performance.
  
  Also, with UEFI, you will still have to have your boot partition formatted with VFAT and you will still have to mirror it with the MD driver as shown in this article. In fact, the procedure is pretty much the same as what is shown in this article if you want to convert to ZFS. Just substitute XFS with ZFS and use ZFS’s internal mirroring driver instead of the MD driver for the root partition.
  
  May 5, 2019
Stuart D Gathman

Instead of making a mirrored partition for each filesystem, it is much easier to make one big mirrored partition, and add it as a PV to LVM. This also allows easily using space when adding mismatched drives. Filesystems are allocated from LVM.

I’m pretty sure only the EFI boot partition needs to be VFAT. Once grub2 is loaded, the /boot filesystem can even be on LVM.

May 7, 2019
- Gregory Bartholomew
  
  Using LVM is cool. Just beware that using LVM in combination with MDRAID and the rd.auto flag is buggy. Sometimes the LVM array will fail to assemble. I ran into that bug while making the UEFI video demo for this article. I left the failure in the video so that people could see what it looks like and how to deal with it.
  
  May 8, 2019
Göran Uddeborg

Wonderful article! Thanks a lot!

What is the point of RAID:ing a swap partition? When I’ve done some similar setups, I’ve used RAID for file system partitions, but just added a standalone swap partition from each disk. It’s not exactly information I would worry about loosing on those disks, and with the same priority, I believe the kernel will distribute the load equally over both. What is it I’m missing?

May 15, 2019
- Gregory Bartholomew
  
  Hi Göran:
  
  The reason for RAID:ing the swap partition in a RAID1 setup is just to insure that the PC continues running in the event of a disk failure. Without doing so, a large block of memory addresses may suddenly disappear from the system when a disk fails and there is no guarantee that the PC will continue running if that happens.
  
  If you are running RAID0, however, then yes, you should put the swap file system on the partitions directly and let the kernel’s internal algorithms handle distributing the reads and writes among all the swap devices.
  
  May 16, 2019
  - Göran Uddeborg
    
    Ah, I see the point now. Thanks!
    
    May 16, 2019

Comments are Closed

Mirror your System Drive using Software RAID

Drive Mirroring

The Procedure

Video Demonstrations

Like this:

Gregory Bartholomew

13 Comments

Cody

Gregory Bartholomew

Joao Rodrigues

Chris

Gregory Bartholomew

Gregory Bartholomew

Damien Dye

Gregory Bartholomew

Stuart D Gathman

Gregory Bartholomew

Göran Uddeborg

Gregory Bartholomew

Göran Uddeborg

Subscribe to Fedora Magazine via Email

Contribute to the Magazine

Mirror your System Drive using Software RAID

🔧 Unlocking system performance: A practical guide to tuning PCP on Fedora & RHEL

🔧 Deep dive into sosreport: understanding the data pack layout in Fedora & RHEL

System insights with command-line tools: free and vmstat

Drive Mirroring

The Procedure

Video Demonstrations

Like this:

13 Comments

Cody

Joao Rodrigues

Chris

Damien Dye

Göran Uddeborg

Göran Uddeborg

Subscribe to Fedora Magazine via Email

Contribute to the Magazine