Managing RAID arrays with mdadm

Posted by Gregory Bartholomew on April 17, 2019

Mdadm stands for Multiple Disk and Device Administration. It is a command line tool that can be used to manage software RAID arrays on your Linux PC. This article outlines the basics you need to get started with it.

The following five commands allow you to make use of mdadm’s most basic features:

Create a RAID array:

# mdadm --create /dev/md/test --homehost=any --metadata=1.0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
Assemble (and start) a RAID array:

# mdadm --assemble /dev/md/test /dev/sda1 /dev/sdb1
Stop a RAID array:

# mdadm --stop /dev/md/test
Delete a RAID array:

# mdadm --zero-superblock /dev/sda1 /dev/sdb1
Check the status of all assembled RAID arrays:

# cat /proc/mdstat

Notes on features

`mdadm --create`

The create command shown above includes the following four parameters in addition to the create parameter itself and the device names:

–homehost:
By default, mdadm stores your computer’s name as an attribute of the RAID array. If your computer name does not match the stored name, the array will not automatically assemble. This feature is useful in server clusters that share hard drives because file system corruption usually occurs if multiple servers attempt to access the same drive at the same time. The name any is reserved and disables the homehost restriction.
–metadata:
mdadm reserves a small portion of each RAID device to store information about the RAID array itself. The metadata parameter specifies the format and location of the information. The value 1.0 indicates to use version-1 formatting and store the metadata at the end of the device.
–level:
The level parameter specifies how the data should be distributed among the underlying devices. Level 1 indicates each device should contain a complete copy of all the data. This level is also known as disk mirroring.
–raid-devices:
The raid-devices parameter specifies the number of devices that will be used to create the RAID array.

By using level=1 (mirroring) in combination with metadata=1.0 (store the metadata at the end of the device), you create a RAID1 array whose underlying devices appear normal if accessed without the aid of the mdadm driver. This is useful in the case of disaster recovery, because you can access the device even if the new system doesn’t support mdadm arrays. It’s also useful in case a program needs read-only access to the underlying device before mdadm is available. For example, the UEFI firmware in a computer may need to read the bootloader from the ESP before mdadm is started.

`mdadm --assemble`

The assemble command above fails if a member device is missing or corrupt. To force the RAID array to assemble and start when one of its members is missing, use the following command:

# mdadm --assemble --run /dev/md/test /dev/sda1

Other important notes

Avoid writing directly to any devices that underlay a mdadm RAID1 array. That causes the devices to become out-of-sync and mdadm won’t know that they are out-of-sync. If you access a RAID1 array with a device that’s been modified out-of-band, you can cause file system corruption. If you modify a RAID1 device out-of-band and need to force the array to re-synchronize, delete the mdadm metadata from the device to be overwritten and then re-add it to the array as demonstrated below:

# mdadm --zero-superblock /dev/sdb1
# mdadm --assemble --run /dev/md/test /dev/sda1 
# mdadm /dev/md/test --add /dev/sdb1

These commands completely overwrite the contents of sdb1 with the contents of sda1.

To specify any RAID arrays to automatically activate when your computer starts, create an /etc/mdadm.conf configuration file.

For the most up-to-date and detailed information, check the man pages:

$ man mdadm 
$ man mdadm.conf

The next article of this series will show a step-by-step guide on how to convert an existing single-disk Linux installation to a mirrored-disk installation, that will continue running even if one of its hard drives suddenly stops working!

Fedora Project community mdadm software raid

Gregory Bartholomew

Systems Administrator for the department of Computer Science at Southern Illinois University Edwardsville

14 Comments

Norbert J.

Great idea to shed some light on this quite complex matter, thanks!

I have been using MD RAIDs on 2 desktop computers for some years and always wondered what “homehost” is good for. It took me also some time to figure out that a kernel parameter “rd.md.uuid=…” is needed if the root fs resides on a MD RAID and a generic (not host-only) initramfs is used; maybe you want to mention this in a later article.

April 18, 2019
- Gregory Bartholomew
  
  Interesting. I didn’t know that an initramfs compiled with “hostonly=no” will exclude the /etc/mdadm.conf file. I just checked with the current version of dracut (049-26.git20181204), and indeed /etc/mdadm.conf does get excluded. So, your options with a “hostonly=no” initramfs are to specify the UUID on the kernel command line, manually include the /etc/mdadm.conf file when you build the initramfs, or specify the rd.auto kernel command line option to have all RAID arrays automatically assembled. The latter probably makes the most sense if you are really going for a “generic” system.
  
  I’m adding a note about this problem to the upcoming article right now. Thanks!
  
  April 19, 2019
Göran Uddeborg

You hint this will become a series. It would be very much appreciated if you could include one post on debugging. I have tried to create a RAID-1 setup quite similar to what you describe above. It seems to work functionally, but a lot of the time it feels like it goes VERY slowly. When running on only one of the disks, I don’t see the issue. I would expect to see a slight speedup for reads and a slight slowdown for writes, but not as much as this.

Note: I’m not asking you to help debugging my system! I’m only suggesting debugging as a topic for one of the parts in the series, and just gave the description above to explain what I mean.

April 18, 2019
- Gregory Bartholomew
  
  Sure. I’ll see what I can put together. Thanks for the idea! ????
  
  April 19, 2019
Mark

Thanks for this, the other important notes section is what I will find most useful as at some point a disk will need to be replaced; so jotted that down as a starting point for the day I need it.
I found the article at https://www.tecmint.com/create-raid1-in-linux/ an easy how-to in setting up software raid1 on two additional disks and using mdadm to create the contents needed for mdadm.conf.
Looking forward on the upcoming article on how to convert a single disk system to mirrored, if it can be done without losing data.

April 19, 2019
Stuart D Gathman

I like to use raid10 (which is different from raid1+0) because you can mirror and stripe with any number of devices, including 3. The stripes are arranged in a pattern to get performance like striping (raid0), but with mirroring. Raid 1+0 also does that, but only with more drives (starting with 4).

I use raid1, when I need to mirror boot partitions.

April 22, 2019
- Gregory Bartholomew
  
  Sure. I’ve heard that the answer to the problem “I need more disk speed” is to “throw more spindles” (i.e. disk drives in RAID arrays) at the problem; even for big companies and datacenters.
  
  One thing to beware of though is how hard it will be to recover from the situation if your computer were to fail. Which is yet another advantage of software RAID — it is a bit more “portable” than a proprietary hardware RAID controller (and I have seen hardware RAID controllers fail). If you have a more complicated software RAID configuration, it may be difficult to reconstruct in the event that your computer dies. So there are trade-offs.
  
  April 23, 2019
Stuart D Gathman

One huge advantage of software RAID, is that you don’t need to match drive size or waste space. E.g., suppose you have 2 1T drives, with raid1 (for simplicity). Now you add a 3rd 2T drive. How do you use the space while maintaining mirroring? Simple:

o allocate 2 1T partitions on the 2T drive
o migrate one of the 1T mirror legs to the first 1T partition.
o create a new RAID array from the 2nd 1T partition and the vacated 1T drive
o add the new RAID array to your Volume Group

April 22, 2019
- Gregory Bartholomew
  
  Indeed. Another thing that I like about it is that it is easy to configure email notifications to be sent in case a drive fails. You can even configure an arbitrary program to be run if, for example, you wanted a text message instead of an email. The possibilities are endless. ????
  
  April 23, 2019
  - Stuart D Gathman
    
    I’ve also seen some benchmarks indicating that software RAID is faster than hardware for raid0, raid1 (and presumably raid10 if that were an option on hardware) with comparable controllers. The hardware is mainly for raid5 and 6, where handling the checksums and read, modify, write cycles is expensive.
    
    April 23, 2019
    - Gregory Bartholomew
      
      Makes sense. The OS knows a lot more about the processes that are running and their access patterns than the hard drive or RAID controller ever could. It has a lot more memory with which to hold and reorder read and write operations too.
      
      April 24, 2019
      - Daniel Letai
        
        Can a CPU extension such as AES-NI be used for the checksum calculations as well? I would imagine that would speed up raid5/6 considerably, if possible.
        Also see https://lkml.org/lkml/2013/5/1/449
        
        April 30, 2019
        
        Gregory Bartholomew
        
        Oops, my below reply was meant to be in reply to this comment.
        
        April 30, 2019
Gregory Bartholomew

I think RAID5/6 normally uses XOR for the parity calculations that allow it to reconstruct data when blocks are missing or corrupt. AES (Advanced Encryption Standard) is something a little different. It is used to encrypt data. But, yes, some common algorithms for encryption and checksums (like AES and CRC32, respectively) are hardware accelerated.

April 30, 2019

Comments are Closed

Managing RAID arrays with mdadm

Notes on features

`mdadm --create`

`mdadm --assemble`

Other important notes

Like this:

Gregory Bartholomew

14 Comments

Norbert J.

Gregory Bartholomew

Göran Uddeborg

Gregory Bartholomew

Mark

Stuart D Gathman

Gregory Bartholomew

Stuart D Gathman

Gregory Bartholomew

Stuart D Gathman

Gregory Bartholomew

Daniel Letai

Gregory Bartholomew

Gregory Bartholomew

Subscribe to Fedora Magazine via Email

Contribute to the Magazine

Managing RAID arrays with mdadm

RISC-V and Fedora: All Aboard!

Contribute at the Fedora Test Day for WSL

The state of the Location permission on Fedora Linux in 2025

Notes on features

mdadm --create

mdadm --assemble

Other important notes

Like this:

14 Comments

Norbert J.

Göran Uddeborg

Mark

Daniel Letai

Subscribe to Fedora Magazine via Email

Contribute to the Magazine

`mdadm --create`

`mdadm --assemble`