Post-Install Raid Configuration
The installation process has a very useful method for creating Redundant Array of Independent Disks (RAID) devices durin the install. This is quite useful but some implementations add RAID after the installation or need to modify the storage situation later. This guide covers the management of RAID devices after the initial installation.
If you are already familiar with RAID, skip this section.
RAID is a method for combining multiple disks into one volume to achieve a different behaviors in storage that give different features. These features include:
- Inexpensive per MB
Typically, you will have one several of the above to the detriment of others. For example, performance and redundancy are opposites and the only way to get both is at great cost. As a general hypothetical, imagine that you have 200 percentile points to spend between these three categories. You could have a system that is very redundant and high performance but it would not be inexpensive at ALL! Or perhaps you are wanting high performance at a low cost? No problem, but it won't be very redundant (in fact it may have a statistically WORSE chance of survival than a single drive).
Here are some typical RAID choices:
- RAID 0
- RAID 1
- RAID 5
- RAID 6
There are other combinations as well but they are beyond the scope of this document.
Of all the RAID combinations, RAID 0 is the fastest and is very cheap. With RAID 0 we create a single volume from all the members and the data is stored across all disks. It can be done with as little as 2 disks ONLY and gets 100% of the volume of both of those disks and the performance is nearly double of those disks had they run alone. It can be both bigger and faster than any other RAID, however it has 0 redundancy and is statistically more likely to fail than just a single disk on its OWN!
Let's imagine that we have 10 1TB hard drives and we make one big RAID 0. We would find that we now have a single 10 TB volume that is able to write to all 10 disks at the same time giving us about 10 times the performance. Wow! That is fast. Surely there must be a downside to this performance…and there is. Imagine that each of these drives has a expected life of 5 years. That means, statistically, all of these drives will fail after a 5 year average. With RAID 0, however, we will lose the entire volume if any one of them fail. This makes the life expectancy of our volume 6 months! Additionally, if we replace the drive and start over again, we'll lose the entire volume again, statistically, in another 6 months.
The more drives we add, the big and faster it becomes…and the more frequently it fails!
Ok, it seems sort of useless then to use RAID 0 right? Yes and no. There isn't a proper RAID combination here for every problem. Rather, there are options and reasons for choosing that option. So where does RAID 0 make sense? Anywhere you have a need for speed and/or and the ability to shrug off a failure. RAID 0 is sometimes used for fast caches, or for storing volatile data that is only needed for a short period of time. It is probably the worst RAID to choose if you actually care about what is being written to the disk for any period of time.
RAID 1 is quite the opposite of RAID 0 in many regards. With RAID 1 we take just two disks to get the volume, capacity and speed of just one disk (you can use more than two disks but you will still only get the performance and capacity of the one) and we mirror them. Anytime there is a write operation, we write to both. When we read, we read from one. With RAID one, we can lose either drive and be fine. And if that drive is lost, we simply replace it and rebuild the array. Super simple. An instant backup of the data as it exists at present period of time. So we get lots of redundancy but at a cost. RAID 1 is twice as expensive as RAID 0 per megabyte of data. Also, it is only as fast as a single drive. But it is simple to implement and easy to fix. After all, you only need one or the other for the whole thing to work.
RAID 1 is ideal for anything that you really, really care about. Because of its simplicity, it is often used boot partitions. Because of this, it can also be mirrored across multiple disks to give the ultimate protection against data loss decreasing your statistical chance of complete failure geometrically!
RAID 2, 3, and 4 do exist. But they are generally less useful so they typically are not included in discussion and they won't be included here except to say that some of them can be used as transition points from RAID 1 to RAID 5 in a migration. The goal of RAID 5 is to seek a balance between capacity and redundancy. With RAID 5 we will take the sum total of the equally sized disks combined and reduce that number the the amount of just 1 of the disks to determine the capacity.
For example, If I have 3 1TB drives, I take the total capacity (3TB) and reduce it by the amount of one of them (1TB) for a total of 2 TB. If I have 7 1TB disks, I will have 6TB usable.
So where does that 1 TB go? It is taxed proportionally from each disk using a striping algorithm and it is populated with a checksum bit for the bits that it is providing redundancy. This means that if a single disk fails (and it doesn't matter which one that is), the system can read that bit (called a parity bit) and determine what used to be written on that disk before it failed. Genius eh? Well there is a cost for such a convention, especially if there is a disk offline and the math has to be performed. There is a huge cost for creating such a bit which slows write performance, and there is a cost for deriving the bit in the case of read performance when one of the member is failed (when all the members are present, the parity bit is not used on reads).
OK, so where does RAID 5 make sense? Because of the complexity it is NOT desirable for boot devices (in fact, software RAID 5 is unusable for boot in Linux although hardware RAID 5 can be used because it is hidden from the OS). Because of the speed it is not good for caches or any subsystem which requires decent performance (stuff that gets written to a lot, databases, et al). Because it allows us to use a lot of disks and has a low cost on redundancy, it is quite useful for file shares which are read more than they are written to. So it is very useful for archiving lots of data.
RAID 6 is like RAID 5 in many, many ways except that with RAID 6 you have 2 drives used for parity instead of 1. With the large capacity of modern drives, RAID 6 is desirable because it can take a long time to repair a drive in an array even if you have one on hand, while the repair of one drive is happening, your array is subject to a second failure which can wipe you out. Performance-wise, RAID 6 is not better than RAID 5 which is the sacrifice for additional redundancy.
Hardware vs Software
There is a big debate that will continue to rage concerning hardware vs. software RAID. RAID can be performed by RAID cards or it can be performed by the operating system itself. My job isn't to convince you of one way over the other. I'll just present some high level facts and let you decide which one is best for you.
Hardware RAID allows you to connect your drives to a hardware controller which will manage your RAID devices at a BIOS level. This allows you to present to the operating system any RAID configuration that you like. The OS doesn't care what RAID you are using because it doesn't see it. The RAID card is capable of creating, managing, and presenting the RAID to the OS as a single volume for you to carve up.
Typically, a BIOS extension is available on your RAID card which will allow you to access the configuration. Some high end RAID cards even have their own web interfaces that allow you to configure them and set alerts.
Hardware RAID cards can easily outperform software RAID if they are designed with caches and other types of goodies that allow them to 'pretend' to write to disk. They RAID cards will typically have battery packs on them so that even if the system faults, they can complete their write operations.
Additionally, hardware RAID is usually simple to configure and the menu driven interfaces are generally intuitive.
There are two downsides to Hardware RAID. The first is cost. The fancier the card, the more it will cost you. The second downside is replacement risk. Because the hardware RAID manages the RAID set, any failure of the RAID card itself will typically require that you have the same or compatible replacement to manage the RAID should the main controller fail. Additionally, you need to be aware of firmware levels as mismatched firmware levels can lead to loss of data on the whole array even if the disks were NOT damaged in the RAID failure!
Software RAID allows you to simply connect drives to your system and then have the operating system which can see and address all of these drives, create and maintain the RAID. Software RAID is cheaper but does not require that you have specific hardware controllers in order to successfully build the array. In fact, software RAID has been known to rebuild even when the computer it was rebuilt on is a different version of Linux or BSD altogether! This flexibility and inexpensive nature comes at the cost of ease of use and performance. Because this software RAID is fully available on ClearOS, it is the purpose of the rest of this document to compensate for the 'ease of use' issue with example of how you can use RAID under ClearOS.
To LVM or not to LVM, that is the question
I want to mention LVM here for just a moment so that you understand what it is and how it relates to software RAID under ClearOS. LVM is a volume manager. You can use it or NOT use it on ClearOS. Under ClearOS 5 it was NOT used by default. Under ClearOS 6 it is used by the default install.
LVM is a volume manager. It is meant to give you control or flexibility that does not exist otherwise. Using this, you can:
- Managing large hard disk farms by letting you add disks, replace disks, copy and share contents from one disk to another without disrupting service (hot swapping).
- On small systems (like a desktop at home), instead of having to estimate at installation time how big a partition might need to be in the future, LVM allows you to resize your disk partitions easily as needed.
- Making backups by taking snapshots.
- Creating single logical volumes of multiple physical volumes or entire hard disks (somewhat similar to RAID 0, but more similar to JBOD), allowing for dynamic volume resizing. (http://en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)|Wikipedia)
In both versions of ClearOS, there are ways to implement or remove this layer. In either case, you can use RAID. In fact, you can implement RAID before LVM at one level and then implement yet another RAID below LVM at another. This is just one way a user can combine RAID styles to come up with a best of breed solution.
For example, you could have might have 10 disks that you format as partition-based RAID partitions and create two separate RAID 0 volumes with 5 disk each. Then you create an LVM partition on those volumes and then create a RAID 1 partition under the LVM managed volume. This would give you a nested-RAID solution called RAID 01 (also called RAID 0+1).
For more examples of nested RAID, click here.
The important thing to remember is that RAID is not LVM and LVM is not RAID. They can work together but they are not required to work together. The tool for managing RAID will treat physical disk partitions (e.g. /dev/sdb1) and LVM partitioned volumes (e.g. /dev/mapper/volume-name) the same. The difference only lies in how you can manage the underlying volume under LVM.
Under ClearOS, RAID volumes managed at the kernel level is handled by the multi-disk manager. The bread and butter tool for manipulating the ClearOS Multi-disks is 'mdadm'. An additional important command line asset is located in /proc/mdstat.
Watching your RAID
The command line tool for monitoring your multi-disks is to look into the kernel process and query the status. You can do this easily with the command:
If you don't have anything configured, you will get information like this:
Personalities : unused devices:
On a configured system you may see some information that looks like this:
Personalities : [raid1] md1 : active raid1 sdb1 sda1 120384 blocks [2/2] [UU] md3 : active raid1 sdb3 sda3 951039872 blocks [2/2] [UU] md2 : active raid1 sdb2 sda2 25599488 blocks [2/2] [UU] unused devices:
There is lots of good information here so I will break it down:
- Personalities: This will tell you the RAID modules that your kernel currently supports. Many are cross-over and so if you see personalities that are not represented by your configuration, don't worry. This just tells you what the kernel supports currently. Also, adding a RAID type can automatically inject the support dynamically in the kernel. All in all, this is not really useful. Supported modules are: [raid0] [raid1] [raid4] [raid5] [raid6] [linear] [multipath], and [faulty]
- md1: This is the block device that is created from its constituent parts. If you see this, you will also be able to address it as /dev/md1 by the kernel and tools. The device, md1, is a partition. Once is is created it can be formatted or it can be set up and configured as a physical volume under LVM.
- active: You won't see any arrays here is it is 'stopped' and any 'inactive' arrays are almost always faulty.
- raid1: Here is where you will see the RAID level for the particular multi-disk array.
- sdb1 sda1: Here are the underlying partitions. In this instance, they are partitions on the two hard disks that use the 'fd' partition type. They are also listed with unique numbers in brackets indicating their position in the RAID. In this case sda1 is 0 which means that it is both read and written to whereas sdb1 is only written to while it is the redundant of the pair.
/dev/sda1 * 1 15 120456 fd Linux raid autodetect /dev/sdb1 * 1 15 120456 fd Linux raid autodetect
- 120384 blocks: This is the amount of block that are now usable. It will be smaller than the actual partition by around 100 blocks (72 on this partition). This keeps track of the RAID information so that if you try to bring the disks together on a different system it is possible to heal.
- [2/2]: This is how many drives are present over how many should be present. For example, if you have a RAID 1 mirror that reads '[5/2]' then that means that you have 3 mirrored spares! If, however, it says '[1/2]' then you are missing a drive from an array. Don't Panic! Take a deep breath. Find out what is going on and whether this is a failure or by design. For example, some ClearBOXs that ship with 1 drive are configured with RAID 1 (degraded). Why? because there is no performance penalty for RAID 1 (degraded) and this allows a user who spots a predictive failure to migrate their system to a new drive using RAID as the replication technology! Genius!
- [UU]: This tells you some of the same information as above. The 'U' indicates that a drive is up and a '_' would indicate that is a drive is down. Ideally, you will see all 'U's. A degraded array will have one or more '_' (e.g. [UUU__] would be a bad thing).
- bitmap: You will have this line if you are using a RAID system that requires parity. Here is the definition of the bitmap enumerated
- recovery: If your RAID is in the process of recovering, you will get an additional line with a progress bar.
A neat trick to implement while you are working on a system with mdadm is to constantly watch the status in another window. Open an additional SSH session to your ClearOS server and run the following:
watch cat /proc/mdstat
Using this will give you an update every 2 seconds. Which is really cool because it will keep you informed about the disposition of your RAID. Additionally, it will simulate a 'moving progress bar' while you are building arrays.