RAID

What is RAID? Redundant Array of Independant/Inexpensive Disks (RAID)

However, thats a fairly old definition. Expensive disks are faster :) as we all know.

RAID is a method of using multiple physical disks to create a logical disk or disks which may be faster and may be more impervious to failure (redundancy) than the physical disks individually. Typically speed increases are achieved through striping, while redundancy may be achieved in a number of ways typically being parity and mirroring.

Common RAID configurations include:

RAID 0 -Striping
RAID 1 -Mirroring
RAID 5 -Striping with distributed parity
RAID 1+0 -Striped mirror set or mirrored stripe set

Striping

Striping is a method of reading and writing to/from RAID arrays which effectively chunks the data into 'stripes'. If an array has multiple disks the stripes can be written to or read from these disks simultaneously, offering a throughput increase. The amount of data that a stripe contains is generally referred to as a 'stripe size'. The stripe size is determined by the RAID controller (be it hardware or software) and is often user configurable. Configuration of the stripe size can lead to throughput performance increases or penalties depending upon the size of the data which will reside upon the array.

While throughput may be increased, latency will never decrease. Latency cannot be combatted by adding more disks.

Parity

Array types that use parity offer very simple error detection and correction capabilities. This is done using the XOR logical operation. The data to be written undergoes a XOR operation which results in parity. Regardless of how many stripes of data undergo a XOR operation, the result is always the length of an individual stripe. The parity is then stored on the array, and can be used to recalculate missing data should there be a hardware failure.

A very simple way to XOR multiple pieces of data together is to sum the columns of corresponding bits. If the column is even, then the result is a 0. If the column is odd, the result is a 1. If the column equals 0, treat it as even. While this works, it is not "what XOR is". Please read elsewhere for an article on XOR itself.

Here is a simplified example of three bytes undergoing a XOR operation to produce a parity byte:

Byte 1  11101011
Byte 2  11001101
Byte 3  01001111
----------------   
Parity  01101001

If a disaster occurs and Byte 2 vanishes in a puff of smoke, we can XOR the remaining two bytes and parity byte together to arrive back at Byte 2's contents, shown as "Missing" below:

Byte 1  11101011
Byte 3  01001111   
Parity  01101001
----------------
Missing 11001101

If another byte vanishes, the single parity byte is useless and the data is gone.

RAID 0

RAID 0 requires at least one disk. In this array configuration stripes are written and read across however many disks are in the array. Typically this leads to an increase which is, assuming there are no subsystem bottlenecks, almost a 100% increase per disk in the array compared to a single disk.

Unfortunately this array type is more susceptible to failure, as it provides no redundancy and a single disk failure will take down the array.

Single disk RAID 0 arrays are sometimes used by server administrators to quickly get a box up and running before they add more disks later, and migrate the array to another striped array type, such as RAID 5.

There is no capacity loss on this array type to parity, mirroring, or anything else. The storage capacity of RAID 0 can be expressed as size*n where n is the number of disks.

RAID 1

RAID 1 requires two disks. Effectively, this array configuration writes the same data to both disks concurrently, creating a mirror. This array configuration can sustain a single disk loss without data loss. Some controllers will also offer a read throughput increase on this array type.

Some RAID 1 controllers actually implement striped mirror sets or mirrored stripe sets (RAID 01 or 10) in place of RAID 1. This affords the ability of the array to be expanded later, or even migrated to another array type. Usually this is transparent to the user, and the controller still refers to it as RAID 1.

Obviously an array of this type has only 50% of the storage space of the combined physical disks. The storage capacity of RAID 1 can be expressed as size(n/2) where n is the number of disks.

RAID 5

RAID 5 arrays generally require a three disk minimum. RAID 5 uses striping with distributed parity to achieve redundancy and a throughput increase. On a three disk RAID 5 array, two disks will receive a stripe of data, and the other will receive a stripe of parity. Which disks receive what is rotated, so that the parity is distributed amongst the disks. RAID 5 can sustain the loss of a single disk without losing data. However, unlike RAID 1, RAID 5 has a greater storage capacity for any given number of disks.

Sometimes RAID 5 arrays are created with two disks, and are effectively the same as a 3 disk RAID 5 array which has lost a single disk. In this state the array will be degraded in performance as all reads will require parity calculations, as well as having no redundancy. A RAID 5 array in this state will go down if a disk is lost. A third disk needs to be added, and the array rebuilt before performance and redundancy return to normal. Two disk RAID 5 arrays are very rarely created.

The storage capacity of RAID 5 can be expressed as size*(n-1) where size is the size of the smallest disk and n is number of disks.

RAID 6

RAID 6 is similar to RAID 5 with double the parity. That is, it uses striping with double distributed parity. Opposed to every two chunks from a stripe being used to calculate one chunk of parity, two chunks of parity are created, and then distributed as with RAID 5. This method effectively doubles the redundancy of the array: a RAID 6 array with a four-disk minimum will fail after two disks have been lost. Using a four disk RAID 6 array as an example, two disks will effectively be used for striping the actual data, the remaining two are used for storing the parity information. Using this configuration, any fears of a second disk failing or of a write error occurring whilst replacing a failed disk are mitigated.

The storage capacity of a RAID 6 array can be expressed as size*(n-2) where size is the size of the smallest disk and n is the number of disks.

Nested RAID Levels

RAID 5 and RAID 6 arrays can give particularly good value and performance in regards to the effective use of disks. However, as the number of disks in the array increases, the redundancy of a RAID 5 or RAID 6 array can decrease to and below that of a single disk. That is, regardless of how many disks are in the array, a RAID 5 array will go down once a second disk has failed, likewise, a RAID 6 array will go down once a third disk has failed.

To overcome this as the size of the array increases, RAID 1 is an obvious choice, as it's redundancy scales with the size of the array, however, it doesn't come with the performance benefits you might expect from striped arrays such as RAID 5 and RAID 6. This is where we introduce a group of RAID levels known as "Nested". These RAID levels can have better scalability with regards to redundancy and occasionally performance. These RAID levels work by combing two or more RAID levels and applying it to the entire array.

RAID 10

RAID 10, also known as RAID 1+0, works by splitting the entire group of disks into two or more groups, striping over all the groups, and then creating a local mirror in each group. This is where the name becomes evident, as it combines RAID 1 with RAID 0. For example, if I had a four-disk RAID 10 array, you can imagine a single piece of data being split into two stripes, one stripe going to Group A, the other going to Group B. Group A then mirrors it's stripe across it's two disks, Group B mirrors it's stripe across it's two disks in a similar fashion. So we are effectively creating two RAID 1 arrays within a larger RAID 0 array.

A four-disk RAID 10 array will sustain a maximum of two disk failures before the array goes down. That is, it's dependent on which two disks fail. Imagine, for example, that both disks in the Group A RAID 1 of the example above failed, then we have lost an entire group out of the higher RAID 0 array: the entire array will fail. However, if one disk from Group A failed, and one disk from Group B failed, the array will continue to function, as both of the RAID 1 arrays are still active. Note that if a third disk were to fail, the array will go down regardless of which RAID 1 batch it was in, as with a four-disk array it would indicate an entire RAID 1 array had been lost.

Depending on the RAID controller, RAID 10 arrays are typically limited to four disks.

The storage capacity of a RAID 10 array is expressed in the same fashion as a RAID 1 array: size*(n/2) where size is the size of the smallest disk and n is the number of disks.

How do I setup RAID on WindowsXP? Why use RAID on XP? 1) Performance - you can use RAID 0 to write the data much faster than a single drive. 2) SATA controllers make it really easy

A 2nd PC running with a floppy drive is best when doing this. Im going to assume you're using onboard RAID ('cause thats what most of us OCAU'ers would use) btw: you have 2 identical HDD's?

1st, setup your BIOS to enable onboard RAID. Set the boot device to the RAID controller (sometimes this means set it to SCSI - check your manual for that one). In your manual there will most probably be two RAID chipsets. One is the oboard default SATA driver (ie. nVidia, VIA, etc). The second chipset will be the addon chipset (I.e. usually Promise RAID) <-- this is the one you want (if your mobo has one). As the processing will be done in this chip with no impact with other resources.

Get your mobo CDROM and find the RAID directory. Theres usually a DIR in there called F6 or BOOTFLOPPY. Basically your looking for file/s that can be copied to a floppy disk that will enable windows to find your RAID HDD's. Copy these files to a floppy (theres NO other way to do this unfortunatley, floppy it is).

Boot XP from the CDROM and when asked Press F6 to install additional RAID or SCSI controller. This will be about 30secs into the blue DOS screen. Put the floppy in and windows will find the correct driver. You then select the correct driver from a small list avalible.

Format the drive and continue installing windows the usual way. Only at the end of the entire install process, will you know if the drivers you put on the floppy are the correct ones!!! <-- and that sucks. If not try again with a different F6 driver.

Once windows boots for the first time install the RAID (32bit windows driver) BEFORE any other driver. This is NOT the driver on the F6 floppy.