Saturday 15 December 2012

How does RAID5 work? The Shortest Explanation

We all have limited time to study long and complicated information about RAID theories, but you may be interested as to how RAID 5 works. We made it simple for you by providing the shortest and easiest explanation ever.

RAID 5 – how it works?

First we need to remind you XOR definition:

XOR function result is equal 1 if both arguments are different.
XOR (0, 1) = 1
XOR (1, 0) = 1


XOR function output is equal 0 if both arguments are same.
XOR (0, 0) = 0
XOR (1, 1) = 0


Now let us assume we have 3 drives with the following bits:
| 101 | 010 | 011 |

And we calculate XOR of those data and place it on 4th drive

XOR (101, 010, 011) = 100     (XOR (101,010) = 111 and then XOR (111, 011) = 100

So the data on the four drives looks like this below:

| 101 | 010 | 011 | 100 |

Now let’s see how the XOR MAGIC works. Let’s assume the second drive has failed. When we calculate XOR all the remaining data will be present from the missing drive.

| 101 | 010 | 011 | 100 |

XOR (101, 011, 100) = 010

You can check the missing other drives and XOR of the remaining data will always give you exactly the data of your missing drive.

| 101 | 010 | 011 | 100 |

XOR (101, 010, 100) = 011

What works for 3 bits and 4 drives only, works for any number of bits and any number of drives. Real RAID 5 has the most common stripe size of 64k (65536 * 8 = 524288 bits )

So the real XOR engine only needs to deal with 524288 bits and not 3 bits as in our exercise. This is why the RAID 5 needs a very efficient XOR engine in order to calculate it fast.
So when adding one drive for parity you will be able to rebuild the missing data in case of any drive failure.

In our example we have explained RAID 4 where parity is on a dedicated drive. RAID 5 will distribute parities evenly between all drives. Distributed parity provides a slight increase in performance but the XOR magic is the same.

RAID 5 is the best solution when we talk about data safety. In the case of failure the system will automatically rebuild the lost data so that they can be read (however, the current performance of the matrix will be reduced).


1 comment: