Sie sind auf Seite 1von 62

nIntroduction

For any organization, whether it be small business or a data center, lost data means lost business. There are two
common practices for protecting that data: backups (protecting your data against total system failure, viruses,
corruption, etc.), and RAID (protecting your data against drive failure). Both are necessary to ensure your data is
secure.

This white paper discusses the various types of RAID configurations available, their uses, and how they should be
implemented into data servers.

NOTE: RAID is not a substitute for regularly-scheduled backups. All organizations and users should always have a solid backup
strategy in place.
What is RAID?
RAID (Redundant Array of Inexpensive Disks) is a data storage structure that allows a system
administrator/designer/builder/user to combine two or more physical storage devices (HDDs, SSDs, or both) into a
logical unit (an array) that is seen by the attached system as a single drive.

There are three basic RAID elements:

1. Striping (RAID 0) writes some data to one drive and some data to another, minimizing read and write access
times and improving I/O performance.
2. Mirroring (RAID 1) replicates data on two drives, preventing loss of data in the event of a drive failure.
3. Parity (RAID 5 & 6) provides fault tolerance by examining the data on two drives and storing the results on a
third. When a failed drive is replaced, the lost data is rebuilt from the remaining drives.

It is possible to configure these RAID levels into combination levels — called RAID 10, 50 and 60.

The RAID controller handles the combining of drives into these different configurations to maximize performance,
capacity, redundancy (safety) and cost to suit the user needs.

Hardware RAID vs. software RAID


RAID can be hardware-based or software-based. Hardware RAID resides on a PCIe controller card, or on a
motherboardintegrated RAID-on-Chip (ROC). The controller handles all RAID functions in its own hardware
processor and memory. The server CPU is not loaded with storage workload so it can concentrate on handling the
software requirements of the server operating system and applications.

Pros: » Better performance than software RAID.


» Controller cards can be easily swapped out for replacement and upgrades.
Cons: » More expensive than software RAID.

Software RAID runs entirely on the CPU of the host computer system.

Pros: » Lower cost due to lack of RAID-dedicated hardware.


Cons: » Lower RAID performance as CPU also powers the OS and applications.
How does RAID work?
In software RAID, the RAID implementation is an application running on the host. This type of RAID uses drives
attached to the computer system via a built-in I/O interface or a processorless host bus adapter (HBA). The RAID
becomes active as soon as the OS has loaded the RAID driver software.

In hardware RAID, a RAID controller has a processor, memory and multiple drive connectors that allow drives to be
attached either directly to the controller, or placed in hot-swap backplanes.

In both cases, the RAID system combines the individual drives into one logical disk. The OS treats the drive like any
other drive in the computer — it does not know the difference between a single drive connected to a motherboard or
a RAID array being presented by the RAID controller.

Given its performance benefits and flexibility, hardware RAID is better suited for the typical modern server system.

RAID-compatible HDDs and SSDs


Storage manufacturers offer many models of drives. Some are designated as “desktop” or “consumer” drives, and
others as “RAID” or “enterprise” drives. There is a big difference: a consumer drive is not designed for the demands
of being connected into a group of drives and is not suitable for RAID. RAID or enterprise drives, on the other hand,
are designed to communicate with the RAID controller and act in unison with other drives to form a stable RAID array
to run your server.

From a RAID perspective, HDDs and SSDs only differ in their performance and capacity capabilities. To the RAID
controller they are all drives, but it is important to take note of the performance characteristics of the RAID controller
to ensure it is capable of fully accommodating the performance capabilities of the SSD. Most modern RAID
controllers are fast enough to allow SSDs to run at their full potential, but a slow RAID controller could bottleneck data
and negatively impact system performance.

Hybrid RAID
Hybrid RAID is a redundant storage solution that combines high capacity, low-cost SATA or higher-performance SAS
HDDs with low latency, high IOPs SSDs and an SSD-aware RAID adapter card (Figure 1).
In Hybrid RAID, read operations are done from the faster SSD and write operations happen on both SSD and HDD
for redundancy purposes.

Hybrid RAID arrays offer tremendous performance gains over standard HDD arrays at a much lower cost than SSD-
only RAID arrays. Compared to HDD-only RAID arrays, hybrid arrays accelerate IOPs and reduce latency, allowing
any server system to host more users and perform more transactions per second on each server, which reduces the
number of servers required to support any given workload.

A simple glance at Hybrid RAID functionality does not readily show its common use cases, which include creating
simple mirrors in workstations through to high-performance readintensive applications in the small to medium
business arena. Hybrid RAID is also used extensively in the data center to provide greater capacity in storage servers
while providing fast boot for those servers. Learn more about Hybrid RAID.
Who should use RAID?
Any server or high-end workstation, and any computer system where constant uptime is required, is a suitable
candidate for RAID.

At some point in the life of a server, at least one drive will fail. Without some form of RAID protection, a failed drive’s
data would have to be restored from backups, likely at the loss of some data and a considerable amount of time. With
a RAID controller in the system, a failed drive can simply be replaced and the RAID controller will automatically
rebuild the missing data from the rest of the drives onto the newlyinserted drive. This means that your system can
survive a drive failure without the complex and long-winded task of restoring data from backups.

Choosing the right RAID level


There are several different RAID configurations, called “levels,” such as RAID 0, RAID 1, RAID 10, and RAID 5. While
there is little difference in their names, there are big differences in their characteristics and where/when they should
be used.

The factors to consider when choosing the right RAID level include:

 Capacity
 Performance
 Redundancy (reliability/safety)
 Price

There is no one-size-fits all approach to RAID because focus on one factor typically comes at the expense of another.
Some RAID levels designate drives to be used for redundancy, which means they can’t be used for capacity. Other
RAID levels focus on performance but not on redundancy. A large, fast, highlyredundant array will be expensive.
Conversely, a small, averagespeed redundant array won’t cost much, but will not be anywhere near as fast as the
previous expensive array.

With that in mind, here is a look at the different RAID levels and how they may meet your requirements.

RAID 0 (Striping)
In RAID 0, all drives are combined into one logical disk (Figure 2). This configuration offers low cost and maximum
performance, but no data protection — a single drive failure results in total data loss.

As such, RAID 0 is not recommended. As SSDs become more affordable and grow in capacity, RAID 0 has declined
in popularity. The benefits of fast read/write access are far outweighed by the threat of losing all data in the event of a
drive failure.

Usage: Suited only for situations where data isn’t mission critical, such as video/audio post-production, multimedia
imaging, CAD, data logging, etc. where it’s OK to lose a complete drive because the data can be quickly re-copied
from the source. Generally speaking, RAID 0 is not recommended.
Pros: » Fast and inexpensive.
» All drive capacity is usable.
» Quick to set up. Multiple HDDs sharing the data load make it the fastest of all arrays.
Cons: » RAID 0 provides no data protection at all.
» If one drive fails, all data will be lost with no chance of recovery.
RAID 1 (Mirroring)
RAID 1 maintains duplicate sets of all data on two separate drives while showing just one set of data as a logical disk
(Figure 3). RAID 1 is about protection, not performance or capacity.

Since each drive holds copies of the same data, the usable capacity is 50% of the available drives in the RAID set.

Usage: Generally only used in cases where there is not a large capacity requirement, but the user wants to make
sure the data is 100% recoverable in the case of a drive failure, such as accounting systems, video editing, gaming
etc.

Pros: » Highly redundant — each drive is a copy of the other.


» If one drive fails, the system continues as normal with no data loss.
Cons: » Capacity is limited to 50% of the available drives, and performance is not much better than a single drive.
NOTE: With the advent of large-capacity SATA HDDs, it is possible to achieve an approximately 8TB RAID 1 array
using two 8TB HDDs. While this may give sufficient capacity for many small business servers, performance will still
be limited by the fact that it only has two spindles operating within the array. Therefore it is recommended to move to
RAID arrays that utilize more spinning media when such capacities are required.

RAID 1E (Striped Mirroring)


RAID 1E combines data striping from RAID 0 with data mirroring from RAID 1 while offering more performance than
RAID 1 (Figure 4). Data written in a stripe on one drive is mirrored to a stripe on the next drive in the array.

As in RAID 1, usable drive capacity in RAID 1E is 50% of the total available capacity of all drives in the RAID set.

Usage: Small servers, high-end workstations, and other environments with no large capacity requirements, but where
the user wants to make sure the data is 100% recoverable in the case of a drive failure.
Pros: » Redundant with better performance and capacity than RAID 1. In effect, RAID 1E is a mirror of an odd number of drives.
Cons: » Cost is high because only half the capacity of the physical drives is available.
NOTE: RAID 1E is best suited for systems with three drives. For scenarios with four or more drives, RAID 10 is
recommended.

RAID 5 (Striping with Parity)


As the most common and best “all-round” RAID level, RAID 5 stripes data blocks across all drives in an array (at least
3 to a maximum of 32), and also distributes parity data across all drives (Figure 5). In the event of a single drive
failure, the system reads the parity data from the working drives to rebuild the data blocks that were lost.

RAID 5 read performance is comparable to that of RAID 0, but there is a penalty for writes since the system must
write both the data block and the parity data before the operation is complete.

The RAID parity requires one drive capacity per RAID set, so usable capacity will always be one drive less than the
total number of drives in the configuration.

Usage: Often used in fileservers, general storage servers, backup servers, streaming data, and other environments
that call for good performance but best value for the money. Not suited to database applications due to poor random
write performance.
Pros: » Good value and good all-around performance.
Cons: » One drive capacity is lost to parity.
» Can only survive a single drive failure at any one time.
» If two drives fail at once, all data is lost.
NOTE: It is strongly recommended to have a hot spare set up with the RAID 5 to reduce exposure to multiple drive
failures. NOTE: While SSDs are becoming cheaper, and their improved performance over HDDs makes it seem
possible to use them in RAID 5 arrays for database applications, the general nature of small random writes in RAID 5
still means that this RAID level should not be used in a system with a large number of small, random writes. A non-
parity array such as RAID 10 should be used instead.
RAID 6 (Striping with Dual Parity)
In RAID 6, data is striped across several drives and dual parity is used to store and recover data (Figure 6). It is
similar to RAID 5 in performance and capacity capabilities, but the second parity scheme is distributed across
different drives and therefore offers extremely high fault tolerance and the ability to withstand the simultaneous failure
of two drives in an array.

RAID 6 requires a minimum of 4 drives and a maximum of 32 drives to be implemented. Usable capacity is always
two less than the number of available drives in the RAID set.

Usage: Similar to RAID 5, including fileservers, general storage servers, backup servers, etc. Poor random write
performance makes RAID 6 unsuitable for database applications.

Pros: » Reasonable value for money with good all-round performance.


» Can survive two drives failing at the same time, or one drive failing and then a second drive failing during the data
rebuild.
Cons: » More expensive than RAID 5 due to the loss of two drive capacity to parity.
» Slightly slower than RAID 5 in most applications.
RAID 10 (Striping and Mirroring)
RAID 10 (sometimes referred to as RAID 1+0) combines RAID 1 and RAID 0 to offer multiple sets of mirrors striped
together (Figures 7 and 8). RAID 10 offers very good performance with good data protection and no parity
calculations.

RAID 10 requires a minimum of four drives, and usable capacity is 50% of available drives. It should be noted,
however, that RAID 10 can use more than four drives in multiples of two. Each mirror in RAID 10 is called a “leg” of
the array. A RAID 10 array using, say, eight drives (four “legs,” with four drives as capacity) will offer extreme
performance in both spinning media and SSD environments as there are many more drives splitting the reads and
writes into smaller chunks across each drive.

Usage: Ideal for database servers and any environment with many small random data writes.
Pros: » Fast and redundant.
Cons: » Expensive because it requires four drives to get the capacity of two.
» Not suited to large capacities due to cost restrictions.
» Not as fast as RAID 5 in most streaming environments.
RAID 50 (Striping with Parity)
RAID 50 (sometimes referred to as RAID 5+0) combines multiple RAID 5 sets (striping with parity) with RAID 0
(striping) (Figures 9 and 10). The benefits of RAID 5 are gained while the spanned RAID 0 allows the incorporation of
many more drives into a single logical disk. Up to one drive in each sub-array may fail without loss of data. Also,
rebuild times are substantially less than a single large RAID 5 array.
A RAID 50 configuration can accommodate 6 or more drives, but should only be used with configurations of more
than 16 drives. The usable capacity of RAID 50 is 67%-94%, depending on the number of data drives in the RAID
set.

It should be noted that you can have more than two legs in a RAID 50. For example, with 24 drives you could have a
RAID 50 of two legs of 12 drives each, or a RAID 50 of three legs of eight drives each. The first of these two arrays
would offer greater capacity as only two drives are lost to parity, but the second array would have greater
performance and much quicker rebuild times as only the drives in the leg with the failed drive are involved in the
rebuild function of the entire array.

Usage: Good configuration for cases where many drives need to be in a single array but capacity is too large for
RAID 10, such as in very large capacity servers.

Pros: » Reasonable value for the expense.


» Very good all-round performance, especially for streaming data, and very high capacity capabilities.
Cons: » Requires a lot of drives.
» Capacity of one drive in each RAID 5 set is lost to parity.
» Slightly more expensive than RAID 5 due to this lost capacity.
RAID 60 (Striping with Dual Party)
RAID 60 (sometimes referred to as RAID 6+0) combines multiple RAID 6 sets (striping with dual parity) with RAID 0
(striping) (Figures 11 and 12). Dual parity allows the failure of two drives in each RAID 6 array while striping
increases capacity and performance without adding drives to each RAID 6 array.

Like RAID 50, a RAID 60 configuration can accommodate 8 or more drives, but should only be used with
configurations of more than 16 drives. The usable capacity of RAID 60 is between 50%-88%, depending on the
number of data drives in the RAID set.

Note that all of the above multiple-leg configurations that are possible with RAID 10 and RAID 50 are also possible
with RAID 60. With 36 drives, for example, you can have a RAID 60 comprising two legs of 18 drives each, or a RAID
60 of three legs with 12 drives in each.

Usage: RAID 60 is similar to RAID 50 but offers more redundancy, making it good for very large capacity servers,
especially those that will not be backed up (i.e. video surveillance servers handling large numbers of cameras).
Pros: » Can sustain two drive failures per RAID 6 array within the set, so it is very safe.
» Very large and reasonable value for money, considering this RAID level won’t be used unless there are a large number
of drives.
Cons: » Requires a lot of drives.
» Slightly more expensive than RAID 50 due to losing more drives to parity calculations.
When to use which RAID level
We can classify data into two basic types: random and streaming. As indicated previously, there are two general
types of RAID arrays: non-parity (RAID 1, 10) and parity (RAID 5, 6, 50, 60).

Random data is generally small in nature (i.e., small blocks), with a large number of small reads and writes making up
the data pattern. This is typified by database-type data.

Streaming data is large in nature, and is characterized by such data types as video, images, general large files.

While it is not possible to accurately determine all of a server’s data usage, and servers often change their usage
patterns over time, the general rule of thumb is that random data is best suited to non-parity RAID, while streaming
data works best and is most cost-effective on parity RAID.

Note that it is possible to set up both RAID types on the same controller, and even possible to set up the same RAID
types on the same set of drives. So if, for example, you have eight 2TB drives, you can make a RAID 10 of 1TB for
your database-type data, and a RAID 5 of the capacity that is left on the drives for your general and/or streaming type
data (approximately 12TB). Having these two different arrays spanning the same drives will not impact performance,
but your data will benefit in performance from being situated on the right RAID level.

Drive size performance


While HDDs are becoming larger, they are not getting any faster — a 1TB HDD and a 6TB HDD from the same
product family will have the same performance characteristics. This has an impact when building/rebuilding arrays as
it can take a long time to write all of the missing data to the new replacement drive.

Conversely, SSDs are often faster in larger capacities, so an 80GB SSD and an 800GB SSD from the same product
family will have quite different performance characteristics. This should be checked carefully with the product
specifications from the drive vendor to make sure you are getting the performance you think you are getting from your
drives.

With HDDs it is generally better to create an array with more, rather than fewer, drives. A RAID 5 of three 6TB HDDs
(12TB capacity) will not have the same performance as a RAID 5 array made from five 3TB HDDs (12TB capacity).

With SSDs, however, it is advisable to achieve the capacity required from as few as drives possible by using larger
capacity SSDs. These will have higher throughput than their smaller counterparts and will yield better system
performance.

Size of array vs size of drives


It is a little-known fact that you do not need to use all of your drive capacity when creating a RAID array. When, for
example, creating the RAID array in the controller BIOS, the controller will show you the maximum possible size the
array can be based on the drives chosen to make up the array.

During the creation process, you can change the size of the array to a lesser size. The unused space on the drives
will be available for creating additional RAID arrays.

A good example of this would be when creating a large server and keeping the operating system and data on
separate RAID arrays. Typically you would make a RAID 10 of, say, 200GB for your OS installation spread across all
drives in the server. This would use a minimal amount of capacity from each drive. You can then create a RAID 5 for
your general data across the unused space on the drives.

This has an added benefit of getting around drive size limitations for boot arrays on non-UEFI servers as the OS will
believe it is only dealing with a 200GB drive when installing the operating system.

Rebuild times and large RAID arrays


The more drives in the array, and the larger the HDDs in the array, the longer the rebuild time when a drive fails and
is replaced or a hot-spare kicks in. While it is possible to have 32 drives in a RAID 5 array, it becomes somewhat
impractical to do this with large spinning media.

For example, a RAID 5 made of 32 6TB drives (186TB) will have very poor build and rebuild times due to the size,
speed and number of drives. In this scenario, it would be advisable to build a RAID 50 with two legs from those drives
(180TB capacity). When a drive fails and is replaced, only 16 of the drives (15 existing plus the new drive) will be
involved in the rebuild. This will improve rebuild performance and reduce system performance impact during the
rebuild process.

Note, however, that no matter what you do, when it comes to rebuilding arrays with 6TB+ SATA drives, rebuild times
will increase beyond 24 hours in an absolutely perfect environment (no load on server). In a real-world environment
with a heavilyloaded system, the rebuild times will be even longer.
Of course, rebuild times on SSD arrays are dramatically quicker due to the fact that the drives are smaller and the
write speed of the SSDs are much faster than their spinning media counterparts.

Default RAID settings


When creating a RAID array in the BIOS or management software, you will be presented with defaults that the
controller proposes for the RAID settings. The most important of these is the “stripe size.” While there is much
science, math and general knowledge involved in working out what is the best stripe size for your array, in the vast
majority of cases the defaults work best, so use the 256kb stripe size as suggested by the controller.

SSDs and read/write cache


In an SSD-only RAID array, disabling the read and write cache will improve performance in a vast majority of cases.
However, you may need test whether enabling read and write cache will improve performance even further. Note that
it is possible to disable and enable read and write cache on the on the fly without affecting or reconfiguring the array,
or restarting the server, so testing both configurations is recommended.

RAID Level Comparison


tures RAID 0 RAID 1 RAID 1E RAID 5 RAID 6 RAID 10 RAID 50 RAID 60

Minimum # Drives 2 2 3 3 4 4 6 8

Data Protection Single-drive Single-drive Single-drive failure Two-drive failure Up to one drive Up to one drive Up to two driv
No Protection
failure failure Up to one drive failure in each failure in each failure in eac
failure in each sub-array sub-array sub-array
sub-array

Read Performance High Medium Medium High High High High High

Write Performance Medium (depends


High Medium Medium Low High Medium Medium
on data type)

Read Performance Low High Medium Medium Low High Medium Low
(degraded)

Write Performance N/A Medium Medium Low Low High Medium Low
(degraded)

Capacity Utilization 100% 50% 50% 67% - 94% 50% - 88% 50% 67% - 94% 67% - 94%

Typical Usage Non-mission-critical Cases where Small servers, Fileservers, Similar to RAID Ideal for Good RAID 60 is
data, such as video/audio there is not a high-end general storage 5, including database servers configuration similar to RAI
postproduction, large capacity workstations, and servers, backup fileservers, and any for cases where 50 but offers
multimedia imaging, requirement, but other servers, streaming general storage environment with many drives more
CAD, data logging, etc. the user wants to environments data, and other servers, backup many small need to be in a redundancy,
where it’s OK to lose a make sure the where no large environments that servers, etc. Poor random data single array but making it goo
complete drive because data is 100% capacity call for good random write writes. capacity is too for very large
the data can be quickly recoverable in requirements, but performance but performance large for RAID capacity serve
recopied from the source. the case of a where the user best value for the makes RAID 6 10, such as in especially tho
GENERALLY drive failure, wants to make money. Not suited unsuitable for very large that will not b
SPEAKING, RAID 0 IS such as sure the data is to database database capacity servers. backed up (i.e
NOT RECOMMENDED. accounting 100% applications due applications. video
systems, video recoverable in to poor random surveillance
editing, gaming the case of a write servers handli
etc. drive failure. performance. large numbers
cameras).
Pros Fast and inexpensive. All Highly Redundant with Good value and Reasonable Fast and Reasonable Can sustain tw
drive capacity is usable. redundant – better good all around value for money redundant. value for the drive failures
Quick to set up. Multiple each drive is a performance and performance. with good all- expense. Very RAID 6 array
HDDs sharing the data copy of the capacity than round good all-round within the set,
load make it the fastest of other. If one RAID 1. In performance. performance, it is very safe.
all arrays. drive fails, the effect, RAID 1E Can survive two especially for Reasonable
system is a mirror of an drives failing at streaming data, value for mon
continues as odd number of the same time, or and very high considering th
normal with no drives. one drive failing capacity RAID level
data loss. and then a capabilities. won’t be used
second drive unless there ar
failing during large number
the data rebuild. drives.

Cons RAID 0 provides no data Capacity is Cost is high One drive More expensive Expensive as it Requires a lot of Requires a lot
protection at all. If one limited to 50% because only half capacity is lost to than RAID 5 due requires four drives. Capacity drives. Slightl
drive fails, all data will of the available the capacity of parity. Can only to the loss of two drives to get the of one drive in more expensiv
be lost with no chance of drives, and the physical survive a single drive capacity to capacity of two. each RAID 5 set than RAID 50
recovery. performance is drives is drive failure at parity. Slightly Not suited to is lost to parity. due to losing
not much better available. any one time. If slower than large capacities Slightly more more drives to
than a single two drives fail at RAID 5 in most due to cost expensive than parity
drive. once, all data is applications. restrictions. Not RAID 5 due to calculations.
lost. as fast as RAID 5 this lost
in streaming capacity.
environments.

Types of Software-Based Motherboard-Based Adapter-Based


RAID

Description Included in the OS, such as Windows®, and Linux. Processor-intensive RAID operations are Processor-intensive RAID operations are off-
All RAID functions are handled by the host CPU which can off-loaded from the host CPU to a RAID loaded from the host CPU to an external PCIe
severely tax its ability to perform other computations. processor integrated into the motherboard. adapter.
Battery-back write back cache can dramatically
increase performance without adding risk of data
loss.

Typical Best used for large block applications such as data Inexpensive. Best used for small block applications such as
Usage warehousing or video streaming. Also where servers have the transaction oriented databases and web servers.
available CPU cycles to manage the I/O intensive operations
certain RAID levels require.

Pros Lower cost due to lack of RAIDdedicated hardware. Lower cost than adapter-based RAID. Offloads RAID tasks from the host system,
yielding better performance than software RAID.
Controller cards can be easily swapped out for
replacement and upgrades.
Data can be backed up to prevent loss in a power
failure.

Cons Lower RAID performance as CPU also powers the OS and No ability to upgrade or replace the RAID More expensive than software and integrated
applications. processor in the event of hardware failure. RAID.
May only support a few RAID levels.

Types of RAID
Nonredundant Arrays (RAID 0)
An array with RAID 0 includes two or more disk drives and provides data striping,
where data is distributed evenly across the disk drives in equal-sized sections.
However, RAID 0 arrays do not maintain redundant data, so they offer no data
protection.

Compared to an equal-sized group of independent disks, a RAID 0 array provides


improved I/O performance.

Drive segment size is limited to the size of the smallest disk drive in the array. For
instance, a array with two 250 GB disk drives and two 400 GB disk drives can create
a RAID 0 drive segment of 250 GB, for a total of 1000 GB for the volume, as shown
in this figure.

FIGURE F-1 Nonredundant Arrays (RAID 0)

[D]
RAID 1 Arrays
A RAID 1 array is built from two disk drives, where one disk drive is a mirror of the
other (the same data is stored on each disk drive). Compared to independent disk
drives, RAID 1 arrays provide improved performance, with twice the read rate and an
equal write rate of single disks. However, capacity is only 50 percent of independent
disk drives.

If the RAID 1 array is built from different-sized disk drives, drive segment size is the
size of the smaller disk drive, as shown in this figure.

FIGURE F-2 RAID 1 Arrays

RAID 1 Enhanced Arrays


A RAID 1 Enhanced (RAID 1E) array--also referred to as a striped mirror--is similar
to a RAID 1 array except that data is both mirrored and striped, and more disk drives
can be included. A RAID 1E array can be built from three or more disk drives.

In this figure, the large bold numbers represent the striped data, and the smaller, non-
bold numbers represent the mirrored data stripes.

FIGURE F-3 RAID 1 Enhanced Arrays

RAID 10 Arrays
A RAID 10 array is built from two or more equal-sized RAID 1 arrays. Data in a
RAID 10 array is both striped and mirrored. Mirroring provides data protection, and
striping improves performance.

Drive segment size is limited to the size of the smallest disk drive in the array. For
instance, a array with two 250 GB disk drives and two 400 GB disk drives can create
two mirrored drive segments of 250 GB, for a total of 500 GB for the array, as shown
in this figure.

FIGURE F-4 RAID 10 Arrays

RAID 5 Arrays
A RAID 5 array is built from a minimum of three disk drives, and uses data striping
and parity data to provide redundancy. Parity data provides data protection, and
striping improves performance.

Parity data is an error-correcting redundancy that’s used to re-create data if a disk


drive fails. In RAID 5 arrays, parity data (represented by Ps in the next figure) is
striped evenly across the disk drives with the stored data.
Drive segment size is limited to the size of the smallest disk drive in the array. For
instance, an array with two 250 GB disk drives and two 400 GB disk drives can
contain 750 GB of stored data and 250 GB of parity data, as shown in this figure.

FIGURE F-5 RAID 5 Arrays

RAID 5EE Arrays


A RAID 5EE array--also referred to as a hot-spare--is similar to a RAID 5 array
except that it includes a distributed spare drive and must be built from a minimum of
four disk drives.

Unlike a hot-spare, a distributed spare is striped evenly across the disk drives with the
stored data and parity data, and can’t be shared with other logical disk drives. A
distributed spare improves the speed at which the array is rebuilt following a disk
drive failure.

A RAID 5EE array protects your data and increases read and write speeds. However,
capacity is reduced by two disk drives’ worth of space, which is for parity data and
spare data.

In this example, S represents the distributed spare, P represents the distributed parity
data.

FIGURE F-6 RAID 5EE Arrays

RAID 50 Arrays
A RAID 50 array is built from at least six disk drives configured as two or more
RAID 5 arrays, and stripes stored data and parity data across all disk drives in both
RAID 5 arrays. (For more information, seeRAID 5 Arrays.)

The parity data provides data protection, and striping improves performance. RAID
50 arrays also provide high data transfer speeds.

Drive segment size is limited to the size of the smallest disk drive in the array. For
example, three 250 GB disk drives and three 400 GB disk drives comprise two equal-
sized RAID 5 arrays with 500 GB of stored data and 250 GB of parity data. The
RAID 50 array can therefore contain 1000 GB (2 x 500 GB) of stored data and 500
GB of parity data.

FIGURE F-7 RAID 50 Arrays

[D]

In this example, P represents the distributed parity data.


RAID 6 Arrays
A RAID 6 array--also referred to as dual drive failure protection--is similar to a
RAID 5 array because it uses data striping and parity data to provide redundancy.
However, RAID 6 arrays include twoindependent sets of parity data instead of one.
Both sets of parity data are striped separately across all disk drives in the array.

RAID 6 arrays provide extra protection for your data because they can recover from
two simultaneous disk drive failures. However, the extra parity calculation slows
performance (compared to RAID 5 arrays).

RAID 6 arrays must be built from at least four disk drives. Maximum stripe size
depends on the number of disk drives in the array.

FIGURE F-8 RAID 6 Arrays


RAID 60 Arrays
Similar to a RAID 50 array (see RAID 50 Arrays), a RAID 60 array--also referred to
as dual drive failure protection-- is built from at least eight disk drives configured as
two or more RAID 6 arrays, and stripes stored data and two sets of parity data across
all disk drives in both RAID 6 arrays.

Two sets of parity data provide enhanced data protection, and striping improves
performance. RAID 60 arrays also provide high data transfer speeds.

Selecting the Best RAID Level


Use this table to select the RAID levels that are most appropriate for the arrays on
your storage space, based on the number of available disk drives and your
requirements for performance and reliability.

TABLE F-1 Selecting the Best RAID Level

Disk Minimum
Drive Built-
RAID Read Write in Hot- Disk
Level Redundancy Usage Performance Performance Spare Drives
RAID No 100% www www No 2
0
RAID Yes 50% ww ww No 2
1
RAID Yes 50% ww ww No 3
1E
RAID Yes 50% ww ww No 4
10
RAID Yes 67 - www w No 3
5 94%
RAID Yes 50 - www w Yes 4
5EE 88%
RAID Yes 67 - www w No 6
50 94%
RAID Yes 50 - ww w No 4
6 88%
RAID Yes 50 - ww w No 8
60 88%

Disk drive usage, read performance, and write performance depend on the number of
drives in the array. In general, the more drives, the better the performance.
Migrating RAID Levels
As your storage space changes, you can migrate existing RAID levels to new RAID
levels that better meet your storage needs. You can perform these migrations through
the Sun StorageTek RAID Manager software. For more information, see the Sun
StorageTek RAID Manager Software User’s Guide. TABLE F-2 lists the supported
RAID level migrations.

TABLE F-2 Supported RAID Level Migrations

Existing RAID Level Supported Migration RAID Level


Simple volume RAID 1
RAID 0  RAID 5
 RAID 10

RAID 1  Simple volume


 RAID 0
 RAID 5
 RAID 10

RAID 5  RAID 0
 RAID 5EE
 RAID 6
 RAID 10

RAID 6 RAID 5
RAID 10  RAID 0
 RAID 5
AID 2

 This uses bit level striping. i.e Instead of striping the blocks across the disks, it
stripes the bits across the disks.
 In the above diagram b1, b2, b3 are bits. E1, E2, E3 are error correction codes.
 You need two groups of disks. One group of disks are used to write the data, another
group is used to write the error correction codes.
 This uses Hamming error correction code (ECC), and stores this information in the
redundancy disks.
 When data is written to the disks, it calculates the ECC code for the data on the fly,
and stripes the data bits to the data-disks, and writes the ECC code to the
redundancy disks.
 When data is read from the disks, it also reads the corresponding ECC code from the
redundancy disks, and checks whether the data is consistent. If required, it makes
appropriate corrections on the fly.
 This uses lot of disks and can be configured in different disk configuration. Some
valid configurations are 1) 10 disks for data and 4 disks for ECC 2) 4 disks for data
and 3 disks for ECC
 This is not used anymore. This is expensive and implementing it in a RAID
controller is complex, and ECC is redundant now-a-days, as the hard disk
themselves can do this.
RAID 3

 This uses byte level striping. i.e Instead of striping the blocks across the disks, it
stripes the bits across the disks.
 In the above diagram B1, B2, B3 are bytes. p1, p2, p3 are parities.
 Uses multiple data disks, and a dedicated disk to store parity.
 The disks have to spin in sync to get to the data.
 Sequential read and write will have good performance.
 Random read and write will have worst performance.
 This is not commonly used.

RAID 4
 This uses block level striping.
 In the above diagram B1, B2, B3 are blocks. p1, p2, p3 are parities.
 Uses multiple data disks, and a dedicated disk to store parity.
 Minimum of 3 disks (2 disks for data and 1 for parity)
 Good random reads, as the data blocks are striped.
 Bad random writes, as for every write, it has to write to the single parity disk.
 It is somewhat similar to RAID 3 and 5, but little different.
 This is just like RAID 3 in having the dedicated parity disk, but this stripes blocks.
 This is just like RAID 5 in striping the blocks across the data disks, but this has only
one parity disk.
 This is not commonly used.

RAID 6
 Just like RAID 5, this does block level striping. However, it uses dual parity.

In the above diagram A, B, C are blocks. p1, p2Q: What is the definition of a "RAID 5"
volume?
A: "RAID 5" refers to a "Redundant Array of Inexpensive (or Independent) Disks" that have been
established in a Level 5, or striped with parity, volume set. A RAID 5 volume is a combination of hard drives
that are configured for data to be written across three (3) or more drives.

Q: What is "parity" or "parity data"?


A: In a RAID 5 configuration, additional data is written to the disk that should allow the volume to be rebuilt
in the event that a single drive fails. In the event that a single drive does fail, the volume continues to
operate in a "degraded" state (no fault tolerance). Once the failed drive is replaced with a new hard drive (of
the same or higher capacity), the "parity data" is used to rebuild the contents of the failed drive on the new
one.

Q: What the minimum drive requirements to create a RAID 5 volume?


A: RAID 5 volume sets require a minimum of at least three (3) hard drives (preferably of the same capacity)
to create and maintain a RAID 5 volume. If one drive is of a lower capacity than the others, the RAID
controller (whether hardware or software) will treat every hard drive in the array as though it were of the
same lower capacity and will establish the volume accordingly.

Q: What are the differences between "hardware" and "software" RAID 5 configurations?
A: With a software-based RAID 5 volume, the hard disk drives use a standard drive contoller and a software
utility provides the management of the drives in the volume. A RAID 5 volume that relies on hardware for
management will have a physical controller (commonly built into the motherboard, but it can also be a
stand-alone expansion card) that provides for the reading and writing of data across the hard drives in the
volume.

Q: What are the advantages of RAID 5 volumes?


A: A RAID 5 volume provides faster data access and fault tolerance, or protection against one of the drives
failing during use. With a RAID 5 disk volume, information is striped (or written) across all of the drives in
the array along with parity data. If one of the hard drives in the array becomes corrupted, drops out of a
ready state or otherwise fails, the remaining hard drives will continue to operate as a striped volume with no
parity and with no loss of data. The failed drive can be replaced in the array with one of equal or larger
capacity, and the data it contained will be automatically rebuilt using the parity data contained on the other
drives. Establishing a RAID 5 volume requires 3 disk drives as a minimum requirement.
Q: What are the disadvantages of RAID 5 configurations?
A: There are several disadvantages. RAID 5 results in the loss of storage capacity equivalent to the capacity
of one hard drive from the volume. For example, three 500GB hard drives added together comprise 1500GB
(or roughly about 1.5 terabytes) of storage. If the three (3) 500GB drives were established as a RAID 0
(striped) configuration, total data storage would equal 1500GB capacity . If these same three (3) drives are
configured as a RAID 5 volume (striped with parity), the usable data storage capacity would be 1000GB and
not 1500GB, since 500GB (the equivalent of one drives' capacity) would be utilized for parity. In addition, if
two (2) or more drives fail or become corrupted at the same time, all data on the volume would be
inaccessible to the user.

Q: Can data be recovered from a re-formatted RAID 5 volume?


A: Many times information is still recoverable, depending on how the drives were re-formatted. Re-
formatting a volume using Windows, for example, will create what will appear to be a new "clean" volume -
but the original data will still be on the disk in the "free and available" space. However, a low-level format
(usually performed through an on-board RAID controller utility) will "wipe" or overwrite every single block
on a drive. Unlike an O/S (or "high-level") format, a low-level format normally is slower, takes a
considerable amount of time and destroys the original data.

Q: Can I run recovery software utilities to recover my RAID volume data?


A: The safest approach to data recovery with a RAID volume (or with any media) is to capture every storage
block on each device individually. The resulting drive "images" are then used to help rebuild the original
array structure and recover the necessary files and folders. This approach limits continued interaction with
the media and helps to preserve the integrity of the original device. One of the dangers in using data
recovery software is that it forces the read / write heads to travel repeatedly over areas of the original
media which, if physically damaged, could become further damaged and possibly unrecoverable.

Q: If a RAID 5 volume will not mount, should I allow a "rebuild" to run?


A: If one drive fails in a RAID 5 configuration, the volume still operates - but in a degraded state (it no
longer writes parity information). The important data should be backed up immediately andverified to be
usable before any rebuild operation is started. When it comes to critical data, anything that is used to read
or write to the original volume represents a risk. Is the hardware operating properly? Are all other drives in
the volume functioning correctly? If you are the least bit unsure, a rebuild should not be performed.

Q: If multiple drives fail in a RAID volume all at once, is the data still recoverable?
A: In many cases, the answer is yes. It usually requires that data be recovered from each failed hard drive
individually before attempting to address the rest of the volume. The quality and integrity of the data
recovered will depend on the extent of the damage incurred to each failed storage device., p3 are

parities. Non-Redundant (RAID Level 0)

A non-redundant disk array, or RAID level 0, has the lowest cost of any RAID
organization because it does not employ redundancy at all. This scheme offers the best
performance since it never needs to update redundant information. Surprisingly, it
does not have the best performance. Redundancy schemes that duplicate data, such as
mirroring, can perform better on reads by selectively scheduling requests on the disk
with the shortest expected seek and rotational delays. Without, redundancy, any single
disk failure will result in data-loss. Non-redundant disk arrays are widely used in
super-computing environments where performance and capacity, rather than
reliability, are the primary concerns.

Sequential blocks of data are written across multiple disks in stripes, as follows:
source: Reference 2

The size of a data block, which is known as the "stripe width", varies with the
implementation, but is always at least as large as a disk's sector size. When it comes
time to read back this sequential data, all disks can be read in parallel. In a multi-
tasking operating system, there is a high probability that even non-sequential disk
accesses will keep all of the disks working in parallel.

Mirrored (RAID Level 1)

The traditional solution, called mirroring or shadowing, uses twice as many disks as a
non-redundant disk array. whenever data is written to a disk the same data is also
written to a redundant disk, so that there are always two copies of the information.
When data is read, it can be retrieved from the disk with the shorter queuing, seek and
rotational delays. If a disk fails, the other copy is used to service requests. Mirroring is
frequently used in database applications where availability and transaction time are
more important than storage efficiency.

source: Reference 2

Memory-Style(RAID Level 2)

Memory systems have provided recovery from failed components with much less cost
than mirroring by using Hamming codes. Hamming codes contain parity for distinct
overlapping subsets of components. In one version of this scheme, four disks require
three redundant disks, one less than mirroring. Since the number of redundant disks is
proportional to the log of the total number of the disks on the system, storage
efficiency increases as the number of data disks increases.

If a single component fails, several of the parity components will have inconsistent
values, and the failed component is the one held in common by each incorrect subset.
The lost information is recovered by reading the other components in a subset,
including the parity component, and setting the missing bit to 0 or 1 to create proper
parity value for that subset. Thus, multiple redundant disks are needed to identify the
failed disk, but only one is needed to recover the lost information.

In you are unaware of parity, you can think of the redundant disk as having the sum of
all data in the other disks. When a disk fails, you can subtract all the data on the good
disks form the parity disk; the remaining information must be the missing
information. Parity is simply this sum modulo 2.

A RAID 2 system would normally have as many data disks as the word size of the
computer, typically 32. In addition, RAID 2 requires the use of extra disks to store an
error-correcting code for redundancy. With 32 data disks, a RAID 2 system would
require 7 additional disks for a Hamming-code ECC. Such an array of 39 disks was
the subject of a U.S. patent granted to Unisys Corporation in 1988, but no commercial
product was ever released.

For a number of reasons, including the fact that modern disk drives contain their own
internal ECC, RAID 2 is not a practical disk array scheme.

source: Reference 2

Bit-Interleaved Parity (RAID Level 3)

One can improve upon memory-style ECC disk arrays by noting that, unlike memory
component failures, disk controllers can easily identify which disk has failed. Thus,
one can use a single parity rather than a set of parity disks to recover lost information.
In a bit-interleaved, parity disk array, data is conceptually interleaved bit-wise over
the data disks, and a single parity disk is added to tolerate any single disk failure. Each
read request accesses all data disks and each write request accesses all data disks and
the parity disk. Thus, only one request can be serviced at a time. Because the parity
disk contains only parity and no data, the parity disk cannot participate on reads,
resulting in slightly lower read performance than for redundancy schemes that
distribute the parity and data over all disks. Bit-interleaved, parity disk arrays are
frequently used in applications that require high bandwidth but not high I/O rates.
They are also simpler to implement than RAID levels 4, 5, and 6.

Here, the parity disk is written in the same way as the parity bit in normal Random
Access Memory (RAM), where it is the Exclusive Or of the 8, 16 or 32 data bits. In
RAM, parity is used to detect single-bit data errors, but it cannot correct them because
there is no information available to determine which bit is incorrect. With disk drives,
however, we rely on the disk controller to report a data read error. Knowing which
disk's data is missing, we can reconstruct it as the Exclusive Or (XOR) of all
remaining data disks plus the parity disk.

source: Reference 2

As a simple example, suppose we have 4 data disks and one parity disk. The sample
bits are:

Disk 0 Disk 1 Disk 2 Disk 3 Parity


0 1 1 1 1

The parity bit is the XOR of these four data bits, which can be calculated by adding
them up and writing a 0 if the sum is even and a 1 if it is odd. Here the sum of Disk 0
through Disk 3 is "3", so the parity is 1. Now if we attempt to read back this data, and
find that Disk 2 gives a read error, we can reconstruct Disk 2 as the XOR of all the
other disks, including the parity. In the example, the sum of Disk 0, 1, 3 and Parity is
"3", so the data on Disk 2 must be 1.
Block-Interleaved Parity (RAID Level 4)

The block-interleaved, parity disk array is similar to the bit-interleaved, parity disk
array except that data is interleaved across disks of arbitrary size rather than in bits.
The size of these blocks is called the striping unit. Read requests smaller than the
striping unit access only a single data disk. Write requests must update the requested
data blocks and must also compute and update the parity block. For large writes that
touch blocks on all disks, parity is easily computed by exclusive-or'ing the new data
for each disk. For small write requests that update only one data disk, parity is
computed by noting how the new data differs from the old data and applying those
differences to the parity block. Small write requests thus require four disk I/Os: one to
write the new data, two to read the old data and old parity for computing the new
parity, and one to write the new parity. This is referred to as a read-modify-write
procedure. Because a block-interleaved, parity disk array has only one parity disk,
which must be updated on all write operations, the parity disk can easily become a
bottleneck. Because of this limitation, the block-interleaved distributed parity disk
array is universally preferred over the block-interleaved, parity disk array.

source: Reference 2

Block-Interleaved Distributed-Parity (RAID


Level 5)

The block-interleaved distributed-parity disk array eliminates the parity disk


bottleneck present in the block-interleaved parity disk array by distributing the parity
uniformly over all of the disks. An additional, frequently overlooked advantage to
distributing the parity is that it also distributes data over all of the disks rather than
over all but one. This allows all disks to participate in servicing read operations in
contrast to redundancy schemes with dedicated parity disks in which the parity disk
cannot participate in servicing read requests. Block-interleaved distributed-parity disk
array have the best small read, large write performance of any redundancy disk array.
Small write requests are somewhat inefficient compared with redundancy schemes
such as mirroring however, due to the need to perform read-modify-write operations
to update parity. This is the major performance weakness of RAID level 5 disk arrays.

The exact method used to distribute parity in block-interleaved distributed-parity disk


arrays can affect performance. Following figure illustrates left-symmetric parity
distribution.

Each square corresponds to a stripe unit. Each


column of squares corresponds to a disk. P0 computes the parity over stripe units 0, 1,
2 and 3; P1 computes parity over stripe units 4, 5, 6, and 7 etc. (source: Reference 1)

A useful property of the left-symmetric parity distribution is that whenever you


traverse the striping units sequentially, you will access each disk once before
accessing any disk device. This property reduces disk conflicts when servicing large
requests.

source: Reference 2

P+Q redundancy (RAID Level 6)


Parity is a redundancy code capable of correcting any single, self-identifying failure.
As large disk arrays are considered, multiple failures are possible and stronger codes
are needed. Moreover, when a disk fails in parity-protected disk array, recovering the
contents of the failed disk requires successfully reading the contents of all non-failed
disks. The probability of encountering an uncorrectable read error during recovery
can be significant. Thus, applications with more stringent reliability requirements
require stronger error correcting codes.

Once such scheme, called P+Q redundancy, uses Reed-Solomon codes to protect
against up to two disk failures using the bare minimum of two redundant disk arrays.
The P+Q redundant disk arrays are structurally very similar to the block-interleaved
distributed-parity disk arrays and operate in much the same manner. In particular,
P+Q redundant disk arrays also perform small write operations using a read-modify-
write procedure, except that instead of four disk accesses per write requests, P+Q
redundant disk arrays require six disk accesses due to the need to update both the `P'
and `Q' information.

Striped Mirrors (RAID Level 10)


RAID 10 was not mentioned in the original 1988 article that defined RAID 1 through
RAID 5. The term is now used to mean the combination of RAID 0 (striping) and
RAID 1 (mirroring). Disks are mirrored in pairs for redundancy and improved
performance, then data is striped across multiple disks for maximum performance. In
the diagram below, Disks 0 & 2 and Disks 1 & 3 are mirrored pairs.

Obviously, RAID 10 uses more disk space to provide redundant data than RAID 5.
However, it also provides a performance advantage by reading from all disks in
parallel while eliminating the write penalty of RAID 5. In addition, RAID 10 gives
better performance than RAID 5 while a failed drive remains unreplaced. Under
RAID 5, each attempted read of the failed drive can be performed only by reading all
of the other disks. On RAID 10, a failed disk can be recovered by a single read of its
mirrored pair.

source: Reference 2
Tool to calculate storage efficiency given the number of disks and the RAID
level (source: Reference 3)

RAID Systems Need Tape Backups

It is worth remembering an important point about RAID systems. Even when you use
a redundancy scheme like mirroring or RAID 5 or RAID 10, you must still do regular
tape backups of your system. There are several reasons for insisting on this, among
them:

 RAID does not protect you from multiple disk failures. While one disk is off
line for any reason, your disk array is not fully redundant.
 Regular tape backups allow you to recover from data loss that is not related to a
disk failure. This includes human errors, hardware errors, and software errors.

 This creates two parity blocks for each data block.


 Can handle two disk failure

This RAID configuration is complex to implement in a RAID controller, as it has to


calculate two parity data for each data bloThere are three important considerations
while making a selection as to which RAID level is to be used for a system viz. cost,
performance and reliability.

There are many different ways to measure these parameters for eg. performance could
be measured as I/Os per second per dollar, bytes per second or response time. We
could also compare systems at the same cost, the same total user capacity, the same
performance or the same reliability. The method used largely depends on the
application and the reason to compare. For example, in transaction processing
applications the primary base for comparison would be I/Os per second per dollar
while in scientific applications we would be more interested in bytes per second per
dollar. In some heterogeneous systems like file servers both I/O per second and bytes
per second may be important. Sometimes it is important to consider reliability as the
base for comparison.

Taking a closer look at the RAID levels we observe that most of the levels are similar
to each other. RAID level 1 and RAID level 3 disk arrays can be viewed as a subclass
of RAID level 5 disk arrays. Also RAID level 2 and RAID level 4 disk arrays are
generally found to be inferior to RAID level 5 disk arrays. Hence the problem of
selecting among RAID levels 1 through 5 is a subset of the more general problem of
choosing an appropriate parity group size and striping unit for RAID level 5 disk
arrays.

Some Comparisons
Given below is a table that compares the throughput of various redundancy schemes
for four types of I/O requests. The I/O requests are basically reads and writes which
are divided into small (reads & writes) and large ones. Remembering the fact that our
data has been spread over multiple disks (data striping), a small refers to an I/O
request of one striping unit while a large I/O request refers to requests of one full
stripe (one stripe unit from each disk in an error correcting group).

RAID Small Small Large Large Storage


Type Read Write Read Write Efficiency
RAID Level
1 1 1 1 1
0
RAID Level
1 1/2 1 1/2 1/2
1
RAID Level
1/G 1/G (G-1)/G (G-1)/G (G-1)/G
3
RAID Level max
1 1 (G-1)/G (G-1)/G
5 (1/G,1/4)
RAID Level max
1 1 (G-2)/G (G-2)/G
6 (1/G,1/6)

G : The number of disks in an error correction group.

The table above tabulates the maximum throughput per dollar relative level 0 for
RAID levels 0, 1, 3, 5 and 6. For practical purposes we consider RAID levels 2 & 4
inferior to RAID level 5 disk arrays, so we don't show the comparisons. The cost of a
system is directly proportional to the number of disks it uses in the disk array. Thus
the table shows us that given equivalent cost RAID level 0 and RAID level 1 systems,
the RAID level 1 system can sustain half the number of small writes per second that a
RAID level 0 system can sustain. Equivalently the cost of small writes is twice as
expensive in a RAID level 1 system as in a RAID level 0 system.
The table also shows storage efficiency of each RAID level. The storage efficiency is
approximately inverse the cost of each unit of user capacity relative to a RAID level 0
system. The storage efficiency is equal to the performance/cost metric for large
writes.

s
ource: Reference 1

The figures above graph the performance/cost metrics from the table above for RAID
levels 1, 3, 5 and 6 over a range of parity group sizes. The performance/cost of RAID
level 1 systems is equivalent to the performance/cost of RAID level 5 systems when
the parity group size is equal to 2. The performance/cost of RAID level 3 systems is
always less than or equal to the performance/cost of RAID level 5 systems. This is
expected given that a RAID level 3 system is a subclass of RAID level 5 systems
derived by restricting the striping unit size such that all requests access exactly a
parity stripe of data. Since the configuration of RAID level 5 systems is not subject to
such a restriction, the performance/cost of RAID level 5 systems can never be less
than that of an equivalent RAID level 3 system. Of course such generalizations are
specific to the models of disk arrays used in the above experiments. In reality, a
specific implementation of a RAID level 3 system can have better performance/cost
than a specific implementation of a RAID level 5 system.

The question of which RAID level to use is better expressed as more general
configuration questions concerning the size of the parity group and striping unit. For a
parity group size of 2, mirroring is desirable, while for a very small striping unit
RAID level 3 would be suited.

The figure below plots the performance/cost metrics from the table above for RAID
levels 3, 5 &
6.
BACK / HOME

Reliability of any I/O system has become as important as its performance and cost.
This part of the tutorial:

 Reviews the basic reliability provided by a block-interleaved parity disk array


 Lists and discusses three factors that can determine the potential reliability of
disk arrays.

Redundancy in disk arrays is motivated by the need to fight disk failures. Two key
factors MTTF(Mean-Time-to-Failure) and MTTR(Mean-Time-to-Repair) are of
primary concern in estimating the reliability of any disk. Following are some formulae
for the mean time between failures :

RAID level 5

MTTF(disk) 2
------------------
N*(G-1)*MTTR(disk)

Disk array with two redundant disk per parity group (eg: P+Q redundancy)
MTTF(disk) 3
-------------------------
N*(G-1)*(G-2)* (MTTR(disk) 2 )

N - total number of disks in the system


G - number of disks in the parity group

Factors affecting Reliability


Three factors that can dramatically affect the reliability of disk arrays are:

 System crashes
 Uncorrectable bit-errors
 Correlated disk failures

System Crashes
System crash refers to any event such as a power failure, operator error, hardware
breakdown, or software crash that can interrupt an I/O operation to a disk array.

Such crashes can interrupt write operations, resulting in states where the data is
updated and the parity is not updated or vice versa. In either case, parity is
inconsistent and cannot be used in the event of a disk failure. Techniques such
as redundant hardware and power supplies can be applied to make such crashes less
frequent.

System crashes can cause parity inconsistencies in both bit-interleaved and block-
interleaved disk arrays, but the problem is of practical concern only in block-
interleaved disk arrays.

For, reliability purposes, system crashes in block-interleaved disk arrays are similar
to disk failures in that they may result in the loss of the correct parity for stripes
that were modified during the crash.

Uncorrectable bit-errors
Most uncorrectable bit-errors are generated because data is incorrectly written or
gradually damaged as the magnetic media ages. These errors are detected only
when we attempt to read the data.

Our interpretation of uncorrectable bit error rates is that they represent the rate at
which errors are detected during reads from the disk during the normal
operation of the disk drive.

One approach that can be used with or without redundancy is to try to protect
against bit errors by predicting when a disk is about to fail. VAXsimPLUS, a
product from DEC, monitors the warnings issued by disks and notifies an operator
when it feels the disk is about to fail.

Correlated disk failures


Causes: Common environmental and manufacturing factors.

For example, an accident might sharply increase the failure rate for all disks in a disk
array for a short period of time. In general, power surges, power failures and simply
switching the disks on and offcan place stress on the electrical components of all
affected disks. Disks also share common support hardware; when this hardware fails,
it can lead to multiple, simultaneous disk failures.

Disks are generally more likely to fail either very early or very late in their lifetimes.

Early failuresare frequently caused by transient defects which may not have been
detected during the manufacturer's burn-in process.
Late failures occur when a disk wears out. Correlated disk failures greatly reduce the
reliability of disk arrays by making it much more likely that an initial disk failure will
be closely followed by additional disk failures before the failed disk can be
reconstructed.

Mean-Time-To-Data-Loss(MTTDL)
Following are some formulae to calculate the mean-time-to-data-loss(MTTDL). In a
block-interleaved parity-protected disk array, data loss is possible through the
following three common ways:

 double disk failures


 system crash followed by a disk failure
 disk failure followed by an uncorrectable bit error during reconstruction

The above three failure modes are the hardest failure combinations, in that we,
currently, don't have any techniques to protect against them without sacrificing
performance.

RAID Level 5

MTTF(disk1) * MTTF(disk2)
Double Disk Failure -----------------------
N * (G-1) * MTTR(disk)
MTTF(system) * MTTF(disk)
System Crash + Disk Failure -----------------------
N * MTTR(system)
MTTF(disk)
Disk Failure + Bit Error -----------------------
N * (1 - ( p(disk)) (G-1) )
Software RAID harmonic sum of the above
harmonic sum of above excluding
Hardware RAID
system crash + disk failure
Failure Characteristics for RAID Level 5 Disk Arrays (source: Reference
1)

P+Q disk Array

MTTF(disk) * (MTTF(disk2) * MTTF(disk3)


Triple Disk Failures ----------------------------------
N * (G-1) * (G-2) * MTTR(disk) 2
System Crash + Disk MTTF(system) * MTTF(disk)
Failure --------------------------
N * MTTR(system)
MTTF(disk) * MTTF(disk2)
Double disk failure + Bit
----------------------------------
error
N*(G-1)*(1-(p(disk)) (G-2) )* MTTR(disk)
Software RAID harmonic sum of the above
harmonic sum excluding system crash +disk
Hardware RAID
failure
Failure characteristics for a P+Q disk array (source: Reference 1)

p(disk) = The probability of reading all sectors on a disk (derived from disk size,
sector size, and BER)

Tool for Reliability Using the Above


Equations. (source: Reference 3)

Redundant array of independent disks (RAID) is a storage technology used to


improve the processing capability of storage systems. This technology is designed to
provide reliability in disk array systems and to take advantage of the performance
gains offered by an array of multiple disks over single-disk storage.

RAID’s two primary underlying concepts are:

 distributing data over multiple hard drives improves performance


 using multiple drives properly allows for any one drive to fail without loss of
data and without system downtime

In the event of a disk failure, disk access continues normally and the failure is
transparent to the host system.

Logical Drive
A logical drive is an array of independent physical drives. Increased availability,
capacity, and performance are achieved by creating logical drives. The logical drive
appears to the host the same as a local hard disk drive does.
FIGURE A-1 Logical Drive Including Multiple Physical Drives

Logical Volume
A logical volume is composed of two or more logical drives. The logical volume can
be divided into a maximum of 32 partitions for Fibre Channel. During operation, the
host sees a nonpartitioned logical volume or a partition of a logical volume as one
single physical drive.

Local Spare Drive


A local spare drive is a standby drive assigned to serve one specified logical drive.
When a member drive of this specified logical drive fails, the local spare drive
becomes a member drive and automatically starts to rebuild.

Global Spare Drive


A global spare drive does not only serve one specified logical drive. When a member
drive from any of the logical drives fails, the global spare drive joins that logical drive
and automatically starts to rebuild.

Channels
You can connect up to 15 devices (excluding the controller itself) to a SCSI channel
when the Wide function is enabled (16-bit SCSI). You can connect up to 125 devices
to an FC channel in loop mode. Each device has a unique ID that identifies the device
on the SCSI bus or FC loop.

A logical drive consists of a group of SCSI drives, Fibre Channel drives, or SATA
drives. Physical drives in one logical drive do not have to come from the same SCSI
channel. Also, each logical drive can be configured for a different RAID level.

A drive can be assigned as the local spare drive to one specified logical drive, or as a
global spare drive. A spare is not available for logical drives that have no data
redundancy (RAID 0).

FIGURE A-2 Allocation of Drives in Logical Drive Configurations

You can divide a logical drive or logical volume into several partitions or
use the entire logical drive as single partition.
FIGURE A-3 Partitions in Logical Drive Configurations

Each partition is mapped to LUNs under host SCSI IDs or IDs on host channels. Each
SCSI ID/LUN acts as one individual hard drive to the host computer.
FIGURE A-4 Mapping Partitions to Host ID/LUNs
FIGURE A-5 Mapping Partitions to LUNs Under an ID

RAID Levels
There are several ways to implement a RAID array, using a combination of mirroring,
striping, duplexing, and parity technologies. These various techniques are referred to
as RAID levels. Each level offers a mix of performance, reliability, and cost. Each
level uses a distinct algorithm to implement fault tolerance.

There are several RAID level choices: RAID 0, 1, 3, 5, 1+0, 3+0 (30), and 5+0 (50).
RAID levels 1, 3, and 5 are the most commonly used.

The following table provides a brief overview of the RAID levels.

TABLE A-1 RAID Level Overview

RAID Number of Drives


Level Description Supported Capacity Redundancy
0 Striping 2-36 N No
1 Mirroring 2 N/2 Yes
1+0 Mirroring and 4-36 (even number N/2 Yes
striping only)
3 Striping with 3-31 N-1 Yes
dedicated parity
5 Striping with 3-31 N-1 Yes
distributed parity
3+0 (30) Striping of RAID 3 2-8 logical drives N-# of logical Yes
logical drives drives
5+0 (50) Striping of RAID 5 2-8 logical drives N-# of logical Yes
logical drives drives
Capacity refers to the total number (N) of physical drives available for data storage.
For example, if the capacity is N-1 and the total number of disk drives in the logical
drive is six 36-Mbyte drives, the disk space available for storage is equal to five disk
drives--(5 x 36 Mbyte or 180 Mbyte. The -1 refers to the amount of striping across six
drives, which provides redundancy of data and is equal to the size of one of the disk
drives.

For RAID 3+0 (30) and 5+0 (50), capacity refers to the total number of physical
drives (N) minus one physical drive (#) for each logical drive in the volume. For
example, if the total number of disk drives in the logical drive is twenty 36-Mbyte
drives and the total number of logical drives is 2, the disk space available for storage
is equal to 18 disk drives--18 x 36 Mbyte (648 Mbyte).

RAID 0
RAID 0 implements block striping, where data is broken into logical blocks and is
striped across several drives. Unlike other RAID levels, there is no facility for
redundancy. In the event of a disk failure, data is lost.

In block striping, the total disk capacity is equivalent to the sum of the capacities of
all drives in the array. This combination of drives appears to the system as a single
logical drive.

RAID 0 provides the highest performance. It is fast because data can be


simultaneously transferred to or from every disk in the array. Furthermore, read/writes
to separate drives can be processed concurrently.
FIGURE A-6 RAID 0 Configuration

RAID 1
RAID 1 implements disk mirroring, where a copy of the same data is recorded onto
two drives. By keeping two copies of data on separate disks, data is protected against
a disk failure. If, at any time, a disk in the RAID 1 array fails, the remaining good disk
(copy) can provide all of the data needed, thus preventing downtime.

In disk mirroring, the total usable capacity is equivalent to the capacity of one drive in
the RAID 1 array. Thus, combining two 1-Gbyte drives, for example, creates a single
logical drive with a total usable capacity of 1 Gbyte. This combination of drives
appears to the system as a single logical drive.

Note - RAID 1 does not allow expansion. RAID levels 3 and 5 permit expansion by
adding drives to an existing array.
FIGURE A-7 RAID 1 Configuration

In addition to the data protection that RAID 1 provides, this RAID level also improves
performance. In cases where multiple concurrent I/O is occurring, that I/O can be
distributed between disk copies, thus reducing total effective data access time.

RAID 1+0

RAID 1+0 combines RAID 0 and RAID 1 to offer mirroring and disk striping. Using
RAID 1+0 is a time-saving feature that enables you to configure a large number of
disks for mirroring in one step. It is not a standard RAID level option that you can
select; it does not appear in the list of RAID level options supported by the controller.
If four or more disk drives are chosen for a RAID 1 logical drive, RAID 1+0 is
performed automatically.
FIGURE A-8 RAID 1+0 Configuration

RAID 3
RAID 3 implements block striping with dedicated parity. This RAID level breaks
data into logical blocks, the size of a disk block, and then stripes these blocks across
several drives. One drive is dedicated to parity. In the event that a disk fails, the
original data can be reconstructed using the parity information and the information on
the remaining disks.

In RAID 3, the total disk capacity is equivalent to the sum of the capacities of all
drives in the combination, excluding the parity drive. Thus, combining four 1-Gbyte
drives, for example, creates a single logical drive with a total usable capacity of 3
Gbyte. This combination appears to the system as a single logical drive.

RAID 3 provides increased data transfer rates when data is being read in small chunks
or sequentially. However, in write operations that do not span every drive,
performance is reduced because the information stored in the parity drive needs to be
recalculated and rewritten every time new data is written, limiting simultaneous I/O.
FIGURE A-9 RAID 3 Configuration

RAID 5
RAID 5 implements multiple-block striping with distributed parity. This RAID level
offers redundancy with the parity information distributed across all disks in the array.
Data and its parity are never stored on the same disk. In the event that a disk fails,
original data can be reconstructed using the parity information and the information on
the remaining disks.
FIGURE A-10 RAID 5 Configuration

RAID 5 offers increased data transfer rates when data is accessed in large chunks, or
randomly and reduced data access time during many simultaneous I/O cycles.

Advanced RAID Levels


Advanced RAID levels require the use of the array’s built-in volume manager. These
combination RAID levels provide the protection benefits of RAID 1, 3, or 5 with the
performance of RAID 1. To use advanced RAID, first create two or more RAID 1, 3,
or 5 arrays, and then join them. The following table provides a description of the
advanced RAID levels.

TABLE A-2 Advanced RAID Levels

RAID
Level Description
RAID 3+0 RAID 3 logical drives that have been joined together using the array’s
(30) built-in volume manager.
TABLE A-2 Advanced RAID Levels

RAID
Level Description
RAID 5+0 RAID 5 logical drives that have been joined together using the array’s
(50) volume manager.

Local and Global Spare Drives


The external RAID controllers provide both local spare drive and global spare drive
functions. The local spare drive is used only for one specified drive; the global spare
drive can be used for any logical drive on the array.

The local spare drive always has higher priority than the global spare drive. Therefore,
if a drive fails and both types of spares are available at the same time or a greater size
is needed to replace the failed drive, the local spare is used.

If there is a failed drive in the RAID 5 logical drive, replace the failed drive with a
new drive to keep the logical drive working. To identify a failed drive, refer to the Sun
StorEdge 3000 Family RAID Firmware User’s Guide for your array.

Caution - If, when trying to remove a failed drive, you mistakenly remove the
wrong drive, you can no longer access the logical drive because you have
incorrectly failed another drive.
A local spare drive is a standby drive assigned to serve one specified logical drive.
When a member drive of this specified logical drive fails, the local spare drive
becomes a member drive and automatically starts to rebuild.

A local spare drive always has higher priority than a global spare drive; that is, if a
drive fails and there is a local spare and a global spare drive available, the local spare
drive is used.

FIGURE A-11 Local (Dedicated) Spare

A global spare drive is available for all logical drives rather than serving only one
logical drive (see FIGURE A-12). When a member drive from any of the logical
drives fails, the global spare drive joins that logical drive and automatically starts to
rebuild.

A local spare drive always has higher priority than a global spare drive; that is, if a
drive fails and there is a local spare and a global spare drive available, the local spare
drive is used.
FIGURE A-12 Global Spare

Having Both Local and Global Spares


In FIGURE A-13, the member drives in logical drive 0 are 9-Gbyte drives, and the
members in logical drives 1 and 2 are all 4-Gbyte drives.
FIGURE A-13 Mixing Local and Global Spares

A local spare drive always has higher priority than a global spare drive; that is, if a
drive fails and both a local spare and a global spare drive are available, the local spare
drive is used.

In FIGURE A-13, it is not possible for the 4-Gbyte global spare drive to join logical
drive 0 because of its insufficient capacity. The 9-Gbyte local spare drive aids logical
drive 0 once a drive in this logical drive fails. If the failed drive is in logical drive 1 or
2, the 4-Gbyte global spare drive immediately aids the failed drive.
source: Reference 1

 ck.

Das könnte Ihnen auch gefallen