You are on page 1of 8

Redundant Arrays of Inexpensive Disks – disk organization techniques that take

advantage of utilizing large numbers of inexpensive, mass-market disks.

Originally a cost-effective alternative to large, expensive disks

Today RAIDs are used for their higher reliability and bandwidth, rather than for
economic reasons. Hence the "I" is interpreted as independent, instead of inexpensive.

Improvement in Performance via Parallelism

Two main goals of parallelism in a disk system:

1. Load balance multiple small accesses to increase throughput

2. Parallelize large accesses to reduce response time

Improve transfer rate by striping data across multiple disks.

Bit-level striping – split the bits of each byte across multiple disks

– In an array of eight disks, write bit i of each byte to disk i.

– Each access can read data at eight times the rate of a single disk.

– But seek/access time worse than for a single disk.

Block-level striping – with n disks, block i of a file goes to disk ( i mod n) + 1.

RAID Levels

Schemes to provide redundancy at lower cost by using disk striping combined with parity
bits

Different RAID organizations, or RAID levels, have differing cost, performance and
reliability characteristics
RAID 0

In a RAID 0 system, data are split up in blocks that get written across all the drives in the
array. By using multiple disks (at least 2) at the same time, RAID 0 offers superior I/O
performance. This performance can be enhanced further by using multiple controllers,
ideally one controller per disk.

Advantages

RAID 0 offers great performance, both in read and write operations. There is no
overhead caused by parity controls.

All storage capacity can be used, there is no disk overhead.

The technology is easy to implement.

Disadvantages

RAID 0 is not fault-tolerant. If one disk fails, all data in the RAID 0 array are lost. It
should not be used on mission-critical systems.

Ideal use

RAID 0 is ideal for non-critical storage of data that have to be read/written at a high
speed, e.g. on a PhotoShop image retouching station.
RAID 1: mirroring

Data are stored twice by writing them to both the data disk (or set of data disks) and a
mirror disk (or set of disks) . If a disk fails, the controller uses either the data drive or the
mirror drive for data recovery and continues operation. You need at least 2 disks for a
RAID 1 array.

RAID 1 systems are often combined with RAID 0 to improve performance. Such a system
is sometimes referred to by the combined number: a RAID 10 system.

Advantages

RAID 1 offers excellent read speed and a write-speed that is comparable to that of a
single disk.

In case a disk fails, data do not have to be rebuild, they just have to be copied to the
replacement disk.

RAID 1 is a very simple technology.

Disadvantages

The main disadvantage is that the effective storage capacity is only half of the total disk
capacity because all data get written twice.
Software RAID 1 solutions do not always allow a hot swap of a failed disk (meaning it
cannot be replaced while the server keeps running). Ideally a hardware controller is
used.

Ideal use

RAID-1 is ideal for mission critical storage, for instance for accounting systems. It is
also suitable for small servers in which only two disks will be used.

RAID -2
Description: Level 2 is the "black sheep" of the RAID family, because it is the only
RAID level that does not use one or more of the "standard" techniques of mirroring,
striping and/or parity. RAID 2 uses something similar to striping with parity, but not the
same as what is used by RAID levels 3 to 7. It is implemented by splitting data at the bit
level and spreading it over a number of data disks and a number of redundancy disks. The
redundant bits are calculated using Hamming codes, a form of error correcting code
(ECC). Each time something is to be written to the array these codes are calculated and
written along side the data to dedicated ECC disks; when the data is read back these ECC
codes are read as well to confirm that no errors have occurred since the data was written.
If a single-bit error occurs, it can be corrected "on the fly". If this sounds similar to the
way that ECC is used within hard disks today, that's for a good reason: it's pretty much
exactly the same. It's also the same concept used for ECC protection of system memory.

Level 2 is the only RAID level of the ones defined by the original Berkeley document
that is not used today, for a variety of reasons. It is expensive and often requires many
drives. The controller required was complex, specialized and expensive. The performance
of RAID 2 is also rather substandard in transactional environments due to the bit-level
striping. But most of all, level 2 was obviated by the use of ECC within a hard disk;
essentially, much of what RAID 2 provides you now get for "free" within each hard disk,
with other RAID levels providing protection above and beyond ECC.

Due to its cost and complexity, level 2 never really "caught on". Therefore, much of the
information below is based upon theoretical analysis, not empirical evidence.

RAID 3

Bit-Interleaved Parity; a single parity bit can be used for error correction, not just
detection.

– When writing data, parity bit must also be computed and written
– Faster data transfer than with a single disk, but fewer I/Os per second since every disk
has to participate in every I/O.

– Subsumes Level 2 (provides all its benefits, at lower cost).

On RAID 3 systems, data blocks are subdivided (striped) and written in parallel on two or
more drives. An additional drive stores parity information. You need at least 3 disks for a
RAID 3 array.

Since parity is used, a RAID 3 stripe set can withstand a single disk failure without losing
data or access to data.

Advantages

RAID-3 provides high throughput (both read and write) for large data transfers.

Disk failures do not significantly slow down throughput.

Disadvantages

This technology is fairly complex and too resource intensive to be done in software.

Performance is slower for random, small I/O operations.

Ideal use

RAID 3 is not that common in prepress

RAID 4

Block-Interleaved Parity; uses block-level striping, and keeps a parity block on a


separate disk for corresponding blocks from N other disks.
– Provides higher I/O rates for independent block reads than Level 3 (block read goes to a
single disk, so blocks stored on different disks can be read in parallel)

– Provides high transfer rates for reads of multiple blocks

– However, parity block becomes a bottleneck for independent block writes since every
block write also writes to parity disk

Description: RAID 4 improves performance by striping data across many disks in


blocks, and provides fault tolerance through a dedicated parity disk. This makes it in
some ways the "middle sibling" in a family of close relatives, RAID levels 3, 4 and 5. It
is like RAID 3 except that it uses blocks instead of bytes for striping, and like RAID 5
except that it uses dedicated parity instead of distributed parity. Going from byte to block
striping improves random access performance compared to RAID 3, but the dedicated
parity disk remains a bottleneck, especially for random write performance. Fault
tolerance, format efficiency and many other attributes are the same as for RAID 3 and
RAID 5.

RAID5

RAID 5 is the most common secure RAID level. It is similar to RAID-3


except that data are transferred to disks by independent read and
write operations (not in parallel). The data chunks that are written are
also larger. Instead of a dedicated parity disk, parity information is
spread across all the drives. You need at least 3 disks for a RAID 5
array.
A RAID 5 array can withstand a single disk failure without losing data
or access to data. Although RAID 5 can be achieved in software, a
hardware controller is recommended. Often extra cache memory is
used on these controllers to improve the write performance.

Advantages

Read data transactions are very fast while write data transaction are
somewhat slower (due to the parity that has to be calculated).

Disadvantages

Disk failures have an effect on throughput, although this is still


acceptable.

Like RAID 3, this is complex technology.

Ideal use

RAID 5 is a good all-round system that combines efficient storage with


excellent security and decent performance. It is ideal for file and
application servers.

RAID 6

P+Q Redundancy scheme; similar to Level 5, but stores extra redundant information to
guard against multiple disk failures. Better reliability than Level 5 at a higher cost; not
used as widely.

Fig : RAID Levels