You are on page 1of 11

Fusion-ios Solid-State Storage:

The Standard for Enterprise-Class Reliability


Executive Summary
Fusion-io offers leading-edge, solid-state solutions based on NAND flash technology. These solutions provide a level of data
integrity and availability for mission-critical data that exceeds other solid-state solutions and significantly surpasses that of
enterprise-class rotating magnetic storage devices.
The enterprise has been hesitant to adopt NAND flash due to its per capacity cost compared to disk and a reputation for
unreliability. Widespread adoption of flash in consumer devices has driven consumer-grade (MLC) flash prices down. But MLC
flash has much higher failure rates than enterprise (SLC) flash. For this reason, MLC flash by itself is not suitable for missioncritical, enterprise applications, where a bit error could crash an operating system or corrupt sensitive data.
Fusion-io has developed patent-pending techniques that allow its ioMemory products to become a NAND flash solution
with reliability that exceeds that of enterprise disk-based storage. This solution protects data at every step, ensuring that
nothing is lost or corrupted in transit or on the media. Fusions Virtual Storage Layer (VSL) is an OS subsystem that integrates
the I/O and virtual memory subsystems. It allows NAND flash to be used as an extension of the server memory hierarchy,
greatly improving server throughput and memory capacity. The results are products that combine memory-like performance
with the persistence and capacity of traditional storage. Without VSL, NAND flash is destined to remain an expensive niche
in a world built around slow disk infrastructure. With VSL and the ioMemory-architecture, innovators can now unlock the
true potential of enterprise flash to achieve the highest, non-volatile levels of performance, efficiency, and savings that
would otherwise be impossible. This advances the performance of storage to the point where simpler, more cost-effective,
and more robust IT solutions may be deployed.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

Applications / Databases
File System

3rd Party
VSL Module

VSL Module

VSL Enhanced File System

VSL Module

Future VSL
Abstraction Layers

Kernel
VSL
ioMemory

Other Hardware

Figure 1. VSL lays the foundation for an emerging ioMemory-optimized software ecosystem that is not possible with legacy I/O subsystems
due to obstructions created by the RAID controller and embedded processors.

This white paper describes the characteristics of a dependable system and provides background on aspects of solid-state storage
that present dependability challenges. It then describes several inventions and advancements Fusion-io has introduced that
ensure data integrity and availability. Finally, it discusses the probability of catastrophic failure in a device and how Fusions
architecture ensures controlled management of component failure and predictable long-term device performance and wear.

The Dimensions of Dependability


Reliability Engineering is a subset of a larger engineering discipline sometimes referred to as Dependability Engineering.
A dependable product has the following attributes:
Reliability: A product or service should fail infrequently, and when it does, it should strive to maintain continuity of
service. The raw failure rate of a product is often referred to as the inherent failure rate, while failures that impact
continuity of service are often associated with the operational failure rate. Fault-tolerance techniques often add
components and functions at the expense of the inherent failure rate, but to the benefit of the operational reliability.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

Availability: Often measured as uptime percentage, availability for enterprise-class sub-systems ranges from four-9s
to six-9s, depending on the level of redundancy deployed at the system or solution-level, and the business criticality
of client applications. For reference, five-9s (99.999%) translates to five minutes of downtime per year.
Serviceability: When systems fail, it is imperative to restore service quickly. Serviceable products not only enable rapid
replacement of failed units, but also provide accurate diagnostics to assure first-fault isolation.
Manageability / Usability: Complex systems are difficult to install, administer, and troubleshoot, even with the best
human-computer interfaces. Poor interfaces lead to compound failures that can lead to loss of assets and protracted
downtime.
Predictable performance: Enterprise applications demand deterministic performance. This requirement extends to
sub-system failure scenarios, where solution performance needs must be met in the face of point failures by using
well-understood redundancy techniques.

FUSION-IO PRODUCTS ARE DESIGNED WITH THESE PRINCIPLES IN MIND:


Parts count is minimized to maximize inherent reliability.
Internal fault-tolerance techniques are used to assure continuity of service in the face of inevitable component failures.
Native PCIe implementation eliminates unnecessary protocol stacks and cabling, enabling common system-level
redundancy methods.
Tools such as ioManager and ioAdministrator use scalable and uncluttered GUIs to monitor and manage Fusion-io devices.
Standard APIs (including SNMP, SMI-S, and WMI) allow Fusion-io devices to integrate into existing storage and system
management infrastructure.
Comprehensive logs and utilities assist with fault isolation and troubleshooting.
Installation and upgrade processes are intuitive and straight-forward.
ioDrives and ioXtremes are single Customer Replaceable Units (or CRUs).
At the solution level, the ease of integration of Fusion-io products coupled with the outstanding performance of solid-state storage
enables reductions in solution complexity, points of failure, and total costs of ownership (TCO). The Fusion-io solution is truly
setting a new standard for enterprise dependability.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

Background
Flash memory chips are a non-volatile storage medium (they retain their information even in the absence of power). The
most common types of flash chips are silicon-based NOR and NAND, named after the organization of memory cells used in
their design. NAND flash technology, introduced in 1989, has become the most commonly used type of flash chip, due to
its faster write speed. Solid-state continues to grow in popularity as its price steadily declines, its storage capacity increases,
and its physical size decreases.
Flash memory offers a number of benefits in comparison to rotating magnetic storage devices (hard disk drives or HDDs). It
has no moving parts and is therefore significantly less prone to shock or vibration disturbance. It is a high-speed solution in
both latency and throughput. Temperature and humidity resilience means that it can operate in a number of challenging
environments. Finally, it consumes significantly less power than rotating magnetic storage devices, particularly when power
requirements for device cooling are considered.
However, solid-state-based subsystems, like all other computer subsystems, can and eventually do fail. Failures occur in
ways that are often familiar to system administrators, and sometimes in ways that are particular to NAND flash technology.
Potential failure points include,
Media Media failures can occur on the NAND flash chips themselves.
Transport Transport errors can occur anywhere along the path carrying data between the CPU and the storage media.
Management Management problems can occur within the NAND flash controller. The code that controls the operation
can contain defects, resulting in data failures.
Device Failure Catastrophic hardware failure can also occur. This includes the possibility of internal short circuits
and open circuits within the memory array, the control logic, and other peripheral circuits.
Wear Out Devices have a finite life; lifetime is a function of product application, configuration options, the type of
NAND flash technology used, and the sophistication of controller logic.
External External problems can affect any part of the process, such as those related to site disasters.
Human Error Information Technology systems may be quite complex to configure, administer, and troubleshoot.
The correct architecture and design addresses these issues and more through comprehensive fault detection, fault handling,
and failure recovery. The correct product also addresses technology integration with key business process and quality
management systems.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

Data Integrity
Data integrity means having confidence that the data you store is exactly the data you get out when you retrieve it. Data
integrity is the most important attribute of a storage system. As data moves from a computers RAM or CPU to the Fusion-io
device, data integrity is ensured by using several proven industry-standard approaches:
The CPU, chipset, and RAM use SECDED (Single Error Correct Double Error Detect) or chip-kill (method for on-the-fly
replacement of a failed chip) to ensure accuracy.
The PCIe bus uses 8b/10b encoding, with end-to-end checksums (Cyclic Redundancy Check, or CRC) and poison-bit checking.
The ioDrive pipelines are fully parity-protected.
Every data packet is labeled and logged. The label is double-checked in hardware for correctness as data is committed.
Writes are acknowledged when the data has been committed to the Write pipeline. Instant Committed Writes technology
assures persistence in the event of unexpected power outages.
Advanced Error Correction technology covers data at rest on the solid-state non-volatile storage medium.
Partially written data is not returned.
When data is read from the non-volatile medium, ECC and related techniques are again employed to ensure that
the data being retrieved is correct.

Availability
Fusions products have no moving parts. They employ sophisticated internal fault management and fault-tolerance
mechanisms that realize an operational Mean-Time-Between-Failures (MTBF) in excess of two million hours for the base product.
This low operational failure rate contributes directly to high availability.
The ioDrive corrects media or die failures without any manual intervention. NAND flashs reputation for unreliability is based
on studies that show potential data loss without utilizing ECC or using less correction capability than employed in Fusion-io
Products. NAND flash media is essentially lossy, as are all other forms of storage and communication media. A robust design
does not presume the absence of errors; rather, it addresses the challenge of on-the-fly error detection and correction.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

Fusion-io devices achieve reliability results that exceed target error probability by orders of magnitude over other solid-state
offerings. What this means is that the events that might trigger product failure in inferior designs are fully handled in Fusions
architecture and design. Note that Fusions target minimum Uncorrectable Bit Error Rate (UBER) is 10-20, which is better than
UBER targets for legacy storage media, such as HDDs, which typically have UBERs in the range of 10-14 to 10-15. Note also
that the UBER for Fusion-io products are orders of magnitude better during the majority of the expected product life excessive Bit Error Rate is strictly an end-of-life phenomena in NAND flash, unlike most other media which is lossy throughout its life.
In other words, with a superior ECC design, not only is UBER a non-issue during the expected product life, but it is significantly
better than that found in legacy storage systems. Put differently, Fusion-io products are substantially more reliable than
rotating magnetic media.
Fusion-io uses multiple ECC techniques to identify and correct faulty data on the device. One of the greater benefits of ECC
occurs when it is coupled with health monitoring utilities. Based on error history, the controller can predict the likelihood
of failure of individual chips. When a particular area of a chip trips any predetermined unreliability threshold, its data can be
moved and the failed region taken out of service. The controller continues to identify and remove bad blocks, regions of chips,
or even entire chips, so that ordinary wear-out does not cause catastrophic failure, but a very predictable endurance trajectory.
There is also significant reserve capacity (over-provisioning) in the device that backfills any retired blocks. This enables fail-inplace high availability at the micro-level.
Data is regularly monitored for accuracy and refreshed when necessary to ensure it does not deteriorate while it is stored.
This process consolidates valid data and creates contiguous free space on the ioDrive to maintain high levels of I/O
performance. This system also assures that blocks are uniformly utilized (i.e., that there are no hot-spots on the ioDrive,
and that blocks are evenly worn). Note that no system-level utilities such as disc defragmentation are required to maintain
optimal ioDrive performance.

Flashback Protection Technology


Enterprises have long sought to take advantage of the speed, size, low power requirements, and high performance of
NAND flash technology, because of its potential to change the way they manage large amounts of active data. The primary
objection to NAND flash has been the perceived reliability of the medium. Fusion-io has eliminated this barrier by inventing
an additional revolutionary, self-healing technology known as Flashback. This technology instantly restores, corrects, and
retrieves lost data in the flash-based storage subsystem. Flashback Protection consists of,
Advanced bit-error correction
Proactive data integrity monitoring of stored data
The addition of dedicated Parity chips to repair failed chips
Fusions patented Flashback Protection employs 24+1, parity-based, internal redundancy. The redundant Parity column is the
bit-wise XOR of the 24 data columns in the NAND flash array. It may be brought to bear through substitution, whenever
hard or soft failures exceed ECC correction capability. Flashback Protection eliminates Single-Points-of-Failure (SPoF) in the
flash array internal to Fusion-io products.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

0
1
Bank
2
3

21

22

23

Parity

Columns
Figure 2. Die-level Fault Tolerance via Parity Substitution

Individual die failures are transparent to the user or hosted application. Reconstruction of failed cells in the storage array
occurs at wire-speed, assuring continuous availability and deterministic performance.
With Flashback Protection, Fusion-io is the first and only company to bring RAID-class redundancy and reliability down
to the device level.

Device Longevity
The majority of this white paper has concentrated on the inherent reliability issues with the use of NAND flash technology
in an enterprise-class subsystem, and how to leverage this medias strengths while overcoming potential weaknesses. In
addition to inherent failures, NAND flash technology also has a practical life. Due to a variety of processes related to nanoscale semi-conductor physics, NAND flash technology wears out over time and with use. The endurance of the subsystem
is determined by these factors:
The type of NAND flash technology used (Single-Level Cell (SLC), Multi-Level Cell (MLC), etc.)
The sophistication of system management techniques used to minimize and mitigate wear
Device access patterns, notably the Write workload characteristics
Customer-specific requirements and configuration options (for example, related to over-provisioning)
Wear-out is generally a function of having lost enough cells to cause both capacity and reliability to drop below
acceptable thresholds.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

Fortunately, NAND flash technology wear-out is a reasonably well-understood phenomenon and is therefore predictable.
Fusion-io employs specific test and verification processes in product design and manufacturing to qualify NAND flash technology in its products. Endurance is sufficiently understood that through logging of error events and other attributes of
the device, Fusion-io is able to accurately forecast and warn when device end-of-life is near.
For the vast majority of applications, selecting the right NAND flash technology means wear-out will not be an issue,
regardless of expected workload or access patterns.
Fusions wear-leveling and other system management algorithms and strategies significantly improve the life
expectancy of its drives. The same ECC strength that contributes to extremely low UBER, as described earlier, serves to
extend the working life of Fusions products. The same Flashback technology that protects against inherent die failures
also provides protection against weak die that may wear prematurely and otherwise cause end-of-life.
Fusion-io specifies product endurance based on Peta-Bytes written (PBW). This metric is simply the number of bytes that
may be written to the non-volatile media. The following values are representative of Fusions first-generation product
endurance ratings:

Product Type

Average Estimated PBW

160 GB SLC ioDrive

75 PBW

320 GB SLC ioDrive Duo

150 PBW

320 GB MLC ioDrive

4 PBW

640 GB MLC ioDrive Duo

8 PBW

To put these values into perspective, a 160 GB SLC Drive exercised with a 70/30 read / write saturated workload will endure
more than 10 years, even with the most challenging I/O access patterns. In most applications, endurance will be significantly
greater than this example suggests. Similarly, a 320 GB MLC ioDrive, receiving 2TB of physical media writes per day, will
endure more than five years.
The most challenging workloads for solid-state storage devices such as Fusions are those characterized by pure random,
small block-size writes, spread over 100% of the available user space. Most workloads are significantly relaxed from this
worst case, because they,
Are sequential in nature
Have a high locality of reference (the active extents are restricted to a small number of files or a small region
of the formatted space)
Have a low write bandwidth requirement. The vast majority of applications today generate less than a terabyte
per day of write workload
Have a high Read/Write ratio, e.g.: WORM (Write Once, Read Many)

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

The relationship between workload at the logical (or application) level and workload at the physical level is complex. With
challenging workloads, it is prudent to add a degree of safety margin, or de-rating, to endurance specifications to assure the
correct fit between application and solid-state device. Note also that a portion of the total physical capacity of ioMemory
devices are pre-configured and used for working space. This reserve capacity maximizes system performance as well as
endurance. Users may add to this reserve capacity by over-provisioning (or under-formatting) an ioDrive. Put simply,
the greater the reserve or over-provisioning, the lesser the endurance de-rating, and the greater the expected life.
Due to Fusions log-structured block management, endurance de-rating needs to be considered only when workloads have a
high degree of write randomization and a low locality of reference. This is sometimes referred to as the Pathological Write
condition. Under these workloads, the level of over-provisioning correlates with the expected endurance. However, for the
majority of applications, these considerations are moot.
For SLC NAND flash, technology workload and under-formatting considerations are largely irrelevant. While ioMemory
products extremely high bandwidth make theoretical workloads possible that might wear-out an ioDrive in less than five
years, such a workload would be essentially impossible to achieve in practice. SLC-based products will fundamentally not
wear out over a typical five year product lifetime!

4,015

3,285
10% Write

Days 2,555

20% Write
30% Write
40% Write
50% Write

1,825

60% Write
70% Write
80% Write
90% Write

1,095

100% Write

5%

25%

50%

75%

100%

% Active
Figure 3. Expected life, in days, an ioMemory device , depending on different read/write access patterns under a continuously saturated workload.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

Enterprise-Class Reliability

MLC and SLC NAND flash technologies both take advantage of the same fault-tolerance and endurance-enhancing techniques.
The methods used to qualify and rate endurance apply equally well to MLC and SLC NAND flash technologies. SLC and
MLC offer capabilities that serve two very different types of applications. However, MLC is designed for high performance
and moderate endurance at an attractive cost per bit, and SLC is designed for even higher performance and maximum
endurance over time, in less cost-sensitive situations.

Controlled Predictable Usage vs. Catastrophic Failure


One of the greatest reliability benefits in Fusions products is the ability to,
Restore and protect data
Monitor and predict media wear out
Correct bad data as necessary
Take blocks out of service when their failure rate becomes unacceptable
Replace bad chips on-the-fly
Relocate data to a known, good location (and update corresponding mapping information)

Data on ioMemory products is protected in a number of ways. These ways include pro-active retirement of any suspect
components in the system, including blocks of NV media. As blocks are taken out of service, reserve capacity is brought
to bear through re-mapping, or block virtualization. This process is transparent to the host applications. The net effect
is that wear-out of ioMemory, instead of being potentially catastrophic, is incremental, predictable, and manageable.
Fusion-io products provide advanced warning prior to wear-out:
When approximately 50 percent of the system-reserved over-capacity has been consumed, alerts are generated to
warn the system administrator of impending performance throttling. This warning helps prevent wear out impact
on performance-critical applications.
Closer to end-of-life, when approximately 60 percent of the system-reserved over-capacity has been consumed,
ioDrives enter write-reduced mode. This mode throttles write performance artificially and sets critical alarms. These
alarms provide unambiguous warning that wear-out is approaching and maintenance actions need to be taken.
At end-of-life, which is defined when approximately 85 percent of the system-reserved over-capacity has been
consumed, ioDrives enter read-only mode. This mode serves to eliminate further wear of the ioDrives as
stored data is relocated to a replacement drive before data becomes unavailable.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

10

Enterprise-Class Reliability

Conclusion
Fusion-io protects your data at every stage of its path. From your applications to persistent storage, Fusion-io ensures that
nothing is lost or corrupted along the way or while the data is being stored. Data is checked multiple times, using several
error detection methods. Once it reaches the non-volatile medium, it is stored with robust error correction encoding that
lets the device not only identify, but correct bit errors. Fusions data integrity design target is a 10-30 probability of undetected bad data and a 10-20 probability of uncorrectable bit error. Integrated fault-tolerance mechanisms maximize device
availability and protect against data loss.
Now with Fusions comprehensive approach to reliability, it is safe to exploit the significant performance gains and many
other benefits offered by NAND flash technology. The architecture pioneered by Fusion-io ensures predictable, controlled
mitigation of component failure and wear-outissues that have up to now limited the adoption of NAND flash technology in enterprise solutions. Fusions ioMemory products exceed the reliability of rotating magnetic media storage, while
providing quantum-leap performance improvement.

WWW.FUSIONIO.COM

2010 Fusion-io, Inc. All rights reserved. ioDrive is a registered trademark of Fusion-io in the United States and/or other countries.
All other product and company names and marks mentioned in this document are property of their respective owners.

11