Sie sind auf Seite 1von 15

Offloaded Data Transfer (ODX) with

Intelligent Storage Arrays



Windows Offloaded Data Transfers Overview
http://technet.microsoft.com/en-us/library/hh831628.aspx

February 28, 2012
Abstract
This paper provides the technical details about the Offloaded Data Transfer (ODX)
operation and design requirements for intelligent storage devices in
Windows 8 Consumer Preview. It also provides conceptual guidelines for developers
of intelligent storage devices to understand the operation of ODX in Windows 8
Consumer Preview.
This information applies to the following operating systems:
Windows 8 Consumer Preview
Windows Server 8 Beta

References and resources discussed here are listed at the end of this paper.
The current version of this paper is maintained on the Web at:
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays













Disclaimer: This document is provided as-is. Information and views expressed in this document, including
URL and other Internet website references, may change without notice. Some information relates to pre-
released product which may be substantially modified before its commercially released. Microsoft makes no
warranties, express or implied, with respect to the information provided here. You bear the risk of using it.
Some examples depicted herein are provided for illustration only and are fictitious. No real association or
connection is intended or should be inferred.
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 2
February 28, 2012
2012 Microsoft. All rights reserved.
This document does not provide you with any legal rights to any intellectual property in any Microsoft
product. You may copy and use this document for your internal, reference purposes.
2012 Microsoft. All rights reserved.



Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 3
February 28, 2012
2012 Microsoft. All rights reserved.
Document History
Date Change
February 28, 2012 First publication

Contents

Introduction ........................................................................................................... 4
Problem Statement ................................................................................................ 4
Overview of Offloaded Data Transfer (ODX) ........................................................... 4
Identify ODX-Capable Source and Destination ........................................................ 5
ODX Read/Write Operations .................................................................................. 6
Synchronous Command Adoption and APIs .............................................................. 6
Offload Read Operations ........................................................................................... 6
ROD Token Policy and Management ..................................................................... 7
Offload Write Operations .......................................................................................... 7
Result from Receive Offload Write Result ............................................................. 7
Offload Write with Well-Known ROD Token ......................................................... 8
Performance Tuning Parameters of ODX Implementation........................................ 8
Minimum Copy Offload File Size ........................................................................... 9
Maximum Token Transfer Size and Optimal Transfer Count ................................ 9
Optimal and Maximum Transfer Lengths .............................................................. 9
ODX Error Handling and High Availability Support ................................................... 9
ODX Error Handling .................................................................................................... 9
ODX Failover in MPIO and Cluster Server Configurations ......................................... 9
ODX Usage Models .............................................................................................. 10
ODX across Physical Disk, Virtual Hard Disk and SMB Shared Disk ......................... 10
ODX Operation with One Server .............................................................................. 10
ODX Operation with Two Servers ............................................................................ 11
Massive Data Migration ........................................................................................... 12
Host-Controlled Data Transfer within a Tiered Storage Device............................... 13
Conclusion ........................................................................................................... 14
Resources ............................................................................................................ 15


Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 4
February 28, 2012
2012 Microsoft. All rights reserved.
Introduction
Increasing demands of high-speed data transfer for system virtualization and cloud
storage data migration is pushing system and storage platform innovation to develop
a more efficient data transfer mechanism. Today, IT pros and end users perform data
transfer through client-server networks or storage networks using extended copy
commands. To advance the storage data movement, Microsoft developed a new data
transfer technologyOffloaded Data Transfer (ODX). Instead of using buffered read
and write operations, ODX starts the copy operation with an offload read, retrieves a
token representing the data from the storage device, and then uses an offload write
command and the token to request data transfer from the source disk to the
destination disk. The copy manager of the storage devices then moves the data
according to the token. ODX operates in the backend storage array, which eliminates
buffered data movement on the client-server network. CPU usage and client-server
network bandwidth consumption drops to near-zero levels.
In Windows 8, IT pros and storage administrators can use ODX to interact with the
storage device to move large files or data through high-speed storage networks. ODX
will significantly reduce client-server network traffic and CPU time usage during large-
size data transfers, because all of the data movement occurs in the backend storage
network. ODX can be used in virtual machine deployment, massive data migration,
and tiered storage device support. Also, the cost of physical hardware deployment
can be reduced through ODX and thin provisioning storage features.

Problem Statement
A traditional copy operation reads data from a source file into a preserved buffer
space and writes data into a destination file with the information stored in the
preserved buffer. Many copy offload solutions in the current market try to speed up
the copy operation through transfer link enhancement, quality of service (QoS) traffic
control, or enhanced data buffer coordination to achieve high-performance data
transfer rates on the client-server network. However, moving data through the client-
server network can consume a large amount of network bandwidth and CPU time of
the host systems.
ODX allows the host system to interact with the storage array to move data through
the high-speed storage area network (SAN). Because ODX uses the backend storage
network, traffic on the front end client-server network and CPU usage is nearly zero.
In Windows 8, ODX can help users copy or move data across virtual hard disks (VHDs),
Server Message Block (SMB) shares, and physical disks. ODX is an end-to-end design
and support feature. It works on the storage devices that comply with the T10 XCOPY
Lite specification. This white paper covers the technical overview, design guide for
storage arrays, and usage models of ODX.

Overview of Offloaded Data Transfer (ODX)
Offloaded Data Transfer (ODX) introduces a tokenized operation to move data on the
storage device. The source file and destination file can be on the same volume, two
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 5
February 28, 2012
2012 Microsoft. All rights reserved.
different volumes hosted by the same machine, a local volume and a remote volume
through SMB2, or two volumes on two different machines through SMB2.
The following is the algorithm of the copy offload operation using ODX. The copy
offload application sends an offload read request to the copy manager of the
source storage device.
The application sends a receive offload read result request to the copy manager
and returns with a token. The token is a representation of the data to be copied.
The application sends an offload write request with the token to the copy
manager of the destination storage device.
The application sends a receive offload write result request to the copy manager.
The copy manager moves the data from the source to the destination and returns
the offload write result to the application.

The following figure shows a diagram of the copy offload operation using ODX.
Figure 1. Diagram of ODX Operations

Identify ODX-Capable Source and Destination
To support ODX, storage arrays must implement the T10 standard specifications. The
following sections describe how to identify ODX-capable storage arrays, offloaded
read and write operations, and Offload Write with Well-Known Token.
During the LUN device enumeration (a system boot or a plug-and-play event),
Windows gathers or updates the ODX capability information of the storage target
device through the following steps:
Query copy offload capability
Gather the required parameters for copy offload operations and limitations
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 6
February 28, 2012
2012 Microsoft. All rights reserved.
By default, Windows 8 tries the ODX path, if both source and destination LUNs are
ODX-capable. If the storage device fails the initial ODX request, Windows marks the
combination of the source and destination LUN as a not ODX capable path.
ODX Read/Write Operations
ODX consists of four major steps:
1. Offload read operations
2. Receive offload read result
3. Offload write with the token
4. Receive offload write result

To avoid SCSI command time-out and ensure robust Multipath I/O (MPIO) and
failover clustering support, Windows 8 adopted the synchronous offload SCSI
commands.
Synchronous Command Adoption and APIs
Windows 8 adopts the synchronous offload read/write operation, and splits a large
offload write request using the following algorithm to ensure a robust synchronous
offload write:
Set the optimal transfer size at 64 MB, if the target storage device does not
provide an optimal transfer size.
Optimal transfer size specified by the storage target devicethe optimal
transfer size is greater than zero and less than 256 MB.
Set the optimal transfer size at 256 MB, if the optimal transfer size set by the
target device is greater than 256 MB.

Synchronous offload read and offload write SCSI commands reduce the complication
of MPIO and cluster failover scenarios. Windows expects the copy manager to
complete the synchronous offload read/write SCSI commands within 4 seconds.
In Windows 8, applications can use FSCTL, DSM IOCTL, or SCSI_PASS_THROUGH APIs
to interact with storage arrays and execute copy offload operations. To avoid data
corruption or system instability, Windows 8 restricts applications from writing
directly to a volume that is mounted by a file system without first obtaining exclusive
access to the volume. This is because the write to the volume may collide with the file
system writes. When such collisions occur, the contents of the volume may be left in
an inconsistent state.
Offload Read Operations
The offload read request of the application can specify the token lifetime (inactivity
time-out). If the application sets the token lifetime to zero, the default inactivity
timer will be used as the token lifetime. The copy manager of the storage array
maintains and validates the token according to its inactivity time-out value and
credentials. The Windows host also limits the number of file fragmentations to 64. If
the offload read request consists of more than 64 fragmentations, Windows fails the
copy offload request and falls back to the traditional copy operation.
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 7
February 28, 2012
2012 Microsoft. All rights reserved.
ROD Token Policy and Management
After completing the offload read request, the copy manager prepares a
representation of data (ROD) token for the receive offload read result command. The
ROD token field specifies the point-in-time representation of user data and
protection information. The ROD can be user data in open exclusively or open with
share format. The copy manager can invalidate the token according to its ROD policy
setting. If the ROD is open exclusively for a copy offload operation, the ROD token can
be invalidated when the ROD is modified or moved. If ROD is in open with share
format, the ROD token remains valid when the ROD is modified.
The ROD token is in a 512-byte format, and the first 4 bytes are used to describe the
ROD token type.

Token Format
Size in Bytes Contents in the Token
4 Bytes

ROD Token Type

508 Bytes


ROD Token ID



Figure 2. Token Format
Because the ROD token is granted and consumed only by the storage array, its format
is opaque, not guessable, and highly secured. If the token is modified, not validated,
or expired, the copy manager can invalidate the token during the offload write
operation. The returned ROD token from the offload read operation has an inactive
time-out value to indicate the number of seconds that the copy manager must keep
the token valid for the next Write Using Token usage.
Offload Write Operations
After receiving the ROD token from the copy manager, the application sends the
offload write request with the ROD token to the copy manager of the storage array.
When a synchronous offload write command is sent to the target device, Windows
expects the copy manager to complete the command within 4 seconds. If the
command is terminated because of command time-out or other error conditions,
Windows fails the command. The application falls back to the legacy copy operation
according to the returned status code.
Result from Receive Offload Write Result
The offload write request can be completed with one or multiple Receive Offload
Write Result commands. If the offload write is partially completed, the copy manager
returns with the estimated delay and the number of transfer counts to indicate the
copy progress. The number of transfer counts specifies the number of contiguous
logical blocks that were written without error from the source to the destination
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 8
February 28, 2012
2012 Microsoft. All rights reserved.
media. The copy manager can perform offload writes in a sequential or
scatter/gather pattern.
When a write failure occurs, the copy progress counts contiguous logical blocks from
the first logical block to the failure block. The client application or copy engine
resumes the offload write from the write failure block. When the offload write is
completed, the copy manager completes the Receive ROD Token Information
command with the estimated status update delay set to zero and the progress of the
data transfer count at 100 percent. If the receive offload write result returns the
same progress of the data transfer count, Windows fails the copy operation back to
the application after four retries.
Offload Write with Well-Known ROD Token
A client application can also perform the offload write operation with a well-known
ROD token. This is a predefined ROD token with a known data pattern and token
format. One common implementation is called a zero token. A client application can
use a zero token to fill one or more ranges of logical blocks with zeros. If the well-
known token is not supported or recognizable, the copy manager fails the offload
write request with Invalid Token.

Token Format
Size in Bytes Contents in the Token
4 Bytes

ROD Token Type

2 Bytes Well Known Pattern
506 Bytes


Reserved



Figure 3. Well-Known Token Format
In an offload write with a well-known ROD token, a client application cannot use an
offload read to request a well-known token. The copy manager verifies and maintains
the well-known ROD tokens according to its policy.
Performance Tuning Parameters of ODX Implementation
Performance of ODX does not depend on the transport link speeds of the client-
server network or SAN between the server and storage array. The data is moved by
the copy manager and the device servers of the storage array.
Not every copy offload benefits from ODX technology. For example, the copy
manager of a 1 Gbit iSCSI storage array could complete a 3 GB file copy within 10
seconds, and the data transfer rate will be greater than 300 MB per second. The data
transfer rate already outperforms the maximum theoretical transfer speed of the 1
Gbit Ethernet interface.
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 9
February 28, 2012
2012 Microsoft. All rights reserved.
Not every file size copy offload will gain the benefit from the Windows ODX
technology. It is possible that copy performance for files of a certain size may not
benefit from ODX technology. To optimize performance, use of ODX can be restricted
to an allowable a minimum file size and maximum copy lengths. To tune the
performance of ODX, adjust the following parameters.
Minimum Copy Offload File Size
Windows sets a minimum file size requirement for copy offload operations. Currently,
the minimum copy offload file size is set at 256 KB in the copy engine. If a file is less
than 256 KB, the copy engine falls back to the legacy copy process.
Maximum Token Transfer Size and Optimal Transfer Count
The Windows host uses a maximum token transfer size and optimal transfer count to
prepare the optimal transfer size of an offload read or write SCSI command. The total
transfer size in number of blocks must not exceed the maximum token transfer size. If
the storage array does not report an optimal transfer count, Windows uses 64 MB as
the default count.
Optimal and Maximum Transfer Lengths
The optimal and maximum transfer length parameters specify the optimal and
maximum number of blocks in one range descriptor. Copy offload applications can
comply with these parameters to achieve the optimal file transfer performance.

ODX Error Handling and High Availability Support
When an ODX operation fails a file copy request, the copy engine and the Windows
file system (NTFS) fall back to the legacy copy operation. If the copy offload fails in
the middle of the offload write operation, the copy engine and NTFS resume with the
legacy copy operation from the first failure point in the offload write.
ODX Error Handling
ODX uses a robust error handling algorithm in accordance with the storage arrays
features. If the copy offload fails in an ODX-capable path, the Windows host expects
the application to fall back to the legacy copy operation. At this point, the Windows
copy engine has already implemented the fallback to traditional copy mechanism.
After the copy offload failure, NTFS marks the source and destination LUN as not
ODX-capable for three minutes. After this period of time passes, the Windows copy
engine retries the ODX operation. A storage array could use this feature to
temporarily disable ODX support in some paths during highly stressful situations.
ODX Failover in MPIO and Cluster Server Configurations
Offload read and write operations must be completed or canceled from the same
storage link (I_T nexus).
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 10
February 28, 2012
2012 Microsoft. All rights reserved.
When an MPIO or a cluster server failover occurs during a synchronous offload read
or write operation, Windows handles the failover using the following algorithm:
Synchronous offload read/write command
With MPIO configurationWindows retries the failed command after the MPIO
path failover. If the command fails again, Windows does the following:
Without the cluster server failover option Windows issues a LUN reset to
the storage device and returns an I/O failure status to the application.
With the cluster server failover option Windows starts the cluster server
node failover.

With Cluster Server configurationThe cluster storage service fails over to the
next preferred cluster node and then resumes the cluster storage service. The
offload application must be cluster aware to be able to retry the offload
read/write command after the cluster storage service failover.

If the offload read or write command failed after the MPIO path and cluster node
failover, Windows issues a LUN reset to the storage device after the failover. The
storage device terminates all outstanding commands and pending operations under
the LUN.
Currently, Windows does not issue asynchronous offload read or write SCSI
commands from the storage stack.

ODX Usage Models
This section discusses the usage models of ODX technology:
High-performance data transfer across physical disks, virtual hard disks, and SMB
shared disks
Massive data migration
Host controller data movement within a tiered storage device

ODX across Physical Disk, Virtual Hard Disk and SMB Shared Disk
To perform ODX operations, the application server must have the access to both
source LUN and destination LUN with read/write privileges. The copy offload
application issues an offload read request to the source LUN and receives a token
from the copy manager of the source LUN. The copy offload applications use the
token to issue an offload write request to the destination LUN. The copy manager
then moves the data from the source LUN to the destination LUN through the storage
network.
ODX Operation with One Server
In a single-server configuration, the copy offload application issues the offload read
and write requests from the same server system.
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 11
February 28, 2012
2012 Microsoft. All rights reserved.

Figure 4. ODX Operation with One Server
In the previous illustration, Server1 or Virtual Machine1 has access to both source
LUN (VHD1 or Physical Disk1) and destination LUN (VHD2 or Physical Disk2). The copy
offload application issues an offload read request to the source LUN and receives the
token from the source LUN, and then the copy offload application uses the token to
issue an offload write request to the destination LUN. The copy manager moves the
data from the source LUN to the destination LUN within the same storage array.
ODX Operation with Two Servers
In the two-server configuration, there are two servers and multiple storage arrays
managed by the same copy manager.

Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 12
February 28, 2012
2012 Microsoft. All rights reserved.
Figure 5. ODX Operation with Two Servers
In the previous illustration:
Server1 or Virtual Machine1 is the host of the source LUN, and Server2 or Virtual
Machine2 is the host of the destination LUN. Server1 shares the source LUN with
the application client through SMB protocol, and Server2 also shares the
destination LUN with the application client through SMB protocol. The
application client has access to both source LUN and destination LUN.
The source and destination storage arrays are managed by the same copy
manager in a SAN configuration.
From the application client system, the copy offload application issues an offload
read request to the source LUN and receives the token from the source LUN, and
then issues an offload write request with the token to the destination LUN. The
copy manager moves the data from the source LUN to the destination LUN across
two different storage arrays in two different locations.

Massive Data Migration
Massive data migration is the process of importing a large amount of data such as
database records, spreadsheets, text files, scanned documents, and images to a new
system. Data migration could be caused by a storage system upgrade, a new
database engine, or changes in application or business process. ODX can be used to
migrate data from a legacy storage system to a new storage system, if the legacy
storage system can be managed by the copy manager of the new storage system.

Figure 6. Data Migration from Legacy to New Storage System
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 13
February 28, 2012
2012 Microsoft. All rights reserved.
In the previous illustration:
Server1 is the host of the legacy storage system, and Server2 is the host of the
new storage system. Server1 shares the source LUN as the data migration
application client through SMB protocol, and Server2 shares the destination LUN
as the data migration application client through SMB protocol. The application
client has access to both the source and destination LUN.
The legacy storage system and new storage system are managed by the same
copy manager in a SAN configuration.
From the data migration application client system, the copy offload application
issues an offload read request to the source LUN and receives the token from the
source LUN, and then issues an offload write request with the token to the
destination LUN. The copy manager moves the data from the source LUN to the
destination LUN across two different storage systems at two different locations.
Massive data migration could be also operated with one server at the same
location.
Host-Controlled Data Transfer within a Tiered Storage Device
A tiered storage device categorizes data into different types of storage media to
reduce costs, increase performance, and address capacity issues. Categories can be
based on levels of protection needed, performance requirements, frequency of
usage, and other considerations.
Data migration strategy plays an important role in the end result of a tiered storage
strategy. ODX enables the host-controlled data migration within the tiered storage
device. The following diagram is an example of ODX in a tiered storage device
Figure 7. Data Transfer within Tiered Storage Device
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 14
February 28, 2012
2012 Microsoft. All rights reserved.
In the previous illustration:
The server is the host of the tiered storage system. The source LUN is the Tier1
storage device, and the destination LUN is the Tier2 storage device.
All tiered storage devices are managed by the same copy manager.
From the server system, the data migration application issues an offload read
request to the source LUN and receives the token from the source LUN, and then
issues an offload write request with the token to the destination LUN. The copy
manager moves the data from the source LUN to the destination LUN across two
different tier storage devices.
When the data migration task is completed, the application deletes the data from
the Tier1 storage device and reclaims the storage space.

Conclusion
ODX introduces the tokenized read and write operations for the offloaded data
movement. It also allows the host server to interact with the copy manager during
the copy offload operation and falls back to the legacy copy process when a failure
occurs during the copy offload operation.
The key benefits of ODX are:
Copy offload across physical servers and virtual machines.
High-performance data transfer rate through the storage network.
Low server CPU usage and network bandwidth consumption during the offload
read/write operations.
Intelligent data movement options allow applications to optimize the offload
read/write solutions.

ODX can be operated from a physical server system or a virtual machine, and the
source and destination disks can be physical disks, VHDs, or SMB shared disks. Many
technologies such as volume snapshot, copy on write, and extended copy
implementation have been applied to the storage arrays for enhancing massive data
transfer. ODX provides a highly secured and efficient front end interface between the
host server and copy manager of the storage systems. ODX is the first
implementation based on the T10 XCOPY Lite specification. The application can
interact with the storage device servers to perform offload read/write operations
across storage arrays under the same copy manager.
In the future, the platform design will need to involve the server cluster and storage
cluster, because the data movement and transaction can also be performed by the
copy manager of the storage cluster. The storage cluster can cross different vendors
storage arrays when the ROD token format is secured and recognized by different
vendors products. Because we expect the T10 Committee to continue the
development of the ROD token format, the offload read and write operations could
be implemented in all industry-standard storage products in the future.
Offloaded Data Transfer (ODX) with Intelligent Storage Arrays - 15
February 28, 2012
2012 Microsoft. All rights reserved.
Resources
T10 XCOPY Lite Specification (11-059r8)
http://www.t10.org/cgi-bin/ac.pl?t=d&f=11-059r8.pdf
Windows Offloaded Data Transfer Logo Requirement
http://msdn.microsoft.com/en-us/windows/hardware/gg487403

Das könnte Ihnen auch gefallen