Sie sind auf Seite 1von 36

Huawei Storage Portfolio


FusionData DR Backup Archive


Device Management Intelligent Storage Management Intelligent O&M


DeviceManager OceanStor DJ eService

 a All-Flash Storage Hybrid Flash Storage Distributed Storage


OceanStor Dorado OceanStor 18500/18800 V5 OceanStor 6800 V5 OceanStor 100D

OceanStor Dorado
18000 V6 8000 V6

OceanStor Dorado OceanStor OceanStor

OceanStor Dorado OceanStor
5300/5500/5600/5800 V5 2200/2600 V3 OceanStor 9000
6000 V6 5000 V6 Dorado 3000 V6

Huawei Converged Storage
Architecture and Technical Overview
1 Introduction to Converged Storage Products

2 Converged Storage Architecture

3 Software and Features

Huawei Converged Storage Overview

Entry-Level Mid-Range High-End

OceanStor 6800 V5 OceanStor 18500/18800 V5

OceanStor 5800 V5
OceanStor 5600 V5
OceanStor 5500 V5
2200/2600 V3 5300 V5

Entry-Level Mid-Range High-End

Controller enclosure height 2U 2U 4U
Controller expansion 2 to 8 2 to 16 2 to 32
Max number of disks 300 to 500 1200 to 2400 3200 to 9600
Cache specifications 16 GB to 256 GB 128 GB to 12 TB 512 GB to 32 TB

Multi-Level Convergence Makes Core Services More Agile

V5 V3

Multiple storage SAN and NAS SSD and HDD Multiple storage A-A for SAN and
types resource pools NAS
Pooling of heterogeneous Gateway-free converged data
Interconnection between Support for multiple types of HDDs and SSDs converged to
storage resources DR solution
different types, levels, and services meet the performance
Unified management and Smooth upgrade to 3DC
generations of flash storage Industry-leading performance requirements of complex
automated service orchestration
and functions services

Multi-level convergence
99.9999% service availability, satisfying complex service requirements

1 Introduction to Converged Storage Products

2 Converged Storage Architecture

3 Software and Features

OceanStor Converged Storage V5 Architecture: Comparison
Converged and parallel file and block SAN over NAS file system SAN over NAS file system
services architecture architecture

iSCSI/FC NFS/CIFS/FTP/HTTP File services Block services NAS


Block services File services WAFL


RAID Manager
Storage pool RAID 2.0+

Storage subsystems


• Converged block and file services • Converged block and file storage
• Parallel processing of file system • WAFL-based architecture • Standalone NAS gateway
services and block services in a • Unified file & block manager • NAS storage pool consisting of
storage pool • Physical RAID group one or more LUNs mapped
• Storage pool based on RAID 2.0+ from SAN storage

OceanStor V5 Software Architecture Overview

iSCSI/FC NFS/CIFS Operation Unified: NAS and SAN

System management software stacks are
Smart series: SmartThin, SmartDedupe, Deploy parallel. File systems are
Data service
SmartCompression, SmartQoS, SmartPartition, in redirect-on-write (ROW)
Cluster SmartTier, and SmartQuota Expand
service mode and LUNs are in
Hyper series: HyperSnap, HyperReplication, HyperVault, Upgrade
HyperClone, HyperLock, and HyperMetro
copy-on-write (COW)
License mode, adaptive to different
object Monitor
management Block service File service
service Alert Converged: NAS and SAN
Log resources are converged in
System object allocation on the
management Dashboard
Storage pool RAID 2.0+ management plane. SAN
and NAS resources are
Heterogeneous directly allocated from the
Kernel and Parallel Memory I/O
driver computing management scheduler Protection storage pool. All resources
Device object
management (disks, CPUs, and memory)
are shared, fully utilizing
SSD SAS NL-SAS Backup resources.

Convergence of SAN and NAS: How to Implement
Disk Domain Storage Pool LUN & FS Protocol

Grain Level-1
(Thin & iSCSI/FC
CKG Level-2 cache Extent Dedupe &
Tiered cache
(RAID group) (Stripe cache) (Tiering unit) Compression) Thick LUN (LUN & FS cache)


Level-1 cache
Block-level Tiered
tiering Thin LUN

Level-2 cache
SAS Not tiered
Thin LUN

Directly divided Grain File system
from CKG

NFS share and CIFS share

Tier2 File-level tiering

OceanStor converged storage architecture minimizes the I/O paths of SAN and NAS, and provides optimal performance.

OceanStor V5: Reliable Scale-out Storage

Data reliability Device reliability Solution reliability

Reliable RAID technology Reliable cluster system Reliable solution


App App



RAID 2.0+ architecture SmartMatrix 3.0 architecture Active-active architecture

RAID 2.0+ SmartMatrix 3.0 HyperMetro

Fastest reconstruction Tolerance of 3-controller failure RPO = 0, RTO ≈ 0

Key Technology: RAID 2.0+ Architecture

Hot Hot
spare spare

Traditional LUN RAID 2.0+ block

RAID virtualization virtualization
EMC: VMAX Huawei: RAID 2.0+
NetApp: FAS

20-fold faster data reconstruction

 Huawei RAID 2.0+: bottom-layer media virtualization + upper-layer resource virtualization for fast data reconstruction and intelligent resource allocation
 Fast reconstruction: Data reconstruction time is shortened from 10 hours to only 30 minutes. The data reconstruction speed is improved 20-fold. Adverse service impacts and disk
failure rates are reduced.
 All disks in a storage pool participate in reconstruction, and only service data is reconstructed. The traditional RAID's many-to-one reconstruction mode is transformed to the many-to-
many fast reconstruction mode.

More Reliable System with 20-Fold Faster Data Reconstruction

Reconstruction principle of Reconstruction principle With RAID 2.0+, data reconstruction time
traditional RAID of RAID 2.0+ plummets from 10 hours to 30 minutes


10 hours

2 30
Traditional Huawei's quick
technology recovery technology

Time for reconstructing a 1 TB NL-SAS disk

 The reconstruction speed of a traditional RAID

group is 30 MB/s, and it takes 10 hours to
reconstruct 1 TB data.
Many-to-one reconstruction Many-to-many reconstruction

Hot spare disk required, long Parallel reconstruction, short

 According to Huawei's test results of
reconstruction time reconstruction time reconstructing 1 TB data, RAID 2.0+ shortens
the time from 10 hours to 30 minutes.

SmartMatrix 3.0 Overview

Fully interconnected and shared

architecture for high-end storage
 Four-controller interconnection: Front-end Fibre
Channel interface modules, back-end interface
modules, switch cards, and controllers are fully
interconnected. Front-end and back-end I/Os
are not forwarded.

 Single-link upgrade: When a host connects to a

single controller and the controller is upgraded,
interface modules automatically forward I/Os to
other controllers without affecting the host.

 Non-disruptive reset: When a controller is reset

or faulty, interface modules automatically
forward I/Os to other controllers without
affecting the host.

 New-gen power protection technology:

Controllers house BBUs. When a controller is
removed, its BBU provides power for flushing
cache data to system disks. Even when multiple
controllers are concurrently removed, data is not

Note: The iSCSI and NAS protocols do not

OceanStor 6800/18500/18800 V5 support front-end interconnect I/O modules.

Key Technology: SmartMatrix 3.0 Front-End Interconnect I/O Module (FIM)

 External FC links are established

between hosts and FIMs. Each FC port
on an FIM connects to four controllers
using independent PCIe physical links,
allowing a host to access the four
controllers via any port.
 An FIM intelligently identifies host I/Os.
Host I/Os are sent directly to the most
appropriate controller without
pretreatment of the controllers,
preventing forwarding between
 In the figure, if controller 1 is faulty,
services on controller 1 are switched
over to other controllers within 1s. At the
same time, the FIM detects that the link
to controller 1 is disconnected and
redistributes host I/Os to other
functioning controllers by using the
intelligent algorithm. The entire process
is completed quickly and transparent to
hosts without interrupting FC links
between FIMs and hosts.

Key Technology: SmartMatrix 3.0 Persistent Cache

A A* C C* A C C* A C
B* B D* D B D* D B D

A* B* C* A*
D* B*


Normal Failure of one controller Failure of more controllers

(such as controller A) (such as controllers A and D)

 If controller A fails, controller B takes over its cache, and the cache of controller B (including that of controller A) is
mirrored to controller C or D. If controller D is also faulty, controller B or C mirrors its cache.
 If a controller fails, services are switched rapidly to the mirror controller, and the mirror relationship between it and
other controllers is re-established, so each controller has a cache mirror. In this way, write-back (instead of write-
through) for service requests is ensured. This guarantees the performance after the controller failure and ensures
system reliability, because data written into the cache has mirror redundancy.

Key Technology: Load Balancing of SmartMatrix 3.0 Persistent Cache
Controller A Controller B Controller A Controller B

A1 B1' B1 A1' B1 A3'

A2 C2' B2 D2' B2 D2'
A3 D3' B3 C3' B3 C3'
A1 D3'

C1 D1' D1 C1'
C1 D1' D1 C1' C2 A1' D2 B2'
C2 A2' D2 B2' C3 B3' D3 A2'
C3 B3' D3 A3' A2 B1' A3 C2'

Controller C Controller D Controller C Controller D

Work cache Mirror cache Work cache Mirror cache

Load Balancing of SmartMatrix 3.0 Persistent Cache

 Load balancing among four controllers: The cache of each controller's LUN is evenly mirrored to the cache of the other
three controllers. The cache is persistently mirrored.
 Four-controller balanced takeover: If one controller is faulty, all LUNs on the faulty controller are evenly taken over by the
other three controllers.
 Switchover of four owning controllers: The owning controller of LUNs and file systems can be switched to any of the
other three controllers in seconds.

Key Technology: HyperMetro (Block & File)

Site A Site B
Working Principles
 One device
Host application
Gateway-free, one device uses HyperMetro to support both active-active files
cluster (shared
volumes and databases.
mounted to  One quorum system
HyperMetro file SAN and NAS share one quorum site. Services are provided by the same site
systems) in the event of link failures, ensuring data consistency.
 One network
Storage network
between storage
The heartbeat, configuration, and physical links between two sites are
arrays and hosts integrated into one link. One network supports both SAN and NAS
IP&FC IP&FC transmission.
Real-time data
NAS SAN mirroring SAN NAS
Dual-write heartbeats
and configurations
 Active-Active, RPO = 0, RTO ≈ 0
 Requires no gateway devices, simplifying networks, saving costs, and
FC/IP eliminating gateway-caused latency.
Production storage Production storage  Supports two quorum servers, improving reliability
 Supports flexible combination of high-end, mid-range, and entry-level
IP IP storage arrays for active-active solutions, saving investment.
 Supports smooth upgrade from active-active or active-passive solutions to
3DC solutions without service interruption.
 Flexibly supports 10GE or FC networks for intra-city interconnection and IP
networks for quorum links.
Quorum site

1 Introduction to Converged Storage Products

2 Converged Storage Architecture

3 Software and Features

OceanStor Converged Storage Features

SAN Features


SmartThin SmartQoS SmartPartition SmartCache

Intelligent thin provisioning Intelligent service quality control Intelligent cache partitioning Intelligent SSD cache

SmartDedupe SmartCompression SmartMulti-Tenant SmartTier

Intelligent inline deduplication Intelligent inline compression vStore (tenant) administration Intelligent block tiering

HyperSnap HyperReplication HyperMirror HyperCopy

Snapshot Synchronous or asynchronous Local active-active LUNs LUN copy
remote replication

HyperClone HyperMetro VVol Support Encryption/Key Mgmt.

LUN clone Synchronous mirroring for Support for VMware VVols Internal Key Manager
automatic failover Self-encrypting drive (SED)

OceanStor Converged Storage Features
NAS Features

CIFS (v1/v2/v3) NFS (v3/v4) HTTP/FTP/NDMP

SmartThin SmartQoS SmartPartition SmartCache

Intelligent thin Intelligent service Intelligent cache Intelligent SSD cache
provisioning quality control partitioning
SmartDedupe&Compression SmartQuota SmartMulti-Tenant SmartTier
Intelligent inline deduplication & compression Quota management vStore (tenant) administration Intelligent file tiering

HyperSnap HyperReplication HyperVault HyperMetro HyperClone

Snapshot Internal and external Inter-array backup A-P cluster FS virtual clone
asynchronous replication
GNS Internal DNS Service HyperLock
Global namespace Auto-load balance for NAS WORM

SmartMulti-Tenant Architecture: Network, Protocol, and Resource Virtualization

Protocol virtualization
vStore vStore
AD Client


Transmission (TCP/IP)
Authentication server User/Group manager
Group Share Share &NIS Share
DB Share AD & DNS Share LADP & NIS

CIFS service NFS service NFS service

CIFS service NFS service

Lock manager
FS 0 FS 1 FS 2 FS 3 VFS

 With protocol virtualization, each vStore has separated NAS

Storage pool A Storage pool B service instances. Each vStore can configure NAS services
separately, isolate I/O requests, and configure different
AD/LDAP/NIS services from other vStores.

HyperMetro for NAS

Site A Site B

Working Principles
FS FS FS FS High-availability synchronous mirror at a file-system level: When data is
written to the primary file system, it will be synchronously replicated to the
secondary file system. If the primary site or file system fails, the secondary site
HyperMetro vStore pairs:
(vStore1  vStrore1') or file system will automatically take over services without any data loss or
(vStore2  vStrore2') application disruption.

HyperMetro pairs:
FS1 FS1' (FS1  FS1')
vStore1 vStore1' (FS2  FS2')
Data and
(FS3  FS3') Highlights
FS3 configuration FS3' (FS4  FS4')
vStore2 sync
 Gateway-free deployment
FS4 FS4'  1 network type between sites
FS5 FS5'  2 components required for smooth upgrade
vStore3 vStore3'
FS6 FS6'  3 automatic fault recovery scenarios
 4x scalability
 5x switching speed

Quorum server

HyperMetro Architecture
LADP, AD, and NIS servers

vStore A at the primary site vStore A' at the secondary site


User & AD & LADP User & AD & LADP

Share Share Share Share
Group DNS & NIS Group DNS & NIS
Protocol instance


File system
FS 0 FS 1 FS 2 FS' 0 FS' 1 FS' 2

Storage pool A Storage pool B

• HyperMetro enables synchronous mirroring of access networks, protocol instances, and file systems in a vStore, ensuring a seamless service failover to the secondary
site when the primary site fails.
• vStore A' on the secondary site is in the passive state. File systems in vStore A' cannot be accessed. After the failover, vStore A' is in the active state, and LIF, protocol
instances, and file systems are enabled. Service data, protocol configuration data, and network configurations will be identical, and the client can recover services by
retrying requests.

Service and Access Processes in the Normal State
1-7: Configurations made by the administrator on
NAS client Admin
vStore A are synchronized to vStore A' in real time, such as the quota, qtree, NFS service,
11 18 CIFS service, security strategy, user and user
1 7 10 8
group, user mapping, share, share permission,
DNS, AD domain, LDAP, and NIS.
NAS service CFG sync
CFG sync NAS service If a failure occurs, the changed configurations
12 17 2 3 5 6 are saved in the CCDB log and the vStore pair
status is set to "to be synchronized". After the
File system CCDB CCDB File system
link is recovered, the configurations in the CCDB
13 16 vStore A vStore A'
14 14 14 log are automatically synchronized to vStore A'.
Object set Data sync Data sync Object set

14 15
15 15 15 14 15 8-18: When a NAS share is mounted to the
Concurrent write
Cache Cache client, the storage system obtains the access
permission of the share path based on the client
IP address. If the network group or host name
Storage pool Storage pool
has the permission, the client obtains the handle
of the shared directory. If a user writes data into
a file, the NAS service processes the request
Storage system A Storage system B
and converts it into a read/write request of the
file system. If it is a write request, the data
synchronization module writes the data to the
server caches of both sites simultaneously, and then
returns the execution result to the client.

Service and Access Processes During a Failover

Admin NAS client
1 8
10 1-8: When vStore A is faulty, vStore A'
detects the faulty pair status and
NAS service CFG sync CFG sync NAS service applies for arbitration from the quorum
11 12 2 7 server. After obtaining the arbitration,
File system CCDB CCDB File system vStore A' activates the file system, NAS
vStore A vStore A' 3 6 service, and LIF status. NAS service
Object set
14 Data
CCDB log Object set configuration differences are recorded
synchronization 4 5 in the CCDB log, and data differences
Cache DCL
9 Cache are recorded in the data change log
9 (DCL). In this manner, vStore A can
Storage pool Storage pool synchronize incremental configurations
and data upon recovery.
Storage system A Storage system B
The CCDB log and DCL are configured
with power failure protection and have
high performance.
LADP/NIS server

NFS Lock Failover Process

Synchronize a client's IP address pair Notify the client The client reclaims the lock

Mount1: Mount1: Mount1:
Notify Reclaim (inactive)


service service service service service service

Back up
Configuration client info Configuration Read
synchronization synchronization configuration


vStore A vStore B vStore A vStore B vStore A vStore B

1. HyperMetro backs up a 1. The NAS storage reads the list 1. The client sends a lock
client's IP address pair to of IP address pairs from the reclaiming command to the
remote storage. CCDB. storage.
2. The NAS storage sends 2. The storage recovers byte-
NOTIFY packages to all clients range locks.
to reclaim locks.

NAS HyperMetro: FastWrite

General Solution FastWrite

OceanStor V5 OceanStor V5 OceanStor V5 OceanStor V5
Host storage storage Host storage
Host storage Host
100 km
100 km

1. Write command FC or 10GE FC or 10GE

1. Write command
2. Transfer ready 2. Transfer ready
3 Data transfer 3 Data transfer

5. Transfer ready 5. Status good RTT-1


8. Status good

Site A Site B Site A Site B

 FastWrite: A proprietary protocol is used to combine the two

 General solution: Write I/Os undergo two interactions at two interactions (write command and data transfer). The cross-site write
sites (write command and data transfer). I/O interactions are reduced by 50%.
 100 km transfer link: two round trip time (RTT) delays  100 km transfer link: RTT for only once, improving service
performance by 30%

The host
modifies data D
Before snapshot During snapshot Modify data after a
creation creation snapshot is created
Active Snapshot Active Snapshot Active
volume mapping table volume mapping table Volume Copy-on-write (COW)
Used by LUNs of OceanStor
A B C D A B C D A B C D1 D V5 storage
data Modified
Data block D is copied and the mapping table is modified.

Before snapshot During snapshot Modify data

creation creation
Redirect-on-write (ROW)
Active FS Snapshot Active FS Snapshot Active FS
Used by file systems of
OceanStor V5 storage

A B C D A B C D A B C D B D1 E1 E2
Deleted Modified New
data data data

2.1 Working Principles (1)
 The initial backup of HyperVault is
a full backup, and subsequent
backups are incremental backups.
 based on file systems, the backups
are completely transparent to hosts
and applications.
 Each copy at the backup file
system contains full service data,
not only the incremental data.
 The data at the backup file system
is stored in the original format and
is readable after the backup is


Policy-based automatic backup

Backup policy: local backup (up to four policies) and

remote backup (up to four policies)
Number of backups supported by each policy: 3
to 256
Backup period: monthly, weekly, or daily.

• Incremental backup and restoration

• Earliest backup deleted automatically
based on a backup policy
• Both local backup and remote backup
supported for restoration.
• Interoperability among high–end, mid-
range, and entry-level storage arrays

DR Star (SAN)
I/O process:
DC 1
1. The host delivers I/Os to the primary LUN-A.
LUN-A (Ta) 2. The primary site dual-writes the I/Os to the secondary LUN-B.
Asynchronous replication (standby)
3. A write success is returned to the host.
4 4. Asynchronous replication starts and triggers LUN-A to activate the time slice
1 Ta+1. New data written to LUN-A is stored in this time slice, and the Ta slice is
LUN-A (Ta+1) DC 3 used as the data source for the standby asynchronous replication.
5. LUN-B activates a new time slice Tb+1, and the new data is stored in this time
LUN-C (Tc+1) slice. LUN-C activates a new time slice Tc+1 as the target of asynchronous
replication. Tc is the protection point of asynchronous replication rollback.
Active-active 2 5 6. LUN-B (Tb) is the data source for asynchronous replication to LUN-C (Tc+1).
DC 2 6 host
Data in DC1 and DC2 is synchronous. After the data is copied from Tb to Tc+1,
LUN-C (Tc)
the data in Ta is also copied to Tc+1. This process is equivalent to the
LUN-B (Tb+1) asynchronous replication between DC1 and DC3. If DC2 is faulty, DC1 and DC3
5 are switched to asynchronous replication. Incremental data is replicated from Ta
Asynchronous replication to DC3.
LUN-B (Tb)
Compared with the common 3DC solution:
1. There is a replication relationship between every two sites. Only one of the
Item Huawei H** E**
two asynchronous replication relationships has I/O replication services and
Active-Active + the other one is in the standby state.
asynchronous remote Supported Supported Not supported 2. If the working asynchronous replication link is faulty or one of the active-
replication active sites is switched over, the working link is switched to the standby link.
Synchronous remote Then, incremental synchronization can be implemented.
replication + asynchronous Supported Not supported Supported 3. You only need to configure DR Star at one site.
remote replication 4. The DR Star supports active-active + asynchronous + standby and
synchronous + asynchronous + standby networking modes. The
Configured at one site Supported Not supported Supported
asynchronous + asynchronous + standby networking mode is not supported.

SmartTier (Intelligent Tiering)


Extent I/O monitoring Collects statistics on the activity

levels of each extent.

Data distribution analysis Ranks the activity level of each extent.

Tier0: SSD Tier1: SAS Tier2: NL-SAS Relocates data based on the rank
Data relocation
and relocation policy.


Dir Dir
File system
Indicates the user-defined file write policy and
relocation policy.
The supported attributes include the file
File policy
Scans the list of files to be relocated based on
File distribution analysis the file policy.

Tier0: SSD Tier1: SAS/NL-SAS File relocation Relocates files based on policy.

SmartTier for NAS (Intelligent File Tiering)

File system
Tiering policy

Automatic relocation mode:

File File Files are first written into SSDs, and then dynamically
scanning scanning
relocated between SSDs and HDDs based on the SSD
Performance Capacity tier
tier (SSD) (HDD)
usage and file access frequency.
Customized relocation mode:
Users are allowed to specify file policies (including the
file name, file name extension, size, creation time,
access time, and modification time) and relocation
policies (weekly, in intervals, and immediately).
Comparison between SmartTier for Block and SmartTier for File
Data Relocation
Feature Tier Scope
Relocation Speed Relocation Mode

High, medium, and

SmartTier for Block Three tiers (SSD/SAS/NL-SAS) One storage pool Extent Automatic
Automatic and customized
SmartTier for File Two tiers (SSDs and HDDs) One file system File Automatic

SmartTier File Relocation Principles

SmartTier policy

Initial file write policy

• Preferentially writes data into the
performance tier.
Performance tier (SSD) Scan the file system, obtain file • Preferentially writes data into the
attributes, and identify hot and capacity tier.
cold files based on user • Determines the tier to which the file is
configurations. written based on the file attributes.

Relocation period
• Specifies the start time.
Add files to the background • Specifies the running duration.
relocation task. • Can be Paused.

Capacity tier Relocation condition

(NL-SAS/SAS HDD) • File time (Atime/Ctime/Mtime/Crtime)
Relocate these files to the • File size
specified media. • File name
SA NL- NL- • File name extension
S SAS SAS • SSD utilization
Hot file
Policy specifications
Cold file The relocation is complete. • A maximum of 10 file policies can be
created for each file system.
• Each policy supports multiple
• The priority of file policies can be set.

SmartTier for NAS and Background Deduplication & Compression
Configure SmartTier to improve performance and save space:
 Enable SmartTier for the file system and configure the automatic relocation mode where all data is written into the performance tier
(SSD tier).
 Set the SmartTier relocation time from 22:00 to 05:00.
 In SmartTier, enable deduplication and compression during relocation.

SmartTier policy example

0 1 2 3 4
8 A.M. to 8 P.M. 10 P.M. to 5 A.M. 5 A.M. to 8 A.M. 8 A.M. to 8 P.M.

Performance tier

Capacity tier SAS SAS


Create a file system. New data is Data is deduplicated and Deduplication New data is
written to SSDs. compressed when and compression written to SSDs.
relocated to HDDs. are complete.


Copyright © 2020 Huawei Technologies Co., Ltd. All Rights Reserved.

The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product portfolio, new
technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive statements. Therefore, such
information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.