Sie sind auf Seite 1von 41

COMMVAULT HYPERSCALE™

STORAGE POOL & APPLIANCE

{NEW MODULE}

Welcome to the Commvault HyperScale™ Software & Appliance Solution


Architecture Module.

1
LEARNING GOALS

Understand
▪ What is Commvault HyperScale™ Software
▪ Why use HyperScale Software
▪ HyperScale architecture and resiliency
▪ HyperScale Appliance architecture
▪ HyperScale StoragePool sizing and resiliency

Consider
▪ HyperScale server options and components
▪ Validated Reference Design Program
▪ Linux VSA for virtual server protection
▪ HyperScale StoragePool and Appliance deployment examples

In this module you will learn more about the Commvault HyperScale™ solution.
This module covers both the HyperScale software solution in the form of the
HyperScale Storage Pool and the Commvault HyperScale Appliance. You will hear
how the HyperScale architecture provides increased scalability and resiliency
through the use of Gluster File System with Erasure Coding.

At the end of this module you will be able to position and explain the different
HyperScale offerings and follow the documented sizing guidelines and best-
practice recommendations for desining a complete Commvault solution utilizing
HyperScale.

2
Commvault HyperScale™ Storage Pool &
Reference Architecture

3
WHAT IS A COMMVAULT HYPERSCALE™ STORAGE POOL?

HyperScale Storage Pool


▪ Software defined architecture
▪ Server based storage nodes Block

▪ Fast, easy deployment


▪ Horizontal & vertical scaling Dynamic Block
Expansion
▪ Inherent resiliency
Block

Block

Node with internal disk

Internal Network

Narrative:

Commvault HyperScale™ is a software defined architecture to manage secondary


storage, where traditional data management techniques are inadequate.
HyperScale can create a scalable Storage Pool from commodity big-data servers
containing local hard drives, based on a reference architecture {CLICK} capable of
achieving multi-petabyte storage that can be used for various purposes, without
having to purchase expensive storage arrays or NAS devices.

{CLICK}
HyperScale utilizes Red Hat Linux with the Gluster File System, and erasure coding
on standardized servers. The data itself is redundantly replicated allowing for
multiple disk or node failures without using a traditional shared storage approach
for resiliency and recovery. Under-the-hood, {CLICK} each HyperScale node is a
MediaAgent, which means that the compute power to process data management
tasks, scales horizontally with the actual size of the environment.

4
WHY USE A COMMVAULT HYPERSCALE™ STORAGE POOL?
Standard Scale-Up Architecture
▪ Design Servers and Storage
4 Node Grid

▪ Configure Storage
– RAID groups, multi-pathing,
volumes, zoning, mount paths RAID Controllers,
Optional NAS

▪ Configure Servers
– Install OS, mount & prepare Expansion
volumes, configure networking Disk Shelves

▪ Configure Software
– Create disk libraries, global
dedupe policy
500 TB – 2PB usable capacity

Narrative:

A standard, Scale-up architecture requires dedicated hardware with fully


redundant servers and storage components with increased costs. The storage is
based on RAID with software based resiliency and the loss of a component can
often have a knock-on effect on other components, as well as data access. In the
end, management, repair, replacement, and even expansion require significant
downtime and manual intervention. For example, {CLICK} if a MediaAgent goes
down it tends to effect the storage that is associated with that MediaAgent. if a
disk fails in an array that effects the RAID group that is accessed by that
MediaAgent. The nature of a traditional hardware failure on scale-up architecture
has a real impact by requiring manual intervention and downtime, which will
impact other applications and effect end users when it occurs.

The standard scale-up architecture requires that you architect storage redundancy
separate to software and MediaAgent based redundancy, thus increasing the
number of components, complexity and ultimately the cost of implementing a
solution.

Commvault HyperScale™ collapses these requirements, by automating the

5
deployment and the redundancy of the solution by scaling out the storage pool
without having to deal with the complexity of building-blocks. The whole system is
designed to be flexible to add or replace nodes without requiring manual
configuration, HyperScale removes the manual effort required to design, configure
and implement the storage, servers, and software infrastructure to support data
management tasks, with all those tasks automated through a simple install that
allows for easy expansion and replacement of nodes.

5
COMMVAULT HYPERSCALE™ ARCHITECTURE OVERVIEW
Hyperscale Storage Pool

Block-1 Block-2 Block-3 Block-4

Node 1

Node 2

Node 3

Sub-Volume

▪ HyperScale Storage Pool, requires minimum ▪ Grow Storage Pool by adding more Blocks
of 3 identical servers (nodes)
– No RAID controller required ▪ Blocks must contain equal number of nodes
– 2 x Network interfaces (10GigE)
– Internal HDDs (bricks) should be similar type and
capacity ▪ Grow Blocks by adding more disk

Narrative:

Now let’s discuss the architecture of the HyperScale Storage Pool, which as
mentioned earlier, is built on a Gluster File System that uses Erasure Coding and
commodity servers, all based on a reference architecture.

{CLICK} Each HyperScale Storage Pool requires a minimum of three servers, known
as nodes. The nodes do not require a RAID controller as the Erasure coding takes
care of data redundancy at the software layer. The nodes are to have similar
resources of CPU, memory, storage and network. Individual nodes are required to
have a minimum of two network ports to support public data protection traffic,
and for private traffic between the nodes in the hyper converged cluster…. 10
Gigabit Ethernet is recommended.

{CLICK} Each group of nodes are known as a Block, while the Storage pool itself
grows in increments of Blocks. Each Block must contain the same number of
nodes, either 3 or 6, depending on the level of resiliency configured…

{CLICK} Each node in a block contains an equal number of internal hard disks. In
Gluster terms a brick is a basic unit of storage, represented by an export directory,

6
stored on the underlying filesystem, for Commvault HyperScale we create 1 brick
on each physical hard disk, {CLICK Block2/3 animate in} that should be of similar
type and density.

{CLICK} A set of bricks automatically form what are known as sub-volumes using
Erasure coding within the Gluster File System. This provides software defined
resiliency and recovery from hardware failures non-disruptively. The illustration
here shows 4+2 erasure coding, more on this in a moment. This means that each
sub-volume will contain 6 bricks.

{CLICK} In addition to adding blocks, blocks themselves can be scaled vertically by


adding additional hard disks (or bricks). {CLICK} In the case of (4+2) erasure coding
this means we need to scale in sets of 6 hard disks (bricks).

6
Let’s take a closer look at how Erasure coding in the Gluster File System
provides both storage resiliency and efficiency.

What is “erasure coding”? It’s an industry-accepted method for


doing RAID-like redundancy without RAID-like rebuild times. It’s
software driven method of data storage protection in which data
blocks are broken into fragments, expanded and encoded with
redundant information to allow rebuild of missing blocks, and those
blocks are then stored across a set of different locations or storage
media. Where RAID stores parity on other disks, and thus is subject
to failure if the array goes down, along with expensive and time-
consuming rebuild times, {CLICK}erasure coding spreads parity
ERASURE CODING: RESILIENCY WITH STORAGE EFFICIENCY
across NODES, and allows nearly-instant recovery and resiliency, and
can tolerate more failures than traditional RAID. Software driven RAID-like redundancy
Erasure
File 1
2 MB
File 2
4 MB
File 3
3 MB
Erasure Code
3 node, 4 data, 2 redundancy (4+2)
▪ Tolerates the loss of 2 drives or 1 node without affecting data

So, how do we leverage that? It’s built into the platform. It’s
Coding
▪ Consumes 1.5X capacity
Calculations achieved through segmentation and
– 2 MB file occupies 3 MB on disk, 4 MB file occupies 6 MB
6 X 0.5 MB
Segments
redundant encoding process.
6 X 1 MB
Segments
6 X 0.75 MB
Segments – 67% efficiency (33% or raw space used for resiliency)

resilient and tolerant enough to allow for multiple node failures


Node 1 Compared to
2 node replication

without impacting data access, and allowing for rebuild as replaced


12 Drive
Nodes
Node 2 ▪ Tolerates the loss of just 1 drive or 1 node
▪ Consumes 2X capacity
– 2 MB file occupies 4 MB on disk
Node 3

or repaired nodes are brought back online.


– 50% efficiency (50% of raw space used for resiliency)

Erasure 3 node replication


Coding
Calculations ▪ Tolerates the loss of 2 drives or 2 nodes
▪ Consumes 3X capacity
File 1 File 2 File 3 – 2 MB file occupies 6 MB on disk
2 MB 4 MB 3 MB – 33% efficiency (67% of raw space used for resiliency)

7
Now, how you build and configure your nodes in a storage pool
affects availability, resiliency, and redundancy. Let’s look at an
example.

{CLICK} Here, we’re starting with a 3 node block, each node


containing 12 drives, per our standard architecture recommendation.
The pool is configured for 4+2 erasure encoding. Each data block is
broken into 4 chunks written as 4 blocks across 4 different drives,
with 2 parity (or redundancy) blocks that can allow for the rebuild
after the loss of any of the other blocks.
{CLICK} Let’s say we have a 1 or 2MB file and I need to read it from
the storage pool and write it to some system over here. I read the
file pieces, run it through the erasure coding algorithm and get back
a 2MB file from those segments I read.

A key characteristic of erasure coding is that if a file gets broken into


multiple segments, as in this context of 4+2 erasure coding, at any
given time only 4 of the written blocks are needed to bring the file
back. {CLICK} I can read ANY 4 (such as two data segments and 2
parity segments), run it through the erasure coding mechanism, and
reconstitute the file.
As you can see, this is why each node is comprised of compute in
addition to storage. The resiliency afforded by this method requires
some calculations for every read and write, using compute resources
distributed across your scale-out storage pool.

{CLICK} Let’s take another example… a 4MB file. This file, when
written, will be broken {CLICK} into (six?) segments, each of those
1MB in size. {CLICK} We take those same segments and land them
on different drives spread out across the pool.
Let’s say in this context I lose two drives, on the same server or on
different servers. {CLICK} I still have four segments available,
{CLICK} so I’ll be able to read those four segments, run it through
the erasure coding algorithm, and produce my original 4MB file.

{CLICK} Final example, with a 3MB file broken into six 0.75MB

7
segments, and written as before in the pool. {CLICK} In this case,
instead of losing individual drives or data, I lose the entire node.
{CLICK} Regardless, I still have 4 available segments and can
reconstitute the original file.

{CLICK} So what’s the benefit of this? Your 4+2 configuration can


tolerate the loss of 2 drives or an entire node without suffering data
loss or access issues. It doesn’t require 2x the storage space like
RAID1 mirroring would require, nor does it suffer the incredibly long
data-outage rebuild times that other RAID levels have. Rather, it
provides real-time resiliency while only using 50% more space than
the original data. 67% efficiency, using just 33% of the raw space to
provide this level of resiliency.
{CLICK} Compare that to 2 node replication, or RAID 1, which can
tolerate the loss of just one drive or node, yet consumes 2x the
space of the original data. It’s only 50% efficient.
{CLICK} 3 node replication, as is used in HDFS with a replication
factor of 3, sounds better, as it can tolerate the loss of 2 drives or
nodes, but it consumes 3X the storage! That’s a miserable 33%
efficiency rate!

7
CONFIGURABLE RESILIENCY
Resiliency Erasure Node/Block Sub-volumes Tolerance to Drive Tolerance to
Factor Code Config per Block Failures Node Failures Recommendation
3 nodes 6 Tolerate Failure of: 1 node failure per
12 drives per 2 drives per sub-volume 3-node block
node 12 drives per block* Provides smallest expansion
Good
4+2 granularity within a single
(default) 3 nodes 12 Tolerate Failure of: 1 node failure per block.
24 drives per 2 drives per sub-volume 3-node block
node 24 drives per block*
3 nodes 3 Tolerate Failure of: 1 node failure per
12 drives per 4 drives per sub-volume 3-node block Higher dispersion may affect
node 12 drives per block* performance slightly. Especially
Better 8+4 critical to establish failure
3 nodes 6 Tolerate Failure of: 1 node failure per domains across discrete racks
24 drives per 4 drives per sub-volume 3-node block to tolerate loss of entire rack
node 24 drives per block*
6 nodes 6 Tolerate Failure of: 2 node failures per
12 drives per 4 drive failures per sub- 6-node block
node volume Recommended for large UCS
24 drives per block* S3260 based deployments
Best 8+4
6 nodes 12 Tolerate Failure of: 2 node failures per starting at near 1 PB of
24 drives per 4 drive failures per sub- 6-node block consumable capacity
node volume
48 drives per block*
*If failed drives are evenly distributed across sub-volumes. In case of uneven drive failures some portion of data access may be affected

Narrative:

The following table shows how each HyperScale Storage Pool can be designed and
configured for various levels of resiliency using either the default 4+2 or for higher
resiliency 8+4.

Take a moment to study the table and click next to proceed when you are ready.

8
COMMVAULT HYPERSCALE™ STORAGE POOL SIZING

Block Size
3 6
(Nodes/Block)

HDDs Per Node 6 12 24 6 12 24

HDDs in Block 18 36 72 36 72 144

https://documentation.commvault.com

Narrative:

Sizing a reference architecture (R.A) designed specifically for data management,


translates to a decision on number and capacity of HDD’s to be used at the server
node level. This is because compute and network resources are predetermined to
deliver a required level of performance. The choice of server node is to be based
on current disk capacity needs while also providing for future growth.
Following is a platform sizing table with total usable space for varying block sizes,
HDD count and capacity.

Total disk capacity requirement is dependent on such customer driven data


protection Service Level Agreement (SLA) criteria as frequency of backups, rate of
change of data, retention period and number of data replicas. The idea is to
translate customer data protection requirements into disk capacity needs. Then,
usable capacity from the above sizing and resiliency tables may be used to arrive
at a platform configuration that meets customer requirements.

{CLICK} The latest Commvault HyperScale™ Storage Pool Sizing and Scalability
guidelines can be found on the Commvault® documentation website.

9
KNOWLEDGE CHECK A commodity “big-data” server
containing SSD and internal
HDD

A group of bricks coded with


Erasure coding
Block Sub-Volume
A scalable software defined
storage layer

Algorithmic software for RAID


Node Storage Pool like data redundancy

A HDD contained within a


node

A group of 3 or 6 HyperScale
Brick Erasure Coding nodes

Narrative:

Drag the Commvault HyperScale™ descriptions to match the corresponding


component.

10
COMMVAULT HYPERSCALE™ SERVER OPTIONS

▪ Not an appliance. There is no factory integration, nor is there integrated hardware support from Commvault ®
▪ Hardware issues must be raised with server vendor separately
▪ Can combine 12 drive form factor and 24 drive form factor in a single StoragePool

200
80 TB
TB––600
multi-PB
TB perper
Site
Site ▪ 2U 12
24 Drive LFF Servers,
servers with internal
with internal NVMeNVMe or cache
or SSD SSD cache
▪ Example vendors: Dell,
HPE, HPE, SuperMicro,
SuperMicro, UCS UCS
▪ More Validated
Validated Designs
Designs being being added.Release Candidates
completed,
▪ available
Support partially filled nodes, 6 drives per node to start
▪ with. Expand
Support withfilled
partially 6 additional
nodes, 6drives
drives per node to start
with. Expand with 6 additional drives

Narrative:

Now you have a good understanding of the architecture and components of a


Commvault HyperScale™ Storage Pool, Let’s look at some of the different Server
Options for building a solution.

{CLICK}
For an 80 to 600 Terabyte back end capacity, you will be looking at the 2u, 12
drive, large form factor servers. With internal NVMe or SSD drives for the index
cache and deduplication store. Vendors such as Dell, HPE, SuperMicro, and Cisco
have appropriate server models in this category that have been validated by
Commvault® for the HyperScale Storage Pool.

One of the flexibilities of a 12 drive server node, is that the chassis can be partially
populated with 6 drivers per node. Then when the customer approaches the
usable capacity you can simply expand the Storage pool by installing an additional
6 drives in each node.

{CLICK}
Moving up to large or Enterprise sites, when you are looking at over 200 terabytes

11
to multi petabyte capacities then you should consider the high-density 2u, 24
drive server options. Examples of vendors with offerings in this space are HPE,
SuperMicro, and Cisco. Again you can partially fill each node and expand capacity
by installing additional hard disk drives in increments of 6 as required.

11
VALIDATED REFERENCE DESIGN PROGRAM FOR COMMVAULT
HYPERSCALE™ SOFTWARE

▪ Ease of Acquisition
▪ Simple Installation and Integration
▪ Centralized Manageability
▪ Single Patch and Firmware Update
▪ One Point of Contact

While a build-your-own solution allows for greater customization, it often


requires staffing resources and time to bring the system into production. The
Commvault® Validated Reference Design Program simplifies designing, planning,
and implementing solutions using Commvault HyperScale™. Commvault Validated
Reference Designs deliver tested configurations with leading hardware vendor
technology that provide validated designs complemented by best practice
configurations that will accelerate ROI, reduce complexity, and add customer
value.

{CLICK} Commvault is actively working with 8 vendors in the program today, with
potential for other vendors to be qualified based on customer and market
demands. The validated reference designs goes through a series of development
stages starting with vendor qualification and creating release candidates all the
way through to testing, validation and finally - external publication.

{CLICK} There are now several validated reference designs, generally available via
commvault.com. We encourage our partners to work with their local Commvault
representative, who may also engage the dedicated alliances team responsible for
the Validated Reference Designs, on opportunities involving vendor reference

12
design candidates.

12
LINUX VSA AND HYPERSCALE

Virtual Server Agent (VSA) integrates


seamlessly with HyperScale™

▪ The Virtual Server Agent is installed by default Hypervisor

on each HyperScale Node

▪ No additional VSA Proxies required for


hypervisors that support protection through a VSA Proxy Group
Linux VSA (optional)

▪ Windows VSA proxies required for certain tasks


– Refer to documentation.commvault.com

Hyperscale Grid

Narrative:

The Virtual Server Agent or VSA has been ported to Linux {CLICK} and is included
on every HyperScale node. {CLICK} This means that any Hypervisors that rely on a
Linux VSA traditionally, {CLICK} do not require separate proxies to perform data
protection and recovery.
{CLICK}
For Hypervisor features that have traditionally relied on a Windows based VSA,
{CLICK} then the solution may require an optional VSA proxy group running
Windows. For example, at the time of writing this training, granular file level
recovery in Windows and VMware SAN mode operations that utilize stand-alone
proxies with HBAs connected to the SAN, would both require a Windows VSA
proxy.

As enhancements are added to the Linux VSA on a regular basis and feature parity
is sought, {CLICK} it is recommended to check the Commvault documentation
website for the latest requirements and currently supported features.

13
Commvault HyperScale™ Appliance

14
CHALLENGES OF BUILD YOUR OWN SOLUTIONS

Acquire
Backup Application Too many moving parts
Server Platform

Support Install Too many vendors


Lifecycle
Complex interop

Unpredictable
Storage performance and cost
Operating System Patch Admin

Add-on Components

Building your own data management infrastructure is often challenging. The


typical steps the customers have to go through include - Acquisition, Installation &
Integration, Administration, Patching and Upgrades, and Support.

There are numerous components which must be ordered, interoperability of the


components must be verified, install and integration takes hours sometimes days,
configuration is tested and performance tweaked. For administering the
infrastructure, the customer is faced with multiple components to manage,
multiple patches, firmware updates that are separate and need to be tested.
Support often turns into the finger pointing game with 3, 4 or even 5 vendors.
Lastly, the adding of capacity without adding more compute often leads to
unpredictable performance and may lead to unforeseen costs.

15
COMMVAULT HYPERSCALE™ APPLIANCE
Extending the value of Commvault® Data Platform with a turnkey solution

▪ Rich feature set – Commvault Data Platform


▪ Converged solution – server, storage, and
all software included
▪ Highly distributed scale out architecture
▪ Linear scalability of capacity and
performance
▪ Enterprise grade resiliency

Commvault® is now extending the benefits of the Commvault data platform into a
turnkey appliance solution, Commvault HyperScale™ appliance. {CLICK1}
Commvault HyperScale appliance is a converged data management solution that
combines {CLICK2} server, storage and all data management capabilities into an
easy to use appliance form factor. {CLICK3} Built on a highly distributed scale out
architecture, CV HS appliances eliminate many of the limitations encountered in
legacy scale up data protection architectures such as performance bottlenecks,
disruptive upgrades etc.

With HS appliances, {CLICK4} customers grow compute and storage capacity


linearly which provides predictable growth of capacity and performance. Lastly,
{CLICK5} the appliances are built with enterprise grade resiliency in mind, which
means no single point of failure in any of the underlying components.

16
COMMVAULT HYPERSCALE™ APPLIANCE – SPECIFICATION

Commodity Hardware

▪ 1U appliance node, 3 node initial configuration

▪ 4 HDDs and 2 SSDs per node

▪ *Available in 32, 48, 64, 80 TB usable


capacities

Intelligent Software Layer


▪ Scale-out architecture

▪ Modular scalability

▪ No single point of failure

▪ Self healing architecture


* Capacities shown in base10 format

Now let’s discuss the technical specification of the Commvault HyperScale™


Appliance.

The Hardware consists of commodity components. Each node is a 1U server and


like the HyperScale Storage Pool, each appliance consists of a 3 node
configuration, which is required for the initial deployment. Each node consists of
4 Hard disk drives and 2 solid state drives. The 2 SSDs (1 SATA SSD and 1 NVMe)
host the pieces of the architecture that require the highest levels of performance.
All protected data is stored on the larger Hard disk drives which are available in 4
capacities – 4, 6, 8, and 10TB. After the erasure coding overhead, the available
space in the 4 offerings is 32, 48, 64, or 80 TB respectively.

On top of the commodity hardware runs the intelligent software layer, which
consists of a scale out shared-nothing architecture which provides features such
as the ability to scale modularly, no single point of failure, and a self healing
architecture.

17
COMMVAULT HYPERSCALE™ APPLIANCE - ARCHITECTURE

Hypervisor Layer for Highly


Available CommServe®

Nodes

Storage Pool

Sub Volumes used for


Erasure Coding
NVMe Flash – used for Partitioned Deduplication
Deduplication Database Database
and Index Cache

SATA SSD – used for 4 drives attached to each Dual port 10GbE adapter - used
Operating System and node will form a storage pool for data and private storage
Commvault® Binaries to act as a single mount path cluster networks

The Commvault HyperScale™ Appliance consists of compute and storage blocks


which form a hyper converged scale-out Storage Pool. {CLICK}Each node has
dedicated NVMe flash used for the Deduplication Database and Index Cache,
while the {CLICK} SATA solid state drives are dedicated for the operating
system and Commvault® binaries. {CLICK} The twelve SAS disk drives are
dedicated to form the Commvault data storage pool. The {CLICK} Commvault
HyperScale Appliance utilizes erasure coding and a KVM hypervisor layer
combination, producing highly available node and CommServe® integrity with
maximum storage capacity.

{CLICK} Each node features a dual port 10 Gigabit Ethernet adapter with an LC
SFP+ transceiver installed in each port. These can be used for 10GbE fiber
cabling or can be removed for supported copper Twinax cabling if desired.
Remember to ensure that the customer has a 10Gbe capable network
infrastructure and they must supply the required fibre cables to connect the
appliance.

All data management tasks including backups and restores as well as Virtual
Commserve connectivity are established through the 10GbE data port. All
storage related tasks including all cluster connectivity for the backend storage
cluster network will be through the private storage network 10GbE port.

18
Each node is a dedicated Linux Media agent, making the appliance easy enough
to fit into an existing Commvault environment or promote the Appliance as the
Commserve & Media Agent combination.

A Hypervisor layer is created to host a VM, which will include the Commserve
component making it highly available.

18
Deployment Examples

We’ll now look at some deployment examples for both the Commvault
Hyperscale™ Storage Pool and HyperScale Appliance…

19
DEPLOYMENT EXAMPLE #1: SMALL TO MEDIUM SITES

Single Site Considerations

▪ New Customer
– CS deployed as VM on the node
– No additional hardware needed
Lead with Appliance Expand with Appliance
▪ Existing Customer
OR – Use existing CS, external to
appliance
Multi-Site Replication – Optionally, migrate CS to VM on
appliance
Site 1 Site 2

Cross Site
Replication

Narrative:

The first example is a small to medium site.

{CLICK} In a single site you can start with a Commvault HyperScale™ appliance and
expand using an additional appliance, up to a maximum of 6 nodes.

{CLICK} Or in a multi-site scenario an appliance can be used in each for local


protection with cross site replication.

{CLICK} For new Commvault® customers, the CommServe® is deployed as a


Highly Available Virtual Machine on the HyperScale appliance, meaning that no
additional hardware is required. For existing CommCells, the appliance can use a
CommServe that is external or optionally migrate the CommServe role onto the
appliance itself.

20
DEPLOYMENT EXAMPLE #2: MANY SMALL SITES TO CLOUD

Site 1 Site 2 Site 3 ▪ Multiple protection sites


< 20TB

▪ Secondary copy stored in


Public Cloud for DR
Up to 6 nodes

▪ Appliances used in larger


sites, small site uses
stand-alone MediaAgent

▪ Flexible & Economical

Public Cloud for Secondary


Storage and DR

Narrative:

In the second example we have multiple sites that are creating a secondary copy
of data inside the Public Cloud for Disaster Recovery. In this particular example we
also have a very small site, with less than a 20 terabyte requirement. In this
situation we have an appliance deployed at site 1 and site 2, again with either 3 or
6 nodes, and then you can choose just a stand alone MediaAgent in site 3. This
provides increased flexibility and economy.

21
DEPLOYMENT EXAMPLE #3: MANY SMALL SITES TO LARGE DATA CENTER

Site 3
Site 1 Site 2 ▪ Large Central DC
containing CS & HS
Storage Pool

▪ Site 1 and Site 2 utilize a


Commvault HyperScale™
Appliance
Block Block

Block

Commvault HyperScale Software ▪ Site 3 utilizes a HyperScale


Secondary
Storage Pool storage pool reference
architecture

Narrative:

In this third example, we have multiple small sites, copying data back to a larger
central data center. Site 1 and Site 2 fall within the documented sizing
recommendations for a HyperScale appliance. Site 3 has a larger data footprint
and is utilizing a HyperScale Storage Pool reference architecture deployment.

The central data center contains both the CommServe® and a larger HyperScale
Storage Pool, capable of retaining secondary copy data from all three smaller
sites.

22
MIXING COMMVAULT HYPERSCALE™ OPTIONS

Site 1 Site 2 Site 3 Start small, expand to large

Add 72 TB – 480 TB Add 72 TB – 480 TB

Add 72 TB – 480 TB Add 72 TB – 480 TB

20 TB – 80 TB 20 TB – 80 TB

Block Block

Start Small Expand with


Appliance
Block

Storage Pool
Commvault HyperScale Software

Secondary
Storage Pool

Narrative:

It is also worth mentioning that you can mix and match HyperScale options within
the same Storage Pool. For example a customer may start small with the
appliances but once they have grown beyond 6 nodes may add reference
architecture to scale out the solution.

23
Commvault HyperScale™ Sizing

24
RIGHT SIZING HYPERSCALE™ INITIAL DEPLOYMENT & EXPANSION

▪ Standard scale-up sizing – Beginning quantity of data to protect + anticipated


annual growth x 3 to 5 years = purchase all the storage and servers up front
▪ New HyperScale sizing – Beginning workload to protect + 6 to 9 months of
growth = Initial block(s) of storage
Year 5 servers to purchase

▪ As Storage Pool approaches full capacity, grow “on demand” – add another block
Year 4 Current workload
Workload 6-9 months of growth

Discount
Year 3

Block Block Block Block


Year 2 Year
Block1 Block Block

✓ “Bulk”
Low uphardware
front costdiscount x
x High uppeople
Makes front cost
nervous…
✓ “Cushion”
Grow for growth
as needed, no reconfig x Under utilized hardware
✓ Nothing &
Moore’s else to think
Kryder’s about
Laws x Un-utilized at EOL
✓ No future reconfiguration
Evergreen storage pool x Cost of ongoing maintenance
x Using old technology

Narrative:

Sizing the initial deployment is important when comparing HyperScale with a


standard scale-up architecture.

{CLICK1} With the standard scale-up model, {CLICK2} you would typically start
with the current quantity of data that needs to be protected, {CLICK3} plus the
anticipated annual growth, usually for the next 3 or 5 years. {CLICK4} Many
customers have become accustomed to purchasing all of the servers and storage
upfront, including {CLICK5} the associated installation, {CLICK6} maintenance,
{CLICK7} rack space, {CLICK8} power, and {CLICK9} cooling. {CLICK10} The
advantages of this approach is that the customer often negotiates a bulk
hardware discount due to the size of the purchase, {CLICK11} and although they
may not use all of that compute and storage capacity, {CLICK12} they feel like they
have a cushion for growth. On the negative side, {CLICK13} although they receive
a bulk discount, {CLICK14} the up front cost is still significantly high. {CLICK15}
There is also the risk of hardware being underutilized or reaching end of life.
Additionally, {CLICK16} there is the cost of ongoing maintenance and the risk of
being forced to run with older technology.

25
{CLICK17}
When you begin to size a HyperScale opportunity, begin with sizing the initial
blocks based on the size of the workload to protect, plus 6 to 9 months of growth.
{CLICK18} As the storage pool approaches capacity, you can grow it by simply
adding additional blocks of severs.

{CLICK19}
This will provide the customer with a low up front cost, because they are just
buying for what they need. Customers are getting more familiar with this
{CLICK20} cloud-economic model, whereby they only consume {CLICK21} what
they need, {CLICK22} when they need it. {CLICK23}As it is easy to add blocks to the
storage pool, there is almost no additional configuration. You do not have to go
back and re-balance the storage or reconfigure clients etc. This follows Moore’s &
Kryder’s laws, whereby customers can expand at a later time {CLICK24} by adding
the same hardware at a cheaper price, {CLICK25} or faster, higher-density
hardware at the same price as the previous purchase. In that sense customers will
be able to always take advantage of the most current technology by adding it into
the Storagepool. {CLICK26} When servers in the pool reach end of life and are no
longer serviceable, {CLICK27} the data can be migrated to newer hardware and
removed without impacting the environment. {CLICK28} You can help reassure
people who may be nervous or unfamiliar with this approach. {CLICK29} You can
even use the admin console to show how easy it is to view the current storage
pool usage and predict when additional capacity will be required.

25
COMMVAULT HYPERSCALE™ BLOCK SIZING CHART
Nodes
per
Drive
Size
Drives
per
Drives
in
Raw
Capacity
Usable
Capacity per
Usable
Capacity Base
Suggested Protection Workload 1 ▪ No Additional MediaAgents
Block node Block per Block Block 2 # of VMs 2 Size of Files 3 Application Size
4 required
Commvault HyperScale Appliance
▪ Additional External Windows
3 4TB 4 12 48TB 32TB 29TB 240 24TB 19TB
Proxies may be needed
3 6TB 4 12 72TB 48TB 43TB 350 35TB 28TB
3 8TB 4 12 96TB 64TB 58TB 480 48TB 38TB ▪ On the Appliance, CommServe®
3 10TB 4 12 120TB 80TB 72TB 600 60TB 48TB
runs on the nodes
Commvault HyperScale Software (Partially Filled Nodes/Partial Node Expansion)

3 6TB 6 18 108TB 72TB 65TB 550 55TB 42TB ▪ On Reference Designs CS is


3 8TB 6 18 144TB 96TB 87TB 750 75TB 56TB external
3 10TB 6 18 180TB 120TB 109TB 950 95TB 70TB

Commvault HyperScale Software (Fully Populated Nodes)

3 6TB 12 36 216TB 144TB 130TB 1,100 110TB 84TB


3 8TB 12 36 288TB 192TB 116TB 1,500 150TB 112TB
1Typical Daily Change Rate of 1-2%, 30-90 day retention.
Combination of workloads could yield different outcomes
3 10TB 12 36 360TB 240TB 238TB 1,900 190TB 140TB
2 Average VM size is 50GB – 100 GB protected using VSA.
3 6TB 24 72 432TB 288TB 261TB 2,250 225TB 168TB Larger VMs may be considered as Apps. Incremental
forever with Change Block Tracking
3 8TB 24 72 576TB 384TB 349TB 3,000 300TB 224TB
3Assumes Incremental forever for files and no database
3 10TB 24 72 720TB 480TB 436TB 3,800 380TB 336TB
dumps backed up as files.
4Applications and DBs have higher daily change rate.
Protected using Application Agents.
* Cisco UCS S3260 partially loaded will require an extra
frame.

Narrative:

The following is a sizing chart to help you size the different size blocks.

You will notice that we are also showing Base2 capacity, which is what the
Commvault® licensing is based on.

You can see that Commvault offer a number of block sizes to address different
customer capacity requirements. As mentioned earlier in the module, you have
the ability with the HyperScale StoragePool to deploy a block with partially filled
nodes and then grow the capacity by simply adding hard disk drives into the
empty spaces.

This sizing chart is accurate at the time of writing this training. Please refer to the
Commvault documentation website for the latest sizing metrics. Partners should
contact their local distributor or Commvault representative if they require further
assistance with sizing HyperScale.

Click next to move to the next section after you have finished reviewing the chart.

26
SIZING METHODOLOGY: EXAMPLE

Information Needed Example


▪ # of Sites:
# of Sites: 2

Site 1 Site 2
For Each Site # of VMs: 500 VMs 800 VMs
▪ # of VMs File Server Capacity: 200 TB 500 TB
▪ File Server Capacity Application Capacity: 50 TB 80 TB
▪ Application Capacity
▪ # of Copies to retain # of Copies to retain
▪ On-premises On-premises :2
▪ In the Cloud In the Cloud :1

Narrative:

Here is a simple sizing methodolgy for Commvault HyperScale™ .

This methodology harks back to earlier modules when we discussed high-level


design and data profiling. We start with the number of sites to protect and then
for each site, the number of VMs, and then the File Server capacity and
application server capacity, both measured in front end terabytes.

{CLICK}
For this example we have 2 sites, please take a moment to review the sizing
information collected and click next when you are ready to go to the example.

27
SIZING EXAMPLE
Site 1 Site 2
VMs (per Copy) FET for VMs: 50 TB 80 TB
100 GB FET per VM
Normal Retention : 1.2X FET Space needed for VMs: 60 TB 96 TB
Long Retention : 2X FET (1 year) Space Needed for Files: 240 TB 600 TB
Space Needed for Apps: 75 TB 120 TB
File Servers (per Copy)
Space for Local Backup: 375 TB 816 TB
Normal Retention : 1.2X FET
Space for Other Site Copy: 816 TB 375 TB
Long Retention : 1.5X FET (1 year)
Total Space for Site: 1,191 TB 1,191 TB
Space required in Cloud: 375 TB 816 TB
Applications (per Copy)
Normal Retention : 1.5X FET Suggested Nodes: 9X 24 Drive 9X 24 Drive
Long Retention : 2.5X FET (1 year) Nodes Nodes
36 TB of SSD Per Site

SDT is being simplified and refreshed. Use for advanced configurations with appropriate assumptions

Let’s look at how we calculate the sizing from the previous example.

{CLICK} First we are using known customer averages for each data set. For
example the average VM size protected across all reported CommCells is
approximately 60Gigabytes, we have padded that out here to 100Gigabytes to
allow for some variation, but for normal retention, that’s between 30 and 60 days,
you will need 1.2 times the front-end terabytes of capacity in order to retain that
data. For long term retention of a year then the capacity requirement is 2 times
the front end terabytes.

{CLICK} File servers also have a 1.2 times ratio for normal retention and 1.5x for
long term retention.

{CLICK} Applications require 1.5x and 2.5x for normal and long term retention
respectively.

{CLICK} Let's look at how this maps out for the example on the previous slide. As
you can see we have 375TB of local space required for Site 1 and 816TB space for
site 2. As we included 2 copies in the requirements each site has an allowance for
replicating its data to the alternate site in addition to the space required by the

28
third copy, in the cloud.

This equates to approximately 1.1Petabytes of space per site.

{CLICK} We can accomplish this requirement with nine nodes, each with 24, 10
terabyte drives installed.

28
COMMVAULT HYPERSCALE™ SIZING STEPS

1 Determine best fit solution for


size and type of customer

Active Data Center (Production) Passive Data Center (DR)

2 Perform high-level design and


Secondary
CommServe®
(DR)

Secondary
MediaAgent

data-profiling tasks
Data (DR)

Other considerations e.g.

3 growth factor,
4+2 redundancy
level, additional
8+4
10GbE networking
proxies,

Nodes per Drive Drives per Drives in Raw Capacity Usable Usable Capacity Suggested Protection Workload 1
Block Size node Block per Block Capacity per Base 2
# of VMs 2 Size of Files 3 Application Size 4

Size Commvault HyperScale™


Block

4
Commvault HyperScale Appliance
3 4TB 4 12 48TB 32TB 29TB 240 24TB 19TB
3 6TB 4 12 72TB 48TB 43TB 350 35TB 28TB
3 8TB 4 12 96TB 64TB 58TB 480 48TB 38TB

infrastructure using the latest


3 10TB 4 12 120TB 80TB 72TB 600 60TB 48TB

Commvault HyperScale Software (Partially Filled Nodes/Partial Node Expansion)


3 6TB 6 18 108TB 72TB 65TB 550 55TB 42TB
3 8TB 6 18 144TB 96TB 87TB 750 75TB 56TB
3 10TB 6 18 180TB 120TB 109TB 950 95TB 70TB

documented metrics
Commvault HyperScale Software (Fully Populated Nodes)
3 6TB 12 36 216TB 144TB 130TB 1,100 110TB 84TB
3 8TB 12 36 288TB 192TB 116TB 1,500 150TB 112TB
3 10TB 12 36 360TB 240TB 238TB 1,900 190TB 140TB
3 6TB 24 72 432TB 288TB 261TB 2,250 225TB 168TB
3 8TB 24 72 576TB 384TB 349TB 3,000 300TB 224TB
3 10TB 24 72 720TB 480TB 436TB 3,800 380TB 336TB

Here are the hyperscale sizing steps

1. One, determine the best fit solution. For example, with small to medium
enterprises you may decide to lead with the Commvault HyperScale™
appliance, and for larger enterprise customers lead with a scale out solution
using a validated reference design. Additionally, very small sites may utilize a
stand-alone MediaAgent if necessary. Remember you can also mix the
HyperScale Appliance with the reference architecture to scale out the
solution beyond 6 nodes.

2. Two, perform the high-level design steps and data profiling tasks as outlined
in the earlier modules.

3. Three, Remember to consider other aspects of the solution, for example the
growth factor required initially, remember not to oversize the storage pool,
you now have the knowledge to educate the customer on how they can
easily scale out, un disruptively on demand.

4. You can then perform the actual sizing of the HyperScale solution, in terms
of nodes and storage capacity, using the latest sizing metrics provided on the

29
Commvault® documentation website.

29
WRAP-UP

▪ Commvault HyperScale™ Software Overview


▪ HyperScale architecture and resiliency
▪ HyperScale Appliance architecture
▪ HyperScale StoragePool sizing and resiliency

▪ HyperScale server options and reference architecture


▪ Linux VSA for virtual server protection
▪ HyperScale StoragePool and Appliance deployment examples

Narrative:

Thank you for watching.

In this module you learned more about the Commvault HyperScale™ solution. We
discussed the HyperScale software solution in the form of the HyperScale Storage
Pool and Commvault HyperScale Appliance. You then heard how the HyperScale
architecture provides increased scalability and resiliency through the use of
Gluster File System with Erasure Coding. We discussed some server options for
the HyperScale StoragePool followed by the Commvault® Validated Reference
Design program.

We then walked through some deployment examples and finally you learned how
to position and size a HyperScale solution effectively.

30
Questions?
Suggestions?
TECHENABLEMENT@COMMVAULT.COM

31

Das könnte Ihnen auch gefallen