Sie sind auf Seite 1von 164

Technical white paper

HP StoreOnce Backup systems, G3 BBxxxA models


Best practices for VTL, NAS, StoreOnce Catalyst
and Replication implementations

Contents

Abstract 5
Related products 5
Validity 5
Executive summary 6
General StoreOnce best practices at a glance 6
VTL best practices at a glance 7
NAS best practices at a glance 7
Catalyst best practices at a glance 7
HP StoreOnce Technology 8
Key factors for performance considerations with deduplication
that occurs on the StoreOnce Backup system 8
StoreOnce Catalyst overview 8
Key features 9
VTL and NAS Replication overview 9
Housekeeping overview 10
Backup Application considerations 11
Multi-stream or Multiplex 11
Use multiple backup streams 11
Data compression and encryption backup application features 12
Network and Fibre Channel best practices 15
Network configuration guidelines 15
Single Port configurations 16
Dual Port configurations 17
Bonded port configurations (recommended) 18
10GbE Ethernet ports on StoreOnce 4420/4430 Backup systems 19
Network configuration for CIFS AD 19
Fibre Channel configuration guidelines 23
Switched fabric 23
Direct Attach (Private Loop) 24
Zoning 24
Use soft zoning for high availability 25
Diagnostic Fibre Channel devices 26
StoreOnce Catalyst configuration guidelines 28
Catalyst technology 28
Generic sizing rule for media servers running Catalyst API 29
Catalyst Copy 29
Maximum concurrent jobs and blackout windows 30
Configuring client access 32
For more information 32
VTL configuration guidelines 33
Summary of best practices 33
Tape library emulation 33
Emulation types 33
Cartridge sizing 34
Number of libraries per appliance 34
Backup application configuration 35
Blocksize and transfer size 35
Rotation schemes and retention policy 35
Retention policy 35
Rotation scheme 35
StoreOnce NAS configuration guidelines 37
Introduction to StoreOnce NAS backup targets 37
Overview of NAS best practices 37
Shares and deduplication stores 37
Maximum concurrently open files 38
Backup application configuration 38
Backup file size 38
Disk space pre-allocation 39
Block / transfer size 40
Concurrent operations 40
Buffering 40
Overwrite versus append 40
Compression and encryption 40
Verify 41
Synthetic full backups 41
CIFS share authentication 41
StoreOnce Replication 42
StoreOnce VTL and NAS replication overview 42
Best practices overview 42
Replication usage models (VTL and NAS only) 43
What to replicate 45
Appliance, library and share replication fan in/out 46
Concurrent replication jobs 46
Apparent replication throughput 46
What actually happens in replication? 47
Limiting replication concurrency 47
WAN link sizing 47
Seeding and why it is required 48

2
Seeding methods in more detail 50
Seeding over a WAN link 50
Co-location (seed over LAN) 52
Floating StoreOnce appliance method of seeding 54
Seeding using physical tape or portable disk drive and backup application copy utilities 56
Replication and other StoreOnce operations 58
Replication blackout windows 58
Replication bandwidth limiting 58
Source Appliance Permissions 59
Replication and Catalyst monitoring 60
Configurable Synchronisation Progress Logging and Out of Sync Notification 60
Activity monitor 61
Replication throughput totals 61
Catalyst throughput totals 62
Using HP Replication Manager to monitor replication and Catalyst copy status 63
Housekeeping monitoring and control 65
Terminology 65
Tape Offload 69
Terminology 69
Direct Tape Offload 69
Backup application Tape Offload/Copy from StoreOnce Backup system 69
Backup application Mirrored Backup from Data Source 69
Tape Offload/Copy from StoreOnce Backup system versus Mirrored Backup from Data Source 69
When is Tape Offload Required? 69
Catalyst device types 70
VTL and NAS device types 71
Key performance factors in Tape Offload performance 72
Summary of Best Practices 73
Appendix A Key reference information 74
Appendix B – Fully Worked Example 76
Hardware and site configuration 76
Backup requirements specification 77
Remote Sites A/D 77
Remote sites B/C 77
Data Center E 77
Using the HP Storage sizing tool 78
Configure replication environment 78
Remote Sites A/D 79
Remote sites B/C 85
Data Center E 87
Sizing Tool output 90
Understanding the htlm output from the Sizing Tool 90
Configure StoreOnce source devices and replication target configuration 95
Sites A and D 95
Sites B and C 95
Site E 95
Map out the interaction of backup, housekeeping and replication for sources and target 96
Tune the solution using replication windows and housekeeping windows 97
Worked example – backup, replication and housekeeping overlaps 98

3
Catalyst Sizing example 102
StoreOnce Catalyst support in the Sizing Tool 102
Worked Example 102
Appendix C: Guidelines on integrating HP StoreOnce with HP Data Protector 7,
Symantec NetBackup 7.x and Symantec Backup Exec 2012 106
HP StoreOnce Catalyst: Configuration, Display and Set-up 107
Status tab 107
Settings tab 107
Permissions tab (per store) 108
Catalyst stores 108
Data stored within Catalyst 109
Catalyst copy jobs 109
Examples of NetBackup Data Job entries 109
Examples of HP Data Protector Item entries 110
Examples of Backup Exec Item entries 111
Catalyst Implementation in HP Data Protector 7 112
Integrating HP Data Protector 7 with StoreOnce Catalyst 112
HP Data Protector gateways 112
Deduplication types with the explicit gateway 113
Key Points: 113
Example scenario 114
Configuring Data Protector 115
Key Points: 117
Creating a Data Protector Specification for backup to a StoreOnce Catalyst store 117
Backup using source-side deduplication (implicit gateway) 118
Backup using server-side deduplication (explicit gateway) 118
Selecting gateways in HP Data Protector 119
Catalyst Copy Implementation in HP Data Protector 7 (Object Copy) 119
Key Points: 121
Setting up Object Copy 121
HP Data Protector 7 – Catalyst best practices summary 124
HP StoreOnce Catalyst stores and Symantec products 126
StoreOnce Catalyst implementation with Symantec NetBackup 128
Integrating HP StoreOnce Catalyst stores with Symantec NetBackup 128
Configuring a StoreOnce Catalyst store in Symantec NetBackup 129
Catalyst Copy Implementation in Symantec NetBackup (Storage Lifecycle policy – duplicate) 132
Symantec NetBackup 7.x – Recovery from Catalyst copies 136
Symantec NetBackup 7.x – Catalyst best practices summary 137
Integrating StoreOnce Catalyst with Symantec Backup Exec 139
Device Configuration in Backup Exec 2012 140
Configuring a backup Job in Backup Exec 2012 147
Catalyst Copy Implementation in Symantec Backup Exec – ( Duplicate) 150
Symantec Backup Exec – Recovery from Catalyst copies 153
Image Clean up 156
Symantec Backup Exec 2012 – Catalyst best practices summary 157
Catalyst Low Bandwidth backups over High Latency Links 159
Index 161
For more information 164

4
Abstract
The HP StoreOnce Backup system products with Dynamic Data Deduplication are Virtual Tape library, NAS share and Catalyst store appliances
designed to provide a cost-effective, consolidated backup solution for business data and fast restore of data in the event of loss.

In order to get the best performance from an HP StoreOnce Backup system there are some configuration best practices that can be applied. These
are described in this document.

Related products
Information in this document relates to the following products:
Product Generation Product Number
HP StoreOnce 2610 iSCSI Backup G3 N/A
HP StoreOnce 2620 iSCSI Backup G3 BB852A
HP StoreOnce 4210 iSCSI Backup G3 BB853A
HP StoreOnce 4210 FC Backup G3 BB854A
HP StoreOnce 4220 Backup G3 BB855A
HP StoreOnce 4420 Backup G3 BB856A
HP StoreOnce 4430 Backup G3 BB857A

Validity
This document provides an equivalent document, for G3 products only, to “HP StorageWorks D2D Backup System Best Practices for Performance
Optimization” (HP Document Part Number EH990-90921). Please note that EH990-90921 is only relevant to the older G1 and G2 StorageWorks
D2D product family.

Note: G3 products run software versions 3.x.x. G2 and G1 products run software versions 2.x.x and 1.x.x. respectively.

Best practices identified in this document are predicated on using up-to-date StoreOnce system software (check www.hp.com/support for
available software upgrades). In order to achieve optimum performance after upgrading from older software there may be some pre-requisite
steps; see the release notes that are available with the software download for more information.

5
Executive summary
This document contains detailed information on best practices to get good performance from an HP StoreOnce Backup system with HP StoreOnce
Deduplication Technology.

HP StoreOnce Technology is designed to increase the amount of historical backup data that can be stored without increasing the disk space
needed. A backup product using deduplication combines efficient disk usage with the fast single file recovery of random access disk and also
enables the use of low bandwidth replication to provide a very cost-effective disaster recovery solution.

As a quick reference these are the important configuration options to take into account when designing a backup solution.

General StoreOnce best practices at a glance


 Always use the HP StoreOnce Sizing tool to size your StoreOnce solution. It is available at:
http://h30144.www3.hp.com/SWDSizerWeb/default.htm.
 Always ensure that the appliance software in your HP StoreOnce Backup system is fully up-to-date. Software upgrades also contain all the
necessary component firmware upgrades. (Check at http://www.hp.com/support.)
 Where possible send the same data types to the same device configured on the StoreOnce system; this will maximize deduplication ratios.
 Run multiple backups in parallel to improve aggregate throughput to a StoreOnce appliance, the maximum number of simultaneous streams
per configured device is shown in Appendix A.
 Create separate windows for backup, replication, housekeeping and offload to physical tape so that overall performance is much more
predictable than when all processes interact with each other in overlapping windows.
 Configure multiple ports in a network bond to achieve maximum available network throughput.
 Identify other performance bottlenecks in your backup environment such as slow clients and media servers.
 Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model – see Appendix A.Choose the
backup device type that best fits your needs and is supported by your backup software provider; VTL, NAS (CIFS & NFS) and StoreOnce Catalyst
device types are available.

Device Type Key Features Best used in Comments

Enterprise FC SAN environment Tried and tested, well understood


Uses virtual tape drives and (B6200 and 4210 FC/ with traditional backup
Virtual Tape (VTL) virtual slots to emulate 4220/4420/4430). applications
physical tape libraries HP StoreOnce also supports iSCSI
VTL (not B6200) Uses Robot and Drives device type

Specific environments that do not


NAS shares can be easily support tape emulation backup or This is a NAS target for backup - not
configured and viewed by the prefer to backup directly to disk. In recommended for random NAS file
NAS (CIFS/NFS shares) operating system: CIFS shares some cases the licensing may be type access.
for Windows, NFS shares for lower cost for NAS shares as a
Unix backup target. Consider this device Uses Basic Disk device type
type for virtualized environments

Environments that require a single


Backup software has total management console for all backup May require additional plug-in
control over the HP StoreOnce and replication activities and the components on Media Servers
appliance, providing source- ability to implement federated
StoreOnce Catalyst Uses OpenStorage device type
based deduplication, deduplication*
replication control, improved (Symantec) or Backup to Disk ( HP
DR etc. Wherever possible HP recommend Data Protector) device type
the use of HP StoreOnce Catalyst

Table 1: HP StoreOnce Device Types


* Federated deduplication is an HP term referring to the ability to distribute the deduplication load across Media servers. This feature is
sometimes known as source-based deduplication or low bandwidth backup.

6
VTL best practices at a glance
 Make use of multiple network or Fibre Channel ports throughout your storage network to eliminate bottlenecks.
 For FC configurations, split virtual tape libraries and drives across multiple FC ports ( FC VTL is available on StoreOnce B6200, 4210 FC/
4220/4420/4430 models).
 Configure multiple VTLs and separate data types across them; for example SQL to VTL1, File to VTL2, and so on.
 Configure larger “block sizes” within the backup application to improve performance.
 Disable any multiplexing configuration within the backup application.
 Disable any compression or encryption of data before it is sent to the StoreOnce appliance.
 Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model – see Appendix A.
 Schedule physical tape offload/copy operations outside of other backup, replication or housekeeping activities.

NAS best practices at a glance


 Configure multiple shares and separate data types into their own shares.
 Adhere to the suggested maximum number of concurrent operations per share/appliance, as shown in Appendix A.
 Choose backup application container file sizes so that the entire backup job fits within that container size. For example, if a full backup is
500GB set the container size to be at least 500GB.
 Each NAS share has a 25,000 file limit, some backup applications create large numbers of small control files during backup to disk. If this is the
case , it may be necessary to create additional shares and distribute the backup across multiple shares.
 Disable software compression, deduplication and synthetic full backups.
 Do not pre-allocate disk space for backup files within the backup application.
 Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model – see Appendix A.
 For NFS shares ensure the correct mount options are used to ensure in-order delivery and provide better deduplication ratios. See the “HP
StoreOnce Linux and UNIX Configuration Guide” for specific details.

StoreOnce Catalyst best practices at a glance


HP StoreOnce Catalyst is a unique interface and is fundamentally different from virtual tape or NAS. It provides the backup application with full
control of backup and replication (called Catalyst Copy). For this reason, best practices are dependent upon the backup application. See Appendix
C for more details. Generic best practices for HP StoreOnce Catalyst implementations are:
 Ensure that the media servers where Catalyst low bandwidth backup is to be deployed are sized accordingly; otherwise, the implementation
will not work well.

 As with other device types, the best deduplication ratios are achieved when similar data types are sent to the same device.
 Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model . Because Catalyst stores can be
acting as a backup target and inbound replication target the maximum value applies to the two target types combined (although inbound
copy jobs would not normally run at the same time as backups) – see Appendix A
 Although Catalyst copy is controlled by the backup software, the copy blackout window overrides the backup software scheduling. Check for
conflicts.
 The first Catalyst low bandwidth backup will take longer than subsequent low bandwidth backups because a seeding process has to take
place.

 If you are implementing multi-hop or one-to-many Catalyst copies, remember that these copies happen serially not in parallel.
 Ensure the backup clean-up scripts that regularly check for expired Catalyst Items run at a frequency that avoids using excessive storage to
hold expired backups (every 24 hours is recommended).

 There are several specific tuning parameters dependent on backup application implementation – please see Appendix C for more details.

7
HP StoreOnce Technology
A basic understanding of the way that HP StoreOnce Technology works is necessary in order to understand factors that may impact performance
of the overall system and to ensure optimal performance of your backup solution.

HP StoreOnce Technology is an “inline” data deduplication process. It uses hash-based chunking technology, which analyzes incoming backup
data in “chunks” that average 4K in size. The hashing algorithm generates a unique hash value that identifies each chunk and points to its location
in the deduplication store.

Hash values are stored in an index that is referenced when subsequent backups are performed. When data generates a hash value that already
exists in the index, the data is not stored a second time, but rather a count is increased showing how many times that hash code has been seen.
Unique data generates a new hash code and that is stored on the appliance. Typically about 2% of every new backup is new data that generates
new hash codes.

With Virtual Tape Library and NAS shares, deduplication always occurs on the StoreOnce Backup system. With Catalyst stores, deduplication may
be configured to occur on the media server (recommended) or on the StoreOnce Backup system.

Key factors for performance considerations with deduplication that occurs on the StoreOnce
Backup system
 The inline nature of the deduplication process means that it is a very processor and memory intensive task. HP StoreOnce appliances have been
designed with appropriate processing power and memory to minimize the backup performance impact of deduplication.
 Best performance will be obtained by configuring a larger number of libraries/shares/Catalyst stores with multiple backup streams to each
device, although this has a trade off with overall deduplication ratio.
o If servers with lots of similar data are to be backed up, a higher deduplication ratio can be achieved by backing them all up to
the same library/share/Catalyst store, even if this means directing different media servers to the same data type device
configured on the StoreOnce appliance.
o If servers contain dissimilar data types, the best deduplication ratio/performance compromise will be achieved by grouping
servers with similar data types together into their own dedicated libraries/shares/Catalyst stores. For example, a requirement
to back up a set of exchange servers, SQL database servers, file servers and application servers would be best served by
creating four virtual libraries, NAS shares or Catalyst stores; one for each server data type.
 The best backup performance to a device configured on a StoreOnce appliance is achieved using somewhere below the maximum number of
streams per device (the maximum number of streams varies between models – See Appendix A for more details and the section StoreOnce
performance explained.
 When restoring data from a deduplicating device it must reconstruct the original un-deduplicated data stream from all of the data chunks
contained in the deduplication stores. This can result in lower performance than that of the backup process (typically 80%). Restores also
typically use only a single stream.
 Full backup jobs will result in higher deduplication ratios and better restore performance. Incremental and differential backups will not
deduplicate as well.

StoreOnce Catalyst overview


HP StoreOnce Catalyst delivers a single, integrated, enterprise-wide deduplication algorithm. It allows the seamless movement of deduplicated
data across the enterprise to other StoreOnce Catalyst systems without rehydration. This means that you can benefit from:
 Simplified management of data movement from the backup application: tighter integration with the backup software to manage file
replication centrally across the enterprise from the backup application GUI.
 Seamless control across complex environments: supporting a range of flexible configurations that enable the concurrent movement of data
from one site to multiple sites, and the ability to cascade data around the enterprise (sometimes referred to as multi-hop).
 Enhanced performance: distributed deduplication processing using StoreOnce Catalyst stores on the StoreOnce Backup system and on
multiple servers can optimize network loading and appliance throughput.
 Faster time to backup to meet shrinking backup windows: up to 100TB/hour* aggregate throughput, which is up to 4 times faster than
backup to a NAS target.

*Actual performance is dependent upon the specific StoreOnce appliance, configuration, data set type, compression levels, number of data
streams, number of devices and number of concurrent tasks, such as housekeeping or replication.

8
All HP StoreOnce Backup systems can support Catalyst stores, Virtual Tape libraries and NAS (CIFS/NFS) shares on the same system, which makes
them ideal for customers who have legacy requirements for VTL and NAS but who wish to move to HP StoreOnce Catalyst technology.

HP StoreOnce Catalyst stores do require a separate license on both source and target; VTL/NAS devices only require licenses if they are replication
targets.

Key features
The following are the key points to be aware of with StoreOnce Catalyst:
 Optional deduplication at the backup server enables greater overall StoreOnce appliance performance and reduced backup bandwidth
requirements. This can be controlled at backup session/job level.

 HP StoreOnce Catalyst enables advanced features such as duplication of backups between appliances in a network-efficient manner under
control of the backup application.

 Catalyst stores can be copied using low-bandwidth links – just like NAS and VTL devices. The key difference here is that there is no need to
set up replication mappings (required with VTL and NAS); the whole of the Catalyst copy process is controlled by the backup software itself.
 HP StoreOnce Catalyst enables space occupied by expired backups to be returned for re-use in an automated manner because of close
integration with the backup application

 HP StoreOnce Catalyst enables asymmetric expiry of data. For example: retain 2 weeks on the source, 4 weeks on the target device.
 HP StoreOnce Catalyst store creation can be controlled by the backup application, if required, from within Data Protector (not available with
Symantec products).

 StoreOnce Catalyst is fully monitored in the Storage Reporting section of the StoreOnce Management GUI and StoreOnce Catalyst Copy can
be monitored on a global basis by using HP Replication Manager V2.1 or above.

 HP StoreOnce Catalyst is an additional licensable feature both on the StoreOnce appliance and within the backup software because of the
advanced functionality it delivers.
 HP StoreOnce is only supported on HP Data Protector 7.01, Symantec NetBackup 7.x and Symantec Backup Exec 2012.

VTL and NAS Replication overview


Deduplication technology is the key enabling technology for efficient replication because only the new data created at the source site needs to
replicate to the target site once seeding is complete. This efficiency in understanding precisely which data needs to replicate can result in
bandwidth savings in excess of 95% compared to having to transmit the full contents of a cartridge/share from the source site. The bandwidth
saving will be dependent on the backup data change rate at the source site.

There is some overhead of control data that also needs to pass across the replication link. This is known as manifest data, a final component of
any hash codes that are not present on the remote site and may also need to be transferred. Typically the “overhead components” are less than
2% of the total virtual cartridge/file size to replicate.

Replication throughput can be “throttled” by using bandwidth limits as a percentage of an existing link, so as not to affect the performance of
other applications running on the same WAN link.

Key factors for performance considerations with replication:


 Define your “seeding” (first replication) strategy before implementation – several methods are available depending on your replication model
active/passive, active/active or Many-to-One. See Seeding methods in more detail.
 If a lot of similar data exists on remote office StoreOnce libraries, replicating these into a single target VTL library will give a better
deduplication ratio on the target StoreOnce Backup system. Consolidation of remote sites into a single device at the target is available with VTL
device types. (Catalyst targets can also be used to consolidate replication from various source sites into a single Catalyst store at a DR site.)
 Replication starts when the cartridge is unloaded or the NAS share file is closed and when a replication window is enabled. If a backup spans
multiple cartridges or NAS files, replication will start on the first cartridge/ file as soon as the job spans to the second, unless a replication
blackout window is in force.
 Size the WAN link appropriately to allow for replication and normal business traffic taking into account data change rates. A temporary increase
in WAN speed may be desirable for initial seeding process if it is to be performed over the WAN
 Apply replication bandwidth limits or apply replication blackout windows to prevent bandwidth hogging.

9
The maximum number of concurrent replication jobs supported by source and target StoreOnce appliances can be varied in the StoreOnce
Management GUI to also manage throughput and bandwidth utilization. The table below shows the default settings for each product.

HP StoreOnce HP StoreOnce HP StoreOnce HP StoreOnce HP StoreOnce


Parameter
2620 4210 4220 4420 4430

Appliance Fan-In 8 16 24 50 50
Appliance Fan-Out 2 4 4 8 8
Device Fan-In (VTL only) 1 8 8 16 16
Device Fan-out 1 1 1 1 1
Max Concurrent Outbound Jobs 12 24 24 48 48
Max Concurrent Inbound Jobs 24 48 48 96 96

Note: Fan in is the maximum number of source appliances that may replicate to a device acting as as a replication target.

Housekeeping overview
If data is deleted from the StoreOnce Backup system (e.g. a virtual cartridge is overwritten or erased), any unique chunks will be marked for
removal, any non-unique chunks are de-referenced and their reference count decremented. The process of removing chunks of data is not an
inline operation because this would significantly impact performance. This process, termed “housekeeping”, runs on the appliance as a
background operation.

Housekeeping is triggered in different ways depending on device type and backup application:
 VTL: media on which the data retention period has expired will be overwritten by the backup application. The act of overwriting triggers
the housekeeping of the expired data. If media is not overwritten (if backup application chooses to use blank media in preference to
overwriting), the expired media continues to occupy disk space.
 NAS shares: Some backup applications overwrite with the same file names after expiration; others do an expiry check before writing
new data to the share; others might do a quota check before overwriting. Any of these actions triggers housekeeping.
 Catalyst stores: The backup application clean-up process, the running of which is configurable, regularly checks for expired backups
and removes catalog entries. This provides a much more structured space reclamation process.
One final comment is that housekeeping blackout windows are configurable (up to two periods in any 24 hours) so, even if the “clean up” scripts
run in the backup software, the housekeeping will not trigger until the blackout window is closed.

Housekeeping is an important process in order to maximize the deduplication efficiency of the appliance and, as such, it is important to ensure
that it has enough time to complete. Running backup, restore, tape offload and replication operations with no break (i.e. 24 hours a day) will
result in housekeeping never being able to complete. Configuring backup rotation schemes correctly is very important to ensure the maximum
efficiency of the product; correct configuration of backup rotation schemes reduces the amount of housekeeping that is required and creates a
predictable load.

Large housekeeping loads are created if large numbers of cartridges are manually erased or re-formatted. In general all media overwrites should
be controlled by the backup rotation scheme so that they are predictable.

10
Backup Application considerations
Multi-stream or Multiplex
Multi-streaming is often confused with Multiplexing; these are however two different (but related) terms. Multi-streaming is when multiple data
streams are sent to the StoreOnce Backup system simultaneously but separately. Multiplexing is a configuration whereby data from multiple
sources (for example multiple client servers) is backed up to a single tape drive device by interleaving blocks of data from each server
simultaneously and combined into a single stream. Multiplexing is a hangover from using physical tape device, and was required in order to
maintain good performance where source servers were slow because it aggregates multiple source server backups into a single stream.

A multiplexed data stream configuration is NOT recommended for use with a StoreOnce system or any other deduplicating device. This is because
the interleaving of data from multiple sources is not consistent from one backup to the next and significantly reduces the ability of the
deduplication process to work effectively; it also reduces restore performance. Care must be taken to ensure that multiplexing is not happening
by default in a backup application configuration. For example when using HP Data Protector to back up multiple client servers in a single backup
job, it will default to writing four concurrent multiplexed servers in a single stream. This must be disabled by reducing the “Concurrency”
configuration value for the tape device from 4 to 1.

Understanding StoreOnce Performance factors


The HP StoreOnce Backup system performs best with multiple backup streams sent to it simultaneously.

The following graph, Figure 1, illustrates the relationship between the number of active data streams and performance; the appliance is assumed
to be one of the larger models where more than 24 streams (if fast enough) can achieve best throughput. The throughput values shown are for
example only. See Appendix A for the maximum number of streams recommended for best throughput per model.

Along the x axis is the number of concurrent streams. A stream is a data path to a device configured on StoreOnce; on VTL it is the number of
virtual tape drives, on NAS the number of writers, on Catalyst stores the number of streams.

Along the Y axis is the overall throughput in MB/sec that the StoreOnce device can process – this ultimately dictates the backup window. As a
backup window begins, the number of streams gradually increases and we aim to have as many streams running as possible to get the best
possible throughput to the StoreOnce device. As the backup jobs come to an end, the stream count starts to decrease and so the overall
throughput to the StoreOnce device starts to reduce.

The StoreOnce device itself also has a limit which we call the maximum ingest rate. In this example it is 1000MB/sec. The > 24 streams value is
calculated using “Infinite performance hosts” to characterize the HP StoreOnce ingest performance.

As long as we can supply around 24 data streams at the required performamce levels we keep the StoreOnce device in its “saturation zone” of
maximum ingest performance.

11
Figure 1: Relationship between active data streams and performance

Note 1: Stream source data rates will vary; some streams will be at 8, others at 50, and maybe some others at 200. This means that as the
stream count increases, it will be the aggregate total of the streams that will drive the unit to saturation, which is the goal. Some of the factors
that influence source data rate are the compressiblity of data, number of disks in the disk group that is feeding the stream, RAID type and others.
Note 2: With 5 streams at100MB/Sec we do not reach the maximum throughput of the node (server), which can support 600MB/sec in this
example. This is the maximum possible ingest rate of the device for a specific model based on 5 streams. This ingest rate is the maximum even if
each stream is capable of 200MB/Sec, because it represents the maximum amount of data the machine can process.
Note 3: The number of streams available varies throughout the backup window. The curve representing backup streams increases as the backup
jobs begin ramping into the appliance (to VTL, NAS share or Catalyst store target devices) and then declines towards the finish of the backup,
when throughput rates decline as backup jobs complete. This highlights the importance of maintaining enough backup threads from sources to
ensure that, while backups are running, sufficient source “data pump” is maintained to hold the StoreOnce device in saturation.

Notes for the color-coded circles

In example 1 (red circle) we are supplying much more than 24 streams (100 actually) but they are all slow hosts and the cumulative ingest rate is
800MB/sec (below our maximum ingest rate).

In example 2 (green circle) we have some high performance hosts that can supply data at a rate higher than the StoreOnce maximum ingest rate;
and so the performance is capped at 1000MB/sec.

12
In example 3 (blue circle) we have some very high performance hosts but can only configure 5 backup streams because of the way the data is
constructed on the hosts. In this case the maximum ingest of the StoreOnce appliance is 600MB/sec but we can only achieve 500 MB/sec because
that is as fast we we can supply the data (because of stream limitations). If we could re-configure the backups to provide more streams, we could
get higher throughput.
In example 4 (brown circle) we show a more realistic situation where we have a mixture of diferent hosts with different performance levels. Most
importantly, we have 30 streams and a total throughput capability of 950MB/sec, which puts us very close to the maximum ingest rate.
The maximum ingest rates vary according to each StoreOnce model. Typically, on the larger StoreOnce units about 48 streams spread across the
configured devices give the best throughput; more streams only help to sustain the throughput with each stream being throttled appropriately.
For example, if 96 streams are configured, the throughput is still the same as if 48 streams were configured – it is just that each stream runs
slower as resources are shared. See Appendix A for more details.
Once we understand the basic streams versus performance concept we can start to apply best practices for the number of devices to configure.
With these factors in mind we have recommended some VTL configurations for the above performance examples, which are illustrated in Figure 2
below.

Figure 2: Relationship between active data streams and device configuration ( VTLs shown)
Note2 above: In general, per device configured we get the best throughput between 12-16 streams and the best throughput per appliance when
we reach 48 streams or more. So, for 100 streams we could configure 6 devices with say 17 streams to each or 20 devices with 5 streams to each.
6 devices is preferrable because:
a) Less devices are easier to manage but we can still group similar data types into the same device
b) They provide best possible throughput when we have the higher stream count to a device

13
Data compression and encryption backup application features
Both software compression and encryption will randomize the source data and will, therefore, not result in a high deduplication ratio for these
data sources. Consequently, performance will also suffer. The StoreOnce Backup system will compress the data at the end of deduplication
processing anyway, before finally writing the data to disk.

For these reasons it is best to do the following, if efficient deduplication and optimum performance are required:
 Ensure that there is no encryption of data before it is sent to the StoreOnce appliance.
 Ensure that software compression is turned off within the backup application. Not all data sources will result in high deduplication ratios;
deduplication ratios are data type dependent, change rate dependent and retention period dependent. Deduplication performance can,
therefore, vary across different data sources. Digital images, video, audio and compressed file archives will typically all yield low deduplication
ratios. If this data predominantly comes from a small number of server sources, consider setting up a separate library/share/Catalyst store for
these sources for better deduplication performance. In general, high- change rates yield low dedupe ratios, whilst low change rates yield high
dedupe ratios over the same retention period. As you might expect – multiple full backups yield high dedeup ratios compared to Full and
Incremental backup regimes.

14
Network and Fibre Channel best practices
The following table shows which network and fibre channel ports are present on each model of StoreOnce appliance.

Product/Model Name Part Number Ethernet Connection Fibre Channel Connection


StoreOnce 2620 BB852A 2 x 1GbE None
StoreOnce 4210 isci BB853A 2 x 1GbE None
StoreOnce 4210 fc BB854A 2 x 1GbE 2 x 8Gb FC
StoreOnce 4220 BB855A 2 x 1GbE 2 x 8Gb FC
StoreOnce 4420 BB856A 2 x 1GbE, 2 x 8Gb FC
2 x 10GbE
StoreOnce 4430 BB857A 2 x 1GbE, 2 x 8Gb FC
2 x 10GbE

Correct configuration of these interfaces is important for optimal data transfer.

Key factors when considering performance.


 A mixture of iSCSI and FC port virtual libraries and NAS shares can be configured on the same StoreOnce appliance to balance performance
needs if required.

Ethernet factors

 It is important to consider the whole network when considering backup performance. Any server acting as a backup server should be configured
where possible with multiple network ports that are bonded in order to provide a fast connection to the LAN. Client servers (those that backup
via a backup server) may be connected with only a single port if backups are to be aggregated through the backup server.
 Ensure that no sub-1GbEnetwork components are in the backup path as this will significantly restrict backup performance.
 Configure bonded (Mode 6) network ports inside the StoreOnce appliance to achieve maximum available network bandwidth and a level of high
availability.
 For StoreOnce 4420/4430 Backup systems that support a 10GbE connection configure a Network SAN on the 10GbE ports, which is dedicated to
backup traffic.

FC factors

 Virtual library devices are assigned to an individual interface. Therefore, for best performance, configure both FC ports and balance the virtual
devices across both interfaces to ensure that one link is not saturated whilst the other is idle.
 Switched fabric mode is preferred for optimal performance on medium to large SANs since zoning can be used.
 Use zoning (by Worldwide Name) to ensure high availability..
 When using switched fabric mode, Fibre Channel devices should be zoned on the switch to be only accessible from a single backup server
device. This ensures that other SAN events, such as the addition and removal of other FC devices, do not cause unnecessary traffic to be sent to
devices. It also ensures that SAN polling applications cannot reduce the performance of individual devices.
 Either or both of the two FC ports may be connected to a FC fabric and each virtual library may be associated with one or both of these FC ports
but each drive can only be associated with one port. Port 1 and 2 is the recommended option in the GUI to achieve efficient load balancing. Only
the robotics (medium changer) part of the VTL is presented to Port 1 and Port 2 initially, with the number of virtual tape drives defined being
presented 50% to Port 1 and 50% to Port 2. This also ensures that in the event of a fabric failure at least half of the drives will still be available
to the hosts. (The initial virtual tape drive allocation to ports (50/50) can be edited later, if required.

Network configuration guidelines


The Ethernet ports are used for data transfer to iSCSI VTL devices, StoreOnce Catalyst and CIFS/NFS shares and also for replication data transfer
and management access to the StoreOnce Web and CLI Management Interfaces.

Configured backup devices and the management interfaces are all available on all network IP addresses configured for the appliance.
In order to deliver best performance when backing up data over the Ethernet ports it will be necessary to configure the appliance network ports,
and also backup servers and network infrastructure to maximize available bandwidth to the StoreOnce device.

15
Each pair of network ports on the appliance can be configured either on separate subnets or in a bond with each other (1GbE and 10GbE ports
cannot be bonded together).

Single Node StoreOnce appliances have a factory default network configuration where the first 1GbE port (Port 1 /eth0) is enabled in DHCP mode.
This enables quick access to the StoreOnce CLI and Management GUI for customers using networks with DHCP servers and DNS lookup because
the appliance hostname is printed on a label on the appliance itself.

The StoreOnce appliances provide “Mode 6” “Adaptive Load Balancing” when ports are bonded. This provides port failover and load balancing
across the physical ports. There is no need for any network switch configuration in this mode. This network bonding mode requires that the same
switch is used for each network port or that spanning tree protocol is enabled, if separate switches are used for each port.

If external switch ports are configured for LACP (Mode 4) bonding then this must be un-configured in order for Mode 6 bonding to work.

Network configuration on StoreOnce Backup systems is performed via the CLI Management interface.

For detailed information about supported network modes and how to configure them, please refer to the “HP StoreOnce 2620, 4210/4220,
and 4420,/4430 Backup system Installation and Configuration guide”.

Single Port configurations

Figure 3: Network configuration, single port mode

16
The example shows the simplest configuration of a single subnet containing just one 1GbE network port, generally this configuration is likely to
be used:
 Only if the network interface is required only for management of the appliance or
 Only if low performance and resiliency backup and restore are acceptable.
A single 10GbE port could also be configured in this way (on 4420 and 4430 appliances), providing both a backup data interface and management
interface. This could deliver good performance, however, bonded ports are recommended for resiliency and maximum performance.

Dual Port configurations (no bonding)


This example describes configuring multiple subnets in separate IP address ranges for each pair of network ports. A maximum of 4 separate
subnets can be configured on a StoreOnce 4420 or 4430 appliance (2 x 1GbE and 2 x 10GbE).

Use this mode:


 If servers to be backed up are split across two physical networks which need independent access to the appliance. In this case, virtual libraries
and shares and Catalyst stores will be available on both network ports; the host configuration defines which port is used.
 If separate data (“Network SAN”) and management LANs are being used, i.e. each server has a port for business network traffic and another for
data backup. In this case, one port on the appliance can be used solely for access to the StoreOnce Management Interface with the other used
for data transfer.

Figure 4: Network configuration, dual port mode

In the case of a separate network SAN being used, configuration of CIFS backup shares with Active Directory authentication requires careful
consideration, see Network Configurations for CIFS AD on page 19 for more information.

17
Bonded port configurations (recommended)
If two network ports are configured within the same subnet they will be presented on a single IP address and will be bonded using Mode 6 bonding
as described at the beginning of this chapter.

This mode is generally recommended for backup data performance and also for resiliency of both data and management network connectivity.
It should be noted that when using bonded ports the full performance of both links will only be realized if multiple host servers are providing
data, otherwise data will still use only one network path from the single server.

Figure 5: Network configuration, bonded , applies to both 1GbE ports and 10 GbE ports

18
10GbE Ethernet ports on StoreOnce 4420/4430 Backup systems
10GbE Ethernet is provided as a viable alternative to the Fibre Channel interface for providing maximum iSCSI VTL performance and also
comparable NAS performance. 10GbE ports also provide good performance when using StoreOnce Catalyst low and high bandwidth backup as
well as Catalyst copy or VTL/NAS replication between appliances. When using 10GbE Ethernet it is common to configure a “Network SAN”, which is
a dedicated network for backup that is separate to the normal business data network; only backup data is transmitted over this network.

Figure 6: Network configuration, HP StoreOnce 4420/4430 with 10GbE ports. As well as CIFS and NFS shares the devices configured could equally
be Catalyst stores.

When a separate network SAN is used, configuration of CIFS backup shares with Active Directory authentication requires careful consideration,
see the next section for more information.

Network configuration for CIFS AD


When using CIFS shares for backup on a StoreOnce device in a Microsoft Active Directory environment the appliance CIFS server may be made a
member of the AD Domain so that Active Directory users can be authenticated against CIFS shares on the StoreOnce Backup system.

However, in order to make this possible the AD Domain controller must be accessible from the StoreOnce device. Broadly there are two possible
configurations which allow both:
 Access to the Active Directory server for AD authentication and
 Separation of Corporate LAN and Network SAN traffic

19
Option 1: HP StoreOnce Backup system on Corporate SAN and Network SAN
In this option, the StoreOnce device has a port in the Corporate SAN which has access to the Active Directory Domain Controller. This link is then
used to authenticate CIFS share access.

The port(s) on the Network SAN are used to transfer the actual data.

This configuration is relatively simple to configure:


 On StoreOnce devices with only 1GbE ports: Two subnets should be configured with one port in each. The ports are connected and configured
for either the Corporate LAN or Network SAN. In this case one data port is “lost” for authentication traffic, so this solution will not provide
optimal performance.
 On HP 4420/4430 devices with both 10GbE and 1GbE ports: the 10GbE ports can be configured in a bonded network mode and configured for
access to the Network SAN. One or both of the 1GbE ports can then be connected to the Corporate LAN for authentication traffic. In this case
optimal performance can be maintained – see below.
The backup application media server also needs network connections into both the Corporate LAN and Network SAN. The diagram below shows
this configuration with an HP StoreOnce 4430/4420 Backup system.

Figure 7: HP StoreOnce Backup system on Corporate SAN and Network SAN

20
Option 2: HP StoreOnce Backup system on Network SAN only with Gateway
In this option the StoreOnce appliance has connections only to the Network SAN, but there is a network router or Gateway server providing access
to the Active Directory domain controller on the Corporate LAN. In order to ensure two-way communication between the Network SAN and
Corporate LAN the subnet of the Network SAN should be a subnet of the Corporate LAN subnet.

Once configured, authentication traffic for CIFS shares will be routed to the AD controller but data traffic from media servers with a connection to
both networks will travel only on the Network SAN. This configuration allows both 1GbE network connections to be used for data transfer but also
allows authentication with the Active Directory Domain controller. The illustration shows a simple Class C network for a medium-sized LAN
configuration.

Figure 8: HP StoreOnce Backup system on Network SAN only with Gateway

21
The screenshot below shows where the 1GbE and 10GbE IP addresses are displayed in the GUI.

There can be up to 4
addresses here (depending if
bonding is used or not used),
one per port.
All ports can be used for either
Management or Data.

22
Fibre Channel configuration guidelines
The HP StoreOnce Backup systems support both switched fabric and direct attach (private loop) topologies.
Switched fabric using NPIV (N Port ID Virtualisation) offers a number of advantages and is the preferred topology for StoreOnce appliances.

Switched fabric
A switched fabric topology utilizes one or more fabric switches configured in one or more storage area networks (SANs) to provide a flexible
configuration between several Fibre Channel hosts and Fibre Channel targets such as the StoreOnce appliance virtual libraries. Switches may be
cascaded or meshed together to form large fabrics.

Figure 9: Fibre Channel, switched fabric topology

StoreOnce does not implement any selective virtual device presentation, and so each virtual library will be visible to all hosts connected to the
same fabric. It is recommended that each virtual library is zoned to be visible to only the hosts that require access. Unlike the iSCSI virtual
libraries, FC virtual libraries can be configured to be used by multiple hosts, if required.

Figure 9 shows the flexibility of the configuration


 VTL1 is connected to FC Port 1 exclusively
 VTL3 is connected to FC Port 2 exclusively
 VTL2 is spread across FC Port 1 and FC Port 2. The medium changer is connected to both ports whereas the drives by default are
conncetd 50% to each port (2 each in this case). Theis mode is useful in high availability SANS – see Figure 11

23
Direct Attach (Private Loop)
A direct attach (private loop) topology is implemented by connecting the StoreOnce appliance ports directly to a Host Bus Adapter (HBA). In this
configuration the Fibre Channel private loop protocol must be used.

Figure 10: Fibre Channel, direct attach (private loop) topology

Either of the FC ports on a StoreOnce Backup system may be connected to a FC private loop, direct attach topology. The FC port configuration of
the StoreOnce appliance should be changed from the default N_Port topology setting to Loop. This topology only supports a single host
connected to each private loop configured FC port. In Private loop mode the medium changer cannot be shared across FC Port 1 and FC port 2

Zoning
Zoning is only required if a switched fabric topology is used and provides a way to ensure that only the hosts and targets that they need are
visible to servers, disk arrays, and tape libraries. Some of the benefits of zoning include:
 Limiting unnecessary discoveries on the StoreOnce appliance
 Reducing stress on the StoreOnce appliance and its library devices by polling agents
 Reducing the time it takes to debug and resolve anomalies in the backup/restore environment
 Reducing the potential for conflict with untested third-party products
 Zoning implemention needs to ensure StoreOnce FC diagnostic device is not presented to hosts.

24
Zoning may not always be required for configurations that are already small or simple. Typically the larger the SAN, the more zoning is needed.
Use the following guidelines to determine how and when to use zoning.
 Small fabric (16 ports or less)—may not need zoning.
 Small to medium fabric (16 - 128 ports)—use host-centric zoning. Host-centric zoning is implemented by creating a specific zone for each
server or host, and adding only those storage elements to be utilized by that host. Host-centric zoning prevents a server from detecting any
other devices on the SAN or including other servers, and it simplifies the device discovery process.
 Disk and tape on the same pair of HBAs is supported along with the coexistence of array multipath software (no multipath to tape or library
devices on the HP StoreOnce Backup system, but coexistence of the multipath software and tape devices).
 Large fabric (128 ports or more)—use host-centric zoning and split disk and tape targets. Splitting disk and tape targets into separate zones
will help to keep the HP StoreOnce Backup system free from discovering disk controllers that it does not need. For optimal performance, where
practical, dedicate HBAs for disk and tape.

Using zoning for high availability


The StoreOnce appliance allows the robot and tape drives to be presented to the different FC ports that are connected to the customer’s fabric.
The diagram below shows a way of utilizing this feature to add higher availability to your StoreOnce VTL deployment. When the VTL is created
there is the option to present the device to Port1, Port2 or Port 1&2. If the customer chooses to present the VTL to Port 1 &2, the following
happens. The robot is presented to both Port1 and Port2, 50% of the configured drives are presented to Port1 and the other 50% to Port2 (this
can be changed if required). With this configuration in the event of a Fabric Failure – the robot and 50% of the drives are available for backup. The
downside to this feature is that only a single 8Gb FC link is available for backups.

Figure 11: VTL Fibre Channel resiliency using WWN zoning (WWPN)

25
In our example the arrows illustrate accessibility, not data flow.
FC configuration StoreOnce VTL configuration

Dual fabrics Default library configuration is 50% drives presented to Port 1, 50%
presented to Port 2. Robot appears on Port 1 and Port 2
Multiple switches within each fabric
Up to 120 WWNs can be presented to Port 1 and Port 2
Zoning by WWPN
If Fabric 1 fails, all VTL libraries on the HP StoreOnce Backup system
Each zone to include a host and the required targets on the HP
still have access to Fabric 2. As long as Hosts A, B and C also have
StoreOnce Backup system
access to Fabric 2, then all backup devices are still available to Hosts A,
B and C.

Use the StoreOnce Management GUI to find out the WWPN for use in zoning. The WW port names are on the VTL-Libraries-Interface Information
tab.

Diagnostic Fibre Channel devices


For each StoreOnce FC port there is a Diagnostic Fibre Channel Device presented to the Fabric. There will be one per active FC physical port. This
means there are two per HP StoreOnce Backup system that has two Fibre Channel ports.

The Diagnostic Fibre Channel Device can be identified by the following example text.
Symbolic Port Name "HP D2D S/N-CZJ1440JBS HP D2DBS Diagnostic Fibre Channel S/N-MY5040204H
Port-1"
Symbolic Node Name "HP D2D S/N-CZJ1440JBS HP D2DBS Diagnostic Fibre Channel S/N-MY5040204H"

A virtual driver or loader would be identified by the following example text:


Symbolic Port Name "HP D2D S/N-CZJ1440JBS HP Ultrium 4-SCSI Fibre Channel S/N-CZJ1440JC5
Port-0"
Symbolic Node Name "HP S/N-CZJ1440JBS HP Ultrium 4-SCSI Fibre Channel S/N-CZJ1440JC5"

In the above the S/N-CZJ1440JBS for all devices should be identical. If this is Node Port 1, the Node Name string will be as above but, if Port 2,
the Node Name string will end with “Port-2”. Often the diagnostic device will be listed above the other virtual devices as it logs in first ahead of
the virtual devices. The S/N-MY5040204H string is an indication of the QLC HBA’s SN not any SN of an appliance/node.

At this time these devices are part of the StoreOnce VTL implementation and are not an error or fault condition. It is imperative that these
devicesbe removed from the switch zone that is also used for virtual drives and loaders to avoid data being sent to diagnostic devices.

26
Sizing StoreOnce solutions
The following diagram provides a simple sizing guide for the HP StoreOnce Generation 3 product family for backups and replication.

The following figure illustrates the typical amount of data that can be protected by HP StoreOnce Backup system on a daily backup.

Figure 12: HP StoreOnce Backup system Gen 3 simple sizing guide

Note: Assumes fully configured product, compression rate of 1.5, data change rate of 1%, data retention period of 6 months and a 12-hour
backup window. Actual performance is dependent upon data set type, compression levels, number of data streams, number of devices emulated
and number of concurrent tasks, such as housekeeping or replication. Additional time is required for periodic physical tape copy, which would
reduce the amount of data that can be protected in 24 hours.

HP also provides a downloadable tool to assist in the sizing of StoreOnce-based data protection solutions at
http://h30144.www3.hp.com/SWDSizerWeb/.

The use of this tool enables more accurate capacity sizing, retention period decisions and replication link sizing and performance for the most
complex StoreOnce environments.

A fully worked example using the Sizing Tool and best practices is contained later in the document, see Appendix B.

27
Introduction to HP StoreOnce Catalyst and device configuration guidelines
StoreOnce Catalyst technology
The following diagram shows the basic concept of a StoreOnce Catalyst store; it is a network based (not FC based) type of backup target, that
exists alongside VTL and NAS targets. The main difference between a Catalyst store and VTL or NAS devices is that the processor-intensive part of
deduplication (hashing/chunking and compressing) can be configured to occur on either the media server or the StoreOnce appliance.
 If deduplication is configured to occur on the media server supplying data to the Catalyst store, this is known as low bandwidth backup
or source deduplication.
 If deduplication is configured to occur on the StoreOnce appliance where the Catalyst store is located, this is known as target-side
deduplication or high-bandwidth backup. ALL the deduplication takes place on the StoreOnce appliance.

The low bandwidth mode is expected to account for the majority of Catalyst implementations since it has the net effect of improving the overall
throughput of the StoreOnce appliance whilst reducing backup bandwidth consumed. It can also be used to allow remote offices to back up
directly to a central StoreOnce Appliance over a WAN link for the first time. Catalyst stores are tolerant of high latency links – this has been tested
by HP. The net effect is the same in both cases – a significant reduction in bandwidth consumed by the data path to the backup storage target.

Figure 13: HP StoreOnce interfaces with HP StoreOnce Catalyst

The deduplication offload into the media server is implemented in different ways in different backup applications.
 With HP Data Protector the StoreOnce deduplication engine is embedded in the HP Data Protector Media Agent that talks to the Catalyst API.
 In Symantec products HP has developed an OpenStorage (OST) Plug-in to NetBackup and Backup Exec that creates the interface between
Symantec products and the StoreOnce Catalyst store API.
Catalyst stores can also be copied using low bandwidth links – just like NAS and VTL devices. The key difference here is that there is no need to set
up replication mappings (required with VTL and NAS); the whole of the Catalyst copy process is controlled by the backup software itself. This is
implemented by sending “Catalyst Copy” commands to the Catalyst API that exists on the source StoreOnce appliance. This simple fact, that the
backup application controls the copy process and is aware of all the copies of the data held in Catalyst stores, solves many of the problems

28
involved in Disaster Recovery scenarios involving replicated copies. No import is necessary because all entries for all copies of data already exist
in the backup application’s Database.

Generic sizing rule for media servers running Catalyst API


The majority of customers using HP StoreOnce Catalyst will use the low bandwidth backup mode because of the bandwidth this saves, and
because of the overall performance improvement this gives on the HP StoreOnce appliance; offloading the deduplication process allows the
appliance itself to run faster because it is less loaded. But be aware that the media server is now expected to do more work in terms of the
hashing, chunking and compressing of the data for deduplication and, as such, must be sized correctly. The rule of thumb for this is shown below;
with even the most basic media servers now being dual or quad core, this is not too onerous a task, but the process of evaluating this should
always be conducted.

 Allow 50MB/s of stream data per GHz of CPU core and 30MB RAM (allow 2 cores for the BACKUP APPLICATION media agent software)
 Allow at least 16GB of RAM overall
 Ignore hyperthreading (for example: 12 cores=24 with hyperthreading).
Example:
Dual Hex-core CPU running at 3.4GHz (12 cores).
10 cores x 3.4GHz = 34GHz (remember 2 cores are allocated to the media agent).
34 streams @ 50MB/s = 1700 MB/s (providing the sources of the data are not the bottleneck)
A media server that is dual hex(6) core processor (12 cores), of which two are used for backup and 10 are available for deduplication, should be
able to deliver Catalyst backup streams (already deduplicated) at a rate of 1700 MB/sec.

Catalyst Copy
Catalyst Copy is the equivalent of Virtual library and NAS share replication. The same principles apply in that only the new data created at the
source site needs to be copied (replicated) to the target site. The fundamental difference is that the copy jobs are created by the backup
application and can, therefore, be tracked and monitored within the backup application catalog as well as from the StoreOnce Management GUI.
Should it be necessary to restore from a Catalyst copy, the backup application is able to restore from a duplicate copy without the need to re-
import data to the catalog database.

Catalyst Copy should not be considered in the same way as VTL and NAS replication, since there is effectively no hard constraints other than
capacity on how many Catalyst stores can be copied (replicated) into a Catalyst store at a central site. Furthermore, because Catalyst copies are
controlled by the backup application, multi-hop replication is possible using Catalyst devices. However, Catalyst replication blackout windows can
be set on the StoreOnce appliance to dictate when the copy job actually happens and bandwidth throttling can also be enforced to limit the
amount of WAN link consumed by StoreOnce Catalyst copy; in this respect it is similar to NAS ad VTL replication.
Catalyst Copy has the following features
 The Copy job is configurable from within the backup application software – see Appendix C
 Several source Catalyst stores can be copied into a single target Catalyst store.
 Multi-hop copy is configurable via the backup software – Source to Target 1 then onto Target 2.
 One to many copy is also configurable but happens serially one after the other.
 With the Catalyst Agents running on remote office media servers HP StoreOnce Catalyst technology has the ability to backup directly
from remote sites to a central site , using what is known as low bandwidth backup – essentially this uses HP StoreOnce replication
technology.

29
Figure 14: Catalyst Copy options

Catalyst concurrency and blackout windows


As with VTL and NAS replication, each StoreOnce appliance supports a maximum number of concurrent jobs.

HP StoreOnce HP StoreOnce HP StoreOnce HP StoreOnce HP StoreOnce


Parameter
2620 4210 4220 4420 4430
Maximum concurrent outbound copy jobs
(per appliance) 12 24 24 48 48
Maximum concurrent data in AND inbound
copy jobs (per appliance) 48 96 96 192 192

Note:
 Outbound copy jobs = replication (out)
 Data in = Backup jobs
 Inbound copy jobs = replication ( in)

30
The concurrency settings for Catalyst are configured by selecting the StoreOnce Catalyst - Settings tab and Edit. In this case, the parameters are
Outbound copy jobs and Data and Inbound Copy jobs. Bear in mind that Catalyst stores can act as both inbound and outbound copies when used
in multi-hop mode.

The following screen illustrates how Catalyst copy blackout windows are configured (from the StoreOnce Catalyst-Blackout Windows tab).

The user can also configure bandwidth limiting (from the StoreOnce Catalyst-Bandwidth Limiting Windows tab).

31
Configuring client access
Catalyst stores have a process that allows client access to the Catalyst stores to be controlled.

First overall client access permission checking is enabled (from the StoreOnce Catalyst – Settings tab).

Then each Catalyst store has a list of clients defined who are allowed to access it (from the StoreOnce Catalyst – Stores – Permissions tab). In
our example, Catalyst store “SQL_VSS_Keep” can only be accessed by Client “DestinyStores.” The backup applications also have this Client name
configured into their Catalyst Backup and Copy Jobs in order to send the Catalyst API calls to the stores. (Note the permissions option for All
Clients provides an open access option.)

For more information


Appendix C of this document includes guidelines on integrating StoreOnce Catalyst with HP Data Protector 7, Symantec NetBackup 7.x and
Symantec Backup Exec 2012.

32
VTL configuration guidelines
Summary of best practices
 Tape drive emulation types have no effect on performance or functionality.
 Configuring multiple tape drives per library enables multi-streaming operations per library for good aggregate throughput performance.
 Do not exceed the recommended maximum concurrent backup streams per library and appliance if maximum performance is required. See
Appendix A.
 Target the backup jobs to run simultaneously across multiple drives within the library and across multiple libraries. Keep the concurrent stream
count high for best throughput.
 Create multiple libraries on the larger StoreOnce appliances to achieve best aggregate performance.
 Configure dedicated Individual libraries for backing up larger servers.
 Configure other libraries for consolidated backups of smaller servers.
 Separate libraries by data type if the best trade-off between deduplication ratio and performance is needed
 Cartridge capacities should be set either to allow a full backup to fit on one cartridge or to match the physical tape size for offload (whichever is
the smaller)
 Use a block size of 256KB or greater. For HP Data Protector and EMC Networker software a block size of 512 KB has been found to provide the
best deduplication ratio and performance balance.
 Disable the backup application verify pass for best performance.
 Remember that virtual cartridges cost nothing and use up very little space overhead. Don’t be afraid of creating “too many” cartridges. Define
slot counts to match required retention policy. The D2DBS, ESL and EML virtual library emulations can have a large number of configurable
slots and drives to give most flexibility in matching customer requirements.
 Design backup policies to overwrite media so that space is not lost to a large expired media pool and media does not have different retention
periods on the same piece of media.
 Reduce the number of appends per tape by specifying separate cartridges for each incremental backup, this improves replication performance
and capacity utilization.

Tape library emulation


Emulation types
HP StoreOnce Backup systems can emulate several types of physical HP Tape Library device; the maximum number of drives and cartridge slots is
defined by the type of library configured.

Performance however is not related to library emulation other than in the respect of the ability to configure multiple drives per library and thus
enable multiple simultaneous backup streams (multi-streaming operation).

To achieve the best performance of the larger StoreOnce appliances more than one virtual library will be required to meet the multi-stream
needs. The appliance is provided with a drive pool and these can be allocated to libraries in a flexible manner and so many drives per library can be
configured up to a maximum as defined by the library emulation type. The number of cartridges per library can also be configured. The table
below lists the key parameters all StoreOnce products.

To achieve best performance the recommended maximum concurrent backup streams per library and appliance in the table should be followed.
As an example, while it is possible to configure 200 drives per library on a 4420 appliance, for best performance no more than 12 of these drives
should be actively writing or reading at any one time.

33
StoreOnce 2620 StoreOnce 4210/4220 StoreOnce 4420/4430
Maximum VTL drives per library/appliance 32 64/96 200
Maximum slots per library (D2DBS, EML-E, ESL-E) 96 1024 4096
Maximum slots (MSL2024, MSL4048, MSL8096) 24, 48, 96 24, 48, 96 24, 48, 96
Maximum active streams per store 48 64/96 128
Recommended maximum concurrent backup streams per 24 48 64
appliance
Recommended maximum concurrent backup streams per 4 6 12
library

The HP D2DBS emulation type and the ESL/EML type provide the most flexibility in numbers of cartridges and drives. This has two main benefits:
 It allows for more concurrent streams on backups which are throttled due to host application throughput, such as multi-streamed backups
from a database.
 It allows for a single library (and therefore Deduplication Store) to contain similar data from more backups, which then increases deduplication
ratio.
The D2DBS emulation type has an added benefit in that it is also clearly identified in most backup applications as a virtual tape library and so is
easier for supportability. It is the recommended option for this reason.

There are a number of other limitations from an infrastructure point of view that need to be considered when allocating the number of drives per
library. As a general point it is recommended that the number of tape drives per library does not exceed 64 due to the restrictions below:
 For iSCSI VTL devices a single Windows or Linux host can only access a maximum of 64 devices. A single library with 63 drives is the most that a
single host can access. Configuring a single library with more than 63 drives will result in not all devices in the library being seen (which may
include the library device). The same limitation could be hit with multiple libraries and fewer drives per library.
 A similar limitation exists for Fibre Channel. Although there is a theoretical limit of 255 devices per FC port on a host or switch, the actual limit
appears to be 128 for many switches and HBAs. You should either balance drives across FC ports or configure less than 128 drives per library.
 Some backup applications will deliver less than optimum performance if managing many concurrent backup tape drives/streams. Balancing the
load across multiple backup application media servers can help here.

Cartridge sizing
The size of a virtual cartridge has no impact on its performance and cartridges do not pre-allocate storage. It is recommended that cartridges are
created to match the amount of data being backed up. For example, if a full backup is 500 GB, the next larger configurable cartridge size is 800
GB, so this should be selected.

Note that if backups are to be offloaded to physical media elsewhere in the network, it is recommended that the cartridge sizing matches that of
the physical media to be used.

Number of libraries per appliance


The StoreOnce appliance supports the creation of multiple virtual library devices. If large amounts of data are being backed up from multiple
hosts or for multiple disk LUNs on a single host, it is good practice to separate these across several libraries (and consequently into multiple
backup jobs). Each library has a separate deduplication “store” associated with it. Reducing the amount of data in, and complexity of, each store
will improve its performance.

Creating a number of smaller deduplication “stores” rather than one large store which receives data from multiple backup hosts could have an
impact on the overall effectiveness of deduplication. However, generally, the cross-server deduplication effect is quite low unless a lot of
common data is being stored. If a lot of common data is present on two servers, it is recommended that these are backed up to the same virtual
library.
 For best backup performance, configure multiple virtual libraries and use them all concurrently.
 For best deduplication performance, use a single virtual library and fully utilize all the drives in that one library.

34
Backup application configuration
In general backup application configurations for physical tape devices can be readily ported over to target a deduplicating virtual library with no
changes; this is one of the key benefits of virtual libraries – seamless integration. However considering deduplication in the design of a backup
application configuration can improve performance, deduplication ratio or ease of data recovery so some time spent optimizing backup
application configuration is valuable.

Blocksize and transfer size


As with physical tape, larger tape block sizes and host transfer sizes are of benefit. This is because they reduce the amount of overhead of
headers added by the backup application and also by the transport interface. The recommended minimum is 256 KB block size, and up to 1 MB is
suggested if the backup application and operating system will support this.

For HP Data Protector and EMC Networker Software a block size of 512 KB has been found to provide the best deduplication ratio and
performance balance and is the recommended block size for this application.

Some minor setting changes to upstream infrastructure might be required to allow backups with greater than 256 KB block size to be performed.
For example, Microsoft’s iSCSI initiator implementation, by default, does not allow block sizes that are greater than 256 KB. To use a block size
greater than this you need to modify the following registry setting:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4D36E97B-E325-11CE-BFC1-08002BE10318}\0000\Parameters

Change the REG_DWORD MaxTransferLength to “80000” hex (524,288 bytes), and restart the media server – this will restart the iSCSI
initiator with the new value.

Rotation schemes and retention policy


Retention policy
The most important consideration is the type of backup rotation scheme and associated retention policy to employ. With data deduplication there
is little penalty for using a large number of virtual cartridges in a rotation scheme and therefore a long retention policy for cartridges because
most data will be the same between backups and will therefore be deduplicated.

A long retention policy provides a more granular set of recovery points with a greater likelihood that a file that needs to be recovered will be
available for longer and in many more versions.

Rotation scheme
There are two aspects to a rotation scheme which need to be considered:
 Full versus Incremental/Differential backups
 Overwrite versus Append of media

Full versus Incremental/Differential backups


The requirement for full or incremental backups is based on two factors, how often offsite copies of virtual cartridges are required and speed of
data recovery. If regular physical media copies are required, the best approach is that these are full backups on a single cartridge. Speed of data
recovery is less of a concern with a virtual library appliance than it is with physical media. For example, if a server fails and needs to be fully
recovered from backup, this recovery will require the last full backup plus every incremental backup since (or the last differential backup). With
physical tape it can be a time consuming process to find and load multiple physical cartridges, however, with virtual tape there is no need to find
all of the pieces of media and, because the data is stored on disk, the time to restore single files is lower due to the ability to randomly seek within
a backup more quickly and to load a second cartridge instantly.

Overwrite versus append of media


Overwriting and appending to cartridges is also a concept where virtual tape has a benefit. With physical media it is often sensible to append
multiple backup jobs to a single cartridge in order to reduce media costs; the downside of this is that cartridges cannot be overwritten until the
retention policy for the last backup on that cartridge has expired. The diagram below shows cartridge containing multiple appended backup
sessions some of which are expired and other that are valid. Space will be used by the StoreOnce appliance to store the expired sessions as well
as the valid sessions. Moving to an overwrite strategy will avoid this.

With virtual tape a large number of cartridges can be configured for “free” and their sizes can be configured so that they are appropriate to the
amount of data stored in a specific backup. Appended backups are of no benefit because media costs are not relevant in the case of VTL.

35
Figure 15: Cartridges with appended backups (not recommended)

Our recommendations are:


 Target full backup jobs to specific cartridges, sized appropriately
 Reduce the number of appends by specifying separate cartridges for each incremental backup

Taking the above factors into consideration, an example of a good rotation scheme where the customer requires weekly full backups sent offsite
and a recovery point objective of every day in the last week, every week in the last month, every month in the last year and every year in the last 5
years might be as follows:
 4 daily backup cartridges, Monday to Thursday, incremental backup, overwritten every week.
 4 weekly backup cartridges, Fridays, full backup, overwritten every fifth week
 12 monthly backup cartridges, last Friday of month, overwritten every 13th month.
 5 yearly backup cartridges, last day of year, overwritten every 5 years.
This means that in the steady state, daily backups will be small, and whilst they will always overwrite the last week, the amount of data
overwritten will be small. Weekly full backups will always overwrite, but housekeeping has plenty of time to run over the following day or
weekend or whenever scheduled to run, the same is true for monthly and yearly backups.

Total virtual tapes required in above rotation = 25


Each backup job effectively has its own virtual tape.

36
StoreOnce NAS configuration guidelines
Introduction to StoreOnce NAS backup targets
The HP StoreOnce Backup system supports the ability to create a NAS (CIFS or NFS) share to be used as a target for backup applications.

The NAS shares provide data deduplication in order to make efficient use of the physical disk capacity when performing backup workloads.
The StoreOnce device is designed to be used for backup not for primary storage or general purpose NAS (drag and drop storage – random access).
Backup applications provide many configuration parameters that can improve the performance of backup to NAS targets, so some time spent
tuning the backup environment is required in order to ensure best performance.

Overview of NAS best practices


 Configure bonded network ports for best performance.
 Configure multiple shares and separate data types into their own shares.
 Adhere to the suggested maximum number of concurrent operations per share/appliance.
 Choose disk container backup file sizes in backup software to meet the maximum size of the backup data. If this is not possible, make the
backup container size as large as possible.
 Disable software compression, deduplication or synthetic full backups.
 Do not pre-allocate disk space for backup files.
 Monitor the number of files created in the share at regular intervals as there is a 25,000 file limit per share
 For CIFS shares the recommended implementation is using AD authentication (see later)
 For NFS shares there is a specific mount option which ensures all data to the NFS share is sent “in order” – which enables the best deduplication
ratio. The name of the mount option varies according to the operating system; some operating systems also require an update package to be
installed to enable this. See the “HP StoreOnce Linux and UNIX Configuration Guide “ for more details.

OS Version Option
SLES 10 "sync" patch and "-o sync" mount option
11 "-o sync" mount option
RHEL 5 "sync" patch and "-o sync" mount option
6 "-o sync" mount option
HP-UX 11iv2 "-o forcedirectio" option
11iv3 "-o forcedirectio" option
Solaris 9 "-o forcedirectio" option
10 "-o forcedirectio" option
AIX 5.3 "-o forcedirectio" option
6.1 "-o forcedirectio" option
7.1 "-o forcedirectio" option

Shares and deduplication stores


Each NAS share created on the StoreOnce system has its own deduplication “store”; any data backed up to a share will be deduplicated against all
of the other data in that store, there is no option to create non-deduplicating NAS shares and there is no deduplication between different shares
on the same StoreOnce appliance.

Once a StoreOnce CIFS share is created, subdirectories can be created via Explorer. This enables multiple host servers to back up to a single NAS
share but each server can back up to a specific sub-directory on that share. Alternatively a separate share for each host can be created.

The backup usage model for StoreOnce has driven several optimisations in the NAS implementation which require accommodation when creating
a backup regime:

37
 Only backup files larger than 24 MB will be deduplicated, this works well with backup applications because they generally create large backup
files and store them in configurable larger containers . Please note that simply copying (by drag and drop for example) a collection of files to
the share will not result in the smaller files being deduplicated.
 There is a limit of 25000 files per NAS share, applying this limit ensures good replication responsiveness to data change. This is not an issue
with many backup applications because they create large files and it is very unlikely that there will be a need to store more than 25000 on a
single share.
 A limit in the number of concurrently open files both above and below the deduplication file size threshold (24 MB) is applied. This prevents
overloading of the deduplication system and thus loss of performance. See Appendix A for values for each specific model.
When protecting a large amount of data from several servers with a StoreOnce NAS solution it is sensible to split the data across several shares in
order to realise best performance from the entire system by improving the responsiveness of each store. Smaller stores have less work to do in
order to match new data to existing chunks so they can perform faster.

The best way to do this whilst still maintaining a good deduplication ratio is to group similar data from several servers in the same store. For
example: keep file data from several servers in one share, and Oracle database backups in another share.

Maximum concurrently open files


The table below shows the maximum number of concurrently open files per share and per StoreOnce appliance for files above and below the 24
MB dedupe threshold size.
A backup job may consist of several small metadata/control files (that are constantly being updated) and at least one large data file. In some
cases, backup applications will hold open more than one large file. It is important not to exceed the maximum concurrent backup operations, see
Concurrent operations on page 40.

If these thresholds are breached the backup application will receive an error from the StoreOnce appliance indicating that a file could not be
opened and the backup will fail.

Open files values StoreOnce 2620 StoreOnce 4210/4220 StoreOnce 4420/4430


Max open files per share > 24MB 48 64 128
Max open files per appliance > 24MB 48 64 128
Max open files per appliance: Total 112 128 640

The number of concurrently open files in the table above do not guarantee that the StoreOnce appliance will perform optimally with this number
of concurrent backups, nor do they take into account the fact that host systems may report a file as having been closed before the actual close
takes place, this means that the limits provided in the table could be exceeded without realizing it.

Should the open file limit be exceeded an entry is made in the StoreOnce Event Log so the user knows this has happened. Corrective action for this
situation is to reduce the overall concurrent backups that are happening and have caused too many files to be opened at once, maybe by re-
scheduling some of the backup jobs to take place at a different time.

Backup application configuration


The HP StoreOnce Backup system NAS functionality is designed to be used with backup applications that create large “backup files” containing all
of the server backup data rather than applications that simply copy the file system contents to a share.

When using a backup application with StoreOnce NAS shares the user will need to configure a new type of device in their backup application. Each
application varies as to what it calls a backup device that is located on a StoreOnce device, for example it may be called a File Library, Backup to
Disk Folder, or even Virtual Tape Library.

Most backup applications allow the operator to set various parameters related to the NAS backup device that is created, these parameters are
important in ensuring good performance in different backup configurations. Generic best practices can be applied to all applications as follows.

Backup file size


Backup applications using disk/NAS targets will create one or more large backup files per backup stream; these contain all of the backed up data.
Generally a limit will be set on the size that this file can get to before a new one is created (usually defaulting to 4–5 GB). A backup file is
analogous to a virtual cartridge for VTL devices, but default file sizes will be much smaller than a virtual cartridge size (e.g. a virtual cartridge may
be 800 GB).

38
In addition to the data files, there will also be a small number of metadata files such as catalogue and lock files, these will generally be smaller
than the 24 MB dedupe threshold size and will not be deduplicated. These files are frequently updated throughout the backup process, so
allowing them to be accessed randomly without deduplication ensures that they can be accessed quickly. The first 24 MB of any backup file will
not be deduplicated, with metadata files this means that the whole file will not be deduplicated, with the backup file the first 24 MB only will not
be deduplicated. This architecture is completely invisible to the backup application which is merely presented with its files in the same way as any
ordinary NAS share would do so.

Figure 16: Backup size of data file

It is possible that the backup application will modify data within the deduplicated data region; this is referred to as a write-in-place operation. This
is expected to occur rarely with standard backup applications because these generally perform stream backups and either create a new file or
append to the end of an existing file rather than accessing a file in the middle.

If a write-in-place operation does occur, the StoreOnce appliance will create a new backup item that is not deduplicated, a pointer to this new item
is then created so that when the file is read the new write-in-place item will be accessed instead of the original data within the backup file.

Figure 17: Backup size of data file with write-in-place item

If a backup application were to perform a large amount of write-in-place operations, there would be an impact on backup performance – because
of the random access nature that write in place creates.

Some backup applications provide the ability to perform “Synthetic Full” backups, these may produce a lot of write-in-place operations or open a
large number of files all at once, it is therefore recommended that Synthetic Full backup techniques are not used, see Synthetic full backups on
page 41 for more information.

Generally configuring larger backup container file sizes will improve backup performance and deduplication ratio because:
1. The overhead of the 24 MB dedupe region is reduced.
2. The backup application can stream data for longer without having to close and create new files.
3. There is a lower percentage overhead of control data within the file that the backup application uses to manage its data files.
4. There is no penalty to using larger backup files as disk space is not usually pre-allocated by the backup application.

If possible the best practice is to configure a container file size that is larger than the complete backup will be (allowing for some data growth
over time), so that only one file is used for each backup. Some applications will limit the maximum size to something smaller than that however,
in which case, using the largest configurable size is the best approach.

Disk space pre-allocation


Some backup applications allow the user to choose whether to “pre-allocate” the disk space for each file at creation time, i.e. as soon as a backup
file is created an empty file is created of the maximum size that the backup file can reach. This is done to ensure that there is enough disk space
available to write the entire backup file. This setting has no value for StoreOnce devices because it will not result in any physical disk space
actually being allocated due to the deduplication system.

39
It is advised that this setting is NOT used because it can result in unrealistically high deduplication ratios being presented when pre-allocated files
are not completely filled with backup data or, in extreme cases, it will cause a backup failure due to a timeout if the application tries to write a
small amount of data at the end of a large empty file. This results in the entire file having to be padded-out with zeros at creation time, which is a
very time consuming operation.

Block / transfer size


Some backup applications provide a setting for block or transfer size for backup data in the same way as for tape type devices. Larger block sizes
are beneficial in the same way for NAS devices as they are for virtual tape devices because they allow for more efficient use of the network
interface by reducing the amount of metadata required for each data transfer. In general, set block or transfer size to the largest value allowed by
the backup application.

Concurrent operations
For best StoreOnce performance it is important to either perform multiple concurrent backup jobs or use multiple streams for each backup (whilst
staying within the limit of concurrently open files per NAS share). Backup applications provide an option to set the maximum number of
concurrent backup streams per file device; this parameter is generally referred to as the number of writers. Setting this to the maximum values
shown in the table below ensures that multiple backups or streams can run concurrently whilst remaining within the concurrent file limits for each
StoreOnce share.

The table below shows the recommended maximum number of backup streams or jobs per share to ensure that backups will not fail due to
exceeding the maximum number of concurrently open files. Note however that optimal performance may be achieved at a lower number of
concurrent backup streams.

These values are based on standard “file” backup using most major backup applications.

If backing up using application agents (e.g. Exchange, SQL, Oracle) it is recommended that only one backup per share is run concurrently because
these application agents frequently open more concurrent files than standard file type backups.

StoreOnce 2620 StoreOnce 4210/4220 StoreOnce 4420/4430


Suggested maximum concurrent backup streams per share 4 6 12
Suggested maximum concurrent backup stream per 32 48 64
appliance

Overall best performance is achieved by running a number of concurrent backup streams across several shares; the exact number of streams
depends upon the StoreOnce model being used and also the performance of the backup servers.

Buffering
If the backup application provides a setting to enable buffering for Read and/or Write this will generally improve performance by ensuring that
the application does not wait for write or read operations to report completion before sending the next write or read command. However, this
setting could result in the backup application inadvertently causing the StoreOnce appliance to have more concurrently open files than the
specified limits (because files may not have had time to close before a new open request is sent). If backup failures occur, disabling buffered
writes and reads may fix the problem, in which case, reducing the number of concurrent backup streams then re-enabling buffering will provide
best performance.

Overwrite versus append


This setting allows the backup application to either always start a new backup file for each backup job (overwrite) or continue to fill any backup
file that has not reached its size limit before starting new ones (append).

Appended backups should not be used because there is no benefit to using the append model, this does not save on disk space used.

Compression and encryption


Most backup applications provide the option to compress the backup data in software before sending, this should not be implemented.

Software compression will have the following negative impacts:


1. Consumption of system resources on the backup server and associated performance impact.
2. Introduction of randomness into the data stream between backups which will reduce the effectiveness of StoreOnce deduplication

40
Some backup applications now also provide software encryption, this technology prevents either the restoration of data to another system or
interception of the data during transfer. Unfortunately it also has a very detrimental effect on deduplication as data backed up will look different
in every backup preventing the ability to match similar data blocks.

The best practice is to disable software encryption and compression for all backups to the HP StoreOnce Backup system.

Verify
By default most backup applications will perform a verify pass on each back job, in which they read the backup data from the StoreOnce appliance
and check against the original data.

Due to the nature of deduplication the process of reading data is slower than writing as data needs to be re-hydrated. Thus running a verify will
more than double the overall backup time. If possible verify should be disabled for all backup jobs to StoreOnce, but trial restores should still
happen on a regular basis.

Synthetic full backups


Some backup applications have introduced the concept of a “Synthetic Full” backup where after an initial full backup, only file or block based
incremental backups are undertaken. The backup application will then construct a full system recovery of a specific point in time from the original
full backup and all of the changes up to the specified recovery point.
In most cases this model will not work well with a NAS target on a StoreOnce Backup system for one of two reasons.
 The backup application may post-process each incremental backup to apply the changes to the original full backup. This will perform a lot of
random read and write and write-in-place which will be very inefficient for the deduplication system resulting in poor performance and dedupe
ratio.
 If the backup application does not post-process the data, then it will need to perform a reconstruction operation on the data when restored,
this will need to open and read a large number of incremental backup files that contain only a small amount of the final recovery image, so the
access will be very random in nature and therefore a slow operation.
An exception to this restriction is the HP Data Protector Synthetic full backup which works well. However the HP Data Protector Virtual Synthetic
full backup which uses a distributed file system and creates thousands of open files does not. Check with your backup application or HP Sales
person for more details.

CIFS share authentication


The StoreOnce device provides three possible authentication options for the CIFS server:
 None – All shares created are accessible to any user from any client (least secure).
 User – Local (StoreOnce) User account authentication.
 AD – Active Directory User account authentication.

None – This authentication mode requires no username or password authentication and is the simplest configuration. Backup applications will
always be able to use shares configured in this mode with no changes to either server or backup application configuration. However this mode
provides no data security as anyone can access the shares and add or delete data.

User – In this mode it is possible to create “local StoreOnce users” from the StoreOnce management interface. This mode requires the
configuration of a respective local user on the backup application media server as well as configuration changes to the backup application
services. Individual users can then be assigned access to individual shares on the StoreOnce appliance. This authentication mode is ONLY
recommended when the backup application media server is not a member of an AD Domain.

AD – In this mode the StoreOnce CIFS server becomes a member of an Active Directory Domain. In order to join an AD domain the user needs to
provide credentials of a user who has permission to add computers and users to the AD domain. After joining an AD Domain access to each share
is controlled by Domain Management tools and domain users or groups can be given access to individual shares on the StoreOnce appliance. This
is the recommended authentication mode if the backup application media server is a member of an AD domain. It is the preferred option.

Refer to the “HP StoreOnce Backup system user guide” for more information about configuring authentication.

41
StoreOnce Replication
StoreOnce replication is a concept that is used with VTL and NAS devices. The equivalent concept for Catalyst store is called Catalyst Copy, which
is described in Appendix C. All three device types use a deduplication-enabled, low bandwidth transfer policy to replicate data from a device on a
“replication source” StoreOnce Backup system to an equivalent device on another “replication target” StoreOnce Backup system. The
fundamental difference is that the backup application controls Catalyst store copy operations, whereas all VTL and NAS replication is configured
and managed on the StoreOnce Management GUI.

Replication provides a point-in-time “mirror” of the data on the source StoreOnce device at a target StoreOnce Backup system on another site;
this enables quick recovery from a disaster that has resulted in the loss of both the original and backup versions of the data on the source site.

Replication does not however provide any ability to roll-back to previously backed-up versions of data that have been lost from the source
StoreOnce Backup system. For example, if a file is accidentally deleted from a server and therefore not included in the next backup, and all
previous versions of backup on the source StoreOnce Backup system have also been deleted, those files will also be deleted from a replication
target device as the target is a mirror of exactly what is on the source device. The only exception to this is if a Catalyst device type is used because
the retention periods of data on the Target can be different (greater in most cases) than the retention periods at the source – giving an additional
margin on data protection.

StoreOnce VTL and NAS replication overview


The StoreOnce Backup system utilizes a propriety protocol for replication traffic over the Ethernet ports; this protocol is optimized for
deduplication-enabled replication traffic. An item (VTL Cartridge or NAS file) will be marked ready for replication as soon as it is closed (or the VTL
cartridge returned to its slot). Replication works in a “round robin” process through the libraries and shares on a StoreOnce Backup system; when
it gets to an item that is ready for replication it will start a replication job for that item assuming there is not already the maximum number of
replication jobs underway. Replication will first exchange metadata information between source and target to identify the blocks of deduplicated
data that are different; it will then synchronize the changes between the two appliances by transferring the changed blocks or marking blocks for
removal at the target appliance. Replication does trigger housekeeping on the Target Appliance.

Replication will not prevent backup or restore operations from taking place. If an item is re-opened for further backups or restore, then
replication of that item will be paused to be resumed later or cancelled if the item is changed.

Replication can also be configured to occur at specific times (via configurable blackout windows) in order to optimize bandwidth usage and not
affect other applications that might be sharing the same WAN link.

VTL and NAS replication is configured between devices using “Mappings” and is not known to the backup software but is controlled entirely by the
StoreOnce appliance. Catalyst Copy is controlled entirely by the backup software and has no Mappings within the device to configure. A data
import process is necessary to recover data from a target NAS or VTL device, But with Catalyst no backup application import is required because
the additional copies are already known to the backup software and do not need to be imported.

Best practices overview


 Use the StoreOnce Sizing tool to accurately specify StoreOnce Backup system and WAN link requirements.
 Use replication blackout windows to avoid overlap with backup operations.
 Use bandwidth throttling when necessary to prevent oversubscription of the WAN link.
 Replication jobs may be paused or cancelled if insufficient WAN bandwidth is available. Limit the number of concurrent replication jobs if only a
small WAN bandwidth is available.
 When creating a VTL replication mapping, select only the subset of cartridges that you need to replicate, for example a daily backup. Similarly
for Catalyst stores only certain backup jobs can be selected for copy.
 With VTL replication and Catalyst Copy several sources (VTL slots or Catalyst stores) can be replicated into a single Target device ( VTL or
Catalyst Store) – this provides the option of consolidation in a single target for easier administration at the target site. It also provides a single
Disaster Recovery pool. NAS source and replication targets have a 1:1 mapping only. Use this consolidation approach when there is similar data
on source sites that lends itself to consolidation at a central site.

42
Replication usage models (VTL and NAS only)
There are four main usage models for replication using StoreOnce VTL and NAS devices shown below.
 Active/Passive – A StoreOnce system at an alternate site is dedicated solely as a target for replication from a StoreOnce system at a primary
location.
 Active/Active – Both StoreOnce systems are backing up local data as well as receiving replicated data from each other.
 Many-to-One – A target StoreOnce system at a data center is receiving replicated data from many other StoreOnce systems at other locations.
 N-Way – A collection of StoreOnce systems on several sites are acting as replication targets for other sites.
The usage model employed will have some bearing on the best practices that can be employed to provide best performance. The following
diagrams show the usage models using VTL device types.

Figure 18: Active to Passive configuration

Figure 19: Active to Active configuration

43
Figure 20: Many to One configuration

44
Figure 21: N-way configuration

In most cases StoreOnce VTL and StoreOnce NAS replication is the same, the only significant configuration difference being that VTL replication
allows multiple source libraries to replicate into a single target library, NAS mappings however are 1:1, one replication target share may only
receive data from a single replication source share. In both cases replication sources libraries or shares may only replicate into a single target.
However, with VTL replication a subset of the cartridges within a library may be configured for replication (a share may only be replicated in its
entirety).

What to replicate
StoreOnce VTL replication allows for a subset of the cartridges within a library to be mapped for replication rather than the entire library (NAS
replication does not allow this).

Some retention policies may not require that all backups are replicated, for example daily incremental backups may not need to go offsite but
weekly and monthly full backups do, in which case it is possible to configure replication to only replicate those cartridges that are used for the full
backups.

Reducing the number of cartridges that make up the replication mapping may also be useful when replicating several source libraries from
different StoreOnce devices into a single target library at a data center, for example. Limited slots in the target library can be better utilized to
take only replication of full backup cartridges rather than incremental backup cartridges as well.

Configuring this reduced mapping does require that the backup administrator has control over which cartridges in the source library are used for
which type of backup. Generally this is done by creating media pools with the backup application then manually assigning source library
cartridges into the relevant pools. For example the backup administrator may configure 3 pools:
 Daily Incremental, 5 cartridge slots (overwritten each week)
 Weekly Full, 4 cartridge slots (overwritten every 4 weeks)
 Monthly Full, 12 cartridge slots (overwritten yearly)

45
Replicating only the slots that will contain full backup cartridges saves five slots on the replication target device which could be better utilized to
accept replication from another source library.

Note: The Catalyst equivalent of this requires the actual Backup policies to define which backups to Catalyst stores are to be copied and which are
not – so for example, you could configure only Full backups to be copied to Catalyst stores.

Appliance, library and share replication fan in/out


Each StoreOnce model has a different level of support for the number of other StoreOnce appliances that can be involved in replication mappings
with it and also the number of libraries that may replicate into a single library on the device. The configuration settings are defined below; please
see Appendix A for actual values per StoreOnce model.

Max Appliance Fan out The maximum number of target appliances that a source appliance can be paired with
Max Appliance Fan in The maximum number of source appliances that a target appliance can be paired with
Max Library Fan in The maximum number of source libraries that may replicate into a single target library on this type of
appliance.
Max Library Fan out The maximum number of target libraries that may be replicated into from a single source library on this
type of appliance
Max Share Fan in The maximum number of source NAS Shares that may replicate into a single target NAS Share on this type
of appliance
Max Share Fan out The maximum number of target NAS Shares that may be replicated into from a single source NAS Share on
this type of appliance

It is important to note that when utilizing a VTL replication Fan-in model (where multiple source libraries are replicated to a single target library),
the deduplication ratio may be better than is achieved by each individual source library due to the deduplication across all of the data in the single
target library. However, over a large period of time the performance of this solution will be slower than configuring individual target libraries
because the deduplication stores will be larger and therefore require more processing for each new replication job.

Parameter HP StoreOnce 2620 HP StoreOnce 4210 HP StoreOnce 4220 HP StoreOnce 4420 / 4430

Appliance Fan-In 8 16 24 50
Appliance Fan-Out 2 4 4 8
Device Fan-In (VTL only) 1 8 8 16
Device Fan-out 1 1 1 1
Max Concurrent Outbound Jobs 12 24 24 48
Max Concurrent Inbound Jobs 24 48 48 96

Concurrent replication jobs


Each StoreOnce model has a different maximum number of concurrently running replication jobs when it is acting as a source or target for
replication. Appendix A and the table above showsthese values. When many items are available for replication, this is the maximum number of
jobs that will be running at any one time. As soon as one item has finished replicating another will start.

For example, an HP 2620 may be replicating up to 4 jobs to a StoreOnce 4430, which may also be accepting another 44 source items from other
StoreOnce systems. But the target concurrency for a 4430 is 96 so the target is not the bottleneck to replication performance. If tTotal source
replication jobs are greater than 96 then the StoreOnce 4430 will limit replication throughput and replication jobs will queue until a slot becomes
available.

Apparent replication throughput


In characterizing replication performance we use a concept of “apparent throughput”. Since replication passes only unique data between sites
there is a relationship between the speed at which the unique data is sent and how that relates to the whole backup apparently replicated
between sites. In all reporting on the StoreOnce Management GUI, the throughput in MB/sec is apparent throughput – think of this at the rate at
which we are apparently replicating the backup data between sites.

46
What actually happens in replication?
Assuming the seeding process is complete (seeding is when the initial data is transferred to the target device), the basic replication process works
like this:
 Source has a cartridge (VTL) or File (NAS) to replicate
 Source sends to target a “Manifest” that is a list of all the Hash codes it wants to send to the target (the hash codes are what make up
the cartridge/file/item)
 The target replies: “I have 98% of those hash codes already – just send the 2% I don’t have.”
 The source sends the 2% of hash codes the target requested.
 The VTL or NAS replication job executes and completes.

The bigger the change rate of data, the more “mismatch” there will be and the higher the volume of unique data that must be replicated over the
WAN.

Limiting replication concurrency


In some cases it may be useful to limit the number of replication jobs that can run concurrently on either the source or target appliance. These
conditions might be:
1. There is a requirement to reduce the activity on either the source or target appliance in order to allow other operations (e.g. backup/restore)
to have more available disk I/O.
2. The WAN Bandwidth is too low to support the number of concurrent jobs that may be running concurrently. It is recommended that a
minimum WAN bandwidth of 2Mb/sec is available per replication job. If a target device can support for example 6 concurrent jobs, then 12
Mb/s of bandwidth is required for that target appliance alone. If there are multiple target appliances, the overall requirement is even higher.
So, limiting the maximum number of concurrent jobs at the target appliance will prevent the WAN bandwidth being oversubscribed with the
possible result of replication failures or impact on other WAN traffic.
The Maximum jobs configuration is available from the StoreOnce Management GUI on the Local Settings tab of the Replication – Configuration
page. Click Edit to edit the various replication parameters on the current device. In the example below we have left the Source Concurrency at 48
and Target Concurrency at 96 – the maximum for a StoreOnce 4420.

Other tabs seen on the above screenshot can be used to control the bandwidth throttling used for replication and the blackout windows that
prevents replication from happening at certain times.

WAN link sizing


One of the most important aspects in ensuring that a replication will work in a specific environment is the available bandwidth between
replication source and target StoreOnce systems. In most cases a WAN link will be used to transfer the data between sites unless the replication
environment is all on the same campus LAN.

It is recommended that the HP Sizing Tool (http://h30144.www3.hp.com/SWDSizerWeb/default.htm) is used to identify the product and WAN
link requirements because the required bandwidth is complex and depends on the following:

47
 Amount of data in each backup
 Data change per backup (deduplication ratio)
 Number of StoreOnce systems replicating
 Number of concurrent replication jobs from each source
 Number of concurrent replication jobs to each target
 Link latency (governs link efficiency)

As a general rule of thumb, however, a minimum bandwidth of 2 Mb/s per replication job should be allowed. For example, if a replication target is
capable of accepting 8 concurrent replication jobs (HP 4220) and there are enough concurrently running source jobs to reach that maximum, the
WAN link needs to be able to provide 16 Mb/s to ensure that replication will run correctly at maximum efficiency – below this threshold replication
jobs may begin to pause and restart due to link contention. It is important to note that this minimum value does not ensure that replication will
meet the performance requirements of the replication solution, a lot more bandwidth may be required to deliver optimal performance.

Seeding and why it is required


One of the benefits of deduplication is the ability to identify unique data, which then enables us to replicate between a source and a target
StoreOnce Backup system, only transferring the unique data identified. This process only requires low bandwidth WAN links, which is a great
advantage to the customer because it delivers automated disaster recovery in a very cost-effective manner. The StoreOnce Management GUI
reports bandwidth saving as a key metric of the replication process and in general it is around the 95-98% mark (depending on data change rate).

However prior to being able to replicate only unique data between source and target StoreOnce Backup system, we must first ensure that each
site has the same hash codes or “bulk data” loaded on it – this can be thought of as the reference data against which future backups are compared
to see if the hash codes exist already on the target. The process of getting the same bulk data or reference data loaded on the StoreOnce source
and StoreOnce target is known as “seeding”.

Note: With Catalyst the very first low bandwidth backup effectively performs its very own seeding operation.

Seeding is generally a one-time operation which must take place before steady-state, low bandwidth replication can commence. Seeding can take
place in a number of ways:
 Over the WAN link – although this can take some time for large volumes of data. A temporary increas e in WAN bandwidth provision by
your telco can often alleviate this problem.
 Using co-location where two devices are physically in the same location and can use a GbE replication link for seeding– (this is best for
Active/Active, Active Passive configurations). After seeding is complete, one unit is physically shipped to its permanent destination.
 Using a “Floating” StoreOnce device which moves between multiple remote sites ( best for many to one replication scenarios)
 Using a form of removable media (physical tape or portable USB disks) to “ship data” between sites.

The recommended way to accelerate seeding is by co-location of the source and target systems on the same LAN whilst performing the first
replicate. This process will obviously involve moving one or both of the appliances and will thus prevent them from running their normal backup
routines. In order to minimize disruption seeding should ideally only be done once; in this case all backup jobs that are going to be replicated must
have completed their first full backup to the source appliance before commencing a seeding operation.

Once seeding is complete there will typically be a 90+% hit rate, meaning most of the hash codes are already loaded on the source and target and
only the unique data will be transferred during replication.

It is good practice to plan for seeding time in your StoreOnce Backup system deployment plan as it can sometimes be very time consuming or
manually intensive work. The Sizing Tool calculates expected seeding times over Wan and LAN to help set expectations for how long seeding will
take place. In practice a gradual migration of backup jobs to the StoreOnce appliance ensures there is not a sudden surge in seeding requirements
but a gradual one, with weekends being used to performer high volume seeding jobs.

During the seeding process it is recommended that no other operations are taking place on the source StoreOnce Backup system, such as further
backups or tape copies. It is also important to ensure that the StoreOnce Backup system has no failed disks and that RAID parity initialization is
complete because these will impact performance.

When seeding over fast networks (co-located StoreOnce devices) it should be expected that performance to replicate a cartridge or file is similar
to the performance of the original backup.

48
Replication models and seeding
The diagrams in Replication usage models starting on page 43 indicate the different replication models supported by HP StoreOnce Backup
systems; the complexity of the replication models has a direct influence on which seeding process is best. For example an Active – Passive
replication model can easily use co-location to quickly seed the target device, where as co-location may not be the best seeding method to use
with a 50:1, many to 1 replication model.
Note: HP StoreOnce Catalyst copy seeding follows the same processes outlined below with the added condition that for multi-hop and one to
many replication scenarios the seeding process may have to occur multiple times.

Summary of possible seeding methods and likely usage models

Technique Best for Concerns Comments


Seed over the WAN Active -- Passive and Many to 1 replication This type of seeding should be scheduled to Seeding time over WAN is
link models with: occur over weekends where at all possible. calculated automatically when
Initial Small Volumes of Backup data using the Sizing tool for
StoreOnce.
OR
It is perfectly acceptable for
Gradual migration of larger backup
customers to ask their link
volumes/jobs to StoreOnce over time
providers for a higher link
speed just for the period where
seeding is to take place.

Co-location (Seed Active -- Passive, Active -- Active and Many This process involves the transportation of Seeding time over LAN is
over LAN) to 1 replication models with significant complete StoreOnce units. calculated automatically when
volumes of data (> 1TB) to seed quickly and This method may not be practical for large using the Sizing tool for
where it would simply take too long to seed fan-in implementations e.g. 50:1 because of StoreOnce
using a WAN link ( > 5 days) the time delays involved in transportation.
This process can only really be used as a
“one off” when replication is first
implemented.

Floating StoreOnce Many to 1 replication models with high fan in Careful control over the device creation and This is really co-location using
ratios where the target must be seeded with co-location replication at the target site is a spare StoreOnce.
several remote sites at once. required. See example below. The last remote site StoreOnce
Using the floating StoreOnce approach can be used as the floating
means the device is ready to be used again unit.
and again for future expansion where more
remote sites might be added to the
configuration.

Backup application Suitable for all replication models, especially Relies on the backup application supporting Reduced shipping costs of
Tape offload/ copy where remote sites are large (inter- the copy process, e.g. Media copy or “object” physical tape media over
from source and continental) distances apart. copy” or “duplicate” or “cloning” actual StoreOnce units.
copy onto target Well suited to target sites that plan to have a Requires physical tape
physical Tape archive as part of the final connectivity at all sites, AND
solution. media server capability at each
Best suited for StoreOnce VTL deployments. site even if only for the seeding
process.
Backup application licensing
costs for each remote site may
be applicable

Use of portable USB portable disks, such as HP RDX series, Multiple drives can be used – single drive USB disks are typically easier
disk drives - can be configured as Disk File Libraries maximum capacity is about 3TB currently. to integrate into systems than
backup application within the backup application software and physical tape or SAS/FC disks.
copy or drag and used for “copies” RDX ruggedized disks are OK
drop OR for easy shipment between
Backup data can be drag and dropped onto sites and cost effective.
the portable disk drive, transported and then
drag and dropped onto the StoreOnce
Target.
Best used for StoreOnce NAS deployments.

49
Seeding methods in more detail
Seeding over a WAN link
With this seeding method the final replication set-up (mappings) can be established immediately.

Active/Passive Active/Active

WAN seeding over the first backup is, in fact, the first wholesale WAN seeding after the first backup at each location is, in fact,
replication. the first wholesale replication in each direction.

Figure 22: Seeding over a WAN link

50
Many to One

WAN seeding over the first backup is, in fact, the first wholesale replication from the many remote sites to the Target site. Care must be taken not
to run too many replications simultaneously or the Target site may become overloaded. Stagger the seeding process from each remote site.

Figure 23: Many-to-One seeding

51
Co-location (seed over LAN)
With this seeding method it is important to define the replication set-up (mappings) in advance so that in say the Many to One example the correct
mapping is established at each site the target StoreOnce appliance visits before the target StoreOnce appliance is finally shipped to the Data
Center Site and the replication “re-established” for the final time.

Active/Passive
Co-location seeding at Source (remote) site

1. Initial backup 2. Replication over GbE link

3. Ship appliance to Data Center site 4. Re-establish replication

Figure 24: Seeding over LAN, using co-location

52
Many to One
Co-location seeding at Source (remote) sites; transport target StoreOnce appliance
between remote sites.

1. Initial backup at each remote site 2. Replication to Target StoreOnce appliance over GbE at
each remote site.
3. Move Target StoreOnce appliance between 4. Finally take Target StoreOnce appliance to Data Center
remote sites and repeat replication. site.

5. Re-establish replication.

Figure 25: Co-location seeding over WAN, many-to-one

53
Floating StoreOnce appliance method of seeding

Many to Once Seeding with Floating StoreOnce target – for large fan-in scenarios
Co-location seeding at Source (remote) sites.
Transport floating target StoreOnce appliance between remote sites then perform replication at the
Data Center site. Repeat as necessary.

1. Initial backup at each remote site. 2. Replication to floating Target StoreOnce appliance over GbE at
each remote site.
3. Move floating Target StoreOnce appliance between 4. Take floating Target StoreOnce appliance to Data Center site.
remote sites and repeat replication.

5. Establish replication from floating StoreOnce Target (now a Source) with Target StoreOnce at Data Center. Delete devices on floating
Target StoreOnce appliance.
Repeat the process for further remote sites until all data has been loaded onto the Data Center Target StoreOnce appliance. You may be
able to accommodate 4 or 5 site s of replicated data on a single floating StoreOnce appliance.

6. Establish final replication with remote sites.

Figure 26: Seeding using floating StoreOnce Backup system

This “floating StoreOnce appliance” method is more complex because for large fan-in (many source sites replicating into single target site) the
initial replication set up on the floating StoreOnce appliance changes as it is then transported to the data center, where the final replication
mappings are configured.

54
The sequence of events is as follows:
1. Plan the final master replication mappings from sources to target that are required and document them. Use an appropriate naming
convention e.g. SVTL1, SNASshare1, TVTL1, TNASshare1.
2. At each remote site perform a full system backup to the source StoreOnce appliance and then configure a 1:1 mapping relationship with the
floating StoreOnce appliance ” e.g. SVTL1 on Remote Site A - FTVTL1 on floating StoreOnce. FTVTL1 = floating target VTL1.
3. Seeding remote site A to the floating StoreOnce appliance will take place over the GbE link and should take only a few hours.
4. On the Source StoreOnce appliance at the remote site DELETE the replication mappings – this effectively isolates the data that is now on the
floating StoreOnce appliance.
5. Repeat the process steps 1-4 at Remote sites B and C.
6. When the floating StoreOnce appliance arrives at the central site, the floating StoreOnce appliance effectively becomes the Source device to
replicate INTO the StoreOnce appliance at the data center site.
7. On the Floating StoreOnce appliance we will have devices (previously named as FTVTL1, FTNASshare 1) that we can see from the Web
Management Interface. Using the same master naming convention as we did in step 1, set up replication which will necessitate the creation of
the necessary devices (VTL or NAS) on the StoreOnce 4220 at the Data Center site e.g. TVTL1, TNASshare 1.
8. This time when replication starts up the contents of the floating StoreOnce appliance will be replicated to the data center StoreOnce appliance
over the GbE connection at the data center site and will take several hours. In this example Remote Site A, B, C data will be replicated and
seeded into the StoreOnce 4220. When this replication step is complete, DELETE the replication mappings on the floating StoreOnce
appliance, to isolate the data on the floating StoreOnce appliance and then DELETE the actual devices on the floating StoreOnce appliance, so
the device is ready for the next batch of remote sites.
9. Repeat steps 1-8 for the next series of remote sites until all the remote site data has been seeded into the StoreOnce 4220.
10. Now we have to set up the final replication mappings using our agreed naming convention decided in Step 1. This time we go to the Remote
sites and configure replication again to the Data Center site but being careful to use the agreed naming convention at the data center site e.g.
TVTL1, TNASshare1 etc.
This time when we set up replication the StoreOnce 4220 at the target site presents a list of possible target replication devices available to
the remote site A. So in this example we would select TVTL1 or TNASshare1 from the drop-down list presented to Remote Site A when we are
configuring the final replication mappings. This time when the replication starts almost all the necessary data is already seeded on the
StoreOnce 4220 for Remote site A and the synchronization process happens very quickly.

Note: If using this approach with Catalyst stores that do not rely on “mappings”, the Floating StoreOnce appliance can be simply used to
collect all the Catalyst Items at the Remote sites if a consolidation model is to be deployed. If not, create a separate Catalyst store on the
Floating StoreOnce Appliance for each site.

55
Seeding using physical tape or portable disk drive and backup application copy utilities

Many-to-one seeding using Physical Tape or portable disk drives


Physical tape-based or portable disk drive seeding

1. Initial backup to StoreOnce appliance. 2. Copy to tape(s) or a disk using backup application software on
Media Server for NAS devices; only use simple drag and drop to
portable disk This technique is not possible at Sites A & B unless a
media server is present
3. Ship tapes/disks to Data Center site. 4. Copy tapes/disks into target appliance using backup application
software on Media Server (or for portable disks only use drag and
drop onto NAS share on the StoreOnce target).
5. Establish replication. .

Figure 27: Seeding using physical tape and backup application

In this method of seeding we use a removable piece of media (like LTO physical tape or removable RDX disk drive acting as a disk Library or file
library*) to move data from the remote sites to the central data center site. This method requires the use of the backup application software and
additional hardware to put the data onto the removable media.

* Different backup software describes “disk targets for backup” in different ways e.g. HP Data Protector calls StoreOnce NAS shares “ DP File
Libraries”, Commvault Simpana calls StoreOnce NAS shares “Disk libraries.”

Proceed as follows
1. Perform full system backup to the StoreOnce Backup system at the remote site using the local media server, e.g. at remote site C.
The media server must also be able to see additional devices such as a physical LTO tape library or a removable disk device configured as a
disk target for backup.

56
2. Use the backup application software to perform a full media copy of the contents of the StoreOnce Backup system to a physical tape or
removable disk target for backup also attached to the media server.
In the case of removable USB disk drives the capacity is probably limited to 2 TB, in the case of physical LTO5 media it is limited to about 3 TB
per tape, but of course multiple tapes are supported if a tape library is available. For USB disks, separate backup targets for disk devices
would need to be created on each removable RDX drive because we cannot span multiple RDX removable disk drives.

3. The media from the remote sites is then shipped (or even posted!) to the data center site.

4. Place the removable media into a library or connect the USB disk drive to the media server and let the media server at the data center site
discover the removable media devices.
The media server at the data center site typically has no information on what is on these pieces of removal media and we have to make the
data visible to the media server at the data center site. This generally takes the form of what is known as an “import” operation where the
removable media has to be registered into the catalog/database of the media server at the data center site.

5. Create devices on the StoreOnce Backup system at the data center site using an agreed convention e.g. TVTL1, TNASshare1. Discover these
devices through the backup application so that the media server at the data center site has visibility of both the removable media devices AND
the devices configured on the StoreOnce Backup system.

6. Once the removable media has been imported into the media server at the data center site it can be copied onto the StoreOnce Backup system
at the data center site (in the same way as before at step 2) and, in the process of copying the data , we seed the StoreOnce Backup system at
the data center site. It is important to copy physical tape media into the VTL device that has been created on the StoreOnce Backup system
and copy the disk target for backup device (RDX) onto the StoreOnce NAS share device that has been created on the StoreOnce Backup system
at the data center site.

7. Now we have to set up the final replication mappings using our agreed naming convention. Go to the remote sites and configure replication
again to the data center site, being careful to use the agreed naming convention at the data center site e.g. TVTL1, TNASshare1 etc. This time
when we set up replication the StoreOnce 4220 at the target site presents a list of possible target replication devices available to the remote
site. So in this example we would select TVTL1 or TNASshare1 from the drop-down list presented to remote site C when we are configuring
the final replication mappings. This time when the replication starts almost all the necessary data is already seeded on the StoreOnce 4220
for Remote site A so the synchronization process happens very quickly.

The media servers are likely to be permanently present at the remote sites and data center site so this is making good use of existing equipment.
For physical tape drives/library connection at the various sites SAS or FC connection is required. For removable disk drives such as RDX a USB
connection is the most likely connection because it is available on all servers at no extra cost.

If the StoreOnce deployment is going to use StoreOnce NAS shares at source and target sites the seeding process can be simplified even further
by using the portable disk drives to drag and drop backup data from the source system onto the portable disk. Then transport the portable disk to
the target StoreOnce site and connect it to a server with access to the StoreOnce NAS share at the target site. Perform a drag and drop from
portable disk onto the StoreOnce NAS share and this then performs the seeding for you!

Note: Drag and drop is NOT to be used for day to day use of StoreOnce NAS devices for backup; but for seeding large volumes of sequential data
this usage model is acceptable.

Only HP Data Protector, Symantec NetBackup and Symantec Backup Exec support HP StoreOnce Catalyst – but Catalyst stores can be “copied” to
Tape or USB Disk using object copy (DP), duplicate commands (NetBackup). See Appendix C for more details.

57
Controlling Replication
In order to either optimize the performance of replication or minimize the impact of replication on other StoreOnce operations it is important to
consider the complete workload being placed on the StoreOnce Backup system.
By default replication will start quickly after a backup completes; this window of time immediately after a backup may become very crowded if
nothing is done to separate tasks. In this time the following are likely to be taking place:
 Other backups to the StoreOnce Backup system which have not yet finished
 Housekeeping of the current and other completed overwrite backups
 Possible copies to physical tape media of the completed backups

These operations will all impact each other’s performance, some best practices to avoid these overlaps are:
 Set replication blackout windows to cover the backup window period , so that replication will not occur whilst backups are taking place.
 Set housekeeping blackout windows to cover the replication period, some tuning may be required in order to set the housekeeping window
correctly and allow enough time for housekeeping to run.
 Delay physical tape copies to run at a later time when housekeeping and replication has completed. Preferably at the weekend

Replication blackout windows


The replication process can be delayed from running using blackout windows that may be configured using the StoreOnce Web Management
Interface. Up to two separate windows per day, which are at different times for each day of the week, may be configured.

The best practice is to set a blackout window throughout the backup window so that replication does not interfere with backup operations.
If tape copy operations are also scheduled, a blackout window for replication should also cover this time.

Care must be taken, however, to ensure that enough time is left for replication to complete. If it is not, some items will never be synchronized
between source and target and the StoreOnce Backup system will start to issue warnings about these items.

The replication blackout window settings can be found on the StoreOnce Management Interface on the HP StoreOnce - Replication - Local
Settings – Blackout Windows page.

Replication bandwidth limiting


In addition to replication blackout windows, the user can also define replication bandwidth limiting; this ensures that StoreOnce replication does
not swamp the WAN with traffic, if it runs during the normal working day.

This enables blackout windows to be set to cover the backup window over the night time period but also allow replication to run during the day
without impacting normal business operation.

Bandwidth limiting is configured by defining the speed of the WAN link between the replication source and target, then specifying a maximum
percentage of that link that may be used.

Again, however, care must be taken to ensure that enough bandwidth is made available to replication to ensure that at least the minimum (2 Mb/s
per job) speed is available and more, depending on the amount of data to be transferred, in the required time.

Replication bandwidth limiting is applied to all outbound (source) replication jobs from an appliance; the bandwidth limit set is the maximum
bandwidth that the StoreOnce Backup system can use for replication across all replication jobs.

58
The replication bandwidth limiting settings can be found on the StoreOnce Management Interface on the HP StoreOnce - Replication - Local
Settings - Bandwidth Limiting page.

There are two ways in which replication bandwidth limits can be applied:
 General Bandwidth limit – this applies when no other limit windows are in place.
 Bandwidth limiting windows – these can apply different bandwidth limits for times of the day
A bandwidth limit calculator is supplied to assist with defining suitable limits.

Source Appliance Permissions


It is a good practice to use the Source Appliance Permissions functionality provided on the Replication - Partner Appliances tab to prevent
malicious or accidental configuration of replication mappings from unknown or unauthorized source appliances.

See the “HP StoreOnce Backup system user guide” for information on how to configure Source Appliance Permissions.

Note the following changes to replication functionality when Source Appliance Permissions are enabled:
 Source appliances will only have visibility of, and be able to create mappings with, libraries and shares that they have already been
given permission to access.
 Source appliances will not be able to create new libraries and shares as part of the replication wizard process, instead these shares and
libraries must be created ahead of time on the target appliance.

59
Replication and Catalyst Copy monitoring
The aim of replication is to ensure that data is “moved” offsite as quickly as possible after a backup job completes. The “maximum time to
offsite” varies depending on business requirements. The StoreOnce Backup system provides tools to help monitor replication performance and
alert system administrators if requirements are not being met. For larger replication deployments HP Replication Manager v2.1 is recommended
and is free to download once a replication license or Catalyst license has been purchased. There is a top level replication monitoring status for VTL
and NAS as shown below within the main StoreOnce GUI.

Configurable Synchronisation Progress Logging and Out of Sync Notification


These configurable logging settings allow the system administrator to take snapshots of replication progress as log events and at fixed hourly
points in order to get a historical view of progress that can be compared to previous days and weeks to check for changes in replication
completion time.

Out of Sync notifications can be configured so that an alert is sent if the required maximum time to offsite is exceeded.

If these logs and alerts indicate a problem, best practices may be applied in order to get replication times back within required ranges.

Note: There are no such Out of Sync thresholds for Catalyst Copy as failure to copy is reported by the backup software’s Administration console.

60
Activity monitor in the StoreOnce GUI
The Activity page has a graph to show Replication and Catalyst Data throughput (inbound and outbound) over the last five minutes. The
throughput is the sum of all replication jobs and is averaged over several minutes. It provides some basic information about replication
performance but should be used mainly to indicate the general performance of replication jobs at the current time. This single activity report
supports all device types VTL, NAS and Catalyst.

Backup &
Restore

Replication &
Catalyst Copy

Replication throughput totals


Whilst replication jobs are running the Replication - Status - Source/Target Active Jobs pages (see below) show some detailed performance
information averaged over several minutes.
The following information is provided:
 Source / Target jobs running: The number of replication jobs that this appliance is running concurrently.
 Transmit / Receive Bandwidth: Amount of LAN/WAN Bandwidth in use
 Outbound / Inbound Throughput: Apparent data throughput, which is the throughput standardized to indicate what the effective
transfer rate is. For example a 100GB cartridge transfer may only be transferring 1% of unique data (1GB). The apparent rate is an
indication of how fast the 100GB cartridge is replicated in MB/sec as it proceeds through the complete cartridge replication.
This information can be used to assess how much bandwidth is being used and also how much efficiency deduplication is providing to the
replication process which is reported as bandwidth savings. It can also show whether replication is being able to utilize multiple jobs to improve
performance or whether only small numbers of jobs are running due to backups completing at different times.

61
A best practice is to use blackout windows so that replication jobs all run concurrently at a time when backup jobs are not running.

Replication Activity can be further monitored by clicking on the individual replication mappings as shown below.

Catalyst throughput totals


The screen shot below shows the comparable parameters for Catalyst jobs. This screen monitors data jobs (low bandwidth backups) and copy
jobs.

62
The equivalent to monitoring replication “mappings” in Catalyst is performed by looking at the individual Catalyst stores and looking at the
inbound and outbound copy jobs as shown below.

Using HP Replication Manager to monitor replication and Catalyst copy status


HP Replication Manager offers:
 Browser based Centralized Management Console with a single pane of glass to monitor health, capacity and replication/copy status
across large set of StoreOnce devices and provide reporting and trending analyses for VTL, NAS shares and Store Once Catalyst stores
 Ability to launch the StoreOnce Management GUI when a device fails or is in a state requiring attention.
 Logical grouping of devices. Virtual libraries, NAS shares and StoreOnce Catalyst Stores that are spread across different StoreOnce
devices but belong to one logical entity (group) can be assigned to one or more managed users.
 Detailed reports on Device Usage Statistics, Analysis and Trending.
 Management of up to 400 StoreOnce devices.

From HP Replication Manager 2.1 onwards Catalyst stores are also supported for monitoring. HP Replication Manager is a free download if either
a replication licence or StoreOnce Catalyst licence has been purchased.
The Replication Manager software can be downloaded from the following location.
URL : http://www.software.hp.com/kiosk with the Login and Password supplied with the replication/catalyst licene purchase.

63
Full details are available in the HP Replication Manager 2.1 user guide. One of the most useful features of HP Replication Manager is the
Topology Viewer, which can be seen below.

The Topology shows the Device Status, Name and Replication status between devices. A tool tip is available when the cursor is over a device. This
tool tip contains additional information about the device.

Figure 28: Topology viewer in Replication Manager

Use the Page Navigation options at the bottom of the page to move to other islands.

Click on Show Legend to display the legend used in the Topology.

64
Housekeeping monitoring and control
Terminology
Housekeeping: If data is deleted from the StoreOnce system (e.g. a virtual cartridge is overwritten or erased), any unused chunks will be marked
for removal, so space can be freed up (space reclamation). The process of removing chunks of data is not an inline operation because this would
significantly impact performance. This process, termed “housekeeping”, runs on the appliance as a background operation. It runs on a per
cartridge, Catalyst store and NAS file basis, and will run as soon as the cartridge is unloaded and returned to its storage slot or a NAS file has
completed writing and has been closed by the appliance, unless a housekeeping blackout window is set. Housekeeping also applies when data is
replicated from a source StoreOnce appliance to a target StoreOnce appliance – the replicated data on the target StoreOnce appliance triggers
housekeeping on the target StoreOnce appliance to take place. Blackout windows are also configurable on the target devices.

Blackout Window: This is a period of time (up to 2 separate periods in any 24 hours) that can be configured in the StoreOnce appliance during
which the I/O intensive process of Housekeeping WILL NOT run. The main use of a blackout window is to ensure that other activities such as
backup and replication can run uninterrupted and therefore give more predictable performance. Blackout windows must be set on BOTH the
source StoreOnce appliance and Target StoreOnce appliance.

This guide includes a fully worked example of configuring a complex StoreOnce environment including setting housekeeping windows, see
Appendix B. An example is shown below from a StoreOnce source from the worked example:-

In the above example we can see backups in green, housekeeping in yellow and replication from the source in blue. In this example we have
already set a replication blackout window which enables replication to run at 20:00.

Without a housekeeping blackout window set we can see how in this scenario where four separate servers are being backed up to the StoreOnce
Backup system, the housekeeping can interfere with the backup jobs. For example the housekeeping associated with DIR1 starts to affect the end
of backup DIR2 since the backup of DIR2 and the housekeeping of DIR1 are both competing for disk I/O.

By setting a housekeeping blackout window appropriately from 12:00 to 00:00 we can ensure the backups and replication run at maximum speed
as can be seen below. The housekeeping is scheduled to run when the device is idle.

65
However some tuning is required to determine how long to set the housekeeping windows and to do this we must use the StoreOnce Management
Interface and the reporting capabilities which we will now explain.

On the StoreOnce Management Interface go to the Housekeeping page; a series of graphs and a configuration capability is displayed. Let us look
at how to analyse the information the graphs portray.

There are four tabs on the Housekeeping page: Overall, Libraries, Shares and StoreOnce Catalyst. The Overall tab shows the total housekeeping
load on the appliance. The other tabs can be used to select the device type and monitor housekeeping load on individual named VTL, NAS shares
or Catalyst stores. Note how the Housekeeping blackout window configuration setting is shown below the Housekeeping status. The
housekeeping blackout window is set on an appliance basis not an individual device type basis.

Housekeeping jobs
received versus
housekeeping jobs
processed

The Housekeeping load on the target replication devices is generally higher than on the source devices and must be monitored/observed on those
devices – you cannot monitor the target housekeeping load from the source device.

The key features within this section are:


 Housekeeping Statistics:
Status has three options: OK if housekeeping has been idle within the last 24 hours, Warning if housekeeping has been processing nonstop for
the last 24 hours, Caution if housekeeping has been processing nonstop for the last 7 days.
Last Idle is the date and time when the housekeeping processing was last idle.
Time Idle (Last 24 Hours) is a percentage of the idle time in the last 24 hours.
Time Idle (Last 7 Days) is a percentage of the idle time in the last 7 days.

66
 Load graph (top graph): will display what levels of load the StoreOnce appliance is under when housekeeping is being processed. However this
graph is intended for use when housekeeping is affecting the performance of the StoreOnce appliance (e.g. housekeeping has been running
nonstop for a couple of hours), therefore if housekeeping is idle most of the time no information will be displayed.

1. Housekeeping under control


2. Housekeeping out of control, not being reduced over time

In the above graph we show two examples, one where the housekeeping load increases and then subsides, which is normal, and another where
the housekeeping job continues to grow and grow overtime. This second condition would be a strong indication that the housekeeping jobs are
not being dealt with efficiently, maybe the housekeeping activity window is too short (housekeeping blackout window too large), or we may be
overloading the StoreOnce appliance with backup and replication jobs and the unit may be undersized.

Another indicator is the Time Idle status, which is a measure of the housekeeping empty queue time. If % idle over 24 hours is = 0 this means that
the box is fully occupied and that is not healthy, but this may be OK if the % idle over 7 days is not 0 as well. For example, if the appliance is 30%
idle over 7 days then we are probably operating within reasonably safe limits.

Signs of housekeeping becoming too high are that backups may start to slow down or backup performance becomes unpredictable.
Corrective actions if idle time is low or the load continues to increase are:
a) Use a larger StoreOnce appliance or add additional shelves to increase I/O performance.
b) Restructure the backup regime to remove appends on tape or keep appends on separate cartridges – as the bigger the tapes (through
appends,) the more housekeeping they generate when they are overwritten.
c) Increase the time allowed for housekeeping to run by reducing the housekeeping blackout windows

If you do set up housekeeping blackout windows (up to two periods per day, 7 days per week), be careful as you cannot set a blackout time from
say 18:00 to 00:00 but you must set 23:59. In addition there is a Pause Housekeeping button, but use this with caution because it pauses
housekeeping indefinitely until you restart it!

Careful!
Finally, remember that it is best practice to set housekeeping blackout windows on both the source and target devices. The diagram below shows
the target device from the worked example later in this document where replication from several source sites is arriving. Two replication blackout
windows are set on the target device, 10:00 to 14:00 and 20:00 to 02:00 (see below). Note how the replication process of data received at the
target (shown here in Blue) triggers housekeeping which must be managed. If housekeeping is not controlled at the target it can start to impact
replication performance from the source. In general the housekeeping load at the target in many to one replication scenarios is higher than that
of any individual source and so a larger housekeeping period must be provisioned.

67
Consider improving the situation by imposing two Housekeeping Windows on the Target Device as shown below

GMT 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00 08:00
DIR 1
A& D Share 1
Filesystem data DIR 2
DIR 3
A&D SQL SQL
A& D Share 3 Filesystem DIR 1
2 DIR2

A& D Share 4 Special App


Data
B&C Filesystem DIR 1
DIR2
B&C SQL SQL
B&C Share 3 Special App
Data
Local Backup Exch

Special
Local Backup App
REDUCED LOAD

Backup
Replication
Housekeeping
Start of Replication Window
Spare Time for Physical Tape OffLoad.
Housekeeping Window

68
Tape Offload
Terminology
Direct Tape Offload
This is when a physical tape library device is connected directly to the rear of the StoreOnce Backup system.
This offload feature is not currently supported on HP StoreOnce Backup system.

Backup application Tape Offload/Copy from StoreOnce Backup system


This is the preferred way of moving data from a StoreOnce Backup system to physical tape. The data transfer is managed entirely by the backup
software, multiple streams can be copied simultaneously and StoreOnce NAS, Catalyst store and VTL emulations can be copied to physical tape.
Both the StoreOnce Backup system and the physical tape library must be visible to the backup application media server doing the copy and some
additional licensing costs may be incurred by the presence of the physical tape library. Using this method, entire pieces of media (complete virtual
tapes or NAS shares) may be copied OR the user can select to take only certain sessions from the StoreOnce Backup system and copy and merge
them onto physical tape. These techniques are known as “media copy” or “object copy” respectively. All copies of the data are tracked by the
backup application software using this method and it is the tape offload method HP recommends.

When reading data in this manner from the StoreOnce Backup system the data to be copied must be read from the StoreOnce appliance and
“reconstructed” then copied to physical tape. Just as with the backup process – the more parallel backup streams to the StoreOnce appliance the
faster the backup will proceed, Similarly the larger the number of parallel reads generated for the tape copy, the faster the copy to tape will take
place – even if this means less than optimal usage of physical tape media.

Scheduling tape offload to occur at less busy periods, such as weekends, is also highly recommended, so that the read process has maximum I/O
available to it.

Backup application Mirrored Backup from Data Source


This again uses the backup application software to write the same backup to two devices simultaneously and create two copies of the same data.
For example, if the monthly backups must be archived to tape, a special policy can be set up for these mirror copy backups. The advantage of this
method is that the backup to physical tape will be faster and you do not need to allocate specific time slots for copying from StoreOnce Backup
system to physical tape. All copies of the data are tracked by the backup application software.

Tape Offload/Copy from StoreOnce Backup system versus Mirrored Backup from Data Source
A summary of the supported methods is shown below.

For easiest integration For optimum performance

Backup application copy to tape Separate physical tape mirrored backup

The backup application controls the copy from the StoreOnce appliance This is a parallel activity. The host backs up to the StoreOnce appliance and
to the network-attached tape drive so that: the host backs up to tape. It has the following benefits:
 It is easier to find the correct backup tape  The backup application still controls the copy location
 The scheduling of copy to tape can be automated within the  It has the highest performance because there are no read operations
backup process and reconstruction from the StoreOnce appliance

Constraints: Constraints:
 Streaming performance will be slower because data must be  It requires the scheduling of specific mirrored backup policies
reconstructed  This method is generally only available at the “Source” side of the
backup process. Offloading to tape at the target site can only use the
backup application copy to tape method.

When is Tape Offload Required?


 Compliance reasons or company strategy dictate Weekly, Monthly, Yearly copies of data be put on tape and archived or sent to a DR site. Or a
customer wants the peace of mind that he can physically “hold” his data on a removable piece of media.
 In a StoreOnce Replication model it makes perfect sense for the data at the StoreOnce DR site or central site to be periodically copied to
physical tape and the physical tape be stored at the StoreOnce site (avoiding offsite costs) yet still providing long term data retention.

69
 The same applies in a StoreOnce Catalyst Copy model. However, the StoreOnce Catalyst Copy feature allows the backup application to
incorporate tape offload, as well as Catalyst Store copy between StoreOnce appliances into a single backup job specification. The following
examples relate to StoreOnce Replication. Please see Appendix C for examples that are more relevant to the StoreOnce Catalyst model.

Catalyst device types

1. Catalyst Copy command. 3. Rehydration and full bandwidth copy to tape under
2. Low bandwidth Catalyst Copy. the control of the ISV software.

Figure 29: StoreOnce Catalyst Copy to tape drive model


This visibility, flexibility and integration of Catalyst stores into the backup software is one of the key advantages of HP StoreOnce Catalyst -
especially because the replicated copies are already known to the backup application.

70
VTL and NAS device types

1. Backup data written to StoreOnce Source. 3. All data stored safely at DR site. Data at
StoreOnce target (written by StoreOnce
source via replication) must be imported to
2. StoreOnce low bandwidth replication
Backup Server B before it can be copied to
tape.

Figure 30: Backup application tape offload at StoreOnce target site for VTL and NAS device types

Note: Target Offload can vary from one backup application to another in terms of import functionality. Please check with your vendor.

1. Copy StoreOnce device to physical tape; this uses the backup Copy job to copy data from the StoreOnce appliance to physical tape and is
easy to automate and schedule, it has a slower copy performance.
2. Mirrored backup; specific backup policy used to back up to StoreOnce and Physical Tape simultaneously (mirrored write) at certain times
(monthly). This is a faster copy to tape method.

Figure 31: Backup application tape offload at StoreOnce source site for VTL and NAS device types

As can be seen in the diagrams above – offload to tape at the source site is somewhat easier because the backup server has written the data to
the StoreOnce Backup system at the source site. In the StoreOnce Target site scenario (Figure 30), some of the data on the StoreOnce Backup
system may have been written by Backup Server B (local DR site backups, maybe) but the majority of the data will be on the StoreOnce Target via
low bandwidth replication from StoreOnce Source. In this case, the Backup Server B has to “learn” about the contents of the StoreOnce target
71
before it can copy them and the typical way this is done is via “importing” the replicated data at the StoreOnce target into the catalog at Backup
Server B, so that it knows what is on each replicated virtual tape or StoreOnce NAS share. Copy to physical tape can then take place. These
limitations do not exist if HP StoreOnce Catalyst device types are used.

Key performance factors in Tape Offload performance


Note in the diagram below how the read performance from a StoreOnce 4420 (red line) increases with the number of read streams – just like with
backup.

If the StoreOnce 4420 reads with a single stream (to physical tape) the copy rate is low. However, if the copy jobs are configured to use multiple
readers and multiple writers then for example with four streams being read it is possible to achieve much higher copy performance. However this
will require more physical tape drives in the Library or the use of multiplexing to tape.
What this means in practice is that you must schedule specific time periods for tape offloads when the StoreOnce Backup system is not busy and
use as many parallel copy streams (tape devices) as practical to improve the copy performance.

1. Single stream read performance 2. Much higher read throughput (for tape offload) with four streams

Figure 32: Read performance graph

72
Summary of Best Practices
1. The first recommendation is to really assess the need for tape offload: with a StoreOnce replication or Catalyst copy is another copy of the
data really necessary? How frequently does the offload to tape really need to be? Monthly offloads to tape are probably acceptable for most
scenarios that must have offload to tape.
2. Catalyst “Integrated tape copies” are the easiest way to implement a tape offload strategy from within backup policy definitions without the
need for imports at the DR site but this functionality i s only available with HP Data Protector, Symantec NetBackup and Symantec Backup
Exec backup software.
3. For “Media Copies” it is always best to try and match the StoreOnce VTL cartridge size with the physical media cartridge size to avoid wastage.
For example: if using physical LTO4 drives (800 GB tapes) then when configuring StoreOnce Virtual Tape Libraries the StoreOnce cartridge
size should also be configured to 800 GB.
Schedule time for Tape Offloads: The StoreOnce Backup system is running many different functions – Backup and Deduplication, Replication
and Housekeeping – and Tape Offload is another task (that involves reading data from the appliance). To ensure the offload to tape performs
the best it can, no other processes should be running. In reality this means actively planning downtime for tape offloads to occur.

Offload a little at a time or all at once? Depending on the usage model of the StoreOnce Backup system and the amount of data to be
offloaded you may be able to support many hours dedicated to tape offload once per month. Or, if the StoreOnce Backup system is very busy
most of the time, you may have to schedule smaller offloads to tape on a more frequent weekly basis.

The example below shows where 2 hours per day have been dedicated to tape offload (purple) at the target site
(taken from our worked example from Appendix B).

GMT 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00 08:00
DIR 1
A& D Share 1
Filesystem data DIR 2
DIR 3
A&D SQL SQL
A& D Share 3 Filesystem DIR 1
2 DIR2

A& D Share 4 Special App


Data
B&C Filesystem DIR 1
DIR2
B&C SQL SQL
B&C Share 3 Special App
Data
Local Backup Exch

Special
Local Backup App
REDUCED LOAD

4. Importing data? When copying to tape at a StoreOnce target site it is only possible to copy to physical tape after the backup server is aware of
the replicated data from the StoreOnce source. It is important to walk through this import process and schedule the import process to occur in
advance of the copy process, otherwise the copy process will be unaware of the data that must be copied. In the case of HP Data Protector
specific scripts have been developed that can poll the StoreOnce target to interrogate newly replicated cartridges and NAS files. HP Data
Protector can then automatically schedule import jobs in the background to import the cartridges/shares so that when the copy job runs, all is
well. Other backup applications’ methods may vary in this area. For example, for most backup applications the StoreOnce target can be read
only to enable the copy to tape, but Symantec Backup Exec requires write/read access which involves breaking the replication mappings for
this to be possible. Please check with your backup application provider before relying on the tape offload process or perform a Disaster
Recovery offload to tape to test the end to end solution.

73
Appendix A
Key reference information

74
StoreOnce Single Node Products, Software 4210 iSCSI/FC
3.4.x and later 2610 iSCSI G3 2620 iSCSI G3 G3 4220 G3 4420 G3 4430
Devices
Usable Disk Capacity (TB) (With full
expansion) 1 2.5 9 18 38 76
Max Number Devices (VTL/NAS) 4 8 16 24 50 50
Replication
Max VTL Library Rep Fan Out 1 1 1 1 1 1
Max VTL Library Rep Fan In 1 1 8 8 16 16
Max Appliance Rep Fan Out 2 2 4 4 8 8
Max Appliance Rep Fan In 4 8 16 24 50 50
Max Appliance Concurrent Rep Jobs Source 12 12 24 24 48 48
Max Appliance Concurrent Rep Jobs Target 24 48 48 48 96 96
Physical Tape Copy Support
Supports direct attach of physical tape device No No No No No No
Max Concurrent Tape Attach Jobs Appliance N/A N/A N/A N/A N/A N/A
VTL
Max VTL Drives Per Library/Appliance 16 32 64 96 200 200
Max Cartridge Size (TB) 3.2 3.2 3.2 3.2 3.2 3.2
Max Slots Per Library (D2DBS, EML-E, ESL-E
Lib Type) 96 96 1024 1024 4096 4096
Max Slots Per Library
(MSL2024,MSL4048,MSL8096 Lib Type) 24,48,96 24,48,96 24,48,96 24,48,96 24,48,96 24,48,96
Max active streams per store 32 48 64 96 128 128
Recommended Max Concurrent Backup
Streams per appliance 16 24 48 48 64 64
Recommended Max Concurrent Backup
Streams per Library 4 4 6 6 12 12
NAS
Max files per share 25000 25000 25000 25000 25000 25000
Max NAS Open Files Per Share >
DDThreshold* 32 48 64 64 128 128
Max NAS Open Files Per Appliance >
DDThreshold* 32 48 64 64 128 128
Max NAS Open Files Per Appliance concurrent 96 112 128 128 640 640
Recommended Max Concurrent Backup
Streams per appliance 16 24 48 48 64 64
Recommended Max Concurrent Backup
Streams per Share 4 4 6 6 12 12
StoreOnce Catalyst
Catalyst Command Sessions 16 16 32 32 64 64
Maximum Concurrent outbound copy jobs per
appliance 12 48 24 24 48 48
Maximum Concurrent inbound data and copy
jobs per appliance 12 48 96 96 192 192
Performance
Max Aggregate Write Throughput Catalyst
Low Bandwidth (TB/hr) 1 1 2.2 2.2 10.8 12.5
Max Aggregate Write Throughput Non-
Catalyst (TB/hr) 0.67 0.67 2.9 3.3 4.8 4.8
Min streams required to achieve max
aggregate throughput** 6 6 12 16 16 20

* DDThreshold is the size a file must reach before a file is deduplicated, set to 24MB.
** Assumes no backup client performance limitations.

75
Appendix B – Fully Worked Examples
In this section we will work through a complete multi-site, multi-region StoreOnce design, configuration and deployment tuning. The following
steps will be undertaken:
 Hardware and site configuration definition
 Backup requirements specification
 Using the HP Storage Sizing Tool, size the hardware requirements, link speeds and likely costs
 Work out each StoreOnce device configuration - NAS, VTL, number of shares, and so on - using best practices that have been articulated earlier
in this document.
 Configure StoreOnce source devices and replication target configuration
 Map out for sources and target the interaction of backup, housekeeping and replication.
 Fine tune the solution using replication blackout windows and housekeeping blackout windows
The worked example below may seem rather complicated at times but it is specifically designed to tease out many different facets of the design
considerations required to produce a comprehensive and high performance solution.

A Catalyst worked example is also shown later in this AppendixIt is expected that Catalyst deployments will be “All Catalyst” with little or no
mixing with VTL and NAS emulations. Catalyst deployments are also limited to HP Data Protector, Symantec NetBackup and Symantec Backup
Exec software and the preferred usage model is with Low bandwidth Catalyst implementations.

Hardware and site configuration


Please study the example below, it consists of four remote sites (A, B, C, D) with varying backup requirements and these four remote sites
replicate to a Data Center E over low bandwidth links. At the data Center E a larger StoreOnce Appliance is used as both the replication target (for
sites A, B, C, D) and to back up local servers at site E.

A fully worked example now follows.

Figure 33: Environment for Sizing Tool example

76
Backup requirements specification

Remote Sites A/D


NAS emulations required, growth 20% size for 1 year
Server 1 – Filesystem 1, 100 GB, spread across 3 mount points
Server 2 – SQL data, 100GB
Server 3 – Filesystem 2, 100GB, spread across 2 mount points
Server 4 – Special App Data , 100GB
Rotation Scheme – Weekly Fulls, 10% Incrementals during week, Keep 4 weeks Fulls and 1 monthly backup
12 hour backup window
12 Hour replication window

Remote sites B/C


iSCSI VTL emulations required, growth 20% size for 1 year
Server 1– Filesystem, 200 GB, spread across 2 mount points C,D
Server 2 – SQL data, 200GB
Server 3 – Special App data, 200GB
Rotation Scheme – Weekly Fulls, 10% Incrementals during week, Keep 4 weeks Fulls and 1 monthly backup
12 hour backup window
12 hour replication window

Data Center E
Fibre Channel VTL emulations required for local backups, growth 20% size for 1 year
Server 1 - 500GB Exchange => Lib 1 - 4 backup streams
Server 2 - 500GB Special App => Lib 2 – 1 backup stream
Rotation Scheme – 500GB Daily Full, retained for 1 month.
12 hour backup window
Replication target for sites A, B, C, D (this means we have to size for replication capacity AND local backups on Site E)
Monthly archive required at Site E from STOREONCE to Physical Tape

One of the key parameters in sizing a solution such as this is trying to estimate the daily block level change rate for data in each backup job. In this
example we will use the default value of 2% in the HP StoreOnce sizing tool. http://h30144.www3.hp.com/SWDSizerWeb/default.htm

Working through this example is strongly recommended because valuable insight can be gained by performing the practical sizing exercise.

77
Using the HP Storage sizing tool
Configure replication environment
1. Click on Backup Calculators and then Design StoreOnce Replication Over WAN to get started.

2. Configure the replication environment for 4 source appliances to 1 target appliance (five appliances in total).
This Replication Model is known as Many-to-One replication.
The Replication Window allowed is 12 hours.
The size of the target device is initially based on capacity, so select “Any” as the Target Device type.
because A and D, and B and C are identical sites we can choose three sites, enter the data once for Site A and Site B, then create two identical
sites for D and C within the HP Storage Sizing tool.

78
3. Click Launch Sites.

For each Source and then for the Target, enter the backup sizes and rotation schemes. Source A and Target E are shown as examples in the Sizing
Tool screenshots below. Inputs for Source A are shown below; the inputs for Source D, which are identical, can also be added by incrementing the
number of similar sites in the Total Source Sites drop-down list shown in the screenshot on the following page. For simplicity we will size for a
single year’s growth of 20%.

Remote Sites A/D


NAS emulations required, growth 20% size for 1 year, no data compression, 2% change rate
Server 1 – Filesystem 1, 100 GB, spread across 3 mount points
Server 2 – SQL data, 100GB – single stream
Server 3 – Filesystem 2, 100GB, spread across 2 mount points
Server 4 –- Special App Data , 100GB – single stream
Rotation Scheme for all jobs – Weekly Fulls, 10% Incrementals during week, Keep 4 weeks Fulls and 1 monthly backup
12 hour backup window
Emulations are all set to NAS (CIFS), growth at 20% data compression =1, block change rate 2%

79
IT IS VERY IMPORTANT that when you are creating the Backup specifications in the HP Storage Sizing Tool you pay particular attention to the Field
Number of parallel Backup Streams. This field determines the backup throughput and the amount of concurrent replications that are possible
BUT it may require a conscious change by the customer in backup policies to make it happen.
The following screenshot illustrates how you enter the backup data sizes for Sites A and D.

Number of parallel backup


streams will determine
overall throughput. The
backup specification says
Filesystem1 has 3 mount
points which allows us to
run 3 parallel backup
streams.

Sites A and D are identical so we


can specify 2 identical sites

More about parallel backup streams


A single backup stream (1) to a StoreOnce device may run, for example, at 30MB/sec but running three streams simultaneously to a
StoreOnce will run at, say, 80MB/sec. The Sizing Tool has all these throughputs per stream modeled with actual numbers from device
testing. So, for best throughput, HP recommends four streams or more. What this means in practice is re-specifying backup jobs to use
separate mount points. For example, instead of backing up “Filesystem1” spread across drives C, D, E on site A, create three jobs C:/Dir1,
D:/Dir2, E:/Dir3, so that we can run three parallel backup streams.

In the case of sites A and D, when we enter all the backup jobs, we will have seven backup jobs running in parallel which will give us good
throughput and backup performance.

80
Dedupe-Input tab
Click on the Dedupe Tab to set up the data types/change rate and retention periods.
The following screen shows the input for Job Filesystem1 from a dedupe perspective 2% daily change rate, incrementals 10% of Full. It
shows deduplication and retention periods for Sites A and D.

Note the Data Type is “unstructured” for Filesystem data.

This parameter is the


block change rate of
data per day at a block
level and will
determine, along with
retention period, the
dedupe ratio achieved
and the amount of
data to be replicated.
The default is 2%; for
dynamic change
environments increase
this number.

To create our specific retention policy, click Create Retention and input the retention policy creating a series of steps with names: daily, weekly,
monthly etc.

81
The Retention Planner can schedule a wide range of retention schemes; below we see the daily incremental, 4 x weekly full and single monthly
required in the specification for sites A and D.

After the retention times have been configured for FileSystem 1 click OK

82
Dedupe Output tab

Let’s now look at the Output tab of the Dedupe section of the job specification; it displays the predicted dedupe ratio you will achieve over the
period of time for which you are sizing. Use the slider to see the dedupe ratio at the right of the table.

Drag to see
growth YoY

83
If you click on YoY Growth you get the full picture of deduplication performance over 10 years with the predicted growth rate.

Dedupe analysis

Click Add Job when finished for FileSystem1 and repeat for other data type entries as per the specification above.

All job definitions loaded


The final result should be as shown in the next screenshot – 4 jobs, all of 100GB with varying amount of streams. In total we will be backing up 7
streams simultaneously.

84
The following screenshot shows all Backup data entered for Sites A and D.

All Job definitions


loaded

We will now repeat the process for Sites B and C – by selecting HP StoreOnce Source #2 in the left hand navigation pane.

Remote sites B/C


iSCSI VTL emulations required, growth 20% size for 1 year, 2% data change rate
Server 1– Filesystem, 200 GB, spread across 2 mount points C,D
Server 2 – SQL data, 200GB – 1 stream
Server 3 – Special App data, 200GB – 1 stream
Rotation Scheme – Weekly Fulls, 10% Incrementals during week, Keep 4 weeks Fulls and 1 monthly backup
12 hour backup window

85
The following screenshot shows how to enter the backup data sizes, dedupe parameters and retention periods for Sites B and C each of which
provides 4 backup streams.

Note Target Emulation is now VTL, Total Source Sites =2 (B&C). Retention is created the same as the previous example.

86
Data Center E
Fibre Channel VTL emulations required for local backups (note selection of FC device as mandatory), growth 20% size for 1 year, 2% change rate.
Server 1 - 500GB Exchange => Lib 1 - 4 backup streams
Server 2 - 500GB Special App => Lib 2 – 1 backup stream
Rotation Scheme – 500GB Daily Full, retained for 1 month.
12 hour backup window
Input job entries for Site E on the Backup tab. It requires full backups every day for 29 days and also requires FC attach, so check FC is mandatory
in the System interface area. The rotation scheme for Site E is Fulls & Fulls.

The following screenshot shows how to enter the backup schedule (fulls every day) for Sites E along with the requirement for FC connectivity.

Fulls ever day

87
On the Dedupe – Input tab, the retention period is set to Fulls for 29 days.

The following screenshot shows how to enter the Data retention periods for Target Site E.

88
If we were to look at the Dedupe - Output tab we would see, because all these backups are fulls over a month, the deduplication ratio is much
higher.

You have now finished defining the jobs for all five appliances (four sources and one target).

Press the Solve/Submit button and the Sizing Tool will do the rest.

The following screenshot shows the Target Output tab and location of the Solve/Submit button.

Fulls fuls give


higher dedupe rate

89
Sizing Tool output
The Sizing Tool creates two outputs.

 It creates an excel spreadsheet with all the parts required for the solution including Service and Support services, and any licenses required
together with the List Pricing ( List pricing not shown for commercial reasons).

 It creates an HTML solution overview (see example in next section) which indicates the types of devices to be used at source and target, the
amount of data to be replicated to and stored on the target, the Link Speeds at source and target for the specified replication window.

Understanding the htlm output from the Sizing Tool


The sizing for replication assumes the following
a) The number of backup streams translates into the number of concurrent replication streams – be it “streams” to Tape drives, streams
to NAS shares and streams to Catalyst stores.
b) The unit of replication in a VTL is a “Cartridge” in NAS share is a “file” and in a Catalyst store is an item.
c) The replication sizing algorithm averages all the jobs from all the sources to make the overall replication assessment.
d) The sizing algorithm copes with if Source replication jobs is > target concurrency or if source replication jobs < target concurrency.
e) Default is 100% of WAN bandwidth available for replication ( can be changed)
f) Link Efficiency is driven by Link latency – higher the latency the lower the efficiency. The longer the distance the higher the latency.
Let’s look at the Sizing Tool output stage by stage. In the following output below it has sized HP StoreOnce2620i appliances for the sources and
an HP StoreOnce 4210 FC single unit for the target, this is because the sizing in this example is mainly determined by capacity requirements as the
backup window is very generous. In more complex solutions there are 3 passes of the sizer – first pass for capacity, second pass for throughput,
then third pass for replication requirements – whichever pass requires the most resources, than that is the solution/sizing that is presented to
the user.

90
Source &
Target
Link sizes
required

Amount of data in
GB transmitted
Source to target
Models sized (including
worst case (fulls)
additional shelves for Replication concurrency (derived from backup
capacity or performance) streams) vs maximum source or target concurrency
of device itself. This can be useful in identifying
bottlenecks. In this case the target concurrency is
not exceeded (using 22 of 48 streams available)

Observations
 We are not saturating the target device, we only have 22 concurrent replication jobs running and the target can accept 48 concurrently.
This allows more remote sites to be brough on line without exceeding the target concurrency.

 Link efficiency is high due to low latency 0-50mS.

 You might wonder why for Source 3&4 the link size smaller when the replicated data (TxGB) is higher than sources 1&2 The link sized
does not just depend on data being replicated but there are two other key factors that determine link bandwidth of each source site:
1) concurrency level
2) data change level – of ratio Total TxGB/Total DataGB.

91
For each source site 3, 4 (B,D)
Effective Link bandwidth = number of concurrent replication streams (Used Ccy) * average replication throughput per stream in MB/sec
* (Total TxGB/Total Data GB) * 8.0.
Effective Link bandwidth = 4*3.03*(19.73/720)*8.0 = 2.66 Mbit/Sec
Final Link Sized = 2.95 Mbit/Sec (after 90% link efficiency)

For each source site 1, 2 (A,C)


We sized a bigger link since concurrency level was higher (7 instead of 4). Please note that this concurrency is both a combination of
your job input as well as what specific systems can support. Throughput in this report is apparent throughput, this means it is the
effective overall replication as if we were progressing through replicating a cartridge (VTL) a file(NAS) or an Item (Catalyst)

 In this case the target WAN link is almost the sum of the separate source WAN links but this is not always the case – especially if the
target is the bottleneck because its concurrency level is exceeded. It is pointless paying for WAN bandwidth if you can’t use it.

 We have to size worst case for the WAN link and the worst case is when a full backup is run because more data is replicated and it still
has to meet the 12-hour backup window in our example.

These are the


jobs that
were
inputted
previously.

92
The html output also provides an estimate of Seeding times (see below) WAN based is using the calculated bandwidth – and it can take a long
time to seed over a small link, a temporary increase in WAN link size from your telco may help here.

The Seeding Hours LAN is based on using “Co-location” and a direct 1Gbe Network between StoreOnce systems.

If the target was the bottleneck because its replication concurrency was being exceeded then you can experiment with the Sizing Tool by forcing
the target to be sized as a larger device so that the target concurrency is not exceeded – this makes for more effective replication and higher
headroom for growth.

There are two ways to do this, both of which are illustrated.


a) Use the Next configuration option on the Target backup calculator, so it sizes the next model up.
b) Change the Target Device in the replication designer

Selecting the Next configuration option

93
Or go to the Replication Designer and choose a specific target model there.

Selecting the Target Device option

94
Configure StoreOnce source devices and replication target configuration
This next stage looks at the detailed configuration that will be required when we deploy this configuration.

Sites A and D
The customer has already told us he wants NAS emulation at sites A and D.
Server 1 – Filesystem data 1, 100GB, spread across 3 mount points
Server 2 – SQL data, 100GB
Server 3 – Filesystem 2, 100GB, spread across 2 mount points
Server 4 -- Special App Data,100GB

On sites A and D the STOREONCE units will be configured with four NAS shares (one for each server), the filesystem servers will be configured with
subdirectories for each of the mount points. These subdirectories can be created on the STOREONCE NAS CIFS share by using Windows Explorer
(e.g. Dir1, Dir2, Dir3) and the backup jobs can be configured separately as shown below, but all run in parallel.
For example: Server 1 Mount points C, D, E
C:  StoreOnce NAS Share/Dir1
D:  StoreOnce NAS Share/Dir2
E:  StoreOnce NAS Share/Dir3

This has two key advantages – all filesystem type data goes into a single NAS share which will then yield high deduplication ratios and, because
we have created the separate directories, we ensure the three backup streams can run in parallel hence increasing overall backup throughput.

This creation of multiple NAS shares on Sites A and D (4 in total) with different data types allows best potential for good deduplication whilst
keeping the stores small enough to provide good performance. This would also mean that a total of 8 NAS replication targets need to be created
at Site E, since NAS shares require a 1:1 source to target mapping.

Sites B and C
For sites B and C the customer has requested VTL emulations.
Server 1 – 200 GB Filesystem, spread across 2 mount points C,D
Server 2 – 200GB SQL data
Server 3 – Special App data, 200GB

In this case we would configure 3 x VTL Libraries (because we have 3 different data types) with the following drive configurations:
Server 1 VTL – 2 drives (to support 2 backup streams) say 12 slots
Server 2 VTL – 1 drive (since only one backup stream) say 12 slots
Server 3 VTL – 1 drive (since only one backup stream) say 12 slots

The monthly backup cycle means a total of 11 cartridges will be used in each cycle, which has guided us to select 12 cartridges per configured
library. The fixed emulations in HP StoreOnce like “MSL2024” would mean we would have to use a full set of 24 slots but, if the customer chooses
the “D2DBS” emulation, the number of cartridges in the VTL libraries is configurable. Note: the backup software must recognize the “D2DBS” as a
supported device and must be configured to overwrite media once expired to prevent expired cartridges “hogging” valuable storage space.

As a general rule of thumb, configure the cartridge size to be the size of the full backup + 25% and, if tape offload via backup application is to be
used, less than the cartridge size of the physical tape drive cartridges. So let us create cartridges of 250 GB for these devices (200GB * 1.25))

Site E
We will require 8 NAS replication shares for Sites A and D.

For sites B and C with VTL emulations we have a choice, because with VTL replication we can use “slot mappings” functionality to map multiple
source devices into a single target devices, allowing easier management and better deduplication ratios on the target device. So, we can either
create 6 * VTL replication libraries in the StoreOnce at Site E or merge the slots from 3 * VTL on sites B and C into 3 x 24 slot VTLs on Site E. This
allows the file system data, SQL data and the Special App data to be replicated to VTLs on Site E with the same data type - again benefiting from
maximum dedupe capability.

We also need to provision two VTL devices for daily full Exchange backups that are retained for 1 month – 4 stream backups for Exchange plus a
single stream backup for the Special Application data on Site E for local backups.
 VTL 1 = 4 drives, at least 31 slots (to hold 1 month retention of daily fulls)
 VTL 2 = 1 drive, at least 31 slots (to hold 1 month retention of daily fulls)

95
The final total source and target configuration is shown below.

Local
backups
on Site E

Figure 34: Example NAS and VTL configurations

Map out the interaction of backup, housekeeping and replication for sources and target
With HP StoreOnce Backup systems it is important to understand that the device cannot do everything at once, it is best to think of “windows” of
activity. Ideally, at any one time, the device should be either receiving backups, replicating, housekeeping, or offloading to physical tape. However
this is only possible with some careful tuning and analysis.

Housekeeping (or space reclamation, as it is sometimes known) is the process whereby the StoreOnce updates its records of how often the
various hash codes that have been computed are being used. When hash codes are no longer being used they are deleted and the space they were
using is reclaimed. As we get into a regular “overwriting pattern” of backups, every time a backup finishes, housekeeping is triggered to happen
and the deduplication stores are scanned to see what space can be reclaimed. This is an I/O intensive operation. Some care is needed to avoid
housekeeping causing backups or replication to slow down as can be seen below.

HP StoreOnce Backup systems have the ability to set blackout windows for replication and housekeeping, when no replication or housekeeping
will take place – this is deliberate in order to ensure replication is configured to run ideally when no backups or housekeeping are running. We can
configure two blackout windows in any 24-hour period.

In the worked example let us assume the following time zones:


Sites A and D = GMT + 6 (based in APJ)
Sites B and C = GMT - 6 (CST in US)
Site E = GMT (based in UK)

96
All time references below are standardized on GMT time. Replication blackout windows are set to ensure replication only happens within
prescribed hours. In our sizing example we input the backup and replication window as 12 hours, but we would have to edit this to 8 hours to
conform to the plan below. This shows how sizing can sometimes be an interative process. Decreasing backup window could result in larger
models being sized to fgive improved throughout – along with larger WAN links.

Site Backup Replication Housekeeping

A 12:00 – 20.00 20:00 – 04:00 04:00 - 08:00

B 24:00- 08:00 08:00-16:00 16:00-20:00

C 24:00- 08:00 08:00-16:00 16:00-20:00

D 12:00 – 20.00 20:00 – 04:00 04:00-08:00

E 18:00 – 02:00 08:00 – 04:00 02:00 – 06:00 & 14:00 to 20:00

As you can see from the above worst case example, with such a worldwide coverage, the target device E cannot easily separate out its local
backup (18:00 – 02:00) so that it does not happen at the same time as the replication jobs from sites A, B, C, D and the housekeeping required on
Site E. The Housekeeping load is generally always higher at the target site.

What this means is that the replication window on the target device must be open almost 24 hours a day or at least 08:00 to 04:00. The target
device essentially has a replication blackout window set only between the hours of 04:00 and 08:00 GMT.

In this situation the user has little alternative but to OVERSIZE the target device E to the next model up with higher I/O and throughput capabilities
in order to handle this unavoidable overlap of local backup, replication from geographically diverse regions and local housekeeping time. How to
upsize units has been explained in the Sizing Tool explanation above.

Tune the solution using replication windows and housekeeping windows


The objective of this section is to allow the solution architect to design for predictable behavior and performance. Predictable configurations may
not always give the fastest time to complete but in the long run they will prevent unexpected performance degradation due to unexpected
overlaps of activities.
In order to show the considerations that need to be taken into account the diagrams below show the stages of “tuning” required at each source
site and target over time.
1. No changes to existing backup policies – all backups start at the same pre-defined time.
2. Set replication windows.
3. Tune target device to handle replication from sources, housekeeping associated with replication and local backups, again making using
of replication windows and housekeeping windows.

97
Worked example – backup, replication and housekeeping overlaps
The example below does not correspond exactly to the sized example above. For instance, the backups are completing within the 12 hours
allocated and the replication is completing also within the 12 hours allocated. This example serves merely to show how consideration must be
given to how the blackout windows can affect overall performance and how analysing the performance on the target is key to a successful
configuration.

Backup
Replication
Housekeeping
Start of Replication Window
Spare Time for Physical Tape OffLoad.
Housekeeping Window

During a rotation scheme cartridges/shares are being overwritten daily, so housekeeping will happen daily

Sites A &D

Initial Config with NO replication blackout window set - lots of overlapping activities - performance not predictable.

GMT 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00
DIR 1

Share 1 DIR 2
Filesystem data
DIR 3

Data
Share 2 - SQL

DIR 1
Share 3 Filesystem
2 DIR2

Share 4 Special Data


App

98
Initial Config with replication blackout window set , all 7 jobs can replicate concurrently (concurrency 12) –
apllying replication window has effect of reducing backup times by avoiding contention. There is still some
contention however with Housekeeping during the replication window.

GMT 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00
DIR
1
DIR
Share 1 2
Filesystem data

DIR
3
Data
Share 2 - SQL

DIR
1
Share 3 Filesystem 2 DIR2
Share 4 Special App Data

Addition of replication window allows us to force


replication activities to only happen outside of
the backup window. Housekeeping still happens
when backup is complete unless we set a
housekeeping blackout window.

Applying Housekeeping blackout window at sites A, D now improves replication performance. This now provides
some free time for future capacity growth and increases in replication time associated withthis

GMT 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00


DIR 1
Share 1
DIR 2
Filesystem data
DIR 3
Share 2 - SQL Data
DIR 1
Share 3 Filesystem 2 DIR2
Share 4 Special App Data

HK Windows can be configured but must be monitored to see Housekeeping load is not growing overall day by day.

A similar analysis to the above can take place at Sites B and C.

99
Site E, Data Center
Let us now analyze the replication situation at Site E, the Disaster Recovery center.

At Site E we have replication jobs from A&D + B&C as well as local backups
Replication jobs also trigger Housekeeping at the Target site
Replication window set to 14 Hours on Target device initially
Concurrency level of Target is 24

TARGET Initial configuration


GMT 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00 08:00
DIR 1
A& D Share 1
Filesystem data DIR 2
DIR 3
A&D SQL SQL
A& D Share 3 Filesystem DIR 1
2 DIR2

A& D Share 4 Special App


Data
B&C Filesystem DIR 1
DIR2
B&C SQL SQL

B&C Share 3 Special App


Data
Local Backup Exch

Special
Local Backup App
HIGH LOAD

Some effort is required to map all the activity


at the target, but it is clear that, between
20:00 and 02:00, the target has a very heavy
load because local backups, replication jobs
from sites A and D and housekeeping
associated with replication jobs from sites B
and C are all running at the same time.

100
Consider improving the situation by imposing two Housekeeping Windows on the Target Device as shown below

GMT 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00 08:00
DIR 1
A& D Share 1
Filesystem data DIR 2
DIR 3
A&D SQL SQL
A& D Share 3 Filesystem DIR 1
2 DIR2

A& D Share 4 Special App


Data
B&C Filesystem DIR 1
DIR2
B&C SQL SQL
B&C Share 3 Special App
Data
Local Backup Exch

Special
Local Backup App
REDUCED LOAD

This allows more efficient replication because there is no contention which in turncan free up spare time on the Target StoreOnce which could
then be used to schedule tape offloads (shown in purple).

By implementing two housekeeping windows we can


successfully reduce the peak load on the target, this then
allows the local backups at Target E to complete faster.

101
Catalyst Sizing example
The main impact of Catalyst from a Sizing perspective is that low bandwidth backups from either inside the Data Center (using less bandwidth) or
remote sites backing up directly over WAN can be configured to the HP StoreOnce Appliance.

This has the effect of achieving higher overall apparent backup throughput.

One of the most common usage models expected is that of several remote sites using Catalyst in low-bandwidth backup mode to send data over
the WAN to a centralized StoreOnce appliance. In this case the heavy-duty deduplication work is done on the media server at the remote site.
Indeed this technique can often negate the need for a StoreOnce appliance on remote sites but still provide the customer with a full Disaster
Recovery capability at a very low cost – which is obviously appealing.

StoreOnce Catalyst support in the Sizing Tool


• StoreOnce Catalyst is only supported on Symantec and HP Data Protector software
• The Catalyst emulation type is selected from the Target Emulation drop-down box (see examples below)

Note:
Use CatalystLowBw for low bandwidth backups in the data centre and direct remote site backups over WAN
Use CatalystHighBw for backups to StoreOnce Catalyst stores in the Data Center using high bandwidth (for servers that cannot support the
deduplication load)

HP expects 99% of Catalyst deployments to use Low Bandwidth mode because of the bandwidth savings it delivers.

Worked Example
Consider 20 backup jobs of 1 TB in 4Hrs from various servers in a data centre or from 20 remote sites. The retention schedule is fulls & daily
incrementals over 29 days. Incremental is 10% of Full backup size. 2% daily block level change rate.

102
Size a solution using Catalyst Low Bandwidth.

103
Job definition - note use
of “copies” because all
jobs are the same

Press Solve/Submit.

The sized solution is one HP 4430 Backup system with two additional shelves.

Prices removed for Commercial reasons

Because with Catalyst Low bandwidth backup all the work is being done on the media servers supplying this data, the HP StoreOnce 4430 has to
work less hard and so can match the performance requirements with a smaller configuration . Note also the additional Catalyst license
requirement.

104
Just for comparison if we size the SAME solution but with VTL emulation (where all the deduplication is done on the target – not distributed on the
media servers) the inputs are as follows:

When we press Solve/Submit we get:

This is a change in the solution to recommend a B6200 model which is much more expensive. Catalyst can save you money!

105
Appendix C:
Guidelines on integrating HP StoreOnce with HP Data Protector 7,
Symantec NetBackup 7.x and Symantec Backup Exec 2012

The information in this Appendix is valid for both single node and B6200 StoreOnce Backup systems. Some of the set-up documented in this
section was performed on an HP B6200 Backup system.

106
HP StoreOnce Catalyst: Configuration, Display and Set-up
This section will provide a guided tour of Catalyst device types within HP StoreOnce Backup systems. For further details please refer to the HP
StoreOnce Backup system user guide or online help.

Catalyst stores are identified as HP StoreOnce device types, just like VTL and NAS, but there are no replication mappings in the left hand
navigation pane for StoreOnce Catalyst because all Catalyst copy operations are controlled by the backup software.

Status tab
The StoreOnce Catalyst Status tab is displayed initially, which shows an overview of overall status. Further tabs provide access to Settings, Clients
to control access control, Blackout Windows to control precisely when replication occurs, and Bandwidth Limiting Window for the outbound copy
jobs to limit WAN bandwidth usage.

Settings tab
The Settings tab defaults to the maximum outbound copy jobs and maximum data and inbound copy jobs; these maximums vary by StoreOnce
model type (see Appendix A). A customer would only reduce this value if too much overall bandwidth was being consumed. This tab is also used
to enable Client Access Permission Checking.

107
Permissions tab (per store)
After overall permission checking has been enabled on the Settings tab, the access per store must also be configured (if different users are
limited to using different stores). The Clients tab is used to set up clients who are allowed to access various StoreOnce Catalyst stores – these
clients are integrated into the different backup applications that support Catalyst (HP Data Protector, Symantec NetBackup and Symantec Backup
Exec).

In NetBackup and Data Protector the client name can be set up on the StoreOnce appliance first and used in access control in the backup software,
but for Backup Exec the client must be an actual user defined in Active Directory as well.

Once Clients have been set up, the Edit function from the Stores-Permissions page can then be used to allow different users to access specific
stores.
.

Catalyst stores
Note: HP Data Protector also supports Catalyst Store creation directly from the software.

To create a new Catalyst store click on Stores in the left hand Navigation pane, then click on create in to top right hand corner and provide a Name
and Description for the store you are about to create. The Primary and Secondary Transfer Policies should both be set to either High Bandwidth
or Low Bandwidth for NetBackup and Backup Exec; only HP Data Protector can support different transfer policies on one store.

HP expects by far (99%) of customers to use the low bandwidth mode as this has the ability to improve overall backup throughput to the
StoreOnce appliance by offloading most of the deduplication load to the media servers/media agents in the customer environment (assuming
these have been adequately sized to perform these tasks).

108
Data stored within Catalyst
Let’s now take a look at how the data is stored in a HP StoreOnce Catalyst store.

When examining the actual contents of a Catalyst store the following definitions apply.

Items: This is the unit of storage within an HP StoreOnce Catalyst store. All item names are defined by the backup software.
Permissions: This controls which clients can access this store.
Data Jobs: These are the backup jobs to a StoreOnce Catalyst store and are directly relevant to the backup software format.
Outbound Copy jobs: This is the terminology used to refer to Catalyst stores being replicated (out) to another Catalyst store.
Inbound Copy Jobs: Because StoreOnce Catalyst can support multi-hop replication under backup application control there is now a
concept of Inbound Copy jobs, these are Catalyst stores being replicated (in) to the Catalyst store from elsewhere in the environment.

The naming of the Items is determined by the backup application. The items can be searched by means of a search engine (see below). Click Show
Items to see actual entries. Examples are shown later.

Catalyst copy jobs


Setting up the Catalyst store in the backup application as both a backup target and copy (replication) to another device is the most complex task
that you have to carry out when using HP StoreOnce Catalyst. This is because backup applications do this in different ways. It is described in detail
in the different sections of this document.
To implement the Catalyst copy the following terminology is used.
 HP Data Protector: Object Copy, licensing is per TB of storage
 Symantec NetBackup: Duplicate under the control of a Storage Lifecycle policy, which requires the Enterprise Disk License and the
Front End Terabytes license
 Symantec Backup Exec: Optimized Duplication, which is a licensed feature requiring a deduplication license per media server as well as
the Front End Terabytes license.
StoreOnce Backup systems require a Catalyst license (per appliance). There are separate Catalyst Licenses for the appliance licenses for HP Data
Protector and Symantec (a single license covers NetBackup and Backup Exec).

Examples of NetBackup Data Job entries


The Data Jobs tab shown below illustrates the Item format from NetBackup – each backup consists of a header file and a data file and, therefore,
contains multiple items.

109
The above NetBackup Data Jobs tab illustrates another point about Catalyst stores used in low bandwidth mode. The first backup to a low
bandwidth catalyst store is similar to replication; a form of seeding is required because much of the data being deduplicated on the media server
does not exist in the Catalyst store and must be physically transmitted. As you can see in the above NetBackup Catalyst store, the first low
bandwidth backup only provided 23.1% bandwidth saving whereas the second backup was 98.6% bandwidth saving. This shows the power of
StoreOnce Catalyst backup devices – vastly reducing the bandwidth required between media server and Catalyst store and saving the bandwidth
required for backup.

Examples of HP Data Protector Item entries


The Item Summary tab below shows the HP Data Protector Item format. In this multi-stream backup of SQL the item names are references in the
HP Data Protector catalog.

The actual replication jobs initiated by the backup application can be monitored in the Outbound Copy Jobs tab of the Catalyst store – below we
see an example of Catalyst copy via backup application control using HP Data Protector.

110
Note the following:

 The Item naming convention of HP Data Protector


 Catalyst copy, like replication, is also very bandwidth efficient – in this example saving up to 99.4%
 Multi-hop copy is possible, a dpstore1 item being copied to two Catalyst stores, called dpstore 2 and dpstore 3. Copy takes place
serially, not concurrently

Examples of Backup Exec Item entries


With Backup Exec several control files are also sent to the Catalyst store and there are also Tag List entries describing the content. Item Names
are prefixed by “BEOST” and the whole naming convention is similar to the concept of tape media in a tape library with “0” byte entries also being
recorded (see below) for unused slots..

111
Catalyst Implementation in HP Data Protector 7
In this section we shall describe:
 How StoreOnce Catalyst is integrated with HP Data Protector 7
 An example scenario
 How to create a Data Protector specification for backup to a StoreOnce Catalyst store
 How to recover from Catalyst copies
 Best practices when using StoreOnce Catalyst with HP Data Protector 7

Integrating HP Data Protector 7 with StoreOnce Catalyst


The following diagram illustrates how StoreOnce Catalyst may be integrated with HP Data Protector 7. The Catalyst API is embedded in the HP
Data Protector 7 media agent, which may be installed (or not installed) on various clients. HP Data Protector uses a concept called gateways to
control access to Catalyst stores.

The high or low bandwidth mode is shown by the pink (high) and yellow (low) gateways in the drawing.

Figure 35: HP Data Protector 7 with StoreOnce Catalyst

HP Data Protector gateways


Gateways may be implicit or explicit and this is defined within the backup specification. The implicit gateway does not support the Data Protector
Object Copy function.

Explicit gateways are server-side gateways. There should always be at least one explicit gateway defined. Explicit gateways are assigned, as
required, to any server running the Data Protector 7 media agent. The explicit gateway configuration can specify the maximum number of
streams and where the deduplication process is to be performed – on the media server or on the HP StoreOnce Backup system. If deduplication in
the media agent is selected (by selecting server-side deduplication in the device properties), this results in a low bandwidth transfer of data
deduplication. The explicit gateway supports HP Data Protector Object Copy (Catalyst store replication)., The explicit gateway could really be
considered as “Client side” deduplication where every client (with a no cost media agent loaded) can have access to a Catalyst store using a
gateway of the same name and settings to access the catalyst store – HP data Protector knows which client to start the media server code on to
perform this task. The explicit gateway can be configured with server-side deduplication turned off and in that case ALL deduplication will take
place in the StoreOnce appliance. This would be called ‘target-side’ deduplication.

The implicit gateway is a source-side gateway and is optional. It is configured once only but can be used by any of the clients in the cell that have
a media agent installed. The implicit gateway does not have an assigned media agent but will start media agents on any server equipped with
media agent software. In effect, it is like a ‘virtual’ gateway for every media-agent-equipped server belonging to the cell where ‘source-side’
deduplication is specified for backup (see configuration examples later in this section).

112
The implicit gateway has the same configuration parameters on every media-agent-equipped backup server. It is designed so that only files or
data resident on the media-agent-equipped backup server can be backed up via this gateway. Files or data resident on an application server with
only the disk agent or application agent installed cannot be backed up or restored via an implicit gateway.

The implicit gateway always invokes deduplication in the media agent (which is referred to as source-side deduplication on the Data Protector
Add Device screen). HP Data Protector Object Copy (Catalyst store replication) is not available using the implicit gateway. There is a setting in the
implicit gate way which can be used to limit the number of parallel streams. This setting as you would expect applies to every server with media
agent using the gateway. So if set at 2 then each media agent equipped client can have a max. of 2 streams.

User can of course configure both source and server-side gateways. So you could have a backup server which normally used the source-side
gateway for deduplication in the server but has a server-side gateway with deduplication not selected for ‘target-side’ deduplication (useful
reducing server load at the expense of backup performance).

There is currently an overall limit of 32 gateway instances per session.

Data Protector also queries the StoreOnce device for the maximum number of Catalyst sessions available. (These would be inbound data jobs).

So let’s look at example:


A DP cell has 100 media agent equipped clients and you have a source-side gateway configured. The max. no. of parallel stream is set at 2 at the
gateway. So ONLY 16 clients are processed each with 2 streams as we hit the 32 limit. Obviously when sessions end another is activated.

So best practices are:

 Use the source-side gateway for mass deployment to smaller server which really only send one stream. Set the limit in the gateway at 2.
 Use the server-side gateway for the large application server where the backup could feed many streams and limit to around 12 streams per
Catalyst store. Each B6200 service set could support 8 of these large servers.
 Use a server-side gateway when target-side deduplication is required and the user has a fast 10GbE connection but does not wish to load the
server. E.g. online database backup.

Important!
There is no difference in the deduplication method for either source or server-side deduplication. The different gateways are essentially for
different deployment methods

Deduplication types with the explicit gateway


The explicit gateway supports both low bandwidth (server-side) and high bandwidth (target-side) deduplication.

 Server-side deduplication – deduplication of data is performed within the dedicated backup server. Server-side deduplication can be used
for data held locally on the backup server and from servers that have a disk agent installed. In this case, data is transferred over the network
to the backup server and then processed by the media agent and sent on to the Catalyst store. Server-side deduplication must use the
‘explicit’ gateway for the backup destination and that gateway requires server-side deduplication selected in the advanced options.

 Target-side deduplication – data is held on a client with only a disk agent installed. This system can be remote from the backup server.
Server-side deduplication is not selected. All data is transferred at high bandwidth across the LAN or WAN to a backup server hosting a
gateway to the StoreOnce Catalyst appliance. This may be necessary when Data Protector 7 has only application/disk agent support for a
particular data type (such as OpenVMS backup).

Key Points:
 At least one explicit gateway must be configured. You cannot configure just an implicit gateway.
 For files or data held on an application server with only a DP7 disk agent installed, backups must be directed to an explicit gateway.
 The implicit gateway is used for source-side deduplication on any server in the cell running a media agent. A server running just a disk or
application agent cannot select the implicit gateway.
 The parameters (maximum streams, and so on) are the same for every server using the implicit gateway. This is useful for limiting
server loading.
 The implicit gateway does not support Object Copy (Catalyst store replication
 Target side deduplication is useful when the extra load of deduplication is not wanted on the backup server.
 Only 64-bit servers can be configured for a gateway.
 The deduplication process is exactly the same for server-side and source-side deduplication.

113
 The StoreOnce deduplication and Catalyst client binaries are built in to the media agent code. There is no requirement for a plug-in
software module, as required for Symantec NetBackup and Backup Exec integration.
 In HP Data Protector terminology a Catalyst store is referred to as a Backup to Disk device type.

Example scenario
First, we need to configure a store on the HP StoreOnce Backup system (see Catalyst stores). Then, we need to configure the device using the Data
Protector Management Client.

This guide will cover configuration of both gateway types. There are four servers:
 The cell manager is on server ‘Zen’ and, following HP Data Protector best practice, this is a separate server. The cell manager creates a
significant loading, which is not desirable on a media server.
 There are three other servers: Bill, Ben and Zip. The server ‘Zip’ has only a Data Protector 7 disk agent installed.
 The HP B6200 Backup System being used in this example is configured for ‘template1’ network configuration, which uses 10GbE for
data and 1GbE for B6200 management. DNS is in use and the B6200 VIFs (virtual IP addresses) will be referenced by their fully qualified
domain name.
 For the purpose of this exercise four Catalyst stores will be configured as dpstore1 – 4 within the B6200

The figure below shows the example layout. Let us assume dpstore1 has been created on the B6200 Backup System, the configuration of Data
Protector can now proceed. The Catalyst store will be called ‘B2D1’ and will be configured as a Backup to Disk device.

Figure 36: Example scenario using HP Data Protector 7

So, in the above example, we could configure in the following ways

 Use a single implicit gateway for Bill and Ben – but only data specific to Bill and Ben would be able to Backup to the Catalyst store and no
object copy of Bill & Ben data would be possible. Zip cannot be backed up to a Catalyst store.
 Use two named Explicit gateways for Bill & Ben this means they can support object copy and backups from zip can be mapped to one of
the explicit gateways. So all the data is transferred over the network to Bill or Ben where it is deduplicated and then only small amounts
of backup data are sent low bandwidth to the Catalyst store accessed by the specific explicit gateway.

114
Configuring Data Protector
1. Start the Data Protector management GUI and select Devices & Media from the drop down box at the top left of the page.

2. Right click on Device and select Add Device.

3. On the screen displayed add the chosen Device Name (B2D1), Description (optional), and select Backup to Disk as the Device Type and
StoreOnce Backup System as the Interface Type.

4. Select Next to continue the configuration.

5. The next screen is used to select (or create) the HP StoreOnce Catalyst store located on the appliance and to configure the gateways.
Specify the IP address or FQDN of the Deduplication System
6. In the Store section of the screen, you can browse and create Catalyst stores.
If client access permissions have been selected on the StoreOnce Backup system GUI, you must enter the Client ID to browse or create a
store on the backup application GUI. If the Client ID is not entered any pre-made stores will not be accessible.
If the store has not been pre-configured on the StoreOnce appliance, a new store will be created by HP Data Protector providing the
Client ID is specified correctly.
Note: HP Data Protector is the only backup application that is able to create Catalyst stores from the backup application GUI.

The screenshot below shows the critical stage of the HP Data Protector Catalyst configuration. We have two options for dpstore1 :
configuring access to it via an implicit gateway (Source-side deduplication ticked) OR if we don’t tick Source-side deduplication we can
use the Add function to define numerous explicit gateways ( each residing on a Data Protector media server) that are allowed to access
dpstore1. This allows multiple media servers to write to the same Catalyst store and, if the Catalyst store is configured for a particular
data type and the media servers send only that data type to that Catalyst store, then the deduplication will be much improved.

Set source side dedupe (implicit)

Set server side dedupe (explicit)

Multiple explicit gateways can be added


here

7. The lower part of the screen is used to configure Gateways.


To configure an implicit gateway (optional), select Source-side deduplication.

115
Click Properties to access the advanced settings, where the maximum number of streams per client may be specified. The default
setting is two and will be applied by any media server when using source-side deduplication (i.e. when the implicit gateway is being
used).
This is an important setting when optimizing the number of streams of data per Catalyst store – 16 is recommended. The blocksize
setting is not used for StoreOnce Catalyst but is still used by disk and application agents. It is recommended to increase this to at least
256KB.

8. The section belowthe red dotted line above adds explicit gateways. At least one explicit gateway must be configured. These gateways
are applied individually to each Data Protector client that has media agent software installed.
The client server names are shown in the drop-down box (servers must be added as clients and the media agent installed prior to
gateway configuration).
Select each server that requires a gateway and click Add. The Properties dialog for explicit gateways allows selection of Server-side
deduplication shown below

If the Server-side deduplication box is ticked in the Advanced Options tab, the deduplication occurs on the server hosting this gateway
and low bandwidth backup occurs. (This is as the equivalent of selecting source-side deduplication when setting up an implicit gateway.)
If this box is not ticked, all deduplication will take place on the StoreOnce appliance and the backup will be high bandwidth, which means
target-side deduplication.

9. You may also use this dialog to specify the maximum number of streams per gateway.
Maximum Number of Parallel Streams per Gateway: The maximum recommended value is 16 for a single Catalyst store. However, the
best overall throughput on a StoreOnce device is achieved if we have 8 separate devices with 6 streams to each. So, in our example, we
have set the value to 6.

116
10. Whilst you are in the Advanced options GUI for Gateway properties, click on the sizes tab, select Blocksize: This must be set as high as
possible, 1024kB is recommended but may not be possible with all Host Bus Adaptors. A high value minimizes the number of HP data
protector headers inserted into the data stream and hence improves deduplication ratios.

11. Click OK to apply the Advanced Settings. This returns you to the Add Device screen, shown in step 6.

12. Use the Check button to the right of the server list to check successful communication between the backup server and the HP StoreOnce
Backup system.
Note: With HP B6200 Backup systems the network may be configured to use two subnets (such as 10GbE for data and 1GbE for
management). If using DNS, although only the data path to the B6200 Backup System is used for the Catalyst data and commands, both
subnets must be capable of resolving their respective service set VIFs for data on the 10GbE network and the management VIF on the
1GbE network.

13. Select Finish at the bottom of the Add Device screen and the stores are now ready for use.

Key Points:
 The optional implicit gateway, when selected, will start media agents on any media-agent-equipped server but only for local data on that
server. It uses the same settings for every server. Data for backup must reside on the same server. It is used for source-side deduplication
only and is useful for providing an overall limit on data streams to match backup server specification.

 The explicit gateways can be configured individually on each media-agent-equipped server that is registered as a client in the cell. They can
be used for server-side or target-side deduplication and can backup data that is resident on other servers via the network.

 For each data stream a media agent process is started. For each mount point a disk agent is started.

 The maximum number of connections per store can be set by Data Protector.

Creating a Data Protector Specification for backup to a StoreOnce Catalyst store


Using the configuration we have just described, we will now create a backup specification to perform a backup of data using source-side or
server-side deduplication.

117
Note: Target-side deduplication is the deduplication method used by VTL, NAS and Catalyst high bandwidth configurations and is not shown in
this document, which is illustrating Catalyst low bandwidth scenarios.

Backup using source-side deduplication (implicit gateway)


Deduplication occurs at the backup server; this is also referred to as low-bandwidth backup.

1. From the Data Protector Management screen select Backup from the drop-down box at the top of the page.

2. Right click on Filesystem and select Add Backup.

3. Select Blank Filesystem and check the Source-side deduplication box as the backup specification option. Click Next.

4. Select some files for backup and click Next to display a list of destination devices. In our example, the destination is the device ‘B2D1’.

Note that the explicit gateway on the server called Bill (b2d1_gw1) is not available and is ‘grayed’ out because we ticked source side in the
backup job definition.

5. Select the source-side gateway. Remember this gateway is selectable only if ‘source-side’ deduplication is specified.
Highlighting the gateway will allow the Properties button to be selected. This is used to specify a media pool. A default media pool is
created for the ‘backup to disk’ device but additional media pools may be created, if desired.

6. Click Next and specify the required options for retention. The backup specification options and schedule may be modified, if desired. It is
often useful to tick the Display statistical information box.

7. Save the backup specification.

Backup using server-side deduplication (explicit gateway)


Deduplication occurs at the backup server but data can be backed up that is not stored on the server running the media agent. This requires
server-side deduplication using an explicit gateway and is used for backing up clients that only have a disk agent loaded; in our example this is the
server called ‘Zip’.

1. From the Data Protector Management screen select Backup from the drop-down box at the top of the page.

2. Right click on Filesystem and select Add Backup.

8. Select Blank Filesystem. DO NOT check the Source-side deduplication box in the backup specification. Click Next.

118
3. Select some files for backup and click Next to display a list of destination devices. In our example, the destination is the device ‘B2D1’.

4. Expand B2D1 and now the Source-side gateway is ‘grayed’ out and the explicit gateway on the backup server ‘bill’ is available. This
gateway has ‘server-side’ deduplication selected in the advanced options. Select the options required and save the backup specification.

5. Click Next and specify the required options for retention. The backup specification options and schedule may be modified, if desired. It is
often useful to tick the Display statistical information box.

6. Save the backup specification.

Selecting gateways in HP Data Protector


The selection of where deduplication takes place is backup job dependent; you select the appropriate gateway when creating the backup job
from the choice on offer. In the example below b2d3 (backup to disk3) is a single Catalyst store that can be accessed by any of the four
gateways configured.

HP StoreOnce Catalyst has the added advantage that expired backups may be removed automatically as Catalyst items and the space made
available for other users of the store. Please note that as the data stored is deduplicated, space is not returned until the Housekeeping
process has run on the StoreOnce appliance (this is configurable to run for up to two periods within any 24 hours).

HP Data Protector 7, by default, automatically returns space occupied by expired backups every 24 hours. This may be modified to hourly by
modifying the global options file. This file is located at: \ProgramData\OmniBack\Config\server\options\global

Finally ensure you have HP Data Protector Windows Disk Agent version > A.07.00 installed on your clients, this is enhanced to minimize the
header inserts into the data stream hence improving deduplication ratios. This can be checked by using the “Clients” context drop down and
checking the HP data protector installed components under “properties”.

Catalyst Copy Implementation in HP Data Protector 7 (Object Copy)


HP StoreOnce Catalyst has the ability to move data between StoreOnce Catalyst stores without re-hydration. This means that bandwidth-efficient
transfers take place, making it ideal for moving data quickly offsite.

HP StoreOnce Catalyst Copy differs from VTL or NAS replication in that data can be duplicated to more than one appliance for extra resilience.
Additionally, the HP Data Protector internal database is now aware of all StoreOnce Catalyst copies and can restore from any copy without
complex scripting arrangements or imports. When required, data can also be copied to real tape for long-term storage. (When data is moved to
tape the deduplicated data must be re-hydrated.)

119
Data Protector 7 Object Copy offers a rich selection of options. This document will cover only basic object copy functions - replicating a backup to
one Catalyst store then on to another store. For more detail please consult the appropriate HP Data Protector documentation.

All transfers between StoreOnce Catalyst stores are bandwidth efficient (low-bandwidth). The StoreOnce Catalyst protocol can now control the
StoreOnce appliance, so there is no need for data to flow through a backup server.

Object copies can be interactive (useful for ad-hoc copies), or scheduled. Additionally they can be set to occur ‘post-backup’ after the backup
completes. It is not possible to make backups to multiple destinations at the same time. Copies are made sequentially from one store to another.

Figure 37: Object Copy implementation with StoreOnce Catalyst

The above figure illustrates backups of user data, performed using a server-side gateway to Catalyst store #1 located in data center #1.
Post backup (or scheduled) HP DP7 object copy can move the backup offsite via the WAN to Catalyst store #2 in data center #2. This is performed
in a bandwidth-efficient manner and, after the first transfer (seeding), only unique data chunks will be transferred. The expiry date of the original
backup can be shorter or even immediate once data is offsite.

The backup can then be duplicated to Catalyst store #3 in data center #3. This gives extra resilience. The Data Protector 7 Object Copy
configuration could also support moving the data onto tape if required.

Transfer of data is direct from source StoreOnce Catalyst store to target StoreOnce Catalyst store. Although it may appear the gateways are
involved, they only pass the commands to perform the replication. The cell manager internal database tracks the copies. It is important to back up
the cell manager database after every backup session.

120
Key Points:
 Use the Data Protector Object Copy function to move data offsite.
 Design the WAN link using the Sizer tool in order to complete the duplication in the appropriate time window. Allocate separate time
slots for the duplication process and the backup process.
 Always select server-side deduplication (explicit gateway) for backups. Because of the virtual nature of the source-side (implicit)
gateway it is not possible to use it for duplication/object copy. Destination server-side (implicit) gateways are ‘grayed’ out. If you select
source-side gateway for backup, jobs will fail unless the object copy job is remapped to use the explicit gateways.
 Duplication can either be post backup or scheduled separately.
 Object copy to tape is referred to as a ‘copy’ as distinct from a catalyst copy which is replication.

Setting up Object Copy


Use the Object Operations drop down in the context box of HP Data Protector to set up the Object Copy (Catalyst store copy using low bandwidth
replication). There are three options:
 Automated (immediately after backup completes)
 Scheduled
 Interactive

Simply sequence your way through the tabs (Backup Specifications ->Copy Specifications, and so on) to construct the Object Copy job.

121
The following example shows a post backup Catalyst replication process being configured through Object Copy in HP Data Protector. There are
several configuration tabs, but be sure to set the following parameters:

 On the Copy Specifications tab, only StoreOnce Catalyst store devices should be enabled for replication.

 On the Options tab remember to set Use replication.

Multi-hop copy is possible to multiple sites (the copy operations, however, happen serially) and Catalyst Copy can also take place to physical tape,
if long term retention or copies are required. Consult the HP Data Protector Object Copy documentation for more details and descriptions of all
the tabs.

In this example we have object copied data from B2D1 to B2D2 (even though these are in the same physical B6200 they represent copy across
different sites)
122
HP Data Protector 7 – Recovery from Catalyst copies
The screenshot below illustrates how it is possible to restore from Copies within the HP Data Protector Restore function.

Select the required Filesystem and display its Properties to see all the versions of that backup. In our example, the original is on Catalyst store
B2D1 whilst a copy is on the Catalyst store B2D2 (typically on a Disaster Recovery site).

copy

original

If the primary copy on B2D1 is lost or the site damaged, then the data can be restored immediately from the copy residing on B2D2. Enable the
Select source copy manually box and select the required copy from the list.

Selecting a copy also selects the correct media pool on the correct Catalyst store.

123
HP Data Protector 7 – Catalyst best practices summary
1. Best practice is to keep similar data in the same Catalyst store. For example: dedicate a B6200 Catalyst store to Oracle backups and have a
different Catalyst store for SQL server.

2. The more streams that can be supplied concurrently to a Catalyst store, the better the throughput – for best throughput 16 streams is about
the maximum you can send before contention for resources happens within a single catalysts store. The source data selection will dictate
how many streams are sent in a particular backup specification. If data is selected for backup from multiple mount points, each mount point
will have a disk agent started. It is also possible to use the backup specifications to select multiple directory selections for backup to produce
multiple streams.

3. Multiplexing cannot be configured within Data Protector 7 with Catalyst devices. (This is known as ‘Concurrency’ within Data Protector.)

4. It is necessary to select source data correctly for multiple streams. For example: for Filesystem backup separate mount points, drive letters
or directory selections are required.

5. Backup servers running multiple streams and deduplication must be sized appropriately.

6. Use the HP Storage sizing tool – this is calibrated with the latest test results from HP R&D and will take into consideration data retention,
data change and growth. It will also size any WAN links or show the bandwidth savings when using Catalyst.

7. Use the implicit gateway and use source-side deduplication if the source data is located on the server running the DP media agent.

8. Use the explicit gateway and server-side deduplication for individual server settings and when source data may be located on other servers
and object copy functionality is required.

9. When setting up the gateway, remember that the implicit gateway (source-side deduplication) has a default limit of 2 streams per client. This
will apply to all media servers using this gateway. The advanced setting for each explicit gateway also has a setting that allows you to
specify the maximum number of streams.

10. When load balancing is selected at backup job run time, there is also a limit on streams and this overrides the gateway limit. If the gateway
stream limit is set to 6 and load balancing is set at 5 (default), then ONLY 5 streams will be possible for a job specification. The default
setting may be modified. Load balancing is not recommended to two different StoreOnce Catalyst stores.

11. Expired backups are deleted from the StoreOnce Catalyst store at intervals specified by the global options file – default 24 hours

12. Exporting backup media from a Catalyst store will leave ‘orphaned’ items in the Catalyst store – avoid doing this.

13. Use the Data Protector Object Copy function to move Catalyst data to another device - typically offsite.

14. Always select an explicit server-side gateway for backups that require duplicating to another site. Because of the virtual nature of the
source-side (implicit) gateway it is not possible to use for duplication; destination server-side (implicit) gateways are ‘grayed’ out. If you
select a source-side implicit gateway for backup, jobs will fail unless the object copy job is remapped to use the explicit gateways.

15. The HP Data Protector Object Copy function supports either post backup or scheduled deduplication, depending on customer requirements.
Interactive mode is generally reserved for one-offs and testing only.

16. Object copy to tape is referred to as a ‘copy’ as distinct from a Catalyst copy which is ‘replication’.

17. HP StoreOnce Catalyst does not use Active Directory services for access control.

18. In a Disaster Recovery situation a StoreOnce Catalyst user is likely to have backup copies on multiple sites, and possibly to tape as well – all
copies have entries in the IDB (internal database). Best practice is to back up the IDB and then replicate offsite.

19. If the original cell manager is lost a new one can be created and then updated with the Cell Manager IDB. However, it will be necessary to
‘import’ the Catalyst items for the relevant store.

124
20. To import Catalyst stores to a new Cell Manager proceed as follows:

 With a new cell manager configure the B2D device which is intact on the DR site. The original cell manager has been ‘lost’ in the disaster.

 Obtain a list of StoreOnce Catalyst objects in the store using the following command:
(commands in \Program Files\Omniback\bin)
omnib2dinfo.exe –list_objects –type OS –host << IP address or VIF of the B6200 service set>>
-name <<storename>>

Note: The storename is the name of the store on the StoreOnce Backup system and not the DP name.

 Update the repository for each Catalyst object, using the Store name given in DP7.
omnimm -add_slots <<Store name in DP7>> <<catalyst object name>>

Catalyst Object names are typically of the form “e74d2262_502255d2_06b8_0004”.

 Either use the GUI to import the catalog data or the command:
omnimm -import <<logical StoreOnce device>> - slot SlotID

125
HP StoreOnce Catalyst stores and Symantec products
StoreOnce Catalyst may be integrated with Symantec NetBackup and Backup Exec. HP StoreOnce Catalyst stores are represented in Symantec
NetBackup and Backup Exec as the device type: Open Storage devices.

For both products, HP StoreOnce Catalyst Open Storage Technology (OST) plug-ins must be downloaded from here and installed onto any
Symantec NetBackup or Backup Exec Media servers that will be used to write to an HP StoreOnce Catalyst store. This allows HP StoreOnce
Catalyst stores to be seen within Symantec products as an Open Storage device – hp-StoreOnceCatalyst is the name of the device.

Different OST plug-ins are required for NetBackup and Backup Exec.

The next diagram shows the implementation of HP StoreOnce Catalyst within Symantec products.

Figure 38: Integrating HP StoreOnce Catalyst with Symantec NetBackup or Backup Exec

126
Once installed, the same concepts apply as with HP Data Protector integration – low and high bandwidth backups to a Symantec Open storage
device and then optimized duplication (Symantec terminology) of the Catalyst stores to another device. The following diagram illustrates
StoreOnce Catalyst functions with Symantec NetBackup.

Figure 39: StoreOnce Catalyst functions with Symantec NetBackup

127
StoreOnce Catalyst implementation with Symantec NetBackup
In this section we shall describe:

 How to integrate HP StoreOnce Catalyst stores with Symantec NetBackup


 How to configure a StoreOnce Catalyst store in Symantec NetBackup
 How to implement a Catalyst Copy in Symantec NetBackup using a Storage Lifecycle
 How to recover from a Catalyst Copy in Symantec NetBackup
 Best practices when using StoreOnce Catalyst with Symantec NetBackup

Integrating HP StoreOnce Catalyst stores with Symantec NetBackup


The HP StoreOnce Catalyst device fully utilizes the Open Storage API device support with NetBackup.
To integrate HP StoreOnce Catalyst with NetBackup you must:
1. Install the HP StoreOnce Catalyst OST 2.0 plug-in on any Media Server that is required to write to HP Catalyst stores. The plug-in is an
executable and requires the NetBackup service to be down whilst the plug-in is installed (the plug-in must be installed on each media
server wishing to write to Catalysts stores).

2. Set up and configure a Catalyst store on the HP StoreOnce Backup System.


3. Configure the Catalyst store into the NetBackup Storage Server/Disk Pool/Storage Unit architecture, as described in this section of the
document.
4. Configure backups to the Catalyst store, as described in this section of the document.
5. Set up low bandwidth replication via NetBackup Storage Lifecycle Policies (optimized copies), as described in this section of the
document.

128
The NetBackup hierarchy is shown below; a backup policy sends backups to a Storage Unit (which can be VTL, Basic Disk or Open Storage
device).

StoreOnce

Figure 39: NetBackup hierarchy

Configuring a StoreOnce Catalyst store in Symantec NetBackup


Assume we have configured a Catalyst store as shown previously. In the example the Catalyst store is called Netbackup75
1. To configure a Catalyst store in NetBackup use the Storage Server Configuration Wizard, as shown below. A Storage Server is NetBackup
terminology that categorizes intelligent appliances integrated into the NetBackup hierachy.

Storage server name: This is the name of the IP address (or equivalent DNS server name) that represents where the Catalyst stores
resides . Note that on StoreOnce B6200 systems this should be the virtual IP address that identifies the Data Path to the Catalyst store
The number of Catalyst stores configurable depends on the Model Number.

129
Storage server type: This is always hp-StoreOnceCatalyst because this is the unique identifier used by the OST2.0 plug-ins that HP has
developed.

Media server: This is the Media Server where the OST2.0 plug-in is installed. In this case we will connect it to our “Heartofgold” Media
Server (where part of the deduplication process will now be performed because we have set up a low-bandwidth Catalyst store).

User name: This is the user that you created on the HP StoreOnce appliance; the password within NetBackup can be of your choosing.
The Use Symantec OpenStorage plug-in for network controlled storage server should not be ticked as this is a Symantec special
device type.

2. Having created a storage server, we must filter this down to create a Storage Unit for a backup policy to access. The first stage is to
integrate the StoreOnce Catalyst stores that NetBackup has detected as Disk Pools and give them a name. In the example below: on HP
StoreOnce appliance B6200ss1.nearline.local there are two HP Catalyst stores, dpstore1 and Netbackup75. Netbackup75 is the Catalyst
store we have created for presentation to NetBackup 7.5. dpstore1 is being used by another backup software vendor.

130
3. The final part of the disk pool process is to assign it a Disk pool name. Because we configured Netbackup75 Catalyst store in low
bandwidth mode – we have named the disk pool Netbackup75Lowbandwidth.

4. Finally, within a Disk Pool we create a Storage Unit that can be accessed by a backup policy. At this stage we can also configure the
number of concurrent streams that can be written to the Catalyst store (max of 16 is recommended) and the maximum file size that can
be written to the Catalyst store/storage unit (524 GB).

131
5. As can be seen below Netbackup75LowbandwidthStore appears as a storage unit in the list of NetBackup Administration Console, which
means we can now configure backup policies and direct them to the Storage Unit. In this case, the backups from the media server will
include deduplication performed on the NetBackup media server “heart of gold” because the Catalyst store has been configured in low
bandwidth mode on the StoreOnce appliance.

6. A typical backup job to a Catalyst storage unit can then be configured, selecting Netbackup75Lowbandwidth as the policy storage unit
as shown below.

Catalyst Copy Implementation in Symantec NetBackup (Storage Lifecycle policy – duplicate)


Catalyst copies are implemented in Symantec NetBackup using a function called a Storage Lifecycle policy. A Storage Lifecycle Policy (SLP) is a
storage plan for a set of backups, which is configured within the Storage Lifecycle Policies utility. An SLP contains instructions in the form of
storage operations to be applied to the data that is backed up by a backup policy. Operations are added to the SLP that determine how the data is
stored, copied, replicated, imported and retained. NetBackup retries the copies as necessary to ensure that all copies are created.

132
SLPs offer users the opportunity to assign a classification to the data at the policy level. A data classification represents a set of backup
requirements, which makes it easier to configure backups for data with different requirements; for example, email data and financial data. SLPs
can be set up to provide staging behavior. They simplify data management by applying a prescribed behavior to all the backup images that are
included in the SLP. This process allows the NetBackup administrator to leverage the advantages of disk-based backups in the near term. It also
preserves the advantages of tape-based backups for long-term storage.

The beauty of OST is that the “duplicate” feature of storage lifecycle policies can be used to enable Catalyst stores to be copied (using low
bandwidth copy), all under the control of the backup software. Furthermore, multiple copies (or duplicates) can be created at multiple sites to
ensure even better Disaster Recovery capabilities. All copies are logged in the NetBackup catalog.

There are two stages when setting up Symantec OST Duplication/Catalyst store low bandwidth copy via NetBackup Storage Lifecycle Policies.

 First, we need to establish the copy target on another HP StoreOnce appliance (e.g. NetBackup75reptarget) as a new Storage Server and
create the associated Disk Pool and Storage Unit within the NetBackup Master server domain.
This is the same process as described in the previous section, but be sure to establish the Storage Server Name (or IP), which is
displayed in the StoreOnce GUI, before you start.

 Once the NetBackup75reptarget is established, we create a Storage Lifecycle Policy to incorporate the existing StoreOnce Catalyst
backup policy along with one or more “duplicate” commands.

1. From the NetBackup Administration Console under Storage, right click Storage Lifecycle Policies SLP) and select New Storage
Lifecycle Policy.

133
2. Give the SLP a name, in this case StoreOnce Catalyst LCP, and then start to specify a sequence of events – choosing between backup and
duplicate functions. As you can see below the first stage is always the backup to a particular storage unit (in this case
Netbackup75LowbandwidthStore) and the backup retention policy (in this case 2 weeks).

3. Additional operations can be added, as shown below. For Catalyst Copy the next operation would be a Duplicate (Catalyst copy) to
another storage unit called Netbackup75reptarget.

Note: The retention period for the duplicate can be different to the retention period for the original backup (in this case 3 months) – so
copies held at a central DR site can be retained for longer than those at the remote site.

4. A second duplicate operation can be added to duplicate the original backup (using low bandwidth technology) to yet another Catalyst
store and finally, if required, a third duplicate operation can be added to copy the backup to physical tape.

134
5. To implement this Storage Lifecycle policy select StoreOnceCatalyst from the Policies section of the left-hand navigation and then
select StoreOnceCatalystLCP from the Policy storage drop down list in the Change Policy screen. Instead of choosing a specific backup
device for the backup, you are now choosing a storage lifecycle policy that utilizes several devices.

6. Run the StoreOnceCatalyst backup policy, which uses StoreOnceCatalystLCP for its directive, and look at the Activity Monitor. Note how,
from the single policy, we evoke both the Backup and then some 30 minutes later the Duplication (Catalyst copy job). The 30 minutes is
configurable and, ideally, priority should be given to completing all backups first. A future release of NetBackup may allow the SLP
duplicate operations to be scheduled more specifically.

Backup job runs followed 30 minutes


later by a Duplication operation

135
7. Finally, one of the options when defining the Storage Lifecycle policy is to set the Retention type of the original Backup to Expire after
copy. This means as soon as the backup is copied offsite the original backup can be deleted – thus ensuring minimal backup storage
requirements at the original site where the backup takes place.

Symantec NetBackup 7.x – Recovery from Catalyst copies


One of the major benefits of HP StoreOnce Catalyst is that the backup software tracks all copies of data in the NetBackup catalog. If there is a
requirement that means the first copy is unavailable (such as a disaster on the primary site), the data can be recovered from a copy that exists on
the DR site. Simply search the NetBackup catalog for duplicates as shown below and restore from “Copy 2”.

There is also a useful feature in NetBackup where a Copy can be promoted to Primary copy; the Primary copy is always the default for recovery.

136
Symantec NetBackup 7.x – Catalyst best practices summary
1. Best practice is to keep similar data in the same Catalyst store. For example, dedicate a StoreOnce B6200, 4210/20, 4420/30 or 2620
Catalyst store to Oracle backups and a different store for SQL server.
2. Use of HP StoreOnce Catalyst within NetBackup requires the addition of Enterprise disk licenses relevant to the amount of Front End
Terabyte ( FETB) being protected by NetBackup on a Catalyst store.
3. The more streams that can be supplied concurrently to a Catalyst store, the better the throughput. For best throughput to a single
catalyst store 16 is recommended. The source data structure will dictate how many streams are sent in a particular backup
specification. If data is selected for backup from multiple mount points then each mount point will have a disk agent started. It is also
possible to use the backup specifications to select multiple directory selections for backup to produce multiple streams. Within
NetBackup there are several places that control the number of streams configurable see the NetBackup implementation guide,
available to download here
4. Backup media servers running multiple streams and deduplication must be sized appropriately, using the rule of thumb described
earlier.
5. Use the HP Storage sizing tool – this is calibrated with the latest test results from HP R&D and will take into consideration data
retention, data change and growth. It will also size any WAN links or show the bandwidth savings when using Catalyst.
6. Catalyst stores must be configured on StoreOnce B6200, 4210/20, 4420/30 or 2620 with the transfer protocol settings either both set
to low bandwidth or both set to high bandwidth.
7. When configuring the storage server the user name must be the one created on the B6200, 4210/20, 4420/30 or 2620 e.g.
Netbackup75user. The address or IP of the Storage server where the Catalyst store is located is displayed on the HP StoreOnce GUI. With
StoreOnce B6200 Backup systems this is the VIF ( virtual IP address or DNS name) of the data path.
8. When creating the disk pools use the default high and low watermarks.
9. When the storage unit corresponding to the Catalyst store is configured ensure the stream count is > 16 and the NetBackup Fragment
size is the maximum possible.
10. Backup policies should have the Allow multiple data stream tick box enabled and Take checkpoints every x mins section completed,
especially if used with StoreOnce B6200 which supports autonomic failover. This enables backups to start close to where they were
interrupted in the event of failover being invoked on the StoreOnce B6200.
11. Expired backups are deleted from the StoreOnce Catalyst store at intervals whenever the Image Cleanup job in NetBackup runs.
However, the space will not be reclaimed until the associated Housekeeping within the HP StoreOnce Backup system has run.
Housekeeping is configurable to run at up to two periods within any 24 hour period (by the use of blackout windows) and should be
allowed to run during periods of inactivity (not when backups or duplication are running).

12. Use the Symantec Storage Lifecycle policy function to move Catalyst data to another device - typically offsite. Remember if multiple
duplicates are specified they happen in serial fashion, not in parallel. The default period between backup and duplication or subsequent
duplications is 30 minutes. This might cause a high workload delaying the total backup job’s completion time. If you wish to extend the
30 minute timeout, or change the trigger point for replication jobs, create and edit the LIFECYCLE_PARAMETERS file stored in
C:\Program Files\Veritas\NetBackup\db\config as shown below

The file contains 3 commands:


MIN_GB_SIZE_PER_DUPLICATION_JOB <N >: Adjusting this value, indicated in gigabytes, affects the number and size of duplication
jobs. If this setting is small, more duplication jobs are created. If it is large, fewer larger duplication jobs are created.
MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB <N >: Reducing this value allows duplication jobs to be submitted that do not
meet the minimum size criterion. The default value is 30 minutes.
DUPLICATION_SESSION_INTERVAL_MINUTES <N >: This parameter indicates how frequently nbstserv looks to see if enough backups
have completed and decides whether or not it is time to submit a duplication job(s)

137
In the above example any duplication jobs < 8GB are forced to duplicate within two minutes.

Additional duplication commands are also available and a full inspection of the NetBackup Administration guide is recommended. (it is
anticipated this area of duplication will be simplified by a scheduling process in a future release of NetBackup.)

Storage Lifecycle duplicate jobs will fail (but be retried according to parameters in the LIFECYCLE_PARAMETRS file) if the replication
blackout window is set on the HP StoreOnce Backup appliance. So, ultimately, the StoreOnce appliance has control over the replication
schedule.
13. The number of duplicate jobs within a single storage lifecycle policy should be kept to a maximum of 4 for ease of monitoring. Duplicate
jobs can be monitored in the HP StoreOnce Backup systems using the Inbound and Outbound Copy Jobs tabs of the Catalyst store in
the StoreOnce GUI. A better way of monitoring Catalyst replication is to use HP Replication Manager 2.1, which is supplied free of charge
when a VTL or NAS replication license is purchased or when an HP StoreOnce Catalyst license is purchased.
14. Depending on customer requirements the Retention period can be set from weeks to months or expire upon duplication within the
Storage Lifecycle Policy.
15. StoreOnce Catalyst within NetBackup does not use Active Directory services for access control.
16. In a DR situation an HP StoreOnce Catalyst user has many options available which make DR much more efficient.
a. In a single Master Server environment no catalog import is required to recover from the duplicated copies of backups on
another site; the user can recover the data from “Copy2” found in the NetBackup Catalog search, if the primary copy has been
deleted or the device is unreachable.
b. Catalyst stores can be added into a Storage Unit group, which allows automated re-direction of backups if a specific Catalyst
store is unreachable. This improves high availability because the backup can be redirected to any other available device in the
Storage Unit group.
c. In a multiple Master Server environment the AIR (Automated image replication) function can be used to transfer Catalyst store
backup information between different catalogs on different Master Servers. This feature is expected to be included in the next
revision of the HP OST plug-ins for NetBackup.

138
Integrating StoreOnce Catalyst with Symantec Backup Exec
In this section we shall describe:

 How StoreOnce Catalyst stores are represented in Symantec Backup Exec


 How to create a Backup Exec device for backup to a StoreOnce Catalyst store
 How to configure a backup job in Backup Exec
 How to implement a Catalyst Copy in Symantec Backup Exec
 How to recover from a Catalyst Copy in Symantec Backup Exec
 Best practices when using StoreOnce Catalyst with Symantec Backup Exec

Representation of HP StoreOnce Catalyst stores in Symantec Backup Exec


The implementation of HP StoreOnce Catalyst in Symantec Backup Exec 2012 is similar to the NetBackup implementation.
 The same Catalyst license is used for Symantec NetBackup and Backup Exec on HP StoreOnce B6200/42xx/44xx/2620
 Catalyst stores should have the same transfer protocol set for primary and secondary (high-bandwidth or low-bandwidth for both
protocols - only HP Data Protector supports the “mix” of low and high transfer protocols within the same Catalyst store).

 HP StoreOnce Catalyst is represented in Symantec Backup Exec as an Open Storage device and requires the HP-OST plug-ins for
Symantec Backup Exec to be installed on every media server that requires access to an HP Catalyst store. Plugins are downloadable
from here.

139
The Backup Exec implementation of HP StoreOnce Catalyst has the following features:

 The correct OST plug-in must be downloaded. There is a single HP OST plug-in for Windows.
 A Deduplication Option license is required for Backup Exec.

 The Logon criteria to the HP StoreOnce Catalyst device must be an actual user configured on the System in Active Directory and MUST
align to the same user configured on the StoreOnce Backup system for client access control.

Device Configuration in Backup Exec 2012


Please refer to the section below entitled HP StoreOnce Catalyst : Configuration, Display and Set-up on page 107 in order to configure a Catalyst
store prior to configuring it in Backup Exec. Primary and secondary transfer policies must both be set to low bandwidth. In our example the
Catalyst store is called CatalystStore1.
You must also configure a client user for your Catalyst store which we will call beclient1. Finally, there must be an entry in Active Directory for a
user called beclient1.
1. Click the Configure Storage icon in the top section of the Backup Exec window.

140
2. Select Network Storage and click Next.

3. Select OpenStorage and click Next.

4. Provide a Name and Description for the OpenStorage device.

141
5. Select a Provider for the OpenStorage device.
Select the hp-StoreOnceCatalyst OpenStorage device. (If this is not displayed, the OST plug-ins for HP StoreOnce Catalyst have not been
loaded.)

6. Enter the connection information for the OpenStorage device. This is the IP address or FQDN of the StoreOnce appliance on which the
Catalyst store has been created. (If this is an HP StoreOnce B6200 Backup system, be sure to provide the Data Path VIF.) In our example,
we shall use b6200ss3.nearline.local. Previously a client access user called beclient1 has been created on the HP StoreOnce appliance
and in the Active directory for the Domain.

142
7. Click Add/Edit – you will be prompted to create a logon account.
NEARLINE\beclient1 is the user in Active Directory (see below) and beclient1 is also the client configured on the HP StoreOnce appliance
(see below). The password can be anything you choose. Click OK.

If required, you can check that your user has been created as follows.

This screen shows BE Client1 in Active Directory.

Below we can see an extract where beclient1 has already been created on the StoreOnce Appliance.

143
8. We must now set the account name NEARLINE\beclient1 as the default logon account because the System logon account cannot be used
for OpenStorage devices. Select NEARLINE\beclient1 as the logon account and click Set as Default. Click on OK.

9. Click Next. (The error message in our example occurred because we were using an Administrator logon, but can be ignored because we
have now created a new user beclient1.)

144
10. Select a storage location on the StoreOnce appliance.
Select CatalystStore 1, which we have already created on the source StoreOnce appliance and click Next.

11. Select the number of concurrent operations to be 16 to enable maximum throughput and click Next.

145
12. The Storage Configuration Summary is displayed. Click Finish if you are satisfied with the details (or Back to make changes).

13. The device is configured successfully and available for backup. Services must be restarted to bring the device online. Click Yes.

14. The device comes online in the storage section of the Backup Exec GUI.

The configuration of the Catalyst store can be edited by clicking the icon shown above in the Storage section. Note how the deduplication ratio is
now available directly to the backup software.

146
The data stream size should not affect Catalyst Stores because there is buffering with the Catalyst client. The span size (or max object size) again
does not have a real impact with Catalyst stores because Catalyst can support as many objects as needed. There is no 25,000 file limit like we
have with NAS. So, HP recommends these values be left at the default values within Symantec Backup Exec.

Configuring a backup Job in Backup Exec 2012


1. Click the Backup and Restore icon in the top banner of the GUI, then Backup. Select the simple Backup to Deduplication Disk Storage.

2. A default backup job is displayed, which you can edit in terms of data to be backed up and devices to be used. The default is Full &
Incremental and any disk storage.

147
3. Click on the Edit button in the Backup section (above). Click on Storage in the left-hand navigation and select our Backup Exec Catalyst
store from the drop down list.

4. Click on the Edit button in the Source System section (see step 2 screenshot) if you want to be more selective about which data is
backed up.

148
5. Click OK to complete the backup job.

6. You can use the One Time backup method to test the interface, if required.

Once the backup has completed successfully the representation of the Backup Exec backup in the HP StoreOnce Catalyst store is a little unusual,
as illustrated below.

Entries in the HP StoreOnce Catalyst store are known as items. A set of BEOST control files are written, these are constantly referenced and
updated. The Backup Exec implementation of OST is similar to that of a virtual tape emulation – with a control file emulating drives and robots
and an inquiry string and additional entries representing slots of a tape Library. The main full backup occupies a single entry (analogous to tape in

149
slot), and other Item entries represent slots allocated to future incremental backup jobs. An item can have no “user data size” but can have a
metadata size, as shown below.

Catalyst Copy Implementation in Symantec Backup Exec – ( Duplicate)


The Catalyst Copy command in Symantec Backup Exec is executed not by Object Copy (DP) or Storage Lifecycle policy (NetBackup) but by means
of a variant of the initial backup configuration (see below) called Back Up to Deduplication Disk Storage and then Duplicate to Deduplication
Disk Storage as shown below.

Before we can put this to the test we create another Catalyst store, called CatalystStore2, on another site to demonstrate the low bandwidth copy
between sites. It is essential that the client we created, beclient1, is also configured as a client to access the second Catalyst store, otherwise the
backup software controlled copy (optimized duplication in Backup Exec Terminology) will fail.
1. Once created, the second Catalyst store comes online to Backup Exec when viewed through the storage tab.

150
2. Returning to the Backup and Duplicate Properties screen, you can see we now have backup data to select. The Backup job is configured
to Run Now to Backup Exec catalyst Store. The Duplicate functionality is scheduled to run immediately after the backup completes to
Backup Exec Catalyst store 2.
These can all be changed by using the Edit box associated with each operation. Note the difference in retention periods between Backup
and Duplicate.

3. The Backup runs as previously and, when complete, the Duplicate functionality kicks in straight away. The first time that duplication
occurs 100% of the data must be replicated (or seeded as this is sometimes called). Subsequent duplications occur much faster after the
initial seeding has completed.
The Backup and Duplicate functions run as two separate jobs.

4. The Backup and Duplicate is recorded in the Job Log, as shown below (double click an entry to get more details).

151
5. The Duplicate job can also be “scheduled” to occur by changing the options in the Duplicate stage as shown below – tick According to
schedule. The Replication blackout windows set on the StoreOnce Appliance override the duplicate schedule set on the duplicate job.

6. By clicking Add Stage in the Duplicate section further multi-hop duplication stages can be added (See Duplicate 2 below)

If a physical tape library is connected into the Backup Exec Domain then the duplicate2 “edit” function could be used to specify the second
duplication to a physical tape device thereby implementing a Disk to Disk to Tape solution. The data from StoreONce would be “re-hydrated” of
course before its copy to physical tape.

152
The CASO option with Backup Exec allows duplication to take place across Backup Exec Domains.

Symantec Backup Exec – Recovery from Catalyst copies


This process is not intuitive. When searching the Backup Exec catalog for files to recover there is no mention of multiple copies; Backup Exec is
trying to hide the complexity from the user.
To test this feature:
1. We performed a backup to Catalyst Store 1, followed immediately by a duplicate to Catalyst Store 2. We then searched the backup
catalog and selected a file to restore.

2. We performed a test restore of a File from the C:/BackupDataSets directory (shown above) and confirmed using the StoreOnce GUI that
this restore was taking place from Catalyst store 1
3. The restore worked successfully.

153
4. We looked to find a way to remove from the Backup Exec catalog the C:/BackupDatatSets backups only on Catalyst store 1 to prove we
can recover from the duplicate copy.

We can view these through the Backup Sets perspective (see next screenshot) but the only insight we have as to which Catalyst store
these backup sets are on is via the expiration date.

4 week retention =
Catalyst Store 2

2 week retention =
Catalyst Store 1

Alternatively we can view Backups residing on particular storage using the Storage view of Catalyst store 1 and Catalyst store 2.

Click on Catalyst store 1.

We have deleted the large 22GB backup of


C:\BackupDataSet from Catalyst Store 1 for
this trial

154
5. If we now re-run the restore job, it fails because the data it is trying to restore is no longer on Catalyst store 1 (Copy 1 or Primary
version).

6. So now we have to run the restore wizard again and re-specify the criteria.

7. Backup Exec keeps all the catalog information for each restore job but, because the backup data has been removed (artificially) in this
example, the restore job fails. In the real world scenario if the restore had taken place after Dec 25 th (when the copy of the backup
expired on Catalyst store 1), the restore would have automatically come from Catalyst store 2 (see snapshot expiration dates above).
We are effectively forcing this to happen quickly in this worked example and, so, we have to redefine the restore job – since we have
removed the backup but not removed the entry from the catalog.

8. This time in the Restore Wizard (see below) we can see Copy 2 time stamped at 2:45 ( the original backup copy 1 – was time stamped
2:34) and we select this backup, which exists on catalyst store 2, to restore.

The Copy on Catalyst store 2 is


visible but not explicitly stated as
Catalyst store 2

9. This modified restore operation now completes successfully, and even indicates it is being restored from Catalyst store 2!

155
10. Just to be certain we use the StoreOnce GUI to observe the reading of data from Catalyst store 2 (as seen below) using the Activity
reporting within the StoreOnce GUI.

Restore from
catalyst Store 2

Image Clean up
There is a concept in Backup Exec of Data Lifecycle management (equivalent to the Image Cleanup utility that runs in NetBackup). This process is
automatic and follows the rules outlined below. The effect of this is that Backup Exec removes expired data from the HP StoreOnce Catalyst
stores automatically and proactively – hence releasing the space back to the free pool for future backups.

Figure 40: Data Lifecycle Management Fundamentals

156
Symantec Backup Exec 2012 – Catalyst best practices summary
1. Best practice is to keep similar data in the same Catalyst store. For example: dedicate a StoreOnce B6200/42xx/44xx/2xxx Catalyst
store to File/Print data and dedicate a different store for SQL server in order to get the best deduplication ratios.

2. Use of HP StoreOnce Catalyst within Symantec Backup Exec requires the addition of a deduplication license in Backup Exec for each
media server that is involved with deduplication as well as Front End Terabyte licenses (FETB) for the amount of data being protected by
Backup Exec on a Catalyst store.

3. The HP StoreOnce appliance also requires an additional Catalyst license to support this advanced functionality on every appliance.

4. Backup Exec 2012 Sp1a must be loaded for HP StoreOnce Catalyst licenses to be properly recognized.

5. The default in Backup Exec is for Verify to be turned, on which significantly extends the backup and replication times. Consider disabling
this to reduce backup and replication windows.

6. The more streams that can be supplied concurrently to a Catalyst store, the better the throughput. For best throughput 16 streams or
above is recommended.

7. Backup media servers running multiple streams and deduplication must be sized appropriately using the “rule of thumb” mentioned in
this guide.

8. Use the HP Storage sizing tool – this is calibrated with the latest test results from HP R&D and will take into consideration data
retention, data change and growth. It will also size any WAN links or show the bandwidth savings when using Catalyst.

9. Catalyst stores must be configured on the HP StoreOnce Backup system with the transfer protocol settings either both set to low
bandwidth or both set to high bandwidth.

10. In Backup Exec to set the Concurrency Level of the Catalyst store and the data stream split level you can edit the Catalyst store
parameters as shown below under “Storage” and “Properties” in the left-hand navigation pane. The data stream size shouldn’t affect
Catalyst Stores as there is buffering with the Catalyst client. The span size (or max object size) again doesn’t have a real impact with
Catalyst stores as Catalyst can support as many objects as needed. There is no 25,000 file limit like we have with NAS. So HP
recommends these values be left at the default values within Symantec Backup Exec.

157
11. If you are using Backup Exec Catalyst with an HP StoreOnce B6200 which supports failover – it is advisable to Enable checkpoint restart
by using the Edit feature on each backup job and then selecting the Advanced Open File Options in the left hand navigation.

12. Similarly for the Duplicate section of the process you can set different retention periods as shown below by editing the duplicate
function and then selecting Storage in the left hand navigation pane. More importantly in this section the Compression and Encryption
type settings should be set to None because Catalyst replication uses its own compression and encryption should not be applied as it
will reduce the deduplication ratio.

158
StoreOnce Catalyst low bandwidth backups over high latency links

One of the primary usage models for HP StoreOnce Catalyst is when remote sites (ROBO) for the first time can send a backup direct to a HP
StoreOnce appliance at a central site over a WAN. This is possible because the volume of data actually sent across the WAN is small, the majority
of the deduplication load having taken place on the media server, only the unique data needs to be transmitted to the central site.
The table below shows the impact of latency on HP StoreOnce Catalyst Low bandwidth backups with a 1% change rate of data using a single
stream low bandwidth Catalyst backup at various link speeds from T1 to T5

HP StoreOnce Catalyst Perceived Throughput 99%


Dedupe Writes 0,50,100,150,200 mSec Latency
350.0

300.0
1.554 (T1)
250.0
Throughput (MB/s)

6.312 (T2)
200.0 10

150.0 44.736 (T3)


50
100.0
100
50.0 274.176 (T4)
0.0 400.352 (T5)
0 50 100 150 200
Latency (ms)

Figure 41: Catalyst low bandwidth backups over high latency links

You can see the dramatic impact on backup performance as the latency increases up to and beyond 50mS. HP StoreOnce Catalyst is design to
tolerate high latency links without “dropping out” but the main concern is ensuring the backups can complete in the appropriate window over high
latency links. Inside a particular country/state most WAN links can be guaranteed to have Latency < 50ms but when being switched between
different Networks or maybe going intercontinental the Latency can start to creep up.

A best practice in these situations is rather than attempting to throttle the bandwidth ( e.g 50% of T1 line for Catalyst backup), is to have a Quality
of Service metric that allows the Catalyst low bandwidth backup full available bandwidth for a pre-defined period when other TI traffic is not
critical (out of business hours).

There are ways to improve the Catalyst low bandwidth backup throughput under these high latency circumstances. There is a parameter called
the Catalyst segment size which is effectively the payload size used between Catalyst data transfers. The default value is 10MB but in High
Latency situations ( > 100mS) where acknowledgement times increase, it is advisable to increase the Catalyst segment size to 20MB.

159
This can be done as follows:-

1. For HP Data Protector

set the OB2D2D_BANDWIDTH_BUFF_SIZE=20 in the omnirc data protector configuration file.

2. For Symantec Netbackup and Backup Exec it is necessary to change the parameter in the OST 2.x plug in.

For OST:
 hpost.conf – controls plugin’s behavior

Lives at:
 %SystemRoot%\Program Files\Hewlett-Packard\OpenStorage20\config (for Windows)
 /usr/openv/hp/ost20/config (for Linux)

The parameter to change is:


LBBUFFERSIZE:<server>:<store name>:<MBs>

Controls the size of the buffer in MBs for Low Bandwidth operations.

Default : 10 MB

Eg: LBBUFFERSIZE:10.11.220.8:Store3:20
Where <server> is the media server IP address.
Where <store name> is the Catalyst Store Name the user has chosen.

160
Index
10GbE Ethernet, 16 Catalyst store
create, 105
A Catalyst technology, 24
Active Directory, 16, 37 Catalyst throughput, 58
active/active seeding, 46 CIFS AD, 16
active/passive seeding, 46 CIFS share
active-to-active replication, 39 authentication, 37
active-to-passive replication, 39 sub-directories, 33
activity monitor, 57 client access permissions, 28
apparent replication throughput, 42 co-location (many-to-one) seeding, 49
appending cartridges, 31 co-location (over LAN) seeding, 48
appending NAS files, 36 comparing device types, 6
authentication, 37 compression, 11, 36
concurrency, 42
concurrency settings, 27
B concurrent backup streams
backup application recommendations, 29
and NAS, 34 concurrent replication jobs, 9
backup application considerations
VTL, 31
backup applications D
integrating with StoreOnce, 103 D2DBS emulation type, 29, 30
backup file size (NAS), 34 deduplication
backup job performance considerations, 8
recommendations, 32 device types
bandwidth limiting, 27, 54 comparison of, 6
best practices diagnostic FC devices, 22
Catalyst, 7 differential backups, 31
general, 6 direct attach (private loop), 20
NAS, 7, 33 disk space pre-allocation, 35
network and FC, 12 dual port network, 14
replication, 38
tape offload, 69
VTL, 7, 29 E
blackout window, 27, 54, 61 emulation types, 29
block size, 29, 31, 35 encryption, 11, 36
bonded port network, 15
buffering (NAS), 36
F
fan in/fan out, 42
C fibre channel
cartridge sizing, 30 diagnostic FC devices, 22
Catalyst soft zoning, 21
bandwidth limiting, 27 Fibre Channel
blackout windows, 27 best practices, 12
client access permissions, 28 Fibre Channel topologies, 19
concurrency settings, 27 floating appliance seeding, 50
configuration on StoreOnce, 104
key features, 9
overview, 8 H
sizing rule, 25 high availability network, 15
Catalyst bandwidths housekeeping
explanation of, 24 overview, 10, 61
Catalyst Copy, 25 housekeeping load, 63
advantages of, 24 housekeeping statistics, 63
Catalyst monitoring, 56 housekeeping tab, 62
Catalyst Sizing example, 98 housekeeping triggers, 10
161
HP Data Protector deduplication, 8
implementing StoreOnce Catalyst, 109 multi-streaming, 11
NAS, 34, 35
network, 12
I replication, 9
incremental backups, 31 physical tape seeding, 52
product numbers, 5

K
key parameters, 70 R
replication
bandwidth limiting, 54
L best practices, 38
libraries per appliance blackout windows, 54
performance, 30 concurrency, 42
fan in/out, 42
guidelines, 41
M impact on other operations, 54
many to one seeding, 47 overview, 9, 38
many-to-one replication, 39 performance considerations, 9
Mode 6 bonding, 13 seeding, 44
monitoring, 56 source appliance permissions, 55
multi-hop Catalyst Copy, 25 usage models, 39
multiplex, 10 WAN link sizing, 43
multi-streaming, 10, 35 replication activity monitor, 57
replication concurrency
limiting, 43
N Replication Manager, 59
NAS replication monitoring, 56
backup application considerations, 34 replication throughput, 57
best practices, 33 retention policy, 31
open files, 34 rotation scheme, 31
performance considerations, 34, 35 example, 32
NAS backup targets, 33
network
bonded port configuration, 15 S
dual port configuration, 14 seeding
high availability, 15 co-location (over LAN), 48
on two subnets, 17 co-location at source, 48
one subnet with gateway, 18 floating appliance, 50
performance considerations, 12 many to one, 47
single port configuration, 13 methods, 45
network configuration, 12 over a WAN link, 46
best practices, 12 overview, 44
for CIFS AD, 16 using physical tape, 52
n-way replication, 39 single port network, 13
single subnet with gateway network, 18
sizing guide, 23
O Sizing tool
open file limit (NAS), 34 with Catalyst, 98
open files, 34 with VTL and NAS, 74
out of sync notifications, 56 Sizing tool output, 86
overwriting cartridges, 31 soft zoning, 21
overwriting NAS files, 36 source, 55
StoreOnce
integrating with backup applications, 103
P StoreOnce technology, 8
parallel backup streams, 76 switched fabric, 19
performance Symantec Backup Exec

162
implementing StoreOnce Catalyst, 136
Symantec NetBackup
implementing StoreOnce Catalyst, 124
synthetic backups, 36

T
tape offload
best practices, 69
overview, 65
performance factors, 68
supported methods, 66
when required, 66
Time Idle status, 63
Topology Viewer, 60
transfer size, 31, 35
two subnet network, 17

V
verify operation, 36
VTL
best practices, 29
VTL performance
libraries per appliance, 30
maximum concurrent backup jobs, 29

W
WAN link sizing, 43
worked example, 72
write-in-place operation, 35

Z
zoning, 20

163
For more information

To read more about the HP StoreOnce Backup system, go to www.hp.com/go/storeonce

The following documents [and websites] provide related information:

 HP StoreOnce Backup system CLI Reference Guide (PDF): This guide describes the StoreOnce CLI commands and how to use them.
 HP StoreOnce Backup system User Guide (PDF): This guide describes the StoreOnce GUI and how to use it.
 HP StoreOnce Linux and UNIX Configuration Guide (PDF): This guide contains information about configuring and using HP StoreOnce
Backup systems with Linux and UNIX.

You can find these documents on the Manuals page of the HP Business Support Center website:
http://www.hp.com/support/manuals

Go to Storage – Disk Storage Systems – StoreOnce Backup.

Share with colleagues

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained


herein is subject to change without notice. The only warranties for HP products and services
are set forth in the express warranty statements accompanying such products and services.
Nothing herein should be construed as constituting an additional warranty. HP shall not be
liable for technical or editorial errors or omissions contained herein.

BB852-90925, Created March 2013

Das könnte Ihnen auch gefallen