Beruflich Dokumente
Kultur Dokumente
Contents
Abstract 5
Related products 5
Validity 5
Executive summary 6
General StoreOnce best practices at a glance 6
VTL best practices at a glance 7
NAS best practices at a glance 7
Catalyst best practices at a glance 7
HP StoreOnce Technology 8
Key factors for performance considerations with deduplication
that occurs on the StoreOnce Backup system 8
StoreOnce Catalyst overview 8
Key features 9
VTL and NAS Replication overview 9
Housekeeping overview 10
Backup Application considerations 11
Multi-stream or Multiplex 11
Use multiple backup streams 11
Data compression and encryption backup application features 12
Network and Fibre Channel best practices 15
Network configuration guidelines 15
Single Port configurations 16
Dual Port configurations 17
Bonded port configurations (recommended) 18
10GbE Ethernet ports on StoreOnce 4420/4430 Backup systems 19
Network configuration for CIFS AD 19
Fibre Channel configuration guidelines 23
Switched fabric 23
Direct Attach (Private Loop) 24
Zoning 24
Use soft zoning for high availability 25
Diagnostic Fibre Channel devices 26
StoreOnce Catalyst configuration guidelines 28
Catalyst technology 28
Generic sizing rule for media servers running Catalyst API 29
Catalyst Copy 29
Maximum concurrent jobs and blackout windows 30
Configuring client access 32
For more information 32
VTL configuration guidelines 33
Summary of best practices 33
Tape library emulation 33
Emulation types 33
Cartridge sizing 34
Number of libraries per appliance 34
Backup application configuration 35
Blocksize and transfer size 35
Rotation schemes and retention policy 35
Retention policy 35
Rotation scheme 35
StoreOnce NAS configuration guidelines 37
Introduction to StoreOnce NAS backup targets 37
Overview of NAS best practices 37
Shares and deduplication stores 37
Maximum concurrently open files 38
Backup application configuration 38
Backup file size 38
Disk space pre-allocation 39
Block / transfer size 40
Concurrent operations 40
Buffering 40
Overwrite versus append 40
Compression and encryption 40
Verify 41
Synthetic full backups 41
CIFS share authentication 41
StoreOnce Replication 42
StoreOnce VTL and NAS replication overview 42
Best practices overview 42
Replication usage models (VTL and NAS only) 43
What to replicate 45
Appliance, library and share replication fan in/out 46
Concurrent replication jobs 46
Apparent replication throughput 46
What actually happens in replication? 47
Limiting replication concurrency 47
WAN link sizing 47
Seeding and why it is required 48
2
Seeding methods in more detail 50
Seeding over a WAN link 50
Co-location (seed over LAN) 52
Floating StoreOnce appliance method of seeding 54
Seeding using physical tape or portable disk drive and backup application copy utilities 56
Replication and other StoreOnce operations 58
Replication blackout windows 58
Replication bandwidth limiting 58
Source Appliance Permissions 59
Replication and Catalyst monitoring 60
Configurable Synchronisation Progress Logging and Out of Sync Notification 60
Activity monitor 61
Replication throughput totals 61
Catalyst throughput totals 62
Using HP Replication Manager to monitor replication and Catalyst copy status 63
Housekeeping monitoring and control 65
Terminology 65
Tape Offload 69
Terminology 69
Direct Tape Offload 69
Backup application Tape Offload/Copy from StoreOnce Backup system 69
Backup application Mirrored Backup from Data Source 69
Tape Offload/Copy from StoreOnce Backup system versus Mirrored Backup from Data Source 69
When is Tape Offload Required? 69
Catalyst device types 70
VTL and NAS device types 71
Key performance factors in Tape Offload performance 72
Summary of Best Practices 73
Appendix A Key reference information 74
Appendix B – Fully Worked Example 76
Hardware and site configuration 76
Backup requirements specification 77
Remote Sites A/D 77
Remote sites B/C 77
Data Center E 77
Using the HP Storage sizing tool 78
Configure replication environment 78
Remote Sites A/D 79
Remote sites B/C 85
Data Center E 87
Sizing Tool output 90
Understanding the htlm output from the Sizing Tool 90
Configure StoreOnce source devices and replication target configuration 95
Sites A and D 95
Sites B and C 95
Site E 95
Map out the interaction of backup, housekeeping and replication for sources and target 96
Tune the solution using replication windows and housekeeping windows 97
Worked example – backup, replication and housekeeping overlaps 98
3
Catalyst Sizing example 102
StoreOnce Catalyst support in the Sizing Tool 102
Worked Example 102
Appendix C: Guidelines on integrating HP StoreOnce with HP Data Protector 7,
Symantec NetBackup 7.x and Symantec Backup Exec 2012 106
HP StoreOnce Catalyst: Configuration, Display and Set-up 107
Status tab 107
Settings tab 107
Permissions tab (per store) 108
Catalyst stores 108
Data stored within Catalyst 109
Catalyst copy jobs 109
Examples of NetBackup Data Job entries 109
Examples of HP Data Protector Item entries 110
Examples of Backup Exec Item entries 111
Catalyst Implementation in HP Data Protector 7 112
Integrating HP Data Protector 7 with StoreOnce Catalyst 112
HP Data Protector gateways 112
Deduplication types with the explicit gateway 113
Key Points: 113
Example scenario 114
Configuring Data Protector 115
Key Points: 117
Creating a Data Protector Specification for backup to a StoreOnce Catalyst store 117
Backup using source-side deduplication (implicit gateway) 118
Backup using server-side deduplication (explicit gateway) 118
Selecting gateways in HP Data Protector 119
Catalyst Copy Implementation in HP Data Protector 7 (Object Copy) 119
Key Points: 121
Setting up Object Copy 121
HP Data Protector 7 – Catalyst best practices summary 124
HP StoreOnce Catalyst stores and Symantec products 126
StoreOnce Catalyst implementation with Symantec NetBackup 128
Integrating HP StoreOnce Catalyst stores with Symantec NetBackup 128
Configuring a StoreOnce Catalyst store in Symantec NetBackup 129
Catalyst Copy Implementation in Symantec NetBackup (Storage Lifecycle policy – duplicate) 132
Symantec NetBackup 7.x – Recovery from Catalyst copies 136
Symantec NetBackup 7.x – Catalyst best practices summary 137
Integrating StoreOnce Catalyst with Symantec Backup Exec 139
Device Configuration in Backup Exec 2012 140
Configuring a backup Job in Backup Exec 2012 147
Catalyst Copy Implementation in Symantec Backup Exec – ( Duplicate) 150
Symantec Backup Exec – Recovery from Catalyst copies 153
Image Clean up 156
Symantec Backup Exec 2012 – Catalyst best practices summary 157
Catalyst Low Bandwidth backups over High Latency Links 159
Index 161
For more information 164
4
Abstract
The HP StoreOnce Backup system products with Dynamic Data Deduplication are Virtual Tape library, NAS share and Catalyst store appliances
designed to provide a cost-effective, consolidated backup solution for business data and fast restore of data in the event of loss.
In order to get the best performance from an HP StoreOnce Backup system there are some configuration best practices that can be applied. These
are described in this document.
Related products
Information in this document relates to the following products:
Product Generation Product Number
HP StoreOnce 2610 iSCSI Backup G3 N/A
HP StoreOnce 2620 iSCSI Backup G3 BB852A
HP StoreOnce 4210 iSCSI Backup G3 BB853A
HP StoreOnce 4210 FC Backup G3 BB854A
HP StoreOnce 4220 Backup G3 BB855A
HP StoreOnce 4420 Backup G3 BB856A
HP StoreOnce 4430 Backup G3 BB857A
Validity
This document provides an equivalent document, for G3 products only, to “HP StorageWorks D2D Backup System Best Practices for Performance
Optimization” (HP Document Part Number EH990-90921). Please note that EH990-90921 is only relevant to the older G1 and G2 StorageWorks
D2D product family.
Note: G3 products run software versions 3.x.x. G2 and G1 products run software versions 2.x.x and 1.x.x. respectively.
Best practices identified in this document are predicated on using up-to-date StoreOnce system software (check www.hp.com/support for
available software upgrades). In order to achieve optimum performance after upgrading from older software there may be some pre-requisite
steps; see the release notes that are available with the software download for more information.
5
Executive summary
This document contains detailed information on best practices to get good performance from an HP StoreOnce Backup system with HP StoreOnce
Deduplication Technology.
HP StoreOnce Technology is designed to increase the amount of historical backup data that can be stored without increasing the disk space
needed. A backup product using deduplication combines efficient disk usage with the fast single file recovery of random access disk and also
enables the use of low bandwidth replication to provide a very cost-effective disaster recovery solution.
As a quick reference these are the important configuration options to take into account when designing a backup solution.
6
VTL best practices at a glance
Make use of multiple network or Fibre Channel ports throughout your storage network to eliminate bottlenecks.
For FC configurations, split virtual tape libraries and drives across multiple FC ports ( FC VTL is available on StoreOnce B6200, 4210 FC/
4220/4420/4430 models).
Configure multiple VTLs and separate data types across them; for example SQL to VTL1, File to VTL2, and so on.
Configure larger “block sizes” within the backup application to improve performance.
Disable any multiplexing configuration within the backup application.
Disable any compression or encryption of data before it is sent to the StoreOnce appliance.
Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model – see Appendix A.
Schedule physical tape offload/copy operations outside of other backup, replication or housekeeping activities.
As with other device types, the best deduplication ratios are achieved when similar data types are sent to the same device.
Best throughput is achieved with multiple streams, the actual number per device/appliance varies by model . Because Catalyst stores can be
acting as a backup target and inbound replication target the maximum value applies to the two target types combined (although inbound
copy jobs would not normally run at the same time as backups) – see Appendix A
Although Catalyst copy is controlled by the backup software, the copy blackout window overrides the backup software scheduling. Check for
conflicts.
The first Catalyst low bandwidth backup will take longer than subsequent low bandwidth backups because a seeding process has to take
place.
If you are implementing multi-hop or one-to-many Catalyst copies, remember that these copies happen serially not in parallel.
Ensure the backup clean-up scripts that regularly check for expired Catalyst Items run at a frequency that avoids using excessive storage to
hold expired backups (every 24 hours is recommended).
There are several specific tuning parameters dependent on backup application implementation – please see Appendix C for more details.
7
HP StoreOnce Technology
A basic understanding of the way that HP StoreOnce Technology works is necessary in order to understand factors that may impact performance
of the overall system and to ensure optimal performance of your backup solution.
HP StoreOnce Technology is an “inline” data deduplication process. It uses hash-based chunking technology, which analyzes incoming backup
data in “chunks” that average 4K in size. The hashing algorithm generates a unique hash value that identifies each chunk and points to its location
in the deduplication store.
Hash values are stored in an index that is referenced when subsequent backups are performed. When data generates a hash value that already
exists in the index, the data is not stored a second time, but rather a count is increased showing how many times that hash code has been seen.
Unique data generates a new hash code and that is stored on the appliance. Typically about 2% of every new backup is new data that generates
new hash codes.
With Virtual Tape Library and NAS shares, deduplication always occurs on the StoreOnce Backup system. With Catalyst stores, deduplication may
be configured to occur on the media server (recommended) or on the StoreOnce Backup system.
Key factors for performance considerations with deduplication that occurs on the StoreOnce
Backup system
The inline nature of the deduplication process means that it is a very processor and memory intensive task. HP StoreOnce appliances have been
designed with appropriate processing power and memory to minimize the backup performance impact of deduplication.
Best performance will be obtained by configuring a larger number of libraries/shares/Catalyst stores with multiple backup streams to each
device, although this has a trade off with overall deduplication ratio.
o If servers with lots of similar data are to be backed up, a higher deduplication ratio can be achieved by backing them all up to
the same library/share/Catalyst store, even if this means directing different media servers to the same data type device
configured on the StoreOnce appliance.
o If servers contain dissimilar data types, the best deduplication ratio/performance compromise will be achieved by grouping
servers with similar data types together into their own dedicated libraries/shares/Catalyst stores. For example, a requirement
to back up a set of exchange servers, SQL database servers, file servers and application servers would be best served by
creating four virtual libraries, NAS shares or Catalyst stores; one for each server data type.
The best backup performance to a device configured on a StoreOnce appliance is achieved using somewhere below the maximum number of
streams per device (the maximum number of streams varies between models – See Appendix A for more details and the section StoreOnce
performance explained.
When restoring data from a deduplicating device it must reconstruct the original un-deduplicated data stream from all of the data chunks
contained in the deduplication stores. This can result in lower performance than that of the backup process (typically 80%). Restores also
typically use only a single stream.
Full backup jobs will result in higher deduplication ratios and better restore performance. Incremental and differential backups will not
deduplicate as well.
*Actual performance is dependent upon the specific StoreOnce appliance, configuration, data set type, compression levels, number of data
streams, number of devices and number of concurrent tasks, such as housekeeping or replication.
8
All HP StoreOnce Backup systems can support Catalyst stores, Virtual Tape libraries and NAS (CIFS/NFS) shares on the same system, which makes
them ideal for customers who have legacy requirements for VTL and NAS but who wish to move to HP StoreOnce Catalyst technology.
HP StoreOnce Catalyst stores do require a separate license on both source and target; VTL/NAS devices only require licenses if they are replication
targets.
Key features
The following are the key points to be aware of with StoreOnce Catalyst:
Optional deduplication at the backup server enables greater overall StoreOnce appliance performance and reduced backup bandwidth
requirements. This can be controlled at backup session/job level.
HP StoreOnce Catalyst enables advanced features such as duplication of backups between appliances in a network-efficient manner under
control of the backup application.
Catalyst stores can be copied using low-bandwidth links – just like NAS and VTL devices. The key difference here is that there is no need to
set up replication mappings (required with VTL and NAS); the whole of the Catalyst copy process is controlled by the backup software itself.
HP StoreOnce Catalyst enables space occupied by expired backups to be returned for re-use in an automated manner because of close
integration with the backup application
HP StoreOnce Catalyst enables asymmetric expiry of data. For example: retain 2 weeks on the source, 4 weeks on the target device.
HP StoreOnce Catalyst store creation can be controlled by the backup application, if required, from within Data Protector (not available with
Symantec products).
StoreOnce Catalyst is fully monitored in the Storage Reporting section of the StoreOnce Management GUI and StoreOnce Catalyst Copy can
be monitored on a global basis by using HP Replication Manager V2.1 or above.
HP StoreOnce Catalyst is an additional licensable feature both on the StoreOnce appliance and within the backup software because of the
advanced functionality it delivers.
HP StoreOnce is only supported on HP Data Protector 7.01, Symantec NetBackup 7.x and Symantec Backup Exec 2012.
There is some overhead of control data that also needs to pass across the replication link. This is known as manifest data, a final component of
any hash codes that are not present on the remote site and may also need to be transferred. Typically the “overhead components” are less than
2% of the total virtual cartridge/file size to replicate.
Replication throughput can be “throttled” by using bandwidth limits as a percentage of an existing link, so as not to affect the performance of
other applications running on the same WAN link.
9
The maximum number of concurrent replication jobs supported by source and target StoreOnce appliances can be varied in the StoreOnce
Management GUI to also manage throughput and bandwidth utilization. The table below shows the default settings for each product.
Appliance Fan-In 8 16 24 50 50
Appliance Fan-Out 2 4 4 8 8
Device Fan-In (VTL only) 1 8 8 16 16
Device Fan-out 1 1 1 1 1
Max Concurrent Outbound Jobs 12 24 24 48 48
Max Concurrent Inbound Jobs 24 48 48 96 96
Note: Fan in is the maximum number of source appliances that may replicate to a device acting as as a replication target.
Housekeeping overview
If data is deleted from the StoreOnce Backup system (e.g. a virtual cartridge is overwritten or erased), any unique chunks will be marked for
removal, any non-unique chunks are de-referenced and their reference count decremented. The process of removing chunks of data is not an
inline operation because this would significantly impact performance. This process, termed “housekeeping”, runs on the appliance as a
background operation.
Housekeeping is triggered in different ways depending on device type and backup application:
VTL: media on which the data retention period has expired will be overwritten by the backup application. The act of overwriting triggers
the housekeeping of the expired data. If media is not overwritten (if backup application chooses to use blank media in preference to
overwriting), the expired media continues to occupy disk space.
NAS shares: Some backup applications overwrite with the same file names after expiration; others do an expiry check before writing
new data to the share; others might do a quota check before overwriting. Any of these actions triggers housekeeping.
Catalyst stores: The backup application clean-up process, the running of which is configurable, regularly checks for expired backups
and removes catalog entries. This provides a much more structured space reclamation process.
One final comment is that housekeeping blackout windows are configurable (up to two periods in any 24 hours) so, even if the “clean up” scripts
run in the backup software, the housekeeping will not trigger until the blackout window is closed.
Housekeeping is an important process in order to maximize the deduplication efficiency of the appliance and, as such, it is important to ensure
that it has enough time to complete. Running backup, restore, tape offload and replication operations with no break (i.e. 24 hours a day) will
result in housekeeping never being able to complete. Configuring backup rotation schemes correctly is very important to ensure the maximum
efficiency of the product; correct configuration of backup rotation schemes reduces the amount of housekeeping that is required and creates a
predictable load.
Large housekeeping loads are created if large numbers of cartridges are manually erased or re-formatted. In general all media overwrites should
be controlled by the backup rotation scheme so that they are predictable.
10
Backup Application considerations
Multi-stream or Multiplex
Multi-streaming is often confused with Multiplexing; these are however two different (but related) terms. Multi-streaming is when multiple data
streams are sent to the StoreOnce Backup system simultaneously but separately. Multiplexing is a configuration whereby data from multiple
sources (for example multiple client servers) is backed up to a single tape drive device by interleaving blocks of data from each server
simultaneously and combined into a single stream. Multiplexing is a hangover from using physical tape device, and was required in order to
maintain good performance where source servers were slow because it aggregates multiple source server backups into a single stream.
A multiplexed data stream configuration is NOT recommended for use with a StoreOnce system or any other deduplicating device. This is because
the interleaving of data from multiple sources is not consistent from one backup to the next and significantly reduces the ability of the
deduplication process to work effectively; it also reduces restore performance. Care must be taken to ensure that multiplexing is not happening
by default in a backup application configuration. For example when using HP Data Protector to back up multiple client servers in a single backup
job, it will default to writing four concurrent multiplexed servers in a single stream. This must be disabled by reducing the “Concurrency”
configuration value for the tape device from 4 to 1.
The following graph, Figure 1, illustrates the relationship between the number of active data streams and performance; the appliance is assumed
to be one of the larger models where more than 24 streams (if fast enough) can achieve best throughput. The throughput values shown are for
example only. See Appendix A for the maximum number of streams recommended for best throughput per model.
Along the x axis is the number of concurrent streams. A stream is a data path to a device configured on StoreOnce; on VTL it is the number of
virtual tape drives, on NAS the number of writers, on Catalyst stores the number of streams.
Along the Y axis is the overall throughput in MB/sec that the StoreOnce device can process – this ultimately dictates the backup window. As a
backup window begins, the number of streams gradually increases and we aim to have as many streams running as possible to get the best
possible throughput to the StoreOnce device. As the backup jobs come to an end, the stream count starts to decrease and so the overall
throughput to the StoreOnce device starts to reduce.
The StoreOnce device itself also has a limit which we call the maximum ingest rate. In this example it is 1000MB/sec. The > 24 streams value is
calculated using “Infinite performance hosts” to characterize the HP StoreOnce ingest performance.
As long as we can supply around 24 data streams at the required performamce levels we keep the StoreOnce device in its “saturation zone” of
maximum ingest performance.
11
Figure 1: Relationship between active data streams and performance
Note 1: Stream source data rates will vary; some streams will be at 8, others at 50, and maybe some others at 200. This means that as the
stream count increases, it will be the aggregate total of the streams that will drive the unit to saturation, which is the goal. Some of the factors
that influence source data rate are the compressiblity of data, number of disks in the disk group that is feeding the stream, RAID type and others.
Note 2: With 5 streams at100MB/Sec we do not reach the maximum throughput of the node (server), which can support 600MB/sec in this
example. This is the maximum possible ingest rate of the device for a specific model based on 5 streams. This ingest rate is the maximum even if
each stream is capable of 200MB/Sec, because it represents the maximum amount of data the machine can process.
Note 3: The number of streams available varies throughout the backup window. The curve representing backup streams increases as the backup
jobs begin ramping into the appliance (to VTL, NAS share or Catalyst store target devices) and then declines towards the finish of the backup,
when throughput rates decline as backup jobs complete. This highlights the importance of maintaining enough backup threads from sources to
ensure that, while backups are running, sufficient source “data pump” is maintained to hold the StoreOnce device in saturation.
In example 1 (red circle) we are supplying much more than 24 streams (100 actually) but they are all slow hosts and the cumulative ingest rate is
800MB/sec (below our maximum ingest rate).
In example 2 (green circle) we have some high performance hosts that can supply data at a rate higher than the StoreOnce maximum ingest rate;
and so the performance is capped at 1000MB/sec.
12
In example 3 (blue circle) we have some very high performance hosts but can only configure 5 backup streams because of the way the data is
constructed on the hosts. In this case the maximum ingest of the StoreOnce appliance is 600MB/sec but we can only achieve 500 MB/sec because
that is as fast we we can supply the data (because of stream limitations). If we could re-configure the backups to provide more streams, we could
get higher throughput.
In example 4 (brown circle) we show a more realistic situation where we have a mixture of diferent hosts with different performance levels. Most
importantly, we have 30 streams and a total throughput capability of 950MB/sec, which puts us very close to the maximum ingest rate.
The maximum ingest rates vary according to each StoreOnce model. Typically, on the larger StoreOnce units about 48 streams spread across the
configured devices give the best throughput; more streams only help to sustain the throughput with each stream being throttled appropriately.
For example, if 96 streams are configured, the throughput is still the same as if 48 streams were configured – it is just that each stream runs
slower as resources are shared. See Appendix A for more details.
Once we understand the basic streams versus performance concept we can start to apply best practices for the number of devices to configure.
With these factors in mind we have recommended some VTL configurations for the above performance examples, which are illustrated in Figure 2
below.
Figure 2: Relationship between active data streams and device configuration ( VTLs shown)
Note2 above: In general, per device configured we get the best throughput between 12-16 streams and the best throughput per appliance when
we reach 48 streams or more. So, for 100 streams we could configure 6 devices with say 17 streams to each or 20 devices with 5 streams to each.
6 devices is preferrable because:
a) Less devices are easier to manage but we can still group similar data types into the same device
b) They provide best possible throughput when we have the higher stream count to a device
13
Data compression and encryption backup application features
Both software compression and encryption will randomize the source data and will, therefore, not result in a high deduplication ratio for these
data sources. Consequently, performance will also suffer. The StoreOnce Backup system will compress the data at the end of deduplication
processing anyway, before finally writing the data to disk.
For these reasons it is best to do the following, if efficient deduplication and optimum performance are required:
Ensure that there is no encryption of data before it is sent to the StoreOnce appliance.
Ensure that software compression is turned off within the backup application. Not all data sources will result in high deduplication ratios;
deduplication ratios are data type dependent, change rate dependent and retention period dependent. Deduplication performance can,
therefore, vary across different data sources. Digital images, video, audio and compressed file archives will typically all yield low deduplication
ratios. If this data predominantly comes from a small number of server sources, consider setting up a separate library/share/Catalyst store for
these sources for better deduplication performance. In general, high- change rates yield low dedupe ratios, whilst low change rates yield high
dedupe ratios over the same retention period. As you might expect – multiple full backups yield high dedeup ratios compared to Full and
Incremental backup regimes.
14
Network and Fibre Channel best practices
The following table shows which network and fibre channel ports are present on each model of StoreOnce appliance.
Ethernet factors
It is important to consider the whole network when considering backup performance. Any server acting as a backup server should be configured
where possible with multiple network ports that are bonded in order to provide a fast connection to the LAN. Client servers (those that backup
via a backup server) may be connected with only a single port if backups are to be aggregated through the backup server.
Ensure that no sub-1GbEnetwork components are in the backup path as this will significantly restrict backup performance.
Configure bonded (Mode 6) network ports inside the StoreOnce appliance to achieve maximum available network bandwidth and a level of high
availability.
For StoreOnce 4420/4430 Backup systems that support a 10GbE connection configure a Network SAN on the 10GbE ports, which is dedicated to
backup traffic.
FC factors
Virtual library devices are assigned to an individual interface. Therefore, for best performance, configure both FC ports and balance the virtual
devices across both interfaces to ensure that one link is not saturated whilst the other is idle.
Switched fabric mode is preferred for optimal performance on medium to large SANs since zoning can be used.
Use zoning (by Worldwide Name) to ensure high availability..
When using switched fabric mode, Fibre Channel devices should be zoned on the switch to be only accessible from a single backup server
device. This ensures that other SAN events, such as the addition and removal of other FC devices, do not cause unnecessary traffic to be sent to
devices. It also ensures that SAN polling applications cannot reduce the performance of individual devices.
Either or both of the two FC ports may be connected to a FC fabric and each virtual library may be associated with one or both of these FC ports
but each drive can only be associated with one port. Port 1 and 2 is the recommended option in the GUI to achieve efficient load balancing. Only
the robotics (medium changer) part of the VTL is presented to Port 1 and Port 2 initially, with the number of virtual tape drives defined being
presented 50% to Port 1 and 50% to Port 2. This also ensures that in the event of a fabric failure at least half of the drives will still be available
to the hosts. (The initial virtual tape drive allocation to ports (50/50) can be edited later, if required.
Configured backup devices and the management interfaces are all available on all network IP addresses configured for the appliance.
In order to deliver best performance when backing up data over the Ethernet ports it will be necessary to configure the appliance network ports,
and also backup servers and network infrastructure to maximize available bandwidth to the StoreOnce device.
15
Each pair of network ports on the appliance can be configured either on separate subnets or in a bond with each other (1GbE and 10GbE ports
cannot be bonded together).
Single Node StoreOnce appliances have a factory default network configuration where the first 1GbE port (Port 1 /eth0) is enabled in DHCP mode.
This enables quick access to the StoreOnce CLI and Management GUI for customers using networks with DHCP servers and DNS lookup because
the appliance hostname is printed on a label on the appliance itself.
The StoreOnce appliances provide “Mode 6” “Adaptive Load Balancing” when ports are bonded. This provides port failover and load balancing
across the physical ports. There is no need for any network switch configuration in this mode. This network bonding mode requires that the same
switch is used for each network port or that spanning tree protocol is enabled, if separate switches are used for each port.
If external switch ports are configured for LACP (Mode 4) bonding then this must be un-configured in order for Mode 6 bonding to work.
Network configuration on StoreOnce Backup systems is performed via the CLI Management interface.
For detailed information about supported network modes and how to configure them, please refer to the “HP StoreOnce 2620, 4210/4220,
and 4420,/4430 Backup system Installation and Configuration guide”.
16
The example shows the simplest configuration of a single subnet containing just one 1GbE network port, generally this configuration is likely to
be used:
Only if the network interface is required only for management of the appliance or
Only if low performance and resiliency backup and restore are acceptable.
A single 10GbE port could also be configured in this way (on 4420 and 4430 appliances), providing both a backup data interface and management
interface. This could deliver good performance, however, bonded ports are recommended for resiliency and maximum performance.
In the case of a separate network SAN being used, configuration of CIFS backup shares with Active Directory authentication requires careful
consideration, see Network Configurations for CIFS AD on page 19 for more information.
17
Bonded port configurations (recommended)
If two network ports are configured within the same subnet they will be presented on a single IP address and will be bonded using Mode 6 bonding
as described at the beginning of this chapter.
This mode is generally recommended for backup data performance and also for resiliency of both data and management network connectivity.
It should be noted that when using bonded ports the full performance of both links will only be realized if multiple host servers are providing
data, otherwise data will still use only one network path from the single server.
Figure 5: Network configuration, bonded , applies to both 1GbE ports and 10 GbE ports
18
10GbE Ethernet ports on StoreOnce 4420/4430 Backup systems
10GbE Ethernet is provided as a viable alternative to the Fibre Channel interface for providing maximum iSCSI VTL performance and also
comparable NAS performance. 10GbE ports also provide good performance when using StoreOnce Catalyst low and high bandwidth backup as
well as Catalyst copy or VTL/NAS replication between appliances. When using 10GbE Ethernet it is common to configure a “Network SAN”, which is
a dedicated network for backup that is separate to the normal business data network; only backup data is transmitted over this network.
Figure 6: Network configuration, HP StoreOnce 4420/4430 with 10GbE ports. As well as CIFS and NFS shares the devices configured could equally
be Catalyst stores.
When a separate network SAN is used, configuration of CIFS backup shares with Active Directory authentication requires careful consideration,
see the next section for more information.
However, in order to make this possible the AD Domain controller must be accessible from the StoreOnce device. Broadly there are two possible
configurations which allow both:
Access to the Active Directory server for AD authentication and
Separation of Corporate LAN and Network SAN traffic
19
Option 1: HP StoreOnce Backup system on Corporate SAN and Network SAN
In this option, the StoreOnce device has a port in the Corporate SAN which has access to the Active Directory Domain Controller. This link is then
used to authenticate CIFS share access.
The port(s) on the Network SAN are used to transfer the actual data.
20
Option 2: HP StoreOnce Backup system on Network SAN only with Gateway
In this option the StoreOnce appliance has connections only to the Network SAN, but there is a network router or Gateway server providing access
to the Active Directory domain controller on the Corporate LAN. In order to ensure two-way communication between the Network SAN and
Corporate LAN the subnet of the Network SAN should be a subnet of the Corporate LAN subnet.
Once configured, authentication traffic for CIFS shares will be routed to the AD controller but data traffic from media servers with a connection to
both networks will travel only on the Network SAN. This configuration allows both 1GbE network connections to be used for data transfer but also
allows authentication with the Active Directory Domain controller. The illustration shows a simple Class C network for a medium-sized LAN
configuration.
21
The screenshot below shows where the 1GbE and 10GbE IP addresses are displayed in the GUI.
There can be up to 4
addresses here (depending if
bonding is used or not used),
one per port.
All ports can be used for either
Management or Data.
22
Fibre Channel configuration guidelines
The HP StoreOnce Backup systems support both switched fabric and direct attach (private loop) topologies.
Switched fabric using NPIV (N Port ID Virtualisation) offers a number of advantages and is the preferred topology for StoreOnce appliances.
Switched fabric
A switched fabric topology utilizes one or more fabric switches configured in one or more storage area networks (SANs) to provide a flexible
configuration between several Fibre Channel hosts and Fibre Channel targets such as the StoreOnce appliance virtual libraries. Switches may be
cascaded or meshed together to form large fabrics.
StoreOnce does not implement any selective virtual device presentation, and so each virtual library will be visible to all hosts connected to the
same fabric. It is recommended that each virtual library is zoned to be visible to only the hosts that require access. Unlike the iSCSI virtual
libraries, FC virtual libraries can be configured to be used by multiple hosts, if required.
23
Direct Attach (Private Loop)
A direct attach (private loop) topology is implemented by connecting the StoreOnce appliance ports directly to a Host Bus Adapter (HBA). In this
configuration the Fibre Channel private loop protocol must be used.
Either of the FC ports on a StoreOnce Backup system may be connected to a FC private loop, direct attach topology. The FC port configuration of
the StoreOnce appliance should be changed from the default N_Port topology setting to Loop. This topology only supports a single host
connected to each private loop configured FC port. In Private loop mode the medium changer cannot be shared across FC Port 1 and FC port 2
Zoning
Zoning is only required if a switched fabric topology is used and provides a way to ensure that only the hosts and targets that they need are
visible to servers, disk arrays, and tape libraries. Some of the benefits of zoning include:
Limiting unnecessary discoveries on the StoreOnce appliance
Reducing stress on the StoreOnce appliance and its library devices by polling agents
Reducing the time it takes to debug and resolve anomalies in the backup/restore environment
Reducing the potential for conflict with untested third-party products
Zoning implemention needs to ensure StoreOnce FC diagnostic device is not presented to hosts.
24
Zoning may not always be required for configurations that are already small or simple. Typically the larger the SAN, the more zoning is needed.
Use the following guidelines to determine how and when to use zoning.
Small fabric (16 ports or less)—may not need zoning.
Small to medium fabric (16 - 128 ports)—use host-centric zoning. Host-centric zoning is implemented by creating a specific zone for each
server or host, and adding only those storage elements to be utilized by that host. Host-centric zoning prevents a server from detecting any
other devices on the SAN or including other servers, and it simplifies the device discovery process.
Disk and tape on the same pair of HBAs is supported along with the coexistence of array multipath software (no multipath to tape or library
devices on the HP StoreOnce Backup system, but coexistence of the multipath software and tape devices).
Large fabric (128 ports or more)—use host-centric zoning and split disk and tape targets. Splitting disk and tape targets into separate zones
will help to keep the HP StoreOnce Backup system free from discovering disk controllers that it does not need. For optimal performance, where
practical, dedicate HBAs for disk and tape.
Figure 11: VTL Fibre Channel resiliency using WWN zoning (WWPN)
25
In our example the arrows illustrate accessibility, not data flow.
FC configuration StoreOnce VTL configuration
Dual fabrics Default library configuration is 50% drives presented to Port 1, 50%
presented to Port 2. Robot appears on Port 1 and Port 2
Multiple switches within each fabric
Up to 120 WWNs can be presented to Port 1 and Port 2
Zoning by WWPN
If Fabric 1 fails, all VTL libraries on the HP StoreOnce Backup system
Each zone to include a host and the required targets on the HP
still have access to Fabric 2. As long as Hosts A, B and C also have
StoreOnce Backup system
access to Fabric 2, then all backup devices are still available to Hosts A,
B and C.
Use the StoreOnce Management GUI to find out the WWPN for use in zoning. The WW port names are on the VTL-Libraries-Interface Information
tab.
The Diagnostic Fibre Channel Device can be identified by the following example text.
Symbolic Port Name "HP D2D S/N-CZJ1440JBS HP D2DBS Diagnostic Fibre Channel S/N-MY5040204H
Port-1"
Symbolic Node Name "HP D2D S/N-CZJ1440JBS HP D2DBS Diagnostic Fibre Channel S/N-MY5040204H"
In the above the S/N-CZJ1440JBS for all devices should be identical. If this is Node Port 1, the Node Name string will be as above but, if Port 2,
the Node Name string will end with “Port-2”. Often the diagnostic device will be listed above the other virtual devices as it logs in first ahead of
the virtual devices. The S/N-MY5040204H string is an indication of the QLC HBA’s SN not any SN of an appliance/node.
At this time these devices are part of the StoreOnce VTL implementation and are not an error or fault condition. It is imperative that these
devicesbe removed from the switch zone that is also used for virtual drives and loaders to avoid data being sent to diagnostic devices.
26
Sizing StoreOnce solutions
The following diagram provides a simple sizing guide for the HP StoreOnce Generation 3 product family for backups and replication.
The following figure illustrates the typical amount of data that can be protected by HP StoreOnce Backup system on a daily backup.
Note: Assumes fully configured product, compression rate of 1.5, data change rate of 1%, data retention period of 6 months and a 12-hour
backup window. Actual performance is dependent upon data set type, compression levels, number of data streams, number of devices emulated
and number of concurrent tasks, such as housekeeping or replication. Additional time is required for periodic physical tape copy, which would
reduce the amount of data that can be protected in 24 hours.
HP also provides a downloadable tool to assist in the sizing of StoreOnce-based data protection solutions at
http://h30144.www3.hp.com/SWDSizerWeb/.
The use of this tool enables more accurate capacity sizing, retention period decisions and replication link sizing and performance for the most
complex StoreOnce environments.
A fully worked example using the Sizing Tool and best practices is contained later in the document, see Appendix B.
27
Introduction to HP StoreOnce Catalyst and device configuration guidelines
StoreOnce Catalyst technology
The following diagram shows the basic concept of a StoreOnce Catalyst store; it is a network based (not FC based) type of backup target, that
exists alongside VTL and NAS targets. The main difference between a Catalyst store and VTL or NAS devices is that the processor-intensive part of
deduplication (hashing/chunking and compressing) can be configured to occur on either the media server or the StoreOnce appliance.
If deduplication is configured to occur on the media server supplying data to the Catalyst store, this is known as low bandwidth backup
or source deduplication.
If deduplication is configured to occur on the StoreOnce appliance where the Catalyst store is located, this is known as target-side
deduplication or high-bandwidth backup. ALL the deduplication takes place on the StoreOnce appliance.
The low bandwidth mode is expected to account for the majority of Catalyst implementations since it has the net effect of improving the overall
throughput of the StoreOnce appliance whilst reducing backup bandwidth consumed. It can also be used to allow remote offices to back up
directly to a central StoreOnce Appliance over a WAN link for the first time. Catalyst stores are tolerant of high latency links – this has been tested
by HP. The net effect is the same in both cases – a significant reduction in bandwidth consumed by the data path to the backup storage target.
The deduplication offload into the media server is implemented in different ways in different backup applications.
With HP Data Protector the StoreOnce deduplication engine is embedded in the HP Data Protector Media Agent that talks to the Catalyst API.
In Symantec products HP has developed an OpenStorage (OST) Plug-in to NetBackup and Backup Exec that creates the interface between
Symantec products and the StoreOnce Catalyst store API.
Catalyst stores can also be copied using low bandwidth links – just like NAS and VTL devices. The key difference here is that there is no need to set
up replication mappings (required with VTL and NAS); the whole of the Catalyst copy process is controlled by the backup software itself. This is
implemented by sending “Catalyst Copy” commands to the Catalyst API that exists on the source StoreOnce appliance. This simple fact, that the
backup application controls the copy process and is aware of all the copies of the data held in Catalyst stores, solves many of the problems
28
involved in Disaster Recovery scenarios involving replicated copies. No import is necessary because all entries for all copies of data already exist
in the backup application’s Database.
Allow 50MB/s of stream data per GHz of CPU core and 30MB RAM (allow 2 cores for the BACKUP APPLICATION media agent software)
Allow at least 16GB of RAM overall
Ignore hyperthreading (for example: 12 cores=24 with hyperthreading).
Example:
Dual Hex-core CPU running at 3.4GHz (12 cores).
10 cores x 3.4GHz = 34GHz (remember 2 cores are allocated to the media agent).
34 streams @ 50MB/s = 1700 MB/s (providing the sources of the data are not the bottleneck)
A media server that is dual hex(6) core processor (12 cores), of which two are used for backup and 10 are available for deduplication, should be
able to deliver Catalyst backup streams (already deduplicated) at a rate of 1700 MB/sec.
Catalyst Copy
Catalyst Copy is the equivalent of Virtual library and NAS share replication. The same principles apply in that only the new data created at the
source site needs to be copied (replicated) to the target site. The fundamental difference is that the copy jobs are created by the backup
application and can, therefore, be tracked and monitored within the backup application catalog as well as from the StoreOnce Management GUI.
Should it be necessary to restore from a Catalyst copy, the backup application is able to restore from a duplicate copy without the need to re-
import data to the catalog database.
Catalyst Copy should not be considered in the same way as VTL and NAS replication, since there is effectively no hard constraints other than
capacity on how many Catalyst stores can be copied (replicated) into a Catalyst store at a central site. Furthermore, because Catalyst copies are
controlled by the backup application, multi-hop replication is possible using Catalyst devices. However, Catalyst replication blackout windows can
be set on the StoreOnce appliance to dictate when the copy job actually happens and bandwidth throttling can also be enforced to limit the
amount of WAN link consumed by StoreOnce Catalyst copy; in this respect it is similar to NAS ad VTL replication.
Catalyst Copy has the following features
The Copy job is configurable from within the backup application software – see Appendix C
Several source Catalyst stores can be copied into a single target Catalyst store.
Multi-hop copy is configurable via the backup software – Source to Target 1 then onto Target 2.
One to many copy is also configurable but happens serially one after the other.
With the Catalyst Agents running on remote office media servers HP StoreOnce Catalyst technology has the ability to backup directly
from remote sites to a central site , using what is known as low bandwidth backup – essentially this uses HP StoreOnce replication
technology.
29
Figure 14: Catalyst Copy options
Note:
Outbound copy jobs = replication (out)
Data in = Backup jobs
Inbound copy jobs = replication ( in)
30
The concurrency settings for Catalyst are configured by selecting the StoreOnce Catalyst - Settings tab and Edit. In this case, the parameters are
Outbound copy jobs and Data and Inbound Copy jobs. Bear in mind that Catalyst stores can act as both inbound and outbound copies when used
in multi-hop mode.
The following screen illustrates how Catalyst copy blackout windows are configured (from the StoreOnce Catalyst-Blackout Windows tab).
The user can also configure bandwidth limiting (from the StoreOnce Catalyst-Bandwidth Limiting Windows tab).
31
Configuring client access
Catalyst stores have a process that allows client access to the Catalyst stores to be controlled.
First overall client access permission checking is enabled (from the StoreOnce Catalyst – Settings tab).
Then each Catalyst store has a list of clients defined who are allowed to access it (from the StoreOnce Catalyst – Stores – Permissions tab). In
our example, Catalyst store “SQL_VSS_Keep” can only be accessed by Client “DestinyStores.” The backup applications also have this Client name
configured into their Catalyst Backup and Copy Jobs in order to send the Catalyst API calls to the stores. (Note the permissions option for All
Clients provides an open access option.)
32
VTL configuration guidelines
Summary of best practices
Tape drive emulation types have no effect on performance or functionality.
Configuring multiple tape drives per library enables multi-streaming operations per library for good aggregate throughput performance.
Do not exceed the recommended maximum concurrent backup streams per library and appliance if maximum performance is required. See
Appendix A.
Target the backup jobs to run simultaneously across multiple drives within the library and across multiple libraries. Keep the concurrent stream
count high for best throughput.
Create multiple libraries on the larger StoreOnce appliances to achieve best aggregate performance.
Configure dedicated Individual libraries for backing up larger servers.
Configure other libraries for consolidated backups of smaller servers.
Separate libraries by data type if the best trade-off between deduplication ratio and performance is needed
Cartridge capacities should be set either to allow a full backup to fit on one cartridge or to match the physical tape size for offload (whichever is
the smaller)
Use a block size of 256KB or greater. For HP Data Protector and EMC Networker software a block size of 512 KB has been found to provide the
best deduplication ratio and performance balance.
Disable the backup application verify pass for best performance.
Remember that virtual cartridges cost nothing and use up very little space overhead. Don’t be afraid of creating “too many” cartridges. Define
slot counts to match required retention policy. The D2DBS, ESL and EML virtual library emulations can have a large number of configurable
slots and drives to give most flexibility in matching customer requirements.
Design backup policies to overwrite media so that space is not lost to a large expired media pool and media does not have different retention
periods on the same piece of media.
Reduce the number of appends per tape by specifying separate cartridges for each incremental backup, this improves replication performance
and capacity utilization.
Performance however is not related to library emulation other than in the respect of the ability to configure multiple drives per library and thus
enable multiple simultaneous backup streams (multi-streaming operation).
To achieve the best performance of the larger StoreOnce appliances more than one virtual library will be required to meet the multi-stream
needs. The appliance is provided with a drive pool and these can be allocated to libraries in a flexible manner and so many drives per library can be
configured up to a maximum as defined by the library emulation type. The number of cartridges per library can also be configured. The table
below lists the key parameters all StoreOnce products.
To achieve best performance the recommended maximum concurrent backup streams per library and appliance in the table should be followed.
As an example, while it is possible to configure 200 drives per library on a 4420 appliance, for best performance no more than 12 of these drives
should be actively writing or reading at any one time.
33
StoreOnce 2620 StoreOnce 4210/4220 StoreOnce 4420/4430
Maximum VTL drives per library/appliance 32 64/96 200
Maximum slots per library (D2DBS, EML-E, ESL-E) 96 1024 4096
Maximum slots (MSL2024, MSL4048, MSL8096) 24, 48, 96 24, 48, 96 24, 48, 96
Maximum active streams per store 48 64/96 128
Recommended maximum concurrent backup streams per 24 48 64
appliance
Recommended maximum concurrent backup streams per 4 6 12
library
The HP D2DBS emulation type and the ESL/EML type provide the most flexibility in numbers of cartridges and drives. This has two main benefits:
It allows for more concurrent streams on backups which are throttled due to host application throughput, such as multi-streamed backups
from a database.
It allows for a single library (and therefore Deduplication Store) to contain similar data from more backups, which then increases deduplication
ratio.
The D2DBS emulation type has an added benefit in that it is also clearly identified in most backup applications as a virtual tape library and so is
easier for supportability. It is the recommended option for this reason.
There are a number of other limitations from an infrastructure point of view that need to be considered when allocating the number of drives per
library. As a general point it is recommended that the number of tape drives per library does not exceed 64 due to the restrictions below:
For iSCSI VTL devices a single Windows or Linux host can only access a maximum of 64 devices. A single library with 63 drives is the most that a
single host can access. Configuring a single library with more than 63 drives will result in not all devices in the library being seen (which may
include the library device). The same limitation could be hit with multiple libraries and fewer drives per library.
A similar limitation exists for Fibre Channel. Although there is a theoretical limit of 255 devices per FC port on a host or switch, the actual limit
appears to be 128 for many switches and HBAs. You should either balance drives across FC ports or configure less than 128 drives per library.
Some backup applications will deliver less than optimum performance if managing many concurrent backup tape drives/streams. Balancing the
load across multiple backup application media servers can help here.
Cartridge sizing
The size of a virtual cartridge has no impact on its performance and cartridges do not pre-allocate storage. It is recommended that cartridges are
created to match the amount of data being backed up. For example, if a full backup is 500 GB, the next larger configurable cartridge size is 800
GB, so this should be selected.
Note that if backups are to be offloaded to physical media elsewhere in the network, it is recommended that the cartridge sizing matches that of
the physical media to be used.
Creating a number of smaller deduplication “stores” rather than one large store which receives data from multiple backup hosts could have an
impact on the overall effectiveness of deduplication. However, generally, the cross-server deduplication effect is quite low unless a lot of
common data is being stored. If a lot of common data is present on two servers, it is recommended that these are backed up to the same virtual
library.
For best backup performance, configure multiple virtual libraries and use them all concurrently.
For best deduplication performance, use a single virtual library and fully utilize all the drives in that one library.
34
Backup application configuration
In general backup application configurations for physical tape devices can be readily ported over to target a deduplicating virtual library with no
changes; this is one of the key benefits of virtual libraries – seamless integration. However considering deduplication in the design of a backup
application configuration can improve performance, deduplication ratio or ease of data recovery so some time spent optimizing backup
application configuration is valuable.
For HP Data Protector and EMC Networker Software a block size of 512 KB has been found to provide the best deduplication ratio and
performance balance and is the recommended block size for this application.
Some minor setting changes to upstream infrastructure might be required to allow backups with greater than 256 KB block size to be performed.
For example, Microsoft’s iSCSI initiator implementation, by default, does not allow block sizes that are greater than 256 KB. To use a block size
greater than this you need to modify the following registry setting:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4D36E97B-E325-11CE-BFC1-08002BE10318}\0000\Parameters
Change the REG_DWORD MaxTransferLength to “80000” hex (524,288 bytes), and restart the media server – this will restart the iSCSI
initiator with the new value.
A long retention policy provides a more granular set of recovery points with a greater likelihood that a file that needs to be recovered will be
available for longer and in many more versions.
Rotation scheme
There are two aspects to a rotation scheme which need to be considered:
Full versus Incremental/Differential backups
Overwrite versus Append of media
With virtual tape a large number of cartridges can be configured for “free” and their sizes can be configured so that they are appropriate to the
amount of data stored in a specific backup. Appended backups are of no benefit because media costs are not relevant in the case of VTL.
35
Figure 15: Cartridges with appended backups (not recommended)
Taking the above factors into consideration, an example of a good rotation scheme where the customer requires weekly full backups sent offsite
and a recovery point objective of every day in the last week, every week in the last month, every month in the last year and every year in the last 5
years might be as follows:
4 daily backup cartridges, Monday to Thursday, incremental backup, overwritten every week.
4 weekly backup cartridges, Fridays, full backup, overwritten every fifth week
12 monthly backup cartridges, last Friday of month, overwritten every 13th month.
5 yearly backup cartridges, last day of year, overwritten every 5 years.
This means that in the steady state, daily backups will be small, and whilst they will always overwrite the last week, the amount of data
overwritten will be small. Weekly full backups will always overwrite, but housekeeping has plenty of time to run over the following day or
weekend or whenever scheduled to run, the same is true for monthly and yearly backups.
36
StoreOnce NAS configuration guidelines
Introduction to StoreOnce NAS backup targets
The HP StoreOnce Backup system supports the ability to create a NAS (CIFS or NFS) share to be used as a target for backup applications.
The NAS shares provide data deduplication in order to make efficient use of the physical disk capacity when performing backup workloads.
The StoreOnce device is designed to be used for backup not for primary storage or general purpose NAS (drag and drop storage – random access).
Backup applications provide many configuration parameters that can improve the performance of backup to NAS targets, so some time spent
tuning the backup environment is required in order to ensure best performance.
OS Version Option
SLES 10 "sync" patch and "-o sync" mount option
11 "-o sync" mount option
RHEL 5 "sync" patch and "-o sync" mount option
6 "-o sync" mount option
HP-UX 11iv2 "-o forcedirectio" option
11iv3 "-o forcedirectio" option
Solaris 9 "-o forcedirectio" option
10 "-o forcedirectio" option
AIX 5.3 "-o forcedirectio" option
6.1 "-o forcedirectio" option
7.1 "-o forcedirectio" option
Once a StoreOnce CIFS share is created, subdirectories can be created via Explorer. This enables multiple host servers to back up to a single NAS
share but each server can back up to a specific sub-directory on that share. Alternatively a separate share for each host can be created.
The backup usage model for StoreOnce has driven several optimisations in the NAS implementation which require accommodation when creating
a backup regime:
37
Only backup files larger than 24 MB will be deduplicated, this works well with backup applications because they generally create large backup
files and store them in configurable larger containers . Please note that simply copying (by drag and drop for example) a collection of files to
the share will not result in the smaller files being deduplicated.
There is a limit of 25000 files per NAS share, applying this limit ensures good replication responsiveness to data change. This is not an issue
with many backup applications because they create large files and it is very unlikely that there will be a need to store more than 25000 on a
single share.
A limit in the number of concurrently open files both above and below the deduplication file size threshold (24 MB) is applied. This prevents
overloading of the deduplication system and thus loss of performance. See Appendix A for values for each specific model.
When protecting a large amount of data from several servers with a StoreOnce NAS solution it is sensible to split the data across several shares in
order to realise best performance from the entire system by improving the responsiveness of each store. Smaller stores have less work to do in
order to match new data to existing chunks so they can perform faster.
The best way to do this whilst still maintaining a good deduplication ratio is to group similar data from several servers in the same store. For
example: keep file data from several servers in one share, and Oracle database backups in another share.
If these thresholds are breached the backup application will receive an error from the StoreOnce appliance indicating that a file could not be
opened and the backup will fail.
The number of concurrently open files in the table above do not guarantee that the StoreOnce appliance will perform optimally with this number
of concurrent backups, nor do they take into account the fact that host systems may report a file as having been closed before the actual close
takes place, this means that the limits provided in the table could be exceeded without realizing it.
Should the open file limit be exceeded an entry is made in the StoreOnce Event Log so the user knows this has happened. Corrective action for this
situation is to reduce the overall concurrent backups that are happening and have caused too many files to be opened at once, maybe by re-
scheduling some of the backup jobs to take place at a different time.
When using a backup application with StoreOnce NAS shares the user will need to configure a new type of device in their backup application. Each
application varies as to what it calls a backup device that is located on a StoreOnce device, for example it may be called a File Library, Backup to
Disk Folder, or even Virtual Tape Library.
Most backup applications allow the operator to set various parameters related to the NAS backup device that is created, these parameters are
important in ensuring good performance in different backup configurations. Generic best practices can be applied to all applications as follows.
38
In addition to the data files, there will also be a small number of metadata files such as catalogue and lock files, these will generally be smaller
than the 24 MB dedupe threshold size and will not be deduplicated. These files are frequently updated throughout the backup process, so
allowing them to be accessed randomly without deduplication ensures that they can be accessed quickly. The first 24 MB of any backup file will
not be deduplicated, with metadata files this means that the whole file will not be deduplicated, with the backup file the first 24 MB only will not
be deduplicated. This architecture is completely invisible to the backup application which is merely presented with its files in the same way as any
ordinary NAS share would do so.
It is possible that the backup application will modify data within the deduplicated data region; this is referred to as a write-in-place operation. This
is expected to occur rarely with standard backup applications because these generally perform stream backups and either create a new file or
append to the end of an existing file rather than accessing a file in the middle.
If a write-in-place operation does occur, the StoreOnce appliance will create a new backup item that is not deduplicated, a pointer to this new item
is then created so that when the file is read the new write-in-place item will be accessed instead of the original data within the backup file.
If a backup application were to perform a large amount of write-in-place operations, there would be an impact on backup performance – because
of the random access nature that write in place creates.
Some backup applications provide the ability to perform “Synthetic Full” backups, these may produce a lot of write-in-place operations or open a
large number of files all at once, it is therefore recommended that Synthetic Full backup techniques are not used, see Synthetic full backups on
page 41 for more information.
Generally configuring larger backup container file sizes will improve backup performance and deduplication ratio because:
1. The overhead of the 24 MB dedupe region is reduced.
2. The backup application can stream data for longer without having to close and create new files.
3. There is a lower percentage overhead of control data within the file that the backup application uses to manage its data files.
4. There is no penalty to using larger backup files as disk space is not usually pre-allocated by the backup application.
If possible the best practice is to configure a container file size that is larger than the complete backup will be (allowing for some data growth
over time), so that only one file is used for each backup. Some applications will limit the maximum size to something smaller than that however,
in which case, using the largest configurable size is the best approach.
39
It is advised that this setting is NOT used because it can result in unrealistically high deduplication ratios being presented when pre-allocated files
are not completely filled with backup data or, in extreme cases, it will cause a backup failure due to a timeout if the application tries to write a
small amount of data at the end of a large empty file. This results in the entire file having to be padded-out with zeros at creation time, which is a
very time consuming operation.
Concurrent operations
For best StoreOnce performance it is important to either perform multiple concurrent backup jobs or use multiple streams for each backup (whilst
staying within the limit of concurrently open files per NAS share). Backup applications provide an option to set the maximum number of
concurrent backup streams per file device; this parameter is generally referred to as the number of writers. Setting this to the maximum values
shown in the table below ensures that multiple backups or streams can run concurrently whilst remaining within the concurrent file limits for each
StoreOnce share.
The table below shows the recommended maximum number of backup streams or jobs per share to ensure that backups will not fail due to
exceeding the maximum number of concurrently open files. Note however that optimal performance may be achieved at a lower number of
concurrent backup streams.
These values are based on standard “file” backup using most major backup applications.
If backing up using application agents (e.g. Exchange, SQL, Oracle) it is recommended that only one backup per share is run concurrently because
these application agents frequently open more concurrent files than standard file type backups.
Overall best performance is achieved by running a number of concurrent backup streams across several shares; the exact number of streams
depends upon the StoreOnce model being used and also the performance of the backup servers.
Buffering
If the backup application provides a setting to enable buffering for Read and/or Write this will generally improve performance by ensuring that
the application does not wait for write or read operations to report completion before sending the next write or read command. However, this
setting could result in the backup application inadvertently causing the StoreOnce appliance to have more concurrently open files than the
specified limits (because files may not have had time to close before a new open request is sent). If backup failures occur, disabling buffered
writes and reads may fix the problem, in which case, reducing the number of concurrent backup streams then re-enabling buffering will provide
best performance.
Appended backups should not be used because there is no benefit to using the append model, this does not save on disk space used.
40
Some backup applications now also provide software encryption, this technology prevents either the restoration of data to another system or
interception of the data during transfer. Unfortunately it also has a very detrimental effect on deduplication as data backed up will look different
in every backup preventing the ability to match similar data blocks.
The best practice is to disable software encryption and compression for all backups to the HP StoreOnce Backup system.
Verify
By default most backup applications will perform a verify pass on each back job, in which they read the backup data from the StoreOnce appliance
and check against the original data.
Due to the nature of deduplication the process of reading data is slower than writing as data needs to be re-hydrated. Thus running a verify will
more than double the overall backup time. If possible verify should be disabled for all backup jobs to StoreOnce, but trial restores should still
happen on a regular basis.
None – This authentication mode requires no username or password authentication and is the simplest configuration. Backup applications will
always be able to use shares configured in this mode with no changes to either server or backup application configuration. However this mode
provides no data security as anyone can access the shares and add or delete data.
User – In this mode it is possible to create “local StoreOnce users” from the StoreOnce management interface. This mode requires the
configuration of a respective local user on the backup application media server as well as configuration changes to the backup application
services. Individual users can then be assigned access to individual shares on the StoreOnce appliance. This authentication mode is ONLY
recommended when the backup application media server is not a member of an AD Domain.
AD – In this mode the StoreOnce CIFS server becomes a member of an Active Directory Domain. In order to join an AD domain the user needs to
provide credentials of a user who has permission to add computers and users to the AD domain. After joining an AD Domain access to each share
is controlled by Domain Management tools and domain users or groups can be given access to individual shares on the StoreOnce appliance. This
is the recommended authentication mode if the backup application media server is a member of an AD domain. It is the preferred option.
Refer to the “HP StoreOnce Backup system user guide” for more information about configuring authentication.
41
StoreOnce Replication
StoreOnce replication is a concept that is used with VTL and NAS devices. The equivalent concept for Catalyst store is called Catalyst Copy, which
is described in Appendix C. All three device types use a deduplication-enabled, low bandwidth transfer policy to replicate data from a device on a
“replication source” StoreOnce Backup system to an equivalent device on another “replication target” StoreOnce Backup system. The
fundamental difference is that the backup application controls Catalyst store copy operations, whereas all VTL and NAS replication is configured
and managed on the StoreOnce Management GUI.
Replication provides a point-in-time “mirror” of the data on the source StoreOnce device at a target StoreOnce Backup system on another site;
this enables quick recovery from a disaster that has resulted in the loss of both the original and backup versions of the data on the source site.
Replication does not however provide any ability to roll-back to previously backed-up versions of data that have been lost from the source
StoreOnce Backup system. For example, if a file is accidentally deleted from a server and therefore not included in the next backup, and all
previous versions of backup on the source StoreOnce Backup system have also been deleted, those files will also be deleted from a replication
target device as the target is a mirror of exactly what is on the source device. The only exception to this is if a Catalyst device type is used because
the retention periods of data on the Target can be different (greater in most cases) than the retention periods at the source – giving an additional
margin on data protection.
Replication will not prevent backup or restore operations from taking place. If an item is re-opened for further backups or restore, then
replication of that item will be paused to be resumed later or cancelled if the item is changed.
Replication can also be configured to occur at specific times (via configurable blackout windows) in order to optimize bandwidth usage and not
affect other applications that might be sharing the same WAN link.
VTL and NAS replication is configured between devices using “Mappings” and is not known to the backup software but is controlled entirely by the
StoreOnce appliance. Catalyst Copy is controlled entirely by the backup software and has no Mappings within the device to configure. A data
import process is necessary to recover data from a target NAS or VTL device, But with Catalyst no backup application import is required because
the additional copies are already known to the backup software and do not need to be imported.
42
Replication usage models (VTL and NAS only)
There are four main usage models for replication using StoreOnce VTL and NAS devices shown below.
Active/Passive – A StoreOnce system at an alternate site is dedicated solely as a target for replication from a StoreOnce system at a primary
location.
Active/Active – Both StoreOnce systems are backing up local data as well as receiving replicated data from each other.
Many-to-One – A target StoreOnce system at a data center is receiving replicated data from many other StoreOnce systems at other locations.
N-Way – A collection of StoreOnce systems on several sites are acting as replication targets for other sites.
The usage model employed will have some bearing on the best practices that can be employed to provide best performance. The following
diagrams show the usage models using VTL device types.
43
Figure 20: Many to One configuration
44
Figure 21: N-way configuration
In most cases StoreOnce VTL and StoreOnce NAS replication is the same, the only significant configuration difference being that VTL replication
allows multiple source libraries to replicate into a single target library, NAS mappings however are 1:1, one replication target share may only
receive data from a single replication source share. In both cases replication sources libraries or shares may only replicate into a single target.
However, with VTL replication a subset of the cartridges within a library may be configured for replication (a share may only be replicated in its
entirety).
What to replicate
StoreOnce VTL replication allows for a subset of the cartridges within a library to be mapped for replication rather than the entire library (NAS
replication does not allow this).
Some retention policies may not require that all backups are replicated, for example daily incremental backups may not need to go offsite but
weekly and monthly full backups do, in which case it is possible to configure replication to only replicate those cartridges that are used for the full
backups.
Reducing the number of cartridges that make up the replication mapping may also be useful when replicating several source libraries from
different StoreOnce devices into a single target library at a data center, for example. Limited slots in the target library can be better utilized to
take only replication of full backup cartridges rather than incremental backup cartridges as well.
Configuring this reduced mapping does require that the backup administrator has control over which cartridges in the source library are used for
which type of backup. Generally this is done by creating media pools with the backup application then manually assigning source library
cartridges into the relevant pools. For example the backup administrator may configure 3 pools:
Daily Incremental, 5 cartridge slots (overwritten each week)
Weekly Full, 4 cartridge slots (overwritten every 4 weeks)
Monthly Full, 12 cartridge slots (overwritten yearly)
45
Replicating only the slots that will contain full backup cartridges saves five slots on the replication target device which could be better utilized to
accept replication from another source library.
Note: The Catalyst equivalent of this requires the actual Backup policies to define which backups to Catalyst stores are to be copied and which are
not – so for example, you could configure only Full backups to be copied to Catalyst stores.
Max Appliance Fan out The maximum number of target appliances that a source appliance can be paired with
Max Appliance Fan in The maximum number of source appliances that a target appliance can be paired with
Max Library Fan in The maximum number of source libraries that may replicate into a single target library on this type of
appliance.
Max Library Fan out The maximum number of target libraries that may be replicated into from a single source library on this
type of appliance
Max Share Fan in The maximum number of source NAS Shares that may replicate into a single target NAS Share on this type
of appliance
Max Share Fan out The maximum number of target NAS Shares that may be replicated into from a single source NAS Share on
this type of appliance
It is important to note that when utilizing a VTL replication Fan-in model (where multiple source libraries are replicated to a single target library),
the deduplication ratio may be better than is achieved by each individual source library due to the deduplication across all of the data in the single
target library. However, over a large period of time the performance of this solution will be slower than configuring individual target libraries
because the deduplication stores will be larger and therefore require more processing for each new replication job.
Parameter HP StoreOnce 2620 HP StoreOnce 4210 HP StoreOnce 4220 HP StoreOnce 4420 / 4430
Appliance Fan-In 8 16 24 50
Appliance Fan-Out 2 4 4 8
Device Fan-In (VTL only) 1 8 8 16
Device Fan-out 1 1 1 1
Max Concurrent Outbound Jobs 12 24 24 48
Max Concurrent Inbound Jobs 24 48 48 96
For example, an HP 2620 may be replicating up to 4 jobs to a StoreOnce 4430, which may also be accepting another 44 source items from other
StoreOnce systems. But the target concurrency for a 4430 is 96 so the target is not the bottleneck to replication performance. If tTotal source
replication jobs are greater than 96 then the StoreOnce 4430 will limit replication throughput and replication jobs will queue until a slot becomes
available.
46
What actually happens in replication?
Assuming the seeding process is complete (seeding is when the initial data is transferred to the target device), the basic replication process works
like this:
Source has a cartridge (VTL) or File (NAS) to replicate
Source sends to target a “Manifest” that is a list of all the Hash codes it wants to send to the target (the hash codes are what make up
the cartridge/file/item)
The target replies: “I have 98% of those hash codes already – just send the 2% I don’t have.”
The source sends the 2% of hash codes the target requested.
The VTL or NAS replication job executes and completes.
The bigger the change rate of data, the more “mismatch” there will be and the higher the volume of unique data that must be replicated over the
WAN.
Other tabs seen on the above screenshot can be used to control the bandwidth throttling used for replication and the blackout windows that
prevents replication from happening at certain times.
It is recommended that the HP Sizing Tool (http://h30144.www3.hp.com/SWDSizerWeb/default.htm) is used to identify the product and WAN
link requirements because the required bandwidth is complex and depends on the following:
47
Amount of data in each backup
Data change per backup (deduplication ratio)
Number of StoreOnce systems replicating
Number of concurrent replication jobs from each source
Number of concurrent replication jobs to each target
Link latency (governs link efficiency)
As a general rule of thumb, however, a minimum bandwidth of 2 Mb/s per replication job should be allowed. For example, if a replication target is
capable of accepting 8 concurrent replication jobs (HP 4220) and there are enough concurrently running source jobs to reach that maximum, the
WAN link needs to be able to provide 16 Mb/s to ensure that replication will run correctly at maximum efficiency – below this threshold replication
jobs may begin to pause and restart due to link contention. It is important to note that this minimum value does not ensure that replication will
meet the performance requirements of the replication solution, a lot more bandwidth may be required to deliver optimal performance.
However prior to being able to replicate only unique data between source and target StoreOnce Backup system, we must first ensure that each
site has the same hash codes or “bulk data” loaded on it – this can be thought of as the reference data against which future backups are compared
to see if the hash codes exist already on the target. The process of getting the same bulk data or reference data loaded on the StoreOnce source
and StoreOnce target is known as “seeding”.
Note: With Catalyst the very first low bandwidth backup effectively performs its very own seeding operation.
Seeding is generally a one-time operation which must take place before steady-state, low bandwidth replication can commence. Seeding can take
place in a number of ways:
Over the WAN link – although this can take some time for large volumes of data. A temporary increas e in WAN bandwidth provision by
your telco can often alleviate this problem.
Using co-location where two devices are physically in the same location and can use a GbE replication link for seeding– (this is best for
Active/Active, Active Passive configurations). After seeding is complete, one unit is physically shipped to its permanent destination.
Using a “Floating” StoreOnce device which moves between multiple remote sites ( best for many to one replication scenarios)
Using a form of removable media (physical tape or portable USB disks) to “ship data” between sites.
The recommended way to accelerate seeding is by co-location of the source and target systems on the same LAN whilst performing the first
replicate. This process will obviously involve moving one or both of the appliances and will thus prevent them from running their normal backup
routines. In order to minimize disruption seeding should ideally only be done once; in this case all backup jobs that are going to be replicated must
have completed their first full backup to the source appliance before commencing a seeding operation.
Once seeding is complete there will typically be a 90+% hit rate, meaning most of the hash codes are already loaded on the source and target and
only the unique data will be transferred during replication.
It is good practice to plan for seeding time in your StoreOnce Backup system deployment plan as it can sometimes be very time consuming or
manually intensive work. The Sizing Tool calculates expected seeding times over Wan and LAN to help set expectations for how long seeding will
take place. In practice a gradual migration of backup jobs to the StoreOnce appliance ensures there is not a sudden surge in seeding requirements
but a gradual one, with weekends being used to performer high volume seeding jobs.
During the seeding process it is recommended that no other operations are taking place on the source StoreOnce Backup system, such as further
backups or tape copies. It is also important to ensure that the StoreOnce Backup system has no failed disks and that RAID parity initialization is
complete because these will impact performance.
When seeding over fast networks (co-located StoreOnce devices) it should be expected that performance to replicate a cartridge or file is similar
to the performance of the original backup.
48
Replication models and seeding
The diagrams in Replication usage models starting on page 43 indicate the different replication models supported by HP StoreOnce Backup
systems; the complexity of the replication models has a direct influence on which seeding process is best. For example an Active – Passive
replication model can easily use co-location to quickly seed the target device, where as co-location may not be the best seeding method to use
with a 50:1, many to 1 replication model.
Note: HP StoreOnce Catalyst copy seeding follows the same processes outlined below with the added condition that for multi-hop and one to
many replication scenarios the seeding process may have to occur multiple times.
Co-location (Seed Active -- Passive, Active -- Active and Many This process involves the transportation of Seeding time over LAN is
over LAN) to 1 replication models with significant complete StoreOnce units. calculated automatically when
volumes of data (> 1TB) to seed quickly and This method may not be practical for large using the Sizing tool for
where it would simply take too long to seed fan-in implementations e.g. 50:1 because of StoreOnce
using a WAN link ( > 5 days) the time delays involved in transportation.
This process can only really be used as a
“one off” when replication is first
implemented.
Floating StoreOnce Many to 1 replication models with high fan in Careful control over the device creation and This is really co-location using
ratios where the target must be seeded with co-location replication at the target site is a spare StoreOnce.
several remote sites at once. required. See example below. The last remote site StoreOnce
Using the floating StoreOnce approach can be used as the floating
means the device is ready to be used again unit.
and again for future expansion where more
remote sites might be added to the
configuration.
Backup application Suitable for all replication models, especially Relies on the backup application supporting Reduced shipping costs of
Tape offload/ copy where remote sites are large (inter- the copy process, e.g. Media copy or “object” physical tape media over
from source and continental) distances apart. copy” or “duplicate” or “cloning” actual StoreOnce units.
copy onto target Well suited to target sites that plan to have a Requires physical tape
physical Tape archive as part of the final connectivity at all sites, AND
solution. media server capability at each
Best suited for StoreOnce VTL deployments. site even if only for the seeding
process.
Backup application licensing
costs for each remote site may
be applicable
Use of portable USB portable disks, such as HP RDX series, Multiple drives can be used – single drive USB disks are typically easier
disk drives - can be configured as Disk File Libraries maximum capacity is about 3TB currently. to integrate into systems than
backup application within the backup application software and physical tape or SAS/FC disks.
copy or drag and used for “copies” RDX ruggedized disks are OK
drop OR for easy shipment between
Backup data can be drag and dropped onto sites and cost effective.
the portable disk drive, transported and then
drag and dropped onto the StoreOnce
Target.
Best used for StoreOnce NAS deployments.
49
Seeding methods in more detail
Seeding over a WAN link
With this seeding method the final replication set-up (mappings) can be established immediately.
Active/Passive Active/Active
WAN seeding over the first backup is, in fact, the first wholesale WAN seeding after the first backup at each location is, in fact,
replication. the first wholesale replication in each direction.
50
Many to One
WAN seeding over the first backup is, in fact, the first wholesale replication from the many remote sites to the Target site. Care must be taken not
to run too many replications simultaneously or the Target site may become overloaded. Stagger the seeding process from each remote site.
51
Co-location (seed over LAN)
With this seeding method it is important to define the replication set-up (mappings) in advance so that in say the Many to One example the correct
mapping is established at each site the target StoreOnce appliance visits before the target StoreOnce appliance is finally shipped to the Data
Center Site and the replication “re-established” for the final time.
Active/Passive
Co-location seeding at Source (remote) site
52
Many to One
Co-location seeding at Source (remote) sites; transport target StoreOnce appliance
between remote sites.
1. Initial backup at each remote site 2. Replication to Target StoreOnce appliance over GbE at
each remote site.
3. Move Target StoreOnce appliance between 4. Finally take Target StoreOnce appliance to Data Center
remote sites and repeat replication. site.
5. Re-establish replication.
53
Floating StoreOnce appliance method of seeding
Many to Once Seeding with Floating StoreOnce target – for large fan-in scenarios
Co-location seeding at Source (remote) sites.
Transport floating target StoreOnce appliance between remote sites then perform replication at the
Data Center site. Repeat as necessary.
1. Initial backup at each remote site. 2. Replication to floating Target StoreOnce appliance over GbE at
each remote site.
3. Move floating Target StoreOnce appliance between 4. Take floating Target StoreOnce appliance to Data Center site.
remote sites and repeat replication.
5. Establish replication from floating StoreOnce Target (now a Source) with Target StoreOnce at Data Center. Delete devices on floating
Target StoreOnce appliance.
Repeat the process for further remote sites until all data has been loaded onto the Data Center Target StoreOnce appliance. You may be
able to accommodate 4 or 5 site s of replicated data on a single floating StoreOnce appliance.
This “floating StoreOnce appliance” method is more complex because for large fan-in (many source sites replicating into single target site) the
initial replication set up on the floating StoreOnce appliance changes as it is then transported to the data center, where the final replication
mappings are configured.
54
The sequence of events is as follows:
1. Plan the final master replication mappings from sources to target that are required and document them. Use an appropriate naming
convention e.g. SVTL1, SNASshare1, TVTL1, TNASshare1.
2. At each remote site perform a full system backup to the source StoreOnce appliance and then configure a 1:1 mapping relationship with the
floating StoreOnce appliance ” e.g. SVTL1 on Remote Site A - FTVTL1 on floating StoreOnce. FTVTL1 = floating target VTL1.
3. Seeding remote site A to the floating StoreOnce appliance will take place over the GbE link and should take only a few hours.
4. On the Source StoreOnce appliance at the remote site DELETE the replication mappings – this effectively isolates the data that is now on the
floating StoreOnce appliance.
5. Repeat the process steps 1-4 at Remote sites B and C.
6. When the floating StoreOnce appliance arrives at the central site, the floating StoreOnce appliance effectively becomes the Source device to
replicate INTO the StoreOnce appliance at the data center site.
7. On the Floating StoreOnce appliance we will have devices (previously named as FTVTL1, FTNASshare 1) that we can see from the Web
Management Interface. Using the same master naming convention as we did in step 1, set up replication which will necessitate the creation of
the necessary devices (VTL or NAS) on the StoreOnce 4220 at the Data Center site e.g. TVTL1, TNASshare 1.
8. This time when replication starts up the contents of the floating StoreOnce appliance will be replicated to the data center StoreOnce appliance
over the GbE connection at the data center site and will take several hours. In this example Remote Site A, B, C data will be replicated and
seeded into the StoreOnce 4220. When this replication step is complete, DELETE the replication mappings on the floating StoreOnce
appliance, to isolate the data on the floating StoreOnce appliance and then DELETE the actual devices on the floating StoreOnce appliance, so
the device is ready for the next batch of remote sites.
9. Repeat steps 1-8 for the next series of remote sites until all the remote site data has been seeded into the StoreOnce 4220.
10. Now we have to set up the final replication mappings using our agreed naming convention decided in Step 1. This time we go to the Remote
sites and configure replication again to the Data Center site but being careful to use the agreed naming convention at the data center site e.g.
TVTL1, TNASshare1 etc.
This time when we set up replication the StoreOnce 4220 at the target site presents a list of possible target replication devices available to
the remote site A. So in this example we would select TVTL1 or TNASshare1 from the drop-down list presented to Remote Site A when we are
configuring the final replication mappings. This time when the replication starts almost all the necessary data is already seeded on the
StoreOnce 4220 for Remote site A and the synchronization process happens very quickly.
Note: If using this approach with Catalyst stores that do not rely on “mappings”, the Floating StoreOnce appliance can be simply used to
collect all the Catalyst Items at the Remote sites if a consolidation model is to be deployed. If not, create a separate Catalyst store on the
Floating StoreOnce Appliance for each site.
55
Seeding using physical tape or portable disk drive and backup application copy utilities
1. Initial backup to StoreOnce appliance. 2. Copy to tape(s) or a disk using backup application software on
Media Server for NAS devices; only use simple drag and drop to
portable disk This technique is not possible at Sites A & B unless a
media server is present
3. Ship tapes/disks to Data Center site. 4. Copy tapes/disks into target appliance using backup application
software on Media Server (or for portable disks only use drag and
drop onto NAS share on the StoreOnce target).
5. Establish replication. .
In this method of seeding we use a removable piece of media (like LTO physical tape or removable RDX disk drive acting as a disk Library or file
library*) to move data from the remote sites to the central data center site. This method requires the use of the backup application software and
additional hardware to put the data onto the removable media.
* Different backup software describes “disk targets for backup” in different ways e.g. HP Data Protector calls StoreOnce NAS shares “ DP File
Libraries”, Commvault Simpana calls StoreOnce NAS shares “Disk libraries.”
Proceed as follows
1. Perform full system backup to the StoreOnce Backup system at the remote site using the local media server, e.g. at remote site C.
The media server must also be able to see additional devices such as a physical LTO tape library or a removable disk device configured as a
disk target for backup.
56
2. Use the backup application software to perform a full media copy of the contents of the StoreOnce Backup system to a physical tape or
removable disk target for backup also attached to the media server.
In the case of removable USB disk drives the capacity is probably limited to 2 TB, in the case of physical LTO5 media it is limited to about 3 TB
per tape, but of course multiple tapes are supported if a tape library is available. For USB disks, separate backup targets for disk devices
would need to be created on each removable RDX drive because we cannot span multiple RDX removable disk drives.
3. The media from the remote sites is then shipped (or even posted!) to the data center site.
4. Place the removable media into a library or connect the USB disk drive to the media server and let the media server at the data center site
discover the removable media devices.
The media server at the data center site typically has no information on what is on these pieces of removal media and we have to make the
data visible to the media server at the data center site. This generally takes the form of what is known as an “import” operation where the
removable media has to be registered into the catalog/database of the media server at the data center site.
5. Create devices on the StoreOnce Backup system at the data center site using an agreed convention e.g. TVTL1, TNASshare1. Discover these
devices through the backup application so that the media server at the data center site has visibility of both the removable media devices AND
the devices configured on the StoreOnce Backup system.
6. Once the removable media has been imported into the media server at the data center site it can be copied onto the StoreOnce Backup system
at the data center site (in the same way as before at step 2) and, in the process of copying the data , we seed the StoreOnce Backup system at
the data center site. It is important to copy physical tape media into the VTL device that has been created on the StoreOnce Backup system
and copy the disk target for backup device (RDX) onto the StoreOnce NAS share device that has been created on the StoreOnce Backup system
at the data center site.
7. Now we have to set up the final replication mappings using our agreed naming convention. Go to the remote sites and configure replication
again to the data center site, being careful to use the agreed naming convention at the data center site e.g. TVTL1, TNASshare1 etc. This time
when we set up replication the StoreOnce 4220 at the target site presents a list of possible target replication devices available to the remote
site. So in this example we would select TVTL1 or TNASshare1 from the drop-down list presented to remote site C when we are configuring
the final replication mappings. This time when the replication starts almost all the necessary data is already seeded on the StoreOnce 4220
for Remote site A so the synchronization process happens very quickly.
The media servers are likely to be permanently present at the remote sites and data center site so this is making good use of existing equipment.
For physical tape drives/library connection at the various sites SAS or FC connection is required. For removable disk drives such as RDX a USB
connection is the most likely connection because it is available on all servers at no extra cost.
If the StoreOnce deployment is going to use StoreOnce NAS shares at source and target sites the seeding process can be simplified even further
by using the portable disk drives to drag and drop backup data from the source system onto the portable disk. Then transport the portable disk to
the target StoreOnce site and connect it to a server with access to the StoreOnce NAS share at the target site. Perform a drag and drop from
portable disk onto the StoreOnce NAS share and this then performs the seeding for you!
Note: Drag and drop is NOT to be used for day to day use of StoreOnce NAS devices for backup; but for seeding large volumes of sequential data
this usage model is acceptable.
Only HP Data Protector, Symantec NetBackup and Symantec Backup Exec support HP StoreOnce Catalyst – but Catalyst stores can be “copied” to
Tape or USB Disk using object copy (DP), duplicate commands (NetBackup). See Appendix C for more details.
57
Controlling Replication
In order to either optimize the performance of replication or minimize the impact of replication on other StoreOnce operations it is important to
consider the complete workload being placed on the StoreOnce Backup system.
By default replication will start quickly after a backup completes; this window of time immediately after a backup may become very crowded if
nothing is done to separate tasks. In this time the following are likely to be taking place:
Other backups to the StoreOnce Backup system which have not yet finished
Housekeeping of the current and other completed overwrite backups
Possible copies to physical tape media of the completed backups
These operations will all impact each other’s performance, some best practices to avoid these overlaps are:
Set replication blackout windows to cover the backup window period , so that replication will not occur whilst backups are taking place.
Set housekeeping blackout windows to cover the replication period, some tuning may be required in order to set the housekeeping window
correctly and allow enough time for housekeeping to run.
Delay physical tape copies to run at a later time when housekeeping and replication has completed. Preferably at the weekend
The best practice is to set a blackout window throughout the backup window so that replication does not interfere with backup operations.
If tape copy operations are also scheduled, a blackout window for replication should also cover this time.
Care must be taken, however, to ensure that enough time is left for replication to complete. If it is not, some items will never be synchronized
between source and target and the StoreOnce Backup system will start to issue warnings about these items.
The replication blackout window settings can be found on the StoreOnce Management Interface on the HP StoreOnce - Replication - Local
Settings – Blackout Windows page.
This enables blackout windows to be set to cover the backup window over the night time period but also allow replication to run during the day
without impacting normal business operation.
Bandwidth limiting is configured by defining the speed of the WAN link between the replication source and target, then specifying a maximum
percentage of that link that may be used.
Again, however, care must be taken to ensure that enough bandwidth is made available to replication to ensure that at least the minimum (2 Mb/s
per job) speed is available and more, depending on the amount of data to be transferred, in the required time.
Replication bandwidth limiting is applied to all outbound (source) replication jobs from an appliance; the bandwidth limit set is the maximum
bandwidth that the StoreOnce Backup system can use for replication across all replication jobs.
58
The replication bandwidth limiting settings can be found on the StoreOnce Management Interface on the HP StoreOnce - Replication - Local
Settings - Bandwidth Limiting page.
There are two ways in which replication bandwidth limits can be applied:
General Bandwidth limit – this applies when no other limit windows are in place.
Bandwidth limiting windows – these can apply different bandwidth limits for times of the day
A bandwidth limit calculator is supplied to assist with defining suitable limits.
See the “HP StoreOnce Backup system user guide” for information on how to configure Source Appliance Permissions.
Note the following changes to replication functionality when Source Appliance Permissions are enabled:
Source appliances will only have visibility of, and be able to create mappings with, libraries and shares that they have already been
given permission to access.
Source appliances will not be able to create new libraries and shares as part of the replication wizard process, instead these shares and
libraries must be created ahead of time on the target appliance.
59
Replication and Catalyst Copy monitoring
The aim of replication is to ensure that data is “moved” offsite as quickly as possible after a backup job completes. The “maximum time to
offsite” varies depending on business requirements. The StoreOnce Backup system provides tools to help monitor replication performance and
alert system administrators if requirements are not being met. For larger replication deployments HP Replication Manager v2.1 is recommended
and is free to download once a replication license or Catalyst license has been purchased. There is a top level replication monitoring status for VTL
and NAS as shown below within the main StoreOnce GUI.
Out of Sync notifications can be configured so that an alert is sent if the required maximum time to offsite is exceeded.
If these logs and alerts indicate a problem, best practices may be applied in order to get replication times back within required ranges.
Note: There are no such Out of Sync thresholds for Catalyst Copy as failure to copy is reported by the backup software’s Administration console.
60
Activity monitor in the StoreOnce GUI
The Activity page has a graph to show Replication and Catalyst Data throughput (inbound and outbound) over the last five minutes. The
throughput is the sum of all replication jobs and is averaged over several minutes. It provides some basic information about replication
performance but should be used mainly to indicate the general performance of replication jobs at the current time. This single activity report
supports all device types VTL, NAS and Catalyst.
Backup &
Restore
Replication &
Catalyst Copy
61
A best practice is to use blackout windows so that replication jobs all run concurrently at a time when backup jobs are not running.
Replication Activity can be further monitored by clicking on the individual replication mappings as shown below.
62
The equivalent to monitoring replication “mappings” in Catalyst is performed by looking at the individual Catalyst stores and looking at the
inbound and outbound copy jobs as shown below.
From HP Replication Manager 2.1 onwards Catalyst stores are also supported for monitoring. HP Replication Manager is a free download if either
a replication licence or StoreOnce Catalyst licence has been purchased.
The Replication Manager software can be downloaded from the following location.
URL : http://www.software.hp.com/kiosk with the Login and Password supplied with the replication/catalyst licene purchase.
63
Full details are available in the HP Replication Manager 2.1 user guide. One of the most useful features of HP Replication Manager is the
Topology Viewer, which can be seen below.
The Topology shows the Device Status, Name and Replication status between devices. A tool tip is available when the cursor is over a device. This
tool tip contains additional information about the device.
Use the Page Navigation options at the bottom of the page to move to other islands.
64
Housekeeping monitoring and control
Terminology
Housekeeping: If data is deleted from the StoreOnce system (e.g. a virtual cartridge is overwritten or erased), any unused chunks will be marked
for removal, so space can be freed up (space reclamation). The process of removing chunks of data is not an inline operation because this would
significantly impact performance. This process, termed “housekeeping”, runs on the appliance as a background operation. It runs on a per
cartridge, Catalyst store and NAS file basis, and will run as soon as the cartridge is unloaded and returned to its storage slot or a NAS file has
completed writing and has been closed by the appliance, unless a housekeeping blackout window is set. Housekeeping also applies when data is
replicated from a source StoreOnce appliance to a target StoreOnce appliance – the replicated data on the target StoreOnce appliance triggers
housekeeping on the target StoreOnce appliance to take place. Blackout windows are also configurable on the target devices.
Blackout Window: This is a period of time (up to 2 separate periods in any 24 hours) that can be configured in the StoreOnce appliance during
which the I/O intensive process of Housekeeping WILL NOT run. The main use of a blackout window is to ensure that other activities such as
backup and replication can run uninterrupted and therefore give more predictable performance. Blackout windows must be set on BOTH the
source StoreOnce appliance and Target StoreOnce appliance.
This guide includes a fully worked example of configuring a complex StoreOnce environment including setting housekeeping windows, see
Appendix B. An example is shown below from a StoreOnce source from the worked example:-
In the above example we can see backups in green, housekeeping in yellow and replication from the source in blue. In this example we have
already set a replication blackout window which enables replication to run at 20:00.
Without a housekeeping blackout window set we can see how in this scenario where four separate servers are being backed up to the StoreOnce
Backup system, the housekeeping can interfere with the backup jobs. For example the housekeeping associated with DIR1 starts to affect the end
of backup DIR2 since the backup of DIR2 and the housekeeping of DIR1 are both competing for disk I/O.
By setting a housekeeping blackout window appropriately from 12:00 to 00:00 we can ensure the backups and replication run at maximum speed
as can be seen below. The housekeeping is scheduled to run when the device is idle.
65
However some tuning is required to determine how long to set the housekeeping windows and to do this we must use the StoreOnce Management
Interface and the reporting capabilities which we will now explain.
On the StoreOnce Management Interface go to the Housekeeping page; a series of graphs and a configuration capability is displayed. Let us look
at how to analyse the information the graphs portray.
There are four tabs on the Housekeeping page: Overall, Libraries, Shares and StoreOnce Catalyst. The Overall tab shows the total housekeeping
load on the appliance. The other tabs can be used to select the device type and monitor housekeeping load on individual named VTL, NAS shares
or Catalyst stores. Note how the Housekeeping blackout window configuration setting is shown below the Housekeeping status. The
housekeeping blackout window is set on an appliance basis not an individual device type basis.
Housekeeping jobs
received versus
housekeeping jobs
processed
The Housekeeping load on the target replication devices is generally higher than on the source devices and must be monitored/observed on those
devices – you cannot monitor the target housekeeping load from the source device.
66
Load graph (top graph): will display what levels of load the StoreOnce appliance is under when housekeeping is being processed. However this
graph is intended for use when housekeeping is affecting the performance of the StoreOnce appliance (e.g. housekeeping has been running
nonstop for a couple of hours), therefore if housekeeping is idle most of the time no information will be displayed.
In the above graph we show two examples, one where the housekeeping load increases and then subsides, which is normal, and another where
the housekeeping job continues to grow and grow overtime. This second condition would be a strong indication that the housekeeping jobs are
not being dealt with efficiently, maybe the housekeeping activity window is too short (housekeeping blackout window too large), or we may be
overloading the StoreOnce appliance with backup and replication jobs and the unit may be undersized.
Another indicator is the Time Idle status, which is a measure of the housekeeping empty queue time. If % idle over 24 hours is = 0 this means that
the box is fully occupied and that is not healthy, but this may be OK if the % idle over 7 days is not 0 as well. For example, if the appliance is 30%
idle over 7 days then we are probably operating within reasonably safe limits.
Signs of housekeeping becoming too high are that backups may start to slow down or backup performance becomes unpredictable.
Corrective actions if idle time is low or the load continues to increase are:
a) Use a larger StoreOnce appliance or add additional shelves to increase I/O performance.
b) Restructure the backup regime to remove appends on tape or keep appends on separate cartridges – as the bigger the tapes (through
appends,) the more housekeeping they generate when they are overwritten.
c) Increase the time allowed for housekeeping to run by reducing the housekeeping blackout windows
If you do set up housekeeping blackout windows (up to two periods per day, 7 days per week), be careful as you cannot set a blackout time from
say 18:00 to 00:00 but you must set 23:59. In addition there is a Pause Housekeeping button, but use this with caution because it pauses
housekeeping indefinitely until you restart it!
Careful!
Finally, remember that it is best practice to set housekeeping blackout windows on both the source and target devices. The diagram below shows
the target device from the worked example later in this document where replication from several source sites is arriving. Two replication blackout
windows are set on the target device, 10:00 to 14:00 and 20:00 to 02:00 (see below). Note how the replication process of data received at the
target (shown here in Blue) triggers housekeeping which must be managed. If housekeeping is not controlled at the target it can start to impact
replication performance from the source. In general the housekeeping load at the target in many to one replication scenarios is higher than that
of any individual source and so a larger housekeeping period must be provisioned.
67
Consider improving the situation by imposing two Housekeeping Windows on the Target Device as shown below
GMT 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00 08:00
DIR 1
A& D Share 1
Filesystem data DIR 2
DIR 3
A&D SQL SQL
A& D Share 3 Filesystem DIR 1
2 DIR2
Special
Local Backup App
REDUCED LOAD
Backup
Replication
Housekeeping
Start of Replication Window
Spare Time for Physical Tape OffLoad.
Housekeeping Window
68
Tape Offload
Terminology
Direct Tape Offload
This is when a physical tape library device is connected directly to the rear of the StoreOnce Backup system.
This offload feature is not currently supported on HP StoreOnce Backup system.
When reading data in this manner from the StoreOnce Backup system the data to be copied must be read from the StoreOnce appliance and
“reconstructed” then copied to physical tape. Just as with the backup process – the more parallel backup streams to the StoreOnce appliance the
faster the backup will proceed, Similarly the larger the number of parallel reads generated for the tape copy, the faster the copy to tape will take
place – even if this means less than optimal usage of physical tape media.
Scheduling tape offload to occur at less busy periods, such as weekends, is also highly recommended, so that the read process has maximum I/O
available to it.
Tape Offload/Copy from StoreOnce Backup system versus Mirrored Backup from Data Source
A summary of the supported methods is shown below.
The backup application controls the copy from the StoreOnce appliance This is a parallel activity. The host backs up to the StoreOnce appliance and
to the network-attached tape drive so that: the host backs up to tape. It has the following benefits:
It is easier to find the correct backup tape The backup application still controls the copy location
The scheduling of copy to tape can be automated within the It has the highest performance because there are no read operations
backup process and reconstruction from the StoreOnce appliance
Constraints: Constraints:
Streaming performance will be slower because data must be It requires the scheduling of specific mirrored backup policies
reconstructed This method is generally only available at the “Source” side of the
backup process. Offloading to tape at the target site can only use the
backup application copy to tape method.
69
The same applies in a StoreOnce Catalyst Copy model. However, the StoreOnce Catalyst Copy feature allows the backup application to
incorporate tape offload, as well as Catalyst Store copy between StoreOnce appliances into a single backup job specification. The following
examples relate to StoreOnce Replication. Please see Appendix C for examples that are more relevant to the StoreOnce Catalyst model.
1. Catalyst Copy command. 3. Rehydration and full bandwidth copy to tape under
2. Low bandwidth Catalyst Copy. the control of the ISV software.
70
VTL and NAS device types
1. Backup data written to StoreOnce Source. 3. All data stored safely at DR site. Data at
StoreOnce target (written by StoreOnce
source via replication) must be imported to
2. StoreOnce low bandwidth replication
Backup Server B before it can be copied to
tape.
Figure 30: Backup application tape offload at StoreOnce target site for VTL and NAS device types
Note: Target Offload can vary from one backup application to another in terms of import functionality. Please check with your vendor.
1. Copy StoreOnce device to physical tape; this uses the backup Copy job to copy data from the StoreOnce appliance to physical tape and is
easy to automate and schedule, it has a slower copy performance.
2. Mirrored backup; specific backup policy used to back up to StoreOnce and Physical Tape simultaneously (mirrored write) at certain times
(monthly). This is a faster copy to tape method.
Figure 31: Backup application tape offload at StoreOnce source site for VTL and NAS device types
As can be seen in the diagrams above – offload to tape at the source site is somewhat easier because the backup server has written the data to
the StoreOnce Backup system at the source site. In the StoreOnce Target site scenario (Figure 30), some of the data on the StoreOnce Backup
system may have been written by Backup Server B (local DR site backups, maybe) but the majority of the data will be on the StoreOnce Target via
low bandwidth replication from StoreOnce Source. In this case, the Backup Server B has to “learn” about the contents of the StoreOnce target
71
before it can copy them and the typical way this is done is via “importing” the replicated data at the StoreOnce target into the catalog at Backup
Server B, so that it knows what is on each replicated virtual tape or StoreOnce NAS share. Copy to physical tape can then take place. These
limitations do not exist if HP StoreOnce Catalyst device types are used.
If the StoreOnce 4420 reads with a single stream (to physical tape) the copy rate is low. However, if the copy jobs are configured to use multiple
readers and multiple writers then for example with four streams being read it is possible to achieve much higher copy performance. However this
will require more physical tape drives in the Library or the use of multiplexing to tape.
What this means in practice is that you must schedule specific time periods for tape offloads when the StoreOnce Backup system is not busy and
use as many parallel copy streams (tape devices) as practical to improve the copy performance.
1. Single stream read performance 2. Much higher read throughput (for tape offload) with four streams
72
Summary of Best Practices
1. The first recommendation is to really assess the need for tape offload: with a StoreOnce replication or Catalyst copy is another copy of the
data really necessary? How frequently does the offload to tape really need to be? Monthly offloads to tape are probably acceptable for most
scenarios that must have offload to tape.
2. Catalyst “Integrated tape copies” are the easiest way to implement a tape offload strategy from within backup policy definitions without the
need for imports at the DR site but this functionality i s only available with HP Data Protector, Symantec NetBackup and Symantec Backup
Exec backup software.
3. For “Media Copies” it is always best to try and match the StoreOnce VTL cartridge size with the physical media cartridge size to avoid wastage.
For example: if using physical LTO4 drives (800 GB tapes) then when configuring StoreOnce Virtual Tape Libraries the StoreOnce cartridge
size should also be configured to 800 GB.
Schedule time for Tape Offloads: The StoreOnce Backup system is running many different functions – Backup and Deduplication, Replication
and Housekeeping – and Tape Offload is another task (that involves reading data from the appliance). To ensure the offload to tape performs
the best it can, no other processes should be running. In reality this means actively planning downtime for tape offloads to occur.
Offload a little at a time or all at once? Depending on the usage model of the StoreOnce Backup system and the amount of data to be
offloaded you may be able to support many hours dedicated to tape offload once per month. Or, if the StoreOnce Backup system is very busy
most of the time, you may have to schedule smaller offloads to tape on a more frequent weekly basis.
The example below shows where 2 hours per day have been dedicated to tape offload (purple) at the target site
(taken from our worked example from Appendix B).
GMT 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00 08:00
DIR 1
A& D Share 1
Filesystem data DIR 2
DIR 3
A&D SQL SQL
A& D Share 3 Filesystem DIR 1
2 DIR2
Special
Local Backup App
REDUCED LOAD
4. Importing data? When copying to tape at a StoreOnce target site it is only possible to copy to physical tape after the backup server is aware of
the replicated data from the StoreOnce source. It is important to walk through this import process and schedule the import process to occur in
advance of the copy process, otherwise the copy process will be unaware of the data that must be copied. In the case of HP Data Protector
specific scripts have been developed that can poll the StoreOnce target to interrogate newly replicated cartridges and NAS files. HP Data
Protector can then automatically schedule import jobs in the background to import the cartridges/shares so that when the copy job runs, all is
well. Other backup applications’ methods may vary in this area. For example, for most backup applications the StoreOnce target can be read
only to enable the copy to tape, but Symantec Backup Exec requires write/read access which involves breaking the replication mappings for
this to be possible. Please check with your backup application provider before relying on the tape offload process or perform a Disaster
Recovery offload to tape to test the end to end solution.
73
Appendix A
Key reference information
74
StoreOnce Single Node Products, Software 4210 iSCSI/FC
3.4.x and later 2610 iSCSI G3 2620 iSCSI G3 G3 4220 G3 4420 G3 4430
Devices
Usable Disk Capacity (TB) (With full
expansion) 1 2.5 9 18 38 76
Max Number Devices (VTL/NAS) 4 8 16 24 50 50
Replication
Max VTL Library Rep Fan Out 1 1 1 1 1 1
Max VTL Library Rep Fan In 1 1 8 8 16 16
Max Appliance Rep Fan Out 2 2 4 4 8 8
Max Appliance Rep Fan In 4 8 16 24 50 50
Max Appliance Concurrent Rep Jobs Source 12 12 24 24 48 48
Max Appliance Concurrent Rep Jobs Target 24 48 48 48 96 96
Physical Tape Copy Support
Supports direct attach of physical tape device No No No No No No
Max Concurrent Tape Attach Jobs Appliance N/A N/A N/A N/A N/A N/A
VTL
Max VTL Drives Per Library/Appliance 16 32 64 96 200 200
Max Cartridge Size (TB) 3.2 3.2 3.2 3.2 3.2 3.2
Max Slots Per Library (D2DBS, EML-E, ESL-E
Lib Type) 96 96 1024 1024 4096 4096
Max Slots Per Library
(MSL2024,MSL4048,MSL8096 Lib Type) 24,48,96 24,48,96 24,48,96 24,48,96 24,48,96 24,48,96
Max active streams per store 32 48 64 96 128 128
Recommended Max Concurrent Backup
Streams per appliance 16 24 48 48 64 64
Recommended Max Concurrent Backup
Streams per Library 4 4 6 6 12 12
NAS
Max files per share 25000 25000 25000 25000 25000 25000
Max NAS Open Files Per Share >
DDThreshold* 32 48 64 64 128 128
Max NAS Open Files Per Appliance >
DDThreshold* 32 48 64 64 128 128
Max NAS Open Files Per Appliance concurrent 96 112 128 128 640 640
Recommended Max Concurrent Backup
Streams per appliance 16 24 48 48 64 64
Recommended Max Concurrent Backup
Streams per Share 4 4 6 6 12 12
StoreOnce Catalyst
Catalyst Command Sessions 16 16 32 32 64 64
Maximum Concurrent outbound copy jobs per
appliance 12 48 24 24 48 48
Maximum Concurrent inbound data and copy
jobs per appliance 12 48 96 96 192 192
Performance
Max Aggregate Write Throughput Catalyst
Low Bandwidth (TB/hr) 1 1 2.2 2.2 10.8 12.5
Max Aggregate Write Throughput Non-
Catalyst (TB/hr) 0.67 0.67 2.9 3.3 4.8 4.8
Min streams required to achieve max
aggregate throughput** 6 6 12 16 16 20
* DDThreshold is the size a file must reach before a file is deduplicated, set to 24MB.
** Assumes no backup client performance limitations.
75
Appendix B – Fully Worked Examples
In this section we will work through a complete multi-site, multi-region StoreOnce design, configuration and deployment tuning. The following
steps will be undertaken:
Hardware and site configuration definition
Backup requirements specification
Using the HP Storage Sizing Tool, size the hardware requirements, link speeds and likely costs
Work out each StoreOnce device configuration - NAS, VTL, number of shares, and so on - using best practices that have been articulated earlier
in this document.
Configure StoreOnce source devices and replication target configuration
Map out for sources and target the interaction of backup, housekeeping and replication.
Fine tune the solution using replication blackout windows and housekeeping blackout windows
The worked example below may seem rather complicated at times but it is specifically designed to tease out many different facets of the design
considerations required to produce a comprehensive and high performance solution.
A Catalyst worked example is also shown later in this AppendixIt is expected that Catalyst deployments will be “All Catalyst” with little or no
mixing with VTL and NAS emulations. Catalyst deployments are also limited to HP Data Protector, Symantec NetBackup and Symantec Backup
Exec software and the preferred usage model is with Low bandwidth Catalyst implementations.
76
Backup requirements specification
Data Center E
Fibre Channel VTL emulations required for local backups, growth 20% size for 1 year
Server 1 - 500GB Exchange => Lib 1 - 4 backup streams
Server 2 - 500GB Special App => Lib 2 – 1 backup stream
Rotation Scheme – 500GB Daily Full, retained for 1 month.
12 hour backup window
Replication target for sites A, B, C, D (this means we have to size for replication capacity AND local backups on Site E)
Monthly archive required at Site E from STOREONCE to Physical Tape
One of the key parameters in sizing a solution such as this is trying to estimate the daily block level change rate for data in each backup job. In this
example we will use the default value of 2% in the HP StoreOnce sizing tool. http://h30144.www3.hp.com/SWDSizerWeb/default.htm
Working through this example is strongly recommended because valuable insight can be gained by performing the practical sizing exercise.
77
Using the HP Storage sizing tool
Configure replication environment
1. Click on Backup Calculators and then Design StoreOnce Replication Over WAN to get started.
2. Configure the replication environment for 4 source appliances to 1 target appliance (five appliances in total).
This Replication Model is known as Many-to-One replication.
The Replication Window allowed is 12 hours.
The size of the target device is initially based on capacity, so select “Any” as the Target Device type.
because A and D, and B and C are identical sites we can choose three sites, enter the data once for Site A and Site B, then create two identical
sites for D and C within the HP Storage Sizing tool.
78
3. Click Launch Sites.
For each Source and then for the Target, enter the backup sizes and rotation schemes. Source A and Target E are shown as examples in the Sizing
Tool screenshots below. Inputs for Source A are shown below; the inputs for Source D, which are identical, can also be added by incrementing the
number of similar sites in the Total Source Sites drop-down list shown in the screenshot on the following page. For simplicity we will size for a
single year’s growth of 20%.
79
IT IS VERY IMPORTANT that when you are creating the Backup specifications in the HP Storage Sizing Tool you pay particular attention to the Field
Number of parallel Backup Streams. This field determines the backup throughput and the amount of concurrent replications that are possible
BUT it may require a conscious change by the customer in backup policies to make it happen.
The following screenshot illustrates how you enter the backup data sizes for Sites A and D.
In the case of sites A and D, when we enter all the backup jobs, we will have seven backup jobs running in parallel which will give us good
throughput and backup performance.
80
Dedupe-Input tab
Click on the Dedupe Tab to set up the data types/change rate and retention periods.
The following screen shows the input for Job Filesystem1 from a dedupe perspective 2% daily change rate, incrementals 10% of Full. It
shows deduplication and retention periods for Sites A and D.
To create our specific retention policy, click Create Retention and input the retention policy creating a series of steps with names: daily, weekly,
monthly etc.
81
The Retention Planner can schedule a wide range of retention schemes; below we see the daily incremental, 4 x weekly full and single monthly
required in the specification for sites A and D.
After the retention times have been configured for FileSystem 1 click OK
82
Dedupe Output tab
Let’s now look at the Output tab of the Dedupe section of the job specification; it displays the predicted dedupe ratio you will achieve over the
period of time for which you are sizing. Use the slider to see the dedupe ratio at the right of the table.
Drag to see
growth YoY
83
If you click on YoY Growth you get the full picture of deduplication performance over 10 years with the predicted growth rate.
Dedupe analysis
Click Add Job when finished for FileSystem1 and repeat for other data type entries as per the specification above.
84
The following screenshot shows all Backup data entered for Sites A and D.
We will now repeat the process for Sites B and C – by selecting HP StoreOnce Source #2 in the left hand navigation pane.
85
The following screenshot shows how to enter the backup data sizes, dedupe parameters and retention periods for Sites B and C each of which
provides 4 backup streams.
Note Target Emulation is now VTL, Total Source Sites =2 (B&C). Retention is created the same as the previous example.
86
Data Center E
Fibre Channel VTL emulations required for local backups (note selection of FC device as mandatory), growth 20% size for 1 year, 2% change rate.
Server 1 - 500GB Exchange => Lib 1 - 4 backup streams
Server 2 - 500GB Special App => Lib 2 – 1 backup stream
Rotation Scheme – 500GB Daily Full, retained for 1 month.
12 hour backup window
Input job entries for Site E on the Backup tab. It requires full backups every day for 29 days and also requires FC attach, so check FC is mandatory
in the System interface area. The rotation scheme for Site E is Fulls & Fulls.
The following screenshot shows how to enter the backup schedule (fulls every day) for Sites E along with the requirement for FC connectivity.
87
On the Dedupe – Input tab, the retention period is set to Fulls for 29 days.
The following screenshot shows how to enter the Data retention periods for Target Site E.
88
If we were to look at the Dedupe - Output tab we would see, because all these backups are fulls over a month, the deduplication ratio is much
higher.
You have now finished defining the jobs for all five appliances (four sources and one target).
Press the Solve/Submit button and the Sizing Tool will do the rest.
The following screenshot shows the Target Output tab and location of the Solve/Submit button.
89
Sizing Tool output
The Sizing Tool creates two outputs.
It creates an excel spreadsheet with all the parts required for the solution including Service and Support services, and any licenses required
together with the List Pricing ( List pricing not shown for commercial reasons).
It creates an HTML solution overview (see example in next section) which indicates the types of devices to be used at source and target, the
amount of data to be replicated to and stored on the target, the Link Speeds at source and target for the specified replication window.
90
Source &
Target
Link sizes
required
Amount of data in
GB transmitted
Source to target
Models sized (including
worst case (fulls)
additional shelves for Replication concurrency (derived from backup
capacity or performance) streams) vs maximum source or target concurrency
of device itself. This can be useful in identifying
bottlenecks. In this case the target concurrency is
not exceeded (using 22 of 48 streams available)
Observations
We are not saturating the target device, we only have 22 concurrent replication jobs running and the target can accept 48 concurrently.
This allows more remote sites to be brough on line without exceeding the target concurrency.
You might wonder why for Source 3&4 the link size smaller when the replicated data (TxGB) is higher than sources 1&2 The link sized
does not just depend on data being replicated but there are two other key factors that determine link bandwidth of each source site:
1) concurrency level
2) data change level – of ratio Total TxGB/Total DataGB.
91
For each source site 3, 4 (B,D)
Effective Link bandwidth = number of concurrent replication streams (Used Ccy) * average replication throughput per stream in MB/sec
* (Total TxGB/Total Data GB) * 8.0.
Effective Link bandwidth = 4*3.03*(19.73/720)*8.0 = 2.66 Mbit/Sec
Final Link Sized = 2.95 Mbit/Sec (after 90% link efficiency)
In this case the target WAN link is almost the sum of the separate source WAN links but this is not always the case – especially if the
target is the bottleneck because its concurrency level is exceeded. It is pointless paying for WAN bandwidth if you can’t use it.
We have to size worst case for the WAN link and the worst case is when a full backup is run because more data is replicated and it still
has to meet the 12-hour backup window in our example.
92
The html output also provides an estimate of Seeding times (see below) WAN based is using the calculated bandwidth – and it can take a long
time to seed over a small link, a temporary increase in WAN link size from your telco may help here.
The Seeding Hours LAN is based on using “Co-location” and a direct 1Gbe Network between StoreOnce systems.
If the target was the bottleneck because its replication concurrency was being exceeded then you can experiment with the Sizing Tool by forcing
the target to be sized as a larger device so that the target concurrency is not exceeded – this makes for more effective replication and higher
headroom for growth.
93
Or go to the Replication Designer and choose a specific target model there.
94
Configure StoreOnce source devices and replication target configuration
This next stage looks at the detailed configuration that will be required when we deploy this configuration.
Sites A and D
The customer has already told us he wants NAS emulation at sites A and D.
Server 1 – Filesystem data 1, 100GB, spread across 3 mount points
Server 2 – SQL data, 100GB
Server 3 – Filesystem 2, 100GB, spread across 2 mount points
Server 4 -- Special App Data,100GB
On sites A and D the STOREONCE units will be configured with four NAS shares (one for each server), the filesystem servers will be configured with
subdirectories for each of the mount points. These subdirectories can be created on the STOREONCE NAS CIFS share by using Windows Explorer
(e.g. Dir1, Dir2, Dir3) and the backup jobs can be configured separately as shown below, but all run in parallel.
For example: Server 1 Mount points C, D, E
C: StoreOnce NAS Share/Dir1
D: StoreOnce NAS Share/Dir2
E: StoreOnce NAS Share/Dir3
This has two key advantages – all filesystem type data goes into a single NAS share which will then yield high deduplication ratios and, because
we have created the separate directories, we ensure the three backup streams can run in parallel hence increasing overall backup throughput.
This creation of multiple NAS shares on Sites A and D (4 in total) with different data types allows best potential for good deduplication whilst
keeping the stores small enough to provide good performance. This would also mean that a total of 8 NAS replication targets need to be created
at Site E, since NAS shares require a 1:1 source to target mapping.
Sites B and C
For sites B and C the customer has requested VTL emulations.
Server 1 – 200 GB Filesystem, spread across 2 mount points C,D
Server 2 – 200GB SQL data
Server 3 – Special App data, 200GB
In this case we would configure 3 x VTL Libraries (because we have 3 different data types) with the following drive configurations:
Server 1 VTL – 2 drives (to support 2 backup streams) say 12 slots
Server 2 VTL – 1 drive (since only one backup stream) say 12 slots
Server 3 VTL – 1 drive (since only one backup stream) say 12 slots
The monthly backup cycle means a total of 11 cartridges will be used in each cycle, which has guided us to select 12 cartridges per configured
library. The fixed emulations in HP StoreOnce like “MSL2024” would mean we would have to use a full set of 24 slots but, if the customer chooses
the “D2DBS” emulation, the number of cartridges in the VTL libraries is configurable. Note: the backup software must recognize the “D2DBS” as a
supported device and must be configured to overwrite media once expired to prevent expired cartridges “hogging” valuable storage space.
As a general rule of thumb, configure the cartridge size to be the size of the full backup + 25% and, if tape offload via backup application is to be
used, less than the cartridge size of the physical tape drive cartridges. So let us create cartridges of 250 GB for these devices (200GB * 1.25))
Site E
We will require 8 NAS replication shares for Sites A and D.
For sites B and C with VTL emulations we have a choice, because with VTL replication we can use “slot mappings” functionality to map multiple
source devices into a single target devices, allowing easier management and better deduplication ratios on the target device. So, we can either
create 6 * VTL replication libraries in the StoreOnce at Site E or merge the slots from 3 * VTL on sites B and C into 3 x 24 slot VTLs on Site E. This
allows the file system data, SQL data and the Special App data to be replicated to VTLs on Site E with the same data type - again benefiting from
maximum dedupe capability.
We also need to provision two VTL devices for daily full Exchange backups that are retained for 1 month – 4 stream backups for Exchange plus a
single stream backup for the Special Application data on Site E for local backups.
VTL 1 = 4 drives, at least 31 slots (to hold 1 month retention of daily fulls)
VTL 2 = 1 drive, at least 31 slots (to hold 1 month retention of daily fulls)
95
The final total source and target configuration is shown below.
Local
backups
on Site E
Map out the interaction of backup, housekeeping and replication for sources and target
With HP StoreOnce Backup systems it is important to understand that the device cannot do everything at once, it is best to think of “windows” of
activity. Ideally, at any one time, the device should be either receiving backups, replicating, housekeeping, or offloading to physical tape. However
this is only possible with some careful tuning and analysis.
Housekeeping (or space reclamation, as it is sometimes known) is the process whereby the StoreOnce updates its records of how often the
various hash codes that have been computed are being used. When hash codes are no longer being used they are deleted and the space they were
using is reclaimed. As we get into a regular “overwriting pattern” of backups, every time a backup finishes, housekeeping is triggered to happen
and the deduplication stores are scanned to see what space can be reclaimed. This is an I/O intensive operation. Some care is needed to avoid
housekeeping causing backups or replication to slow down as can be seen below.
HP StoreOnce Backup systems have the ability to set blackout windows for replication and housekeeping, when no replication or housekeeping
will take place – this is deliberate in order to ensure replication is configured to run ideally when no backups or housekeeping are running. We can
configure two blackout windows in any 24-hour period.
96
All time references below are standardized on GMT time. Replication blackout windows are set to ensure replication only happens within
prescribed hours. In our sizing example we input the backup and replication window as 12 hours, but we would have to edit this to 8 hours to
conform to the plan below. This shows how sizing can sometimes be an interative process. Decreasing backup window could result in larger
models being sized to fgive improved throughout – along with larger WAN links.
As you can see from the above worst case example, with such a worldwide coverage, the target device E cannot easily separate out its local
backup (18:00 – 02:00) so that it does not happen at the same time as the replication jobs from sites A, B, C, D and the housekeeping required on
Site E. The Housekeeping load is generally always higher at the target site.
What this means is that the replication window on the target device must be open almost 24 hours a day or at least 08:00 to 04:00. The target
device essentially has a replication blackout window set only between the hours of 04:00 and 08:00 GMT.
In this situation the user has little alternative but to OVERSIZE the target device E to the next model up with higher I/O and throughput capabilities
in order to handle this unavoidable overlap of local backup, replication from geographically diverse regions and local housekeeping time. How to
upsize units has been explained in the Sizing Tool explanation above.
97
Worked example – backup, replication and housekeeping overlaps
The example below does not correspond exactly to the sized example above. For instance, the backups are completing within the 12 hours
allocated and the replication is completing also within the 12 hours allocated. This example serves merely to show how consideration must be
given to how the blackout windows can affect overall performance and how analysing the performance on the target is key to a successful
configuration.
Backup
Replication
Housekeeping
Start of Replication Window
Spare Time for Physical Tape OffLoad.
Housekeeping Window
During a rotation scheme cartridges/shares are being overwritten daily, so housekeeping will happen daily
Sites A &D
Initial Config with NO replication blackout window set - lots of overlapping activities - performance not predictable.
GMT 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00
DIR 1
Share 1 DIR 2
Filesystem data
DIR 3
Data
Share 2 - SQL
DIR 1
Share 3 Filesystem
2 DIR2
98
Initial Config with replication blackout window set , all 7 jobs can replicate concurrently (concurrency 12) –
apllying replication window has effect of reducing backup times by avoiding contention. There is still some
contention however with Housekeeping during the replication window.
GMT 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00
DIR
1
DIR
Share 1 2
Filesystem data
DIR
3
Data
Share 2 - SQL
DIR
1
Share 3 Filesystem 2 DIR2
Share 4 Special App Data
Applying Housekeeping blackout window at sites A, D now improves replication performance. This now provides
some free time for future capacity growth and increases in replication time associated withthis
HK Windows can be configured but must be monitored to see Housekeeping load is not growing overall day by day.
99
Site E, Data Center
Let us now analyze the replication situation at Site E, the Disaster Recovery center.
At Site E we have replication jobs from A&D + B&C as well as local backups
Replication jobs also trigger Housekeeping at the Target site
Replication window set to 14 Hours on Target device initially
Concurrency level of Target is 24
Special
Local Backup App
HIGH LOAD
100
Consider improving the situation by imposing two Housekeeping Windows on the Target Device as shown below
GMT 10:00 12:00 14:00 16:00 18:00 20:00 22:00 00:00 02:00 04:00 06:00 08:00
DIR 1
A& D Share 1
Filesystem data DIR 2
DIR 3
A&D SQL SQL
A& D Share 3 Filesystem DIR 1
2 DIR2
Special
Local Backup App
REDUCED LOAD
This allows more efficient replication because there is no contention which in turncan free up spare time on the Target StoreOnce which could
then be used to schedule tape offloads (shown in purple).
101
Catalyst Sizing example
The main impact of Catalyst from a Sizing perspective is that low bandwidth backups from either inside the Data Center (using less bandwidth) or
remote sites backing up directly over WAN can be configured to the HP StoreOnce Appliance.
This has the effect of achieving higher overall apparent backup throughput.
One of the most common usage models expected is that of several remote sites using Catalyst in low-bandwidth backup mode to send data over
the WAN to a centralized StoreOnce appliance. In this case the heavy-duty deduplication work is done on the media server at the remote site.
Indeed this technique can often negate the need for a StoreOnce appliance on remote sites but still provide the customer with a full Disaster
Recovery capability at a very low cost – which is obviously appealing.
Note:
Use CatalystLowBw for low bandwidth backups in the data centre and direct remote site backups over WAN
Use CatalystHighBw for backups to StoreOnce Catalyst stores in the Data Center using high bandwidth (for servers that cannot support the
deduplication load)
HP expects 99% of Catalyst deployments to use Low Bandwidth mode because of the bandwidth savings it delivers.
Worked Example
Consider 20 backup jobs of 1 TB in 4Hrs from various servers in a data centre or from 20 remote sites. The retention schedule is fulls & daily
incrementals over 29 days. Incremental is 10% of Full backup size. 2% daily block level change rate.
102
Size a solution using Catalyst Low Bandwidth.
103
Job definition - note use
of “copies” because all
jobs are the same
Press Solve/Submit.
The sized solution is one HP 4430 Backup system with two additional shelves.
Because with Catalyst Low bandwidth backup all the work is being done on the media servers supplying this data, the HP StoreOnce 4430 has to
work less hard and so can match the performance requirements with a smaller configuration . Note also the additional Catalyst license
requirement.
104
Just for comparison if we size the SAME solution but with VTL emulation (where all the deduplication is done on the target – not distributed on the
media servers) the inputs are as follows:
This is a change in the solution to recommend a B6200 model which is much more expensive. Catalyst can save you money!
105
Appendix C:
Guidelines on integrating HP StoreOnce with HP Data Protector 7,
Symantec NetBackup 7.x and Symantec Backup Exec 2012
The information in this Appendix is valid for both single node and B6200 StoreOnce Backup systems. Some of the set-up documented in this
section was performed on an HP B6200 Backup system.
106
HP StoreOnce Catalyst: Configuration, Display and Set-up
This section will provide a guided tour of Catalyst device types within HP StoreOnce Backup systems. For further details please refer to the HP
StoreOnce Backup system user guide or online help.
Catalyst stores are identified as HP StoreOnce device types, just like VTL and NAS, but there are no replication mappings in the left hand
navigation pane for StoreOnce Catalyst because all Catalyst copy operations are controlled by the backup software.
Status tab
The StoreOnce Catalyst Status tab is displayed initially, which shows an overview of overall status. Further tabs provide access to Settings, Clients
to control access control, Blackout Windows to control precisely when replication occurs, and Bandwidth Limiting Window for the outbound copy
jobs to limit WAN bandwidth usage.
Settings tab
The Settings tab defaults to the maximum outbound copy jobs and maximum data and inbound copy jobs; these maximums vary by StoreOnce
model type (see Appendix A). A customer would only reduce this value if too much overall bandwidth was being consumed. This tab is also used
to enable Client Access Permission Checking.
107
Permissions tab (per store)
After overall permission checking has been enabled on the Settings tab, the access per store must also be configured (if different users are
limited to using different stores). The Clients tab is used to set up clients who are allowed to access various StoreOnce Catalyst stores – these
clients are integrated into the different backup applications that support Catalyst (HP Data Protector, Symantec NetBackup and Symantec Backup
Exec).
In NetBackup and Data Protector the client name can be set up on the StoreOnce appliance first and used in access control in the backup software,
but for Backup Exec the client must be an actual user defined in Active Directory as well.
Once Clients have been set up, the Edit function from the Stores-Permissions page can then be used to allow different users to access specific
stores.
.
Catalyst stores
Note: HP Data Protector also supports Catalyst Store creation directly from the software.
To create a new Catalyst store click on Stores in the left hand Navigation pane, then click on create in to top right hand corner and provide a Name
and Description for the store you are about to create. The Primary and Secondary Transfer Policies should both be set to either High Bandwidth
or Low Bandwidth for NetBackup and Backup Exec; only HP Data Protector can support different transfer policies on one store.
HP expects by far (99%) of customers to use the low bandwidth mode as this has the ability to improve overall backup throughput to the
StoreOnce appliance by offloading most of the deduplication load to the media servers/media agents in the customer environment (assuming
these have been adequately sized to perform these tasks).
108
Data stored within Catalyst
Let’s now take a look at how the data is stored in a HP StoreOnce Catalyst store.
When examining the actual contents of a Catalyst store the following definitions apply.
Items: This is the unit of storage within an HP StoreOnce Catalyst store. All item names are defined by the backup software.
Permissions: This controls which clients can access this store.
Data Jobs: These are the backup jobs to a StoreOnce Catalyst store and are directly relevant to the backup software format.
Outbound Copy jobs: This is the terminology used to refer to Catalyst stores being replicated (out) to another Catalyst store.
Inbound Copy Jobs: Because StoreOnce Catalyst can support multi-hop replication under backup application control there is now a
concept of Inbound Copy jobs, these are Catalyst stores being replicated (in) to the Catalyst store from elsewhere in the environment.
The naming of the Items is determined by the backup application. The items can be searched by means of a search engine (see below). Click Show
Items to see actual entries. Examples are shown later.
109
The above NetBackup Data Jobs tab illustrates another point about Catalyst stores used in low bandwidth mode. The first backup to a low
bandwidth catalyst store is similar to replication; a form of seeding is required because much of the data being deduplicated on the media server
does not exist in the Catalyst store and must be physically transmitted. As you can see in the above NetBackup Catalyst store, the first low
bandwidth backup only provided 23.1% bandwidth saving whereas the second backup was 98.6% bandwidth saving. This shows the power of
StoreOnce Catalyst backup devices – vastly reducing the bandwidth required between media server and Catalyst store and saving the bandwidth
required for backup.
The actual replication jobs initiated by the backup application can be monitored in the Outbound Copy Jobs tab of the Catalyst store – below we
see an example of Catalyst copy via backup application control using HP Data Protector.
110
Note the following:
111
Catalyst Implementation in HP Data Protector 7
In this section we shall describe:
How StoreOnce Catalyst is integrated with HP Data Protector 7
An example scenario
How to create a Data Protector specification for backup to a StoreOnce Catalyst store
How to recover from Catalyst copies
Best practices when using StoreOnce Catalyst with HP Data Protector 7
The high or low bandwidth mode is shown by the pink (high) and yellow (low) gateways in the drawing.
Explicit gateways are server-side gateways. There should always be at least one explicit gateway defined. Explicit gateways are assigned, as
required, to any server running the Data Protector 7 media agent. The explicit gateway configuration can specify the maximum number of
streams and where the deduplication process is to be performed – on the media server or on the HP StoreOnce Backup system. If deduplication in
the media agent is selected (by selecting server-side deduplication in the device properties), this results in a low bandwidth transfer of data
deduplication. The explicit gateway supports HP Data Protector Object Copy (Catalyst store replication)., The explicit gateway could really be
considered as “Client side” deduplication where every client (with a no cost media agent loaded) can have access to a Catalyst store using a
gateway of the same name and settings to access the catalyst store – HP data Protector knows which client to start the media server code on to
perform this task. The explicit gateway can be configured with server-side deduplication turned off and in that case ALL deduplication will take
place in the StoreOnce appliance. This would be called ‘target-side’ deduplication.
The implicit gateway is a source-side gateway and is optional. It is configured once only but can be used by any of the clients in the cell that have
a media agent installed. The implicit gateway does not have an assigned media agent but will start media agents on any server equipped with
media agent software. In effect, it is like a ‘virtual’ gateway for every media-agent-equipped server belonging to the cell where ‘source-side’
deduplication is specified for backup (see configuration examples later in this section).
112
The implicit gateway has the same configuration parameters on every media-agent-equipped backup server. It is designed so that only files or
data resident on the media-agent-equipped backup server can be backed up via this gateway. Files or data resident on an application server with
only the disk agent or application agent installed cannot be backed up or restored via an implicit gateway.
The implicit gateway always invokes deduplication in the media agent (which is referred to as source-side deduplication on the Data Protector
Add Device screen). HP Data Protector Object Copy (Catalyst store replication) is not available using the implicit gateway. There is a setting in the
implicit gate way which can be used to limit the number of parallel streams. This setting as you would expect applies to every server with media
agent using the gateway. So if set at 2 then each media agent equipped client can have a max. of 2 streams.
User can of course configure both source and server-side gateways. So you could have a backup server which normally used the source-side
gateway for deduplication in the server but has a server-side gateway with deduplication not selected for ‘target-side’ deduplication (useful
reducing server load at the expense of backup performance).
Data Protector also queries the StoreOnce device for the maximum number of Catalyst sessions available. (These would be inbound data jobs).
Use the source-side gateway for mass deployment to smaller server which really only send one stream. Set the limit in the gateway at 2.
Use the server-side gateway for the large application server where the backup could feed many streams and limit to around 12 streams per
Catalyst store. Each B6200 service set could support 8 of these large servers.
Use a server-side gateway when target-side deduplication is required and the user has a fast 10GbE connection but does not wish to load the
server. E.g. online database backup.
Important!
There is no difference in the deduplication method for either source or server-side deduplication. The different gateways are essentially for
different deployment methods
Server-side deduplication – deduplication of data is performed within the dedicated backup server. Server-side deduplication can be used
for data held locally on the backup server and from servers that have a disk agent installed. In this case, data is transferred over the network
to the backup server and then processed by the media agent and sent on to the Catalyst store. Server-side deduplication must use the
‘explicit’ gateway for the backup destination and that gateway requires server-side deduplication selected in the advanced options.
Target-side deduplication – data is held on a client with only a disk agent installed. This system can be remote from the backup server.
Server-side deduplication is not selected. All data is transferred at high bandwidth across the LAN or WAN to a backup server hosting a
gateway to the StoreOnce Catalyst appliance. This may be necessary when Data Protector 7 has only application/disk agent support for a
particular data type (such as OpenVMS backup).
Key Points:
At least one explicit gateway must be configured. You cannot configure just an implicit gateway.
For files or data held on an application server with only a DP7 disk agent installed, backups must be directed to an explicit gateway.
The implicit gateway is used for source-side deduplication on any server in the cell running a media agent. A server running just a disk or
application agent cannot select the implicit gateway.
The parameters (maximum streams, and so on) are the same for every server using the implicit gateway. This is useful for limiting
server loading.
The implicit gateway does not support Object Copy (Catalyst store replication
Target side deduplication is useful when the extra load of deduplication is not wanted on the backup server.
Only 64-bit servers can be configured for a gateway.
The deduplication process is exactly the same for server-side and source-side deduplication.
113
The StoreOnce deduplication and Catalyst client binaries are built in to the media agent code. There is no requirement for a plug-in
software module, as required for Symantec NetBackup and Backup Exec integration.
In HP Data Protector terminology a Catalyst store is referred to as a Backup to Disk device type.
Example scenario
First, we need to configure a store on the HP StoreOnce Backup system (see Catalyst stores). Then, we need to configure the device using the Data
Protector Management Client.
This guide will cover configuration of both gateway types. There are four servers:
The cell manager is on server ‘Zen’ and, following HP Data Protector best practice, this is a separate server. The cell manager creates a
significant loading, which is not desirable on a media server.
There are three other servers: Bill, Ben and Zip. The server ‘Zip’ has only a Data Protector 7 disk agent installed.
The HP B6200 Backup System being used in this example is configured for ‘template1’ network configuration, which uses 10GbE for
data and 1GbE for B6200 management. DNS is in use and the B6200 VIFs (virtual IP addresses) will be referenced by their fully qualified
domain name.
For the purpose of this exercise four Catalyst stores will be configured as dpstore1 – 4 within the B6200
The figure below shows the example layout. Let us assume dpstore1 has been created on the B6200 Backup System, the configuration of Data
Protector can now proceed. The Catalyst store will be called ‘B2D1’ and will be configured as a Backup to Disk device.
Use a single implicit gateway for Bill and Ben – but only data specific to Bill and Ben would be able to Backup to the Catalyst store and no
object copy of Bill & Ben data would be possible. Zip cannot be backed up to a Catalyst store.
Use two named Explicit gateways for Bill & Ben this means they can support object copy and backups from zip can be mapped to one of
the explicit gateways. So all the data is transferred over the network to Bill or Ben where it is deduplicated and then only small amounts
of backup data are sent low bandwidth to the Catalyst store accessed by the specific explicit gateway.
114
Configuring Data Protector
1. Start the Data Protector management GUI and select Devices & Media from the drop down box at the top left of the page.
3. On the screen displayed add the chosen Device Name (B2D1), Description (optional), and select Backup to Disk as the Device Type and
StoreOnce Backup System as the Interface Type.
5. The next screen is used to select (or create) the HP StoreOnce Catalyst store located on the appliance and to configure the gateways.
Specify the IP address or FQDN of the Deduplication System
6. In the Store section of the screen, you can browse and create Catalyst stores.
If client access permissions have been selected on the StoreOnce Backup system GUI, you must enter the Client ID to browse or create a
store on the backup application GUI. If the Client ID is not entered any pre-made stores will not be accessible.
If the store has not been pre-configured on the StoreOnce appliance, a new store will be created by HP Data Protector providing the
Client ID is specified correctly.
Note: HP Data Protector is the only backup application that is able to create Catalyst stores from the backup application GUI.
The screenshot below shows the critical stage of the HP Data Protector Catalyst configuration. We have two options for dpstore1 :
configuring access to it via an implicit gateway (Source-side deduplication ticked) OR if we don’t tick Source-side deduplication we can
use the Add function to define numerous explicit gateways ( each residing on a Data Protector media server) that are allowed to access
dpstore1. This allows multiple media servers to write to the same Catalyst store and, if the Catalyst store is configured for a particular
data type and the media servers send only that data type to that Catalyst store, then the deduplication will be much improved.
115
Click Properties to access the advanced settings, where the maximum number of streams per client may be specified. The default
setting is two and will be applied by any media server when using source-side deduplication (i.e. when the implicit gateway is being
used).
This is an important setting when optimizing the number of streams of data per Catalyst store – 16 is recommended. The blocksize
setting is not used for StoreOnce Catalyst but is still used by disk and application agents. It is recommended to increase this to at least
256KB.
8. The section belowthe red dotted line above adds explicit gateways. At least one explicit gateway must be configured. These gateways
are applied individually to each Data Protector client that has media agent software installed.
The client server names are shown in the drop-down box (servers must be added as clients and the media agent installed prior to
gateway configuration).
Select each server that requires a gateway and click Add. The Properties dialog for explicit gateways allows selection of Server-side
deduplication shown below
If the Server-side deduplication box is ticked in the Advanced Options tab, the deduplication occurs on the server hosting this gateway
and low bandwidth backup occurs. (This is as the equivalent of selecting source-side deduplication when setting up an implicit gateway.)
If this box is not ticked, all deduplication will take place on the StoreOnce appliance and the backup will be high bandwidth, which means
target-side deduplication.
9. You may also use this dialog to specify the maximum number of streams per gateway.
Maximum Number of Parallel Streams per Gateway: The maximum recommended value is 16 for a single Catalyst store. However, the
best overall throughput on a StoreOnce device is achieved if we have 8 separate devices with 6 streams to each. So, in our example, we
have set the value to 6.
116
10. Whilst you are in the Advanced options GUI for Gateway properties, click on the sizes tab, select Blocksize: This must be set as high as
possible, 1024kB is recommended but may not be possible with all Host Bus Adaptors. A high value minimizes the number of HP data
protector headers inserted into the data stream and hence improves deduplication ratios.
11. Click OK to apply the Advanced Settings. This returns you to the Add Device screen, shown in step 6.
12. Use the Check button to the right of the server list to check successful communication between the backup server and the HP StoreOnce
Backup system.
Note: With HP B6200 Backup systems the network may be configured to use two subnets (such as 10GbE for data and 1GbE for
management). If using DNS, although only the data path to the B6200 Backup System is used for the Catalyst data and commands, both
subnets must be capable of resolving their respective service set VIFs for data on the 10GbE network and the management VIF on the
1GbE network.
13. Select Finish at the bottom of the Add Device screen and the stores are now ready for use.
Key Points:
The optional implicit gateway, when selected, will start media agents on any media-agent-equipped server but only for local data on that
server. It uses the same settings for every server. Data for backup must reside on the same server. It is used for source-side deduplication
only and is useful for providing an overall limit on data streams to match backup server specification.
The explicit gateways can be configured individually on each media-agent-equipped server that is registered as a client in the cell. They can
be used for server-side or target-side deduplication and can backup data that is resident on other servers via the network.
For each data stream a media agent process is started. For each mount point a disk agent is started.
The maximum number of connections per store can be set by Data Protector.
117
Note: Target-side deduplication is the deduplication method used by VTL, NAS and Catalyst high bandwidth configurations and is not shown in
this document, which is illustrating Catalyst low bandwidth scenarios.
1. From the Data Protector Management screen select Backup from the drop-down box at the top of the page.
3. Select Blank Filesystem and check the Source-side deduplication box as the backup specification option. Click Next.
4. Select some files for backup and click Next to display a list of destination devices. In our example, the destination is the device ‘B2D1’.
Note that the explicit gateway on the server called Bill (b2d1_gw1) is not available and is ‘grayed’ out because we ticked source side in the
backup job definition.
5. Select the source-side gateway. Remember this gateway is selectable only if ‘source-side’ deduplication is specified.
Highlighting the gateway will allow the Properties button to be selected. This is used to specify a media pool. A default media pool is
created for the ‘backup to disk’ device but additional media pools may be created, if desired.
6. Click Next and specify the required options for retention. The backup specification options and schedule may be modified, if desired. It is
often useful to tick the Display statistical information box.
1. From the Data Protector Management screen select Backup from the drop-down box at the top of the page.
8. Select Blank Filesystem. DO NOT check the Source-side deduplication box in the backup specification. Click Next.
118
3. Select some files for backup and click Next to display a list of destination devices. In our example, the destination is the device ‘B2D1’.
4. Expand B2D1 and now the Source-side gateway is ‘grayed’ out and the explicit gateway on the backup server ‘bill’ is available. This
gateway has ‘server-side’ deduplication selected in the advanced options. Select the options required and save the backup specification.
5. Click Next and specify the required options for retention. The backup specification options and schedule may be modified, if desired. It is
often useful to tick the Display statistical information box.
HP StoreOnce Catalyst has the added advantage that expired backups may be removed automatically as Catalyst items and the space made
available for other users of the store. Please note that as the data stored is deduplicated, space is not returned until the Housekeeping
process has run on the StoreOnce appliance (this is configurable to run for up to two periods within any 24 hours).
HP Data Protector 7, by default, automatically returns space occupied by expired backups every 24 hours. This may be modified to hourly by
modifying the global options file. This file is located at: \ProgramData\OmniBack\Config\server\options\global
Finally ensure you have HP Data Protector Windows Disk Agent version > A.07.00 installed on your clients, this is enhanced to minimize the
header inserts into the data stream hence improving deduplication ratios. This can be checked by using the “Clients” context drop down and
checking the HP data protector installed components under “properties”.
HP StoreOnce Catalyst Copy differs from VTL or NAS replication in that data can be duplicated to more than one appliance for extra resilience.
Additionally, the HP Data Protector internal database is now aware of all StoreOnce Catalyst copies and can restore from any copy without
complex scripting arrangements or imports. When required, data can also be copied to real tape for long-term storage. (When data is moved to
tape the deduplicated data must be re-hydrated.)
119
Data Protector 7 Object Copy offers a rich selection of options. This document will cover only basic object copy functions - replicating a backup to
one Catalyst store then on to another store. For more detail please consult the appropriate HP Data Protector documentation.
All transfers between StoreOnce Catalyst stores are bandwidth efficient (low-bandwidth). The StoreOnce Catalyst protocol can now control the
StoreOnce appliance, so there is no need for data to flow through a backup server.
Object copies can be interactive (useful for ad-hoc copies), or scheduled. Additionally they can be set to occur ‘post-backup’ after the backup
completes. It is not possible to make backups to multiple destinations at the same time. Copies are made sequentially from one store to another.
The above figure illustrates backups of user data, performed using a server-side gateway to Catalyst store #1 located in data center #1.
Post backup (or scheduled) HP DP7 object copy can move the backup offsite via the WAN to Catalyst store #2 in data center #2. This is performed
in a bandwidth-efficient manner and, after the first transfer (seeding), only unique data chunks will be transferred. The expiry date of the original
backup can be shorter or even immediate once data is offsite.
The backup can then be duplicated to Catalyst store #3 in data center #3. This gives extra resilience. The Data Protector 7 Object Copy
configuration could also support moving the data onto tape if required.
Transfer of data is direct from source StoreOnce Catalyst store to target StoreOnce Catalyst store. Although it may appear the gateways are
involved, they only pass the commands to perform the replication. The cell manager internal database tracks the copies. It is important to back up
the cell manager database after every backup session.
120
Key Points:
Use the Data Protector Object Copy function to move data offsite.
Design the WAN link using the Sizer tool in order to complete the duplication in the appropriate time window. Allocate separate time
slots for the duplication process and the backup process.
Always select server-side deduplication (explicit gateway) for backups. Because of the virtual nature of the source-side (implicit)
gateway it is not possible to use it for duplication/object copy. Destination server-side (implicit) gateways are ‘grayed’ out. If you select
source-side gateway for backup, jobs will fail unless the object copy job is remapped to use the explicit gateways.
Duplication can either be post backup or scheduled separately.
Object copy to tape is referred to as a ‘copy’ as distinct from a catalyst copy which is replication.
Simply sequence your way through the tabs (Backup Specifications ->Copy Specifications, and so on) to construct the Object Copy job.
121
The following example shows a post backup Catalyst replication process being configured through Object Copy in HP Data Protector. There are
several configuration tabs, but be sure to set the following parameters:
On the Copy Specifications tab, only StoreOnce Catalyst store devices should be enabled for replication.
Multi-hop copy is possible to multiple sites (the copy operations, however, happen serially) and Catalyst Copy can also take place to physical tape,
if long term retention or copies are required. Consult the HP Data Protector Object Copy documentation for more details and descriptions of all
the tabs.
In this example we have object copied data from B2D1 to B2D2 (even though these are in the same physical B6200 they represent copy across
different sites)
122
HP Data Protector 7 – Recovery from Catalyst copies
The screenshot below illustrates how it is possible to restore from Copies within the HP Data Protector Restore function.
Select the required Filesystem and display its Properties to see all the versions of that backup. In our example, the original is on Catalyst store
B2D1 whilst a copy is on the Catalyst store B2D2 (typically on a Disaster Recovery site).
copy
original
If the primary copy on B2D1 is lost or the site damaged, then the data can be restored immediately from the copy residing on B2D2. Enable the
Select source copy manually box and select the required copy from the list.
Selecting a copy also selects the correct media pool on the correct Catalyst store.
123
HP Data Protector 7 – Catalyst best practices summary
1. Best practice is to keep similar data in the same Catalyst store. For example: dedicate a B6200 Catalyst store to Oracle backups and have a
different Catalyst store for SQL server.
2. The more streams that can be supplied concurrently to a Catalyst store, the better the throughput – for best throughput 16 streams is about
the maximum you can send before contention for resources happens within a single catalysts store. The source data selection will dictate
how many streams are sent in a particular backup specification. If data is selected for backup from multiple mount points, each mount point
will have a disk agent started. It is also possible to use the backup specifications to select multiple directory selections for backup to produce
multiple streams.
3. Multiplexing cannot be configured within Data Protector 7 with Catalyst devices. (This is known as ‘Concurrency’ within Data Protector.)
4. It is necessary to select source data correctly for multiple streams. For example: for Filesystem backup separate mount points, drive letters
or directory selections are required.
5. Backup servers running multiple streams and deduplication must be sized appropriately.
6. Use the HP Storage sizing tool – this is calibrated with the latest test results from HP R&D and will take into consideration data retention,
data change and growth. It will also size any WAN links or show the bandwidth savings when using Catalyst.
7. Use the implicit gateway and use source-side deduplication if the source data is located on the server running the DP media agent.
8. Use the explicit gateway and server-side deduplication for individual server settings and when source data may be located on other servers
and object copy functionality is required.
9. When setting up the gateway, remember that the implicit gateway (source-side deduplication) has a default limit of 2 streams per client. This
will apply to all media servers using this gateway. The advanced setting for each explicit gateway also has a setting that allows you to
specify the maximum number of streams.
10. When load balancing is selected at backup job run time, there is also a limit on streams and this overrides the gateway limit. If the gateway
stream limit is set to 6 and load balancing is set at 5 (default), then ONLY 5 streams will be possible for a job specification. The default
setting may be modified. Load balancing is not recommended to two different StoreOnce Catalyst stores.
11. Expired backups are deleted from the StoreOnce Catalyst store at intervals specified by the global options file – default 24 hours
12. Exporting backup media from a Catalyst store will leave ‘orphaned’ items in the Catalyst store – avoid doing this.
13. Use the Data Protector Object Copy function to move Catalyst data to another device - typically offsite.
14. Always select an explicit server-side gateway for backups that require duplicating to another site. Because of the virtual nature of the
source-side (implicit) gateway it is not possible to use for duplication; destination server-side (implicit) gateways are ‘grayed’ out. If you
select a source-side implicit gateway for backup, jobs will fail unless the object copy job is remapped to use the explicit gateways.
15. The HP Data Protector Object Copy function supports either post backup or scheduled deduplication, depending on customer requirements.
Interactive mode is generally reserved for one-offs and testing only.
16. Object copy to tape is referred to as a ‘copy’ as distinct from a Catalyst copy which is ‘replication’.
17. HP StoreOnce Catalyst does not use Active Directory services for access control.
18. In a Disaster Recovery situation a StoreOnce Catalyst user is likely to have backup copies on multiple sites, and possibly to tape as well – all
copies have entries in the IDB (internal database). Best practice is to back up the IDB and then replicate offsite.
19. If the original cell manager is lost a new one can be created and then updated with the Cell Manager IDB. However, it will be necessary to
‘import’ the Catalyst items for the relevant store.
124
20. To import Catalyst stores to a new Cell Manager proceed as follows:
With a new cell manager configure the B2D device which is intact on the DR site. The original cell manager has been ‘lost’ in the disaster.
Obtain a list of StoreOnce Catalyst objects in the store using the following command:
(commands in \Program Files\Omniback\bin)
omnib2dinfo.exe –list_objects –type OS –host << IP address or VIF of the B6200 service set>>
-name <<storename>>
Note: The storename is the name of the store on the StoreOnce Backup system and not the DP name.
Update the repository for each Catalyst object, using the Store name given in DP7.
omnimm -add_slots <<Store name in DP7>> <<catalyst object name>>
Either use the GUI to import the catalog data or the command:
omnimm -import <<logical StoreOnce device>> - slot SlotID
125
HP StoreOnce Catalyst stores and Symantec products
StoreOnce Catalyst may be integrated with Symantec NetBackup and Backup Exec. HP StoreOnce Catalyst stores are represented in Symantec
NetBackup and Backup Exec as the device type: Open Storage devices.
For both products, HP StoreOnce Catalyst Open Storage Technology (OST) plug-ins must be downloaded from here and installed onto any
Symantec NetBackup or Backup Exec Media servers that will be used to write to an HP StoreOnce Catalyst store. This allows HP StoreOnce
Catalyst stores to be seen within Symantec products as an Open Storage device – hp-StoreOnceCatalyst is the name of the device.
Different OST plug-ins are required for NetBackup and Backup Exec.
The next diagram shows the implementation of HP StoreOnce Catalyst within Symantec products.
Figure 38: Integrating HP StoreOnce Catalyst with Symantec NetBackup or Backup Exec
126
Once installed, the same concepts apply as with HP Data Protector integration – low and high bandwidth backups to a Symantec Open storage
device and then optimized duplication (Symantec terminology) of the Catalyst stores to another device. The following diagram illustrates
StoreOnce Catalyst functions with Symantec NetBackup.
127
StoreOnce Catalyst implementation with Symantec NetBackup
In this section we shall describe:
128
The NetBackup hierarchy is shown below; a backup policy sends backups to a Storage Unit (which can be VTL, Basic Disk or Open Storage
device).
StoreOnce
Storage server name: This is the name of the IP address (or equivalent DNS server name) that represents where the Catalyst stores
resides . Note that on StoreOnce B6200 systems this should be the virtual IP address that identifies the Data Path to the Catalyst store
The number of Catalyst stores configurable depends on the Model Number.
129
Storage server type: This is always hp-StoreOnceCatalyst because this is the unique identifier used by the OST2.0 plug-ins that HP has
developed.
Media server: This is the Media Server where the OST2.0 plug-in is installed. In this case we will connect it to our “Heartofgold” Media
Server (where part of the deduplication process will now be performed because we have set up a low-bandwidth Catalyst store).
User name: This is the user that you created on the HP StoreOnce appliance; the password within NetBackup can be of your choosing.
The Use Symantec OpenStorage plug-in for network controlled storage server should not be ticked as this is a Symantec special
device type.
2. Having created a storage server, we must filter this down to create a Storage Unit for a backup policy to access. The first stage is to
integrate the StoreOnce Catalyst stores that NetBackup has detected as Disk Pools and give them a name. In the example below: on HP
StoreOnce appliance B6200ss1.nearline.local there are two HP Catalyst stores, dpstore1 and Netbackup75. Netbackup75 is the Catalyst
store we have created for presentation to NetBackup 7.5. dpstore1 is being used by another backup software vendor.
130
3. The final part of the disk pool process is to assign it a Disk pool name. Because we configured Netbackup75 Catalyst store in low
bandwidth mode – we have named the disk pool Netbackup75Lowbandwidth.
4. Finally, within a Disk Pool we create a Storage Unit that can be accessed by a backup policy. At this stage we can also configure the
number of concurrent streams that can be written to the Catalyst store (max of 16 is recommended) and the maximum file size that can
be written to the Catalyst store/storage unit (524 GB).
131
5. As can be seen below Netbackup75LowbandwidthStore appears as a storage unit in the list of NetBackup Administration Console, which
means we can now configure backup policies and direct them to the Storage Unit. In this case, the backups from the media server will
include deduplication performed on the NetBackup media server “heart of gold” because the Catalyst store has been configured in low
bandwidth mode on the StoreOnce appliance.
6. A typical backup job to a Catalyst storage unit can then be configured, selecting Netbackup75Lowbandwidth as the policy storage unit
as shown below.
132
SLPs offer users the opportunity to assign a classification to the data at the policy level. A data classification represents a set of backup
requirements, which makes it easier to configure backups for data with different requirements; for example, email data and financial data. SLPs
can be set up to provide staging behavior. They simplify data management by applying a prescribed behavior to all the backup images that are
included in the SLP. This process allows the NetBackup administrator to leverage the advantages of disk-based backups in the near term. It also
preserves the advantages of tape-based backups for long-term storage.
The beauty of OST is that the “duplicate” feature of storage lifecycle policies can be used to enable Catalyst stores to be copied (using low
bandwidth copy), all under the control of the backup software. Furthermore, multiple copies (or duplicates) can be created at multiple sites to
ensure even better Disaster Recovery capabilities. All copies are logged in the NetBackup catalog.
There are two stages when setting up Symantec OST Duplication/Catalyst store low bandwidth copy via NetBackup Storage Lifecycle Policies.
First, we need to establish the copy target on another HP StoreOnce appliance (e.g. NetBackup75reptarget) as a new Storage Server and
create the associated Disk Pool and Storage Unit within the NetBackup Master server domain.
This is the same process as described in the previous section, but be sure to establish the Storage Server Name (or IP), which is
displayed in the StoreOnce GUI, before you start.
Once the NetBackup75reptarget is established, we create a Storage Lifecycle Policy to incorporate the existing StoreOnce Catalyst
backup policy along with one or more “duplicate” commands.
1. From the NetBackup Administration Console under Storage, right click Storage Lifecycle Policies SLP) and select New Storage
Lifecycle Policy.
133
2. Give the SLP a name, in this case StoreOnce Catalyst LCP, and then start to specify a sequence of events – choosing between backup and
duplicate functions. As you can see below the first stage is always the backup to a particular storage unit (in this case
Netbackup75LowbandwidthStore) and the backup retention policy (in this case 2 weeks).
3. Additional operations can be added, as shown below. For Catalyst Copy the next operation would be a Duplicate (Catalyst copy) to
another storage unit called Netbackup75reptarget.
Note: The retention period for the duplicate can be different to the retention period for the original backup (in this case 3 months) – so
copies held at a central DR site can be retained for longer than those at the remote site.
4. A second duplicate operation can be added to duplicate the original backup (using low bandwidth technology) to yet another Catalyst
store and finally, if required, a third duplicate operation can be added to copy the backup to physical tape.
134
5. To implement this Storage Lifecycle policy select StoreOnceCatalyst from the Policies section of the left-hand navigation and then
select StoreOnceCatalystLCP from the Policy storage drop down list in the Change Policy screen. Instead of choosing a specific backup
device for the backup, you are now choosing a storage lifecycle policy that utilizes several devices.
6. Run the StoreOnceCatalyst backup policy, which uses StoreOnceCatalystLCP for its directive, and look at the Activity Monitor. Note how,
from the single policy, we evoke both the Backup and then some 30 minutes later the Duplication (Catalyst copy job). The 30 minutes is
configurable and, ideally, priority should be given to completing all backups first. A future release of NetBackup may allow the SLP
duplicate operations to be scheduled more specifically.
135
7. Finally, one of the options when defining the Storage Lifecycle policy is to set the Retention type of the original Backup to Expire after
copy. This means as soon as the backup is copied offsite the original backup can be deleted – thus ensuring minimal backup storage
requirements at the original site where the backup takes place.
There is also a useful feature in NetBackup where a Copy can be promoted to Primary copy; the Primary copy is always the default for recovery.
136
Symantec NetBackup 7.x – Catalyst best practices summary
1. Best practice is to keep similar data in the same Catalyst store. For example, dedicate a StoreOnce B6200, 4210/20, 4420/30 or 2620
Catalyst store to Oracle backups and a different store for SQL server.
2. Use of HP StoreOnce Catalyst within NetBackup requires the addition of Enterprise disk licenses relevant to the amount of Front End
Terabyte ( FETB) being protected by NetBackup on a Catalyst store.
3. The more streams that can be supplied concurrently to a Catalyst store, the better the throughput. For best throughput to a single
catalyst store 16 is recommended. The source data structure will dictate how many streams are sent in a particular backup
specification. If data is selected for backup from multiple mount points then each mount point will have a disk agent started. It is also
possible to use the backup specifications to select multiple directory selections for backup to produce multiple streams. Within
NetBackup there are several places that control the number of streams configurable see the NetBackup implementation guide,
available to download here
4. Backup media servers running multiple streams and deduplication must be sized appropriately, using the rule of thumb described
earlier.
5. Use the HP Storage sizing tool – this is calibrated with the latest test results from HP R&D and will take into consideration data
retention, data change and growth. It will also size any WAN links or show the bandwidth savings when using Catalyst.
6. Catalyst stores must be configured on StoreOnce B6200, 4210/20, 4420/30 or 2620 with the transfer protocol settings either both set
to low bandwidth or both set to high bandwidth.
7. When configuring the storage server the user name must be the one created on the B6200, 4210/20, 4420/30 or 2620 e.g.
Netbackup75user. The address or IP of the Storage server where the Catalyst store is located is displayed on the HP StoreOnce GUI. With
StoreOnce B6200 Backup systems this is the VIF ( virtual IP address or DNS name) of the data path.
8. When creating the disk pools use the default high and low watermarks.
9. When the storage unit corresponding to the Catalyst store is configured ensure the stream count is > 16 and the NetBackup Fragment
size is the maximum possible.
10. Backup policies should have the Allow multiple data stream tick box enabled and Take checkpoints every x mins section completed,
especially if used with StoreOnce B6200 which supports autonomic failover. This enables backups to start close to where they were
interrupted in the event of failover being invoked on the StoreOnce B6200.
11. Expired backups are deleted from the StoreOnce Catalyst store at intervals whenever the Image Cleanup job in NetBackup runs.
However, the space will not be reclaimed until the associated Housekeeping within the HP StoreOnce Backup system has run.
Housekeeping is configurable to run at up to two periods within any 24 hour period (by the use of blackout windows) and should be
allowed to run during periods of inactivity (not when backups or duplication are running).
12. Use the Symantec Storage Lifecycle policy function to move Catalyst data to another device - typically offsite. Remember if multiple
duplicates are specified they happen in serial fashion, not in parallel. The default period between backup and duplication or subsequent
duplications is 30 minutes. This might cause a high workload delaying the total backup job’s completion time. If you wish to extend the
30 minute timeout, or change the trigger point for replication jobs, create and edit the LIFECYCLE_PARAMETERS file stored in
C:\Program Files\Veritas\NetBackup\db\config as shown below
137
In the above example any duplication jobs < 8GB are forced to duplicate within two minutes.
Additional duplication commands are also available and a full inspection of the NetBackup Administration guide is recommended. (it is
anticipated this area of duplication will be simplified by a scheduling process in a future release of NetBackup.)
Storage Lifecycle duplicate jobs will fail (but be retried according to parameters in the LIFECYCLE_PARAMETRS file) if the replication
blackout window is set on the HP StoreOnce Backup appliance. So, ultimately, the StoreOnce appliance has control over the replication
schedule.
13. The number of duplicate jobs within a single storage lifecycle policy should be kept to a maximum of 4 for ease of monitoring. Duplicate
jobs can be monitored in the HP StoreOnce Backup systems using the Inbound and Outbound Copy Jobs tabs of the Catalyst store in
the StoreOnce GUI. A better way of monitoring Catalyst replication is to use HP Replication Manager 2.1, which is supplied free of charge
when a VTL or NAS replication license is purchased or when an HP StoreOnce Catalyst license is purchased.
14. Depending on customer requirements the Retention period can be set from weeks to months or expire upon duplication within the
Storage Lifecycle Policy.
15. StoreOnce Catalyst within NetBackup does not use Active Directory services for access control.
16. In a DR situation an HP StoreOnce Catalyst user has many options available which make DR much more efficient.
a. In a single Master Server environment no catalog import is required to recover from the duplicated copies of backups on
another site; the user can recover the data from “Copy2” found in the NetBackup Catalog search, if the primary copy has been
deleted or the device is unreachable.
b. Catalyst stores can be added into a Storage Unit group, which allows automated re-direction of backups if a specific Catalyst
store is unreachable. This improves high availability because the backup can be redirected to any other available device in the
Storage Unit group.
c. In a multiple Master Server environment the AIR (Automated image replication) function can be used to transfer Catalyst store
backup information between different catalogs on different Master Servers. This feature is expected to be included in the next
revision of the HP OST plug-ins for NetBackup.
138
Integrating StoreOnce Catalyst with Symantec Backup Exec
In this section we shall describe:
HP StoreOnce Catalyst is represented in Symantec Backup Exec as an Open Storage device and requires the HP-OST plug-ins for
Symantec Backup Exec to be installed on every media server that requires access to an HP Catalyst store. Plugins are downloadable
from here.
139
The Backup Exec implementation of HP StoreOnce Catalyst has the following features:
The correct OST plug-in must be downloaded. There is a single HP OST plug-in for Windows.
A Deduplication Option license is required for Backup Exec.
The Logon criteria to the HP StoreOnce Catalyst device must be an actual user configured on the System in Active Directory and MUST
align to the same user configured on the StoreOnce Backup system for client access control.
140
2. Select Network Storage and click Next.
141
5. Select a Provider for the OpenStorage device.
Select the hp-StoreOnceCatalyst OpenStorage device. (If this is not displayed, the OST plug-ins for HP StoreOnce Catalyst have not been
loaded.)
6. Enter the connection information for the OpenStorage device. This is the IP address or FQDN of the StoreOnce appliance on which the
Catalyst store has been created. (If this is an HP StoreOnce B6200 Backup system, be sure to provide the Data Path VIF.) In our example,
we shall use b6200ss3.nearline.local. Previously a client access user called beclient1 has been created on the HP StoreOnce appliance
and in the Active directory for the Domain.
142
7. Click Add/Edit – you will be prompted to create a logon account.
NEARLINE\beclient1 is the user in Active Directory (see below) and beclient1 is also the client configured on the HP StoreOnce appliance
(see below). The password can be anything you choose. Click OK.
If required, you can check that your user has been created as follows.
Below we can see an extract where beclient1 has already been created on the StoreOnce Appliance.
143
8. We must now set the account name NEARLINE\beclient1 as the default logon account because the System logon account cannot be used
for OpenStorage devices. Select NEARLINE\beclient1 as the logon account and click Set as Default. Click on OK.
9. Click Next. (The error message in our example occurred because we were using an Administrator logon, but can be ignored because we
have now created a new user beclient1.)
144
10. Select a storage location on the StoreOnce appliance.
Select CatalystStore 1, which we have already created on the source StoreOnce appliance and click Next.
11. Select the number of concurrent operations to be 16 to enable maximum throughput and click Next.
145
12. The Storage Configuration Summary is displayed. Click Finish if you are satisfied with the details (or Back to make changes).
13. The device is configured successfully and available for backup. Services must be restarted to bring the device online. Click Yes.
14. The device comes online in the storage section of the Backup Exec GUI.
The configuration of the Catalyst store can be edited by clicking the icon shown above in the Storage section. Note how the deduplication ratio is
now available directly to the backup software.
146
The data stream size should not affect Catalyst Stores because there is buffering with the Catalyst client. The span size (or max object size) again
does not have a real impact with Catalyst stores because Catalyst can support as many objects as needed. There is no 25,000 file limit like we
have with NAS. So, HP recommends these values be left at the default values within Symantec Backup Exec.
2. A default backup job is displayed, which you can edit in terms of data to be backed up and devices to be used. The default is Full &
Incremental and any disk storage.
147
3. Click on the Edit button in the Backup section (above). Click on Storage in the left-hand navigation and select our Backup Exec Catalyst
store from the drop down list.
4. Click on the Edit button in the Source System section (see step 2 screenshot) if you want to be more selective about which data is
backed up.
148
5. Click OK to complete the backup job.
6. You can use the One Time backup method to test the interface, if required.
Once the backup has completed successfully the representation of the Backup Exec backup in the HP StoreOnce Catalyst store is a little unusual,
as illustrated below.
Entries in the HP StoreOnce Catalyst store are known as items. A set of BEOST control files are written, these are constantly referenced and
updated. The Backup Exec implementation of OST is similar to that of a virtual tape emulation – with a control file emulating drives and robots
and an inquiry string and additional entries representing slots of a tape Library. The main full backup occupies a single entry (analogous to tape in
149
slot), and other Item entries represent slots allocated to future incremental backup jobs. An item can have no “user data size” but can have a
metadata size, as shown below.
Before we can put this to the test we create another Catalyst store, called CatalystStore2, on another site to demonstrate the low bandwidth copy
between sites. It is essential that the client we created, beclient1, is also configured as a client to access the second Catalyst store, otherwise the
backup software controlled copy (optimized duplication in Backup Exec Terminology) will fail.
1. Once created, the second Catalyst store comes online to Backup Exec when viewed through the storage tab.
150
2. Returning to the Backup and Duplicate Properties screen, you can see we now have backup data to select. The Backup job is configured
to Run Now to Backup Exec catalyst Store. The Duplicate functionality is scheduled to run immediately after the backup completes to
Backup Exec Catalyst store 2.
These can all be changed by using the Edit box associated with each operation. Note the difference in retention periods between Backup
and Duplicate.
3. The Backup runs as previously and, when complete, the Duplicate functionality kicks in straight away. The first time that duplication
occurs 100% of the data must be replicated (or seeded as this is sometimes called). Subsequent duplications occur much faster after the
initial seeding has completed.
The Backup and Duplicate functions run as two separate jobs.
4. The Backup and Duplicate is recorded in the Job Log, as shown below (double click an entry to get more details).
151
5. The Duplicate job can also be “scheduled” to occur by changing the options in the Duplicate stage as shown below – tick According to
schedule. The Replication blackout windows set on the StoreOnce Appliance override the duplicate schedule set on the duplicate job.
6. By clicking Add Stage in the Duplicate section further multi-hop duplication stages can be added (See Duplicate 2 below)
If a physical tape library is connected into the Backup Exec Domain then the duplicate2 “edit” function could be used to specify the second
duplication to a physical tape device thereby implementing a Disk to Disk to Tape solution. The data from StoreONce would be “re-hydrated” of
course before its copy to physical tape.
152
The CASO option with Backup Exec allows duplication to take place across Backup Exec Domains.
2. We performed a test restore of a File from the C:/BackupDataSets directory (shown above) and confirmed using the StoreOnce GUI that
this restore was taking place from Catalyst store 1
3. The restore worked successfully.
153
4. We looked to find a way to remove from the Backup Exec catalog the C:/BackupDatatSets backups only on Catalyst store 1 to prove we
can recover from the duplicate copy.
We can view these through the Backup Sets perspective (see next screenshot) but the only insight we have as to which Catalyst store
these backup sets are on is via the expiration date.
4 week retention =
Catalyst Store 2
2 week retention =
Catalyst Store 1
Alternatively we can view Backups residing on particular storage using the Storage view of Catalyst store 1 and Catalyst store 2.
154
5. If we now re-run the restore job, it fails because the data it is trying to restore is no longer on Catalyst store 1 (Copy 1 or Primary
version).
6. So now we have to run the restore wizard again and re-specify the criteria.
7. Backup Exec keeps all the catalog information for each restore job but, because the backup data has been removed (artificially) in this
example, the restore job fails. In the real world scenario if the restore had taken place after Dec 25 th (when the copy of the backup
expired on Catalyst store 1), the restore would have automatically come from Catalyst store 2 (see snapshot expiration dates above).
We are effectively forcing this to happen quickly in this worked example and, so, we have to redefine the restore job – since we have
removed the backup but not removed the entry from the catalog.
8. This time in the Restore Wizard (see below) we can see Copy 2 time stamped at 2:45 ( the original backup copy 1 – was time stamped
2:34) and we select this backup, which exists on catalyst store 2, to restore.
9. This modified restore operation now completes successfully, and even indicates it is being restored from Catalyst store 2!
155
10. Just to be certain we use the StoreOnce GUI to observe the reading of data from Catalyst store 2 (as seen below) using the Activity
reporting within the StoreOnce GUI.
Restore from
catalyst Store 2
Image Clean up
There is a concept in Backup Exec of Data Lifecycle management (equivalent to the Image Cleanup utility that runs in NetBackup). This process is
automatic and follows the rules outlined below. The effect of this is that Backup Exec removes expired data from the HP StoreOnce Catalyst
stores automatically and proactively – hence releasing the space back to the free pool for future backups.
156
Symantec Backup Exec 2012 – Catalyst best practices summary
1. Best practice is to keep similar data in the same Catalyst store. For example: dedicate a StoreOnce B6200/42xx/44xx/2xxx Catalyst
store to File/Print data and dedicate a different store for SQL server in order to get the best deduplication ratios.
2. Use of HP StoreOnce Catalyst within Symantec Backup Exec requires the addition of a deduplication license in Backup Exec for each
media server that is involved with deduplication as well as Front End Terabyte licenses (FETB) for the amount of data being protected by
Backup Exec on a Catalyst store.
3. The HP StoreOnce appliance also requires an additional Catalyst license to support this advanced functionality on every appliance.
4. Backup Exec 2012 Sp1a must be loaded for HP StoreOnce Catalyst licenses to be properly recognized.
5. The default in Backup Exec is for Verify to be turned, on which significantly extends the backup and replication times. Consider disabling
this to reduce backup and replication windows.
6. The more streams that can be supplied concurrently to a Catalyst store, the better the throughput. For best throughput 16 streams or
above is recommended.
7. Backup media servers running multiple streams and deduplication must be sized appropriately using the “rule of thumb” mentioned in
this guide.
8. Use the HP Storage sizing tool – this is calibrated with the latest test results from HP R&D and will take into consideration data
retention, data change and growth. It will also size any WAN links or show the bandwidth savings when using Catalyst.
9. Catalyst stores must be configured on the HP StoreOnce Backup system with the transfer protocol settings either both set to low
bandwidth or both set to high bandwidth.
10. In Backup Exec to set the Concurrency Level of the Catalyst store and the data stream split level you can edit the Catalyst store
parameters as shown below under “Storage” and “Properties” in the left-hand navigation pane. The data stream size shouldn’t affect
Catalyst Stores as there is buffering with the Catalyst client. The span size (or max object size) again doesn’t have a real impact with
Catalyst stores as Catalyst can support as many objects as needed. There is no 25,000 file limit like we have with NAS. So HP
recommends these values be left at the default values within Symantec Backup Exec.
157
11. If you are using Backup Exec Catalyst with an HP StoreOnce B6200 which supports failover – it is advisable to Enable checkpoint restart
by using the Edit feature on each backup job and then selecting the Advanced Open File Options in the left hand navigation.
12. Similarly for the Duplicate section of the process you can set different retention periods as shown below by editing the duplicate
function and then selecting Storage in the left hand navigation pane. More importantly in this section the Compression and Encryption
type settings should be set to None because Catalyst replication uses its own compression and encryption should not be applied as it
will reduce the deduplication ratio.
158
StoreOnce Catalyst low bandwidth backups over high latency links
One of the primary usage models for HP StoreOnce Catalyst is when remote sites (ROBO) for the first time can send a backup direct to a HP
StoreOnce appliance at a central site over a WAN. This is possible because the volume of data actually sent across the WAN is small, the majority
of the deduplication load having taken place on the media server, only the unique data needs to be transmitted to the central site.
The table below shows the impact of latency on HP StoreOnce Catalyst Low bandwidth backups with a 1% change rate of data using a single
stream low bandwidth Catalyst backup at various link speeds from T1 to T5
300.0
1.554 (T1)
250.0
Throughput (MB/s)
6.312 (T2)
200.0 10
Figure 41: Catalyst low bandwidth backups over high latency links
You can see the dramatic impact on backup performance as the latency increases up to and beyond 50mS. HP StoreOnce Catalyst is design to
tolerate high latency links without “dropping out” but the main concern is ensuring the backups can complete in the appropriate window over high
latency links. Inside a particular country/state most WAN links can be guaranteed to have Latency < 50ms but when being switched between
different Networks or maybe going intercontinental the Latency can start to creep up.
A best practice in these situations is rather than attempting to throttle the bandwidth ( e.g 50% of T1 line for Catalyst backup), is to have a Quality
of Service metric that allows the Catalyst low bandwidth backup full available bandwidth for a pre-defined period when other TI traffic is not
critical (out of business hours).
There are ways to improve the Catalyst low bandwidth backup throughput under these high latency circumstances. There is a parameter called
the Catalyst segment size which is effectively the payload size used between Catalyst data transfers. The default value is 10MB but in High
Latency situations ( > 100mS) where acknowledgement times increase, it is advisable to increase the Catalyst segment size to 20MB.
159
This can be done as follows:-
2. For Symantec Netbackup and Backup Exec it is necessary to change the parameter in the OST 2.x plug in.
For OST:
hpost.conf – controls plugin’s behavior
Lives at:
%SystemRoot%\Program Files\Hewlett-Packard\OpenStorage20\config (for Windows)
/usr/openv/hp/ost20/config (for Linux)
Controls the size of the buffer in MBs for Low Bandwidth operations.
Default : 10 MB
Eg: LBBUFFERSIZE:10.11.220.8:Store3:20
Where <server> is the media server IP address.
Where <store name> is the Catalyst Store Name the user has chosen.
160
Index
10GbE Ethernet, 16 Catalyst store
create, 105
A Catalyst technology, 24
Active Directory, 16, 37 Catalyst throughput, 58
active/active seeding, 46 CIFS AD, 16
active/passive seeding, 46 CIFS share
active-to-active replication, 39 authentication, 37
active-to-passive replication, 39 sub-directories, 33
activity monitor, 57 client access permissions, 28
apparent replication throughput, 42 co-location (many-to-one) seeding, 49
appending cartridges, 31 co-location (over LAN) seeding, 48
appending NAS files, 36 comparing device types, 6
authentication, 37 compression, 11, 36
concurrency, 42
concurrency settings, 27
B concurrent backup streams
backup application recommendations, 29
and NAS, 34 concurrent replication jobs, 9
backup application considerations
VTL, 31
backup applications D
integrating with StoreOnce, 103 D2DBS emulation type, 29, 30
backup file size (NAS), 34 deduplication
backup job performance considerations, 8
recommendations, 32 device types
bandwidth limiting, 27, 54 comparison of, 6
best practices diagnostic FC devices, 22
Catalyst, 7 differential backups, 31
general, 6 direct attach (private loop), 20
NAS, 7, 33 disk space pre-allocation, 35
network and FC, 12 dual port network, 14
replication, 38
tape offload, 69
VTL, 7, 29 E
blackout window, 27, 54, 61 emulation types, 29
block size, 29, 31, 35 encryption, 11, 36
bonded port network, 15
buffering (NAS), 36
F
fan in/fan out, 42
C fibre channel
cartridge sizing, 30 diagnostic FC devices, 22
Catalyst soft zoning, 21
bandwidth limiting, 27 Fibre Channel
blackout windows, 27 best practices, 12
client access permissions, 28 Fibre Channel topologies, 19
concurrency settings, 27 floating appliance seeding, 50
configuration on StoreOnce, 104
key features, 9
overview, 8 H
sizing rule, 25 high availability network, 15
Catalyst bandwidths housekeeping
explanation of, 24 overview, 10, 61
Catalyst Copy, 25 housekeeping load, 63
advantages of, 24 housekeeping statistics, 63
Catalyst monitoring, 56 housekeeping tab, 62
Catalyst Sizing example, 98 housekeeping triggers, 10
161
HP Data Protector deduplication, 8
implementing StoreOnce Catalyst, 109 multi-streaming, 11
NAS, 34, 35
network, 12
I replication, 9
incremental backups, 31 physical tape seeding, 52
product numbers, 5
K
key parameters, 70 R
replication
bandwidth limiting, 54
L best practices, 38
libraries per appliance blackout windows, 54
performance, 30 concurrency, 42
fan in/out, 42
guidelines, 41
M impact on other operations, 54
many to one seeding, 47 overview, 9, 38
many-to-one replication, 39 performance considerations, 9
Mode 6 bonding, 13 seeding, 44
monitoring, 56 source appliance permissions, 55
multi-hop Catalyst Copy, 25 usage models, 39
multiplex, 10 WAN link sizing, 43
multi-streaming, 10, 35 replication activity monitor, 57
replication concurrency
limiting, 43
N Replication Manager, 59
NAS replication monitoring, 56
backup application considerations, 34 replication throughput, 57
best practices, 33 retention policy, 31
open files, 34 rotation scheme, 31
performance considerations, 34, 35 example, 32
NAS backup targets, 33
network
bonded port configuration, 15 S
dual port configuration, 14 seeding
high availability, 15 co-location (over LAN), 48
on two subnets, 17 co-location at source, 48
one subnet with gateway, 18 floating appliance, 50
performance considerations, 12 many to one, 47
single port configuration, 13 methods, 45
network configuration, 12 over a WAN link, 46
best practices, 12 overview, 44
for CIFS AD, 16 using physical tape, 52
n-way replication, 39 single port network, 13
single subnet with gateway network, 18
sizing guide, 23
O Sizing tool
open file limit (NAS), 34 with Catalyst, 98
open files, 34 with VTL and NAS, 74
out of sync notifications, 56 Sizing tool output, 86
overwriting cartridges, 31 soft zoning, 21
overwriting NAS files, 36 source, 55
StoreOnce
integrating with backup applications, 103
P StoreOnce technology, 8
parallel backup streams, 76 switched fabric, 19
performance Symantec Backup Exec
162
implementing StoreOnce Catalyst, 136
Symantec NetBackup
implementing StoreOnce Catalyst, 124
synthetic backups, 36
T
tape offload
best practices, 69
overview, 65
performance factors, 68
supported methods, 66
when required, 66
Time Idle status, 63
Topology Viewer, 60
transfer size, 31, 35
two subnet network, 17
V
verify operation, 36
VTL
best practices, 29
VTL performance
libraries per appliance, 30
maximum concurrent backup jobs, 29
W
WAN link sizing, 43
worked example, 72
write-in-place operation, 35
Z
zoning, 20
163
For more information
HP StoreOnce Backup system CLI Reference Guide (PDF): This guide describes the StoreOnce CLI commands and how to use them.
HP StoreOnce Backup system User Guide (PDF): This guide describes the StoreOnce GUI and how to use it.
HP StoreOnce Linux and UNIX Configuration Guide (PDF): This guide contains information about configuring and using HP StoreOnce
Backup systems with Linux and UNIX.
You can find these documents on the Manuals page of the HP Business Support Center website:
http://www.hp.com/support/manuals