Beruflich Dokumente
Kultur Dokumente
Abstract
Conceptually understanding OpenStorage software and the API-based integration with Symantec NetBackup provides a clear view of the business value and technical merits of the integration. This guide moves past the conceptual stage to solution planning and deployment. Best practice guidelines are covered with the goal of eliminating implementation challenges. Knowledge and experience gained from assisting early adopters is logically presented for the overall benefit of those deploying an OpenStorage solution.
DEDUPLICATION STORAGE
Table of Contents
1 INTROdUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1 .1 TARGET AUdIENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1 .2 ExECUTIvE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 PlANNING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 .1 NAMING CONvENTIONS . . . . . . . . . . . . . . . . . . . . . . . . 3 2 .2 NETwORKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 .3 dOCUMENTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 OPTIMIzEd dUPlICATION . . . . . . . . . . . . . . . . . . . . . . . . . . 9 8 .2 ExISTING BACKUPS RETAIN OR dUPlICATE? . . . . . 14 3 .1 STORAGE UNITS ANd STORAGE SERvER ACCESS . . . . 9 8 .3 ARE STORAGE lIFECYClE POlICIES REqUIREd? . . . . 14 3 .2 NETwORK CONSIdERATIONS . . . . . . . . . . . . . . . . . . . 10 8 .4 NETBACKUP POlICY MOdIFICATION . . . . . . . . . . . . . 14 3 .3 ThROTTlING OPTIMIzEd dUPlICATION . . . . . . . . . . 10 8 .5 lEGACY REPlICATION . . . . . . . . . . . . . . . . . . . . . . . . . 15 3 .4 OPTIMIzEd dUPlICATION FAIlURES . . . . . . . . . . . . . 10 8 .6 dElETING lEGACY STORAGE UNITS . . . . . . . . . . . . . 15 3 .5 SEEdING REMOTE dATA dOMAIN SYSTEMS . . . . . . . 10 3 .6 dUPlICATION JOB CONFIGURATION OPTIONS . . . . . 11 4 dUPlICATION TO TAPE . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4 .1 TAPE CREATION FROM ThE PRIMARY NETBACKUP COPY . . . . . . . . . . . . . . . . . . . 12 4 .2 TAPE CREATION FROM A NON-PRIMARY NETBACKUP COPY . . . . . . . . . . . . . . . 12 6 7 8 5 dISASTER RECOvERY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5 .1 wIThIN ThE SAME NETBACKUP dOMAIN . . . . . . . . . 13 5 .2 TO A dIFFERENT NETBACKUP dOMAIN . . . . . . . . . . . 13 AddITIONAl REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . 13 CONClUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 APPENdIx MIGRATION TO OPENSTORAGE . . . . . . . . . . 14 8 .1 MUlTIPlE PROTOCOlS ON ONE dATA dOMAIN SYSTEM . . . . . . . . . . . . . . . . . . . 14
1 Introduction
Data Domain deduplication storage with OpenStorage software is not difficult to install or configure. Deployment is straightforward in most environments. However, deployments involving multiple sites and a complex environment may experience issues with naming conventions, network infrastructure, or site generated documentation detailing the installation. Therefore, OpenStorage implementations should be well planned and documented so that they can be deployed more quickly with fewer challenges when compared to the use of ad-hoc techniques. Deployment is often followed by a series of trials or a period of testing intended to prove that the solution is functioning as planned. In this guide, OpenStorage best practices are examined and discussed to assist in eliminating the bottlenecks associated with deployment and functional testing of the solution.
4 NetBackup media server load balancing, eliminating the need to manually divide client backups across NetBackup media servers utilizing OpenStorage storage units. 4 Tape consolidation Backup images from remote locations and branch offices can be replicated to a centralized location where they can be duplicated to tape under the control of NetBackup.
2 Planning
Deciding to change naming conventions halfway through a deployment can be painful, even more painful if production backups were executed to previously named components that later need to be deleted such that they can be renamed. Likewise, reconfiguring portions of the IP network that connect NetBackup media servers and Data Domain deduplication storage systems halfway through a deployment can also create a less than optimal experience when testing with production backups. Combining name changes with network changes is made worse when nothing is properly documented. While configuration changes are both possible and supported, a production environment may not be the best place to learn these techniques for the first time. Production environments differ from lab environments in that the severity of a situation is likely to be less pronounced in the lab. Creating a plan and documenting the configuration forms the foundation for a successful deployment and subsequent test phases.
server owns the storage capacity associated with a specific Data Domain system configured as a storage server.
4 Create one storage unit group per OpenStorage storage unit using a -sug extension. 4 The preferred storage unit could be the first storage unit in the list of storage units. 4 A second storage unit can be added to the group for use should the first storage unit be rendered non-operational. 4 Select the Failover storage unit selection algorithm. The use of storage unit groups is optional. Use of the Failover selection algorithm is a best practice as it facilitates sending the same backups to the same Data Domain OpenStorage Server which will equate to a higher data deduplication ratio. In the event that the preferred storage unit enters a non-operational state, backups will be sent to an alternate storage unit. This methodology may be of interest for mission critical or otherwise important backup jobs. Alternatively, you may elect not to use storage unit groups for backups that do not require N+1 redundancy.
LAN
LAN
WAN
2.2 Networks
Varying degrees of network complexity are associated with a given OpenStorage deployment. At a minimum, a single Data Domain system configured as a storage server is network connected to a NetBackup media server. NetBackup optimized duplication adds additional requirements as does a configuration that leverages media server load balancing. This section reviews sample network topologies: 4 NetBackup media server and Data Domain systems sharing a common LAN configured for optimized duplication. 4 NetBackup media servers and Data Domain systems in geographically different locations configured for optimized duplication. 4 NetBackup media server load balancing with a Data Domain system.
NetBackup Client
LAN LAN
Figure 3: NetBackup media server load balancing Figure 3 shows a NetBackup client that can be backed up through a number of different NetBackup media servers. The OpenStorage storage unit has been configured so that each NetBackup media server can access its resources. This enables NetBackup media server load balancing, where the least loaded media server is used to fulfill a backup request. Additionally, this configuration allows NetBackup to bypass an offline media server when fulfilling a backup or restore request.
Figure 1: Optimized duplication with a common LAN Figure 1 shows a simple example of a NetBackup master/media server LAN connected to two Data Domain systems. In this use case both backup and optimized duplication traffic use the same NIC (Network Interface Card) on a given Data Domain system.
Typical deployments may employ a combination of local and geographically dispersed components that leverage NetBackup media server load balancing as well as optimized duplication.
Other solutions that may pose similar issues are worth noting. 4 Combining an OpenStorage storage unit and a basic disk storage unit on the same Data Domain system can create capacity reporting issues. While the simultaneous use of all Data Domain supported protocols is possible, NetBackup intelligent capacity management will not be aware of space allocated to basic disk or VTL (Virtual Tape Library) operations.
Figure 5: Combining OpenStorage and basic disk storage units Using a single Data Domain system as an OpenStorage storage unit and a basic disk storage unit (NFS mount or CIFS share) is not recommended. NetBackup assumes complete and total ownership of any OpenStorage storage unit space. This example is also applicable to the sharing of a Data Domain system between OpenStorage and VTL protocols.
Figure 4: Attempting to overcome the single 1 GbE bottleneck This configuration requires using four unique network names for a single Data Domain system. As a result it also requires four unique storage server instances within NetBackup. Adding four unique LSUs to the configuration yields the ability to create four unique disk pools on the single Data Domain system. The four disk pools are used in configuring four storage units. This configuration results in numerous issues and should be avoided.
Figure 6: NetBackup limitations NetBackup catalog backups cannot be written to DiskPool type storage units, which include OpenStorage storage units. This NetBackup limitation is known to exist in product versions 6.5.0 through 6.5.2. The limitation may eventually be removed in a future version of NetBackup.
Dedicated backup networks provide a number of tangible benefits. 4 Dedicated backup networks segregate NetBackup media server and storage unit traffic from other network traffic. Contention issues are constrained to backup and recovery jobs. Known available bandwidth can be managed from the perspective of achieving aggressive data protection and recovery service levels. 4 Dedicated backup networks lay the foundation for a scalable infrastructure should data protection network bandwidth requirements change over time. 4 Data Domain recommends the use of a 10 GbE network infrastructure in cases where single stream or aggregate data transfer rates in excess of 125 MB/s are required between a single NetBackup media server and the Data Domain system. When deploying a Data Domain system that can accommodate data transfer rates exceeding the capabilities of 1 GbE networks, the use of a 10 GbE infrastructure overcomes data transfer rate bottlenecks. Single stream performance that exceeds 125 MB/s dictates the need for a 10 GbE network connection. Aggregate performance that exceeds 125 MB/s from a single NetBackup media server also dictates the need for a 10 GbE network connection. 4 Network topology without a 10 GbE infrastructure As discussed previously in the naming conventions section of this document, strongly recommended is the use of only one storage server and LSU per Data Domain system. This restricts the ability to use multiple 1 GbE interfaces between a single NetBackup media server and a single Data Domain system configured as a storage server. The network connecting NetBackup media servers to a given Data Domain system can incorporate the use of multiple 1 GbE interfaces so long as there is only one connection per NetBackup media server.
Figure 7: Unsupported storage server in multiple NetBackup domains Using OpenStorage storage units in a multiple NetBackup master server configuration appears attractive as two sites could potentially replicate to each other and effectively serve as disaster recovery vehicles for each other. The configuration is not supported by NetBackup however, as only one NetBackup master server can effectively own or control a given storage server.
1 GbE Private Network 1 GbE Private Network 1 GbE Private Network 1 GbE Private Network
Figure 10: Separate replication network for optimized duplication Using a separate network for replication controlled by NetBackup optimized duplication is optional, and is usually deployed in cases where the source and destination Data Domain systems reside in geographically different locations. Note this also serves to separate regular backup and restore network traffic from replication traffic that may be a traveling over a wide area network.
Figure 9: Recommended use of multiple 1 GbE networks In the example shown in figure 9, each of four NetBackup media servers connects to a specific NIC on a Data Domain system configured as a single storage server. Each NetBackup media server is configured to use DNS or a local host file such that the storage server name resolves to a specific interface on the Data Domain system. This configuration accommodates NetBackup media server load balancing as it utilizes a single storage server, single disk pool, and a single storage unit. By default the storage unit is defined to allow all four NetBackup media servers to use the shared disk pool resource. This 1 GbE topology imposes limits on maximum single stream as well as aggregate data transfer rates from any single NetBackup media server to the Data Domain system. The combined data transfer rate of all NetBackup media servers can result in an aggregate data transfer rate that seeks to better utilize resources and achieve the maximum throughput possible on the Data Domain system.
2.3 Documentation
With multiple NetBackup media servers, multiple Data Domain systems, and the potential use of multiple networks combined with different geographical locations, the importance of documenting the deployed solution cannot be overemphasized. Proper documentation enables various site and vendor groups including management, data protection administrators, and network administrators to understand and maintain the deployed solution. Should the need arise to modify, alter, or enhance the solution, documentation lays the groundwork for moving forward. Should the need for technical support or other assistance be required, documentation can assist in rapid problem isolation and resolution. 4 Topology Diagram: This basic diagram consists of a map of physical components labeled using the recommended naming conventions. Also included are the individual networks and IP addresses of the components. This common sense approach makes it possible for others within or outside of the organization to quickly understand the overall view of the deployed solution should one person be on vacation or otherwise unable to assist when needed. 4 Data Collection: Collecting and recording relevant configuration information is consistent with the creation of best practice documentation. On the NetBackup master server and each NetBackup media server used for the OpenStorage solution the following commands (or their equivalent) should be executed:
4 Optimized duplication traffic can use the same network connection as the NetBackup media server, or it can use an alternate NIC. Backup and recovery data streams are fully inflated, where every byte of the stream passes over a network connection. Optimized duplication replicates deduplicated data between source and destination targets, and typically requires only a fraction of the network bandwidth consumed by backup or recovery jobs. The choice in deciding what network interface to use for optimized duplication is usually based on deployment requirements. In cases where optimized duplication traffic flows between geographically different locations, some customers have chosen to use a separate dedicated network connection. This connection links source and destination Data Domain systems specifically for the purpose of replication controlled by NetBackup initiated optimized duplication. User requirements to track WAN link usage may also prefer this approach.
3 Optimized Duplication
Simple in principle, optimized duplication is also simple to configure once requirements are understood.
1 GbE Public Network Optimized duplication source storage unit Optimized duplication destination storage unit
Figure 11: Separate source and destination NetBackup media servers Figure 11 depicts optimized duplication between two OpenStorage storage units. The NetBackup media server initiating an optimized duplication job needs to have credentials to access both the source and destination OpenStorage storage units.
Credentials are set by means of the NetBackup tpconfig command on each NetBackup media server requiring access to a given OpenStorage storage unit. This allows the NetBackup media server to use the OpenStorage storage unit for backup and recovery jobs, as well as for optimized duplication. In cases where optimized duplication uses a destination OpenStorage storage unit that may be geographically distant from the NetBackup media server initiating optimized duplication, the storage unit definition should not
allow the geographically distant NetBackup media server to use the storage unit for backup or recovery jobs. This is easily accomplished from within the NetBackup storage unit dialog window as shown in figure 12.
The net effect of network bandwidth throttling may impact recovery point objectives for disaster recovery. Other effects might include the queuing of jobs as optimized duplication jobs contribute to destination storage unit concurrent jobs. Once a storage units maximum concurrent jobs parameter has been reached, new jobs requiring the use of the storage unit will be queued as they await storage unit resource availability. Based on service level requirements, it might be possible to limit the quantity of network bandwidth required by limiting the amount of data that needs to be replicated. One possibility worth considering is the optimized duplication of only full backups, where incremental backups are not duplicated. Different storage lifecycle policies can be employed for full and incremental backups should this methodology align with service level objectives.
At present there is no known way to configure a NetBackup storage lifecycle policy such that a failed optimized duplication job will not be retried. However, a manually driven NetBackup utility nbstlutil can be used to cancel pending duplication operations.
10
4 Data Domain recommends using NetBackup storage lifecycle policies to control optimized duplication. 4 Storage lifecycle policies facilitate setting different retention periods for backup and duplication jobs. 4 Data Domain recommends using Fixed retention periods versus Staged capacity managed and Expire after duplication retention period types. 4 Data classification can be used in conjunction with storage lifecycle policies if desired, but doing so is not a requirement for optimized duplication.
Figure 13: Storage Unit dialog Maximum concurrent jobs Figure 13 shows the storage unit Maximum concurrent jobs parameter set to a value of zero. Use this technique when relocating a Data Domain system from a local site to a final destination site so that related optimized duplication jobs will enter a queued state instead of failing.
Figure 14: Storage lifecycle policy The example shown in figure 14 contains a backup storage destination equal to storage unit dd120a-stu with a fixed retention period of one week. The example also includes a duplication destination equal to dd120b-stu with a fixed retention period of six months. The storage lifecycle policy has optionally been assigned a data classification value equal to Platinum. When the duplication task is executed it will result in an optimized duplication job that appears in the NetBackup activity monitor.
Storage lifecycle policy duplication relies upon certain default settings that control the point at which a duplication job will be launched. An optional configuration file can be created to customize lifecycles to run duplication jobs based on customer requirements. Out of the box defaults for NetBackup version 6.5 include: 4 MIN_KB_SIZE_PER_DUPLICATION_JOB 8192 4 MAX_KB_SIZE_PER_DUPLICATION_JOB 25600 4 MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB 30 Optimized duplication testing with backup images less than 8 GB in size may be delayed by up to 30 minutes as a result of the default settings. In a test only environment it may make sense to alter the default value for the MAX_MINUTES_TIL_FORCE_SMALL_ DUPLICATION_JOB to a value of less than 30 minutes. In some environments the default settings may be appropriate. The default settings can be adjusted by creating a LIFECYCLE_ PARAMETERS file. The Veritas NetBackup Administrators Guide, Volume I should be consulted for additional information before adjusting these values.
11
4 Duplication to Tape
Requirements to retain long term copies of backup images on removable tape media are easily integrated with OpenStorage solutions. Multiple means of accomplishing this objective currently exist, with additional functionality likely to be forthcoming in new NetBackup versions. 4 NetBackup supports the duplication of backup images, not media or specific tape cartridges. 4 NetBackup supports the creation, cataloging, and tracking of up to ten copies of a particular backup image. 4 The default value for Maximum backup copies is two. 4 The Maximum backup copies parameter can be adjusted with the NetBackup administrative GUI via Host Properties > Master Servers > Global Attributes.
Take for example the execution of a storage lifecycle policy used in conjunction with optimized duplication. The initial backup to the first Data Domain OpenStorage Server will be copy 1, and it will also be the primary copy. Optimized duplication of this copy to a second Data Domain OpenStorage Server will result in the creation of copy 2. Copy 2 will be a non-primary copy as long as copy 1 is still being retained, or until copy 2 is manually set to primary. Users seeking to create tape-based copies from copy 2 may elect to allow copy 1, the primary copy, to expire. At the point where copy 1 expires, copy 2 is set to primary by NetBackup. Execution of a properly configured NetBackup Vault Option policy will then use copy 2 to create copy 3. The criteria for configuring a NetBackup Vault Policy that accomplishes this objective is based on criteria that selects backups which occurred in the past, such that copy 1 no longer exists.
Figure 16: NetBackup Vault Option profile Figure 15: Maximum backup copies By default the NetBackup global attribute Maximum backup copies is set to a value of two. As shown in figure 15, altering the value to accommodate additional copies is easily performed via the administrative GUI. The NetBackup Vault Option provides the ability to specify granular selection criteria of backup images for duplication. In the example shown above, backups started between 16 and 15 days prior to the execution of a Vault job will be selected for inclusion. This strategy allows copy 1 of a backup image to expire, and copy 2 (which has been set to primary by NetBackup) to be used to fulfill a duplication request.
12
5 Disaster Recovery
Using optimized duplication to create duplicate backup images assists in accommodating a variety of disaster recovery scenarios.
Setting a non-primary copy to primary is easily accomplished by right-clicking the image and then selecting Set Primary Copy from the pop-up menu.
Figure 19: Setting primary copy Setting a non-primary backup image copy to primary can be accomplished via a pop-up menu. This enables the use of a particular backup image to fulfill restore requests. This can be useful in cases where recovery from a specific geographical location is desired or in cases where the original primary copy is not available.
6 Additional References
Data Domain secure access customer support site: https://my.datadomain.com/
Figure 17: NetBackup catalog copy 1 primary copy The NetBackup catalog utility can be used to select the primary copy of a backup image.
OpenStorage (OST) User Guide OpenStorage (OST) Quick Start Symantec: http://www.symantec.com/business/support/documentation.jsp? language=english&view=manuals&pid=15143 NetBackup Administration Guides NetBackup Shared Storage Guide NetBackup Vault Administrators Guide NetBackup Command Guides
Figure 18: NetBackup catalog copy 2 not primary The NetBackup catalog utility can be used to select copy 2 of a backup image.
13
7 Conclusion
Data Domain support for Symantec NetBackup OpenStorage advances the ability to use disk as disk, store more data on disk with inline deduplication, and simplifies the creation of backup copies with optimized duplication. Creating duplicate backup copies with optimized duplication enables advanced disaster recovery strategies. Disaster recovery copies of backup images are created faster, and are available at the disaster recovery location sooner when compared to tape-based solutions.
14
Data Domain | 2421 Mission College Blvd., Santa Clara, CA 95054 | 866-WE-DDUPE, 408-980-4800
Copyright 2008 Data Domain, Inc. All rights reserved. Data Domain, Inc. believes information in this publication is accurate as of its publication date. This publication could include technical inaccurancies or typographical errors. The information is subject to change without notice. Changes are periodically added to the information herein; these changes will be incorporated in new additions of the publication. Data Domain, Inc. may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time. Reproduction of this publication without prior written permission is forbidden. The information in this publication is provided as is. Data Domain, Inc. makes no representations or warranties of any kind, with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Data Domain and Global Compression are trademarks of Data Domain, Inc. All other brands, products, service names, trademarks, or registered service marks are used to identify the products or services of their respective owners.WP-OSTBPG-1208
DEDUPLICATION STORAGE
www.datadomain.com