You are on page 1of 20

IBM Spectrum Protect

High Availability and Disaster Recovery with

IBM DB2 HADR


and

IBM ProtecTIER native IP Replication

Authors:

Joerg Walter (jowalter@de.ibm.com)


Andre Gaschler (gaschler@de.ibm.com)
Erik Franz (erik.franz@de.ibm.com)

Version:

1.0 (18.11.2015)

Spectrum Protect with ProtecTIER HA & DR

Contents
1.

Introduction...............................................................................................................3

2.

What is DB2 HADR?.................................................................................................4

2.1

HADR Synchronization Types and Scenarios................................................................................5

2.2

HADR Configuration and Validation...............................................................................................5

2.2.1

How to prime the designated standby databases?..............................................................6

2.2.2

Which HADR parameters need to be configured?..................................................................6

2.2.3

How to configure HADR on the different DB nodes?..............................................................7

2.2.4

How to start the databases with their appropriate HADR roles?.............................................8

2.2.5

How to validate that the HADR setup and the role assignment?............................................8

2.3

Performing a TSM & DB2 HADR Failover...................................................................................10

2.3.1
2.4

3.
3.1

4.
4.1

5.

How to perform a failover?................................................................................................... 10

What about the TSM storage pools?...........................................................................................11

IBM ProtecTIER at a glance...................................................................................12


IBM ProtecTIER with native IP Replication..................................................................................13

Multi-site redundant Backup Environment..............................................................14


Failover in multi-site redundant Backup Environment..................................................................14

Summary.................................................................................................................16

Appendix A: Contents of Figures.....................................................................................17


Appendix B: Contents of Tables.......................................................................................18
Appendix C: References..................................................................................................19
Disclaimer........................................................................................................................20
Trademarks............................................................................................................................................ 20

IBM Corporation 2015

IBM and BP Internal Use

Page 2 of 20

Spectrum Protect with ProtecTIER HA & DR

1. Introduction
IBM Spectrum Protect (formerly known as Tivoli Storage Manager / TSM) [1] is made to protect business
critical data and applications, requiring continuous availability and disaster protection. High available
Spectrum Protect infrastructures in conjunction with IBM ProtecTIERs [2] data deduplication and
replication features lead to minimal RTO/RPO times, combined with maximum space efficiency at the
same time.
IBMs ProtecTIER solution offers great data deduplication and replication features which allows for
efficient replication of backup data to offsite locations, without the need to move physical tapes. In
addition, the following benefits come along with ProtecTIER:

Deduplication performance of up to 2.500MB/sec. for Backup and even higher performance for
restore operations

Single system capacity up to 1 PB physical repository size

LANfree backup and restore capabilities

Near sync replication for improved RPO

Support for multi-site replication requirements

In Spectrum Protect and ProtecTIER backup environments, the ProtecTIER deduplicates and replicates
the backup data, while Spectrum Protect manages and replicates the backup catalog (meta data), which
is stored on an integrated IBM DB2 database [3].
Combining the IP based replication features of ProtecTIER and Spectrum Protect, it is possible to design
a flexible Data Protection environment with multi-site redundancy.
This article describes the setup of a multi-site redundancy backup environment using Spectrum Protect
together with ProtecTIER. It is based on the experiences we made during a customer implementation and
various tests in the ESCC Mainz Storage Systems Lab.
We will give you a short introduction to DB2 HADR feature and to the ProtecTIER solution. Further on,
well explain how a multi-site redundant backup environment based on Spectrum Protect together with
ProtecTIER is designed.

IBM Corporation 2015

IBM and BP Internal Use

Page 3 of 20

Spectrum Protect with ProtecTIER HA & DR

2. What is DB2 HADR?


High Availability Disaster Recovery (HADR) is a data replication feature that provides a high availability
solution for DB2 databases.
HADR protects against data loss by replicating changes from a source database (Primary) to a target
database (Standby).

Figure 1: IBM DB2 HADR Overview

The following list describes why you should think about using DB2 HADR in your Spectrum Protect
environment:

HADR is a standard feature of DB2, which is included with TSM beginning with version 6.x., so it
is ready to use.

Using HADR only for DB2 bundled with TSM requires no additional licenses.

HADR communication is managed by the database, using standard TCP/IP networks, so there
are no special requirements regarding disk subsystems or other HW or SW.

HADR is easy to setup and manage. Only a few commands are required to configure HADR on
an existing TSM instance.

HADR allows to implement cluster features on an application layer, with no need for operating
system cluster support.

HADR supports both, HA and DR scenarios:


o

HADR sync peers provide warm standby for HA

HADR async peers provide warm standby for DR

Both variants can be combined in an environment

Starting with DB2 v10.1, up to three HADR standby databases can be setup for a primary database. This
feature is available with Spectrum Protect (TSM) v7.1, which contains DB2 v10.5.
One system needs to be designated as Principal Standby, while additional standby systems can be
added as Auxiliary Standby.
All of the HADR sync modes are supported on the principal standby, but the auxiliary standbys
synchronization mode is always SUPERASYNC mode.

IBM Corporation 2015

IBM and BP Internal Use

Page 4 of 20

Spectrum Protect with ProtecTIER HA & DR

2.1 HADR Synchronization Types and Scenarios


HADR supports a variety of log shipping synchronization modes to balance performance and data
protection:

SYNC: Log write on primary requires replication to the persistent storage on the standby.

NEARSYNC: Log write on primary requires replication to the memory on the standby.

ASYNC: Log write on primary requires a successful send to standby (receive is not guaranteed).

SUPERASYNC: Log write on primary has no dependency on replication to standby.

The following diagram shows an example of a DB2 HADR multiple standby environment:

Figure 2: DB2 HADR multiple standby environment

2.2 HADR Configuration and Validation


The following list shows the high-level tasks that have to be performed to setup HADR:
1. Syncronize the time on all involved servers and components (NTP)
2. Prime the DB2 database on the designated HADR standby(s)
3. Configure HADR parameters on primary and standby database
4. Start HADR on the standby server(s)
5. Start HADR on the primary server
6. Validate the HADR state
7. Start the TSM application on the primary server

IBM Corporation 2015

IBM and BP Internal Use

Page 5 of 20

Spectrum Protect with ProtecTIER HA & DR


2.2.1

How to prime the designated standby databases?

In order to allow the application of log updates from the primary DB to the standby DB, the standby
database needs to be initialized this is done by restoring an offline backup on the target DB:
1. Backup the DB on the primary TSM server host:

Halt the TSM server

Switch to the instance owner user ID (su tsminst1)

Backup the DB2 database to a shared media (db2 backup db tsmdb1 to /nfsdir/hadr)

Note: At this point, do not start the TSM server application again!

2. Restore the DB backup to the standby server host:

Halt the TSM server (if it is running)

Switch to the instance owner user ID ( su tsminst1)

Drop the DB2 instance (db2 drop db tsmsb1)

Restore the DB (db2 restore db tsmdb1 from /nfsdir/hadr)

3. Continue to configure HADR for both DBs, still without starting TSM yet.

2.2.2

Which HADR parameters need to be configured?

The following table shows the DB2 HADR parameters that need to be configured properly in a multi-target
environment:

Parameter

Description

hadr_local_host

The local hostname

hadr_local_svc

TCP/IP port to be assigned to the local HADR process


(Note: Check the /etc/services for a free port on all systems)

hadr_target_list

Defines a list of all databases (host:port) that participate in a HADR multiple standby
environment. It starts with the principal standby, followed by all auxiliary standbys
(assuming the local system will become the primary DB)

hadr_remote_host

Hostname of the HADR remote DB/Peer (what will be my principal standby, in case I
will become the primary DB?)

hadr_remote_inst

Name of the remote TSM instance (e.g. tsminst1)

hadr_remote_svc

TCP/IP port to be assigned to the remote HADR process

hadr_sync_mode

Log shipping syncronization mode between primary and principal standby (auxiliary
standby always uses SUPERASYNC)

hadr_timeout

Wait time before HADR considers a peering attempt as failed


Table 1: IBM DB2 configuration parameters

IBM Corporation 2015

IBM and BP Internal Use

Page 6 of 20

Spectrum Protect with ProtecTIER HA & DR


2.2.3

How to configure HADR on the different DB nodes?

Example: Configure three TSM servers with HADR

TSM_SERVER_A

(PRIMARY)

TSM_SERVER_B

(PRINCIPAL STANDBY)

TSM_SERVER_C

(AUXILIARY STANDBY)

All systems will use TCP port 60111 for the remote HADR service.
The primary server will use a replication type of SYNC to the principal standby.
Configuration of TSM_SERVER_A:
su - tsminst1
db2 update db cfg for tsmdb1 using hadr_local_host TSM_SERVER_A
db2 update db cfg for tsmdb1 using hadr_local_svc 60111
db2 update db cfg for tsmdb1 using hadr_target_list
"TSM_SERVER_B:60111|TSM_SERVER_C:60111"
db2 update db cfg for tsmdb1 using hadr_remote_host TSM_SERVER_B
db2 update db cfg for tsmdb1 using hadr_remote_inst tsminst1
db2 update db cfg for tsmdb1 using hadr_remote_svc 60111
db2 update db cfg for tsmdb1 using hadr_syncmode SYNC
db2 update db cfg for tsmdb1 using hadr_timeout 120
Configuration of TSM_SERVER_B:
su - tsminst1
db2 update db cfg for tsmdb1 using hadr_local_host TSM_SERVER_B
db2 update db cfg for tsmdb1 using hadr_local_svc 60111
db2 update db cfg for tsmdb1 using hadr_target_list
"TSM_SERVER_A:60111|TSM_SERVER_C:60111"
db2 update db cfg for tsmdb1 using hadr_remote_host TSM_SERVER_A
db2 update db cfg for tsmdb1 using hadr_remote_inst tsminst1
db2 update db cfg for tsmdb1 using hadr_remote_svc 60111
db2 update db cfg for tsmdb1 using hadr_syncmode SYNC
db2 update db cfg for tsmdb1 using hadr_timeout 120
Configuration of TSM_SERVER_C:
su - tsminst1
db2 update db cfg for tsmdb1 using hadr_local_host TSM_SERVER_C
db2 update db cfg for tsmdb1 using hadr_local_svc 60111
db2 update db cfg for tsmdb1 using hadr_target_list
"TSM_SERVER_A:60111|TSM_SERVER_B:60111"
db2 update db cfg for tsmdb1 using hadr_remote_host TSM_SERVER_A
db2 update db cfg for tsmdb1 using hadr_remote_inst tsminst1
db2 update db cfg for tsmdb1 using hadr_remote_svc 60111
db2 update db cfg for tsmdb1 using hadr_syncmode SUPERASYNC
db2 update db cfg for tsmdb1 using hadr_timeout 120

IBM Corporation 2015

IBM and BP Internal Use

Page 7 of 20

Spectrum Protect with ProtecTIER HA & DR


2.2.4

How to start the databases with their appropriate HADR roles?

The following tasks have to be performed in order to start all involved HADR databases - standby(s) first,
primary last:
1. Start the database on the Principal Standby:
su tsminst1
db2 start hadr on db tsmdb1 as standby
2. Start the database on Auxiliary Standby(s)
su tsminst1
db2 start hadr on db tsmdb1 as standby
3. Start the database on Primary
su tsminst1
db2 start hadr on db tsmdb1 as primary
4. Start TSM

2.2.5

How to validate that the HADR setup and the role assignment?

Executing the db2pd command on the primary system provides an at-a-glance view to all peers:
[tsminst1@TSM_Server_A ~]$ db2pd -db tsmdb1 hadr
Database Member 0 -- Database TSMDB1 -- Active -- Up 8 days 23:41:07 -- Date 2015-0404-16.56.03.593986
HADR_ROLE
REPLAY_TYPE
HADR_SYNCMODE
STANDBY_ID
HADR_STATE
PRIMARY_MEMBER_HOST
PRIMARY_INSTANCE
STANDBY_MEMBER_HOST
STANDBY_INSTANCE
HADR_CONNECT_STATUS
HADR_CONNECT_STATUS_TIME
HEARTBEAT_INTERVAL(seconds)
HADR_TIMEOUT(seconds)
PRIMARY_LOG_FILE,PAGE,POS
STANDBY_LOG_FILE,PAGE,POS
STANDBY_REPLAY_LOG_FILE,PAGE,POS
PRIMARY_LOG_TIME
STANDBY_LOG_TIME
STANDBY_REPLAY_LOG_TIME

=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=

PRIMARY
PHYSICAL
SYNC
1
PEER
TSM_SERVER_A
tsminst1
TSM_SERVER_B
tsminst1
CONNECTED
03/26/2015 16:15:03.336112 (1427382903)
30
120
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)

HADR_ROLE
REPLAY_TYPE
HADR_SYNCMODE
STANDBY_ID
HADR_STATE
PRIMARY_MEMBER_HOST
PRIMARY_INSTANCE
STANDBY_MEMBER_HOST
STANDBY_INSTANCE
HADR_CONNECT_STATUS
HADR_CONNECT_STATUS_TIME

=
=
=
=
=
=
=
=
=
=
=

PRIMARY
PHYSICAL
SUPERASYNC
2
REMOTE_CATCHUP
TSM_SERVER_A
tsminst1
TSM_SERVER_C
tsminst1
CONNECTED
03/26/2015 16:15:06.459274 (1427382906)

IBM Corporation 2015

IBM and BP Internal Use

Page 8 of 20

Spectrum Protect with ProtecTIER HA & DR


HEARTBEAT_INTERVAL(seconds)
HADR_TIMEOUT(seconds)
PRIMARY_LOG_FILE,PAGE,POS
STANDBY_LOG_FILE,PAGE,POS
STANDBY_REPLAY_LOG_FILE,PAGE,POS
PRIMARY_LOG_TIME
STANDBY_LOG_TIME
STANDBY_REPLAY_LOG_TIME

IBM Corporation 2015

=
=
=
=
=
=
=
=

30
120
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
S0000068.LOG, 3577, 36381110307
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)
04/04/2015 16:51:51.000000 (1428159111)

IBM and BP Internal Use

Page 9 of 20

Spectrum Protect with ProtecTIER HA & DR

2.3 Performing a TSM & DB2 HADR Failover


Consider the following when planning for a TSM failover:

HADR is taking care of the TSM server database, but what about the storage pool data?

Ensure that also the TSM client backup data is accessible on the failover location. This can be
achieved e.g. by using a shared file system, a copy pool, or by using other data replication
techniques (e.g. ProtecTIER).

Ensure that the TSM clients can access the TSM instance on the failover host, e.g. by using
Service IP addresses or DNS changes.

Properly size the LAN / WAN connections from the TSM clients to the TSM failover system.

A DB2 HADR standby database can take over the primary role, e.g. if the primary system fails, or if there
is a planned maintenance. According to this, there are two potential failover types:

Graceful Failover:
If the primary and standby system are both available, they can switch their roles. This is used e.g.
for maintenance purpose.

Forced Failover:
This method is used to bring up a standby system with the primary role due to a failed primary
system.

2.3.1

How to perform a failover?

Perform the following commands to gracefully failover from a primary server to a standby system:
On the primary server (skip this step for a forced failover):
1. Halt the TSM application (will also stop DB2)
2. Restart the DB2 in standby role:
su tsminst1
db2start
db2 start hadr on db tsmdb1 as standby
3. On one of the standby servers (preferred on the principal standby):
Execute the takeover command and validate that it was successful:
su tsminst1
db2 takeover hadr on db tsmdb1 by force
db2pd -db tsmdb1 hadr
4. Start the TSM application
In order to failback, perform the steps above in opposite direction.

IBM Corporation 2015

IBM and BP Internal Use

Page 10 of 20

Spectrum Protect with ProtecTIER HA & DR

2.4 What about the TSM storage pools?


As discussed before, the TSM storage pool data (or a copy of it) needs to be available in the failover
location. This can be achieved by various methods, e.g.:

Copy storage pools on Disk or Tape


If HADR is used to provide HA only (e.g. with TSM servers in two DCs on the same campus),
disk- or tape-storage devices could be attached via fibre-channel to both HADR servers.

Shared NAS file systems


This method could be used e.g. to provide a file-based TSM copy pool to a location that is too far
away for a fibre-channel attachment.

IBM ProtecTIER with native IP replication


This is the preferred method which we will explain further in this document.

IBM Corporation 2015

IBM and BP Internal Use

Page 11 of 20

Spectrum Protect with ProtecTIER HA & DR

3. IBM ProtecTIER at a glance


IBM ProtecTIER with Hyperfactor is a software running on Linux, providing in-line data de-duplication
features for backup data (e.g. Spectrum Protect, NetBackup, etc.).
Various configurations are available, e.g.:

Small TS7620 appliance or TS7650G gateway (single node or cluster)

FibreChannel-attached Virtual Tape Library emulation

Ethernet-attached CIFS, NFS or OST interfaces

Figure 3: IBM ProtecTIER Overview

IBM Corporation 2015

IBM and BP Internal Use

Page 12 of 20

Spectrum Protect with ProtecTIER HA & DR

3.1 IBM ProtecTIER with native IP Replication


IBM ProtecTIER native IP replication provides an option to replicate virtual cartridges (for VTLs) or files
(for systems using the File System Interface / FSI) from one ProtecTIER system to another PT system via
standard TCP/IP networks. Due to a high grade of parallelism, this option allows to move huge amounts
of backup data to an offsite location over large distances. Only the deduplicated portion of data is being
replicated. The following diagram gives an overview of an IP replication scenario:

Figure 4: IBM ProtecTIER Replication Overview

IBM Corporation 2015

IBM and BP Internal Use

Page 13 of 20

Spectrum Protect with ProtecTIER HA & DR

4. Multi-site redundant Backup Environment


The following diagram shows an example for a three-site environment:

Figure 5: Multi-site redundant Backup Environment

A Spectrum Protect server (HADR primary) has two HADR standby servers. The principal
standby is in a second data center in the main location and acts as a failover system e.g. for
hardware maintenance purpose. The auxiliary standby acts as a failover system for DR purpose.

Each server has a ProtecTIER TS7650G (VTL) attached.

Virtual cartridges are replicated from PT_A to PT_B and PT_C.

4.1 Failover in multi-site redundant Backup Environment


The following steps have to be performed to failover TSM and PT from DC1 to DC2:
1. Failover the TSM Application:

Checkout all libvolumes from the VTL library (remove=no, checklabel=no)

Halt TSM in DC1

Restart DB2 in DC1 in standby role

Perform DB2 HADR takeover from DC1 to DC2, monitor peering and finally start TSM in
DC2

2. Failover the PT-based Storagepool(s):

IBM Corporation 2015

IBM and BP Internal Use

Page 14 of 20

Spectrum Protect with ProtecTIER HA & DR

Update the VTL library definition to serial=autodetect

Delete all Drives and all Paths to the VTL library in TSM (e.g. by using perform libaction)

Re-define the library path and all drives using the proper device names on the failover
host

Enable DR mode for the ProtecTIER in DC2 to stop incoming replication traffic from
DC1

Use the PT GUI to move the replicated cartridges to the prepared VTL partition in DC2

Checkin the libvolumes to the re-defined library in TSM (first checkin scratch, then
private)

3. Optional: Prepare for continuing operations in DC2:

All replicated tape cartridges are read-only, which allows to perform restores of data

In order to perform new backups on the failover site, create new virtual tape cartridges
(readwrite)

The failback from DC2 to DC1 is the same procedure, vice-versa.

IBM Corporation 2015

IBM and BP Internal Use

Page 15 of 20

Spectrum Protect with ProtecTIER HA & DR

5. Summary
DB2 HADR offers a great approach to replicate a Spectrum Protect server database (the Meta data) to
one or more (standby) target sites. Combined with the native IP replication feature of the IBM ProtecTIER
VTL system, it is possible to build easy-to use, efficient, high available, high capacity and high
performance backup solutions, which provide superior Disaster protection at the same time.

IBM Corporation 2015

IBM and BP Internal Use

Page 16 of 20

Spectrum Protect with ProtecTIER HA & DR

Appendix A: Contents of Figures


Figure 1: IBM DB2 HADR Overview............................................................................................................ 4
Figure 2: DB2 HADR multiple standby environment....................................................................................5
Figure 3: IBM ProtecTIER Overview.......................................................................................................... 12
Figure 4: IBM ProtecTIER Replication Overview.......................................................................................13
Figure 5: Multi-site redundant Backup Environment..................................................................................14

IBM Corporation 2015

IBM and BP Internal Use

Page 17 of 20

Spectrum Protect with ProtecTIER HA & DR

Appendix B: Contents of Tables


Table 1: IBM DB2 configuration parameters................................................................................................ 6

IBM Corporation 2015

IBM and BP Internal Use

Page 18 of 20

Spectrum Protect with ProtecTIER HA & DR

Appendix C: References
[1] IBM Spectrum Protect (TSM) Home page:
http://www-03.ibm.com/software/products/en/tivoli-storage-manager-family

[2] IBM System Storage TS7650G ProtecTIER Deduplication Gateway


http://www-03.ibm.com/systems/storage/tape/ts7650g/index.html

[3] IBM DB2 database software


http://www-01.ibm.com/software/data/db2/

IBM Corporation 2015

IBM and BP Internal Use

Page 19 of 20

Spectrum Protect with ProtecTIER HA & DR

Disclaimer
The information contained in this documentation is provided for informational purposes only. While efforts
were made to verify the completeness and accuracy of the information provided, it is provided as is
without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising
out of the use of, or otherwise related to, this documentation or any other documentation. Nothing
contained in this documentation is intended to, nor shall have the effect of, creating any warranties or
representations from IBM (or its suppliers or licensors), or altering the terms and conditions of the
applicable license agreement governing the use of IBM software.
The Techdocs information, tools and documentation ("Materials") are being provided to IBM Business
Partners to assist them with customer installations. Such Materials are provided by IBM on an "as-is"
basis. IBM makes no representations or warranties regarding these Materials and does not provide any
guarantee or assurance that the use of such Materials will result in a successful customer installation.
These Materials may only be used by authorized IBM Business Partners for installation of IBM products
and otherwise in compliance with the IBM Business Partner Agreement.

Trademarks
The following terms are trademarks or registered trademarks of the IBM Corporation in the United States
or other countries or both: IBM, ProtecTIER, System Storage, Spectrum Protect and Tivoli.
Linux is a registered trademark of Linus Torwald.
Other company, product, and service names may be trademarks or service marks of others.

IBM Corporation 2015

IBM and BP Internal Use

Page 20 of 20