Beruflich Dokumente
Kultur Dokumente
Lijun (June) Gu David (Danhai) Cao Joachim Dirker Roger E. Sanders Michael T. Terrell Roland Tretau
ibm.com/redbooks
International Technical Support Organization DB2 UDB Exploitation of NAS Technology July 2002
SG24-6538-00
Take Note! Before using this information and the product it supports, be sure to read the general information in Notices on page xv.
First Edition (July 2002) This edition applies to IBM TotalStorage Network Attached Storage (NAS) and NetWork Appliance filer products with the Windows Powered Operating System and Linux. Copyright Network Appliance Inc. 2002. All rights reserved.
Comments may be addressed to: IBM Corporation, International Technical Support Organization Dept. QXXE Building 80-E2 650 Harry Road San Jose, California 95120-6099 When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you.
Copyright International Business Machines Corporation 2002. All rights reserved. Note to U.S Government Users Documentation related to restricted rights Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
Contents
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii The team that wrote this redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Special notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix Comments welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx Part 1. NAS and NetApp filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 1. Introduction to DB2 UDB, NAS, and SAN . . . . . . . . . . . . . . . . . . 3 1.1 Introduction to DB2 UDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.1 DB2 Universal Database packaging . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.2 The Universal Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.3 DB2s query optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.4 DB2 utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 Introduction to Network Attached Storage. . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2.1 File servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2.2 Network appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.3 Benefits of NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3 Introduction to Storage Area Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.3.1 Storage Area Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.3.2 Benefits of SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts . . . . . . . 23 2.1 DB2 terminology and concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.1 Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.2 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.1.3 Buffer pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.1.4 Table spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.1.5 Tables, indexes, and long data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.1.6 DB2 UDB and parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.1.7 Registry and environment variables . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.1.8 A word about DB2EMPFA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.1.9 Backup and recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
iii
2.2 NAS terminology and concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2.1 Network file system protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2.2 File I/O. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.2.3 Local Area Networks (LANs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.3 Storage Area Network terminology and concepts . . . . . . . . . . . . . . . . . . . 41 2.3.1 SAN storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.3.2 SAN fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.3.3 SAN applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Chapter 3. Introduction to the NetApp filer . . . . . . . . . . . . . . . . . . . . . . . 45 3.1 The Network Appliance Filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2 System architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.1 NVRAM implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.2 RAID environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.3 Write Anywhere File Layout (WAFL) . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.4 Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Chapter 4. NetApp filer terminology and concepts . . . . . . . . . . . . . . . . . 51 4.1 Understanding RAID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.1.1 Levels of RAID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.1.2 Eliminating the parity disk bottleneck . . . . . . . . . . . . . . . . . . . . . . . . 55 4.1.3 Using multiple RAID groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.1.4 Performance and RAID configuration . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2 WAFL implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2.1 Meta-data lives in files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2.2 A tree of blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.2.3 A word about write allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.3 Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.3.1 Snapshots and the block-map file . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4 Volumes and Quota Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.1 Quota trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Chapter 5. DB2 and the NetApp filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1 DB2/NetApp filer design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2 Interacting with a Network Appliance filer . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2.1 Using FilerView. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.3 Creating volumes on a Network Appliance filer. . . . . . . . . . . . . . . . . . . . . 72 5.4 Creating qtrees on a Network Appliance filer . . . . . . . . . . . . . . . . . . . . . . 74 5.5 Managing NFS exports (UNIX only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.6 Filer volumes and qtrees with DB2 UDB . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.7 Creating DB2 UDB databases on a filer . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.7.1 Setting the appropriate environment/registry variables . . . . . . . . . . . 86 5.7.2 Creating DB2 UDB databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.7.3 Verifying the location of a database . . . . . . . . . . . . . . . . . . . . . . . . . 89
iv
5.7.4 Improving the performance of SMS table spaces . . . . . . . . . . . . . . . 91 5.7.5 Changing the storage location of database log files . . . . . . . . . . . . . 92 Chapter 6. Backup and recovery options for databases that reside on NetApp filers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6.1 Backup methods available . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.2 Designing a DB2 database with filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.3 Suspending and resume database I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.3.1 WRITE SUSPEND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.3.2 WRITE RESUME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.3.3 DB2INIDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.4 Using NetApp Snapshots with a DB2 database . . . . . . . . . . . . . . . . . . . 100 6.4.1 Taking a Snapshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.4.2 Restoring a DB2 UDB database from a filer Snapshot . . . . . . . . . . 101 6.4.3 DataLink considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Chapter 7. Diagnostics and performance monitoring . . . . . . . . . . . . . . . 105 7.1 The DB2 Database System Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 7.1.1 The snapshot monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 7.1.2 Event monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 7.2 Operating system monitoring tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7.2.1 The top program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2.2 Virtual memory statistics vmstat . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2.3 Process state ps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.3 Network Appliance filer monitoring tools . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.3.1 sysstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.3.2 ifstat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.3.3 netstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.3.4 df . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Part 2. DB2 working with IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Chapter 8. Terminology and concepts of IBM NAS . . . . . . . . . . . . . . . . . 125 8.1 The IBM TotalStorage NAS 200 and 300 concept . . . . . . . . . . . . . . . . . 126 8.1.1 System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 8.1.2 NAS Server Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 8.1.3 Storage subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 8.1.4 Pre-loaded code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 8.2 IBM NAS terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 8.2.1 Hard disks and adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 8.2.2 Arrays, logical disks, and volumes . . . . . . . . . . . . . . . . . . . . . . . . . 130 8.2.3 RAID support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 8.2.4 File system I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 8.2.5 Backup and recovery functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Contents
8.3 Backup and recovery in IBM NAS products . . . . . . . . . . . . . . . . . . . . . . 136 8.4 IBM NAS Persistent Storage Manager (PSM). . . . . . . . . . . . . . . . . . . . . 137 8.4.1 How PSM works overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.4.2 PSM cache contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 8.4.3 PSM True Image: read-only or read-write . . . . . . . . . . . . . . . . . . . . 144 Chapter 9. Introduction to IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 9.1 IBM Network Attached Storage overview . . . . . . . . . . . . . . . . . . . . . . 148 9.2 IBM TotalStorage Network Attached Storage . . . . . . . . . . . . . . . . . . . . . 148 9.2.1 The IBM TotalStorage Network Attached Storage 200 . . . . . . . . . . 148 9.2.2 The IBM TotalStorage Network Attached Storage 300 . . . . . . . . . . 150 Chapter 10. Configuration of IBM NAS 200 and 300 . . . . . . . . . . . . . . . . 153 10.1 Our environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 10.1.1 Create db2 user account . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 10.1.2 Add computer to domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 10.2 Setting up IBM NAS 200 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 10.2.1 Connecting to the NAS 200. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 10.2.2 Default configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 10.2.3 Setting up storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 10.2.4 Add NAS 200 to domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 10.2.5 Creating a share volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 10.3 Setting up the IBM NAS 300 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 10.3.1 Default configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 10.3.2 Setting up storage on the NAS 300. . . . . . . . . . . . . . . . . . . . . . . . 170 10.3.3 Setting up the Cluster Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 10.3.4 Create clustered share volume . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 10.4 Getting connected to NAS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 10.4.1 Accessing the shares from our Windows clients . . . . . . . . . . . . . . 197 10.4.2 Accessing the shares for DB2 user . . . . . . . . . . . . . . . . . . . . . . . . 197 Chapter 11. DB2 installation on IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . 199 11.1 DB2 for Windows on IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 11.1.1 DB2 for Windows Objects on IBM NAS . . . . . . . . . . . . . . . . . . . . 201 Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS . 205 12.1 Backup and recovery considerations on IBM NAS . . . . . . . . . . . . . . . . 206 12.1.1 DB2 UDB standard backup and recovery methods . . . . . . . . . . . 208 12.1.2 DB2 UDB NAS True Image support . . . . . . . . . . . . . . . . . . . . . . . 209 12.2 DB2 UDB considerations for PSM True Images . . . . . . . . . . . . . . . . . . 212 12.2.1 Getting DB2 UDB prepared for IBM NAS True Image . . . . . . . . . 212 12.2.2 PSM configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 12.2.3 Options for IBM NAS True Image copies . . . . . . . . . . . . . . . . . . . 216 12.2.4 Creating an IBM NAS True Image . . . . . . . . . . . . . . . . . . . . . . . . 217
vi
12.2.5 Restoring an IBM NAS True Image. . . . . . . . . . . . . . . . . . . . . . . . 218 12.2.6 Accessing True Image copy overview . . . . . . . . . . . . . . . . . . . 220 12.2.7 Some considerations about cache size and location . . . . . . . . . . 223 12.3 Using IBM NAS True Image with DB2 UDB . . . . . . . . . . . . . . . . . . . . . 224 12.3.1 System environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 12.3.2 Taking a True Image of an offline DB2 UDB database . . . . . . . . . 226 12.3.3 Taking a True Image of an online DB2 UDB database . . . . . . . . . 228 12.3.4 PSM True Image copy as DB2 UDB True Image database . . . . . 229 12.3.5 Creating a DB2 backup from a True Image . . . . . . . . . . . . . . . . . 235 12.3.6 Version recovery from a PSM True Image . . . . . . . . . . . . . . . . . . 236 12.3.7 Roll-forward recovery from a True Image . . . . . . . . . . . . . . . . . . . 239 Chapter 13. IBM NAS high availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 13.1 NAS 200 high availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 13.2 NAS 300 high availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 13.3 Failover tests on NAS 300. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 13.3.1 Creating a failover event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 13.3.2 Failover response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 13.3.3 Load balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 13.3.4 Administration considerations for NAS . . . . . . . . . . . . . . . . . . . . . 247 Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Related publications . . . . . . . . . . . . . . . . . . . . . . IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other resources . . . . . . . . . . . . . . . . . . . . . . . . Referenced Web sites . . . . . . . . . . . . . . . . . . . . . . How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . IBM Redbooks collections . . . . . . . . . . . . . . . . . ...... ...... ...... ...... ...... ...... ....... ....... ....... ....... ....... ....... ...... ...... ...... ...... ...... ...... . . . . . . 257 257 258 259 260 260
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Contents
vii
viii
Figures
1-1 1-2 1-3 1-4 2-1 2-2 2-3 3-1 4-1 4-2 4-3 4-4 4-5 4-6 4-7 5-1 5-2 5-3 5-4 5-5 5-6 5-7 5-8 5-9 5-10 5-11 5-12 5-13 5-14 5-15 5-16 5-17 5-18 5-19 7-1 7-2 7-3 7-4 The implementation of NAS in a typical storage network . . . . . . . . . . . 15 Storage consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Logical storage consolidation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Loading the IP network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Relationship between buffer pools, table spaces, and instances . . . . . 28 Table space containers and extents . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 IBM NAS devices use File I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Network Appliance System Architecture . . . . . . . . . . . . . . . . . . . . . . . . 47 Network Appliances RAID 4 disk layout . . . . . . . . . . . . . . . . . . . . . . . . 53 FFS and WAFL disk write operation patterns . . . . . . . . . . . . . . . . . . . . 55 Layout used by the WAFL file system . . . . . . . . . . . . . . . . . . . . . . . . . . 59 WAFLs tree of blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 How WAFL creates a Snapshot in an active file system . . . . . . . . . . . . 62 Life cycle of a block-map file entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 NetApp filer with multiple volumes composed of multiple RAID groups. 65 Infrastructure used to test DB2 UDB and a Network Appliance filer . . . 69 Initial page of the Network Appliance filer Web interface. . . . . . . . . . . . 70 FilerViews main screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Manage Volumes screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Add New Volume data entry screen . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Volumes Report screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Manage Qtrees screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Create a new Qtree dialog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Manage Qtrees screen after qtrees for test environment were created. 78 Manage NFS Exports screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Create a New /etc/exports Line dialog . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Adding permissions to an export entry. . . . . . . . . . . . . . . . . . . . . . . . . . 81 Add Option dialog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 NFS permissions for our test environment. . . . . . . . . . . . . . . . . . . . . . . 82 /etc/fstab file used in Linux test environment . . . . . . . . . . . . . . . . . . . . . 84 Output from df after qtrees were mounted on our Linux server . . . . . . . 85 Output from LIST DATABASE DIRECTORY command . . . . . . . . . . . . 90 Output from LIST TABLESPACES command . . . . . . . . . . . . . . . . . . . . 91 Output from LIST TABLESPACE CONTAINERS command . . . . . . . . . 92 Sample GET MONITOR SWITCHES output . . . . . . . . . . . . . . . . . . . . 109 Sample GET DBM MONITOR SWITCHES output . . . . . . . . . . . . . . . 110 Sample table space-level snapshot output . . . . . . . . . . . . . . . . . . . . . 110 Sample Table-level snapshot output . . . . . . . . . . . . . . . . . . . . . . . . . . 111
ix
7-5 7-6 7-7 7-8 7-9 7-10 7-11 8-1 8-2 8-3 8-4 8-5 8-6 8-7 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10 10-11 10-12 10-13 10-14 10-15 10-16 10-17 10-18 10-19 10-20 10-21 10-22 10-23 10-24 10-25 10-26 10-27 10-28 10-29
Sample top output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Sample vmstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Sample ps output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Sample sysstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Sample ifstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Sample netstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Sample df output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 IBM NAS Appliance System Architecture . . . . . . . . . . . . . . . . . . . . . . 127 IBM NAS I/O mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 IBM NAS array support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 IBM NAS logical drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 INM NAS drive partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 PSM copy-on-write operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 PSM read from persistent image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Our environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Create new user. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Change user properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Change nas_db2_user group select group . . . . . . . . . . . . . . . . . . . 157 Change nas_db2_user group result . . . . . . . . . . . . . . . . . . . . . . . . 158 Create computer account in Windows domain . . . . . . . . . . . . . . . . . . 159 Add to domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Add to domain the domain user account . . . . . . . . . . . . . . . . . . . . . 160 Add user to local Administrator group . . . . . . . . . . . . . . . . . . . . . . . . . 160 Using IBM NAS 200 Server RAID Manager. . . . . . . . . . . . . . . . . . . . . 164 Share folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Add user for share folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Set permissions for db2_data folder . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Create disk array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Create disk array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Create logical disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Create partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Create partition select disk size. . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Create partition format disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Set up public network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Configure public network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Set up first node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Set up cluster on first node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Set up second node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Result of setting up second node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Adjust cluster quorum log size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Increase private network priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Set private network to internal communication . . . . . . . . . . . . . . . . . . 186 Resource balance for disk group 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
10-30 10-31 10-32 10-33 10-34 10-35 10-36 10-37 10-38 10-39 10-40 10-41 10-42 10-43 11-1 11-2 12-1 12-2 12-3 12-4 12-5 12-6 12-7 12-8 12-9 12-10 12-11 12-12 12-13 12-14 12-15 12-16 12-17 12-18 12-19 12-20 12-21 12-22 12-23 12-24 12-25 12-26 12-27
Set up threshold for failover of Disk Group 1 . . . . . . . . . . . . . . . . . . . . 188 Set up failback for Disk Group 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Create IP address resource . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Enter IP address resource information . . . . . . . . . . . . . . . . . . . . . . . . 191 Select possible owner of IP address resource . . . . . . . . . . . . . . . . . . . 192 Enter IP address for IP address resource . . . . . . . . . . . . . . . . . . . . . . 192 Bring IP address resource online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Enter network name resource information . . . . . . . . . . . . . . . . . . . . . . 193 Enter dependencies information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Enter network name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Bring network name online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Create share volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Enter dependencies for Share Volume resource . . . . . . . . . . . . . . . . . 195 Enter Share Volume information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 DB2 Control Center launching database wizard . . . . . . . . . . . . . . . . . 202 DB2 UDB 7.2 Create Database Wizard . . . . . . . . . . . . . . . . . . . . . . . . 203 Database Backup from True Image Copy . . . . . . . . . . . . . . . . . . . . . . 207 Version recovery from True Image . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 DB2relocatedb scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 PSM True Image: Global Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 PSM True Image: Select Volume for configuration . . . . . . . . . . . . . . . 215 PSM True Image: Volume Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 PSM True Image: Volume List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 PSM True Image Copy: Create new Copy. . . . . . . . . . . . . . . . . . . . . . 217 PSM True Image Copy: Volume Selection . . . . . . . . . . . . . . . . . . . . . 217 PSM True Image Copy: Persistent Images List . . . . . . . . . . . . . . . . . . 218 PSM True Image Copy: Restore read-write True Image . . . . . . . . . . . 219 NAS System Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 PSM True Image Copy: System Log Details . . . . . . . . . . . . . . . . . . . . 220 NAS Volumes with allocated PSM cache . . . . . . . . . . . . . . . . . . . . . . 220 NAS volumes with PSM cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 How to access PSM True Image copies . . . . . . . . . . . . . . . . . . . . . . . 222 Our test environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 NAS directory structure for scenario environment . . . . . . . . . . . . . . . . 225 Directory structure for primary and secondary images . . . . . . . . . . . . 226 True Image of an offline DB2 database . . . . . . . . . . . . . . . . . . . . . . . . 227 True Image of an online database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Accessing a True Image copy from a secondary server . . . . . . . . . . . 230 Initiate database as DB2 True Image database. . . . . . . . . . . . . . . . . . 231 Screen Capture of the db2inidb command sequence . . . . . . . . . . . . . 232 Different directory structure for database True Image copy. . . . . . . . . 233 The db2inidb RELOCATE command sequence . . . . . . . . . . . . . . . . . 234 DB2 Backup from a True Image Copy . . . . . . . . . . . . . . . . . . . . . . . . . 235
Figures
xi
Version recovery from a PSM True Image. . . . . . . . . . . . . . . . . . . . . Roll-forward recovery from database True Image . . . . . . . . . . . . . . . Network resource response in type 3 failover . . . . . . . . . . . . . . . . . . Error message in the DB2 command center . . . . . . . . . . . . . . . . . . .
. . . .
xii
Tables
4-1 5-1 7-1 8-1 8-2 8-3 8-4 8-5 8-6 8-7 8-8 8-9 8-10 8-11 8-12 Volumes compared to quota trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Mount option descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Snapshot monitor switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Layout of disk after instant virtual copy is made . . . . . . . . . . . . . . . . 140 Layout of PSM cache after instant virtual copy is made . . . . . . . . . . 140 Layout of disk immediately after file is deleted . . . . . . . . . . . . . . . . . . 141 Layout of PSM cache immediately after file is deleted: . . . . . . . . . . . . 141 Layout of disk after changing time to date . . . . . . . . . . . . . . . . . . . 141 Layout of PSM cache after changing time to date . . . . . . . . . . . . . 141 Layout of disk after changing men to women.. . . . . . . . . . . . . . . . . 142 Layout of PSM cache after changing men to women:. . . . . . . . . . . 142 Layout of disk after changes without free space detection . . . . . . . . . 143 Layout of PSM cache after changes without free space detection . . . 143 Layout of disk after changes with free space detection . . . . . . . . . . . . 144 Layout of PSM cache after changes with free space detection . . . . . . 144
xiii
xiv
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.
xv
Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: AFS AIX AIX 5L DB2 DB2 Connect DB2 Universal Database DFS Enterprise Storage Server ESCON Everyplace FlashCopy IBM IBM.COM IMS Informix Micro Channel Netfinity OS/2 OS/390 OS/400 PAL Perform PowerPC Predictive Failure Analysis RACF RAMAC Redbooks Redbooks(logo) RMF S/390 SANergy Sequent ServeRAID SP SP2 TCS Tivoli TotalStorage xSeries
The following terms are trademarks of International Business Machines Corporation and Lotus Development Corporation in the United States, other countries, or both: Approach Lotus Word Pro
The following terms are trademarks of other companies: ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. C-bus is a trademark of Corollary, Inc. in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. SET, SET Secure Electronic Transaction, and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC. Other company, product, and service names may be trademarks or service marks of others.
xvi
Preface
This IBM Redbook is an informative guide that describes how DB2 Universal Database (UDB) can take advantage of Network Attached Storage (NAS) and Storage Area Networks (SAN) technology. Specifically, this book provides detailed information to help you learn about Network Appliance filers and IBM NAS 200/300 appliances and to show you how DB2 UDB databases can be stored on these devices. This easy-to-follow guide documents the generic network, software, and hardware requirements, as well as the basic procedures needed to set up, configure, and integrate DB2 UDB databases with Network Appliance Filers and IBM NAS 200 and 300 appliances. These procedures start with the basics of initializing the NAS device for DB2 UDB and then continue with the more advanced topics of backing up databases stored on NAS devices using the True Image technology that is supplied with each. This book also provides general information on how DB2 UDB databases, which are stored on NAS devices as well as the NAS devices themselves, can be monitored for performance.
xvii
Joachim Dirker is a Data Management Pre-Sales Specialist with IBM Germany. He has over 10 years of experience in the Data Management field. Roger E. Sanders is a Database Performance Engineer with Network Appliance, Inc. He has been designing and programming software applications for IBM PCs for more than 15 years and he has worked with DB2 Universal Database and its predecessors for the past 10 years. He has written several computer magazine articles, presented at two International DB2 User's Group (IDUG) conferences, and is the author of All-In-One DB2 Administration Exam Guide, DB2 Universal Database SQL Developer's Guide, DB2 Universal Database API Developers Guide, DB2 Universal Database CLI Developer's Guide, ODBC 3.5 Developer's Guide, and The Developer's Handbook to DB2 for Common Servers. His background in database application design and development is extensive and he holds the following professional certifications: IBM Certified Advanced Technical Expert DB2 for Clusters; IBM Certified Solutions Expert DB2 UDB V7.1 Database Administration for UNIX, Windows, and OS/2; IBM Certified Solutions Expert DB2 UDB V6.1 Application Development for UNIX, Windows, and OS/2; IBM Certified Specialist DB2 UDB V6/V7 User. Michael T. Terrell is an IT Specialist with the IBM Data Management group. He has 15 years experience in database application design and development. He has worked for Informix Software and IBM for 9 years, where he held positions as a trainer, consultant, and IT Specialist. He holds the following professional certifications: IBM Certified Specialist DB2 UDB V6/V7 User and IBM Certified Solutions Expert DB2 UDB V7.1 Database Administration for UNIX, Windows, and OS/2. Roland Tretau is a Project Leader with the IBM International Technical Support Organization, San Jose Center. Before joining the ITSO in April 2001, Roland worked in Germany as an IT Architect for Cross Platform Solutions and Microsoft Technologies. He holds a masters degree in Electrical Engineering with a focus in Telecommunications. We would especially like to thank the following people for their contributions in providing equipment and contents to be incorporated within these pages: Barry Warwick, Rob Davis, Frank Tutone, Benjamin L. (Ben) Stern, Brenda Haynes, Barbara Gallimore, Susan Grey, Ling Pong IBM US Bob Jancer Network Appliance Incorporation
xviii
Thanks to the following people for their contributions to this project: Rakesh Goenka, Enzo Cialini, Dale M. McInnis IBM DB2 Toronto Lab Mark Hayakawa, Michael Sowers, Brent Barnum, Jeff Browning, Joe Richart, Dave Hitz, Michael Marchi, James Lau, Michael Malcolm, Karl L. Swartz, Keith Brown, Jeff Katcher, Rex Walters, Andy Watson, Francine Bellet Network Appliance Incorporation Jay Knott, Ken E. Quarles, J. M. Lake, Sushama Paranjape, Sandy Albu, John M. Zoltek, Garry Rawlins, Sasha A. Loose, Tina DeAnglis IBM US Michael Baker, Cheryl Block, Margaret Hockett Critical Thinking Books & Software, CA USA Nagraj Alur, Corinne Baragoin, Tom Cady, Will Carney, Mary Comianos, Rowell Hernandez, Emma Jacobs, Yvonne Lyon, Deanna Polm, Journel Saniel, Patrick Vabre, Bart Steegmans, Osamu Takagiwa International Technical Support Organization, San Jose Center
Special notice
This publication is intended to help DB2 database administrators and database specialists, network/storage administrators to install, configure, and backup and restore DB2 using IBM TotalStorage NAS 200, IBM NAS 300, or NetWork Appliance Filer. The information in this publication is not intended as the specification of any programming interfaces that are provided by IBM TotalStorage NAS 200 and 300. See the PUBLICATIONS section of the IBM Programming Announcement for the IBM TotalStorage NAS 200 and 300 for more information about what publications are considered to be product documentation.
Preface
xix
Comments welcome
Your comments are important to us! We want our Redbooks to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways: Use the online Contact us review redbook form found at:
ibm.com/redbooks
xx
Part 1
Part
In this part of the book, we first introduce the basic concepts of Network Attached Storage (NAS), Storage Area Networks (SAN), DB2 Universal Database (DB UDB), Network Appliance filer, and the Network Appliance (NetApp) filer terminology and concepts. Next we show you how DB2 UDB work with Network Appliance filer. Finally, we describe the various methods that can be used for diagnostics and performance monitoring of DB2 UDB.
Chapter 1.
Universal access
DB2 UDB provides universal access to all forms of electronic data. This includes traditional relational data as well as structured and unstructured binary information, documents and text in many languages, graphics, images, multimedia (audio and video), information specific to operations, such as engineering drawings, maps, insurance claim forms, numerical control streams, or any type of electronic information. Access to a wide variety of data sources can be accomplished with the use of DB2 UDB and its complimentary products: Relational Connect, DB2 Connect, Data Joiner, and Classic Connect. Sources that can be accessed include: DB2 UDB for OS/390, DB2 UDB for OS/400, IMS, Oracle, MS/SQL Server, Sybase, NCR Teradata and IBM Informix databases.
Universal application
DB2 UDB supports a wide variety of application types. It can be configured to perform well for both online transaction processing (OLTP) was well as for decision support systems (DSS). It can also be used as the underlying database for an online analytical processing (OLAP) system. DB2 UDB is also accessible from and/or can be integrated into a wide variety of application development environments. In addition to being able to embed SQL statements within source code files written in standard programming languages such as C/C++, COBOL, and Visual Basic, DB2 UDB fully supports Java technology and is accessible from Java applets, servlets and applications. DB2 UDB also participates in Microsofts OLE DB as both a provider and a consumer.
Universal extensibility
Data is stored in most relational databases according to its data type and DB2 UDB is no exception. In order to support a wide variety of data types and formats, DB2 UDB contains a rich set of built-in data types, along with a set of functions that are designed to manipulate each of these data types. DB2 UDB also provides a way to create user-defined data types (UDTs) and supporting user-defined functions (UDFs); consequently, the base data types provided can be extended to provide data types that are specific to your business needs.
Using UDTs and UDFs, IBM went one step farther and created several different sets of user-defined data types and functions to manage particular kinds of data that have begun to emerge over the last few years. Collectively, these sets of data types and functions are referred to as extenders. Currently, five different extender products are available for DB2 UDB; together they provide the capability to store and manipulate image, audio, video, text, XML, and spatial data, just to name a few.
Universal scalability
DB2 UDB scales from pervasive/handheld devices, in which DB2 Everyplace is used, all the way up to Massively Parallel Processing (MPP) environments, in which DB2 UDB Enterprise - Extended Edition (EEE) is used. The various editions of DB2 UDB outlined above will run on Palmtops, Laptops, Distributed Servers, and Central Servers, as well as clustered server configurations. The superior scalability of DB2 UDB is made possible through a combination of features that are built into the base product. These include intra-partition parallelism as well as inter-partition parallelism. With intra-partition parallelism, database operations are subdivided into multiple parts, which are then executed in parallel within a single database partition. With inter-partition parallelism, database operations are subdivided into multiple parts, which are then executed in parallel across one or more partitions of a multi-partition database. In addition, DB2 UDBs database engine is designed to take advantage of I/O parallelism (the process of reading or writing to two or more I/O devices at the same time) whenever possible, and it is capable of interacting with disk I/O subsystems that have been designed with RAID technology in mind.
Universal reliability
DB2 UDB runs reliably across multiple hardware and operating systems, however sometimes, unforeseen events (such as power or media failure) can cause a database system to become unstable or unusable. DB2 UDB uses write-ahead transaction logging as a preventative measure, and as a result, it can usually resolve database problems that are caused by power interruptions and/or application failures without any additional intervention. Unfortunately, this is not the case when problems arise because the storage media being used to hold a databases files becomes corrupted or fails. To address these types of problems, some kind of backup (and recovery) program must be established. And to help establish such a program, DB2 UDB provides a set of utilities that can be used to: Create a backup image of a database. Return a database to the state it was in when a backup image was made (by restoring it from a backup image).
Reapply (or roll-forward) some or all changes made to a database since the last backup image was made, once the database has been returned to the state it was in when the backup image was made. Backups can be scheduled to run automatically, and both full and incremental backup images can be made. Backups can also be tailored for a single table space or for an entire database, and backup images can be taken while a database is online or offline.
Universal management
The primary management tool for DB2 UDB is the Control Center, which together with a common integrated tool set provided, is used to manage local and remote databases across all software and hardware client platforms from a single terminal. Components of the Control Center include: The Command Center, which is GUI window that provides for inputting database or operating system commands, while allowing for storage, retrieval, and browsing of previous commands. The Script Center, which is a GUI that allows for the creation, modification, and execution of database or operating system scripts. The Journal Facility, which is a GUI tool that provides for managing jobs, recovery, alerts, and messages. The Visual Explain facility, which provides a graphical means to display optimization-associated cost information and visual drill-down views of a querys access plan. The Event Analyzer, which is a flexible GUI tool that provides summary and historical analysis of performance. The Performance Monitor, which is a GUI tool that supports online monitoring of buffer pools, sorts, locks, I/O, and CPU activity. SmartGuides, which are GUIs that guide database administrators through tasks such as backup/recovery, performance configuration, and object definition. The Alert Center, which is a GUI tool that displays objects which are in an exception status. The Index Creation wizard, which is a GUI tool that helps database administrators to build the best possible indexes for a given query workload. All of the information presented in the Control Center can also be accessed via a command line interface known as the Command Line Processor.
This book does not attempt to provide a complete list of the DB2 Utilities. Instead, focus is concentrated on those utilities which are most likely to be used with network attached storage. For more information on DB2 UDB Utilities, please refer to the appropriate sections of the DB2 UDB Administration Guide (all three volumes), the DB2 Command Reference, and the DB2 UDB Data Movement Guide.
Export
The Export utility is used to extract specific portions of data from a database and externalize it to ASCII Delimited (DEL), Worksheet (WSF), or PC Integrated Exchange Format (PC/IXF or IXF) formatted files. Such files can then be used to populate tables in a variety of databases (including the database the data was extracted from) or to provide input to software applications such as spreadsheets and word processors.
Import
The Import utility provides a way to read data directly from DEL, ASC, WSF, or PC/IXF formatted files and store it in a specific database table. When the Export utility is used to externalize data in a table to a PC/IXF formatted file, the table structure and definitions of all of the tables associated indexes are written to the file along with the data. Because of this, the Import utility can create/re-create a table and its indexes as well as populate the table if data is being imported from a PC/IXF formatted file. When any other file format is used, if the table or updateable view receiving the data already contains data values, the data being imported can either replace or be appended to the existing data, provided the base table receiving the data does not contain a primary key that is referenced by a foreign key of another table. In some situations, data being imported can also be used to update existing rows in a base table.
Load
Like the Import utility, the Load utility is designed to read data directly from DEL, ASC, or PC/IXF formatted files and store it in a specific database table. However, unlike when the Import utility is used, the table that the data is stored in must already exist in the database before the load operation is initiated the Load utility ignores the table structure and index definition information stored in PC/IXF formatted files. Likewise, the Load utility does not create new indexes for a table it only rebuilds indexes that have already been defined for the table being loaded. The most important difference between the Import utility and the Load utility relates to performance. Because the Import utility inserts data into a table one row at a time, each row inserted must be checked for constraint compliance (such as foreign key constraints and table check constraints) and all activity performed must be recorded in the databases transaction log files. The Load utility, on the other hand, inserts data into a table much faster than the Import utility because instead of inserting data into a table one row of data at a time, it builds data pages using several individual rows of data and then writes those pages directly to the table space container that the tables structure and any preexisting data have been stored in. Existing primary/unique indexes are then rebuilt once all data pages constructed have been written to the container and duplicate rows that violate primary or unique key constraints are deleted (and copied to an exception table, if appropriate).
10
Autoloader
In a partitioned database environment, the Autoloader utility is used to perform the necessary steps needed to balance data being loaded from single or multiple flat files into the partitions that comprise a partitioned table. The Autoloader utility splits data all data being loaded using the partition map for the table, pipes the split data to the appropriate partition, and concurrently loads each portion of the data into each partition of the partitioned table.
DB2MOVE
The DB2MOVE utility facilitates the movement of a large number of tables between DB2 databases. This utility queries the system catalog tables for a particular database and compiles a list of all user tables found. It then exports the contents and table structure of each table found to a PC/IXF formatted file. The set of files produced can then be imported or loaded to another DB2 database on the same system, or they can be transferred to another workstation platform and imported or loaded to a DB2 database that resides on that platform. (This is the best method to use when copying or moving an existing database from one platform to another.) The DB2MOVE utility can be run in one of three modes: EXPORT, IMPORT, or LOAD. When run in EXPORT mode, the DB2MOVE utility invokes the Export utility to extract data from one or more tables and externalize it to PC/IXF formatted files. It also creates a file named db2move.lst that contains the names of all tables processed, along with the names of the files that the tables data was written to. When run in IMPORT mode, the DB2MOVE utility invokes the Import utility to re-create a table and its indexes from data stored in PC/IXF formatted files. In this mode, the file db2move.lst is used to establish a link between the PC/IXF formatted files needed and the tables into which data will be imported. When run in LOAD mode, the DB2MOVE utility invokes the Load utility to populate tables that have already been created with data stored in PC/IXF formatted files. Again, the file db2move.lst is used to establish a link between the PC/IXF formatted files needed and the tables into which data will be imported.
DB2LOOK
The DB2LOOK utility is a special utility that will generate the DDL SQL statements needed to re-create existing objects in a given database. In addition to generating DDL statements, DB2LOOK can also collect statistical information that has been generated for objects in a database from the system catalog tables and save it (in readable format) in an external file. In fact, by using DB2LOOK, it is possible to create a clone of an existing database that contains both its data objects and current statistical information about each of those objects.
11
12
13
The file server was responsible for accurately managing I/O requests, queuing requests as necessary, fulfilling requests and returning appropriate information to the correct initiator. In addition, the NAS server handled all aspects of security and lock management. If one user had a file open for updating, no one else was allowed to update the file until it was released. The file server keep track of connected clients by means of their network IDs, addresses, and so on.
14
Resource pooling
A NAS appliance enables disk storage capacity to be consolidated and pooled on a shared network resource, which may be located at great distances from the clients and servers which will access it. Thus a NAS device can be configured as one or more file systems, each residing on specified disk volumes (Figure 1-1).
IP Network
Ethernet
All users accessing the same file system are assigned space within it on demand. This contrasts with individual DAS storage, when some users may have too little storage, and others may have too much. Consolidation of files onto a centralized NAS device can also minimize or eliminate the need to have multiple copies of files spread across several distributed clients. Thus overall hardware costs can be reduced. Additionally, NAS pooling can reduce the need to physically reassign capacity among users. The results can be lower overall costs through better utilization of the storage, lower management costs, increased flexibility, and increased control.
15
Simple to implement
Because NAS devices attach to mature, standard LAN implementations, and have standard LAN addresses, they are typically extremely easy to install, operate, and administer. This plug-and-play operation results in lower risk, ease of use, and fewer operator errors, all of which contributes to lower costs of ownership.
Enhanced choice
With NAS, the storage decision is separated from the server decision, thus enabling the buyer to exercise more choice in selecting equipment to meet the business needs.
Connectivity
LAN implementation allows any-to-any connectivity across the network. Often, NAS appliances allow for concurrent attachment to multiple networks, thus one NAS device has the capability to support many users, simultaneously.
Scalability
Typically, NAS appliances can scale in capacity and performance within the allowed configuration limits of the individual appliance. However, this scalability may be restricted by considerations such as LAN bandwidth constraints, and the need to avoid restricting other LAN traffic.
Enhanced backup
NAS appliance backup is a common feature of most popular backup software packages. For instance, the IBM NAS 200 and 300 appliances all provide TSM client software support. Some NAS appliances have some integrated, automated backup facility to tape, enhanced by the availability of advanced functions such as the IBM NAS appliance facility called Persistent Storage Manager (PSM). This enables multiple point-in-time copies of files to be created on disk, which can be used to make backup copies to tape in the background. This is similar in concept to features such as IBMs Snapshot function on the IBM RAMAC Virtual Array (RVA).
16
Improved manageability
By providing consolidated storage, which supports multiple application systems, storage management is centralized. This enables a storage administrator to manage more capacity on a NAS appliance than typically would be possible for distributed, directly attached storage. To summarize, an appliance is an easy to use device, which is designed to perform a specific function, such as serving files to be shared among multiple clients. In fact, a NAS appliance performs this task very well. It is important to recognize that a NAS is not a general purpose server, and should not be used (indeed, due to its customized OS, probably cannot be used) for general purpose server tasks. However, it does provide a good solution for appropriately selected shared storage applications. In this book, we focus on implementing DB2 UDB EE on NAS as a storage networking solution. Reading this book should adequately equip you to implement a DB2 and NAS solution using one or more products we describe to meet your networked storage requirements.
17
A SAN can be shared between servers and/or dedicated to one server. It can be local or can be extended over geographical distances. SAN interfaces can be Enterprise Systems Connection (ESCON), Small computer systems interface (SCSI), serial storage architecture (ssa), high Performance Parallel Interface (HIPPI), Fibre Channel (FC) or whatever new physical connectivity emerges. The diagram in Figure 1-3 shows a schematic overview of a SAN connecting multiple servers to multiple storage systems.
Physical consolidation
Data from disparate storage subsystems can be combined on to large, enterprise class shared disk arrays, which may be located at some distance from the servers. The capacity of these disk arrays can be shared by multiple servers, and users may also benefit from the advanced functions typically offered with such subsystems. This may include RAID capabilities, remote mirroring, and instantaneous data replication functions, which might not be available with smaller, integrated disks. The array capacity may be partitioned, so that each server has an appropriate portion of the available gigabytes.
18
Consolidated Storage
Server A
Server B
Server C
A B C
Free space
Available capacity can be dynamically allocated to any server requiring additional space. Capacity not required by a server application can be re-allocated to other servers. This avoids the inefficiency associated with free disk capacity attached to one server not being usable by other servers. Extra capacity may be added, in a non-disruptive manner
Logical consolidation
It is possible to achieve shared resource benefits from the SAN, but without moving existing equipment. A SAN relationship can be established between a client and a group of storage devices that are not physically co-located (excluding devices which are internally attached to servers). A logical view of the combined disk resources may allow available capacity to be allocated and re-allocated between different applications running on distributed servers, to achieve better utilization. Consolidation is covered in greater depth in IBM Storage Solutions for Server Consolidation, SG24-5355.
19
NT Client
IFS w/ cache
AIX Client
IFS w/ cache
Solaris Client
IFS w/ cache
Linix Client
IFS w/ cache
Meta-data Server
SAN Fabric
Meta-data Server
. . .
Device-to-device data movement
Meta-data Server
Data sharing
The term data sharing is used somewhat loosely by users and some vendors. It is sometimes interpreted to mean the replication of files or databases to enable two or more users, or applications, to concurrently use separate copies of the data. The applications concerned may operate on different host platforms. A SAN may ease the creation of such duplicated copies of data using facilities such as remote mirroring. Data sharing may also be used to describe multiple users accessing a single copy of a file. This could be called true data sharing. In a homogeneous server environment, with appropriate application software controls, multiple servers may access a single copy of data stored on a consolidated storage subsystem. If attached servers are heterogeneous platforms (for example, a mix of UNIX and Windows NT), sharing of data between such unlike operating system environments is complex. This is due to differences in file systems, data formats, and encoding structures. IBM, however, uniquely offers a true data sharing capability, with concurrent update, for selected heterogeneous server environments, using the Tivoli SANergy File Sharing solution.
20
Tape pooling
Providing tape drives to each server is costly, and also involves the added administrative overhead of scheduling the tasks, and managing the tape media. SANs allow for greater connectivity of tape drives and tape libraries, especially at greater distances. Tape pooling is the ability for more than one server to logically share tape drives within an automated library. This can be achieved by software management, using tools, such as Tivoli Storage Manager; or with tape libraries with outboard management, such as IBMs 3494.
21
SAN provides the solution, by enabling the elimination of backup and recovery data movement across the LAN. Fibre Channels high bandwidth and multi-path switched fabric capabilities enables multiple servers to stream backup data concurrently to high speed tape drives. This frees the LAN for other application traffic. IBMs Tivoli software solution for LAN-free backup offers the capability for clients to move data directly to tape using the SAN. A future enhancement to be provided by IBM Tivoli will allow data to be read directly from disk to tape (and tape to disk), bypassing the server. This solution is known as server-free backup.
22
Chapter 2.
23
2.1.1 Instances
DB2 Universal Database sees the world as a hierarchy of several different types of objects. Workstations on which any edition of DB2 Universal Database has been installed are known as system objects and they occupy the highest level of this hierarchy. Systems objects can represent systems that are accessible to other DB2 clients or servers within a network, or they can represent stand-alone systems that neither have access to nor can be accessed from other DB2 clients or servers. When any edition of DB2 Universal Database is installed on a particular workstation (or system), program files for the DB2 Database Manager are physically copied to a specific location on that workstation and one instance of the DB2 Database Manager is created and assigned to the system as part of the installation process. (Instances comprise the next level in the object hierarchy.) If needed, additional instances of the DB2 Database Manager can be created for a particular system; multiple instances can be used to separate the development environment from the production environment, tune the DB2 Database Manager for a particular environment, and protect sensitive information from a unauthorized access. Each time a new instance is created, it references the DB2 Database Manager program files that were stored on that workstation during the installation process; thus, each instance behaves like a separate installation of DB2 Universal Database, even though all instances within a particular system share the same binary code. Although all instances share the same physical code, each can be run concurrently with the others and each has its own environment, which can be modified by altering the contents of its configuration file.
24
2.1.2 Databases
In its simplest form, a DB2 Universal Database database is a set of related database objects. In fact, when you create a DB2 Universal Database, you are establishing an administrative relational database entity that provides an underlying structure for an eventual collection of database objects (such as tables, views, indexes, and so on). This underlying structure consists of a set of system catalog tables (along with a set of corresponding views), a set of table spaces in which both the system catalog tables and the eventual collection of database objects will reside, and a set of files that will be used to handle database recovery and other bookkeeping details. DB2 UDB allows multiple databases to be defined within a single database instance. Each database has its own configuration file, which allows characteristics of the database, such as memory usage and logging, to be fine tuned for optimum performance.
Page cleaners
To prevent a buffer pool from becoming full, page cleaner agents are used to write modified pages to disk at a predetermined interval (by default, when the buffer pool is 60 percent full) to guarantee the availability of buffer pool pages for future read operations. For example, if you have updated a large amount of data in a table, many data pages in the buffer pool may be updated but not written into disk storage (these pages are known as dirty pages). Since prefetchers cannot place fetched data pages onto the dirty pages in the buffer pool, these dirty pages must be flushed to disk storage so that prefetchers can store needed data pages in the buffer pool.
25
26
Regardless of how they are managed, three types of table spaces can exist: regular, temporary, and long. Tables that contain user data can reside in regular DMS table spaces. (Indexes can also be stored in regular DMS table spaces.) Tables that contain long field data or large object (LOB) data, such as multimedia objects, can reside in long DMS table spaces. Temporary table spaces are classified as either system or user; system temporary table spaces are used to store internal temporary data that is required during SQL operations such as sorting, reorganizing tables, index creation, and table joins. User temporary table spaces are used to store declared global temporary tables that, in turn, are used to store application specific temporary data.
Containers
Every table space is made up of at least one container, which is essentially an allocation of physical storage that the DB2 Database Manager is given unique access to. Containers essentially provide a way of defining what location on a specific storage device will be made available for storing database objects. Containers may be assigned from file systems by specifying a directory; such containers are identified as PATH containers. Containers may also reference files which reside within a directory. These types of containers are identified as FILE containers and, when used, a specific file size must be specified. Finally, containers may also reference raw devices. Such containers are identified as DEVICE containers, and the device specified must already exist on the system before a DEVICE container can be used. A single table space can span many containers, but each container can only belong to one table space. Figure 2-1 illustrates the relationship between buffer pools, table spaces, and containers.
27
System
Instance
Database
Buffer Pool
RESERVED
TABLESPACE 1
TABLESPACE 21
TABLESPACE 3
+ + +
Figure 2-1 Relationship between buffer pools, table spaces, and instances
Page size
With DB2 UDB, data is transferred between table space containers and buffer pools in discrete blocks that are called pages. (The memory reserved to buffer a page transfer is called an I/O buffer.) The actual page size used by a particular table space is determined by the page size of the buffer pool the table space is associated with. Four different page sizes are available: 4K, 8K, 16K, and 32K. By default, all table spaces that are created as part of the database creation process are assigned a 4K page size.
28
Extent size
An extent is a unit of space within a container that makes up a table space. When a table space spans multiple containers, data associated with that table space is stored on all of its respective containers in a round-robin fashion; the extent size of a table space represents the number of pages of table data that are to be written to one container before moving to the next container in the list. This helps balance data across all containers that belong to a given table space (assuming all extent sizes specified are equal). Figure 2-2 illustrates how extents are used to balance data across multiple containers.
EXTENT 4
EXTENT 2 EXTENT 0
CONTAINER 0
EXTENT 3 EXTENT 1
CONTAINER 1
Prefetch size
Prefetching is a technique that the DB2 Database Manager uses to fetch pages of data that it thinks a user is about to need into one or more buffer pools before requests for the data are actually made. Thus, the prefetch size of a table space identifies the number of pages of table data that are to be read in advance of the pages currently being referenced by a query, in anticipation that they will be needed to resolve the query. The overall objective of sequential prefetching is to reduce query response time. This can be achieved if page prefetching can occur asynchronously to query execution.
29
Sequential prefetches read consecutive pages into the buffer pool before they are needed. List prefetches however, are more complex in this case the DB2 optimizer attempts to optimize the retrieval of randomly located data. The amount of data being prefetched determines the amount of parallel I/O activity. Ordinarily the database administrator should define a prefetch value large enough to allow parallel use of all of the available containers, and therefore all of the physical devices used.
Tables
Tables are uniquely identified units of storage that are maintained within a table space. Each table is a logical structure that is used to present data as a collection of unordered rows with a fixed number of columns. Every column contains a set of values of the same data type (or one of its subtypes) and the definition of the columns in a table make up the table structure (the rows contain the actual table data). The storage representation of a row is called a record, and the storage representation of a column is called a field. Each intersection of a row and column in a database table contains a specific data item called a value. Data in a table is typically logically related and additional relationships, known as referential constraints, can be defined between two or more tables.
Indexes
An index is an object that contains an ordered set of pointers that refer to a key in a base table. When indexes are used, the DB2 Database Manager can access data directly and more efficiently because each index provides a direct path to the data through pointers that have been ordered based on the values of the columns that the index is associated with. When an index is created, the DB2 Database Manager uses a balanced binary tree (a hierarchical data structure in which each element may have at most one predecessor but may have many successors) to order the values of the key columns in the base table that the index refers to.) More than one index may be defined for a given table, and they provide a way to assist in the clustering of data.
30
Long data
All data is classified, to some extent, according to its type (for example, some data might be numerical, whereas other data might be textual). Because a table is comprised of one or more columns that are designed to hold data values, each column must be assigned a specific data type. This data type determines the internal representation that will be used to store the data, what the ranges of the datas values are, and what set of operators and functions can be used to manipulate that data once it has been stored. DB2 Universal Database supports 19 different built-in data types (along with an infinite number of user-defined data types that are based on the built-in data types). Of these built-in data types, 5 are designed to store data values that exceed 32,700 bytes in length: Varying-length long character string (LONG VARCHAR) Varying-length double-byte long character string (LONG VARGRAPHIC) Binary large object (BLOB) Character large object (CLOB) Double-byte character large object (DBCLOB) These objects, although logically referenced as part of the table, may be stored in their own table space when the base table is stored in a DMS table space. This allows for more efficient access of both the long data and the related table data.
I/O parallelism
Parallel I/O refers to the process of writing to, or reading from, two or more I/O devices simultaneously. The DB2 Database Manager can take advantage of parallel I/O in situations where multiple storage containers exist for a single table space. When used, I/O parallelism can provide significant improvements in data throughput.
31
Query parallelism
Query parallelism controls how database operations are performed. DB2 UDB supports two different types of query parallelism: inter-query parallelism and intra-query parallelism. Inter-query parallelism refers to the ability of multiple applications to query a database at the same time. Each query executes independently of the others, but all are executed at the same time. When intra-partition parallelism is used, what is usually considered to be a single database operation such as index creation, database loading, or an SQL query is subdivided into multiple parts, many or all of which can be run in parallel within a single database partition. Intra-query parallelism refers to the simultaneous processing of individual parts of a single query, using either intra-partition parallelism, inter-partition parallelism, or both. When inter-partition parallelism is used, what is usually considered to be a single database operation is subdivided into multiple parts, many or all of which are run in parallel across multiple partitions of a partitioned database. Inter-partition parallelism only applies to DB2 UDB Enterprise-Extended Edition (EEE).
DB2_PARALLEL_IO
When reading data from, or writing data to table space containers, DB2 may use parallel I/O if the number of containers in the database is greater than 1. However, there are situations when it would be beneficial to have parallel I/O enabled for single container table spaces. For example, if the container is created on a RAID device that is composed of more than one physical disk, performance may be improved if read and write calls are issued in parallel.
32
To force DB2 UDB to use parallel I/O for a table space that only has one container, you use the DB2_PARALLEL_IO registry variable. This variable can be set to asterisk (*), meaning every table space is to use parallel I/O, or it can be set to a list of table space IDs that are separated by commas. For example, this command would turn parallel I/O on for all table spaces:
db2set DB2_PARALLEL_IO=*
However, the following command would only turn parallel I/O on for table spaces 1, 2, 4, and 8:
db2set DB2_PARALLEL_IO=1,2,4,8
The DB2_PARALLEL_IO registry variable also affects tablespaces with more than one container defined. If this the registry variable is not set, the I/O parallelism used is equal to the number of containers in the tablespace. However, if this registry variable is set, the I/O parallelism used is equal to the result of (prefetch size / extent size). For example, if a tablespace has 2 containers and the prefetch size is 4 times the extent size and if the DB2_PARALLEL_IO registry variable is not set, a prefetch request for this table space will be broken into 2 requests (each request will be for 2 extents). Provided that the prefetchers are available to do work, 2 prefetchers can be working on these requests in parallel. In the case where the DB2_PARALLEL_IO registry variable is set, a prefetch request for this table space will be broken into 4 requests (1 extent per request) with a possibility of 4 prefetchers servicing the requests in parallel. In this example, if each of the 2 containers had a single disk dedicated to it, setting the DB2_PARALLEL_IO registry variable might result in contention on those disks since 2 prefetchers would be accessing each of the 2 disks at once. However, if each of the 2 containers was striped across multiple disks, setting the DB2_PARALLEL_IO registry variable would potentially allow access to 4 different disks at the same time.
DB2_STRIPED_CONTAINERS
When creating a DMS table space, a one-page tag is stored at the beginning of each container used for identification purposes. The remaining pages are available for storage by DB2 and are grouped into extent-size blocks of data. When using RAID devices for table space containers, it is suggested that the table space be created with an extent size that is equal to, or a multiple of, the RAID stripe size. However, because of this one-page container tag, the extents will not line up with the RAID stripes; this may cause I/O requests to access more physical disks than would be optimal. This can have a significant impact when the RAID devices are not cached, and do not have special sequential detection and prefetch mechanisms.
33
To eliminate this problem, the DB2_STRIPPED_CONTAINERS registry variable can be used to tell DB2 UDB to use a full extent to store the identification tag in each container used. If this variable is set to ON, every table space created is to use a full extent for each containers tag. For example, this command would cause identification tags to be stored in one extent, rather than in one page:
db2set DB2_STRIPPED_CONTAINERS=ON
34
If the autorestart parameter in a database's configuration file is set to ON, the DB2 Database Manager will automatically execute the RESTART DATABASE command whenever it determines that the database is in an inconsistent state. (The DB2 Database Manager checks the state of a database when it attempts to establish the first connection to that database.) The RESTART DATABASE command is designed to handle problems that are caused by power interruptions and application failures. However, it cannot correct problems that are caused by media or storage failure. In order to resolve these types of problems, a backup image of the database must exist. Database backup images can be created at any time by executing the BACKUP DATABASE command. After one or more backup images have been created, they can be used to rebuild the database or any of its table spaces if either becomes damaged or corrupted. The first time a backup image of a database is created, a special file, known as the database recovery history file is also created. This file is then updated with summary information each time subsequent backup images are made. Because the database recovery history file contains summary information about each backup image available, it is used as a tracking mechanism during a recovery (restore) operation. (Each backup image contains special information in its header that is checked against the records in the recovery history file to verify that the backup image being used corresponds to the database being restored.) A damaged or corrupted database can be restored to the state it was in when a particular backup image was made by executing the RESTORE DATABASE command. When a database is restored from a backup image, all changes made to that database since the backup image was created will be lost unless roll-forward recovery for that database has been enabled. Roll-forward recovery is enabled by setting the logretain and/or the userexit parameter in a database's configuration file to ON.) When enabled, roll-forward recovery uses information stored in a database's transaction logs to reapply some or all of the changes made to a database since the last backup image was taken. By reapplying changes stored in the transaction logs, a database can be returned to the state it was in just before the restore/roll-forward operation began. The roll-forward process is initiated by executing the ROLLFORWARD DATABASE command. Usually, a roll-forward recovery operation is performed immediately after a full database restore operation is completed. Note: If a roll-forward recovery operation is to follow a full database restore operation, all database log files associated with that database must be copied to a separate directory, if possible, before the restore operation is performed (otherwise they will be overwritten by the log files stored in the backup image). These log files must then be copied back to their original location before the roll-forward recovery operation is started.
35
Transaction logging
Transaction logging is simply a process that is used to keep track of changes that are made to a database, as they are made. Each time a change is made to a row in a table (by an insert, update, or delete operation), records that reflect that change are written to a log buffer, which is simply a designated area in memory. When a transaction terminates by executing a COMMIT or a ROLLBACK SQL statement, when pages are flushed from a buffer pool by a page cleaner, or when the log buffer becomes full, all log records associated with that transaction, or page (or stored in the log buffer) are immediately written from the log buffer to one or more log files stored on disk. Only after all log records associated with the transaction have been externalized to one or more log files does a transaction receive confirmation that the commit or rollback operation has been successfully completed. This ensures that all log records of a completed transaction will not be lost due to a system failure. (Although log records may be written to disk before a commit or a rollback operation is performed, for example, if the log buffer becomes full, such early-writes do not affect data consistency because the execution of the COMMIT or ROLLBACK statement itself is eventually logged as well. The transaction logger and the buffer pool manager cooperate and ensure that updated information for a data page is not written to the database before its associated log record(s) have been written to the log file(s). This behavior ensures that the DB2 Database Manager can obtain enough information from the logs to recover a database that has been left in an inconsistent state, for example as a result of a power or application failure. Two methods of transaction logging are available, circular and archive. Each logging method provides for a different level of recovery capability. As long as a database is active, an active transaction log is available, regardless of which method is used. If the DB2 Database Manager cannot find a log to write to, it will suspend processing until a log file becomes available.
Circular logging
With circular logging, a group of primary online log files are defined and used in a round-robin fashion to provide transaction logging support. Records are written to the a log file as a transaction is processed and records are removed from a log file when the transaction the records are associated with is either committed and externalized to disk or rolled back. Once all records in a primary log file have been processed, the log file is freed up for reuse and will be repopulated the next time it becomes the active log file in the circular cycle. If the DB2 Database Manager determines that the next primary log file needed is unavailable, one or more secondary log files will be allocated and used, until all records in the primary log file needed have been processed. Logging will then continue with that primary log file and records stored in any secondary log files allocated will be
36
processed. When the DB2 Database Manager determines that a secondary file is no longer needed (because all of its records have been processed) that log file is removed and its associated memory is freed. Circular logging is the default logging behavior used when a new database is created. With circular logging, only full, offline backup images of the database can be taken, and roll-forward recovery cannot be performed. When a database is placed in an inconsistent state, it can be returned to a consistent state by utilizing the records stored in the active log to resolve any in-doubt or in-flight transactions. However, if a database becomes damaged or destroyed, it can only be salvaged by using the RESTORE DATABASE command in conjunction with a full, offline backup image that was taken earlier. Unfortunately, any changes made to the database after the backup image was made will be lost.
Archival logging
Archived logs contain the same log data as circular logs. However, archive log files are not reused by the DB2 Database Manager; instead, they are retained specifically for roll-forward recovery. Heres how archive logging works: once the active log file becomes full, a new active log file is created in the database's log directory, and the current active log file becomes an archive log file. An archive log file can be classified as either online or offline. An online archive log file is immediately available to DB2 for roll-forward recovery, and can be found in the database's log directory. An offline archive log file is not immediately available because it has been moved to a location other than the database's log directory.
When an online backup operation is performed, all transactions continue to be logged, and can be recreated in a future roll-forward recovery operation. When a database is restored from an online backup image, it must be rolled forward at least to the point in time at which the backup operation was completed. For this to happen, the active log file and any archived log files (online or offline) needed must be available when the roll-forward recovery process is initiated. That is because the DB2 Database Manager must be able to access the log files needed, in the proper sequence (whether they are active, online, or offline), to perform the roll-forward recovery operation. Obviously, the more archive log files you have online, the faster the recovery process will be. Because every change to a row includes the before-and after-image of the row, online archive log files can potentially become quite large. Thus, a NAS device is an excellent location for these objects, as well as for the database itself.
37
38
39
File I/O
IP protocol
Application server directs File I/O request over the LAN to the remote file system in the NAS appliance
Figure 2-3 IBM NAS devices use File I/O
File system in the NAS appliance initiates Block I/O to the NAS disk
40
41
Data-copy sharing
Data-copy sharing allows different platforms to access the same data by sending a copy of the data from one platform to the other. There are two approaches to data-copy sharing between platforms: flat file transfer and piping.
42
Clustering
Clustering is usually thought of as a server process providing failover to a redundant server, or as scalable processing using multiple servers in parallel. In a cluster environment, SAN provides the data pipe, allowing storage to be shared.
43
44
Chapter 3.
Network Appliance filers are easy-to-manage appliances that are designed specifically for today's scalable, network-centric IT system architectures. Network Appliance filers provide up to 12 terabytes of disk storage and can be attached directly to a network, rather than to a specific network server. In this chapter we provide an overview of Network Appliance filer technology, and discuss some of the functionality a Network Appliance filer provides.
45
46
A network interface driver within Data ONTAP is responsible for receiving all incoming NFS, CIFS, HTTP, and FTP requests. As each request is received, it is logged in non-volatile RAM (NVRAM), an acknowledgement is immediately sent back to the requestor, and processing that is needed to satisfy the request is initiated. Once initiated, such processing runs uninterrupted (and continuous), so far as possible. This approach differs from that of traditional file servers, which employ separate processes for handling the network protocol stack, the remote file system semantics, the local file system, and the disk subsystem. Network Appliance filers use RAID 4 parity protection for all data stored in the disk subsystem. In the event that any disk drive in the RAID subsystem fails, a hot spare disk drive is allocated to that RAID group and data on the failed drive is reconstructed on the hot spare disk drive, using information stored on the parity disk in the RAID group. While reconstruction occurs, requests for data from the failed disk are served by reconstructing the data on the fly with no interruption in file service. The WAFL file system is a UNIX compatible file system that has been optimized specifically for network file access. Network Appliances WAFL and RAID technologies were designed together to eliminate many of the performance problems that most file systems experience with RAID, and as a result, RAID management is integrated directly into the WAFL file system. By integrating the file system and RAID management, problems that result when RAID management sits on top of the file system (which is how RAID management is usually implemented) are eliminated.
47
48
With WAFLs RAID 4 environment, data is written to the data disks using blocks that are 4 KB in size; a group of blocks (known as a stripe) is written to each data disk in a RAID group, and the corresponding parity values for each stripe are written to the parity disk. If one block on a disk goes bad, the parity disk within that disk's RAID group is used to recreate the data in that block, and a new block containing the recreated data is created on the disk. If an entire disk fails, the parity disk prevents any data from being lost when the failed disk is replaced, the parity disk is used to recalculate its entire contents.
3.2.4 Snapshots
One of the benefits the WAFL file system provides is the ability to make read-only copies of the way its entire file system looks at any given point in time, and to make those copies available to system administrators via special subdirectories that appear in the current (active) file system. Each read-only copy of the file system is called a Snapshot, and a Network Appliance filer can maintain up to 31 Snapshots concurrently.
49
On any type of file system, each user-visible file and directory will be comprised of a set of blocks that reside on the physical disk media and in this respect, WAFL is no different. Since Snapshots operate at the block level of the WAFL file system, when a Snapshot is first taken, every file and directory that resides within the new Snapshot uses the same set of disk blocks that make up the file or directory in the active file system. Because of this, a Snapshot can be created in just a few seconds (since the blocks themselves are not duplicated) and each new Snapshot requires only a minimal amount of additional disk storage space. As files are changed or deleted, new blocks reflecting the changes are created and the original blocks are marked for reuse unless they are part of a snapshot (in which case they are retained until the snapshot that uses them is deleted). Most file systems would implement snapshots by copying the original data to a new block before modifing the original block. WAFL however, retains the original blocks because the files or directories that reference them still reside in a Snapshot. Hence Snapshots only start to consume disk space as the file system changes. Snapshots add a fourth dimension time to the file system's contents as a whole. Data can be viewed in its current state or it can be viewed as it existed at selected instances in the past. And because all blocks that are referenced by files and directories in a Snapshot remain as they were at the point in time the Snapshot was taken, Snapshots provide an efficient way to back up an active system without having to take that system offline.
50
Chapter 4.
51
52
created. Instead, if the size of an existing RAID 5 array needs to be increased, the number of disks to be added must match the current size of the array. Thus, if a RAID 5 implementation uses 7 disks in each array, then disks must be added to the array 7 at a time. The Network Appliance filer uses RAID 4 technology to protect against disk failure. However, unlike generic RAID 4 and RAID 5 implementations which are architected without thought to file system structure and activity, Network Appliance's RAID 4 implementation is heavily optimized to work in tandem with the Data ONTAP file system. By optimizing the file system and the RAID layer together, the Network Appliance RAID design provides all the benefits of RAID parity protection, without incurring the performance disadvantages that are often associated with general-purpose RAID 4 solutions. And because Network Appliance's RAID 4 design does not interleave parity information like a generic implementation of RAID 5, the overall system can be expanded quickly and easily, even though RAID protection is present. The disk layout for Network Appliances RAID 4 technology is illustrated in Figure 4-1.
Parity Disk
One 4 KB Stripe
53
As you can see in Figure 4-1 on page 53, a single Network Appliance RAID 4 array consists of one disk that is used for parity and up to twenty-seven disks that are used for storing data. Each disk in the RAID 4 array (or RAID group) is made up of 4KB blocks; therefore each stripe consists of one block from each data disk and one block from the parity disk. The parity block in each stripe allows data to be recalculated if any one block (on a data disk) in the stripe is lost. (Figure 4-1 on page 53 also shows how a filer's RAID group is divided into stripes, each one of which consists of one 4 KB block on the parity disk along and one 4KB block on each of the data disks in the disk array.) In order to understand how the parity disk works, it helps to think of each 4KB disk block as if it were a really big integer (32,768 bits long), and that RAID 4 is responsible for performing simple math operations using these integers. With this in mind, the parity block can be thought of a being the big integer that is basically the sum of all blocks in the stripe. For example: Parity 12 Data 1 3 Data 2 7 Data 3 2
Thus, if one of the data disks in the RAID group fails, for instance Data 2, then the data stored on that disk can be reconstructed, again by performing simple arithmetic: Data 2 = Parity - Data 1 - Data 3 = 12 - 3 - 2 =7 In reality, the RAID system uses EXCLUSIVE-OR instead of addition and subtraction, and the numbers are much larger. But the math works out the same using addition and subtraction on small numbers makes the technique easier to understand. Lost data is recalculated on the fly so that the system will still run, even if one of the data disks in the raid group has failed. (The entire contents of a failed disk are recalculated and written to the new disk when the failed disk is replaced.) Of course, if two blocks in a single stripe fail, there is no longer sufficient information available to recalculate the lost data. On the other hand, if the parity disk itself fails, it must be replaced (after which parity values will be recalculated), but no data stored on the data disks in the RAID group will be lost.
54
D1
D2
W W
W W
W W
W W W
W W
W W
W W
W W
W W
Since the FFS file system is not aware of the underlying RAID 4 layout, when read operations are performed, it tends to generate requests for data that is scattered throughout the data disks, which causes the parity disk to seek excessively.
55
The WAFL file system, on the other hand, writes blocks in a pattern that is designed to minimize seek operations on the parity disk. The illustration on the right of Figure 4-2 on page 55 shows how WAFL allocates the same blocks to make RAID 4 operate efficiently. WAFL always writes blocks to stripes that are near each other, eliminating long seeks on the parity disk. WAFL also writes multiple blocks to the same stripe whenever possible, further reducing traffic on the parity disk. Notice that FFS uses six separate stripes in Figure 4-2 on page 55, so six parity blocks must be updated. In contrast, WAFL uses only 3 stripes, so only 3 parity blocks are updated and they are all located near each other. As a WAFL file system becomes full it uses more stripes to write a given number of blocks which increases the number of parity blocks that need to be updated. Even in a very full file system, however, a small range of cylinders contains many free blocks, so the more important benefit of reducing seeks on the parity disk remains. Like FFS, WAFL reserves 10% of disk space to improve overall performance
RAID reconstruction
Whenever a disk drive in the disk subsystem of a Network Appliance filer fails, the filer is placed in what is know as degraded mode. Requests for data from the failed disk are served by reconstructing the data on the fly with no interruption in file service. A new disk drive can be substituted for the failed one at any time, and the image of the data stored on the failed disk will automatically be rebuilt on the replacement disk, still without any interruption in file service.
56
Furthermore, one or more hot spare disk drives may be configured on a filer. A hot spare will immediately be substituted for a failed disk without human intervention as soon as a filer enters degraded mode. Additional hot spares allow for the replacement of a subsequent failed disk if physical replacement of the first failed disk drive has yet to be accomplished. Concurrent drive failures can also be accommodated so long as no two failed drives are in the same RAID group. Reconstruction of data onto a spare can be prioritized relative to the servicing of incoming client file service requests. This is done by means of the raid.reconstruct_speed filer option. If this option is set to a value of 1, reconstruction will proceed at low priority (and will take a long time to complete), but incoming file service requests will be processed expeditiously. However, if this option is set to its maximum value of 10, then almost all of the filer's resources will be devoted to reconstruction (and it will complete quickly), but clients will experience sluggish filer service performance. The default value for the raid.reconstruct_speed option is 4. It is important to note that the raid.reconstruct_speed option controls the total amount of CPU resources that are devoted by the filer to reconstruction. For a given value, starting an additional reconstruction in another RAID group will not cause more resources to be spent on reconstruction. Instead, each reconstruction will take longer, but client performance will not be significantly reduced except where impacted by competition for access to disks.
RAID scrubbing
Far more likely than the possibility of a second disk drive failing in a RAID group before reconstruction has been completed for a previous disk failure, is the possibility that there may be an unknown bad block (media error) on an otherwise intact disk. If there are no failed disks within a RAID group, the filer will compensate for bad blocks by using parity information to recompute the bad block's original contents, which is then remapped to a spare block elsewhere on the disk when data in that block is accessed. However, if a bad block is encountered while the filer is in degraded mode (after a disk failure but before reconstruction has completed), then that block's data is irrecoverably lost. To protect against this scenario, filers routinely verify all data stored in the file system by using a process known as RAID scrubbing. By default, this process is performed once per week, early on Sunday morning, although it can be rescheduled or suppressed altogether. During the RAID scrubbing process, all data blocks are read from RAID groups which have no failed drives and if a media error is encountered, the bad blocks data value is recomputed and the data value itself is rewritten to a spare block. Otherwise, parity is recomputed and verified and if the computed parity value does not match the corresponding parity value stored on disk, the parity value on disk is rewritten. It is important to note that all non-degraded RAID groups are scrubbed in parallel.
57
58
Root Inode
Inode File
All Other Files Block Map File Inode Map File Other Files in the File System
By keeping meta-data in files, WAFL can write meta-data blocks anywhere on disk (this is where the name WAFL, which stands for Write Anywhere File Layout came from). This write-anywhere design allows WAFL to operate efficiently with the RAID disk subsystem by scheduling multiple writes to the same RAID stripe whenever possible to avoid the 4-to-1 write penalty that RAID traditionally incurs when just one block in a stripe is updated. Keeping meta-data in files makes it easy to increase the size of the file system on the fly. When a new disk is added, the file server automatically increases the sizes of the meta-data files (and the system administrator can increase the number of inodes in the file system manually if the default is too small). Finally, the write-anywhere design enables the copy-on-write technique that is used by Snapshots in order for Snapshots to work, WAFL must be able to write all new data, including meta-data, to new locations on disk, instead of overwriting existing data with new data values. If WAFL stored meta-data at fixed locations on disk, this would not be possible.
59
Root Inode
Regular File Data Blocks Block Map File Inode Map File Random Small File Random Large File
As you can see in Figure 4-4, files are made up of individual blocks and large files have additional layers of indirection between the inode and the blocks that contain the actual file data. In order for WAFL to boot, it must be able to find the root of this tree, so the one exception to WAFL's write-anywhere rule is that the block containing the root inode must live at a fixed location on disk where WAFL can find it.
60
61
WAFL reduces head-contention when reading large files by placing sequential blocks for a file on a single disk in the RAID array (rather than across multiple disks) whenever possible.
4.3 Snapshots
Understanding that the WAFL file system is a tree of blocks that is rooted by the root inode is the key to understanding Snapshots. To create a virtual copy of this tree of blocks, WAFL simply duplicates the root inode (disk blocks themselves are not copied, but rather, every block in the volume's file system is recognized as belonging to the Snapshot of the active file system on the volume. This meta-data is what is physically stored in the reserved area of the disks. WAFL creates a Snapshot by duplicating the root inode that describes the inode file. WAFL avoids changing blocks a Snapshot refers to by writing modified data to new locations on disk. Figure 4-5 illustrates how Snapshots work with an active file system.
Figure 4-5 is a simplified diagram of the file system shown in Figure 4-4 (before and after a Snapshot is taken) that leaves out internal nodes in the tree, such as inodes and indirect blocks. Section (a) of Figure 4-5 shows how the active file system (or root inode) looks before a Snapshot is taken. In this scenario, the active file system is contained on the four disk blocks A,B,C, and D. Section (b) of Figure 4-5 shows how WAFL creates a new Snapshot by making a duplicate copy of the root inode.
62
This duplicate inode becomes the root of a tree of blocks representing the Snapshot, just as the root inode represents the active file system. When the Snapshot inode is created, it points to exactly the same disk blocks as the root inode, so a brand new Snapshot consumes no disk space except for the Snapshot inode itself.(the disk blocks A,B,C, and D are associated with both the active file system and the Snapshot). Section (c) of Figure 4-5 on page 62 shows what happens when a user modifies data block D. WAFL writes the new data to block D' on disk, and changes the active file system to point to the new block. The Snapshot still references the original block D which is unmodified on disk. Because disk block D is participating in a Snapshot, it is marked by Data ONTAP as being in use and is not returned to the available disk block pool. If another Snapshot is taken at this time, the meta-data that the new Snapshot would describe would be for disk blocks A,B,C, and D'. WAFL would be very inefficient if it wrote the blocks associated with each NFS write request as they came in. Instead, WAFL gathers up many hundreds of NFS requests and stores them in a write buffer before actually writing to disk. During a write operation, WAFL allocates disk space for all the dirty data in the cache and schedules the required disk I/O. As a result, commonly modified blocks, such as indirect blocks and blocks in the inode file, are written only once per write episode instead of once per NFS write request. Initially, a Snapshot takes up an insignificant amount of disk space because all blocks referenced by the Snapshot are also referenced by the active file system. Over time, as files in the active file system are modified or deleted, a Snapshot may reference more and more blocks that are no longer used in the active file system. Thus the storage requirements for a Snapshots will increase if it is kept for any extended period of time. By default, 20% of the disk space available to a volume is reserved for Snapshot data. This amount of disk space can be increased or decreased to accommodate the requirements of the Snapshot maintenance plan implemented by the filer system administrator. To understand just how efficient Snapshots are, it helps to compare WAFL's Snapshots with the IBM TransArc Episode file system's fileset clones. Instead of duplicating the root inode, Episode creates a clone by copying the entire inode file. This approach generates considerable disk I/O and consumes a lot of disk space. For instance, a 10 GB file system with one inode for every 4 KB of disk space would have 320 MB of inodes. In such a file system, creating a Snapshot by duplicating the inodes would generate 320 MB of disk I/O and consume 320 MB of disk space. Creating 10 such Snapshots would consume almost one-third of the file system's space even before any data blocks were modified.
63
By duplicating just the root inode, WAFL can create Snapshots that require very little disk I/O. And because just the root inode is duplicated, Snapshots can be created every few seconds to guarantee quick recovery after an unclean system shutdown.
In this example, at time t1, the block-map entry is completely clear, indicating that the block is available. At time t2, WAFL allocates the block and stores file data in it. When shots are created, at times t3 and t4, WAFL copies the active file system bit into the bit indicating membership in the Snapshot. The block is deleted from the active file system at time t5. This can occur either because the file containing the block is removed, or because the contents of the block are updated and the new contents are written to a new location on disk. The block can't be reused, however, until no Snapshot references it. In Figure 4-6, this occurs at time t8 after both Snapshots that reference the block have been removed.
64
Figure 4-7 NetApp filer with multiple volumes composed of multiple RAID groups
Each volume in turn contains its own WAFL file system, with its own inodes, block map-file, inode-map file, etc. Although some systems from other vendors allow system administrators to create volumes which can contain other types of objects, in a Network Appliance filer a volume always contains a WAFL file system and thus the two terms are almost (but not quite) interchangeable. Some system administrators choose to use two smaller filers rather than a single, larger filer. In some cases, this is to limit the risk of data loss due to multiple disk failures, (which was discussed earlier). In other cases, this is because it may be more efficient to manage several smaller file systems than one large file system.
65
The multiple volume feature of the Data ONTAP software provides this management flexibility without requiring the user to purchase multiple physical devices. In effect, it allows multiple logical filers to exist within a single appliance. (A filer can currently have up to 23 volumes, each composed of an integral number of RAID groups, which in turn are comprised of physical disks.)
Feature Limit size of collection of data Implementation Granularity May reduce allocation May over-commit
66
Chapter 5.
67
68
Figure 5-1 Infrastructure used to test DB2 UDB and a Network Appliance filer
Here, filer_ID is the host IP address or name assigned to the filer. In our test environment, the filer was named terminator and assigned the IP address 9.1.39.40. Therefore, in order to access the Web interface, we used the URLs:
http://9.1.39.40/na_admin http://terminator/na_admin
69
Note: In order to use the filer name in a URL, information that associates the filers name with its IP address must reside on the database server. (Usually, this is done in the file /etc/hosts that resides on the DB2 server.) If you do not know the IP address or name that has been assigned to the filer, contact your system administrator. Figure 5-2 shows the initial screen of the Network Appliance filer Web interface that appears once a valid filer URL is provided in a Web browser.
Figure 5-2 Initial page of the Network Appliance filer Web interface
From the initial screen of the filer Web interface, you can install filer documentation, view filer documentation, invoke the filer administration tool named FilerView, invoke the filer monitor tool named Filer At-A-Glance, or initiate a technical support call.
70
As you can see here, the main screen of FilerView consists of a collapsible menu and a display area. As menu items are selected, data entry forms or statistical information associated with the menu item selected are shown in the display area.
71
Once the Manage Volumes screen is displayed, new volumes can be created by selecting the Add New Volume link located at the top of the Manage Volumes screen (refer to Figure 5-4). When this link is selected, the Add New Volume data entry screen will be displayed in place of the Manage Volumes screen.
72
From the Add Volume screen, you define the properties of each new volume that is to be created. Properties include the name to assign to the volume, the size of the RAID group to use in the volume, the language to use with the volume, the number of disks to assign to the volume, the size of the disks used, and whether or not specific disks are to be used by the volume. When adding a new volume, you have the option of letting the filer automatically select the disks to use (in which case you specify the number of data disks desired) or you can manually select each disk that is to make up the volume. Note: A RAID group can consist of 2 to 28 disks. By using a smaller RAID group size, the potential for a double disk failure to occur within a single RAID group is reduced (because fewer disks are used). In a normal transaction type of database environment, a good RAID size to use is 8 or 14. It is important to note that the RAID group size specified is applicable to the current RAID group as well as to future RAID groups that may be added to the volume. Figure 5-5 on page 75 illustrates what the Add New Volume data entry screen might look like after its data fields have been populated. New volumes can also be created by using a telnet session or a remote shell (rsh command) to issue the following command at the filer:
vol create volname [-r raidsize] [-l language_code] { ndisks[@size] | -d disk1 [disk2 ...] }
For more information on the vol create command, refer to the Network Appliance Manual Pages. In order to be able to perform roll-forward on any DB2 UDB database, the databases log files must be accessible and kept up-to-date. We recommend that you store database files on one volume and database log files on a separate volume so that they can be backed up independent of each other using Network Appliance Snapshots. Another alternative would be to store the database logs on a separate Network Appliance filer. For our test environment, we created two volumes that had the following characteristics: Volume 1 Name: db2_data RAID Group Size: 8 Language: English (US) Automatic Disk Selection: Yes Number of Disks: 7 Disk Size: Any size disks
73
Volume 2 Name: db2_logs RAID Group Size: 8 Language: English (US) Automatic Disk Selection: Yes Number of Disks: 5 Disk Size: Any size disks These volumes were created by using the Add New Volume screen. Alternately, these volumes could have been created by using a telnet session or a remote shell (rsh command) to issue the following commands at the filer:
vol create db2_data -r 8 -l en_US 7 vol create db2_logs -r 8 -l en_US 5
74
The easiest way to obtain and view information about existing qtrees on a Network Appliance filer is by selecting Volumes>Qtrees>Manage from the FilerView menu (Figure 5-7 on page 77). This sequence of menu selections will cause the Manage Qtrees screen to be displayed in the FilerView display area. Figure 5-7 on page 77 shows how the Manage Qtrees screen might look after three volumes (one for Data ONTAP, one for database data, and one for database log files) have been created.
75
New qtrees can be created for a particular volume by highlighting the desired volume shown on the Manage Qtrees screen and then clicking the Create button shown just below the volume list (Figure 5-8). When this button is clicked, the Create a new Qtree dialog is displayed, and the user is prompted to provide the name to be assigned to the qtree that is to be created. Figure 5-8 illustrates how the Create a new Qtree dialog might look after its data fields have been populated. New qtrees can also be created by using a telnet session or a remote shell (rsh command) to issue the following command at the filer:
qtree create [qtree_name]
For more information on the qtree create command, refer to the Network Appliance Manual Pages.
76
For our test environment, we created two qtrees for each volume; we used a combination of volume names and operating system names to produce the following qtrees:
77
Volume 1 (db2_data) db2_data_aix db2_data_linux Volume 2 (db2_logs) db2_logs_aix db2_logs_linux These qtrees were created by using the Create a new Qtree dialog. Alternately, these qtrees could have been created by using a telnet session or a remote shell (rsh command) to issue the following set of commands at the filer:
qtree qtree qtree qtree create create create create /vol/db2_data/db2_data_aix /vol/db2_data/db2_data_linux /vol/db2_logs/db2_logs_aix /vol/db2_logs/db2_logs_linux
Figure 5-9 shows how the Manage Qtrees screen looked after these qtrees were created.
Figure 5-9 Manage Qtrees screen after qtrees for test environment were created
78
79
To add a line to the /etc/exports file (which, in turn, makes a newly created volume or qtree available to an NFS client), highlight a blank line to activate the Apply and Insert Line buttons, then select the Insert Line button to display the Create a New /etc/exports Line dialog. Figure 5-11 illustrates how the Create a New /etc/exports Line dialog might look after its data fields have been populated.
Once an entry for an NFS export has been made in the /etc/exports file, appropriate data access permissions must be specified for that entry. To specify permissions for an entry in the /etc/exports file, highlight the appropriate entry in the Manage NFS Exports screen and select the Add Option button to display the Add Option dialog. Figure 5-12 illustrates how the Add Option dialog is activated. Figure 5-14 illustrates how the Add Option dialog might look after its data fields have been populated. To assign permissions from the Add Option dialog, simply select the access level from the drop down list box provided and click the OK button. The following type of permissions are available:
Access -allows users from the Host or network group to access the filer RW -allows read and write operations on the filer. Root -allows root privileges and can mount directories, change permissions and ownerships, and create or delete directories and files.
A single entry in the /etc/exports file can be assigned multiple permissions; just highlight the appropriate entry and add a new option for each permission needed. Once the permissions have been assigned, click the Apply button located on the Manage NFS Exports screen to write all changes to disk, then select the Export All button to make all changes made to the /etc/exports file effective.
80
You can also make changes to the /etc/exports file become effective by using a telnet session or a remote shell (rsh command) to issue the following command at the filer:
exportfs [-aiuv] [-o options] [pathname]
For more information on the exportfs command, refer to the Network Appliance Manual Pages.
81
Figure 5-14 shows how the Manage NFS Exports screen looked just after we set the appropriate permissions for our test environment and just before we made those changes effective.
82
3. Modifying the appropriate files that are used to establish NFS mount points (on Linux, this file is /etc/fstab; on other UNIX-based systems this file is /etc/vfstab) by adding new mount points that associate appropriate volumes and/or qtrees on the filer with the directories just created. 4. Mounting the filer volumes and/or qtrees to the directories created. 5. Verifying that mount operation was successful. For example, the following is the step-by-step procedure we used to make the qtrees we created on the Network Appliance filer available to the Linux DB2 UDB server used in our test environment: 1. Logon to the server as root. 2. Create two mount point directories by executing the following commands: mkdir /db2_data mkdir /db2_logs Important: Make sure the file permissions for these directories are set such that the appropriate users can both read from and write to them. File permissions can be set by executing the UNIX command chmod, along with the appropriate options, for each directory created. 3. Add the following lines to the file /etc/fstab terminator:/vol/db2_data/db2_data_linux/db2_datanfs terminator:/vol/db2_logs/db2_logs_linux/db2_logsnfs 4. Save the modified file. (Figure 5-15 shows how /etc/fstab looked when this step was completed.) 5. Mount the filer qtrees by executing the following commands: mount /db2_data -o hard, intr, vers=3, proto=udp, suid, rsize=32768, wsize=32768 mount /db2_logs -o hard, intr, vers=3, proto=udp, suid, rsize=32768, wsize=32768 Refer to Table 5-1 for a description of each of these options. Note: The options shown with the mount commands used could have been stored in the file /etc/fstab as opposed to being provided with the mount commands. By storing the options shown with the mount commands provided in the /etc/fstab file, the options specified will be used to remount the mount points each time the DB2 server is rebooted.
83
6. Verify that the filer qtrees have been mounted by executing the following command: df -k (Figure 5-16 shows the output that was produced when this step was completed on the Linux server used in our test environment.) 7. Logoff the server.
Figure 5-15 /etc/fstab file used in Linux test environment Table 5-1 Mount option descriptions
Option hard Description Indicates that the mount point should never time out and that the DB2 server workstation should not come online without it. When used, this option will cause the DB2 server to hang if the Network Appliance filer is not responding to NFS for any reason. If the DB2 server is booting and the filer cannot be found, the DB2 server will not complete the boot process and DB2 UDB will not start. If the DB2 server it is already up and running and the filer quits responding, all I/O to and from the filer will be suspended until the filer is available again. Allows operator generated keyboard interrupts to kill a process that is hung while waiting for a response from the Network Appliance filer. Specifies which NFS version should be used. Some versions of UNIX have been reported to have serious performance problems when running with NFS Version 3. Others perform better using NFS Version 3 instead of Version 2. The system administrator should try using the vers option with both NFS versions, and should run with the NFS version which provides the best performance. This option is supported in recent releases of UNIX.
intr vers
84
Option proto
Description Along with the vers option, this option gives the system administrator the option of choosing whether UDP or TCP protocol should be used. For NFS over local area networks, UDP offers less overhead (and therefore better performance) than TCP. However, if your network connection path between the Network Appliance filer and the DB2 host is prone to lose packets, drop frames, or introduce checksum errors, then TCP can improve performance compared to UDP. We recommend that you run using UDP on a dedicated network connection with a crossover cable between the DB2 server and the filer. If you use UDP, be sure to enable UDP checksum on the DB2 server workstation. Tells the DB2 server that it should honor the set-uid bit on files mounted at this mount point. If you have any of the DB2 executables located on the filer, then using this option is important. If you are putting only the database files on the filer, then this option can be omitted. If you use this option, you must also export the file system with the -anon=0 option. For example, the /etc/exports file on the filer should read something like: /vol/vol0 -anon=0,root=somepc /vol/db2_data -anon=0 /vol/db2_logs -anon=0 Tells the DB2 server the size of the read block being used. The default is 32K Tells the DB2 server the size of the write block being used. The default is 32K
suid
rsize wsize
Figure 5-16 Output from df after qtrees were mounted on our Linux server
85
Once the operating system on the DB2 UDB server has been configured to access the volumes and/or qtrees on the filer, DB2 UDB can use those volumes and/or qtrees as a repository for database data and databases transaction log files in the same way that direct attached storage would be used for the same purpose.
86
87
Note: If you do not specify table space parameters with the CREATE DATABASE command, the DB2 Database Manager will create three system managed storage (SMS) table spaces using directory containers. These directory containers are created in the subdirectory that is created for the database. Please refer to the IBM DB2 UDB Command Reference for more information about the CREATE DATABASE command.
88
Important: Make sure the file permissions for these directories are set such that the appropriate users can both read from and write to them. File permissions can be set by executing the UNIX command chmod, along with the appropriate options, for each directory created. 2. Create a new database (named TEST_DB) on the filer that has three SMS table spaces (also stored on the filer) by executing the following command: db2 create database TEST_DB on /db2_data USER TABLESPACE MANAGED BY SYSTEM USING (/db2_data/user) CATALOG TABLESPACE MANAGED BY SYSTEM USING (/db2_data/system) TEMPORARY TABLESPACE MANAGED BY SYSTEM USING (/db2_data/temp)
89
As you can see in this sample output, the database TEST_DB was successfully created in the Network Appliance filer qtree that the mount point /db2_data was associated with.
90
Once you have the internal ID for a particular table space, you can find out where the data for that table space is physically located by executing the LIST TABLESPACE CONTAINERS command. Thus, the LIST TABLESPACES command can be used in conjunction with the LIST TABLESPACE CONTAINERS command to verify that table spaces were created as expected if the third method is used to create a DB2 UDB database on a Network Appliance filer. Figure 5-18 shows the output that was produced by the LIST TABLESPACES command, when it was executed in our test environment after the database TEST_DB was created using the third method available. Figure 5-19 shows the output that was produced by the LIST TABLESPACE CONTAINERS command, when it was executed in our test environment to obtain specific information about the table space TEMPSPACE1.
91
To force DB2 UDB to expand SMS table spaces one extent at a time, rather than one page at a time, you use the DB2EMPFA utility. The db2empfa tool is located in the bin subdirectory of the sqllib directory in which the DB2 UDB product is installed. Running it causes the multipage_alloc database configuration parameter (which is a read-only configuration parameter) to be set to YES. For our test environment, we used the db2empfa utility to tell the DB2 Database Manager to perform multi-page allocation for our test database by executing the following command from a system prompt: db2empfa TEST_DB
92
The location that database log files are to be written to is specified by setting the value of the database configuration file parameter newlogpath. Here is the step-by-step procedure we followed to make the database we created in our test environment store its log files on a separate volume of the filer: 1. Store the mount point to the db2_log_linux qtree on the filer in the database configuration file parameter newlogpath by executing the following command: db2 update db cfg for TEST_DB using newlogpath /db2_logs 2. Force all connections to the database to be terminated so the changes will take effect by executing the following command: db2 force applications all
93
94
Chapter 6.
Backup and recovery options for databases that reside on NetApp filers
In this chapter we describe the steps used to back up and restore a DB2 UDB database using the WRITE SUSPEND, WRITE RESUME, and DB2INIDB commands (which were introduced in DB2 UDB v7.1 FIxPak 2) in conjunction with Network Appliances Snapshot technology.
95
96
With these relationships in mind, you should then try to group related database objects together on the same logical filer volume or group of volumes. Placing database objects with dissimilar backup requirements or functions on the same logical filer volume will complicate the use of Snapshots and typically make recovery that much harder. On the other hand, by keeping similar data objects together, the recovery process will be much easier. It should be noted that the archive logs should be stored on a volume that is separate from the one the data is stored on. Particularly if roll-forward recovery is to be enabled. The basic data container in DB2 Universal Database is the table space object. A table space provides a transparent relationship between all other objects in a database and the underlying physical storage they reside on. Essentially, table spaces provide a way to assign the location of objects and data directly to one or more containers (which can be a directory, a file, or a raw device). If a single table space spans more than one container, the DB2 Database Manager will attempt to balance the data load across all containers used. Specifying quota trees (qtree for short) on a Network Appliance filer as table space containers is the simplest way to keep logically related DB2 database objects together. Qtrees, which are essentially subdirectories on a logical filer volume, make it easy to control the placement of related database objects on a Network Appliance filer. By creating table spaces that use qtrees as containers, a single Snapshot of a volume can be used to back up multiple, related database objects at one time. One of the advantages of using a Network Appliance filer is that you can create logical volumes from many different physical disk drives. The current maximum size of a logical volume on a Network Appliance filer is 1.4 TB. The current maximum capacity of a Network Appliance F840 filer is 6 TB. The use of multiple RAID groups will increase the number of parity disks and reduce the already low probability of double disk failures. This means that a very large database can be kept in a small number of containers, all of which reside on the same logical Network Appliance filer volume. Data stored in a database can have different access and update frequencies. Some tables such as look-up tables may contain static data that changes rarely, if ever. Other tables may contain volatile data that is updated or altered several times a second. The backup requirements for each is different. Tables containing static data need to be backed up far less often than tables containing volatile data. Because Snapshots of volumes on a Network Appliance filer can be taken at different intervals, the database recovery process can be improved by storing tables that hold these two types of data on different logical volumes.
Chapter 6. Backup and recovery options for databases that reside on NetApp filers
97
For example, one volume could be defined and used to hold static data and a Snapshot of this volume could be taken once a week; another volume could be defined and used to hold volatile data and a Snapshot of this volume could be taken every hour. Note: The Network Appliance filer will allow you to schedule when Snapshots for selectable volumes are to be taken automatically. However, this feature cannot be used if the databases that reside on selected volumes will be active at the time the Snapshot is to be taken. Thats because database logging must be suspended before and resumed after each Snapshot is taken. On the other hand, if a database is normally taken off line at regular intervals, the automated Snapshots feature of the Network Appliance filer can be used to capture desired Snapshots during those intervals.
98
A database connection must exist before this command can be submitted. It is also recommended that the subsequent write resume command that must follow a write suspension should be executed in the same session.
6.3.3 DB2INIDB
Many storage vendors, including Network Appliance, provide storage solutions that ensure that data is constantly available. One such offering is the ability to make a mirrored copy of a database and then make that mirrored copy available for processing by the same or a different server. To take advantage of these offerings, DB2 Universal Database created a utility that is designed specifically to work with mirrored copies of a database. This utility, which is invoked by executing the command db2inidb, was also introduced in version 7.2. The db2inidb command looks like this: db2inidb [DatabaseAlias] as [snapshot | standby | mirror] When executed, this command works with a mirrored copy of a database to do one of the following: Perform database recovery using a mirrored copy of a database. Put a mirrored copy of a database in the roll-forward pending state so that it can be synchronized with the primary database. Allow a mirrored copy of a database to be backed up, thus providing a way to back up a large database without having to take it off line. Tip: A database connection does not have to exist before this command can be submitted.
Chapter 6. Backup and recovery options for databases that reside on NetApp filers
99
Which of these actions is performed is determined by the option that is specified when the db2inidb command is executed: snapshot: The mirrored copy of the database will be initialized as a read-only clone of the primary database. (The DB2INIDB snapshot should not be confused with the Network Appliance filer Snapshot.) standby: The mirrored copy of the database will be placed in roll-forward pending state. New logs from the primary database can be retrieved and applied to the mirrored copy of the database. The mirrored copy of the database can then be used in place of the primary database if, for some reason, it goes down. mirror: The mirrored copy of the database will be placed in roll-forward pending state and is to be used as a backup image, which can be used to restore the primary database. If the database is in an inconsistent state, it will remain in that state and any in-flight transactions will remain outstanding.
100
The location used to store a databases log files is determined by the value of the logpath parameter of a databases configuration file. To change the location of a databases log path, issue the following command: db2 UPDATE DB CFG FOR [DatabaseAlias] USING NEWLOGPATH [Location] In this command: DatabaseAlias is the alias for the database whose configuration is to be modified and Location is the location where database log files are to be stored.
Chapter 6. Backup and recovery options for databases that reside on NetApp filers
101
Version recovery
To restore a DB2 UDB database to the state it was in at the point in time that a filer Snapshot was taken (using an existing Snapshot): 1. Shut down the DB2 Database Manager instance by issuing the following command: db2stop 2. If the DB2 Database Manager instance cannot be shut down because one or more processes are still active, issue the following commands instead: db2 force application all db2stop 3. Using a remote shell, restore the database from the Snapshot taken of the filer volume that contains the database by executing the following command: rsh -l root db2filer1 vol snaprestore [DataVolName] -f -s [DataSnapshotName] The database can also be restored by copying the database files, including the log control file (i.e., SQLOGCTL.LFH) from the appropriate Snapshot directory on the filer, over the existing database and log control files. (Essentially, all files and directories contained in the database directory, should be copied. If you created table spaces that use containers that reside in other directories that were captured in the filer Snapshot, those files must be copied as well.) Note: Once a volume has been restored with a particular Snapshot, any Snapshots that were taken after the Snapshot that was used to restore the volume was taken will be returned to the available block pool, effectively eliminating them. Therefore, Snapshot recovery should be performed in descending time sequence where the most current Snapshot is used first, if applicable.
4. Place the restored database (which is a mirrored copy of the database) in a consistent state by executing the following command: db2inidb [DatabaseAlias] as snapshot 5. Restart the DB2 Database Manager instance by issuing the following command: db2start The database should now be available for use. However, all changes made to the database after the filer Snapshot was taken will no longer be reflected.
102
Roll-forward recovery
To restore a database to the state it was in at a specific point in time by reapplying changes stored in associated transaction log files: 1. Shut down the DB2 Database Manager instance by issuing the following command: db2stop 2. If the DB2 Database Manager instance cannot be shut down because one or more processes are still active, issue the following commands instead: db2 force application all db2stop 3. Using a remote shell, restore the database from the Snapshot taken of the filer volume that contains the database by executing the following command: rsh -l root db2filer1 vol snaprestore [DataVolName] -f -s [DataSnapshotName] The database can also be restored by copying the database files, including the log control file (i.e., SQLOGCTL.LFH) from the appropriate Snapshot directory on the filer, over the existing database and log control files. (Essentially, all files and directories contained in the database directory, should be copied. If you created table spaces that use containers that reside in other directories that were captured in the Snapshot, those files must be copied as well.) Important: Do not restore the log files from any Snapshot if you want to have the ability to perform roll-forward recovery on the restored database. 4. Restart the DB2 Database Manager instance by issuing the following command: db2start 5. Place the restored database (which is a mirrored copy of the database) in roll-forward pending state and indicate that it is to be used as a backup image, by executing the following command: db2inidb [DatabaseAlias] as mirror 6. Perform a roll-forward recovery operation on the database, using records stored in the databases log files, by executing the following command: rollforward database [DatabaseAlias] to end of logs and stop The database should now be available for use.
Chapter 6. Backup and recovery options for databases that reside on NetApp filers
103
104
Chapter 7.
105
106
Locks
LOCK
DFT_MON_LOCK
Tables
TABLE
DFT_MON_TABLE
Buffer pools
BUFFERPOOL
Unit of work
UOW
SQL statements
STATEMENT
DFT_MON_STMT
107
As you can see from the information provided in Table 7-1 on page 107, each snapshot monitor switch available has a corresponding parameter value in the DB2 Database Manager configuration file. By setting a snapshot monitor switch using a DB2 Database Manager configuration parameter, snapshot monitor information can be collected at the instance level as opposed to the application level. Snapshot monitor switches are set at the instance level using the UPDATE DBM CFG command; snapshot monitor switches are set at the application level using the UPDATE MONITOR SWITCHES command. When activating a snapshot monitor switch from the application level, for example, by issuing the UPDATE MONITOR SWITCHES command from the Command Line Processor, an instance connection is made and all data collected for the selected switch group(s) is made available to the application/user until the instance connection is terminated. All data collected will be different from that collected by any other application/user that turns on the same snapshot monitor switche(s) at a different point in time. In order to make the snapshot information available and consistent for all instance connections, the default monitor switches should be turned on using the appropriate DB2 Database Manager configuration file parameters. Note: Typically, when you change the value of a DB2 Database Manager configuration file parameter, you need to stop and restart the DB2 Database Manager instance before those changes will take effect. However, changes made to parameters that correspond to the snapshot monitor switches are effective immediately. Therefore, you do not need to stop and start the DB2 Database Manager instance. Instead, you need to terminate and reestablish any active connections before the changes made will be in effect.
108
The easiest way to examine the current state of the DB2 Database Manager-level snapshot monitor switches available is by executing the GET DBM MONITOR SWITCHES command. Figure 7-2 illustrates how output from the GET MONITOR SWITCHES command looks; again the timestamp values shown correspond to the date and time a particular snapshot monitor switch was reset or turned on.
109
In the output shown in Figure 7-3, we can see the disk read and write operations that have been performed at the tablespace level. If there are multiple tables in a tablespace then the command GET SNAPSHOT FOR TABLES ON [Database] can be used to determine which tables are the most active. Figure 7-4 shows an example of snapshot data that was collected at the table level and returned.
110
111
112
TABLES: Records an event record for each active table when the last application disconnects from the database. An active table is a table that has changed since the first connection to the database was established. DEADLOCKS: Records an event record for each deadlock event. TABLESPACES: Records an event record for each active table space when the last application disconnects from the database. BUFFERPOOLS: Records an event record for each buffer pool when the last application disconnects from the database. CONNECTIONS: Records an event record for each database connection event each time an application disconnects from the database. STATEMENTS: Records an event record for each time an SQL statement is issued by an application. TRANSACTIONS: Records an event record each time a transaction completes (by executing a COMMIT or ROLLBACK statement). Event monitors are created by executing the CREATE EVENT MONITOR SQL statement. Note: SYSADM or DBADM authority is required to create an event monitor.
113
114
Using the top program, you can quickly locate a process that is consuming a large amount of system resources.
115
Figure 7-5 shows an example of process data that was collected by the vmstat program. In this example, 10 lines of data were collected and a 2-second interval was used.
116
To determine which DB2 UDB processes are running, use the command: ps -ef | grep db2 To determine which DB2 UDB agents are idle, which agents are handling a database connection, and which agents are handling an instance connection, execute the command: ps -ef | grep db2agent
117
Note: In order to use these utilities, you must be remotely connected to the filer.
7.3.1 sysstat
The sysstat utility is used to display aggregated filer performance statistics such as the current CPU utilization, the current amount of network I/O, the current amount of disk I/O, and the current amount of tape I/O being performed. When invoked with no arguments sysstat prints a new line of statistics every 15 seconds. The sysstat utility can also be invoked by issuing one of the following commands: sysstat [interval] sysstat [-c count] [-s] [-u | -x] [interval] Figure 7-8 shows an example of the output that is provided by the sysstat utility.
If sysstat is started with no interval count specified, it can be stopped by using Control-C. For more information on the sysstat utility, refer to the Network Appliance Manual Pages.
118
7.3.2 ifstat
The ifstat utility is used to display statistics about packets that have been received and sent on a specified network interface or on all network interfaces. The statistics returned by ifstat are a cumulative total that has been collected since the filer was booted.The ifstat utility can be invoked as follows: ifstat [-z] [-a | interface_name] If specified, the -z argument causes the statistics to be cleared. The -a argument causes statistics for all network interfaces including the virtual host and the loopback address to be displayed. If you don't use the -a argument, the name of a specific network interface should be provided. Figure 7-9 shows an example of the output that is provided by the ifstat utility.
For more information on the ifstat utility, refer to the Network Appliance Manual Pages.
7.3.3 netstat
The netstat utility is used to symbolically display the contents of various network-related data structures. There are a number of output formats available, depending on the options specified for the information to be presented. The netstat utility can be invoked by issuing one of the following commands: netstat [-anx] netstat [-mnrs]
119
netstat [-i | -I interface [ -dn ] [ -f { wide | normal } ] netstat [-w interval] [ -i | -I interface ] [ -dn ] netstat [ -p protocol ] The first form of the netstat utility command displays a list of active sockets for each protocol used. The second form presents the contents of one of the other network data structures according to the option selected. The third form will display cumulative statistics for all interfaces or, for the interface specified using the -I option. This form will also display the sum of the cumulative statistics for all configured network interfaces. The fourth form continuously displays information regarding packet traffic on the interface that was configured first, or for the interface specified using the -I option. This form will also display the sum of the cumulative traffic information for all configured network interfaces. The fifth form displays statistics about the protocol specified. Figure 7-10 shows an example of the output that is provided by the netstat utility.
For more information on the netstat utility, refer to the Network Appliance Manual Pages.
120
7.3.4 df
The df utility is used to displays statistics about the amount of free disk space remaining in one or all volumes on a filer. All sizes are reported in 1024-byte blocks. The df utility can be invoked by issuing the following command: df [ -i ] [ pathname ] In this command, pathname identifies the path name to a specific volume. If a path name is specified, df reports only on the corresponding volume; otherwise, it reports on every volume that is currently on-line. Figure 7-11 shows an example of the output that is provided by the df utility.
When executed, df displays statistics about snapshots for each volume on a separate line from the statistics about the active file system. The snapshot line displays information about the amount of disk space that is consumed by all the snapshots in the system. Blocks that are referenced by both the active file system and by one or more snapshots are counted only in the active file system line; their count is notreflected in the snapshot line. If snapshots consume more space than has been reserved for them (20% by default), the excess space consumed by snapshots is reported as used by the active file system as well as by snapshots. In this case, it may appear that more blocks have been used in total than are actually present in the file system. When invoked with the -i option specified, the df utility displays statistics about the number of free inodes available.
121
122
Part 2
Part
In this part of the book, we first introduce IBM NAS 200 and NAS 300 and terminology and concepts for IBM NAS. We then describe how to configure IBM NAS 200 and 300 and how to install DB2 on Windows. Next we walk you through the backup and recovery options using IBM NAS Persistent Storage Manager (PSM). Finally, we show the IBM NAS high availability features and the test results with DB2 UDB.
123
124
Chapter 8.
125
126
The basic architecture of the IBM NAS appliance can be seen in Figure 8-1.
Storage
System Management UNIX Services Disaster Recovery PSM DHCP DNS FTP LDAP
Integrated OS
Novell NetWare
TCP/IP
Figure 8-1 IBM NAS Appliance System Architecture
The main buildings blocks of the IBM NAS appliance architecture are: NAS Server Engine Storage Subsystem Pre-loaded Software
127
IBM NAS gets connected to the LAN by Ethernet Adapter: The NAS 200 includes two ethernet controllers. One is a PCI slot card and the other is directly integrated on the motherboard. Both adapters are configured by default to use the DHCP server. The adapter on board is used for administration tasks and the other one is used for the public network. The NAS 300 come with an integrated 10/100 Ethernet controller, which is exclusively used to communicate between the two engine nodes. At least one Ethernet adapter must be ordered with each engine of the configuration to connect to the Ethernet LAN for access by the users.
128
129
Hard disks
The NAS 200 and 300 can contain up 48 disks in a variety of capacities. Currently there are 36 GB and 73 GB disk drive available which allows a total capacity of 3.5 TB for the NAS 200 and 6.5 TB for NAS300. Both disk types are hot-swappable, designed for high-performance (10,000 rpm HDD) and featured with Predictive Failure Analysis (PFA). The integrated disk adapter provide almost all the RAID functions within a NAS appliance, such as parity calculations, disk rebuild, and sparing.
130
As shown in Figure 8-3, several arrays with different RAID level support can be assigned to a set of IBM NAS disks.
Figure 8-4 shows one or more logical disks can be defined within an array. For logical disks, the term logical drives is used. If required, you can add more disk space dynamically from the assigned array and increase the local drive size.
H a r d D is k
L o g ic a l D r iv e
Figure 8-4 IBM NAS logical drives
A rra y
In order to make disk space accessible for applications, drive partitions have to be defined. A drive partition is defined by assigning space from one or more logical drives to it if more than one logical drive is used, we called it a spanned logical partition. Logical partitions or short partitions are identified by drive letters like F:, E:, etc., as shown in Figure 8-5.
131
P a rtitio n F :
P a rtitio n E :
For logical disk and drive partition configuration and management, you can use the ServeRAID Manager program on NAS 200 the ServeRAID Manager program is part of the IBM Advanced Appliance Configuration Utility (IAACU) and can be accessed either by using Windows Terminal Services or Internet Explorer or the IBM Netfinity Fibre Channel Storage Manager on NAS300. The NAS system comes pre-configured with a default setup for array, logical drives, and drive partitions. (For NAS 200, by default, an array A with three logical drives has been set up. The logical drives are mapped to drive letter C:, D:, and E:).
132
The NAS 300 incorporates the IBM TotalStorage 5191 external RAID controller. The NAS 300 has a battery-backed cache that will protect any unwritten data (that was still in cache when the failure occurred) for up to 72 hours. It supports RAID levels 0, 1, 3 and 5. The 5191 is designed for high-availability applications requiring a high degree of component redundancy. It features two hot-plug RAID controllers and two hot-plug power supplies and redundant fans.
RAID 0
RAID 0 allows multiple physical drives to be logically concatenated into a single logical disk drive. A technique called data striping is applied to the physical disk drives. This technique interleaves blocks of data across the disks. The layout is such that a sequential read of data on the logical drive results in parallel reads to each of the physical drives. RAID 0 requires a minimum of two drives. RAID 0 provides no redundancy protection such as parity protection or data mirroring. If a single disk fails, all data is lost, and all disks must be reformatted.
RAID 1
RAID 1 uses the concept of data mirroring, which duplicates the data from a single logical drive across two physical drives. Data written to the logical drive is written to both physical disk drives. This creates a pair of drives that contain the same data. If one of these physical drives fails, the data is still available from the remaining disk drive.
RAID 3
RAID 3 stripes data across all the data drives, writing a single block across all drives. This type of striping is referred to as byte-level striping. Parity data is then stored on a dedicated drive. Parity data can be used to reconstruct the data if a single disk drive fails. RAID 3 requires a minimum of three drives (two data disks and one parity disk).
RAID 4
RAID 4 is very similar to RAID 3, except that it uses block-level striping instead of byte-level striping. With block-level striping, a complete block is written to a single disk. The use of larger stripes improves the write performance over RAID 3. It still maintains the use of a dedicated parity drive and requires a minimum of three drives, as does RAID 3.
RAID 5
RAID 5 uses block-level striping and distributed parity. This eliminates the bottleneck of writing to the dedicated parity drive and does not require the duplicate disk drives of RAID 1. Both the data and parity information are spread across the disks one block at a time. RAID 5 requires a minimum of three drives.
133
As with RAID 4, the one performance penalty is in the read-modify-write cycle for writes smaller than a full stripe. A RAID array operating with a failed drive is said to be in degraded mode. RAID 5 arrays synthesize the requested data for the failed drive by reading the parity information for the corresponding data stripes from the remaining drives in the array. A failed drive in a RAID 1 or RAID 5 array can be replaced by physically swapping in a new drive or by a designated hot spare.
RAID 5E
RAID 5E (Enhanced) puts hot spares to work to improve reliability and performance. A hot spare is normally inactive during array operation and is not used until a drive fails. By utilizing deallocated space on the drives in the array, a virtual hot spare is created. By putting the hot spare to work, performance improves because more heads are writing the data. In the event of a drive failure, the RAID controller will start rearranging the data from the failed disk into the spare space on the other drives in the array.
134
Point-in-time images
IBM Network Attached Storage products provide point-in-time images of the file volumes through the Persistent Storage Manager (PSM) function. This function uses storage cache that is privately managed by the PSM code. The point-in-time image function of PSM is similar to functions in other products, such as: FlashCopy function on the IBM Enterprise Storage Server (ESS) SnapShot function on the Network Appliance products SnapShot function on StorageTek or IBM RAMAC products In IBM Network Attached Storage product documents, all of the following terms refer to this functionality: Persistent Image, True Image, Point-in-Time Image, or Instant Virtual Copy. Attention: Throughout this redbook, we will always use the term True Image copy when we are referring to a PSM point-in-time image.
Archival backup
IBM NAS products offer support for archival backup of the NAS operating system and archival backup of NAS user data The archival backup of the NAS operating system is supported by the pre-loaded NTBackup software or separately purchased backup programs like Tivoli Storage Manager (TSM). For archival backup of user data the same backup programs, either the pre-loaded NTBackup or a separately purchased like Tivoli Storage Manager can be used. With the pre-loaded NTBackup, full, incremental, or differential backups of NAS user data can be taken. When a full backup is taken, all selected files are backed up without any exception. A differential backup image will contain all files changed since the previous full backup thus, no matter how many differential backups are made, only one differential backup plus the original full backup are needed for any restore operation. With incremental backup, all files will be included into the backup image that changed since that previous incremental backup.
135
TSM backup
The IBM NAS products come pre-installed with the Tivoli Storage Manager (TSM) Client. The TSM client enables the backup of data in the NAS appliance. Because this is only a client, a separate TSM server is required to perform the actual backup. Based on the TSM servers configuration, the final destination of the NAS appliances backup can either be located in the TSM servers disk storage or an attached tape subsystem.
136
137
C o p y - o n - w rite o p e ra tio n
N A S f ile s y s te m
1 . W rite re q u e s t t o u p d a te d is k
P S M s o f tw a r e
3 . W rite c o m p le t e s to d is k
PS M cache
D is k
4. As additional write-sector requests are made, PSM again saves a private copy of the original data in the PSM-specific cache. This process is called a copy-on-write operation and continues from then on until that virtual copy is deleted from the system. Note that through time, the PSM-specific cache will grow larger. However, only the original sector contents are saved and not each individual change. 5. When an application wants to read the virtual copy instead of the actively changing (normal) data, PSM substitutes the original sectors for the changed sectors. Of course, read-sector requests of the normal (actively changing) data pass through unmodified see Figure 8-7.
138
R e a d da ta fro m P e rs is ten t Im ag e
N A S file sy ste m
1. R e a d fro m th e p e rsis te nt im a g e c op y 3 . F o r c h a ng e d se c to rs , P S M s u b stitu te s the o rigina l fro m its c ac h e w h en it se n ds th e d a ta to th e N A S file s ys te m .
P S M s o ftw a re
2a . S e c to rs th a t h a ve n o t c h an g e d are re a d fro m th e re g u la r lo c a tio n .
P S M c a ch e
2b . F o r s e ctors th at h av e c h an g e d, th e p re viou s ly-sa v ed o rigina l s ec to r da ta is re trie ve d fro m the P S M c a c he
D isk
By design, processes (such as backup or restoration) having data access through a persistent image have a lower process priority than the normal read and write operations. Therefore, should a tape backup program be run at the same time the NAS is experiencing heavy client utilization, the tape-backup access to the PSM image is limited, while the normal production performance is favored, which helps to minimize normal user impact. While creating the PSM image happens very quickly, it might take a few minutes before that image is available and visible to the users. In particular, the very first image will generally take much longer to be made available than subsequent images. By design, PSM will run at a lower priority than regular traffic, so if the system is heavily utilized, this delay can be longer than normal.
139
In these examples, we assume that the disk originally contained only the following phrase Now is the time for all good men to come to the aid of their country. In these examples below, the expression (FS) represents those sector(s) containing the file system meta-data. This, of course, is updated on every write operation. Empty (free space) sectors are indicated as #0001, #0002, etc. The disk/cache picture examples A through D below are not cumulative, that is,, in each case we are comparing against example A. A. Immediately after a persistent image (instant virtual copy) is made Table 8-1 shows the layout of how the disk would appear immediately after the instant virtual copy is made. Note that nothing has really changed. Although pointers and control blocks have changed, for simplicity, those details are not shown here.
Table 8-1 Layout of disk after instant virtual copy is made
Now i e aid #0019 #0028 s the of t #0020 #0029 time heir #0021 #0030 for a count #0022 #0031 ll go ry. #0023 #0032 od me #0015 #0024 #0033 n to #0016 #0025 #0034 come #0017 #0026 #0035 to th #0018 #0027 (FS)
Table 8-2 shows the layout of the PSM cache after instant virtual copy is made. Notice that it contains empty cells.
Table 8-2 Layout of PSM cache after instant virtual copy is made
B. Immediately after a file is deleted Table 8-3 shows the layout of how the disk would appear immediately after the original file was erased. Note that a copy of the original file system (meta-data, etc.) is all that is saved.
140
Table 8-4 shows the layout of the PSM cache immediately after file is deleted. Notice that the PSM cache contains a copy of the original file system data.
Table 8-4 Layout of PSM cache immediately after file is deleted:
(FS)
C. Immediately after an update in place changing time to date Table 8-5 shows the layout of how the disk would appear if the word time was changed to date. For this example to be truly correct, we would further assume the application program only wrote back the changed sectors. As explained below, this is not typical. The picture below illustrates how the sectors might appear.
Table 8-5 Layout of disk after changing time to date
Now i e aid #0019 #0028 s the of t #0020 #0029 date heir #0021 #0030 for a count #0022 #0031 ll go ry. #0023 #0032 od me #0015 #0024 #0033 n to #0016 #0025 #0034 come #0017 #0026 #0035 to th #0018 #0027 (FS)
Table 8-6 shows the layout of how the PSM cache would contain the original sector contents for the word time and the file systems meta-data:
Table 8-6 Layout of PSM cache after changing time to date
time (FS)
141
D. Immediately after an update in place changing men to women Table 8-7 shows the layout of how the disk would appear if the change requires more spaces. Since more spaces are required, obviously the data following the word women would also change as well. The original contents of all changed sectors would have to be saved in the PSM cache. Note that this example is not cumulative with examples B or C above.
Table 8-7 Layout of disk after changing men to women.
Now i the a #0019 #0028 s the id of #0020 #0029 time their #0021 #0030 for a r cou #0022 #0031 ll go ntry. #0023 #0032 od wo #0015 #0024 #0033 men t #0016 #0025 #0034 o com #0017 #0026 #0035 e to #0018 #0027 (FS)
Table 8-8 shows the layout that the PSM cache would contain all the changed sectors starting with the sector containing men plus the data that was slid to the right and together with the original file systems meta-data.
Table 8-8 Layout of PSM cache after changing men to women:
od me (FS) n to come to th e aid of t heir count ry.
E. Appearance for most file updates In the above examples, we assumed that the change was an update in place, where the changes were written back to the very same sectors containing the original data. Most databases do an update in place. However, most desktop applications, such as Freelance, WordPro, Notepad, etc., performs a write and erase original update. When these desktop applications write a change to the file system, they actually write a new copy to the disk. After that write is completed, they erase the original copy. Individual sectors on a disk always have some ones and zeros stored in every byte. Sectors are either allocated (in use) or free space (not in use or empty, and the specific data bit pattern is considered as garbage). The disk file system keeps track of which data is in what sector, and also which sectors are free space.
142
For the NAS code that shipped on 9 March 2001, PSM is unaware of free space in the file system. Therefore, if something is written to the disk, even if it is written to deallocate disk storage, the underlying sectors are copied to the PSM cache. The following example illustrates this: Table 8-9 shows the layout of how the disk would appear following a save operation after changing the word time to date. This assumes no free space detection and no update in place. Note again that this example is not cumulative with examples A, B, C and D above.
Table 8-9 Layout of disk after changes without free space detection
#0001 #0010 ll go r. y #0002 #0011 od me #0029 #0003 #0012 n to #0030 #0004 #0013 come #0031 #0005 #0014 to th #0032 #0006 Now i e aid #0033 #0007 s the of t #0034 #0008 date heir #0035 #0009 for a count (FS)
After this save is complete, the new, saved information is written into free space sectors #0015-#0028, and the original location sectors then turn into free space, as indicated by #0001-#0014 above. Since the PSM cache works at the sector level and since this version of PSM code is unaware of free space, PSM would copy the previous free-space sectors to its cache as shown in Table 8-10 below:
Table 8-10 Layout of PSM cache after changes without free space detection
#0015 #0024 #0016 #0025 #0017 #0026 #0018 #0027 #0019 #0028 #0020 (FS) #0021 #0022 #0023
F. Appearance for most file updates, with free space detection For the NAS code that shipped on 28 April 2001, PSM will be enhanced and will detect free space in the file system. Therefore, if data is written to the disks free-space sectors, those free space sectors will not be copied to the PSM cache. Table 8-11 shows the layout of the disk in the event of a save operation after changing the word time to date, with free space detection but not update in place. Again, this example is NOT cumulative with examples B, C, D and E above.
143
Table 8-11 Layout of disk after changes with free space detection
#0001 #0010 ll go ry. #0002 #0011 od me #0029 #0003 #0012 n to #0030 #0004 #0013 come #0031 #0005 #0014 to th #0032 #0006 Now i e aid #0033 #0007 s the of t #0034 #0008 date heir #0035 #0009 for a count (FS)
Table 8-12 shows the layout of the PSM cache after saving the changes from time to date. Here, since the PSM cache is aware that the new phrase is being stored in free space, it does not copy the original free space contents into the cache, and instead only updates the file system information containing pointers to the data, etc.
Table 8-12 Layout of PSM cache after changes with free space detection
(FS)
Finally note that in this situation, as the recycle bin is active on the NAS, these save operations will tend to walk through disk storage and write in free-space sectors. Therefore, with free space detection (28 April 2001 code) the recycle bin should be set to a higher number to minimize cache writes and minimize cache size. For the 9 March 2001 code, the recycle bin should be set to a low number or turned off, to minimize cache size. Eventually a save operation will need to use sectors that were not free space when the original persistent image was made. Then the original contents will be copied into the PSM cache.
144
The ability to create a read-write copy is particularly valuable for test environments when bringing up a new test system. Specifically, using PSM, a True Image copy can be made of a live database, and this True Image copy could be configured as read-write. Then, a separate non-production test system could use the True Image copy for test purposes. During debug of the non-production system, the tester could select Undo Writes to reset the test-system database to its original True Image copy. All of this testing would be kept completely separate from the ongoing active system, and a full copy would not be required. By design, processes (such as the test system in this example) having data access through a True Image copy have a lower process priority than the normal read and write operations, thus minimizing the performance impact to the production database use.
145
146
Chapter 9.
147
148
The IBM TotalStorage NAS Models 201 and 226 are designed to be high-throughput, two-way SMP-capable appliances with excellent scalability. They incorporate a powerful 1.133 GHz processor with 512 KB advanced transfer L2 cache. The advanced transfer cache is the result of a new backside bus that is 256 bits wide. The quad-wide cache line can transfer four 64-bit cache line segments at one time to deliver full-speed capability. Two Intel Pentium III connectors are standard on the system board to support installation of a second processor. The second 1.133 GHz processor is standard on the Model 226; it may be added as an option on the Model 201. When both processors are present, they share the workload and are load-balanced. High-speed, 133 MHz SDRAM is optimized for 133 MHz processor-to-memory subsystem performance. The IBM TotalStorage NAS Models 201 and 226 use the Server Works HE-SL chip set to maximize throughput from processors to memory, and to the 64-bit and 32-bit PCI buses. The NAS 200 models scale from 109 GB to over 3.49 TB total storage. Their rapid, non-disruptive deployment capabilities means you can easily add storage on demand. Capitalizing on IBM experience with RAID technology, system design and firmware, together with the Windows Powered operating system (a derivative of Windows 2000 Advanced Server software) and multi-file system support, the NAS 200 delivers high throughput to support rapid data delivery.
149
150
The TotalStorage NAS 300 Model 326 base machine consists of the following machines: one IBM 5186 NAS Rack Model 36U, two IBM 5187 NAS Engines Model 6RZs, two 3534 SAN Fibre Channel Managed Hubs Model 1RUs, and one IBM 5191 NAS RAID Storage Controller. The NAS 300 Model 326 is preinstalled in the rack with a fully integrated suite of optimized software preloaded. It is designed to be installed quickly and easily and the entire system is tested by IBM prior to delivery. For high-performance data handling, each NAS 300 Model 326 engine employs dual 1.133 GHz Pentium III processors and IGB of memory standard. High performance Fibre Channel Hubs and cabling are used for disk connectivity. Optionally, the NAS 300 Model 326 can be enhanced by adding: an additional 5191 NAS RAID Storage Controller Model RU, up to seven 5192 NAS Storage Unit Model RUs. The Rack is preconfigured with Fibre Channel cabling for all such expansions. The base NAS 300 Model 326 is configured with either 109 GB or 218 GB of HDD. These minimums consist of the minimum three 36.4 GB or 73.4 GB HDDs in the first (required) RAID Controller. These minimums can be expanded by adding one to seven additional HDDs to the first RAID Controller. It is recommended that the practical minimum be six to ten HDDs. Additional HDDs can be added in eight increments of 109.2 to 728 GB. This level of granularity is achieved using three to ten 36.4 GB or 73.4 GB HDDs in each additional 5191 and 5192. The maximum configuration of 6.61 TB is achieved by adding 7 IBM 5192 NAS Storage Units Model 0RU and the additional IBM 5191 NAS RAID Storage Controller Model 0RU optional machines all using 73.4 GB HDDs. The NAS 300 base configuration features the following: One Rack 36U (with state-of-the-art Power Distribution Unit) Two Engines, each with: Dual 1,13 GHZ Pentium III processors 1 GB memory Two Redundant and hot swap power supplies/fans Support for 36.4 GB HDD and 73.4 GB HDD 364 GB starting capacity (10 x 36,4 GB hot swap-able HDD), expandable to over 6.57 TB Two Fibre Channel Managed Hubs One RAID Controller Ten 36.4 GB Hot swap-able HDD
151
Optionally, it supports the following: Additional RAID Controller Maximum of 7 Storage Expansion Units. Each populated with ten 36.4 GB Hot Swap-able HDD The system comes standard with dual-node engines for clustering and fail-over protection. The dual Fibre Channel Hubs provide IT administrators with high performance paths to the RAID storage controllers using fibre-to-fibre technology. The pre-loaded operating system and application code is tuned for the network storage server function, and designed to provide 24 X 7 up time. The simple point-and-click restore feature makes backup extremely simple. With multi-level persistent image capability, recovery is quickly managed to ensure highest availability and reliability.
152
10
Chapter 10.
153
DB2
NAS 200
NAS 300
STATION 2
STATION 1
The network is 10/100 Ethernet; all the other computers are in the same local network. We created a Windows Active Directory, the domain name is NAS-DB2.ITSO.IBM.COM, and the PDC is running Windows 2000 Server. We have three stations: two stations are running Windows NT, one is running WIndows 2000. In the following sections we describe the steps to set up the test network, including creating the domain user account for DB2 and adding the computer to the domain. This is the main network environment. In addition, we also set up a trusted domain, and an isolated network for NAS 300, but the setting up procedure is similar.
154
After entering the user information, type the password. The final message shows that the user has been successfully created.
155
Now change the users property, and add this user to the domain admin group. Go to Active Directory Users and Computers, right-click the user nas_db2_user, and select Properties (Figure 10-3).
156
Select Member of tab, and click Add. Select Domain Admins and the Administrators group from the domain and click OK (see Figure 10-4).
157
Now, groups to which user nas_db2_user belongs are listed; click OK to confirm (see Figure 10-5).
158
Because we havent created a computer account in the domain, the wizard will ask for the account information (see Figure 10-7). Note: If you dont have domain administrator privilege, your domain administrator must create a computer account for you before this step. Then, at this step, the system wont require you to create a computer account.
After you enter the computer account information, click Next, and the wizard will ask for a domain user account which has the permission to add this computer into the domain. Use the nas_db2_user account (see Figure 10-8).
159
Now it will take a little while (about 10 seconds). If successful, a message box, Welcome to NAS-DB2 domain will be shown (see Figure 10-9).
After selecting the Administrators group, this computer has been added to the domain. Restarting is required; just click OK. After restarting the computer, the login screen is changed, and the Login into Domain option is available now. Note: This step is not required if the nas_db2_user belongs to the domain admin group, because the domain admin user is automatically added to the local administrators group.
160
161
After executing the URL, the system will pop up a Windows login to prompt you for the user name and password. Note: Please use a domain user to login if you already have a domain user account setup, and remember to enter the domain name. If you use the local default Administrator account, you will not be able to access other network resources directly.
Disk
The NAS 200 includes a ServeRAID 4H with 4 channels. The first channel is connected to 6 disks of 36 GB each. By default, a drive array A with 3 logical drives has been set up in the NAS 200: Array Drive 1, C drive, 3123 MB Array Drive 2, D drive, 6498 MB Array Drive 3, E drive, 157535 MB
162
If you want to customize the array configuration, you need to run the ServeRAID Manager program.
Network
The NAS 200 includes two ethernet controllers. One is a PCI slot card and the other is directly integrated on the motherboard. The Windows 2000 operating system shows these adapters as an IBM Netfinity Fault Tolerance PCI Adapter for the on-board adapter, and as an IBM 10/100 for the PCI-SLOT adapter. Both adapters are configured by default to use a DHCP server. We recommend that you change it to a static IP addresses. The adapter on board is used for a cluster, and the other one is used for the public network connection. By default, the appliances names are composed by the IBM Machine Type Model appliance plus IBM Serial Number. We recommend that you change this according to your company naming standard.
163
2. Select NAS Management -> Storage -> ServeRAID Manager -> Server Raid Manager (see Figure 10-10).
Creating arrays
The following steps show how to create drive arrays: 1. In the storage browsing tree, right-click the ServeRAID controller that you want to configure. 2. Click Create Arrays. 3. Click the Custom configuration button. 4. Click Next and the Create Arrays window opens. 5. Right-click the drive or SCSI channel icons in the Main Tree to select the drives that you want to add to your arrays, delete from your arrays, or define as hot-spare drives; then select a choice from the pop-up list. If you want to create a spanned array, click the Span Arrays box. 6. After you select the ready drives for your arrays and define your hot-spare drives, click Next. If you are not creating spanned arrays, here you can select the RAID level.
164
7. To finish the procedure, click Apply. 8. Click Yes in answer to the question Do you want to apply the new configuration?. 9. Right-click new array to synchronize it; the synchronization time depends on the RAID level and number of drives. 10.Now the array is ready to be created on the operating system.
165
1. Right-click the folder (or drive) you want to share, select the Sharing tab, and enter the Share name (see Figure 10-11).
2. Select Permissions. By default, everyone can read and change. For security reasons, the default permissions should be removed, and permissions for a DB2 user should be added.
166
3. Select the db2 user from the domain: nas_db2_user (see Figure 10-12).
167
4. Select your access type for nas_db2_user and click OK (see Figure 10-13).
168
Disk
Each engine on the NAS300 has one 9-GB ultra SCSI hard disk. It is divided into two partitions: there is one 3-GB partition for the system, and the rest (6 GB) is reserved for maintenance. The system partition contains the Windows Powered Operating System files. The Model 326 comes with a preconfigured shared storage RAID configuration on the first IBM 5191 RAID Storage Controller Model 0RU. The storage configuration application is IBM FAST Storage Manager 7 Client. The storage is formatted as an array, at RAID-Level 5, consisting of the following LUNs: A LUN of 500 MB, for the Quorum drive (the drive letter will be G). The Quorum drive is used by Microsoft Cluster Service to manage clustered resources A second LUN, composed of the remaining space, and used as a shared drive with one built-in hot spare.
Network
There are at least four network interface cards (NICs) per engine. One on-board card is called an IBM 10/100 Netfinity Fault Tolerant Adapter; and the other add-on PCI cards are called IBM 10/100 Ethernet Server Adapters.
169
170
After creating the array, the result in the RAID controller is shown in Figure 10-15.
171
4. Disk 3 is new disk we just created, right-click and select create partition, and click Next (see Figure 10-17). .
172
5. Select primary partition, and click Next. 6. Select the disk size (see Figure 10-18).
7. Assign drive letter I. 8. Format the drive using NTFS. Dont select compression (see Figure 10-19).
The formatting takes approximately tens of minutes. After done, the I drive is ready for the application.
173
174
Checklist
Before you start, you should make sure you have all the information you will need.
Network
Both two nodes can connect to the domain controller. Connect means they are in the same broadcast network, or DNS and gateway are properly configured for both nodes to find the Domain Controller. You need a domain admin account. DNS: The PDC running Windows 2000 is a DNS server. You may have an official DNS server to resolve the domain name outside of this domain. Gateway: Gateways for each network to which the node is connected are needed for NAS to connect outside of the broadcast network.
Cluster information
This is the information you will need regarding the cluster. Cluster name: The Virtual Name is the host name the clients will use to address the NAS300. This home name can bind and fail-over to any of the engines/nodes on the appliance. Cluster IP address: The cluster IP address is the address bound to the Virtual Name. Drive letter of Quorum drive: This is where the cluster information is kept. It is recommended to use drive G (which is the default on the NAS300). Virtual server information: The virtual server is the network name resource created in the cluster for the client to access the share volume: Virtual server name Virtual server IP address Share volume information Note: The virtual server is not required to finish the cluster setting; you can add new virtual server or share volume later. In our environment, we us Avocent DSView to access the NAS 300 appliance. The following configuration was used: Avocent DSView access: userID: sndiop\nas_db2_user (this is a completely different user) password: nas_db2_user
175
Domain environment: Domain name: nas-db2 Domain userID: nad_db2_user Domain user password: nas_db2_user Gateway: 192.168.100.10 (Since this is an isolated network, the gateway is not required) DNS: 192.168.100.10 Server 1 (PDC): Host name: db2w2ksvr1.nas-db2.itso.ibm.com IP address: 192.168.100.10 OS: Windows 2000 Advanced Server SP2 Software: DB2 7.2 Server Enterprise Edition DB2 7.2 Client Server 2: Hostname: db2w2ksvr1.nas-db2.itso.ibm.com IP Address: 192.168.100.20 OS & Software same as Server 1. NAS 300 (V2): Node 1: Hostname: nasdb2n1 IP Address: 192.168.100.30 Node 2: Host Name: nasdb2n2 IP Address: 192.168.100.31 Cluster information: Cluster Name: db2cluster Cluster IP Address: 192.168.100.32 Virtual Server 1 information: Virtual Server Name:nasdb2nn1 IP Address: 192.168.100.33 Share Volume: nasdb2sf1,nasdb2sf2
Network setup
Networking for both nodes needs to be set up. Each node has a interconnect network and three public networks. Only the networks in use need to be configured. In our environment, there is only one public network.
176
177
Note: When you do these renaming steps for the joining node, ensure that the local area connection name for each physically connected network is identical on each server. 4. Right-click My Network Places, click Properties, right-click the Public icon, then click Properties, select Internet Protocol (TCP/IP), and click Properties. 5. Use the networking information in the checklist to enter the networking addresses (Figure 10-20), including: IP address, subnet mask, default gateway, and preferred DNS server.
6. If needed, configure the DNS, WINS, HOSTS, or whichever, method you will be using for name resolution. You can view this information by clicking the Advanced button on the Properties window.
178
7. Click OK on each panel to return to the Properties window. You need to check the network binding order. The Cluster function requires binding of the Private Network to be the first network binding. To change it: a. Right-click My Network Places and then select Properties. b. Select Advanced,then Advanced Settings. c. Reorder the position of the adapters by select them, then pressing the up or down arrow keys, then clicking OK. The result of the network configuration is shown in Figure 10-21.
179
3. The Cluster information panel appears. Enter the following information (see Figure 10-22): Administrators account info Domain Name Cluster IP address Subnet mask Quorum drive
180
4. On the confirmation window, select Yes; it will take few minutes to finish. The cluster administration utility will automatically start showing the first node with its groups and the resources. After finishing, the cluster and node 1 information is shown on the Cluster Administrator Window. See Figure 10-23.
181
5. You may be requested for domain admin account information if you are not logged in as a domain admin user.
182
You will see a message that configuration will take a few minutes. After it is completed, the Cluster Administration function starts on the second node (see Figure 10-25).
183
184
2. Change private network priority: Select Network Priority to view all networks acknowledged by the cluster server, select the private network connection, and move it to the top for cluster communication priority (see Figure 10-27).
185
3. Set private network to Internal communication: Open the Properties for the private network and select Internal cluster communication only to ensure that no client traffic will be placed on the private network (see Figure 10-28).
186
3. In the available nodes panel, select a node and click the arrow button (see Figure 10-29).
Click OK, and the preferred owners will be shown on the Disk Group 1 property window. Each disk group has a preferred owner, so that, when both nodes are running, all resources contained within each disk group will have a node defined as the owner of those resources. Even though a disk group has a preferred owner, its resources can run on the other node after a cluster failover. If you restart a cluster node, those resources that are preferentially owned by the restarted node will fallback to that node once the cluster service detects that the node is operational, and provided that the defined failover policy allows this to occur. If you have not defined the node as the preferred owner for the resources, then they do not fallback to the node.
Failover setup
The failover of resources under a disk group on a node allows users to continue accessing the resources if the node goes down. Individual resources contained in a group cannot be moved to the other node; rather, the group it is contained in is moved. If a disk group contains a large number of resources and any one of those resources fails, then the whole group will failover according to the groups failover policy.
187
The setup of the failover policies is critical to data availability. To set up the failover function (see Figure 10-30): 1. Open the Properties panel for the disk group. 2. Select the Failover tab to set the Threshold for Disk Group Failure.
For example, if a network name fails, clustering services attempts to failover the group 10 times within 6 hours, but if the resource fails an eleventh time, the resource will remain in a failed state and Administrator action is required to correct the failure. 3. Select the Fallback tab to allow, or prevent, failback of the disk group to the preferred owner, if defined (see Figure 10-31).
In calling fallback of groups, there is a slight delay in the resources moving from one node to the other. The group can also be instructed to allow fallback when the preferred node becomes available or to fallback during specific off-peak usage hours.
188
Each resource under each disk group has individual resource properties. The properties range from restart properties, polling intervals to check if resource is operational, to a time-out to return to an online state. The default settings for these properties are selected from average conditions and moderate daily usage.
Physical disk
This is the base resource in which to store user data. It is not dependent on the other resources except for the physical disk that it defines. The disk resource must also have the same drive letters on both nodes so that the definitions of resources that depend on it will remain if the resource is moved to the other node.
Static IP address
This is a virtual address that will bind onto an existing IP address on one of the clusters public networks. This IP address provides access for clients, and is not dependent on a particular node, rather a subnet that both nodes can access. Because this address is not the physical adapters permanent address, it can bind and unbind to its paired adapter on the same network on the other node in the cluster. You can create multiple IP addresses through the Cluster Administrator on the same physical network. Note: The cluster IP Address is not to be used for file shares. That address is reserved to connect to and manage the cluster through the network that it is defined on.
189
190
2. Enter an IP address name, for example, ipaddr2, and change the resource type to IP Address. Select Run this resource in a separate Resource Monitor, and click Next (Figure 10-33).
b. A list of possible owners displays, and both nodes should remain as assigned. Click Next (Figure 10-34).
191
c. There are no resource dependencies on this panel, so click Next in the resource dependencies panel. d. Enter your TCP/IP parameters (Figure 10-35). This will be the first virtual IP Address. The value in the Network field identifies to the system which network the address is located on. Click Finish to create the resource.
192
3. Creating the network name resource: a. Right-click Disk Group 1, and select New-> Resource. b. Enter the virtual server name you want to use, for example, NN2, select Network Name as the resource type, and click Next (Figure 10-37).
c. Both nodes will be possible owners. Click Next. d. Add the IP address you created as a resource dependency in Step 1 and click Next (Figure 10-38).
193
e. Enter the virtual server name NASDB2NN1 into the Network Name Parameters field and click Finish (Figure 10-39).
f. It takes a few moments to register the virtual server name with your Name Server. After this has completed, bring the resource online (Figure 10-40).
194
4. Creating the CIFS file share resource: a. Right-click DIsk Group 1, and select New -> Resource. b. Enter a file share name, for example, FS2, and select either File Share or NFS Share (Figure 10-41).
c. Both nodes are possible owners. Click Next. d. Add the resource dependencies for the Physical Disk and Network Name that the file share will use and click Next (Figure 10-42).
195
e. Enter the share name of FS2 and the path to the disk in this group, either drive or sub-directory. You can then set User Limit, Permissions, and Advanced File Share (Figure 10-43). .
196
You can also use the /interactive switch to see the commands that are executed by AutoExNT when the system starts and a user has logged on.
197
3. Right-click My Computer, select Manage. Select Services and Applications. Double-click Services. 4. Right-click AutoExNT, select Properties. Select the General tab. Under Startup type: make sure that it is set as Automatic as shown in Figure B-2 (it should be set as Automatic by default). 5. Select the Log On tab. Under Log on as: click the button beside This account:. 6. Click Browse, select the domain administrator account (for example, ITSOSJ\Administrator). Supply the password and confirm it. Click Apply. Then click OK. 7. Create a file named autoexnt.bat and save it on the directory %systemroot%\system32. You may leave it empty for the moment. The actual content will depend on the applications that you are going to use and the drive mappings that you will need before starting the applications. Note: We used the administrator account in our scenario, but we recommend that you create a special service account with the special rights: Act as part of the operating system and Log on as a service.
198
11
Chapter 11.
199
200
201
1. From the IBM DB2 program folder, open the Control Center. Open the instance where you are going to create the SAMPLE database. 2. Right-click on the Databases folder. Click Create and select Database using the wizard; see Figure 11-1.
202
3. For database name, type SAMPLE. For Default drive, choose F:. For Alias, type SAMPLE. See Figure 11-2.
203
204
12
Chapter 12.
Backup and recovery options for DB2 UDB and IBM NAS
In this chapter we describe how IBM NAS True Images can be used for DB2 backup and recovery solutions. We start with a brief description of DB2 standard backup and recovery options, followed by a short overview of the DB2 True Image support. Finally, we describe DB2 backup and recovery scenarios using IBM NAS True Images.
205
206
In this example, a True Image which was taken from a database on the primary system (instance DB2A) is accessed by the secondary system (instance DB2B). On the secondary system the DB2 backup utility is used to create a backup of the True Image copy. This backup image represents a valid point-in-time image of the primary database on system DB2A. An example for a version recovery from a True Image is shown in Figure 12-2. In this example a True Image copy is used to recover a database of instance DB2A to point-in-time when the True Image copy for this database was taken.
DB2AA DB2
Primary Image True Image
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
207
The usage of True Images is not restricted to backup and recovery only. True Image copies can also be utilized for such tasks as the following: Provide a transactional consistent True Image of the database at the current point in time. This database can be used to off-load user queries that dont need the most current version of the database. Provide a standby database that can be accessed as a disaster recovery strategy if the primary database is not available. All logs from the primary database will be applied to the secondary database so that it will represent the most current transactional consistent version of the primary database.
Standard backup
The most common approach to creating a backup image of a database is to terminate all connections to the database, take the database offline, then using the DB2 BACKUP DATABASE command, make a full backup image of the database. Another approach is to isolate and backup specific portions of the database, again by using the DB2 BACKUP DATABASE command. With this approach, full, incremental, and/or delta backup images can be made while a database remains online.
Standard recovery
A database that has been backed up by means of DB2 backup utility can be recovered using either version recovery or roll-forward recovery. With version recovery, the database is returned to the state it was at that the time the last backup was taken any changes made since that time are lost. With roll-forward recovery, a database can be returned to the state it was in at a specific point in time by returning it to the state it was in the last time a backup was taken and then rolling it forward, using db2s transaction log files, to a specific point in time.
208
Suspend I/O
With PSM, a point-in-time True Image of a database can be taken. Because this function is working on file and logical volume level rather then on application level (database), special means have to be in place which allow PSM to take a consistent point-in-time copy of the database. This could be achieved by either taking the database offline while a True Image copy is running or to provide special commands which suspend database I/Os for this period of time. Beginning with Version 7.1 (fix pack 2) new DB2 commands were introduced that provides the capability to use True Image and split mirroring technology while DB2 is online. Suspend I/O supports continuous system availability by providing a full implementation for taking a True Image without shutting down the database. The new DB2 commands are SET WRITE SUSPEND FOR DATABASE and SET WRITE RESUME FOR DATABASE.
WRITE SUSPEND
When executed, the write suspend command (SET WRITE SUSPEND FOR DATABASE) all write operations to table spaces and log files for that particular database are suspended. Read-only transactions are not suspended and are able to continue execution against the suspended database provided they do not request a resource that is being held by the suspended I/O process. In addition, while I/O is suspended, applications can continue to process insert, update, and delete operations using data that has been cached in the databases buffer pool(s). A database connection must exist before this command can be submitted.
WRITE RESUME
The write resume command (SET WRITE RESUME FOR DATABASE) lifts an active suspension and allows all write operations to table spaces and log files that are used by a particular DB2 UDB database to continue. You need to be connected to that database before this command can be submitted.
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
209
Database reallocation
Beginning with Version7.2/Version7.1 FixPak4 the new command db2relocatedb was introduced. This command allows you to rename or realocate a database or parts of it (for example, container, log directory). The intended changes are specified in a configuration file which has to be provided by the user of the command.
210
This command is needed if you plan to have a database name for your True Image copy which is different to the name of the primary image. Furthermore, if the directory structure of your True Image and primary image is not the same, the db2relocatedb can be used to make the necessary changes to the DB2 instance and the database see 12.2.6, Accessing True Image copy overview on page 220 for a detailed description. A potential relocate database scenario is depicted in Figure 12-3, where the True Image copy will be accessed as database nasdb. Furthermore, the mapping of the volume directories to Drives are different for the Primary and Secondary Site (on the Primary Site the directories DB2 data and DB2 logs are mapped to Drive F: and I:; on the Secondary Site the corresponding True Image copys are mapped to G: and I:). In order to make the True Image copy accessible to the Secondary Site, the database name and the Drive settings need to be adjusted with the db2relocatedb command.
Mounts on the Primary Site directory for DB2 data directory for DB2 logs
Mounts on the Secondary Site mount ...snapshot.1\db2_data\db2--> G:\ mount ...snapshot.1\db2_logs --> K:\ db2relocatedb Command used db2relocatedb rname_db.sql rname_db.sql db_name=sample, nasdb db_path=f:,g: instance=db2 log_dir=i:,k:
Another way to do the necessary changes to a True Image copy is to use the new RELOCATE option of the db2inidb command (since Version7.2/Version7.1 FixPak4). For our example, the systax would be (assuming, that we initiate the True Image copy as DB2 SNPASHOT): db2inidb NASDB as snapshot RELOCATE USING rname_db.sql
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
211
12.2.1 Getting DB2 UDB prepared for IBM NAS True Image
For creating a virtual copy of your database image, you have to make sure that all the required files are captured. For DB2 UDB, this includes the following objects: Container (SMS or DMS files) DB2 control files DB2 configuration files and DB2 log files For the setup of your database on IBM NAS, we recommend that the database files and the databases corresponding log files be physically stored in two separate IBM NAS volumes. In the event that a database recovery operation becomes necessary, maintaining separate volumes will enable you to easily restore the database files from the appropriate True Image of the database volume, and then perform a roll-forward recovery operation using the 'original' database archive log files.
212
The location used to store a databases log files is determined by the value of the log path parameter of a databases configuration file. To change the location of a databases log path, issue the following command: db2 UPDATE DB CFG FOR [Database Alias] USING NEWLOGPATH [Location] In this command, Database Alias is the alias for the database whose configuration is to be modified, and Location is the location where database log files are to be stored.
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
213
The Persistent Image Global Settings offer the following options: Maximum persistent images: This corresponds to the maximum number of images that you can create per volume. Default value is 250. Inactive period (Quiescent period): This is the idle time (on the volume) PSM will wait before creating a True Image or persistent image. The Default value is 5 seconds. Inactive time-out (Quiescent time-out): This is the time that PSM will be willing to wait for quiescence. If the Quiescent period (for example, 5 seconds) does not occur within the specified Quiescent time-out (for example, 15 minutes), PSM will force a persistent image creation. The Default value is 15 minutes. Persistent image files location: This drive is where the images will be created. Note that you can only select one location for all the images (of your volumes). So if youre planning to create several images of each volume, you need to have enough space on this drive. The exact image size will depend on the changes made to the volume. The Default value is D: (maintenance), but you may want to put it in a fault tolerant array. For our example we set the Maximum persistent images to 20. For the inactive period we assume that because of the DB2 WRTITE SUSPEND the lowest available inactive period would be sufficient which is 1 second.
214
Volume settings
With the menu option Volume Settings you get a list of all available NAS Volumes in your system see Figure 12-5. Remember that this disks are NAS logical disks! From the list of available NAS Volumes, choose the one you want to configure here, which means, define the PSM settings for this NAS volume.
For Volume Settings the following options are available (see Figure 12-6): Warning threshold reached... (Cache full warning threshold): This is the percentage of the cache size before warnings are sent. This is done to inform the NAS administrator that it is time to save the images before unwanted deletion of the first few persistent images occurs. The logs for this option are saved on the NT Event Viewer, so you can check for it using either Internet Explorer or a Terminal Services Client.
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
215
Begin persistent image deletions: This is the percentage of cache size that if reached will begin deleting images on First In First Out basis. The Default value is cache 90% full. Cache size: This will be the size of the PSM cache allocated from the PSM volume location. The Default value is 1GB. In our example we set the cache size for Volume H (Drive) to 10% because this Volume was used for test purposes only. As a Rule-of-Thumb we would recommend to set the Volume cache sizes to 20% for production environments. Figure 12-7 on page 216 shows that the allocated PSM cache size has changed to 10% of the total Volume capacity.
216
From the Persistent Images menu, select the New function. The Create Persistent Image box will appear see Figure 12-9.
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
217
Here you can include the Volumes you want to be included in your True Image. For this example the F-Drive was chosen. Furthermore you can select to create a read-only or a read-write PSM True Image Copy, the retention weight, and the name of the PSM True Image Copy. The creation of a PSM True Image Copy might take a few seconds depending on the size of the data you included in the copy. After completion the Persistent Image list will show all new and existing PSM True Image Copy in your system. In our example we created beside a read-write copy for Volume F: three additional copies see Example 12-10.
218
Figure 12-11 PSM True Image Copy: Restore read-write True Image
The successful completion of a True Image restore is recorded in the NAS system log. To check to NAS system log select the following functions from the NAS Administration Program see Figure 12-12. 1. Click Maintenance. 2. Click Logs. 3. Click System Log.
From the list of events, select the one with the appropriate time stamp (column Time) and select Event Details ... see Figure 12-13.
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
219
220
Figure 12-14 on page 220 shows a directory structure of a IBM NAS machine as seen by the NAS Administrator. In our example the NAS volume F: (FDrive) is dedicated to all database data in directory DB2_data, except DB2 log files which are allocated on NAS volume I: (IDrive) in directory db2_logs. Beside the directories DB2_data and db2_logs (in our example defined as Windows shared folder) dedicated PSM Cache directories (SNAPSHOTS) are allocated on NAS volumes FDrive and IDrive. The name of the PSM cache folder was set to SNAPSHOTS by the NAS administrator at PSM configuration time (see PSM configuration on page 213). For each True Image copy of a NAS volume a dedicated subdirectory is allocated in the PSM cache by PSM at the time a True Image is taken. In our example a True Image copy of the NAS volume FDrive is stored in subdirectory snapshot.1. Figure 12-15 shows that PSM True Image Copy creates an exact (virtual) copy of a NAS Volume. In our example a (virtual) copy of the FDrive (directory structure and all files) is created under F:\SNAPSHOTS\snapshot.1.
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
221
In our example the NAS directory F:\DB2_data\DB2... where we created our sample database is mounted as network mapped drive to server DB2A. The NAS directory F:\SNAPSHOTS\snapshot.1\DB2_data\DB2... which contains the True Image copy of this database is mounted as network mapped drive to server DB2B. As the directories on each server is mounted as f:\DB2\... both servers see the same directory structure in this way the different directory structure of the primary image and True Image copy are masked.
f:\DB2\...
DB2AA DB2
DB2B DB2B
f:\DB2\...
IBM NAS
Figure 12-16 How to access PSM True Image copies
If, for any reason, the same drive letter or mount point for primary image and True Image copy cannot be used (this is the case if you plan to access more then one True Image copy of a primary image at a time) a different drive letter or mount point can be used. In this case, however, the path settings of the True Image needs to be adjusted at True Image initiation time using the RELOCATE option of the db2inidb command a RELOCATE example can be found in Section 12.3.4, PSM True Image copy as DB2 UDB True Image database on page 229.
222
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
223
DB2A
DB2B
NAS 200
NAS 300
A DB2 instance was created on each NT server (DB2A and DB2B). Our test database SAMPLE was created on DB2A. Therefore we call this machine the primary server and the database image on that server the primary image. The second instance (DB2B) was used to initiate and access True Image copies of the SAMPLE database. Therefore we call this machine the secondary server.
224
Directory structure
For the primary (database) image, two dedicated NAS volumes were reserved: One for DB2 logs and a second for the remain database objects (container and control files). The directory structure we used is shown in Figure 12-18.
The directories DB2_data and db2_logs were set up as shared folder. The folder names we used were \\db2_production_data for the DB2_data and \\db2_production_logs for the db2_logs directory. Both shared folder were mounted on the primary server \\db2_production_data was mapped to drive letter F: and \\db2_production_logs to drive letter I: The snapshot directory on each of the NAS volumes (FDrive and IDrive) are dedicated PSM cache directories which got allocated during the PSM configuration for details on the PSM configuration refer to 12.2.2, PSM configuration on page 213. For each True Image we took PSM allocated a dedicated subdirectory in the SNAPSHOTS directory for example snapshot.1. In order to make a True Image copy accessible to our secondary server the directory of that image is setup as shared folder and mounted on the server. In our example the True Image copys of the primary database are allocated in the following directories: F:\SNAPSHOTS\snapshot.1\DB2_data and I:\SNAPSHOTS\snapshot.1\db2_logs. Both directories were set up as shared folder \\db2_snapshot_data and \\db2_snapshot_logs and mounted on drives F: and I: on the secondary server (DB2B).
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
225
Database setup
In our test environment a DB2 instance was created on the primary and secondary server (so binaries and instance related DB2 objects got allocated on disks local to the server). The test database SAMPLE was created on the primary server on the F: drive (the drive maps to the shard folder DB2_data on the NAS volume FDrive). After database creation the DB2 logpath was changed to drive I: (the drive maps to the shared folder db2_snapshot_data on the NAS volume IDrive) and logretain was switched on.
Overview
Figure 12-20 shows the necessary steps for an offline True Image. Before you can start taking a True Image, the database needs to be offline (t2). As soon as the database is offline, a True Image can be taken (t3). After completion, the database can be taken back to online (t4).
226
DB2 DB2
t1: db is online
DB2 DB2
DB2 DB2
DB2 DB2
t4:get db online
Required steps
In order to create a True Image copy of an offline database, follow these steps: 1. Set database offline: Disconnect all users from the database. 2. Disconnect from database: db2 connect reset Terminate all command line back end processes by issuing: db2 terminate 3. Create True Image: Create a True Image of the required Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217) 4. Resume access for database: Issue the following command: db2 connect to <database alias>
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
227
Overview
Figure 12-21 shows the necessary steps for taking an online True Image. Before taking a True Image, all write I/Os to that database need to be suspended (t2). After suspending write I/Os, a point-in-time image of the database can be created (t3). After PSM True Image has finished, I/O to the primary image (database image) can be resumed (t4).
DB2 DB2
DB2 DB2
DB2 DB2
DB2 DB2
228
Required steps
In order to create a True Image copy of an online database, follow these steps: 1. Write suspend for database: Connect to database. Issue the following command:
db2 set write suspend for database
2. Create True Image: Create a True Image of the required NAS Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217) 3. Write resume for database: Issue the following command:
db2 set write resume for database
12.3.4 PSM True Image copy as DB2 UDB True Image database
If a quick copy of a DB2 database is required to populate a test or development system, then a True Image of the production system can be utilized.
Overview
For this scenario, we used a second server with a dedicated DB2 instance. The True Image copy we used was accessible through a shared directory (shared folder on Windows, NFS mounted on UNIX). Figure 12-22 gives a high level description of that environment.
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
229
f:\DB2\....
DB2AA DB2
DB2B DB2B
f:\DB2\....
f:\DB2\....
f:\SNAPSHOTS\snapshot.1\DB2\....
NAS200
Figure 12-22 Accessing a True Image copy from a secondary server
Here, the directory of the True Image (f:\SNAPSHOTS\snapshot.1\DB2\...) is mounted as f:\DB2 on the secondary server (DB2B). Therefore, both server working with the same directory structure. In cases where this is not possible, DB2 path settings have to be adjusted to the directory structure of the secondary server by using the RELOCATE option of the db2inidb command. Figure 12-23 briefly describes the required steps. At t1 an online True Image is taken. The created True Image copy is allocated into a PSM cache directory by PSM. After the PSM cache directory is made accessible to server DB2B (t2), the db2inidb command is used to make the True Image copy of the primary database image accessible to the secondary server (t3). In our example the True Image is accessed as DB2 True Image database.
230
DB2 AA DB2
IBM NAS
IBM NAS
IBM NAS
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
231
Attention: If the directory structure of your True Image copy is the same as on your primary image, the RELOCATE option of the db2inidb command is not required unless you intend to change the name of the database, for example, from SAMPLE to NASDB.
3. Access your database: The database should now be ready for access! Figure 12-24 shows the command sequence we used in our scenario. Note that the first attempt to get connected to the True Image copy of the sample database failed. Because this True Image copy was taken from an online database, the image itself was still in write suspend mode! In order to get access, the True Image copy of the primary database needs to be initiated with the db2inidb command.
C:\PROGRA~1\SQLLIB\BIN>db2 list database directory System Database Directory Number of entries in the directory = 1 Database 1 entry: Database alias Database name Database drive Database release level Comm ent Directory entry type Catalog node number = = = = = = = SAMPLE SAMPLE F:\DB2 9.00 Indirect 0
C:\PROGRA~1\SQLLIB\BIN>db2 connect to sample SQL20153N The database's split image is in the suspended state. SQLSTATE=55040 C:\PROGRA~1\SQLLIB\BIN>db2inidb sam ple as snapshot Operation was successful. C:\PROGRA~1\SQLLIB\BIN>db2 connect to sample Database Connection Information Database server = DB2/NT 7.2.2 SQL authorization ID = NAS_DB2_... Local database alias = SAMPLE
232
Mounts on the Primary Site directory for DB2 data directory for DB2 logs
Mounts on the Secondary Site mount ...snapshot.1\db2_data\db2--> G:\ mount ...snapshot.1\db2_logs --> K:\ db2relocatedb Command used db2relocatedb rname_db.sql rname_db.sql db_name=sample, nasdb db_path=f:,g: instance=db2 log_dir=i:,k:
Figure 12-25 Different directory structure for database True Image copy
In order to create a DB2 UDB True Image database that needs to be relocated, the following steps are required: 1. Create True Image copy online database: Create a True Image of the required NAS Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217)
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
233
2. Initiate True Image copy image from the secondary server: Login to your secondary server. Make sure that the shared folder with the True Image copy is accessible either as network mapped device on Windows OS or mountable device on UNIX. Catalog the database. Initiate the database as True Image by issuing:
db2inidb <database-name> as snapshot RELOCATE USING <file-name>
3. Access your database: The database should now be ready for access! Figure 12-26 shows the command sequence we used in our scenario.
= = = = = = = =
C :\ > d b 2 in id b n a s d b a s s n a p s h o t r e lo c a t e u s in g r n a m e _ d b _ g . s q l R e lo c a t in g d a ta b a s e .. . F ile s a n d c o n tr o l s t r u c tu r e s w e r e c h a n g e d s u c c e s s fu lly . D a ta b a s e w a s c a t a lo g e d s u c c e s s fu lly . D a ta b a s e r e lo c a t io n w a s s u c c e s s fu l. O p e r a tio n w a s s u c c e s s fu l. C :\ > d b 2 lis t d a ta b a s e d ir e c to r y D a ta b a s e 1 e n tr y : D a ta b a s e a lia s = NASDB D a ta b a s e n a m e = NASDB D a ta b a s e d r iv e = G : \D B 2 D a ta b a s e r e le a s e le v e l = 9 .0 0 C om m en t = C a t a lo g e d b y d b 2 r e lo c a t e d b D ir e c t o r y e n tr y t y p e = I n d ir e c t C a ta lo g n o d e n u m b e r = 0
234
Overview
As a starting point for a DB2 backup copy from a True Image, a valid point-in-time copy (True Image) of the primary database system is required (t1) see 12.3.3, Taking a True Image of an online DB2 UDB database on page 228. At this point the True Image copy is a valid virtual image of the primary database. Using the virtual image of the primary database requires that a DB2 Instance either local or remote can access the True Image copy for backup purposes (t2 and t3).
DB2AA DB2
IBM NAS
IBM NAS
IBM NAS
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
235
Required steps
In order to create a copy of an online database, follow these steps: 1. Create True Image copy of the online database: Create a True Image of the required NAS Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217). 2. Get access to the True Image copy: Login to your secondary server. Make sure that the shared folder with the True Image copy is accessible either as network mapped device on Windows OS or mountable device on UNIX. Catalog the database. You should see the database name of the primary database remember you are accessing a (virtual) copy of the primary image!
db2 list database directory
3. Take the backup of True Image copy: Issue the following command:
db2 backup database <database> to <device>
Overview
This scenario requires that a valid True Image copy of the primary database exists. Before the True Image can be restored the database must be taken offline (t1). For the restore, a True Image copy, which represents a valid image of the database at a prior point in time, is used to restore the primary database (t2). After the restore has been completed successfully, the database image needs to be initialized if the True Image was taken from on online database (t3). If the True Image was taken from an offline database, an initiation of the restored database is not required.
Note: For an offline True Image, we do not need to suspend I/O. Therefore, the True Image copy image is not in a write suspend state.
236
DB2A DB2A
DB2A DB2A
IBM NAS
IBM NAS
DB2A DB2A
t3:get db online
IBM NAS
Figure 12-28 Version recovery from a PSM True Image
At this point you might shutdown the database in order to prevent user from re-connecting to the database by issuing
db2stop
2. Restore database from True Image copy On IBM NAS Administration console use the Restore Persistent Images menu to select and restore the appropriate True Image copy see 12.2.5, Restoring an IBM NAS True Image on page 218. 3. Restart the DB2 instance, if it was stopped, by issuing:
db2start
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
237
4. Connect to database: Because we used a True Image of an offline database for version restore no database initiation needs to be done after restoring the True Image.
db2 connect to <database>
At this point you might shutdown the database in order to prevent user from re-connecting to the database by issuing:
db2stop
2. Restore database from True Image copy: On IBM NAS Administration console use the Restore Persistent Images menu to select and restore the appropriate True Image copy see 12.2.5, Restoring an IBM NAS True Image on page 218. 3. DB2 instance, if it was stopped, by issuing:
db2start
4. Initiate the database as True Image: Because we used a True Image of an online database for version restore a database initiation needs to be done after restoring the True Image:
db2inidb <database> as snapshot
238
Overview
This scenario requires that an online True Image of the primary database exists. Before the True Image can be restored, the database must be taken offline (t1) see Figure 12-29. A True Image copy of the database, which represents a valid image of this database at a prior point in time, is used to restore the primary database (t2). After restoring the True Image, the database needs to be initialized with the db2inidb <database> as MIRROR command (t3). This will place the database image in roll-forward pending state. In t2 the DB2 logs of the primary site were not replaced. By using these logs we can roll for wad the primary database by the ROLLFORWARD command (rollforward <database> to end of logs end complete).
DB2 AA DB2
DB2 AA DB2
IBM NAS
IBM NAS
DB2 AA DB2
t3:db2inidb db as MIRR OR
DB2 AA DB2
t4:roll-forward db
IBM NAS
IBM NAS
Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS
239
At this point you might shutdown the database in order to prevent user from re-connecting to the database by issuing:
db2stop
2. Restore database from True Image copy: On IBM NAS Administration console use the Restore Persistent Images menu to select and restore the appropriate True Image copy see 12.2.5, Restoring an IBM NAS True Image on page 218. 3. Restart the DB2 instance, if it was stopped, by issuing:
db2start
4. Initiate the database as MIRROR: Because we used a True Image of an online database we need to initialize the database after restoring the True Image:
db2inidb <database> as mirror
This will place the database in roll-forward pending state. 5. Roll forward the database to end of logs and complete:
db2 rollforward <database> to end of logs and complete
240
13
Chapter 13.
241
242
243
Moving resource between nodes is also used to make sure the cluster is successful set up. Please refer to 10.3.3, Setting up the Cluster Server on page 174 for details.
2. Restarting the Primary Node:
The second scenario we used was to restart the primary node. This is very easy because it can be done using a Windows Session connection, but the connection needs to reconnect to the server manually after restart, and it is difficult to know when the node is back. We did some failover tests using this type when we only had a remote connection. In reality, this type of event only happens when the OS is forced to restart by some crashed application. Its client response is the same as a type 1 event, because the cluster software has enough time to notify the other node when it is going down.
3. Powering off the Primary Node:
Turning off the power of the Primary Node is the last scenario we tested; the network acted differently. (See Figure 13-1.) In the response analyzed below, we only analyzed events 2 and 3, since events 1 and 2 had the same results for the client.
244
The result shows that in the type 2 event, the clustered resource includes cluster IP, and virtual IP addresses were switched to another node without breaking. In the type 3 event (Figure 13-1), all the clustered resources on the primary node were not available after the primary node was powered off. It took less than 10 seconds for the cluster to come back. One of the virtual servers (IP address) came back almost at the same time as the cluster IP address; the other virtual server took few seconds more to come back.
245
If a database client tries to access the database during that 20-45 second blackout period, he will get a connection error. The only solution for this is to design an application that can try the connection again and again.
246
247
248
Application Binary Interface Access Control Entries Access Control List Microsoft Active Directory ADSTAR Distributed Storage Manager Andrew File System Advanced Interactive eXecutive American National Standards Institute All Points Addressable Application Programming Interface Advanced Program-to-Program Advanced Peer-to-Peer Networking Advanced RISC Computer Advanced Research Projects Agency American National Standard Code for Information Interchange Asynchronous Terminal Emulation Asynchronous Transfer Mode Audio Video Interleaved Backup Domain Controller
BIND BNU BOS BRI BSD BSOD BUMP CA CAL C-SPOC CDE CDMF CDS CERT CGI CHAP CIDR CIFS CMA CO COPS
Berkeley Internet Name Domain Basic Network Utilities Base Operating System Basic Rate Interface Berkeley Software Distribution Blue Screen of Death Bring-Up Microprocessor Certification Authorities Client Access License Cluster single point of control Common Desktop Environment Commercial Data Masking Facility Cell Directory Service Computer Emergency Response Team Common Gateway Interface Challenge Handshake Authentication Classless InterDomain Routing Common Internet File System Concert Multi-threaded Architecture Central Office Computer Oracle and Password System
249
CPI-C
Common Programming Interface for Communications Central Processing Unit Client Service for NetWare Client/server Runtime Discretionary Access Controls Defense Advanced Research Projects Agency Direct Access Storage Device Database Management Distributed Computing Environment Distributed Component Object Model Dynamic Data Exchange Dynamic Domain Name System Directory Enabled Network Data Encryption Standard Distributed File System Dynamic Host Configuration Protocol Data Link Control Dynamic Load Library Differentiated Service Directory Service Agent Directory Specific Entry Domain Name System Distributed Time Service Encrypting File Systems Effective Group Identifier
EISA EMS EPROM ERD ERP ERRM ESCON ESP ESS EUID FAT FC FDDI FDPR FIFO FIRST FQDN FSF FTP FtDisk GC GDA GDI GDS GID
Extended Industry Standard Architecture Event Management Services Erasable Programmable Read-Only Emergency Repair Disk Enterprise Resources Planning Event Response Resource Manager Enterprise System Connection Encapsulating Security Payload Enterprise Storage Server Effective User Identifier File Allocation Table Fibre Channel Fiber Distributed Data Interface Feedback Directed Program Restructure First In/First Out Forum of Incident Response and Security Fully Qualified Domain Name File Storage Facility File Transfer Protocol Fault-Tolerant Disk Global Catalog Global Directory Agent Graphical Device Interface Global Directory Service Group Identifier
DASD DBM DCE DCOM DDE DDNS DEN DES DFS DHCP DLC DLL DS DSA DSE DNS DTS EFS EGID
250
GL GSNW GUI HA HACMP HAL HBA HCL HSM HTTP IBM ICCM IDE IDL IDS IEEE IETF IGMP IIS IKE IMAP I/O IP
Graphics Library Gateway Service for NetWare Graphical User Interface High Availability High Availability Cluster Multiprocessing Hardware Abstraction Layer Host Bus Adapter Hardware Compatibility List
Interprocess Communication Initial Program Load Internet Protocol Security Internetwork Packet eXchange Industry Standard Architecture SCSI over IP Integrated Services Digital Network Interface-specific Network Options International Organization for Standardization Interactive Session Support Independent Software Vendor Initial Technology Security Evaluation International Technical Support Organization International Telecommunications Union Inter Exchange Carrier Just a Bunch of Disks Journaled File System Just-In-Time Layer 2 Forwarding Layer 2 Tunneling Protocol Local Area Network Logical Cluster Number Lightweight Directory Access Protocol
Hierarchical Storage
Management Hypertext Transfer Protocol International Business Machines Corporation Inter-Client Conventions Manual Integrated Drive Electronics Interface Definition Language Intelligent Disk Subsystem Institute of Electrical and Electronic Engineers Internet Engineering Task Force Internet Group Management Protocol Internet Information Server Internet Key Exchange Internet Message Access Protocol Input/Output Internet Protocol
251
LFS LFS LFT JNDI LOS LP LPC LPD LPP LRU LSA LTG LUID LUN LVCB LVDD LVM MBR MCA MDC MFT MIPS MMC MOCL MPTN
Log File Service (Windows NT) Logical File System (AIX ) Low Function Terminal Java Naming and Directory Interface Layered Operating System Logical Partition Local Procedure Call Line Printer Daemon Licensed Program Product Least Recently Used Local Security Authority Local Transfer Group Login User Identifier Logical Unit Number Logical Volume Control Block Logical Volume Device Driver Logical Volume Manager Master Boot Record Micro Channel Architecture Meta Data Controller Master File Table Million Instructions Per Second Microsoft Management Console Managed Object Class Library Multi-protocol Transport Network
MS-DOS MSCS MSS MSS MWC NAS NBC NBF NBPI NCP NCS NCSC NDIS NDMP NDS NETID NFS NIM NIS NIST
Microsoft Disk Operating System Microsoft Cluster Server Maximum Segment Size Modular Storage Server Mirror Write Consistency Network Attached Storage Network Buffer Cache NetBEUI Frame Number of Bytes per I-node NetWare Core Protocol Network Computing System National Computer Security Center Network Device Interface Specification Network Data Management Protocol NetWare Directory Service Network Identifier Network File System Network Installation Management Network Information System National Institute of Standards and Technology National Language Support Novell Network Services Netscape Commerce Server's Application NT File System
252
NTLDR NTLM NTP NTVDM NVRAM NetBEUI NetDDE OCS ODBC ODM OLTP OMG ONC OS OSF PAL PAM PAP PBX PCI PCMCIA PDC PDF
NT Loader NT LAN Manager Network Time Protocol NT Virtual DOS Machine Non-Volatile Random Access Memory NetBIOS Extended User Interface Network Dynamic Data Exchange On-Chip Sequencer Open Database Connectivity Object Data Manager OnLine Transaction Processing Object Management Group Open Network Computing Operating System Open Software Foundation Platform Abstract Layer Pluggable Authentication Module Password Authentication Protocol Private Branch Exchange Peripheral Component Interconnect Personal Computer Memory Card Primary Domain Controller Portable Document Format
Performance Diagnostic Tool PHIGS Extension to X Physical File System Per Hop Behavior Programmer's Hierarchical Interactive Graphics System Process Identification Number Personal Identification Number Path Maximum Transfer Unit Post Office Protocol Portable Operating System Interface for Computer Environment Power-On Self Test Physical Partition Point-to-Point Protocol Point-to-Point Tunneling Protocol PowerPC Reference Platform Persistent Storage Manager Program Sector Number Parallel System Support Program Physical Volume Physical Volume Identifier Quality of Service Resource Access Control Facility Redundant Array of Independent Disks
POST PP PPP PPTP PReP PSM PSN PSSP PV PVID QoS RACF RAID
253
RAS RDBMS RFC RGID RISC RMC RMSS ROLTP ROS RPC RRIP RSCT RSM RSVP SACK SAK SAM SAN SASL SATAN SCSI SDK SFG SFU
Remote Access Service Relational Database Management System Request for Comments Real Group Identifier Reduced Instruction Set Computer Resource Monitoring and Control Reduced-Memory System Simulator Relative OnLine Transaction Processing Read-Only Storage Remote Procedure Call Rock Ridge Internet Protocol Reliable Scalable Cluster Technology Removable Storage Management Resource Reservation Protocol Selective Acknowledgments Secure Attention Key Security Account Manager Storage Area Network Simple Authentication and Security Layer Security Analysis Tool for Auditing Small Computer System Interface Software Developer's Kit Shared Folders Gateway Services for UNIX
SID SLIP SMB SMIT SMP SMS SNA SNAPI SNMP SP SPX SQL SRM SSA SSL SUSP SVC SWS TAPI TCB TCP/IP
Security Identifier Serial Line Internet Protocol Server Message Block System Management Interface Tool Symmetric Multiprocessor Systems Management Server Systems Network Architecture SNA Interactive Transaction Program Simple Network Management Protocol System Parallel Sequenced Packet eXchange Structured Query Language Security Reference Monitor Serial Storage Architecture Secure Sockets Layer System Use Sharing Protocol Serviceability Silly Window Syndrome Telephone Application Program Interface Trusted Computing Base Transmission Control Protocol/Internet Protocol Trusted Computer System Evaluation Transport Data Interface
TCSEC TDI
254
TDP TLS TOS TSM TTL UCS UDB UDF UDP UFS UID UMS UNC UPS URL USB UTC UUCP UUID VAX VCN VFS VG VGDA VGSA VGID VIPA VMM
Tivoli Data Protection Transport Layer Security Type of Service Tivoli Storage Manager Time to Live Universal Code Set Universal Database Universal Disk Format User Datagram Protocol UNIX File System User Identifier Ultimedia Services Universal Naming Convention Uninterruptable Power Supply Universal Resource Locator Universal Serial Bus Universal Time Coordinated UNIX to UNIX Communication Protocol Universally Unique Identifier Virtual Address eXtension Virtual Cluster Name Virtual File System Volume Group Volume Group Descriptor Area Volume Group Status Area Volume Group Identifier Virtual IP Address Virtual Memory Manager
VP VPD VPN VRMF VSM W3C WAN WFW WINS WLM WOW WWW WYSIWYG WinMSD XCMF XDM XDMCP XDR XNS XPG4
Virtual Processor Vital Product Data Virtual Private Network Version, Release, Modification, Fix Virtual System Management World Wide Web Consortium Wide Area Network Windows for Workgroups Windows Internet Name Service Workload Manager Windows-16 on Win32 World Wide Web What You See Is What You Get Windows Microsoft Diagnostics X/Open Common Management Framework X Display Manager X Display Manager Control Protocol eXternal Data Representation XEROX Network Systems X/Open Portability Guide
255
256
Related publications
The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook.
IBM Redbooks
For information on ordering these publications, see How to get IBM Redbooks on page 260:
Implementing the IBM TotalStorage NAS 300G: High Speed Cross Platform Storage and TivoliSANergy!, SG24-6278-00 The IBM TotalStorage NAS 200 and 300 Integration Guide, SG24-6505-00 Using iSCSI Planning and Implementing Solutions, SG24-6291-00 DB2 UDB e-business Guide, SG24-6539 IP Storage Networking: NAS and iSCSI Solutions, SG24-6240 A Practical Guide to Tivoli SANergy, SG24-6146 Tivoli SANergy Administrators Guide, GC26-7389 IBM SAN Survival Guide, SG24-6143 IBM Storage Solutions for Server Consolidation, SG24-5355 Tivoli Storage Management Concepts, SG24-4877 Getting Started with Tivoli Storage Manager: Implementation Guide, SG24-5416 Using Tivoli Storage Manager in a SAN Environment, SG24-6132 Tivoli Storage Manager Version 4.2: Technical Guide, SG24-6277 Red Hat Linux Integration Guide for IBM eServer xSeries and Netfinity, SG24-5853 AIX 5L and Windows 2000: Side by Side, SG24-4784 Migrating IBM Netfinity Servers to Microsoft Windows 2000, SG24-5854 Using TSM in a Clustered NT Environment, SG24-5742 ESS Solutions for Open Systems Storage: Compaq Alpha Server, HP and SUN, SG24-6119 Backing Up DB2 Using Tivoli Storage Manager, SG24-6247-00
257
Other resources
These publications are also relevant as further information sources: Roger E. Sanders, DB2 Administration, McGraw-Hill/Osborne, 2002, ISBN 0-07-213375-9 Larry Peterson and Bruce Davie, Computer Networks - A Systems Approach, Morgan Kaufmann Publishers, 1996, ISBN 1558603689 A. S. Tanenbaum, Computer Networks, Prentice Hall, 1996, ISBN 0133499456 M. Schwartz, Telecommunication Networks: Protocols, Modeling and Analysis, Addison-Wesley, 1986, ISBN 020116423X Matt Welsh, Mathias Kalle Dalheimer, and Lar Kaufman, Running Linux (3rd Edition), OReilly, 1999, ISBN 156592469X Scott M. Ballew, Managing IP Networks with CISCO Routers, OReilly, 1997, ISBN 1565923200 Ellen Siever, et al., Linux in a Nutshell (3rd Edition), OReilly, 2000, ISBN 0596000251 Andreas Siegert, The AIX Survival Guide, Addison-Wesley, 1996, ISBN 0201593882 William Boswell, Inside Windows 2000 Server, New Riders, 1999, ISBN 1562059297 Paul Albitz and Cricket Liu, DNS and BIND (4th Edition), OReilly, 2001, ISBN 0596001584 Gary L. Olsen and Ty Loren Carlson, Windows 2000 Active Directory Design and Deployment, New Riders, 2000, ISBN1578702429
Microsoft Windows 2000 Professional Resource Kit, Microsoft Press, 2000, ISBN 1572318082
D. Libertone, Windows 2000 Cluster Server Guidebook, Prentice Hall, 2000, ISBN 0130284696
Microsoft Services for UNIX version 2 white paper, found at: http://www.microsoft.com/WINDOWS2000/sfu/sfu2wp.asp
C. J. Date, An Introduction to Database Systems (7th Edition), Addison-Wesley, 1999, ISBN 0201385902 George Baklarz and Bill Wong, DB2 Universal Database V7.1, Prentice Hall, 2001, ISBN 0130913669
258
Related publications
259
Red Hat Linux http://www.redhat.com/ SUSE Linux http://www.suse.com/index_us.html SAS Institute Inc. http://www.sas.com/ IBM/SAS alliance http://www.sas.com/partners/directory/ibm SAS Administrator documentation http://www.sas.com/service/admin/admindoc.html SAS/ACCESS sample programs for UNIX http://www.sas.com/service/techsup/sample/unix_access.html Oracle http://www.oracle.com/
You can also download additional materials (code samples or diskette/CD-ROM images) from that site.
260
Index
A
Access 80 active log file 37 Adapters for NAS200 and 300 130 Add Option dialog 80 Add Volume 73 age cleaner agents 25 AIX commands vmstat 115 Alert Center 7 application failure 34 Archival backup 135 Archival logging 37 archive logging 92 Arrays, logical disks, and volumes 130 ASCII Delimited 10 AutoExNT Service 197 autorestart 35 Avocent 175
C
caching 25 Character large object 31 Check list 175 CIFS 13, 3839, 46, 134, 195 Circular logging 36 circular logging 92 CLOB 31 clone 11 Cluster information 175 Cluster resource balancing 186 Cluster Server 174 Clustering 43 coherency control 43 column 30 Command Center 7 command line interface 69 COMMIT 36 Common Internet File System 39, 46 Concurrent Copy 43 Configuration of DB2 Create db2 user account 153 Connectivity 16 consistent state 34 container 26 Containers 27 Control Center 7 copy-on-write operation 139 Create a new Qtree dialog 76 CREATE DATABASE 87 Create db2 user account 155 cumulative backup 9 customized operating systems, 14
B
Backup 205 backup 123 Backup and recovery 34 Backup and recovery functions 135 Backup and recovery in IBM NAS products 136 BACKUP DATABASE 35, 96, 208 backup image 6 balanced binary tree 30 base table 30 Berkeley Fast File System 58 bin 34 Binary large object 31 BLOB 31 Block I/O 40, 134 BM NAS 200 126 buffer pool manager 36 Buffer pools 25 Bufferpool 8 bulk insert 34
D
DAS 40 data accessibility 41 Data Backup to Tape 128 Data maintenance utilities 12 data mining 4 Data ONTAP 46, 65, 72 Data Protection 43 Data Protection on Disk 128
261
Data Protection Technology 128 Data sharing 20 data sharing 42 data throughput 31 data type 30 Data Vaulting 42 data warehousing 4 database 201 database configuration 31 database engine 6 database loading 32 database managed space 26, 68 Database Reallocation 210 database recovery 25 database recovery history file 35 Databases 25 Data-Copy Sharing 42 DataLink 104 DB2 199, 205 DB2 clients 24 DB2 Command Reference 9 DB2 Connect 5 DB2 Database Manager 24 DB2 Everyplace 6 DB2 Optimizer 12 DB2 optimizer 30 DB2 UDB 45 DB2 UDB 7.1 Create database 201 DB2 UDB for OS/400 5 DB2 UDB Utilities 8 DB2 Universal Database 4 DB2 Universal Database packaging 4 DB2_PARALLEL_IO 32, 86 DB2_STRIPED_CONTAINERS 33 DB2_STRIPPED_CONTAINERS 86 DB2s query optimizer 8 DB2EMPFA 34 db2empfa 92 DB2INIDB 96, 99 db2inidb 210211 db2relocatedb 210 DB2SET 32 db2start 32 db2stop 32 DBCLOB 31 DEL 10 delta 9 DEVICE 27
DEVICE containers 27 dftdbpath 87 diagnostics and performance monitoring 1 dirty pages 25 Disaster Recovery 43 Disk 162, 169 DMS 26, 68, 212 Domain 165 domain 158 Double-byte character large object 31 DSS 5 DSView 175
E
EE 4 EEE 4, 32 engineering drawings 5 Enterprise Systems Connection 18 Enterprise-Extended Edition 32 environment variables 32 ESCON 18 Event Analyzer 7 Event Monitor partial record identifier 114 Export 10 exportfs 81 extenders 6 Extent size 29
F
fabric 42 Failover 187 FC 18 FC disks 128 Fibre Channel 1718 Fibre Channel SAN 15 Fibre Channel switching technology 42 field 30 FILE 27 File I/O 40 file level I/O protocols 40 file locking 39 file permissions 39 File servers 13 File System Formats (File I/O 127 File system I/O 134 file systems 27 FilerView 7071
262
FlashCopy 43, 135 flat file transfer 42 free block-map file 59 free inode-map file 59 FTP 38 functions 31
G
gateways 17 global temporary tables 27 graphics 5 Group 39
H
HA 43 Hard disks and adapters 129 hardware environment 31 Heterogeneous file sharing 16 heuristic algorithms 25 hierarchical data structure 30 High Availability 43 high bandwidth 17 high Performance Parallel Interface 18 HIPPI 18 history file 38 homogeneous server environment 20 HTTP 38 hubs 17 Hyper Text Transfer Protocol 46
IBM TransArc Episode 58, 63 IBMs 3494 21 IBMDEFAULTBP 25 images 5 Import 10 IMS 5 Incremental 9 Index creation wizard 7 index creation 32 Indexes 30 indexes 25 inode file 59 installation 199 Instances 24 instantaneous data replication functions 18 insurance claim forms 5 integrated disks 18 inter-partition parallelism 32 inter-query parallelism 32 intra-partition parallelism 32 intra-query parallelism 32 Introduction to IBM NAS 147 ISO 41 ISV backup software 136
J
Java 5 Journal Facility 7
I
I/O buffer 28 I/O Parallelism 31 I/O parallelism 31, 33 IAACU) 162 IBM NAS 125, 153, 199, 205 NAS 200 126 NAS 300 128 IBM NAS 200 Model 201 (5194-200) 130 IBM NAS 200 Model 226 (5194-225) 130 IBM NAS 300 (5195-325) 130 IBM NAS Persistent Storage Manager (PSM) 137 IBM NAS True Image 205 IBM Network Attached Storage - Overview 148 IBM RAMAC Virtual Array (RVA) 16 IBM Total Storage NAS 200 147 IBM TotalStorage NAS Models 201 149
L
LAN 13 LAN-free 21 Large database systems 3 large object 27 LIST DATABASE DIRECTORY 89 List prefetches 30 LIST TABLESPACE CONTAINERS 91 LIST TABLESPACES 90 Load 10 LOB 27 Local Area Networks 41 Local database directory 89 log buffer 36 Logical consolidation 19 logical database design 26 logical drive 171 logical drives 165
Index
263
logical structure 30 logretain 35 Long data 31 long DMS table spaces 27 long field data 27 LONG VARCHAR 31 LONG VARGRAPHIC 31
O
object hierarchy 24 object-relational database 4 offline 209 offline archive log file 37 offline backup 37 OLAP 5, 8 OLAP SQL extensions 8 OLTP 5, 8 online archive log file 37 Open Systems Interconnection 41 operators 31 optimum performance. 28 optional Network Lock Manager 39 OSI 41 Other 39
M
Manage NFS Exports 79 Manage Qtrees 75 Manage Volumes 72 Microsofts Server Message Block 39 mirror 210 Model 326 169 mount 39 mount point 8283 MPPs 4 MS Windows Terminal Client 161 MSCS 174 multimedia 5 multipage_alloc 34
P
Page cleaners 25 Page size 28 pages. 28 parallel query processing 8 parallelism 31 Parity Disk 53 partition 172 partitioned database environment 9 PATH 27 PC Integrated Exchange Format 10 PC/IXF or IXF 10 PDC 154 PE 4 performance 25 Performance Monitor 7 Persistent Storage Manager 123 Persistent Storage Manager (PSM) 129 Personal Developers Edition 4 Physical consolidation 18 physical storage on a system 26 pointers 30 Point-in-time images 135 power interruptions 34 Predictive Failure Analysis (PFA) 130 prefetch mechanisms 33 Prefetch size 29 prefetch size / extent size 33 Pre-loaded code 129 Primary Domain Controller 155 primary online log files 36
N
NAS 200 153, 161 NAS 200 Model 201 149 NAS 200 Model 226 150 NAS 300 150, 153, 169170, 174 NAS 300 base configuration 151 NAS 300 Model 326 151 NAS device 37 NAS Server Engine 127 NAS300 147 NetApp filer 1 Network 163, 169 Network Appliance filers 3 Network appliances 14 network attached storage 3 network availability 41 Network File System 39, 46 NFS 13, 3839, 46, 134 NFS exports 79 NLM 39 node 179 Node directory 89 Non-disruptive scalability for growth 21 non-volatile RAM 48 NVRAM 48
264
primary partition 173 protocol 39 PSM 137, 206 PSM cache contents 139 PSM True Image 212
Q
qtree 66, 97 qtree create 76 qtrees 74 Query Parallelism 32 Query parallelism 31 Quota Trees 66 quota trees 97
R
RAID 1 52 RAID 3 52 RAID 4 52, 133 RAID 5 52, 133 RAID capabilities 18 RAID Implementation 128 RAID reconstruction 56 RAID scrubbing 57 RAID stripe size 33 RAID support 132 RAID 0 133 RAID 1 133 RAID 3 133 RAID 4 133 RAID 5 133 RAID 5E 134 RAID support 132 raw devices 27 RECONCILE 104 record 30 recovery 205 Recovery History file 38 Redbooks Web site 260 Contact us xx REDISTRIBUTE 13 Redistribute Data 13 Redundant Array of Inexpensive Disks 48 referential constraints 30 Registry 32 registry variable 33 regular DMS table spaces. 27 Relational Connect 5
reliability 6 remote cluster storage 43 Remote Copy 43 remote copy 43 Remote file sharing 16 remote mirroring 18 REORG 12, 38 Reorganize Table 12 Resource pooling 15 resource sharing 41 RESTART DATABASE 34 RESTORE DATABASE 35, 37 ROLLBACK 36 rolled back 36 ROLLFORWARD DATABASE 35 roll-forward recovery 35, 37, 208 Root 80 round-robin fashion 29 routers 17 rows 30 Run Statistics 12 RUNSTATS 12, 38 RW 80
S
SAN 1 SAN Fabric 42 SAN Storage 41 Scalability 16 scalability 6 Script Center 7 SCSI 18 second redundant copies of the data 43 secondary log files 36 sequential detection 33 Sequential prefetches 30 serial storage architecture 18 serialization 43 server process 43 ServeRAID Manage 163 server-free 21 SET WRITE RESUME FOR DATABASE 209 SET WRITE SUSPEND FOR DATABASE 209 Share Volume 165 Shared repository 42 sharing of resources 18 single database operation 32 Small computer systems interface 18
Index
265
SMB 39 SMPs 4 SMS 26, 68, 212 SMS table spaces 34 snapshot 210 SnapShot Copy 43 SnapShot function 135 Snapshots 49, 62 spatial data 6 sqllib directory 34 ssa 18 stand-alone 24 Standard backup 208 Standard recovery 208 standby 210 Starburst 8 Storage consolidation 18 storage failure 34 storage mirroring 43 Storage Subsystem 128 Subsystem local copy services 43 subtypes 30 Suspend I/O 209 switches 17 SYSCATSPACE 26 System Architecture 126 system catalog tables 25 system catalog tables and views 26 System database directory 89 system manageability 41 system managed space 26, 68 system temporary table spaces 27
true data sharing 20 TSM 9 TSM backup 136 TSM client software support 16
U
UDB for OS/390 5 UDE 4 UDFs 5 UDTs 5 units of work 34 Universal Database Enterprise Edition 4 Universal Database Enterprise-Extended Edition 4 Universal Database Personal Edition 4 Universal Database Workgroup Edition 4 Universal Developers Edition 4 Universal Extensibility 5 Universal management 7 unstable stat 34 UPDATE DB CFG 213 UPDATE MONITOR SWITCHES 108 URL 70 user table space 26 User temporary table spaces 27 user-defined data types 5, 31 user-defined functions 5 user-defined objects 26 userexit 35 USERSPACE1 26 Utilities Backup and Recovery 9 Data movement 9 Utility Autoloader 11 DB2LOOK 11 DB2MOVE 11
T
Table spaces 26 table spaces 9, 25 Tables 30 tables 25 Tape pooling 21 temporary table space 26 TEMPSPACE1 26 Terminal Service 161 Tivoli SANergy File Sharing 20 transaction 36 transaction log 34, 208 transaction logger 36 Transaction logging 36 transactions 34
V
value 30 Varying-length double-byte long character string 31 Varying-length long character string 31 Version Recovery 207 views 25 Virtual copies 137 Virtual Server 169 virtual server 196 Visual Explain 7 VLANs 39
266
W
WAFL 49, 58 WAN 41 WANs 17 warm standby techniques 43 WE 4 web interface 69 Wide Area Network 41 Windows clients 197 Windows NT backup and NAS backup assistant 136 Worksheet (WSF) 10 Write Anywhere File Layout 49 WRITE RESUME 96, 99 WRITE SUSPEND 96, 98
X
XML, 6 XRC 43
Index
267
268
Back cover
BUILDING TECHNICAL INFORMATION BASED ON PRACTICAL EXPERIENCE IBM Redbooks are developed by the IBM International Technical Support Organization. Experts from IBM, Customers and Partners from around the world create timely technical information based on realistic scenarios. Specific recommendations are provided to help you implement IT solutions more effectively in your environment.