Sie sind auf Seite 1von 294

Front cover

DB2 UDB Exploitation of NAS Technology


Integrate DB2 UDB and NAS using this hands-on guide Learn how NAS can enhance your DB2 environment Configure DB2 for optimal NAS usage

Lijun (June) Gu David (Danhai) Cao Joachim Dirker Roger E. Sanders Michael T. Terrell Roland Tretau

ibm.com/redbooks

International Technical Support Organization DB2 UDB Exploitation of NAS Technology July 2002

SG24-6538-00

Take Note! Before using this information and the product it supports, be sure to read the general information in Notices on page xv.

First Edition (July 2002) This edition applies to IBM TotalStorage Network Attached Storage (NAS) and NetWork Appliance filer products with the Windows Powered Operating System and Linux. Copyright Network Appliance Inc. 2002. All rights reserved.

Comments may be addressed to: IBM Corporation, International Technical Support Organization Dept. QXXE Building 80-E2 650 Harry Road San Jose, California 95120-6099 When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you.
Copyright International Business Machines Corporation 2002. All rights reserved. Note to U.S Government Users Documentation related to restricted rights Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.

Contents
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii The team that wrote this redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Special notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix Comments welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx Part 1. NAS and NetApp filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 1. Introduction to DB2 UDB, NAS, and SAN . . . . . . . . . . . . . . . . . . 3 1.1 Introduction to DB2 UDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.1 DB2 Universal Database packaging . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.2 The Universal Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.3 DB2s query optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.4 DB2 utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 Introduction to Network Attached Storage. . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2.1 File servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2.2 Network appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.3 Benefits of NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3 Introduction to Storage Area Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.3.1 Storage Area Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.3.2 Benefits of SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts . . . . . . . 23 2.1 DB2 terminology and concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.1 Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.2 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.1.3 Buffer pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.1.4 Table spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.1.5 Tables, indexes, and long data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.1.6 DB2 UDB and parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.1.7 Registry and environment variables . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.1.8 A word about DB2EMPFA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.1.9 Backup and recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Copyright IBM Corp. 2002

iii

2.2 NAS terminology and concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2.1 Network file system protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2.2 File I/O. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.2.3 Local Area Networks (LANs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.3 Storage Area Network terminology and concepts . . . . . . . . . . . . . . . . . . . 41 2.3.1 SAN storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.3.2 SAN fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.3.3 SAN applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Chapter 3. Introduction to the NetApp filer . . . . . . . . . . . . . . . . . . . . . . . 45 3.1 The Network Appliance Filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2 System architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.1 NVRAM implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.2 RAID environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2.3 Write Anywhere File Layout (WAFL) . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.4 Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Chapter 4. NetApp filer terminology and concepts . . . . . . . . . . . . . . . . . 51 4.1 Understanding RAID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.1.1 Levels of RAID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.1.2 Eliminating the parity disk bottleneck . . . . . . . . . . . . . . . . . . . . . . . . 55 4.1.3 Using multiple RAID groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.1.4 Performance and RAID configuration . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2 WAFL implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2.1 Meta-data lives in files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2.2 A tree of blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.2.3 A word about write allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.3 Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.3.1 Snapshots and the block-map file . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4 Volumes and Quota Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.1 Quota trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Chapter 5. DB2 and the NetApp filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1 DB2/NetApp filer design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2 Interacting with a Network Appliance filer . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2.1 Using FilerView. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.3 Creating volumes on a Network Appliance filer. . . . . . . . . . . . . . . . . . . . . 72 5.4 Creating qtrees on a Network Appliance filer . . . . . . . . . . . . . . . . . . . . . . 74 5.5 Managing NFS exports (UNIX only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.6 Filer volumes and qtrees with DB2 UDB . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.7 Creating DB2 UDB databases on a filer . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.7.1 Setting the appropriate environment/registry variables . . . . . . . . . . . 86 5.7.2 Creating DB2 UDB databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.7.3 Verifying the location of a database . . . . . . . . . . . . . . . . . . . . . . . . . 89

iv

DB2 UDB Exploitation of NAS technology

5.7.4 Improving the performance of SMS table spaces . . . . . . . . . . . . . . . 91 5.7.5 Changing the storage location of database log files . . . . . . . . . . . . . 92 Chapter 6. Backup and recovery options for databases that reside on NetApp filers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6.1 Backup methods available . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.2 Designing a DB2 database with filer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.3 Suspending and resume database I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.3.1 WRITE SUSPEND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.3.2 WRITE RESUME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.3.3 DB2INIDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.4 Using NetApp Snapshots with a DB2 database . . . . . . . . . . . . . . . . . . . 100 6.4.1 Taking a Snapshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.4.2 Restoring a DB2 UDB database from a filer Snapshot . . . . . . . . . . 101 6.4.3 DataLink considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Chapter 7. Diagnostics and performance monitoring . . . . . . . . . . . . . . . 105 7.1 The DB2 Database System Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 7.1.1 The snapshot monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 7.1.2 Event monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 7.2 Operating system monitoring tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 7.2.1 The top program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2.2 Virtual memory statistics vmstat . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2.3 Process state ps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.3 Network Appliance filer monitoring tools . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.3.1 sysstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.3.2 ifstat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.3.3 netstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 7.3.4 df . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Part 2. DB2 working with IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Chapter 8. Terminology and concepts of IBM NAS . . . . . . . . . . . . . . . . . 125 8.1 The IBM TotalStorage NAS 200 and 300 concept . . . . . . . . . . . . . . . . . 126 8.1.1 System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 8.1.2 NAS Server Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 8.1.3 Storage subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 8.1.4 Pre-loaded code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 8.2 IBM NAS terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 8.2.1 Hard disks and adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 8.2.2 Arrays, logical disks, and volumes . . . . . . . . . . . . . . . . . . . . . . . . . 130 8.2.3 RAID support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 8.2.4 File system I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 8.2.5 Backup and recovery functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Contents

8.3 Backup and recovery in IBM NAS products . . . . . . . . . . . . . . . . . . . . . . 136 8.4 IBM NAS Persistent Storage Manager (PSM). . . . . . . . . . . . . . . . . . . . . 137 8.4.1 How PSM works overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 8.4.2 PSM cache contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 8.4.3 PSM True Image: read-only or read-write . . . . . . . . . . . . . . . . . . . . 144 Chapter 9. Introduction to IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 9.1 IBM Network Attached Storage overview . . . . . . . . . . . . . . . . . . . . . . 148 9.2 IBM TotalStorage Network Attached Storage . . . . . . . . . . . . . . . . . . . . . 148 9.2.1 The IBM TotalStorage Network Attached Storage 200 . . . . . . . . . . 148 9.2.2 The IBM TotalStorage Network Attached Storage 300 . . . . . . . . . . 150 Chapter 10. Configuration of IBM NAS 200 and 300 . . . . . . . . . . . . . . . . 153 10.1 Our environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 10.1.1 Create db2 user account . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 10.1.2 Add computer to domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 10.2 Setting up IBM NAS 200 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 10.2.1 Connecting to the NAS 200. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 10.2.2 Default configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 10.2.3 Setting up storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 10.2.4 Add NAS 200 to domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 10.2.5 Creating a share volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 10.3 Setting up the IBM NAS 300 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 10.3.1 Default configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 10.3.2 Setting up storage on the NAS 300. . . . . . . . . . . . . . . . . . . . . . . . 170 10.3.3 Setting up the Cluster Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 10.3.4 Create clustered share volume . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 10.4 Getting connected to NAS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 10.4.1 Accessing the shares from our Windows clients . . . . . . . . . . . . . . 197 10.4.2 Accessing the shares for DB2 user . . . . . . . . . . . . . . . . . . . . . . . . 197 Chapter 11. DB2 installation on IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . 199 11.1 DB2 for Windows on IBM NAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 11.1.1 DB2 for Windows Objects on IBM NAS . . . . . . . . . . . . . . . . . . . . 201 Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS . 205 12.1 Backup and recovery considerations on IBM NAS . . . . . . . . . . . . . . . . 206 12.1.1 DB2 UDB standard backup and recovery methods . . . . . . . . . . . 208 12.1.2 DB2 UDB NAS True Image support . . . . . . . . . . . . . . . . . . . . . . . 209 12.2 DB2 UDB considerations for PSM True Images . . . . . . . . . . . . . . . . . . 212 12.2.1 Getting DB2 UDB prepared for IBM NAS True Image . . . . . . . . . 212 12.2.2 PSM configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 12.2.3 Options for IBM NAS True Image copies . . . . . . . . . . . . . . . . . . . 216 12.2.4 Creating an IBM NAS True Image . . . . . . . . . . . . . . . . . . . . . . . . 217

vi

DB2 UDB Exploitation of NAS technology

12.2.5 Restoring an IBM NAS True Image. . . . . . . . . . . . . . . . . . . . . . . . 218 12.2.6 Accessing True Image copy overview . . . . . . . . . . . . . . . . . . . 220 12.2.7 Some considerations about cache size and location . . . . . . . . . . 223 12.3 Using IBM NAS True Image with DB2 UDB . . . . . . . . . . . . . . . . . . . . . 224 12.3.1 System environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 12.3.2 Taking a True Image of an offline DB2 UDB database . . . . . . . . . 226 12.3.3 Taking a True Image of an online DB2 UDB database . . . . . . . . . 228 12.3.4 PSM True Image copy as DB2 UDB True Image database . . . . . 229 12.3.5 Creating a DB2 backup from a True Image . . . . . . . . . . . . . . . . . 235 12.3.6 Version recovery from a PSM True Image . . . . . . . . . . . . . . . . . . 236 12.3.7 Roll-forward recovery from a True Image . . . . . . . . . . . . . . . . . . . 239 Chapter 13. IBM NAS high availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 13.1 NAS 200 high availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 13.2 NAS 300 high availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 13.3 Failover tests on NAS 300. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 13.3.1 Creating a failover event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 13.3.2 Failover response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 13.3.3 Load balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 13.3.4 Administration considerations for NAS . . . . . . . . . . . . . . . . . . . . . 247 Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Related publications . . . . . . . . . . . . . . . . . . . . . . IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other resources . . . . . . . . . . . . . . . . . . . . . . . . Referenced Web sites . . . . . . . . . . . . . . . . . . . . . . How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . IBM Redbooks collections . . . . . . . . . . . . . . . . . ...... ...... ...... ...... ...... ...... ....... ....... ....... ....... ....... ....... ...... ...... ...... ...... ...... ...... . . . . . . 257 257 258 259 260 260

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

Contents

vii

viii

DB2 UDB Exploitation of NAS technology

Figures
1-1 1-2 1-3 1-4 2-1 2-2 2-3 3-1 4-1 4-2 4-3 4-4 4-5 4-6 4-7 5-1 5-2 5-3 5-4 5-5 5-6 5-7 5-8 5-9 5-10 5-11 5-12 5-13 5-14 5-15 5-16 5-17 5-18 5-19 7-1 7-2 7-3 7-4 The implementation of NAS in a typical storage network . . . . . . . . . . . 15 Storage consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Logical storage consolidation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Loading the IP network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Relationship between buffer pools, table spaces, and instances . . . . . 28 Table space containers and extents . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 IBM NAS devices use File I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Network Appliance System Architecture . . . . . . . . . . . . . . . . . . . . . . . . 47 Network Appliances RAID 4 disk layout . . . . . . . . . . . . . . . . . . . . . . . . 53 FFS and WAFL disk write operation patterns . . . . . . . . . . . . . . . . . . . . 55 Layout used by the WAFL file system . . . . . . . . . . . . . . . . . . . . . . . . . . 59 WAFLs tree of blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 How WAFL creates a Snapshot in an active file system . . . . . . . . . . . . 62 Life cycle of a block-map file entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 NetApp filer with multiple volumes composed of multiple RAID groups. 65 Infrastructure used to test DB2 UDB and a Network Appliance filer . . . 69 Initial page of the Network Appliance filer Web interface. . . . . . . . . . . . 70 FilerViews main screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Manage Volumes screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Add New Volume data entry screen . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Volumes Report screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Manage Qtrees screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Create a new Qtree dialog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Manage Qtrees screen after qtrees for test environment were created. 78 Manage NFS Exports screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Create a New /etc/exports Line dialog . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Adding permissions to an export entry. . . . . . . . . . . . . . . . . . . . . . . . . . 81 Add Option dialog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 NFS permissions for our test environment. . . . . . . . . . . . . . . . . . . . . . . 82 /etc/fstab file used in Linux test environment . . . . . . . . . . . . . . . . . . . . . 84 Output from df after qtrees were mounted on our Linux server . . . . . . . 85 Output from LIST DATABASE DIRECTORY command . . . . . . . . . . . . 90 Output from LIST TABLESPACES command . . . . . . . . . . . . . . . . . . . . 91 Output from LIST TABLESPACE CONTAINERS command . . . . . . . . . 92 Sample GET MONITOR SWITCHES output . . . . . . . . . . . . . . . . . . . . 109 Sample GET DBM MONITOR SWITCHES output . . . . . . . . . . . . . . . 110 Sample table space-level snapshot output . . . . . . . . . . . . . . . . . . . . . 110 Sample Table-level snapshot output . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Copyright IBM Corp. 2002

ix

7-5 7-6 7-7 7-8 7-9 7-10 7-11 8-1 8-2 8-3 8-4 8-5 8-6 8-7 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10 10-11 10-12 10-13 10-14 10-15 10-16 10-17 10-18 10-19 10-20 10-21 10-22 10-23 10-24 10-25 10-26 10-27 10-28 10-29

Sample top output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Sample vmstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Sample ps output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Sample sysstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Sample ifstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Sample netstat output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Sample df output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 IBM NAS Appliance System Architecture . . . . . . . . . . . . . . . . . . . . . . 127 IBM NAS I/O mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 IBM NAS array support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 IBM NAS logical drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 INM NAS drive partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 PSM copy-on-write operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 PSM read from persistent image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Our environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Create new user. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Change user properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Change nas_db2_user group select group . . . . . . . . . . . . . . . . . . . 157 Change nas_db2_user group result . . . . . . . . . . . . . . . . . . . . . . . . 158 Create computer account in Windows domain . . . . . . . . . . . . . . . . . . 159 Add to domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Add to domain the domain user account . . . . . . . . . . . . . . . . . . . . . 160 Add user to local Administrator group . . . . . . . . . . . . . . . . . . . . . . . . . 160 Using IBM NAS 200 Server RAID Manager. . . . . . . . . . . . . . . . . . . . . 164 Share folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Add user for share folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Set permissions for db2_data folder . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Create disk array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Create disk array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Create logical disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Create partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Create partition select disk size. . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Create partition format disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Set up public network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Configure public network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Set up first node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Set up cluster on first node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Set up second node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Result of setting up second node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Adjust cluster quorum log size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Increase private network priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Set private network to internal communication . . . . . . . . . . . . . . . . . . 186 Resource balance for disk group 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

DB2 UDB exploitation of NAS technology

10-30 10-31 10-32 10-33 10-34 10-35 10-36 10-37 10-38 10-39 10-40 10-41 10-42 10-43 11-1 11-2 12-1 12-2 12-3 12-4 12-5 12-6 12-7 12-8 12-9 12-10 12-11 12-12 12-13 12-14 12-15 12-16 12-17 12-18 12-19 12-20 12-21 12-22 12-23 12-24 12-25 12-26 12-27

Set up threshold for failover of Disk Group 1 . . . . . . . . . . . . . . . . . . . . 188 Set up failback for Disk Group 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Create IP address resource . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Enter IP address resource information . . . . . . . . . . . . . . . . . . . . . . . . 191 Select possible owner of IP address resource . . . . . . . . . . . . . . . . . . . 192 Enter IP address for IP address resource . . . . . . . . . . . . . . . . . . . . . . 192 Bring IP address resource online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Enter network name resource information . . . . . . . . . . . . . . . . . . . . . . 193 Enter dependencies information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Enter network name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Bring network name online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Create share volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Enter dependencies for Share Volume resource . . . . . . . . . . . . . . . . . 195 Enter Share Volume information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 DB2 Control Center launching database wizard . . . . . . . . . . . . . . . . . 202 DB2 UDB 7.2 Create Database Wizard . . . . . . . . . . . . . . . . . . . . . . . . 203 Database Backup from True Image Copy . . . . . . . . . . . . . . . . . . . . . . 207 Version recovery from True Image . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 DB2relocatedb scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 PSM True Image: Global Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 PSM True Image: Select Volume for configuration . . . . . . . . . . . . . . . 215 PSM True Image: Volume Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 PSM True Image: Volume List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 PSM True Image Copy: Create new Copy. . . . . . . . . . . . . . . . . . . . . . 217 PSM True Image Copy: Volume Selection . . . . . . . . . . . . . . . . . . . . . 217 PSM True Image Copy: Persistent Images List . . . . . . . . . . . . . . . . . . 218 PSM True Image Copy: Restore read-write True Image . . . . . . . . . . . 219 NAS System Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 PSM True Image Copy: System Log Details . . . . . . . . . . . . . . . . . . . . 220 NAS Volumes with allocated PSM cache . . . . . . . . . . . . . . . . . . . . . . 220 NAS volumes with PSM cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 How to access PSM True Image copies . . . . . . . . . . . . . . . . . . . . . . . 222 Our test environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 NAS directory structure for scenario environment . . . . . . . . . . . . . . . . 225 Directory structure for primary and secondary images . . . . . . . . . . . . 226 True Image of an offline DB2 database . . . . . . . . . . . . . . . . . . . . . . . . 227 True Image of an online database . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Accessing a True Image copy from a secondary server . . . . . . . . . . . 230 Initiate database as DB2 True Image database. . . . . . . . . . . . . . . . . . 231 Screen Capture of the db2inidb command sequence . . . . . . . . . . . . . 232 Different directory structure for database True Image copy. . . . . . . . . 233 The db2inidb RELOCATE command sequence . . . . . . . . . . . . . . . . . 234 DB2 Backup from a True Image Copy . . . . . . . . . . . . . . . . . . . . . . . . . 235

Figures

xi

12-28 12-29 13-1 13-2

Version recovery from a PSM True Image. . . . . . . . . . . . . . . . . . . . . Roll-forward recovery from database True Image . . . . . . . . . . . . . . . Network resource response in type 3 failover . . . . . . . . . . . . . . . . . . Error message in the DB2 command center . . . . . . . . . . . . . . . . . . .

. . . .

237 239 245 246

xii

DB2 UDB exploitation of NAS technology

Tables
4-1 5-1 7-1 8-1 8-2 8-3 8-4 8-5 8-6 8-7 8-8 8-9 8-10 8-11 8-12 Volumes compared to quota trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Mount option descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Snapshot monitor switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Layout of disk after instant virtual copy is made . . . . . . . . . . . . . . . . 140 Layout of PSM cache after instant virtual copy is made . . . . . . . . . . 140 Layout of disk immediately after file is deleted . . . . . . . . . . . . . . . . . . 141 Layout of PSM cache immediately after file is deleted: . . . . . . . . . . . . 141 Layout of disk after changing time to date . . . . . . . . . . . . . . . . . . . 141 Layout of PSM cache after changing time to date . . . . . . . . . . . . . 141 Layout of disk after changing men to women.. . . . . . . . . . . . . . . . . 142 Layout of PSM cache after changing men to women:. . . . . . . . . . . 142 Layout of disk after changes without free space detection . . . . . . . . . 143 Layout of PSM cache after changes without free space detection . . . 143 Layout of disk after changes with free space detection . . . . . . . . . . . . 144 Layout of PSM cache after changes with free space detection . . . . . . 144

Copyright IBM Corp. 2002

xiii

xiv

DB2 UDB exploitation of NAS technology

Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.

Copyright IBM Corp. 2002

xv

Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: AFS AIX AIX 5L DB2 DB2 Connect DB2 Universal Database DFS Enterprise Storage Server ESCON Everyplace FlashCopy IBM IBM.COM IMS Informix Micro Channel Netfinity OS/2 OS/390 OS/400 PAL Perform PowerPC Predictive Failure Analysis RACF RAMAC Redbooks Redbooks(logo) RMF S/390 SANergy Sequent ServeRAID SP SP2 TCS Tivoli TotalStorage xSeries

The following terms are trademarks of International Business Machines Corporation and Lotus Development Corporation in the United States, other countries, or both: Approach Lotus Word Pro

The following terms are trademarks of other companies: ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. C-bus is a trademark of Corollary, Inc. in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. SET, SET Secure Electronic Transaction, and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC. Other company, product, and service names may be trademarks or service marks of others.

xvi

DB2 UDB exploitation of NAS technology

Preface
This IBM Redbook is an informative guide that describes how DB2 Universal Database (UDB) can take advantage of Network Attached Storage (NAS) and Storage Area Networks (SAN) technology. Specifically, this book provides detailed information to help you learn about Network Appliance filers and IBM NAS 200/300 appliances and to show you how DB2 UDB databases can be stored on these devices. This easy-to-follow guide documents the generic network, software, and hardware requirements, as well as the basic procedures needed to set up, configure, and integrate DB2 UDB databases with Network Appliance Filers and IBM NAS 200 and 300 appliances. These procedures start with the basics of initializing the NAS device for DB2 UDB and then continue with the more advanced topics of backing up databases stored on NAS devices using the True Image technology that is supplied with each. This book also provides general information on how DB2 UDB databases, which are stored on NAS devices as well as the NAS devices themselves, can be monitored for performance.

The team that wrote this redbook


This redbook was produced by a team of specialists from around the world working at the International Technical Support Organization, San Jose Center. Lijun (June) Gu is a Project Leader with the IBM International Technical Support Organization (ITSO), San Jose Center, California (USA), where she conducts projects on all aspects of DB2 Universal Database (DB2 UDB). She is an IBM-Certified Solution Expert of DB2 UDB Database Administrator and an IBM-Certified Specialist DB2 UDB User. She has extensive experience in DB2 UDB and ADSM administration as well as database design and modeling. She holds three masters degrees: MS in Computer Science, MS in Analytical Chemistry, and MS in Soil Science. David (Danhai) Cao is Director of Software Development with Critical Thinking Books & Software, an education publisher in Monterey, California. He has nearly 10 years of experience in Application Architecture, Development, and System Administration. His previous experience included working as a Senior Engineer in Oracle E-Travel, and as a consultant for IBM, Sun, and Reuters Asia Singapore). His areas of expertise include working as an application architect on the Java platform and system administration. He holds a masters degree in Computer Networking.

Copyright IBM Corp. 2002

xvii

Joachim Dirker is a Data Management Pre-Sales Specialist with IBM Germany. He has over 10 years of experience in the Data Management field. Roger E. Sanders is a Database Performance Engineer with Network Appliance, Inc. He has been designing and programming software applications for IBM PCs for more than 15 years and he has worked with DB2 Universal Database and its predecessors for the past 10 years. He has written several computer magazine articles, presented at two International DB2 User's Group (IDUG) conferences, and is the author of All-In-One DB2 Administration Exam Guide, DB2 Universal Database SQL Developer's Guide, DB2 Universal Database API Developers Guide, DB2 Universal Database CLI Developer's Guide, ODBC 3.5 Developer's Guide, and The Developer's Handbook to DB2 for Common Servers. His background in database application design and development is extensive and he holds the following professional certifications: IBM Certified Advanced Technical Expert DB2 for Clusters; IBM Certified Solutions Expert DB2 UDB V7.1 Database Administration for UNIX, Windows, and OS/2; IBM Certified Solutions Expert DB2 UDB V6.1 Application Development for UNIX, Windows, and OS/2; IBM Certified Specialist DB2 UDB V6/V7 User. Michael T. Terrell is an IT Specialist with the IBM Data Management group. He has 15 years experience in database application design and development. He has worked for Informix Software and IBM for 9 years, where he held positions as a trainer, consultant, and IT Specialist. He holds the following professional certifications: IBM Certified Specialist DB2 UDB V6/V7 User and IBM Certified Solutions Expert DB2 UDB V7.1 Database Administration for UNIX, Windows, and OS/2. Roland Tretau is a Project Leader with the IBM International Technical Support Organization, San Jose Center. Before joining the ITSO in April 2001, Roland worked in Germany as an IT Architect for Cross Platform Solutions and Microsoft Technologies. He holds a masters degree in Electrical Engineering with a focus in Telecommunications. We would especially like to thank the following people for their contributions in providing equipment and contents to be incorporated within these pages: Barry Warwick, Rob Davis, Frank Tutone, Benjamin L. (Ben) Stern, Brenda Haynes, Barbara Gallimore, Susan Grey, Ling Pong IBM US Bob Jancer Network Appliance Incorporation

xviii

DB2 UDB exploitation of NAS technology

Thanks to the following people for their contributions to this project: Rakesh Goenka, Enzo Cialini, Dale M. McInnis IBM DB2 Toronto Lab Mark Hayakawa, Michael Sowers, Brent Barnum, Jeff Browning, Joe Richart, Dave Hitz, Michael Marchi, James Lau, Michael Malcolm, Karl L. Swartz, Keith Brown, Jeff Katcher, Rex Walters, Andy Watson, Francine Bellet Network Appliance Incorporation Jay Knott, Ken E. Quarles, J. M. Lake, Sushama Paranjape, Sandy Albu, John M. Zoltek, Garry Rawlins, Sasha A. Loose, Tina DeAnglis IBM US Michael Baker, Cheryl Block, Margaret Hockett Critical Thinking Books & Software, CA USA Nagraj Alur, Corinne Baragoin, Tom Cady, Will Carney, Mary Comianos, Rowell Hernandez, Emma Jacobs, Yvonne Lyon, Deanna Polm, Journel Saniel, Patrick Vabre, Bart Steegmans, Osamu Takagiwa International Technical Support Organization, San Jose Center

Special notice
This publication is intended to help DB2 database administrators and database specialists, network/storage administrators to install, configure, and backup and restore DB2 using IBM TotalStorage NAS 200, IBM NAS 300, or NetWork Appliance Filer. The information in this publication is not intended as the specification of any programming interfaces that are provided by IBM TotalStorage NAS 200 and 300. See the PUBLICATIONS section of the IBM Programming Announcement for the IBM TotalStorage NAS 200 and 300 for more information about what publications are considered to be product documentation.

Preface

xix

Comments welcome
Your comments are important to us! We want our Redbooks to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways: Use the online Contact us review redbook form found at:
ibm.com/redbooks

Send your comments in an Internet note to:


redbook@us.ibm.com

Mail your comments to the address on page ii.

xx

DB2 UDB exploitation of NAS technology

Part 1

Part

NAS and NetApp filer

In this part of the book, we first introduce the basic concepts of Network Attached Storage (NAS), Storage Area Networks (SAN), DB2 Universal Database (DB UDB), Network Appliance filer, and the Network Appliance (NetApp) filer terminology and concepts. Next we show you how DB2 UDB work with Network Appliance filer. Finally, we describe the various methods that can be used for diagnostics and performance monitoring of DB2 UDB.

Copyright IBM Corp. 2002

DB2 UDB exploitation of NAS technology

Chapter 1.

Introduction to DB2 UDB, NAS, and SAN


Large database systems can be complex environments comprising multiple hardware and software components. In order for these systems to be successful, each component needs to work seamlessly with the others, taking advantage of each others strengths. This chapter is designed to provide an overview of the major components of a large database system that takes advantage of network attached storage: the database management system (DB2 Universal Database), and the storage subsystem (Network Appliance filers or IBM NAS). We discuss each of these components in high-level detail and outline some of the major terms that are used throughout this book.

Copyright IBM Corp. 2002 Copyright Network Appliance Inc. 2002

1.1 Introduction to DB2 UDB


DB2 Universal Database Version 7 (DB2 UDB) is IBM's object-relational database for UNIX, Linux, OS/2, and Windows operating environments. IBM offers DB2 Universal Database packages that provide easy installation, integrated functionality, a rich bundle of development tools, full Web enablement, OLAP capabilities, and flexibility to scale and change platforms.

1.1.1 DB2 Universal Database packaging


DB2 UDB (V7.2) for UNIX, Windows, and OS/2 environments is available in the following package options: DB2 Universal Database Personal Developer's Edition (PDE) provides all the tools for one software developer to develop desktop business tools and applications for DB2 Universal Database Personal Edition. DB2 Universal Database Personal Edition (PE) provides a single-user object-relational database management system for your PC-based desktop that is ideal for mobile applications or the power-user. DB2 Universal Developer's Edition (UDE) provides all the tools required for one software developer to develop client/server applications to run on DB2 Universal Database on any supported platform. Database servers and gateways can be set up for development purposes only. DB2 Universal Database Workgroup Edition (WE) is a multi-user object-relational database for applications and data shared in a workgroup or department setting on PC-based LANs. It is ideal for small businesses or departments. DB2 Universal Database Enterprise Edition (EE) is a multi-user object-relational database for complex configurations and large database needs for Intel or UNIX platforms, ranging from uniprocessors to the largest symmetrical multi-processors (SMPs). It is ideal for midsize to large businesses and departments, particularly where Internet and/or enterprise connectivity is important. DB2 Universal Database Enterprise-Extended Edition (EEE) provides a high performance mechanism to support large databases, and offers greater scalability in Massively Parallel Processors (MPPs) or clustered servers. It is ideal for applications requiring parallel processing, particularly data warehousing and data mining.

DB2 UDB exploitation of NAS technology

1.1.2 The Universal Database


DB2 UDB is truly the universal database supporting a wide variety of platforms and applications.

Universal access
DB2 UDB provides universal access to all forms of electronic data. This includes traditional relational data as well as structured and unstructured binary information, documents and text in many languages, graphics, images, multimedia (audio and video), information specific to operations, such as engineering drawings, maps, insurance claim forms, numerical control streams, or any type of electronic information. Access to a wide variety of data sources can be accomplished with the use of DB2 UDB and its complimentary products: Relational Connect, DB2 Connect, Data Joiner, and Classic Connect. Sources that can be accessed include: DB2 UDB for OS/390, DB2 UDB for OS/400, IMS, Oracle, MS/SQL Server, Sybase, NCR Teradata and IBM Informix databases.

Universal application
DB2 UDB supports a wide variety of application types. It can be configured to perform well for both online transaction processing (OLTP) was well as for decision support systems (DSS). It can also be used as the underlying database for an online analytical processing (OLAP) system. DB2 UDB is also accessible from and/or can be integrated into a wide variety of application development environments. In addition to being able to embed SQL statements within source code files written in standard programming languages such as C/C++, COBOL, and Visual Basic, DB2 UDB fully supports Java technology and is accessible from Java applets, servlets and applications. DB2 UDB also participates in Microsofts OLE DB as both a provider and a consumer.

Universal extensibility
Data is stored in most relational databases according to its data type and DB2 UDB is no exception. In order to support a wide variety of data types and formats, DB2 UDB contains a rich set of built-in data types, along with a set of functions that are designed to manipulate each of these data types. DB2 UDB also provides a way to create user-defined data types (UDTs) and supporting user-defined functions (UDFs); consequently, the base data types provided can be extended to provide data types that are specific to your business needs.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

Using UDTs and UDFs, IBM went one step farther and created several different sets of user-defined data types and functions to manage particular kinds of data that have begun to emerge over the last few years. Collectively, these sets of data types and functions are referred to as extenders. Currently, five different extender products are available for DB2 UDB; together they provide the capability to store and manipulate image, audio, video, text, XML, and spatial data, just to name a few.

Universal scalability
DB2 UDB scales from pervasive/handheld devices, in which DB2 Everyplace is used, all the way up to Massively Parallel Processing (MPP) environments, in which DB2 UDB Enterprise - Extended Edition (EEE) is used. The various editions of DB2 UDB outlined above will run on Palmtops, Laptops, Distributed Servers, and Central Servers, as well as clustered server configurations. The superior scalability of DB2 UDB is made possible through a combination of features that are built into the base product. These include intra-partition parallelism as well as inter-partition parallelism. With intra-partition parallelism, database operations are subdivided into multiple parts, which are then executed in parallel within a single database partition. With inter-partition parallelism, database operations are subdivided into multiple parts, which are then executed in parallel across one or more partitions of a multi-partition database. In addition, DB2 UDBs database engine is designed to take advantage of I/O parallelism (the process of reading or writing to two or more I/O devices at the same time) whenever possible, and it is capable of interacting with disk I/O subsystems that have been designed with RAID technology in mind.

Universal reliability
DB2 UDB runs reliably across multiple hardware and operating systems, however sometimes, unforeseen events (such as power or media failure) can cause a database system to become unstable or unusable. DB2 UDB uses write-ahead transaction logging as a preventative measure, and as a result, it can usually resolve database problems that are caused by power interruptions and/or application failures without any additional intervention. Unfortunately, this is not the case when problems arise because the storage media being used to hold a databases files becomes corrupted or fails. To address these types of problems, some kind of backup (and recovery) program must be established. And to help establish such a program, DB2 UDB provides a set of utilities that can be used to: Create a backup image of a database. Return a database to the state it was in when a backup image was made (by restoring it from a backup image).

DB2 UDB exploitation of NAS technology

Reapply (or roll-forward) some or all changes made to a database since the last backup image was made, once the database has been returned to the state it was in when the backup image was made. Backups can be scheduled to run automatically, and both full and incremental backup images can be made. Backups can also be tailored for a single table space or for an entire database, and backup images can be taken while a database is online or offline.

Universal management
The primary management tool for DB2 UDB is the Control Center, which together with a common integrated tool set provided, is used to manage local and remote databases across all software and hardware client platforms from a single terminal. Components of the Control Center include: The Command Center, which is GUI window that provides for inputting database or operating system commands, while allowing for storage, retrieval, and browsing of previous commands. The Script Center, which is a GUI that allows for the creation, modification, and execution of database or operating system scripts. The Journal Facility, which is a GUI tool that provides for managing jobs, recovery, alerts, and messages. The Visual Explain facility, which provides a graphical means to display optimization-associated cost information and visual drill-down views of a querys access plan. The Event Analyzer, which is a flexible GUI tool that provides summary and historical analysis of performance. The Performance Monitor, which is a GUI tool that supports online monitoring of buffer pools, sorts, locks, I/O, and CPU activity. SmartGuides, which are GUIs that guide database administrators through tasks such as backup/recovery, performance configuration, and object definition. The Alert Center, which is a GUI tool that displays objects which are in an exception status. The Index Creation wizard, which is a GUI tool that helps database administrators to build the best possible indexes for a given query workload. All of the information presented in the Control Center can also be accessed via a command line interface known as the Command Line Processor.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

1.1.3 DB2s query optimizer


The query optimizer is the key component in the performance of any Enterprise Database Server. IBM has made more than a 25-year investment in enhancements to DB2s cost-based, rule-driven optimizer with the goal of keeping it the best in the industry. A major component of this effort is the implementation of the Starburst extensible optimizer in DB2 UDB. This allows IBM to add new intelligence to the optimizer without having to modify the entire query-compilation process. As DB2 UDB is enhanced with new features and functionality, the optimizer performance improves. An example is extending the optimizer to understand OLAP SQL extensions and multiple levels of parallel query processing. The latter is particularly important in clusters where each node is an SMP. IBM has also built into the optimizer knowledge about how to work with underlying disk sub-systems. The optimizer takes advantage of its knowledge of the characteristics of the hardware environment; CPU speed, I/O speed, network bandwidth (for federated queries), and buffer pool allocations. In addition it knows the characteristics of the data itself; the size of the table being queried, the partitioning scheme scheme used (if any), the uniqueness of the data, the existence of indexes, and the existence of automatic summary tables. All of these are taken in account when choosing an optimal execution strategy for a given SQL statement. Such evaluations become more important as the size of a database increases. The query optimizer can also perform query re-write operations automatically, if necessary. This feature allows DB2 UDB to generate query data access plans that have been optimized for performance, without changing the intent of the original queries or the result sets produced. This capability is especially important for environments where the SQL is being generated by a tool which is the case for many decision support applications.

1.1.4 DB2 utilities


In addition to providing a rich object-relational database management system, each edition of DB2 UDB, with the exception of DB2 Everywhere, contains a comprehensive set of utilities that enable you to work with objects and data stored in DB2 UDB databases. These utilities are designed to support a wide range of traditional database configurations, as well as small OLAP and OLTP database environments; they incorporate a highly scalable software architecture which allows them to execute on a wide variety of hardware platforms.

DB2 UDB exploitation of NAS technology

This book does not attempt to provide a complete list of the DB2 Utilities. Instead, focus is concentrated on those utilities which are most likely to be used with network attached storage. For more information on DB2 UDB Utilities, please refer to the appropriate sections of the DB2 UDB Administration Guide (all three volumes), the DB2 Command Reference, and the DB2 UDB Data Movement Guide.

Backup and recovery utilities


One of the basic tools used to prevent catastrophic data losses when media failures occur is a backup image. A backup image is essentially a copy of an entire database or of one or more of the table spaces that make up a database. Once created, a backup image can be used at any time to return a database to the state it was in at the time the image was made. DB2 UDB provides a backup utility and a restore utility, each of which can be run granular or in parallel. With these utilities: Online or offline backup/restore operations can be performed. An entire database, a single table space, or multiple table spaces can be backed up or restored. Incremental, delta, or cumulative backup images can be made. In a partitioned database environment, all partitions or a subset of the data partitions available can be backed up or restored. The degree of parallelism achieved during the backup and restore process is determined by the number of backup devices used. This parallel capability results in a linear reduction in the time needed to perform a backup or restore operation. The backup and restore utilities provided with DB2 UDB have the ability to interface with storage management products such as the IBM Tivoli Storage manager (TSM) product, which manages backup jobs and log archives from multiple servers. These products (integrated through the use of DB2 user exits) allow for the automated archival and retrieval of backup images and archive log files to and from off-line storage.

Data movement utilities


Although a database is normally thought of as a single self-contained entity, there are times when it becomes necessary for a database to exchange data with the outside world. For this reason, several of the utilities provided with DB2 UDB are used to move data between databases and external files.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

Export
The Export utility is used to extract specific portions of data from a database and externalize it to ASCII Delimited (DEL), Worksheet (WSF), or PC Integrated Exchange Format (PC/IXF or IXF) formatted files. Such files can then be used to populate tables in a variety of databases (including the database the data was extracted from) or to provide input to software applications such as spreadsheets and word processors.

Import
The Import utility provides a way to read data directly from DEL, ASC, WSF, or PC/IXF formatted files and store it in a specific database table. When the Export utility is used to externalize data in a table to a PC/IXF formatted file, the table structure and definitions of all of the tables associated indexes are written to the file along with the data. Because of this, the Import utility can create/re-create a table and its indexes as well as populate the table if data is being imported from a PC/IXF formatted file. When any other file format is used, if the table or updateable view receiving the data already contains data values, the data being imported can either replace or be appended to the existing data, provided the base table receiving the data does not contain a primary key that is referenced by a foreign key of another table. In some situations, data being imported can also be used to update existing rows in a base table.

Load
Like the Import utility, the Load utility is designed to read data directly from DEL, ASC, or PC/IXF formatted files and store it in a specific database table. However, unlike when the Import utility is used, the table that the data is stored in must already exist in the database before the load operation is initiated the Load utility ignores the table structure and index definition information stored in PC/IXF formatted files. Likewise, the Load utility does not create new indexes for a table it only rebuilds indexes that have already been defined for the table being loaded. The most important difference between the Import utility and the Load utility relates to performance. Because the Import utility inserts data into a table one row at a time, each row inserted must be checked for constraint compliance (such as foreign key constraints and table check constraints) and all activity performed must be recorded in the databases transaction log files. The Load utility, on the other hand, inserts data into a table much faster than the Import utility because instead of inserting data into a table one row of data at a time, it builds data pages using several individual rows of data and then writes those pages directly to the table space container that the tables structure and any preexisting data have been stored in. Existing primary/unique indexes are then rebuilt once all data pages constructed have been written to the container and duplicate rows that violate primary or unique key constraints are deleted (and copied to an exception table, if appropriate).

10

DB2 UDB exploitation of NAS technology

Autoloader
In a partitioned database environment, the Autoloader utility is used to perform the necessary steps needed to balance data being loaded from single or multiple flat files into the partitions that comprise a partitioned table. The Autoloader utility splits data all data being loaded using the partition map for the table, pipes the split data to the appropriate partition, and concurrently loads each portion of the data into each partition of the partitioned table.

DB2MOVE
The DB2MOVE utility facilitates the movement of a large number of tables between DB2 databases. This utility queries the system catalog tables for a particular database and compiles a list of all user tables found. It then exports the contents and table structure of each table found to a PC/IXF formatted file. The set of files produced can then be imported or loaded to another DB2 database on the same system, or they can be transferred to another workstation platform and imported or loaded to a DB2 database that resides on that platform. (This is the best method to use when copying or moving an existing database from one platform to another.) The DB2MOVE utility can be run in one of three modes: EXPORT, IMPORT, or LOAD. When run in EXPORT mode, the DB2MOVE utility invokes the Export utility to extract data from one or more tables and externalize it to PC/IXF formatted files. It also creates a file named db2move.lst that contains the names of all tables processed, along with the names of the files that the tables data was written to. When run in IMPORT mode, the DB2MOVE utility invokes the Import utility to re-create a table and its indexes from data stored in PC/IXF formatted files. In this mode, the file db2move.lst is used to establish a link between the PC/IXF formatted files needed and the tables into which data will be imported. When run in LOAD mode, the DB2MOVE utility invokes the Load utility to populate tables that have already been created with data stored in PC/IXF formatted files. Again, the file db2move.lst is used to establish a link between the PC/IXF formatted files needed and the tables into which data will be imported.

DB2LOOK
The DB2LOOK utility is a special utility that will generate the DDL SQL statements needed to re-create existing objects in a given database. In addition to generating DDL statements, DB2LOOK can also collect statistical information that has been generated for objects in a database from the system catalog tables and save it (in readable format) in an external file. In fact, by using DB2LOOK, it is possible to create a clone of an existing database that contains both its data objects and current statistical information about each of those objects.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

11

Data maintenance utilities


The way data stored in tables is physically distributed across table space containers can have a significant impact on the performance of applications that access the data. How data is stored in a table is affected by insert, update, and delete operations that are performed on the table. For example, a delete operation may leave empty pages that for some reason never get reused. Or update operations performed on variable-length columns may result in larger column values that cause an entire row to be moved to a different page because they no longer fit on the original page. In both scenarios, internal gaps are created in the underlying table space containers. As a consequence, the DB2 Database Manager may have to read more physical pages into memory in order to retrieve the data needed to satisfy a query. Because situations such as these are almost unavoidable, DB2 UDB provides a set of data maintenance utilities that are used to optimize the physical distribution of all data stored in a table.

Reorganize Table (REORG)


The Reorganize Table utility removes gaps in table space containers by rewriting the data associated with a table to contiguous pages in storage (similar to the way a disk defragmenter works). With the help of an index, the Reorganize Table utility can also place the data rows of a table in a specific physical sequence, thereby increasing the cluster ratio of the selected index. This approach also has an attractive side effectif the DB2 Database Manager finds the data needed to satisfy a query stored in contiguous space and in the desired sort order, the overall performance of the query will be improved because the seek time needed to read the data will be shorter and a sort operation may no longer be necessary.

Run Statistics (RUNSTATS)


Although the system catalog tables contain information such as the number of rows in a particular table, the way storage space is utilized by tables and indexes, and the number of different values found in a column, this information is not automatically kept up-to-date. Instead, it has to be generated periodically by running the Run Statistics utility. The information that is collected by the Run Statistics utility is used in two ways: to provide information about the physical organization of the data in a table and to provide information that the DB2 Optimizer can use when selecting the best path to use to access data that will satisfy the requirements of a query.

12

DB2 UDB exploitation of NAS technology

Redistribute Data (REDISTRIBUTE)


The Redistribute Data utility is designed to physically move data between partitions when data is stored in a partitioned database environment. This utility is typically used to move data between partitions when a new logical data partition is added to or removed from a nodegroup. It can also be used to redistribute data if the way data is distributed across existing database partitions is not uniform.

1.2 Introduction to Network Attached Storage


Given the expansive growth in both storage and network technology during the past decade, it is only natural that an easy way to implement a scalable solution that meets various storage needs has evolved. Storage devices which optimize the concept of file sharing across the network have come to be known as Network Attached Storage (NAS). NAS solutions utilize the mature Ethernet IP network technology of the LAN; data is sent to and from NAS devices over the LAN using TCP/IP. By making storage devices LAN addressable, the storage is freed from its direct attachment to a specific server and any-to-any connectivity is facilitated using the LAN fabric. In principle, any user running any operating system can access files on the remote storage device. This is facilitated by means of a common network access protocol, for example, NFS for UNIX servers, and CIFS for Windows servers. A NAS device cannot just attach to a LAN; In order to manage the transfer and organization of data on the device, it needs additional intelligence, which is provided by a dedicated server to which the NAS is attached. It is important to understand this concept NAS consists of a server, an operating system, plus storage which is shared across the network by many other servers and clients. So NAS is a device, rather than a network infrastructure, and shared storage is either internal to the NAS device or remotely attached to it.

1.2.1 File servers


Early NAS implementations in the late 1980s used a standard UNIX or NT server with NFS or CIFS software to operate as a remote file server. In such implementations, clients and other application servers were able to access the files stored on the remote file server, as though they were located on their local disks. The location of the file was transparent to the user; several hundred users could work on information stored on the file server, each one unaware that the data is located on another system.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

13

The file server was responsible for accurately managing I/O requests, queuing requests as necessary, fulfilling requests and returning appropriate information to the correct initiator. In addition, the NAS server handled all aspects of security and lock management. If one user had a file open for updating, no one else was allowed to update the file until it was released. The file server keep track of connected clients by means of their network IDs, addresses, and so on.

1.2.2 Network appliances


More recent NAS implementations use application specific, specialized, thin server configurations with customized operating systems, usually comprising a stripped down UNIX kernel, a reduced Linux OS, a specialized Windows 2000 kernel (as with the IBM NAS products IBM TotalStorage NAS 200 and NAS 300), or an embedded proprietary OS (as with Network Appliance filers). With these customized operating systems, many of the functions provided in the full server operating system are not supported. The objective is to improve performance and reduce costs by eliminating unnecessary functions normally found in standard hardware and software. Some NAS implementations also employ specialized data mover engines and separate interface processors in efforts to further boost performance. These specialized file servers together with their specialized OS are typically known as appliances. The term appliance originated from household electrical devices such as a coffee maker or a toaster, which is a specialized application specific tool. NAS appliances typically come with pre-configured software and hardware, and with no monitor or keyboard for user access (which is why it is commonly termed a headless system). In order to manage the disk resources, a storage administrator must access the appliance from a remote console. One of the typical characteristics of a NAS appliance is its ability to be installed rapidly, with minimal time and effort needed to configure the system. Thus, NAS can be integrated seamlessly into an existing network, as shown in Figure 1-1. This makes NAS appliances especially attractive when lack of time and skills are elements that must be taken into consideration.

1.2.3 Benefits of NAS


NAS appliances offer a number of benefits that address some of the limitations of directly attached storage devices, and that overcome some of the complexities associated with SANs.

14

DB2 UDB exploitation of NAS technology

Resource pooling
A NAS appliance enables disk storage capacity to be consolidated and pooled on a shared network resource, which may be located at great distances from the clients and servers which will access it. Thus a NAS device can be configured as one or more file systems, each residing on specified disk volumes (Figure 1-1).

Specialized NAS Appliance

IP Network

Ethernet

Figure 1-1 The implementation of NAS in a typical storage network

All users accessing the same file system are assigned space within it on demand. This contrasts with individual DAS storage, when some users may have too little storage, and others may have too much. Consolidation of files onto a centralized NAS device can also minimize or eliminate the need to have multiple copies of files spread across several distributed clients. Thus overall hardware costs can be reduced. Additionally, NAS pooling can reduce the need to physically reassign capacity among users. The results can be lower overall costs through better utilization of the storage, lower management costs, increased flexibility, and increased control.

Exploits existing infrastructure


Because NAS utilizes the existing LAN infrastructure, the costs associated with implementation are minimal (whereas, introducing a new network infrastructure, such as a Fibre Channel SAN, can cause significant hardware costs to be incurred). In addition, relatively few new skills must be acquired in order to use NAS whereas with SAN, new skills must be acquired and a project of any size will need careful planning and monitoring to bring it to completion.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

15

Simple to implement
Because NAS devices attach to mature, standard LAN implementations, and have standard LAN addresses, they are typically extremely easy to install, operate, and administer. This plug-and-play operation results in lower risk, ease of use, and fewer operator errors, all of which contributes to lower costs of ownership.

Enhanced choice
With NAS, the storage decision is separated from the server decision, thus enabling the buyer to exercise more choice in selecting equipment to meet the business needs.

Connectivity
LAN implementation allows any-to-any connectivity across the network. Often, NAS appliances allow for concurrent attachment to multiple networks, thus one NAS device has the capability to support many users, simultaneously.

Scalability
Typically, NAS appliances can scale in capacity and performance within the allowed configuration limits of the individual appliance. However, this scalability may be restricted by considerations such as LAN bandwidth constraints, and the need to avoid restricting other LAN traffic.

Heterogeneous file sharing


Remote file sharing is one of the basic functions of any NAS appliance. Multiple client systems can have access to the same file access control is serialized by NFS or CIFS. Heterogeneous file sharing may be enabled by the provision of translation facilities between NFS and CIFS.

Enhanced backup
NAS appliance backup is a common feature of most popular backup software packages. For instance, the IBM NAS 200 and 300 appliances all provide TSM client software support. Some NAS appliances have some integrated, automated backup facility to tape, enhanced by the availability of advanced functions such as the IBM NAS appliance facility called Persistent Storage Manager (PSM). This enables multiple point-in-time copies of files to be created on disk, which can be used to make backup copies to tape in the background. This is similar in concept to features such as IBMs Snapshot function on the IBM RAMAC Virtual Array (RVA).

16

DB2 UDB exploitation of NAS technology

Improved manageability
By providing consolidated storage, which supports multiple application systems, storage management is centralized. This enables a storage administrator to manage more capacity on a NAS appliance than typically would be possible for distributed, directly attached storage. To summarize, an appliance is an easy to use device, which is designed to perform a specific function, such as serving files to be shared among multiple clients. In fact, a NAS appliance performs this task very well. It is important to recognize that a NAS is not a general purpose server, and should not be used (indeed, due to its customized OS, probably cannot be used) for general purpose server tasks. However, it does provide a good solution for appropriately selected shared storage applications. In this book, we focus on implementing DB2 UDB EE on NAS as a storage networking solution. Reading this book should adequately equip you to implement a DB2 and NAS solution using one or more products we describe to meet your networked storage requirements.

1.3 Introduction to Storage Area Networks


SANs create new methods of attaching storage to servers. These new methods promise great improvements in both availability and performance. Todays SANs are used to connect shared storage arrays to multiple servers, and are used by clustered servers for failover. They can interconnect mainframe disk or tape to network servers or clients, and can create parallel data paths for high bandwidth computing environments. A SAN is another network that differs from traditional networks because it is constructed from storage interfaces. Often it is referred to as the network behind the server

1.3.1 Storage Area Network


In todays Storage Area Network (SAN) environment, the storage devices in the bottom tier are centralized and interconnected, which represents, in effect, a move back to the central storage model of the host or mainframe. One definition of a SAN is a high-speed network, comparable to a LAN, that allows the establishment of direct connections between storage devices and processors (servers) centralized to the extent supported by the distance of Fibre Channel. The SAN can be viewed as an extension to the storage bus concept that enables storage devices and servers to be interconnected using similar elements as in Local Area Networks (LANs) and Wide Area Networks (WANs): routers, hubs, switches and gateways.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

17

A SAN can be shared between servers and/or dedicated to one server. It can be local or can be extended over geographical distances. SAN interfaces can be Enterprise Systems Connection (ESCON), Small computer systems interface (SCSI), serial storage architecture (ssa), high Performance Parallel Interface (HIPPI), Fibre Channel (FC) or whatever new physical connectivity emerges. The diagram in Figure 1-3 shows a schematic overview of a SAN connecting multiple servers to multiple storage systems.

1.3.2 Benefits of SAN


Todays business environment creates many challenges for the enterprise IT planner. This is a true statement and relates to more than just business continuance, so perhaps now is a good time to look at whether deploying a SAN will solve more than just one problem. It may be an opportunity to look at where you are today and where you want to be in three years time. Is it better to plan for migration to a SAN from the start, or try to implement one later after other solutions have been considered and possibly implemented? Are you sure that the equipment that you install today will still be usable in three years time? Is there any use that you can make of it outside of business continuance? A journey of a thousand miles begins with one step. In the topics that follow we will remind you of some of the business benefits that SANs can provide. We have identified some of the operational problems that a business faces today, and which could potentially be solved by a SAN implementation.

Storage consolidation and sharing of resources


By enabling storage capacity to be connected to servers at a greater distance, and by disconnecting storage resource management from individual hosts, a SAN enables disk storage capacity to be consolidated. The results can be lower overall costs through better utilization of the storage, lower management costs, increased flexibility, and increased control. This can be achieved physically or logically.

Physical consolidation
Data from disparate storage subsystems can be combined on to large, enterprise class shared disk arrays, which may be located at some distance from the servers. The capacity of these disk arrays can be shared by multiple servers, and users may also benefit from the advanced functions typically offered with such subsystems. This may include RAID capabilities, remote mirroring, and instantaneous data replication functions, which might not be available with smaller, integrated disks. The array capacity may be partitioned, so that each server has an appropriate portion of the available gigabytes.

18

DB2 UDB exploitation of NAS technology

Physical consolidation of storage is shown in Figure 1-2.

Consolidated Storage

Server A

Server B

Server C

A B C
Free space

Shared Disk Array


Figure 1-2 Storage consolidation

Available capacity can be dynamically allocated to any server requiring additional space. Capacity not required by a server application can be re-allocated to other servers. This avoids the inefficiency associated with free disk capacity attached to one server not being usable by other servers. Extra capacity may be added, in a non-disruptive manner

Logical consolidation
It is possible to achieve shared resource benefits from the SAN, but without moving existing equipment. A SAN relationship can be established between a client and a group of storage devices that are not physically co-located (excluding devices which are internally attached to servers). A logical view of the combined disk resources may allow available capacity to be allocated and re-allocated between different applications running on distributed servers, to achieve better utilization. Consolidation is covered in greater depth in IBM Storage Solutions for Server Consolidation, SG24-5355.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

19

In Figure 1-3, we show a logical consolidation of storage.

Existing IP Network for Client/Server Communications

NFS CIFS FTP HTTP

Heterogeneous Clients (Workstations or Servers)

NT Client
IFS w/ cache

AIX Client
IFS w/ cache

Solaris Client
IFS w/ cache

Linix Client
IFS w/ cache
Meta-data Server

Fibre Channel Network

SAN Fabric

Meta-data Server

Private Cluster Persistent store shared among servers

. . .
Device-to-device data movement

Shared Storage Devices Active data

Meta-data Server

Backup, archive and inactive data

Server Cluster for Load balancing Fail-over processing Scalability

Figure 1-3 Logical storage consolidation

Data sharing
The term data sharing is used somewhat loosely by users and some vendors. It is sometimes interpreted to mean the replication of files or databases to enable two or more users, or applications, to concurrently use separate copies of the data. The applications concerned may operate on different host platforms. A SAN may ease the creation of such duplicated copies of data using facilities such as remote mirroring. Data sharing may also be used to describe multiple users accessing a single copy of a file. This could be called true data sharing. In a homogeneous server environment, with appropriate application software controls, multiple servers may access a single copy of data stored on a consolidated storage subsystem. If attached servers are heterogeneous platforms (for example, a mix of UNIX and Windows NT), sharing of data between such unlike operating system environments is complex. This is due to differences in file systems, data formats, and encoding structures. IBM, however, uniquely offers a true data sharing capability, with concurrent update, for selected heterogeneous server environments, using the Tivoli SANergy File Sharing solution.

20

DB2 UDB exploitation of NAS technology

Non-disruptive scalability for growth


There is an explosion in the quantity of data stored by the majority of organizations. This is fueled by the implementation of applications, such as e-business, e-mail, Business Intelligence, Data Warehouse, and Enterprise Resource Planning. Industry analysts, such as IDC and Gartner Group, estimate that electronically stored data is doubling every year. In the case of e-business applications, opening the business to the Internet, there have been reports of data growing by more than 10 times annually. This is a nightmare for planners, as it is increasingly difficult to predict storage requirements. A finite amount of disk storage can be connected physically to an individual server due to adapter, cabling and distance limitations. With a SAN, new capacity can be added as required, without disrupting ongoing operations. SANs enable disk storage to be scaled independently of servers.

Improved backup and recovery


With data doubling every year, what effect does this have on the backup window? Backup to tape, and recovery, are operations which are problematic in the parallel SCSI or LAN based environments. For disk subsystems attached to specific servers, two options exist for tape backup. Either it must be done to a server attached tape subsystem, or by moving data across the LAN.

Tape pooling
Providing tape drives to each server is costly, and also involves the added administrative overhead of scheduling the tasks, and managing the tape media. SANs allow for greater connectivity of tape drives and tape libraries, especially at greater distances. Tape pooling is the ability for more than one server to logically share tape drives within an automated library. This can be achieved by software management, using tools, such as Tivoli Storage Manager; or with tape libraries with outboard management, such as IBMs 3494.

LAN-free and server-free data movement


Backup using the LAN moves the administration to centralized tape drives or automated tape libraries. However, at the same time, the LAN experiences very high traffic volume during the backup or recovery operations, and this can be extremely disruptive to normal application access to the network. Although backups can be scheduled during non-peak periods, this may not allow sufficient time. Also, it may not be practical in an enterprise which operates in multiple time zones.

Chapter 1. Introduction to DB2 UDB, NAS, and SAN

21

We illustrate loading the IP network in Figure 1-4.

LAN Backup/Restore Today


Existing IP Network for Client/Server Communications

Backup/Restore Control and Data Movement

Storage Manager client

Storage Manager server

Figure 1-4 Loading the IP network

SAN provides the solution, by enabling the elimination of backup and recovery data movement across the LAN. Fibre Channels high bandwidth and multi-path switched fabric capabilities enables multiple servers to stream backup data concurrently to high speed tape drives. This frees the LAN for other application traffic. IBMs Tivoli software solution for LAN-free backup offers the capability for clients to move data directly to tape using the SAN. A future enhancement to be provided by IBM Tivoli will allow data to be read directly from disk to tape (and tape to disk), bypassing the server. This solution is known as server-free backup.

22

DB2 UDB exploitation of NAS technology

Chapter 2.

DB2 UDB, NAS, and SAN terminology and concepts


When establishing a DB2 UDB server environment that takes advantage of network attached storage or Storage Area Network, it is helpful to have an understanding of some of the terminology and concepts that might be encountered in product documentation. This chapter is designed to introduce some of the DB2 UDB, NAS, and SAN terminology and concepts that might be encountered in documentation and that will be used later in the book.

Copyright IBM Corp. 2002 Copyright Network Appliance Inc. 2002

23

2.1 DB2 terminology and concepts


In this section we describe some of the terminology and concepts that are related to DB2 Universal Database.

2.1.1 Instances
DB2 Universal Database sees the world as a hierarchy of several different types of objects. Workstations on which any edition of DB2 Universal Database has been installed are known as system objects and they occupy the highest level of this hierarchy. Systems objects can represent systems that are accessible to other DB2 clients or servers within a network, or they can represent stand-alone systems that neither have access to nor can be accessed from other DB2 clients or servers. When any edition of DB2 Universal Database is installed on a particular workstation (or system), program files for the DB2 Database Manager are physically copied to a specific location on that workstation and one instance of the DB2 Database Manager is created and assigned to the system as part of the installation process. (Instances comprise the next level in the object hierarchy.) If needed, additional instances of the DB2 Database Manager can be created for a particular system; multiple instances can be used to separate the development environment from the production environment, tune the DB2 Database Manager for a particular environment, and protect sensitive information from a unauthorized access. Each time a new instance is created, it references the DB2 Database Manager program files that were stored on that workstation during the installation process; thus, each instance behaves like a separate installation of DB2 Universal Database, even though all instances within a particular system share the same binary code. Although all instances share the same physical code, each can be run concurrently with the others and each has its own environment, which can be modified by altering the contents of its configuration file.

24

DB2 UDB exploitation of NAS technology

2.1.2 Databases
In its simplest form, a DB2 Universal Database database is a set of related database objects. In fact, when you create a DB2 Universal Database, you are establishing an administrative relational database entity that provides an underlying structure for an eventual collection of database objects (such as tables, views, indexes, and so on). This underlying structure consists of a set of system catalog tables (along with a set of corresponding views), a set of table spaces in which both the system catalog tables and the eventual collection of database objects will reside, and a set of files that will be used to handle database recovery and other bookkeeping details. DB2 UDB allows multiple databases to be defined within a single database instance. Each database has its own configuration file, which allows characteristics of the database, such as memory usage and logging, to be fine tuned for optimum performance.

2.1.3 Buffer pools


A buffer pool is an area of main memory that has been allocated to the DB2 Database Manager for the purpose of caching table and index data pages as they are read from disk or modified. Using a set of heuristic algorithms, the DB2 Manager Database prefetches pages of data that it thinks a user is about to need into one or more buffer pools and it moves pages of data that it thinks are no longer needed back to disk. This approach improves overall system performance because data can be accessed much faster from memory than from disk. (The fewer times the DB2 Database Manager needs to perform disk I/O, the better the performance.) Each time a new database is created, one buffer pool, named IBMDEFAULTBP, is also created as part of the database initialization process. On UNIX platforms, this buffer pool consists of 1,000 4K (kilobyte) pages of memory; on all other platforms, this buffer pool consists of 250 4K pages of memory.

Page cleaners
To prevent a buffer pool from becoming full, page cleaner agents are used to write modified pages to disk at a predetermined interval (by default, when the buffer pool is 60 percent full) to guarantee the availability of buffer pool pages for future read operations. For example, if you have updated a large amount of data in a table, many data pages in the buffer pool may be updated but not written into disk storage (these pages are known as dirty pages). Since prefetchers cannot place fetched data pages onto the dirty pages in the buffer pool, these dirty pages must be flushed to disk storage so that prefetchers can store needed data pages in the buffer pool.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

25

2.1.4 Table spaces


When setting up a new database, one of the first tasks that must be performed is to map the logical database design to physical storage on a system. This is where table spaces come into play. Table spaces are used to control where data in a database is physically stored on a system and to provide a layer of indirection between the database and the container objects in which the actual data resides. When a database is first created, the following three table spaces are also created and associated with the default buffer pool IBMDEFAULTBP as part of the database initialization process: A catalog table space named SYSCATSPACE, which is used to store the system catalog tables and views associated with the database. A user table space named USERSPACE1, which is used to store all user-defined objects (such as tables, indexes, and so on) along with user data. A temporary table space named TEMPSPACE1, which is used to store temporary tables that might be created in order to resolve a query. Additional table spaces can be created as needed. All table spaces are classified according to how their storage space is managed: A table space can be either a system managed space (SMS) or a database managed space (DMS). With SMS table spaces, the operating systems file manager is responsible for allocating and managing the storage space used by the table space. SMS table spaces typically consist of several individual files (representing data objects such as tables and indexes) that are stored in the file system. With database managed space (DMS) table spaces, the table space creator (and in some cases, the DB2 Database Manager) is responsible for allocating the storage space used and the DB2 Database Manager is responsible for managing it. Essentially, a DMS table space is an implementation of a special-purpose file system that has been designed specifically to meet the needs of the DB2 Database Manager.

26

DB2 UDB exploitation of NAS technology

Regardless of how they are managed, three types of table spaces can exist: regular, temporary, and long. Tables that contain user data can reside in regular DMS table spaces. (Indexes can also be stored in regular DMS table spaces.) Tables that contain long field data or large object (LOB) data, such as multimedia objects, can reside in long DMS table spaces. Temporary table spaces are classified as either system or user; system temporary table spaces are used to store internal temporary data that is required during SQL operations such as sorting, reorganizing tables, index creation, and table joins. User temporary table spaces are used to store declared global temporary tables that, in turn, are used to store application specific temporary data.

Containers
Every table space is made up of at least one container, which is essentially an allocation of physical storage that the DB2 Database Manager is given unique access to. Containers essentially provide a way of defining what location on a specific storage device will be made available for storing database objects. Containers may be assigned from file systems by specifying a directory; such containers are identified as PATH containers. Containers may also reference files which reside within a directory. These types of containers are identified as FILE containers and, when used, a specific file size must be specified. Finally, containers may also reference raw devices. Such containers are identified as DEVICE containers, and the device specified must already exist on the system before a DEVICE container can be used. A single table space can span many containers, but each container can only belong to one table space. Figure 2-1 illustrates the relationship between buffer pools, table spaces, and containers.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

27

System

Instance

Database

Buffer Pool

RESERVED

TABLESPACE 1

TABLESPACE 21

TABLESPACE 3

+ + +

Figure 2-1 Relationship between buffer pools, table spaces, and instances

Characteristics that affect table space performance


In addition to how a table space is managed, three other table space characteristics must be taken into consideration in order to design a database for optimum performance. These characteristics are Page size Extent size Prefetch size

Page size
With DB2 UDB, data is transferred between table space containers and buffer pools in discrete blocks that are called pages. (The memory reserved to buffer a page transfer is called an I/O buffer.) The actual page size used by a particular table space is determined by the page size of the buffer pool the table space is associated with. Four different page sizes are available: 4K, 8K, 16K, and 32K. By default, all table spaces that are created as part of the database creation process are assigned a 4K page size.

28

DB2 UDB exploitation of NAS technology

Extent size
An extent is a unit of space within a container that makes up a table space. When a table space spans multiple containers, data associated with that table space is stored on all of its respective containers in a round-robin fashion; the extent size of a table space represents the number of pages of table data that are to be written to one container before moving to the next container in the list. This helps balance data across all containers that belong to a given table space (assuming all extent sizes specified are equal). Figure 2-2 illustrates how extents are used to balance data across multiple containers.

Table Space 0 1 Page

1 EXTENT = 32 Pages (DEFAULT)

EXTENT 4
EXTENT 2 EXTENT 0
CONTAINER 0

EXTENT 3 EXTENT 1
CONTAINER 1

Data Written In Round-Robin Manner

Figure 2-2 Table space containers and extents

Prefetch size
Prefetching is a technique that the DB2 Database Manager uses to fetch pages of data that it thinks a user is about to need into one or more buffer pools before requests for the data are actually made. Thus, the prefetch size of a table space identifies the number of pages of table data that are to be read in advance of the pages currently being referenced by a query, in anticipation that they will be needed to resolve the query. The overall objective of sequential prefetching is to reduce query response time. This can be achieved if page prefetching can occur asynchronously to query execution.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

29

Sequential prefetches read consecutive pages into the buffer pool before they are needed. List prefetches however, are more complex in this case the DB2 optimizer attempts to optimize the retrieval of randomly located data. The amount of data being prefetched determines the amount of parallel I/O activity. Ordinarily the database administrator should define a prefetch value large enough to allow parallel use of all of the available containers, and therefore all of the physical devices used.

2.1.5 Tables, indexes, and long data


If you look closely at how most data is stored in a database, you will find that it is stored as three separate objects: as a data object, which is where regular user data is stored; as an index object, which is where indexes defined on the table are stored; and as a long field object, which is where long field data is stored if the table contains one or more long data columns. Each of these three objects is stored separately and each can be stored in its own table space provided DMS table spaces are used.

Tables
Tables are uniquely identified units of storage that are maintained within a table space. Each table is a logical structure that is used to present data as a collection of unordered rows with a fixed number of columns. Every column contains a set of values of the same data type (or one of its subtypes) and the definition of the columns in a table make up the table structure (the rows contain the actual table data). The storage representation of a row is called a record, and the storage representation of a column is called a field. Each intersection of a row and column in a database table contains a specific data item called a value. Data in a table is typically logically related and additional relationships, known as referential constraints, can be defined between two or more tables.

Indexes
An index is an object that contains an ordered set of pointers that refer to a key in a base table. When indexes are used, the DB2 Database Manager can access data directly and more efficiently because each index provides a direct path to the data through pointers that have been ordered based on the values of the columns that the index is associated with. When an index is created, the DB2 Database Manager uses a balanced binary tree (a hierarchical data structure in which each element may have at most one predecessor but may have many successors) to order the values of the key columns in the base table that the index refers to.) More than one index may be defined for a given table, and they provide a way to assist in the clustering of data.

30

DB2 UDB exploitation of NAS technology

Long data
All data is classified, to some extent, according to its type (for example, some data might be numerical, whereas other data might be textual). Because a table is comprised of one or more columns that are designed to hold data values, each column must be assigned a specific data type. This data type determines the internal representation that will be used to store the data, what the ranges of the datas values are, and what set of operators and functions can be used to manipulate that data once it has been stored. DB2 Universal Database supports 19 different built-in data types (along with an infinite number of user-defined data types that are based on the built-in data types). Of these built-in data types, 5 are designed to store data values that exceed 32,700 bytes in length: Varying-length long character string (LONG VARCHAR) Varying-length double-byte long character string (LONG VARGRAPHIC) Binary large object (BLOB) Character large object (CLOB) Double-byte character large object (DBCLOB) These objects, although logically referenced as part of the table, may be stored in their own table space when the base table is stored in a DMS table space. This allows for more efficient access of both the long data and the related table data.

2.1.6 DB2 UDB and parallelism


DB2 UDB uses parallelism to optimize performance when accessing a database. There are different ways in which a task can be performed in parallel. Three factors the nature of the task, the database configuration, and the hardware environment determine how DB2 will perform a task in parallel. Using these factors, DB2 UDB can initiaite any of the following types of parallelism: I/O parallelism Query parallelism

I/O parallelism
Parallel I/O refers to the process of writing to, or reading from, two or more I/O devices simultaneously. The DB2 Database Manager can take advantage of parallel I/O in situations where multiple storage containers exist for a single table space. When used, I/O parallelism can provide significant improvements in data throughput.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

31

Query parallelism
Query parallelism controls how database operations are performed. DB2 UDB supports two different types of query parallelism: inter-query parallelism and intra-query parallelism. Inter-query parallelism refers to the ability of multiple applications to query a database at the same time. Each query executes independently of the others, but all are executed at the same time. When intra-partition parallelism is used, what is usually considered to be a single database operation such as index creation, database loading, or an SQL query is subdivided into multiple parts, many or all of which can be run in parallel within a single database partition. Intra-query parallelism refers to the simultaneous processing of individual parts of a single query, using either intra-partition parallelism, inter-partition parallelism, or both. When inter-partition parallelism is used, what is usually considered to be a single database operation is subdivided into multiple parts, many or all of which are run in parallel across multiple partitions of a partitioned database. Inter-partition parallelism only applies to DB2 UDB Enterprise-Extended Edition (EEE).

2.1.7 Registry and environment variables


In addition to DB2 Database Manager and database configuration parameters, DB2 UDB utilizes several registry and environment variables to configure the system where DB2 UDB has been installed. Because changes made to registry and environment variables impact the entire system, changes should be carefully considered before they are made. (The system command DB2SET is used to display, set, or remove DB2 registry profile variables.) After setting any registry variable, the DB2 Database Manager must be stopped (db2stop), and then restarted (db2start), in order for the changes to take effect. Two registry variables that should be set when working with NAS are: DB2_PARALLEL_IO DB2_STRIPED_CONTAINERS

DB2_PARALLEL_IO
When reading data from, or writing data to table space containers, DB2 may use parallel I/O if the number of containers in the database is greater than 1. However, there are situations when it would be beneficial to have parallel I/O enabled for single container table spaces. For example, if the container is created on a RAID device that is composed of more than one physical disk, performance may be improved if read and write calls are issued in parallel.

32

DB2 UDB exploitation of NAS technology

To force DB2 UDB to use parallel I/O for a table space that only has one container, you use the DB2_PARALLEL_IO registry variable. This variable can be set to asterisk (*), meaning every table space is to use parallel I/O, or it can be set to a list of table space IDs that are separated by commas. For example, this command would turn parallel I/O on for all table spaces:
db2set DB2_PARALLEL_IO=*

However, the following command would only turn parallel I/O on for table spaces 1, 2, 4, and 8:
db2set DB2_PARALLEL_IO=1,2,4,8

The DB2_PARALLEL_IO registry variable also affects tablespaces with more than one container defined. If this the registry variable is not set, the I/O parallelism used is equal to the number of containers in the tablespace. However, if this registry variable is set, the I/O parallelism used is equal to the result of (prefetch size / extent size). For example, if a tablespace has 2 containers and the prefetch size is 4 times the extent size and if the DB2_PARALLEL_IO registry variable is not set, a prefetch request for this table space will be broken into 2 requests (each request will be for 2 extents). Provided that the prefetchers are available to do work, 2 prefetchers can be working on these requests in parallel. In the case where the DB2_PARALLEL_IO registry variable is set, a prefetch request for this table space will be broken into 4 requests (1 extent per request) with a possibility of 4 prefetchers servicing the requests in parallel. In this example, if each of the 2 containers had a single disk dedicated to it, setting the DB2_PARALLEL_IO registry variable might result in contention on those disks since 2 prefetchers would be accessing each of the 2 disks at once. However, if each of the 2 containers was striped across multiple disks, setting the DB2_PARALLEL_IO registry variable would potentially allow access to 4 different disks at the same time.

DB2_STRIPED_CONTAINERS
When creating a DMS table space, a one-page tag is stored at the beginning of each container used for identification purposes. The remaining pages are available for storage by DB2 and are grouped into extent-size blocks of data. When using RAID devices for table space containers, it is suggested that the table space be created with an extent size that is equal to, or a multiple of, the RAID stripe size. However, because of this one-page container tag, the extents will not line up with the RAID stripes; this may cause I/O requests to access more physical disks than would be optimal. This can have a significant impact when the RAID devices are not cached, and do not have special sequential detection and prefetch mechanisms.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

33

To eliminate this problem, the DB2_STRIPPED_CONTAINERS registry variable can be used to tell DB2 UDB to use a full extent to store the identification tag in each container used. If this variable is set to ON, every table space created is to use a full extent for each containers tag. For example, this command would cause identification tags to be stored in one extent, rather than in one page:
db2set DB2_STRIPPED_CONTAINERS=ON

2.1.8 A word about DB2EMPFA


As mentioned earlier, because SMS table spaces are managed by the file system, rather than the DB2 Database Manager, a size for each container used does not have to be specified. Thats because the file system is responsible for allocation additional storage space as it is needed. By default, SMS table spaces are expanded one single page at a time. However, in certain work loads (for example, when doing a bulk insert) it might be desirable to have storage space allocated in extents rather than pages. To force DB2 UDB to expand SMS table spaces one extent at a a time, rather than one page at a time, you use the DB2EMPFA utility. The db2empfa tool is located in the bin subdirectory of the sqllib directory in which the DB2 UDB product is installed. Running it causes the multipage_alloc database configuration parameter (which is a read-only configuration parameter) to be set to YES.

2.1.9 Backup and recovery


Most problems that occur with DB2 UDB databases are directly related to media or storage failure, power interruptions, and/or application failure. When one or more transactions (otherwise known as units of work) are unexpectedly interrupted by any of these types of situations, all databases that have been accessed by those transactions are placed in an inconsistent or unstable state. Such databases must be returned to a consistent state before they can be used again. An inconsistent database is returned to a consistent state by executing the RESTART DATABASE command. When this command is executed, all incomplete and/or in-doubt transactions that were still in memory when the interruption occurred are rolled back and all completed transactions that were recorded in the transaction log file, but not applied to the database itself, are committed. (Additional, more complex actions will be performed in a partitioned database environment.)

34

DB2 UDB exploitation of NAS technology

If the autorestart parameter in a database's configuration file is set to ON, the DB2 Database Manager will automatically execute the RESTART DATABASE command whenever it determines that the database is in an inconsistent state. (The DB2 Database Manager checks the state of a database when it attempts to establish the first connection to that database.) The RESTART DATABASE command is designed to handle problems that are caused by power interruptions and application failures. However, it cannot correct problems that are caused by media or storage failure. In order to resolve these types of problems, a backup image of the database must exist. Database backup images can be created at any time by executing the BACKUP DATABASE command. After one or more backup images have been created, they can be used to rebuild the database or any of its table spaces if either becomes damaged or corrupted. The first time a backup image of a database is created, a special file, known as the database recovery history file is also created. This file is then updated with summary information each time subsequent backup images are made. Because the database recovery history file contains summary information about each backup image available, it is used as a tracking mechanism during a recovery (restore) operation. (Each backup image contains special information in its header that is checked against the records in the recovery history file to verify that the backup image being used corresponds to the database being restored.) A damaged or corrupted database can be restored to the state it was in when a particular backup image was made by executing the RESTORE DATABASE command. When a database is restored from a backup image, all changes made to that database since the backup image was created will be lost unless roll-forward recovery for that database has been enabled. Roll-forward recovery is enabled by setting the logretain and/or the userexit parameter in a database's configuration file to ON.) When enabled, roll-forward recovery uses information stored in a database's transaction logs to reapply some or all of the changes made to a database since the last backup image was taken. By reapplying changes stored in the transaction logs, a database can be returned to the state it was in just before the restore/roll-forward operation began. The roll-forward process is initiated by executing the ROLLFORWARD DATABASE command. Usually, a roll-forward recovery operation is performed immediately after a full database restore operation is completed. Note: If a roll-forward recovery operation is to follow a full database restore operation, all database log files associated with that database must be copied to a separate directory, if possible, before the restore operation is performed (otherwise they will be overwritten by the log files stored in the backup image). These log files must then be copied back to their original location before the roll-forward recovery operation is started.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

35

Transaction logging
Transaction logging is simply a process that is used to keep track of changes that are made to a database, as they are made. Each time a change is made to a row in a table (by an insert, update, or delete operation), records that reflect that change are written to a log buffer, which is simply a designated area in memory. When a transaction terminates by executing a COMMIT or a ROLLBACK SQL statement, when pages are flushed from a buffer pool by a page cleaner, or when the log buffer becomes full, all log records associated with that transaction, or page (or stored in the log buffer) are immediately written from the log buffer to one or more log files stored on disk. Only after all log records associated with the transaction have been externalized to one or more log files does a transaction receive confirmation that the commit or rollback operation has been successfully completed. This ensures that all log records of a completed transaction will not be lost due to a system failure. (Although log records may be written to disk before a commit or a rollback operation is performed, for example, if the log buffer becomes full, such early-writes do not affect data consistency because the execution of the COMMIT or ROLLBACK statement itself is eventually logged as well. The transaction logger and the buffer pool manager cooperate and ensure that updated information for a data page is not written to the database before its associated log record(s) have been written to the log file(s). This behavior ensures that the DB2 Database Manager can obtain enough information from the logs to recover a database that has been left in an inconsistent state, for example as a result of a power or application failure. Two methods of transaction logging are available, circular and archive. Each logging method provides for a different level of recovery capability. As long as a database is active, an active transaction log is available, regardless of which method is used. If the DB2 Database Manager cannot find a log to write to, it will suspend processing until a log file becomes available.

Circular logging
With circular logging, a group of primary online log files are defined and used in a round-robin fashion to provide transaction logging support. Records are written to the a log file as a transaction is processed and records are removed from a log file when the transaction the records are associated with is either committed and externalized to disk or rolled back. Once all records in a primary log file have been processed, the log file is freed up for reuse and will be repopulated the next time it becomes the active log file in the circular cycle. If the DB2 Database Manager determines that the next primary log file needed is unavailable, one or more secondary log files will be allocated and used, until all records in the primary log file needed have been processed. Logging will then continue with that primary log file and records stored in any secondary log files allocated will be

36

DB2 UDB exploitation of NAS technology

processed. When the DB2 Database Manager determines that a secondary file is no longer needed (because all of its records have been processed) that log file is removed and its associated memory is freed. Circular logging is the default logging behavior used when a new database is created. With circular logging, only full, offline backup images of the database can be taken, and roll-forward recovery cannot be performed. When a database is placed in an inconsistent state, it can be returned to a consistent state by utilizing the records stored in the active log to resolve any in-doubt or in-flight transactions. However, if a database becomes damaged or destroyed, it can only be salvaged by using the RESTORE DATABASE command in conjunction with a full, offline backup image that was taken earlier. Unfortunately, any changes made to the database after the backup image was made will be lost.

Archival logging
Archived logs contain the same log data as circular logs. However, archive log files are not reused by the DB2 Database Manager; instead, they are retained specifically for roll-forward recovery. Heres how archive logging works: once the active log file becomes full, a new active log file is created in the database's log directory, and the current active log file becomes an archive log file. An archive log file can be classified as either online or offline. An online archive log file is immediately available to DB2 for roll-forward recovery, and can be found in the database's log directory. An offline archive log file is not immediately available because it has been moved to a location other than the database's log directory.
When an online backup operation is performed, all transactions continue to be logged, and can be recreated in a future roll-forward recovery operation. When a database is restored from an online backup image, it must be rolled forward at least to the point in time at which the backup operation was completed. For this to happen, the active log file and any archived log files (online or offline) needed must be available when the roll-forward recovery process is initiated. That is because the DB2 Database Manager must be able to access the log files needed, in the proper sequence (whether they are active, online, or offline), to perform the roll-forward recovery operation. Obviously, the more archive log files you have online, the faster the recovery process will be. Because every change to a row includes the before-and after-image of the row, online archive log files can potentially become quite large. Thus, a NAS device is an excellent location for these objects, as well as for the database itself.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

37

The recovery history file


The recovery history file contains certain historical information about major actions that have been performed against a database. Information recorded in the recovery history file is used to assist with the recovery of the database in the event of a failure. The following is a list of the actions that will generate an entry in the history file: Backing up the database or a table space Restoring the database or a table space from a backup image Performing a roll-forward recovery operation on the database or a table space Loading a table Altering a table spaces definition Quiescing a table space Reorganizing a table (REORG) Updating statistics for a table (RUNSTATS) Dropping a table The recovery history file can be queried to obtain information such as when a database was last backed up and when a database was last restored.

2.2 NAS terminology and concepts


In this section we describe some of the terminology and concepts that are related to Network Attached Storage.

2.2.1 Network file system protocols


The two most common file level protocols used to share files across networks are Network File System (NFS) for UNIX and Common Internet File System (CIFS) for Windows. Both are network based client/server protocols which enable hosts to share resources across a network using TCP/IP. Users manipulate shared files, directories, and devices such as printers, as if they were locally on or attached to the users own computer. The NAS devices described in the book are pre-configured to support both NFS and CIFS. IBM NAS can also support HTTP and FTP.

38

DB2 UDB exploitation of NAS technology

Network File System (NFS)


NFS servers make their file systems available to other systems in the network by exporting directories and files over the network. Once exported, an NFS client can then mount a remote file system from the exported directory location. NFS controls access by giving client-system level user authorization based on the assumption that a user who is authorized to the system must be trustworthy. Although this type of security is adequate for some environments, it is open to abuse by anyone who can access a UNIX system via the network. (To get around this, NFS can be made secure by using an isolated network in conjunction with VLANs to control what systems have access to across the network.) For directory and file level security, NFS uses the UNIX concept of file permissions with User (the owners ID), Group (a set of users sharing a common ID), and Other (meaning all other user IDs). For every NFS request, the IDs are verified against the UNIX file permissions. NFS is a stateless service. Therefore, any failure in the link will be transparent to both client and server. When the session is re-established, the two can immediately continue to work together again. NFS handles file locking by providing an advisory lock to subsequent applications to inform them that the file is in use by another application. The ensuing applications can decide if they want to abide by the lock request or not. This has the advantage of allowing any UNIX application to access any file at any time, even if it is in use. The system relies on good neighbor responsibility which, though often convenient, clearly is not foolproof. This is avoided by using the optional Network Lock Manager (NLM). It provides file locking support to prevent multiple instances of open files.

Common Internet File System (CIFS)


Another method used to share resources across a network uses CIFS, which is a protocol based on Microsofts Server Message Block (SMB) protocol. Using CIFS, servers create file shares which are accessible by authorized clients. Clients subsequently connect to the servers shares to gain access to the resource. Security is controlled at both the user and share level. Client authentication information is sent to the server before the server will grant access. CIFS uses access control lists that are associated with the shares, directories, and files, and authentication is required for access. A session in CIFS is oriented and stateful. This means that both client and server share a history of what is happening during a session, and they are aware of the activities occurring. If there is a problem, and the session has to be re-initiated, a new authentication process must be completed.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

39

2.2.2 File I/O


One of the key differences of a NAS device, compared to direct access storage (DAS) is that all I/O operations use file level I/O protocols. File I/O is a high level type of request that, in essence, specifies only the file to be accessed, but does not directly address the storage device. This is done later by other operating system functions in the remote NAS device. A File I/O request specifies the file and the offset into the file. For instance, the I/O may specify Go to byte 1000 in the file (as if the file was a set of contiguous bytes), and read the next 256 bytes beginning at that position. Unlike Block I/O, there is no awareness of a disk volume or disk sectors in a File I/O request. Inside the NAS device, the operating system keeps track of where files are located on disk. The OS issues a Block I/O request to the disks to fulfill the File I/O read and write requests it receives. Network access methods, NFS and CIFS, can only handle File I/O requests to the remote file system. I/O requests are packaged by the node initiating the I/O request into packets to move across the network. The remote NAS file system converts the request to Block I/O and reads or writes the data to the NAS disk storage. To return data to the requesting client application, the NAS software re-packages the data in TCP/IP protocols to move it back across the network. This is illustrated in Figure 2-3.

NAS uses File I/O


IP network
Application server NAS Appliance

File I/O
IP protocol

Application server directs File I/O request over the LAN to the remote file system in the NAS appliance
Figure 2-3 IBM NAS devices use File I/O

File system in the NAS appliance initiates Block I/O to the NAS disk

40

DB2 UDB exploitation of NAS technology

2.2.3 Local Area Networks (LANs)


A Local Area Network (LAN) is simply the connection of two or more computers (nodes) to facilitate data and resource sharing. They proliferated from the mid-1980s to address the problem of islands of information which occurred with standalone computers within departments and enterprises. LANs typically reside in a single or multiple buildings confined to a limited geographic area which is spanned by connecting two or more LANs together to form a Wide Area Network (WAN). LAN design is typically based on open systems networking concepts, as described in the network model of the Open Systems Interconnection (OSI) standards of the International Standards Organization (ISO). LAN types are defined by their topology, which is simply how nodes on the network are physically connected together. A LAN may rely on a single topology throughout the entire network but typically has a combination of topologies connected using additional hardware.

2.3 Storage Area Network terminology and concepts


The server infrastructure is the underlying reason for all SAN solutions. This infrastructure includes a mix of server platforms such as Windows NT, UNIX and OS/390. With initiatives such as server consolidation and e-business, the need for SAN will increase.

2.3.1 SAN storage


The storage infrastructure is the foundation on which information relies and therefore must support a companys business objectives and business model. In this environment, simply deploying more and faster storage devices is not enough; a new kind of infrastructure is needed, one that provides more enhanced network availability, data accessibility, and system manageability than is provided by todays infrastructure. The SAN meets this challenge. The SAN liberates the storage device, so it is not on a particular server bus, and attaches it directly to the network. In other words, storage is externalized and functionally distributed across the organization. The SAN also enables the centralizing of storage devices and the clustering of servers, which makes for easier and less expensive administration.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

41

2.3.2 SAN fabric


The first element that must be considered in any SAN implementation is the connectivity of storage and server components using technologies such as Fibre Channel. SANs, like LANs, interconnect the storage interfaces together into many network configurations and across long distances. Much of the terminology used for SAN has its origin in IP network terminology. The hardware that enables workstations and servers to work with storage devices in a SAN is referred to as a fabric. The SAN fabric gives any server the ability to connect to any storage device through the use of Fibre Channel switching technology.

2.3.3 SAN applications


Storage Area Networks (SANs) enable a number of applications that provide enhanced performance, manageability, and scalability to IT infrastructures. These applications are being driven by parts technology capabilities, and as technology matures over time, we are likely to see more and more applications in the future. A few applications are listed below:

Shared repository and data sharing


SANs enable storage to be externalized from the server and centralized, and in so doing, allow data to be shared among multiple host servers without impacting system performance. The term data sharing describes the access of common data for processing by multiple computer platforms or servers. Data sharing can be between platforms that are similar or different; this is also referred to as homogeneous and heterogeneous data sharing.

Data-copy sharing
Data-copy sharing allows different platforms to access the same data by sending a copy of the data from one platform to the other. There are two approaches to data-copy sharing between platforms: flat file transfer and piping.

Data vaulting and data backup


In most present day scenarios of data vaulting and data backups to near-line or off-line storage, the primary network, LAN or WAN, is the medium used for transferring both the server, file and database server, or end-user client data to the storage media. SANs enable data vaulting and data backup operations on servers to be faster and independent of the primary network, which has led to the delivery of new data movement applications like LAN-less backup and server-free backup.

42

DB2 UDB exploitation of NAS technology

Clustering
Clustering is usually thought of as a server process providing failover to a redundant server, or as scalable processing using multiple servers in parallel. In a cluster environment, SAN provides the data pipe, allowing storage to be shared.

Data protection and disaster recovery


The highest level of application availability requires avoiding traditional recovery techniques, such as recovery from tape. Instead, new techniques that duplicate systems and data must be architected so that, upon the event of a failure, another system is ready to go. Techniques to duplicate the data portion include remote copy and warm standby techniques. Data protection in environments with the highest level of availability is best achieved by creating second redundant copies of the data by storage mirroring, remote cluster storage, Remote Copy and Extended Remote Copy (XRC), Concurrent Copy, and other High Availability (HA) data protection solutions. These are then used for disaster recovery situations. SAN any-to-any connectivity enables these redundant data/storage solutions to be dynamic and not impact the primary network and servers, including serialization and coherency control. Subsystem local copy services, such as SnapShot Copy or FlashCopy, assist in creating copies in high availability environments and for traditional backup and therefore are not directly applicable to disaster recovery or part of SAN. See Figure 1-3 on page 20.

Chapter 2. DB2 UDB, NAS, and SAN terminology and concepts

43

44

DB2 UDB exploitation of NAS technology

Chapter 3.

Introduction to the NetApp filer

Network Appliance filers are easy-to-manage appliances that are designed specifically for today's scalable, network-centric IT system architectures. Network Appliance filers provide up to 12 terabytes of disk storage and can be attached directly to a network, rather than to a specific network server. In this chapter we provide an overview of Network Appliance filer technology, and discuss some of the functionality a Network Appliance filer provides.

Copyright IBM Corp. 2002 Copyright Network Appliance Inc. 2002

45

3.1 The Network Appliance Filer


A popular and accelerating trend in networking has been to use appliances (devices that perform a single function very well) rather than general-purpose computers to provide common services to a network environment. Appliances have been successful because they are easier to use, more reliable, and have better price/performance than general-purpose computers. Since 1993, Network Appliance storage appliances, known as filers, have provided fast and simple network data storage to clients using the Network File System (NFS) protocol. In 1996, support for two additional remote file access protocols was added: Common Internet File System (CIFS) for Microsoft Windows networking clients and Hyper Text Transfer Protocol (HTTP), primarily for Web clients. Today, a single Network Appliance filer can provide data storage for up to 12 Terabytes of data to any network that uses any combination of NFS, CIFS, HTTP, and FTP protocols. Network Appliance filers provide fast, scalable, and reliable data management solutions that overcome the challenges of sharing, managing, and protecting data in today's global, high-growth infrastructures. As dedicated appliances, Network Appliance filers are optimized to serve and manage data. They are also designed to dramatically simplify data management, improve overall performance, and ensure that continuous availability is provided.

3.2 System architecture


Network Appliances storage architecture is driven by a robust, tightly-coupled, multi-tasking, real-time micro-kernel called Data ONTAP. This compact, pre-tuned micro-kernel consists of three primary elements: A real-time mechanism that is responsible for processing all incoming requests A RAID manager The Write Anywhere File Layout (WAFL) file system The basic architecture of a Network Appliance filer can be seen in Figure 3-1.

46

DB2 UDB exploitation of NAS technology

Figure 3-1 Network Appliance System Architecture

A network interface driver within Data ONTAP is responsible for receiving all incoming NFS, CIFS, HTTP, and FTP requests. As each request is received, it is logged in non-volatile RAM (NVRAM), an acknowledgement is immediately sent back to the requestor, and processing that is needed to satisfy the request is initiated. Once initiated, such processing runs uninterrupted (and continuous), so far as possible. This approach differs from that of traditional file servers, which employ separate processes for handling the network protocol stack, the remote file system semantics, the local file system, and the disk subsystem. Network Appliance filers use RAID 4 parity protection for all data stored in the disk subsystem. In the event that any disk drive in the RAID subsystem fails, a hot spare disk drive is allocated to that RAID group and data on the failed drive is reconstructed on the hot spare disk drive, using information stored on the parity disk in the RAID group. While reconstruction occurs, requests for data from the failed disk are served by reconstructing the data on the fly with no interruption in file service. The WAFL file system is a UNIX compatible file system that has been optimized specifically for network file access. Network Appliances WAFL and RAID technologies were designed together to eliminate many of the performance problems that most file systems experience with RAID, and as a result, RAID management is integrated directly into the WAFL file system. By integrating the file system and RAID management, problems that result when RAID management sits on top of the file system (which is how RAID management is usually implemented) are eliminated.

Chapter 3. Introduction to the NetApp filer

47

3.2.1 NVRAM implementation


The filer uses non-volatile RAM (NVRAM) to improve overall response time for network transactions and to prevent NFS/CIFS requests from being dropped because of delays in receiving packet received acknowledgements. (NVRAM is special memory that is equipped with battery-backup which allows it to store data for days, even when system power is off.) During a normal system shutdown, the filer turns off the NFS/CIFS service, flushes all cached operations to disk, and turns off the NVRAM. However, if, for some reason, a power failure were to occur, all cached NFS/CIFS requests that have not yet been flushed to disk will remain in the NVRAM and as soon as the filer is restarted, the current consistent state on the disks is located and all outstanding requests stored in NVRAM are replayed. Using NVRAM to maintain a log of uncommitted requests is very different from using NVRAM as a disk cache. When NVRAM is used at the disk layer, it may contain data that is critical to file system consistency. In this case, if the NVRAM fails, the file system may become inconsistent in ways that fsck (a UNIX file system repair utility) cannot correct. WAFL uses NVRAM as a file system journal, not as a cache of disk blocks that need be changed on the drives. As such, WAFL's use of NVRAM space is extremely efficient. For example, a request for a file system to create a file can be described in just a few hundred bytes of information, where as the actual operation of creating a file on disks might involve changing a dozen or more blocks of information. Because WAFL uses NVRAM as a journal of operations that need to be performed on the drives, rather than the result of the operations themselves, thousands of operations can be journaled in a typical filer NVRAM log.

3.2.2 RAID environment


Redundant Array of Inexpensive Disks (RAID) technology is designed to protect against loss of data in the event disk failure occurs. Although RAID technology can be implemented in five different levels (each of which have their own advantages and disadvantages), levels 1, 3, and 5 are the most common forms used. The Network Appliance filer uses RAID 4 technology (which uses arrays that consists of a single parity disk and one or more data disks) to protect against disk failure. However, unlike generic RAID 4 and RAID 5 implementations which are architected without thought to file system structure and activity, WAFL's RAID 4 implementation is heavily optimized to work in tandem with the filers file system. By optimizing the file system and the RAID layer together, the Network Appliance RAID design provides all the benefits of RAID parity protection, without incurring the performance disadvantages that are often associated with general-purpose RAID 4 solutions.

48

DB2 UDB exploitation of NAS technology

With WAFLs RAID 4 environment, data is written to the data disks using blocks that are 4 KB in size; a group of blocks (known as a stripe) is written to each data disk in a RAID group, and the corresponding parity values for each stripe are written to the parity disk. If one block on a disk goes bad, the parity disk within that disk's RAID group is used to recreate the data in that block, and a new block containing the recreated data is created on the disk. If an entire disk fails, the parity disk prevents any data from being lost when the failed disk is replaced, the parity disk is used to recalculate its entire contents.

3.2.3 Write Anywhere File Layout (WAFL)


WAFL is a UNIX compatible file system that was written specifically for the Network Appliance filer. In many ways, WAFL is similar to other UNIX file systems, such as the Berkeley Fast File System (FFS) and the IBM TransArc Episode file system: WAFL is block based (4KB blocks, no fragments) WAFL uses inodes to describe its files WAFL treats directories as specially formatted files Like the IBM TransArc Episode file system, WAFL uses a set of special-purpose files to store meta-data. WAFL's three most important meta-data files are the inode file, which contains all inodes for the file system, the free block-map file, which identifies all free blocks available, and the free inode-map file, which identifies all free inodes available. (The terms block-map and inode-map are used instead of bit map because these files require more than one bit for each entry.) By keeping meta-data in files, meta-data blocks can be written anywhere on disk in fact, thats where the name WAFL came from. WAFL has complete flexibility in its write allocation policies because no blocks are permanently assigned to fixed disk locations (as they are in FFS). WAFL uses this flexibility to optimize write performance for the filer's RAID features.

3.2.4 Snapshots
One of the benefits the WAFL file system provides is the ability to make read-only copies of the way its entire file system looks at any given point in time, and to make those copies available to system administrators via special subdirectories that appear in the current (active) file system. Each read-only copy of the file system is called a Snapshot, and a Network Appliance filer can maintain up to 31 Snapshots concurrently.

Chapter 3. Introduction to the NetApp filer

49

On any type of file system, each user-visible file and directory will be comprised of a set of blocks that reside on the physical disk media and in this respect, WAFL is no different. Since Snapshots operate at the block level of the WAFL file system, when a Snapshot is first taken, every file and directory that resides within the new Snapshot uses the same set of disk blocks that make up the file or directory in the active file system. Because of this, a Snapshot can be created in just a few seconds (since the blocks themselves are not duplicated) and each new Snapshot requires only a minimal amount of additional disk storage space. As files are changed or deleted, new blocks reflecting the changes are created and the original blocks are marked for reuse unless they are part of a snapshot (in which case they are retained until the snapshot that uses them is deleted). Most file systems would implement snapshots by copying the original data to a new block before modifing the original block. WAFL however, retains the original blocks because the files or directories that reference them still reside in a Snapshot. Hence Snapshots only start to consume disk space as the file system changes. Snapshots add a fourth dimension time to the file system's contents as a whole. Data can be viewed in its current state or it can be viewed as it existed at selected instances in the past. And because all blocks that are referenced by files and directories in a Snapshot remain as they were at the point in time the Snapshot was taken, Snapshots provide an efficient way to back up an active system without having to take that system offline.

50

DB2 UDB exploitation of NAS technology

Chapter 4.

NetApp filer terminology and concepts


In this chapter we provide an in-depth look at some of the Network Appliance filer concepts that were introduced in the previous chapter. We also attempt to identify and define relevant Network Appliance filer terminology that will be used in later chapters.

Copyright IBM Corp. 2002 Copyright Network Appliance Inc. 2002

51

4.1 Understanding RAID


Redundant Array of Inexpensive Disks (RAID) technology is designed to protect against loss of data in the event disk failure occurs. In order to understand how the Network Appliance filer implements RAID technology, is important to understand what levels of RAID are available and how RAID actually works.

4.1.1 Levels of RAID


Although RAID technology can be implemented in five different levels (each of which have their own advantages and disadvantages), levels 1, 3, and 5 are the most common forms used. RAID level 1 is simply disk mirroring, in which all data is duplicated on two separate disks. RAID 1 is very safe, but it doubles the amount of disk space normally required. RAID 3 uses a single parity disk and one or more data disks. Data is written to a the data disks in stripes, but the stripe size used is so small that each individual read or write operation performed must access all disks in the RAID array. For instance, the first byte in a block of data might be on the first disk, the second byte on the second disk, and so on. RAID 3 is a good fit for applications that require very high data rates for single large files, such as super-computing and graphics processing. However, it performs poorly when used with multi-user applications that require many unrelated disk operations to be performed in parallel because every operation performed causes traffic to be generated on each disk in the array. Like RAID 3, RAID 4 uses a single parity disk and one or more data disks. However, with RAID 4, data is written to the data disks is stripes that are much larger. As a result, each data disk in a RAID 4 array can usually satisfy a separate user request at the same time. With RAID 4, if one block on a disk goes bad, the parity disk within that disk's RAID group is used to recalculate the data in that block, and the block is mapped to a new location on disk. If an entire disk fails, the parity disk prevents any data from being lost when the failed disk is replaced, the parity disk is used to recalculate its contents automatically. RAID 5 is like RAID 4, but instead of keeping the parity blocks on a single parity disk, it cycles parity blocks among all of the disks in the RAID array (parity for the first stripe is on the first disk, parity for the second stripe on the second disk, and so on). The primary advantage of RAID 5 is that it eliminates the need for a single parity drive, which can become a bottleneck if large amounts of data are written to the RAID group in parallel. The primary disadvantage with RAID 5 is that it is not practical to add a single disk to a disk array once the array has been

52

DB2 UDB exploitation of NAS technology

created. Instead, if the size of an existing RAID 5 array needs to be increased, the number of disks to be added must match the current size of the array. Thus, if a RAID 5 implementation uses 7 disks in each array, then disks must be added to the array 7 at a time. The Network Appliance filer uses RAID 4 technology to protect against disk failure. However, unlike generic RAID 4 and RAID 5 implementations which are architected without thought to file system structure and activity, Network Appliance's RAID 4 implementation is heavily optimized to work in tandem with the Data ONTAP file system. By optimizing the file system and the RAID layer together, the Network Appliance RAID design provides all the benefits of RAID parity protection, without incurring the performance disadvantages that are often associated with general-purpose RAID 4 solutions. And because Network Appliance's RAID 4 design does not interleave parity information like a generic implementation of RAID 5, the overall system can be expanded quickly and easily, even though RAID protection is present. The disk layout for Network Appliances RAID 4 technology is illustrated in Figure 4-1.

Parity Disk

Data Disks (up to total of (27)

One 4 KB Stripe

Figure 4-1 Network Appliances RAID 4 disk layout

How the parity disk is used


With RAID 4, if one block on a data disk goes bad, the parity disk within that data disk's RAID group is used to recalculate the data stored in that block, and the block is recreated at a new location on disk. A similar approach is used if an entire data disk fails; when the failed disk is replaced, the parity disk is used to recreate its contents, thereby preventing any data from being lost.

Chapter 4. NetApp filer terminology and concepts

53

As you can see in Figure 4-1 on page 53, a single Network Appliance RAID 4 array consists of one disk that is used for parity and up to twenty-seven disks that are used for storing data. Each disk in the RAID 4 array (or RAID group) is made up of 4KB blocks; therefore each stripe consists of one block from each data disk and one block from the parity disk. The parity block in each stripe allows data to be recalculated if any one block (on a data disk) in the stripe is lost. (Figure 4-1 on page 53 also shows how a filer's RAID group is divided into stripes, each one of which consists of one 4 KB block on the parity disk along and one 4KB block on each of the data disks in the disk array.) In order to understand how the parity disk works, it helps to think of each 4KB disk block as if it were a really big integer (32,768 bits long), and that RAID 4 is responsible for performing simple math operations using these integers. With this in mind, the parity block can be thought of a being the big integer that is basically the sum of all blocks in the stripe. For example: Parity 12 Data 1 3 Data 2 7 Data 3 2

Thus, if one of the data disks in the RAID group fails, for instance Data 2, then the data stored on that disk can be reconstructed, again by performing simple arithmetic: Data 2 = Parity - Data 1 - Data 3 = 12 - 3 - 2 =7 In reality, the RAID system uses EXCLUSIVE-OR instead of addition and subtraction, and the numbers are much larger. But the math works out the same using addition and subtraction on small numbers makes the technique easier to understand. Lost data is recalculated on the fly so that the system will still run, even if one of the data disks in the raid group has failed. (The entire contents of a failed disk are recalculated and written to the new disk when the failed disk is replaced.) Of course, if two blocks in a single stripe fail, there is no longer sufficient information available to recalculate the lost data. On the other hand, if the parity disk itself fails, it must be replaced (after which parity values will be recalculated), but no data stored on the data disks in the RAID group will be lost.

54

DB2 UDB exploitation of NAS technology

4.1.2 Eliminating the parity disk bottleneck


Most vendors of RAID peripherals for UNIX and Windows have avoided using RAID 4 technology because with general-purpose file systems, the parity disk can become a bottleneck. The WAFL file system, on the other hand, uses the flexibility of its write anywhere layout to write blocks to locations that are efficient for RAID 4. To better understand how the parity disk can become a bottleneck, and how WAFL helps eliminate this problem, it helps to examine how both WAFL and some other general-purpose RAID 4 file system works. For example, the Berkeley Fast File System (FFS) is a RAID 4 file system that was designed to optimize write operations for individuals files. Because of this design, FFS typically writes blocks for different files to widely separated locations on disk. The illustration on the left in Figure 4-2 shows how FFS might allocate blocks for 3 unrelated files in a RAID 4 array. While each data disk in the example receives only 2 write operations, the parity disk receives 6 (three times as many). More importantly, the parity writes are widely spread, causing time consuming seek operations to be performed.

RAID 4 with FFS


Parity
W W

RAID 4 with WAFL


D3 Parity D1 D2 D3

D1

D2
W W

W W

W W

W W W

W W

W W

W W

W W

W W

Figure 4-2 FFS and WAFL disk write operation patterns

Since the FFS file system is not aware of the underlying RAID 4 layout, when read operations are performed, it tends to generate requests for data that is scattered throughout the data disks, which causes the parity disk to seek excessively.

Chapter 4. NetApp filer terminology and concepts

55

The WAFL file system, on the other hand, writes blocks in a pattern that is designed to minimize seek operations on the parity disk. The illustration on the right of Figure 4-2 on page 55 shows how WAFL allocates the same blocks to make RAID 4 operate efficiently. WAFL always writes blocks to stripes that are near each other, eliminating long seeks on the parity disk. WAFL also writes multiple blocks to the same stripe whenever possible, further reducing traffic on the parity disk. Notice that FFS uses six separate stripes in Figure 4-2 on page 55, so six parity blocks must be updated. In contrast, WAFL uses only 3 stripes, so only 3 parity blocks are updated and they are all located near each other. As a WAFL file system becomes full it uses more stripes to write a given number of blocks which increases the number of parity blocks that need to be updated. Even in a very full file system, however, a small range of cylinders contains many free blocks, so the more important benefit of reducing seeks on the parity disk remains. Like FFS, WAFL reserves 10% of disk space to improve overall performance

4.1.3 Using multiple RAID groups


Although data is usually protected from loss (because of parity protection) when RAID 4 technology is used, there is concern that the possibility of a double disk drive failure, can never be eliminated. (A given RAID group can survive the loss of a single disk drive with no data loss, but failure of a second disk in the same RAID group, before the first failed disk has been reconstructed, defeats this data protection mechanism.) Not only can the possibility of a double disk drive failure never be eliminated, but protection against double disk drive failure does not scale well to very large filers (or volumes), since the probability of a double disk failure is proportional to approximately the square of the number of data disks used in a RAID group. In other words, doubling the number of disks roughly quadruples the probability of a double disk failure. Network Appliance's Data ONTAP software addresses this potential problem by providing support for multiple RAID groups. The remainder of this section discusses disk failure probability and how multiple RAID groups can be used to reduce this risk.

RAID reconstruction
Whenever a disk drive in the disk subsystem of a Network Appliance filer fails, the filer is placed in what is know as degraded mode. Requests for data from the failed disk are served by reconstructing the data on the fly with no interruption in file service. A new disk drive can be substituted for the failed one at any time, and the image of the data stored on the failed disk will automatically be rebuilt on the replacement disk, still without any interruption in file service.

56

DB2 UDB exploitation of NAS technology

Furthermore, one or more hot spare disk drives may be configured on a filer. A hot spare will immediately be substituted for a failed disk without human intervention as soon as a filer enters degraded mode. Additional hot spares allow for the replacement of a subsequent failed disk if physical replacement of the first failed disk drive has yet to be accomplished. Concurrent drive failures can also be accommodated so long as no two failed drives are in the same RAID group. Reconstruction of data onto a spare can be prioritized relative to the servicing of incoming client file service requests. This is done by means of the raid.reconstruct_speed filer option. If this option is set to a value of 1, reconstruction will proceed at low priority (and will take a long time to complete), but incoming file service requests will be processed expeditiously. However, if this option is set to its maximum value of 10, then almost all of the filer's resources will be devoted to reconstruction (and it will complete quickly), but clients will experience sluggish filer service performance. The default value for the raid.reconstruct_speed option is 4. It is important to note that the raid.reconstruct_speed option controls the total amount of CPU resources that are devoted by the filer to reconstruction. For a given value, starting an additional reconstruction in another RAID group will not cause more resources to be spent on reconstruction. Instead, each reconstruction will take longer, but client performance will not be significantly reduced except where impacted by competition for access to disks.

RAID scrubbing
Far more likely than the possibility of a second disk drive failing in a RAID group before reconstruction has been completed for a previous disk failure, is the possibility that there may be an unknown bad block (media error) on an otherwise intact disk. If there are no failed disks within a RAID group, the filer will compensate for bad blocks by using parity information to recompute the bad block's original contents, which is then remapped to a spare block elsewhere on the disk when data in that block is accessed. However, if a bad block is encountered while the filer is in degraded mode (after a disk failure but before reconstruction has completed), then that block's data is irrecoverably lost. To protect against this scenario, filers routinely verify all data stored in the file system by using a process known as RAID scrubbing. By default, this process is performed once per week, early on Sunday morning, although it can be rescheduled or suppressed altogether. During the RAID scrubbing process, all data blocks are read from RAID groups which have no failed drives and if a media error is encountered, the bad blocks data value is recomputed and the data value itself is rewritten to a spare block. Otherwise, parity is recomputed and verified and if the computed parity value does not match the corresponding parity value stored on disk, the parity value on disk is rewritten. It is important to note that all non-degraded RAID groups are scrubbed in parallel.

Chapter 4. NetApp filer terminology and concepts

57

4.1.4 Performance and RAID configuration


The addition of support for multiple RAID groups (and multiple volumes) was intended to have no impact on filer performance during normal operation. However, careless administration of these features can lead to a substantial degradation of performance. It may be tempting to start off with a small volume consisting of a single RAID group, and adding one disk at a time as demand for storage increases. This works fine until the limit of 28 disks (27 data disks plus one parity disk) within a RAID group is reached. But two disks must be added, instead of one after the RAID group limit is reached. Thats because, a new RAID group must be started and that requires that two disks be added one for parity for the new RAID group, one as the new data disk. The Network Appliance filer will attempt to distribute writes evenly between all RAID groups in a volume. Therefore, a filer administrator should plan for the expansion of volumes such that each RAID group in the volume has approximately the same capacity, and that no RAID group has fewer than three data disks if performance is a concern. (An example of an exception might be the filers root volume which has little more than the filer's operating system and some hosts and security information. It would not be beneficial to waste three data disks on this relatively small amount of data, and performance isn't a major concern for this volume.) Just as more disks on a filer can provide much better performance, more disks per RAID group also vastly improves filer performance.

4.2 WAFL implementation


WAFL is a UNIX compatible file system that was written specifically for the Network Appliance filer. In many ways, WAFL is similar to other UNIX file systems, such as the Berkeley Fast File System (FFS) and the IBM TransArc Episode file system: WAFL is block based (4KB blocks, no fragments) WAFL uses inodes to describe its files WAFL treats directories as specially formatted files Each WAFL inode contains 16 block pointers to indicate which blocks belong to which files. Unlike FFS, all the block pointers in a WAFL inode refer to blocks at the same level. Thus, inodes for files smaller than 64 KB use the 16 block pointers to point to the data blocks that contain the actual file data. Inodes for files larger than 64 MB point to indirect blocks which, in turn, point to the data blocks that contain the actual file data. Inodes for larger files point to doubly indirect blocks whereas inodes for very small files contain the actual file data itself, rather than block pointers.

58

DB2 UDB exploitation of NAS technology

4.2.1 Meta-data lives in files


Like the IBM TransArc Episode file system, WAFL uses a set of special-purpose files to store meta-data. WAFL's three most important meta-data files are the inode file, which contains all inodes for the file system, the free block-map file, which identifies all free blocks available, and the free inode-map file, which identifies all free inodes available. (The terms block-map and inode-map are used instead of bit map because these files require more than one bit for each entry.) The layout of these three meta-data files is shown in Figure 4-3.

Root Inode

Inode File

All Other Files Block Map File Inode Map File Other Files in the File System

Figure 4-3 Layout used by the WAFL file system

By keeping meta-data in files, WAFL can write meta-data blocks anywhere on disk (this is where the name WAFL, which stands for Write Anywhere File Layout came from). This write-anywhere design allows WAFL to operate efficiently with the RAID disk subsystem by scheduling multiple writes to the same RAID stripe whenever possible to avoid the 4-to-1 write penalty that RAID traditionally incurs when just one block in a stripe is updated. Keeping meta-data in files makes it easy to increase the size of the file system on the fly. When a new disk is added, the file server automatically increases the sizes of the meta-data files (and the system administrator can increase the number of inodes in the file system manually if the default is too small). Finally, the write-anywhere design enables the copy-on-write technique that is used by Snapshots in order for Snapshots to work, WAFL must be able to write all new data, including meta-data, to new locations on disk, instead of overwriting existing data with new data values. If WAFL stored meta-data at fixed locations on disk, this would not be possible.

Chapter 4. NetApp filer terminology and concepts

59

4.2.2 A tree of blocks


A WAFL file system is essentially a tree of blocks. At the root of the tree is the root inode, as shown in Figure 4-3 on page 59. The root inode is a special inode that describes the inode file. The branches of the tree consists of the inodes that describe the rest of the files in the file system, including the block-map and inode-map files. The leaves of the tree are the actual data blocks of all the files stored in the system. Figure 4-4 shows a detailed view of this tree of blocks.

Root Inode

Inode File Indirect Blocks

Inode File Data Blocks

Regular File Indirect Blocks

Regular File Data Blocks Block Map File Inode Map File Random Small File Random Large File

Figure 4-4 WAFLs tree of blocks

As you can see in Figure 4-4, files are made up of individual blocks and large files have additional layers of indirection between the inode and the blocks that contain the actual file data. In order for WAFL to boot, it must be able to find the root of this tree, so the one exception to WAFL's write-anywhere rule is that the block containing the root inode must live at a fixed location on disk where WAFL can find it.

60

DB2 UDB exploitation of NAS technology

4.2.3 A word about write allocation


Write performance is especially important for network file servers. It has been observed that as read caches get larger at both the client and server, writes begin to dominate the I/O subsystem. This effect is especially pronounced with NFS which allows very little client-side write caching. The result is that the disks on an NFS server may have 5 times as many write operations as reads. WAFL's design was motivated largely by a desire to maximize the flexibility of its write allocation policies. This flexibility takes three forms: 1. WAFL can write any file system block (except the one containing the root inode) to any location on disk. In FFS, meta-data, such as inodes and bit maps, is kept in fixed locations on disk. This prevents FFS from optimizing writes by, for example, putting both the data for a newly updated file and its inode right next to each other on disk. Since WAFL can write meta-data anywhere on disk, it can optimize writes more creatively. 2. WAFL can write blocks to disk in any order. FFS writes blocks to disk in a carefully determined order so that fsck (a UNIX file system repair utility) can be used to restore file system consistency after an unclean shutdown has occurred. WAFL can write blocks in any order because the on-disk image of the file system changes only when WAFL writes a consistency point. The one constraint is that WAFL must write all the blocks in a new consistency point before it writes the root inode for the consistency point. 3. WAFL can allocate disk space for many NFS operations at once in a single write episode. FFS allocates disk space as it processes each NFS request. WAFL gathers up hundreds of NFS requests before scheduling a consistency point, at which time it allocates blocks for all requests in the consistency point at once. Deferring write allocation improves the latency of NFS operations by removing disk allocation from the processing path of the reply, and it avoids wasting time allocating space for blocks that are removed before they reach disk. Together, these three features give WAFL extraordinary flexibility in its write allocation policies. The ability to schedule writes for many requests at once enables more intelligent allocation policies, and the fact that blocks can be written to any location and in any order allows a wide variety of strategies. In short: WAFL improves RAID performance by writing multiple stripes to disk using the same number of I/O operations that other RAID implementations use to write a single stripe. WAFL reduces seek time by writing blocks to locations that are near each other on disk.

Chapter 4. NetApp filer terminology and concepts

61

WAFL reduces head-contention when reading large files by placing sequential blocks for a file on a single disk in the RAID array (rather than across multiple disks) whenever possible.

4.3 Snapshots
Understanding that the WAFL file system is a tree of blocks that is rooted by the root inode is the key to understanding Snapshots. To create a virtual copy of this tree of blocks, WAFL simply duplicates the root inode (disk blocks themselves are not copied, but rather, every block in the volume's file system is recognized as belonging to the Snapshot of the active file system on the volume. This meta-data is what is physically stored in the reserved area of the disks. WAFL creates a Snapshot by duplicating the root inode that describes the inode file. WAFL avoids changing blocks a Snapshot refers to by writing modified data to new locations on disk. Figure 4-5 illustrates how Snapshots work with an active file system.

Figure 4-5 How WAFL creates a Snapshot in an active file system

Figure 4-5 is a simplified diagram of the file system shown in Figure 4-4 (before and after a Snapshot is taken) that leaves out internal nodes in the tree, such as inodes and indirect blocks. Section (a) of Figure 4-5 shows how the active file system (or root inode) looks before a Snapshot is taken. In this scenario, the active file system is contained on the four disk blocks A,B,C, and D. Section (b) of Figure 4-5 shows how WAFL creates a new Snapshot by making a duplicate copy of the root inode.

62

DB2 UDB exploitation of NAS technology

This duplicate inode becomes the root of a tree of blocks representing the Snapshot, just as the root inode represents the active file system. When the Snapshot inode is created, it points to exactly the same disk blocks as the root inode, so a brand new Snapshot consumes no disk space except for the Snapshot inode itself.(the disk blocks A,B,C, and D are associated with both the active file system and the Snapshot). Section (c) of Figure 4-5 on page 62 shows what happens when a user modifies data block D. WAFL writes the new data to block D' on disk, and changes the active file system to point to the new block. The Snapshot still references the original block D which is unmodified on disk. Because disk block D is participating in a Snapshot, it is marked by Data ONTAP as being in use and is not returned to the available disk block pool. If another Snapshot is taken at this time, the meta-data that the new Snapshot would describe would be for disk blocks A,B,C, and D'. WAFL would be very inefficient if it wrote the blocks associated with each NFS write request as they came in. Instead, WAFL gathers up many hundreds of NFS requests and stores them in a write buffer before actually writing to disk. During a write operation, WAFL allocates disk space for all the dirty data in the cache and schedules the required disk I/O. As a result, commonly modified blocks, such as indirect blocks and blocks in the inode file, are written only once per write episode instead of once per NFS write request. Initially, a Snapshot takes up an insignificant amount of disk space because all blocks referenced by the Snapshot are also referenced by the active file system. Over time, as files in the active file system are modified or deleted, a Snapshot may reference more and more blocks that are no longer used in the active file system. Thus the storage requirements for a Snapshots will increase if it is kept for any extended period of time. By default, 20% of the disk space available to a volume is reserved for Snapshot data. This amount of disk space can be increased or decreased to accommodate the requirements of the Snapshot maintenance plan implemented by the filer system administrator. To understand just how efficient Snapshots are, it helps to compare WAFL's Snapshots with the IBM TransArc Episode file system's fileset clones. Instead of duplicating the root inode, Episode creates a clone by copying the entire inode file. This approach generates considerable disk I/O and consumes a lot of disk space. For instance, a 10 GB file system with one inode for every 4 KB of disk space would have 320 MB of inodes. In such a file system, creating a Snapshot by duplicating the inodes would generate 320 MB of disk I/O and consume 320 MB of disk space. Creating 10 such Snapshots would consume almost one-third of the file system's space even before any data blocks were modified.

Chapter 4. NetApp filer terminology and concepts

63

By duplicating just the root inode, WAFL can create Snapshots that require very little disk I/O. And because just the root inode is duplicated, Snapshots can be created every few seconds to guarantee quick recovery after an unclean system shutdown.

4.3.1 Snapshots and the block-map file


Most file systems keep track of free blocks using a bit map with one bit per disk block. If the bit is set, then the block is in use. This technique does not work for WAFL because many snapshots can reference a block at the same time. WAFL's block-map file contains a 32-bit entry for each 4 KB disk block. Bit 0 is set if the active file system references the block, bit 1 is set if the first Snapshot references the block, and so on. A block is in use if any of the bits in its block-map entry are set. Figure 4-6 shows the life cycle of a typical block-map entry.

Figure 4-6 Life cycle of a block-map file entry

In this example, at time t1, the block-map entry is completely clear, indicating that the block is available. At time t2, WAFL allocates the block and stores file data in it. When shots are created, at times t3 and t4, WAFL copies the active file system bit into the bit indicating membership in the Snapshot. The block is deleted from the active file system at time t5. This can occur either because the file containing the block is removed, or because the contents of the block are updated and the new contents are written to a new location on disk. The block can't be reused, however, until no Snapshot references it. In Figure 4-6, this occurs at time t8 after both Snapshots that reference the block have been removed.

64

DB2 UDB exploitation of NAS technology

4.4 Volumes and Quota Trees


Network Appliance's filer software, Data ONTAP, allows system administrators to create multiple volumes on a filer, each of which may be composed of multiple RAID groups, as illustrated in Figure 4-7.

Figure 4-7 NetApp filer with multiple volumes composed of multiple RAID groups

Each volume in turn contains its own WAFL file system, with its own inodes, block map-file, inode-map file, etc. Although some systems from other vendors allow system administrators to create volumes which can contain other types of objects, in a Network Appliance filer a volume always contains a WAFL file system and thus the two terms are almost (but not quite) interchangeable. Some system administrators choose to use two smaller filers rather than a single, larger filer. In some cases, this is to limit the risk of data loss due to multiple disk failures, (which was discussed earlier). In other cases, this is because it may be more efficient to manage several smaller file systems than one large file system.

Chapter 4. NetApp filer terminology and concepts

65

The multiple volume feature of the Data ONTAP software provides this management flexibility without requiring the user to purchase multiple physical devices. In effect, it allows multiple logical filers to exist within a single appliance. (A filer can currently have up to 23 volumes, each composed of an integral number of RAID groups, which in turn are comprised of physical disks.)

4.4.1 Quota trees


Quota trees (or qtrees) allow a limit to be placed on the size of a directory tree within a filer, independent of user and group quotas. This is somewhat like the limits enforced on collections of data by the size of a partition in a traditional UNIX or Windows file system, but with the flexibility to subsequently change the limit on a live file system, since quota trees have no connection to a specific range of blocks on disk. Volumes and quota trees bear a superficial resemblance to each other, since both allow chunks of a filer to be carved off in pre-defined sizes and independently exported. However, volumes are mapped to specific collections of disks (RAID groups) and thus are more like partitions in traditional systems. Quota trees are implemented at a higher level than volumes and can therefore offer more flexibility, as shown in Table 4-1. In fact, each volume may contain one or more quota trees.
Table 4-1 Volumes compared to quota trees

Feature Limit size of collection of data Implementation Granularity May reduce allocation May over-commit

Volumes Yes Mapped to collections of physical devices RAID group (n disks) No No

Quota Trees Yes abstraction in software Kilobytes Yes Yes

66

DB2 UDB exploitation of NAS technology

Chapter 5.

DB2 and the NetApp filer


Now that you have an understanding of DB2 UDB and Network Appliance technology and concepts, we will examine how a DB2 UDB database system can be configured to take advantage of a Network Appliance filer. In this chapter we discuss the steps that we performed to create a test environment using DB2 UDB EE, a Network Appliance F840 filer, and servers running the Linux and AIX operating systems. Specifically, this chapter focuses on how to: Configure the Network Appliance filer Create filer mount points on the DB2 UDB server Configure DB2 so it will use filer mount points Note: This redbook does not discuss the steps needed to install the network or the steps needed to add a Network Appliance filer to the network. It should be noted, however, that Network Appliance recommends using a gigabit ethernet, enabling flow control, and using full-duplex on both the database server and the Network Appliance filer, as well as having a dedicated network available to the database system.

Copyright Network Appliance Inc. 2002 Copyright IBM Corp. 2002

67

5.1 DB2/NetApp filer design considerations


You may recall that in Chapter 2, DB2 UDB, NAS, and SAN terminology and concepts on page 23, we saw that all table spaces are classified according to how their storage space is managed: A table space can be either a system managed space (SMS) or a database managed space (DMS). With SMS table spaces, the operating systems file manager is responsible for allocating and managing the storage space used by the table space. SMS table spaces typically consist of several individual files (representing data objects such as tables and indexes) that are stored in a file system. With database managed space (DMS) table spaces, the table space creator (and in some cases, the DB2 Database Manager) is responsible for allocating the storage space used, and the DB2 Database Manager is responsible for managing it. Essentially, a DMS table space is an implementation of a special-purpose file system that has been designed specifically to meet the needs of the DB2 Database Manager. When designing a DB2 UDB database that is to reside on a Network Appliance filer, SMS table spaces are the easiest type of table spaces to use. It is possible to use DMS table spaces, however, at this time, only file containers are supported (in a future release of Data ONTAP, device containers will be supported as well.) By using SMS table spaces, the filer can manage storage space allocation whereas if DMS table spaces are used, the system or database administrator must monitor the amount of storage space used and manually allocate additional storage space as it is needed. The key to successfully managing a transactional based DB2 UDB database (such as an OLTP database) that resides on a Network Appliance filer is to divide the filer disks available into three or more volumes. (Chapter 4, NetApp filer terminology and concepts on page 51, has more information on filer volumes.) By default, one root volume must exist for the Data ONTAP operating system. At a minimum, two additional volumes should be used to store a database: one to hold the database itself, and one to hold the databases transaction log files. (Additional volumes may be required, depending upon the anticipated database size, number of table spaces needed, number of table space containers desired, and the amount of logging activity expected.) Such a configuration reduces the overhead required for system and database administration. Figure 5-1 illustrates the configuration infrastructure used in the test environment we created to produce this section of our redbook. In our test environment, we stored the database on one volume that consisted of eight 36G disks (7 data; 1 parity) and we stored the database transaction log files on a second volume that consisted of five 36G (4 data; 1 parity) disks.

68

DB2 UDB exploitation of NAS technology

Figure 5-1 Infrastructure used to test DB2 UDB and a Network Appliance filer

5.2 Interacting with a Network Appliance filer


Network Appliance provides two different methods for configuring and managing a filer: a Web interface and a command line interface. In our test environment, we used both methods. To access the Network Appliance filer Web interface, a Web browser such as Internet Explorer should be used and the following URL should be provided:
http://[filer_ID]/na_admin

Here, filer_ID is the host IP address or name assigned to the filer. In our test environment, the filer was named terminator and assigned the IP address 9.1.39.40. Therefore, in order to access the Web interface, we used the URLs:
http://9.1.39.40/na_admin http://terminator/na_admin

Chapter 5. DB2 and the NetApp filer

69

Note: In order to use the filer name in a URL, information that associates the filers name with its IP address must reside on the database server. (Usually, this is done in the file /etc/hosts that resides on the DB2 server.) If you do not know the IP address or name that has been assigned to the filer, contact your system administrator. Figure 5-2 shows the initial screen of the Network Appliance filer Web interface that appears once a valid filer URL is provided in a Web browser.

Figure 5-2 Initial page of the Network Appliance filer Web interface

From the initial screen of the filer Web interface, you can install filer documentation, view filer documentation, invoke the filer administration tool named FilerView, invoke the filer monitor tool named Filer At-A-Glance, or initiate a technical support call.

70

DB2 UDB exploitation of NAS technology

5.2.1 Using FilerView


The Web interface tool that is actually used to perform administrative tasks on a Network Appliance filer is a tool named FilerView. FilerView is activated by selecting the FilerView link that is provided on the initial screen of the filers Web interface. (Figure 5-2 on page 70 identifies the link that must be selected in order to activate FilerView.) Figure 5-3 shows the screen that appears once the FilerView administrative tool has been activated.

Figure 5-3 FilerViews main screen

As you can see here, the main screen of FilerView consists of a collapsible menu and a display area. As menu items are selected, data entry forms or statistical information associated with the menu item selected are shown in the display area.

Chapter 5. DB2 and the NetApp filer

71

5.3 Creating volumes on a Network Appliance filer


If you recall, in Chapter 4, NetApp filer terminology and concepts on page 51, we saw that a volume is a logical collection of two or more disks on a Network Appliance filer. Before a filer can be used to store a DB2 UDB database, one or more volumes must be created on it, and the mount points to those volumes must be created on the database server. The easiest way to obtain and view information about existing volumes on a Network Appliance filer is by selecting Volumes>Manage from the FilerView menu. This sequence of menu selections will cause the Manage Volumes screen to be displayed in the FilerView display area. Figure 5-4 shows how the Manage Volumes screen might look after a filer has just been initialized. From this example, you can see that only one volume (the root volume that contains the Data ONTAP operating system) exists on the filer.

Figure 5-4 Manage Volumes screen

Once the Manage Volumes screen is displayed, new volumes can be created by selecting the Add New Volume link located at the top of the Manage Volumes screen (refer to Figure 5-4). When this link is selected, the Add New Volume data entry screen will be displayed in place of the Manage Volumes screen.

72

DB2 UDB exploitation of NAS technology

From the Add Volume screen, you define the properties of each new volume that is to be created. Properties include the name to assign to the volume, the size of the RAID group to use in the volume, the language to use with the volume, the number of disks to assign to the volume, the size of the disks used, and whether or not specific disks are to be used by the volume. When adding a new volume, you have the option of letting the filer automatically select the disks to use (in which case you specify the number of data disks desired) or you can manually select each disk that is to make up the volume. Note: A RAID group can consist of 2 to 28 disks. By using a smaller RAID group size, the potential for a double disk failure to occur within a single RAID group is reduced (because fewer disks are used). In a normal transaction type of database environment, a good RAID size to use is 8 or 14. It is important to note that the RAID group size specified is applicable to the current RAID group as well as to future RAID groups that may be added to the volume. Figure 5-5 on page 75 illustrates what the Add New Volume data entry screen might look like after its data fields have been populated. New volumes can also be created by using a telnet session or a remote shell (rsh command) to issue the following command at the filer:
vol create volname [-r raidsize] [-l language_code] { ndisks[@size] | -d disk1 [disk2 ...] }

For more information on the vol create command, refer to the Network Appliance Manual Pages. In order to be able to perform roll-forward on any DB2 UDB database, the databases log files must be accessible and kept up-to-date. We recommend that you store database files on one volume and database log files on a separate volume so that they can be backed up independent of each other using Network Appliance Snapshots. Another alternative would be to store the database logs on a separate Network Appliance filer. For our test environment, we created two volumes that had the following characteristics: Volume 1 Name: db2_data RAID Group Size: 8 Language: English (US) Automatic Disk Selection: Yes Number of Disks: 7 Disk Size: Any size disks

Chapter 5. DB2 and the NetApp filer

73

Volume 2 Name: db2_logs RAID Group Size: 8 Language: English (US) Automatic Disk Selection: Yes Number of Disks: 5 Disk Size: Any size disks These volumes were created by using the Add New Volume screen. Alternately, these volumes could have been created by using a telnet session or a remote shell (rsh command) to issue the following commands at the filer:
vol create db2_data -r 8 -l en_US 7 vol create db2_logs -r 8 -l en_US 5

Viewing volume statistics


Once new volumes have been created on a Network Appliance filer, statistical information about those volumes can be obtained by viewing the Volumes Report screen. The Volume Reports screen is activated by selecting Volumes>Reports from the FilerView menu. Figure 5-6 on page 76 shows how the Volumes Report screen might look after three volumes (one for Data ONTAP, one for database data, and one for database log files) have been created.

5.4 Creating qtrees on a Network Appliance filer


If you plan on storing several different databases on a single Network Appliance filer, those databases can be stored in individual volumes or they can reside in individual qtrees that have been created within a single volume. One of the advantages of storing multiple databases within a single volume is that several different databases can be backed up with a single Network Appliance Snapshot. In our test environment, we wanted to create one database using a Linux server and a second database using an AIX server. Because we chose to store these two databases in the same volume, we created individual qtrees for each database and for each databases transaction log files.

74

DB2 UDB exploitation of NAS technology

Figure 5-5 Add New Volume data entry screen

The easiest way to obtain and view information about existing qtrees on a Network Appliance filer is by selecting Volumes>Qtrees>Manage from the FilerView menu (Figure 5-7 on page 77). This sequence of menu selections will cause the Manage Qtrees screen to be displayed in the FilerView display area. Figure 5-7 on page 77 shows how the Manage Qtrees screen might look after three volumes (one for Data ONTAP, one for database data, and one for database log files) have been created.

Chapter 5. DB2 and the NetApp filer

75

Figure 5-6 Volumes Report screen

New qtrees can be created for a particular volume by highlighting the desired volume shown on the Manage Qtrees screen and then clicking the Create button shown just below the volume list (Figure 5-8). When this button is clicked, the Create a new Qtree dialog is displayed, and the user is prompted to provide the name to be assigned to the qtree that is to be created. Figure 5-8 illustrates how the Create a new Qtree dialog might look after its data fields have been populated. New qtrees can also be created by using a telnet session or a remote shell (rsh command) to issue the following command at the filer:
qtree create [qtree_name]

For more information on the qtree create command, refer to the Network Appliance Manual Pages.

76

DB2 UDB exploitation of NAS technology

Figure 5-7 Manage Qtrees screen

Figure 5-8 shows the Create a new Qtree dialog box.

Figure 5-8 Create a new Qtree dialog

For our test environment, we created two qtrees for each volume; we used a combination of volume names and operating system names to produce the following qtrees:

Chapter 5. DB2 and the NetApp filer

77

Volume 1 (db2_data) db2_data_aix db2_data_linux Volume 2 (db2_logs) db2_logs_aix db2_logs_linux These qtrees were created by using the Create a new Qtree dialog. Alternately, these qtrees could have been created by using a telnet session or a remote shell (rsh command) to issue the following set of commands at the filer:
qtree qtree qtree qtree create create create create /vol/db2_data/db2_data_aix /vol/db2_data/db2_data_linux /vol/db2_logs/db2_logs_aix /vol/db2_logs/db2_logs_linux

Figure 5-9 shows how the Manage Qtrees screen looked after these qtrees were created.

Figure 5-9 Manage Qtrees screen after qtrees for test environment were created

78

DB2 UDB exploitation of NAS technology

5.5 Managing NFS exports (UNIX only)


Once the appropriate volumes (and if necessary, qtrees) have been created on the Network Appliance filer, they must be made available to the database servers operating system and subsequently to DB2 UDB. When NFS clients are used to access the filer, volumes and qtrees are made available through something that is known as NFS exports. NFS exports are managed by a special file named exports that resides in the /etc directory of the root volume that contains the Data ONTAP operating system. The easiest way to obtain, view, and modify information stored in the /etc/exports file is by selecting NFS>Manage Exports from the FilerView menu. This sequence of menu selections will cause the Manage NFS Exports screen to be displayed in the FilerView display area. Figure 5-10 shows how the Manage NFS Exports screen might look after a filer has just been initialized.

Figure 5-10 Manage NFS Exports screen

Chapter 5. DB2 and the NetApp filer

79

To add a line to the /etc/exports file (which, in turn, makes a newly created volume or qtree available to an NFS client), highlight a blank line to activate the Apply and Insert Line buttons, then select the Insert Line button to display the Create a New /etc/exports Line dialog. Figure 5-11 illustrates how the Create a New /etc/exports Line dialog might look after its data fields have been populated.

Figure 5-11 Create a New /etc/exports Line dialog

Once an entry for an NFS export has been made in the /etc/exports file, appropriate data access permissions must be specified for that entry. To specify permissions for an entry in the /etc/exports file, highlight the appropriate entry in the Manage NFS Exports screen and select the Add Option button to display the Add Option dialog. Figure 5-12 illustrates how the Add Option dialog is activated. Figure 5-14 illustrates how the Add Option dialog might look after its data fields have been populated. To assign permissions from the Add Option dialog, simply select the access level from the drop down list box provided and click the OK button. The following type of permissions are available:

Access -allows users from the Host or network group to access the filer RW -allows read and write operations on the filer. Root -allows root privileges and can mount directories, change permissions and ownerships, and create or delete directories and files.
A single entry in the /etc/exports file can be assigned multiple permissions; just highlight the appropriate entry and add a new option for each permission needed. Once the permissions have been assigned, click the Apply button located on the Manage NFS Exports screen to write all changes to disk, then select the Export All button to make all changes made to the /etc/exports file effective.

80

DB2 UDB exploitation of NAS technology

You can also make changes to the /etc/exports file become effective by using a telnet session or a remote shell (rsh command) to issue the following command at the filer:
exportfs [-aiuv] [-o options] [pathname]

For more information on the exportfs command, refer to the Network Appliance Manual Pages.

Figure 5-12 Adding permissions to an export entry

Figure 5-13 Add Option dialog

Chapter 5. DB2 and the NetApp filer

81

Figure 5-14 shows how the Manage NFS Exports screen looked just after we set the appropriate permissions for our test environment and just before we made those changes effective.

Figure 5-14 NFS permissions for our test environment

5.6 Filer volumes and qtrees with DB2 UDB


Once volumes and qtrees have been defined, and NFS permissions required to use those volumes and qtrees have been set, the next step towards using a Network Appliance filer to store a DB2 UDB database, is to configure the operating system on the DB2 UDB server so that it can access the volumes and/or qtrees on the filer. This is a relatively simple process provided you have system administrator authorization on the database server workstation. Making filer volumes and qtrees available to an operating system typically involves: 1. Logging into the system as a system administrator (usually root on UNIX-based systems). 2. Creating one or more directories that will be used as mount points.

82

DB2 UDB exploitation of NAS technology

3. Modifying the appropriate files that are used to establish NFS mount points (on Linux, this file is /etc/fstab; on other UNIX-based systems this file is /etc/vfstab) by adding new mount points that associate appropriate volumes and/or qtrees on the filer with the directories just created. 4. Mounting the filer volumes and/or qtrees to the directories created. 5. Verifying that mount operation was successful. For example, the following is the step-by-step procedure we used to make the qtrees we created on the Network Appliance filer available to the Linux DB2 UDB server used in our test environment: 1. Logon to the server as root. 2. Create two mount point directories by executing the following commands: mkdir /db2_data mkdir /db2_logs Important: Make sure the file permissions for these directories are set such that the appropriate users can both read from and write to them. File permissions can be set by executing the UNIX command chmod, along with the appropriate options, for each directory created. 3. Add the following lines to the file /etc/fstab terminator:/vol/db2_data/db2_data_linux/db2_datanfs terminator:/vol/db2_logs/db2_logs_linux/db2_logsnfs 4. Save the modified file. (Figure 5-15 shows how /etc/fstab looked when this step was completed.) 5. Mount the filer qtrees by executing the following commands: mount /db2_data -o hard, intr, vers=3, proto=udp, suid, rsize=32768, wsize=32768 mount /db2_logs -o hard, intr, vers=3, proto=udp, suid, rsize=32768, wsize=32768 Refer to Table 5-1 for a description of each of these options. Note: The options shown with the mount commands used could have been stored in the file /etc/fstab as opposed to being provided with the mount commands. By storing the options shown with the mount commands provided in the /etc/fstab file, the options specified will be used to remount the mount points each time the DB2 server is rebooted.

Chapter 5. DB2 and the NetApp filer

83

6. Verify that the filer qtrees have been mounted by executing the following command: df -k (Figure 5-16 shows the output that was produced when this step was completed on the Linux server used in our test environment.) 7. Logoff the server.

Figure 5-15 /etc/fstab file used in Linux test environment Table 5-1 Mount option descriptions
Option hard Description Indicates that the mount point should never time out and that the DB2 server workstation should not come online without it. When used, this option will cause the DB2 server to hang if the Network Appliance filer is not responding to NFS for any reason. If the DB2 server is booting and the filer cannot be found, the DB2 server will not complete the boot process and DB2 UDB will not start. If the DB2 server it is already up and running and the filer quits responding, all I/O to and from the filer will be suspended until the filer is available again. Allows operator generated keyboard interrupts to kill a process that is hung while waiting for a response from the Network Appliance filer. Specifies which NFS version should be used. Some versions of UNIX have been reported to have serious performance problems when running with NFS Version 3. Others perform better using NFS Version 3 instead of Version 2. The system administrator should try using the vers option with both NFS versions, and should run with the NFS version which provides the best performance. This option is supported in recent releases of UNIX.

intr vers

84

DB2 UDB exploitation of NAS technology

Option proto

Description Along with the vers option, this option gives the system administrator the option of choosing whether UDP or TCP protocol should be used. For NFS over local area networks, UDP offers less overhead (and therefore better performance) than TCP. However, if your network connection path between the Network Appliance filer and the DB2 host is prone to lose packets, drop frames, or introduce checksum errors, then TCP can improve performance compared to UDP. We recommend that you run using UDP on a dedicated network connection with a crossover cable between the DB2 server and the filer. If you use UDP, be sure to enable UDP checksum on the DB2 server workstation. Tells the DB2 server that it should honor the set-uid bit on files mounted at this mount point. If you have any of the DB2 executables located on the filer, then using this option is important. If you are putting only the database files on the filer, then this option can be omitted. If you use this option, you must also export the file system with the -anon=0 option. For example, the /etc/exports file on the filer should read something like: /vol/vol0 -anon=0,root=somepc /vol/db2_data -anon=0 /vol/db2_logs -anon=0 Tells the DB2 server the size of the read block being used. The default is 32K Tells the DB2 server the size of the write block being used. The default is 32K

suid

rsize wsize

Figure 5-16 Output from df after qtrees were mounted on our Linux server

Chapter 5. DB2 and the NetApp filer

85

Once the operating system on the DB2 UDB server has been configured to access the volumes and/or qtrees on the filer, DB2 UDB can use those volumes and/or qtrees as a repository for database data and databases transaction log files in the same way that direct attached storage would be used for the same purpose.

5.7 Creating DB2 UDB databases on a filer


In most database systems that take advantage of network attached storage (NAS), the RDBMS program executables are stored on a server workstation and the database files and logs are stored on the disks that are provided by the NAS. We used this same approach in our test environment we installed the DB2 UDB executables and instance on the local drive of a Linux server and we created test databases such that they were stored on the Network Appliance filer. Note: This redbook does not discuss the steps needed to install the DB2 UDB software on a server. Refer to the appropriate Quick Beginnings Guide for your operating system for specific information on how to install DB2 UDB.

5.7.1 Setting the appropriate environment/registry variables


After DB2 UDB is installed on the server, but before databases are created, the DB2 UDB environment/registry variables that are applicable to NAS must be set. If you recall, in Chapter 2, DB2 UDB, NAS, and SAN terminology and concepts on page 23 we said that two DB2 UDB environment/registry variables that should be set when working with NAS are the DB2_PARALLEL_IO and the DB2_STRIPED_CONTAINERS variables. (Refer to the Registry and environment variables section in Chapter 2, DB2 UDB, NAS, and SAN terminology and concepts on page 23 for more information about what these variables are used for and how they are set.) For our test environment, we set the NAS-specific environment/registry variables by executing the following DB2 commands from a system prompt: db2set DB2_PARALLEL_IO=* db2set DB2_STRIPPED_CONTAINERS=ON

86

DB2 UDB exploitation of NAS technology

5.7.2 Creating DB2 UDB databases


The process used to create a database on a Network Appliance filer is straight forward; there are three methods that can be used: Modify the default database location (path) stored in the DB2 Database Manager configuration file parameter dftdbpath so that it refers to a filer volume or qtree that has been mounted on the server. Specify a filer volume or qtree that has been mounted on the server as the location for the database during the database creation process (via the CREATE DATABASE command or the Create Database Wizard that is launched from the Control Center.) Specify a directory or file in the filer volume or qtree that has been mounted on the server as the location for each of the three default table spaces (SYSCATSPACE, USERSPACE1, and TEMPSPACE1) that are created as part of the database creation process. (Again, this information can be specified via the CREATE DATABASE command or the Create Database Wizard). Important: If the third method is used, the second method should also be used to ensure that all files related to a database are physically located on the filer. Otherwise, a filer Snapshot will not capture everything it needs to recover a database (because part of the database will reside on local storage.)

Creating a DB2 UDB database using the first method


If the default database location (path) stored in the DB2 Database Manager configuration file parameter dftdbpath is modified so that it refers to a Network Appliance filer volume or qtree that has been mounted on the server, all databases are will automatically be created on the filer unless a different location is explicitly specified. The following is the step-by-step procedure we followed to create a database in our test environment, using this approach: 1. Store the mount point to the db2_data_linux qtree on the filer in the DB2 Database Manager configuration file parameter dftdbpath by executing the following command: db2 update dbm cfg using dftdbpath /db2_data 2. Stop and restart the DB2 Database Manager so the change will take effect by executing the following commands: db2stop db2start 3. Create a new database (named TEST_DB) by executing the following command: db2 create database TEST_DB

Chapter 5. DB2 and the NetApp filer

87

Note: If you do not specify table space parameters with the CREATE DATABASE command, the DB2 Database Manager will create three system managed storage (SMS) table spaces using directory containers. These directory containers are created in the subdirectory that is created for the database. Please refer to the IBM DB2 UDB Command Reference for more information about the CREATE DATABASE command.

Creating a DB2 UDB database using the second method


If a Network Appliance filer volume or qtree that has been mounted on the server is specified as the location for the database during the database creation process (via the CREATE DATABASE command or the Create Database Wizard that is launched from the Control Center), the database will be created on the filer, regardless of the location stored in the DB2 Database Manager configuration file parameter dftdbpath. The following is the step-by-step procedure we followed to create a database in our test environment, using this approach: 1. Create a database in the db2_data_linux qtree on the filer by executing the following command: db2 create database TEST_DB on /db2_data

Creating a DB2 UDB database using the third method


If a directory or file in the filer volume or qtree that has been mounted on the server is specified as the location for each of the three default table spaces (SYSCATSPACE, USERSPACE1, and TEMPSPACE1) that are created as part of the database creation process, the database itself will be created on local storage or on the filer, and all of its data will be stored on the filer. Again, this information can be specified via the CREATE DATABASE command or the Create Database Wizard. The following is the step-by-step procedure we followed to create a database in our test environment, using this approach: 1. Create three directories in the db2_data_linux qtree on the filer by executing the following commands: mkdir /db2_data/system mkdir /db2_data/user mkdir /db2_data/temp

88

DB2 UDB exploitation of NAS technology

Important: Make sure the file permissions for these directories are set such that the appropriate users can both read from and write to them. File permissions can be set by executing the UNIX command chmod, along with the appropriate options, for each directory created. 2. Create a new database (named TEST_DB) on the filer that has three SMS table spaces (also stored on the filer) by executing the following command: db2 create database TEST_DB on /db2_data USER TABLESPACE MANAGED BY SYSTEM USING (/db2_data/user) CATALOG TABLESPACE MANAGED BY SYSTEM USING (/db2_data/system) TEMPORARY TABLESPACE MANAGED BY SYSTEM USING (/db2_data/temp)

5.7.3 Verifying the location of a database


After executing the CREATE DATABASE command or using the Create Database Wizard, it is usually a good idea to verify that the desired database was actually created where you wanted it. DB2 Universal Database uses a set of special files to keep track of where databases are stored and to provide access to both local and remote databases. Because the information stored in these files is used much like the information in an office-building directory is used, they are referred to as directory files. The following types of directory files are available: Local database directory files System database directory files Node directory files A local database directory file exists on each path (called a drive on many operating systems) in which a database has been created. This file contains one entry for each database that is physically stored at that location. A system database directory file exists for each DB2 Database Manager instance. This file resides on the logical disk drive where the DB2 Universal Database product software is installed and it contains one entry for each database that has been cataloged for a particular instance. A node directory file is created on each client workstation when the first database partition is cataloged. Like the system database directory file, the node directory file also resides on the logical disk drive where the DB2 Universal Database product software is installed. Entries in the node directory are used in conjunction with entries in the system database directory for making connections and instance attachments to remote DB2 database servers. The contents of the system database directory file can be examined by executing the command LIST DATABASE DIRECTORY. Figure 5-17 shows the output that was produced by the LIST DATABASE DIRECTORY command, when it was executed in our test environment after the database TEST_DB was created.

Chapter 5. DB2 and the NetApp filer

89

Figure 5-17 Output from LIST DATABASE DIRECTORY command

As you can see in this sample output, the database TEST_DB was successfully created in the Network Appliance filer qtree that the mount point /db2_data was associated with.

Viewing information about database table spaces


When a table space is created, information about that table space is recorded in the databases system catalog tables. Thus, you can view specific information about every table space in a database by querying the appropriate system catalog table. You can also view specific information about table spaces by issuing the LIST TABLESPACES command. When executed, this command obtains and displays the following information about each table space that has been defined for a particular database: The internal ID that the DB2 Database Manager assigned the table space when it was created The name that was assigned to the table space The method used to manage the table spaces storage space (SMS or DMS) The type of data the table space was designed to hold (regular data, long data, or temporary data) The current state table space is in

90

DB2 UDB exploitation of NAS technology

Once you have the internal ID for a particular table space, you can find out where the data for that table space is physically located by executing the LIST TABLESPACE CONTAINERS command. Thus, the LIST TABLESPACES command can be used in conjunction with the LIST TABLESPACE CONTAINERS command to verify that table spaces were created as expected if the third method is used to create a DB2 UDB database on a Network Appliance filer. Figure 5-18 shows the output that was produced by the LIST TABLESPACES command, when it was executed in our test environment after the database TEST_DB was created using the third method available. Figure 5-19 shows the output that was produced by the LIST TABLESPACE CONTAINERS command, when it was executed in our test environment to obtain specific information about the table space TEMPSPACE1.

5.7.4 Improving the performance of SMS table spaces


It was mentioned in Chapter 2, DB2 UDB, NAS, and SAN terminology and concepts on page 23, that by default, SMS table spaces are expanded a single page at a time. However, in certain work loads (for example, when doing a bulk insert) it might be desirable to have storage space allocated in extents rather than pages.

Figure 5-18 Output from LIST TABLESPACES command

Chapter 5. DB2 and the NetApp filer

91

Figure 5-19 Output from LIST TABLESPACE CONTAINERS command

To force DB2 UDB to expand SMS table spaces one extent at a time, rather than one page at a time, you use the DB2EMPFA utility. The db2empfa tool is located in the bin subdirectory of the sqllib directory in which the DB2 UDB product is installed. Running it causes the multipage_alloc database configuration parameter (which is a read-only configuration parameter) to be set to YES. For our test environment, we used the db2empfa utility to tell the DB2 Database Manager to perform multi-page allocation for our test database by executing the following command from a system prompt: db2empfa TEST_DB

5.7.5 Changing the storage location of database log files


By default, when a database is created its transaction log files are written to a subdirectory of the directory the database is created in (the actual location can be determined by examining the database configuration file parameter sqlogdir) and circular logging is used. If you choose to use archive logging with a database that is stored on a Network Appliance filer, it is usually a good idea to store the databases transaction log files on a volume that is separate from the volume that contains the database. Such a configuration will provide an increase in performance for transactional based DB2 UDB databases that perform a significant amount of transaction logging.

92

DB2 UDB exploitation of NAS technology

The location that database log files are to be written to is specified by setting the value of the database configuration file parameter newlogpath. Here is the step-by-step procedure we followed to make the database we created in our test environment store its log files on a separate volume of the filer: 1. Store the mount point to the db2_log_linux qtree on the filer in the database configuration file parameter newlogpath by executing the following command: db2 update db cfg for TEST_DB using newlogpath /db2_logs 2. Force all connections to the database to be terminated so the changes will take effect by executing the following command: db2 force applications all

Chapter 5. DB2 and the NetApp filer

93

94

DB2 UDB exploitation of NAS technology

Chapter 6.

Backup and recovery options for databases that reside on NetApp filers
In this chapter we describe the steps used to back up and restore a DB2 UDB database using the WRITE SUSPEND, WRITE RESUME, and DB2INIDB commands (which were introduced in DB2 UDB v7.1 FIxPak 2) in conjunction with Network Appliances Snapshot technology.

Copyright IBM Corp. 2002 Copyright Network Appliance Inc. 2002

95

6.1 Backup methods available


The most common approach to creating a backup image of a database is to terminate all connections to the database, take the database offline, then using the BACKUP DATABASE command, make a full backup image of the database. Another approach is to isolate and backup specific portions of a database (table spaces), again by using the using the BACKUP DATABASE command. With this approach, full, incremental, and/or delta backup images can be made while a database remains online. When an online backup operation is performed, all transactions continue to be logged, and can be recreated in a future roll-forward recovery operation. Once a database has been restored from an online backup image, it must be rolled forward at least to the point in time at which the backup operation was completed. However, in order for this to happen, the active log file and any archived log files (online or offline) needed must be available when the roll-forward recovery process is initiated. Thats because the DB2 Database Manager must be able to access the log files needed, in the proper sequence (whether they are active, online, or offline), in order to perform a roll-forward recovery operation. Obviously the more archive log files you have online, the faster the recovery process will be. Because every change to a row includes the before-and after-image of the row, online archive log files can potentially become quite large. Thus, a Network Appliance filer is an excellent location for these objects, as well as that of the database itself. DB2 Universal Database databases residing on Network Appliance filers can use a faster, more efficient approach when backing up a database: execution of the WRITE SUSPEND, WRITE RESUME, and DB2INIDB commands, coupled with Network Appliances Snapshot technology. This approach uses the Snapshot capabilities of the Network Appliance filer to create a logical copy of the physical disks the database being backed up, and its associated log files reside on.

6.2 Designing a DB2 database with filer


When designing a new database that will ultimately reside on a Network Appliance filer, it is important to take backup and recovery needs into consideration. As part of the database design process, you should identify all relationships that will potentially exist between the various objects of the database. These relationships can be at an application level, where transactions will work with one or more tables, as well as at a database level, where referential integrity constraints may exist between tables, or where events against one table might activate triggers that perform operations against other tables.

96

DB2 UDB exploitation of NAS technology

With these relationships in mind, you should then try to group related database objects together on the same logical filer volume or group of volumes. Placing database objects with dissimilar backup requirements or functions on the same logical filer volume will complicate the use of Snapshots and typically make recovery that much harder. On the other hand, by keeping similar data objects together, the recovery process will be much easier. It should be noted that the archive logs should be stored on a volume that is separate from the one the data is stored on. Particularly if roll-forward recovery is to be enabled. The basic data container in DB2 Universal Database is the table space object. A table space provides a transparent relationship between all other objects in a database and the underlying physical storage they reside on. Essentially, table spaces provide a way to assign the location of objects and data directly to one or more containers (which can be a directory, a file, or a raw device). If a single table space spans more than one container, the DB2 Database Manager will attempt to balance the data load across all containers used. Specifying quota trees (qtree for short) on a Network Appliance filer as table space containers is the simplest way to keep logically related DB2 database objects together. Qtrees, which are essentially subdirectories on a logical filer volume, make it easy to control the placement of related database objects on a Network Appliance filer. By creating table spaces that use qtrees as containers, a single Snapshot of a volume can be used to back up multiple, related database objects at one time. One of the advantages of using a Network Appliance filer is that you can create logical volumes from many different physical disk drives. The current maximum size of a logical volume on a Network Appliance filer is 1.4 TB. The current maximum capacity of a Network Appliance F840 filer is 6 TB. The use of multiple RAID groups will increase the number of parity disks and reduce the already low probability of double disk failures. This means that a very large database can be kept in a small number of containers, all of which reside on the same logical Network Appliance filer volume. Data stored in a database can have different access and update frequencies. Some tables such as look-up tables may contain static data that changes rarely, if ever. Other tables may contain volatile data that is updated or altered several times a second. The backup requirements for each is different. Tables containing static data need to be backed up far less often than tables containing volatile data. Because Snapshots of volumes on a Network Appliance filer can be taken at different intervals, the database recovery process can be improved by storing tables that hold these two types of data on different logical volumes.

Chapter 6. Backup and recovery options for databases that reside on NetApp filers

97

For example, one volume could be defined and used to hold static data and a Snapshot of this volume could be taken once a week; another volume could be defined and used to hold volatile data and a Snapshot of this volume could be taken every hour. Note: The Network Appliance filer will allow you to schedule when Snapshots for selectable volumes are to be taken automatically. However, this feature cannot be used if the databases that reside on selected volumes will be active at the time the Snapshot is to be taken. Thats because database logging must be suspended before and resumed after each Snapshot is taken. On the other hand, if a database is normally taken off line at regular intervals, the automated Snapshots feature of the Network Appliance filer can be used to capture desired Snapshots during those intervals.

6.3 Suspending and resume database I/O


As a database increases in size and as heavy usage demands require database systems to be available twenty-four hours a day - seven days a week, the time and hardware required to back up and restore the database also increases substantially. Backing up an entire database or the table spaces of a large database can put a strain on system resources, require a considerable amount of additional storage space (to hold the backup images) and can reduce the availability of the database system (particularly if the system has to be taken off line before it can be backed up). To help reduce the impact of backing up large databases, DB2 Universal Database added the ability to suspend and resume database I/O while a database is online in version 7.2. This functionality was provided in the form of three new commands: WRITE SUSPEND, WRITE RESUME, and DB2INIDB.

6.3.1 WRITE SUSPEND


When executed, the write suspend command (SET WRITE SUSPEND FOR DATABASE) suspends all write operations to table spaces and log files that are used by a particular DB2 Universal database. (The suspension of writes to the active log file is designed to prevent partial page writes from occurring until the suspension is lifted.) Read-only transactions are not suspended and are able to continue execution against the suspended database provided they do not request a resource that is being held by the suspended I/O process. In addition, while I/O is suspended, applications can continue to process insert, update, and delete operations using data that has been cached in the databases buffer pool(s).

98

DB2 UDB exploitation of NAS technology

A database connection must exist before this command can be submitted. It is also recommended that the subsequent write resume command that must follow a write suspension should be executed in the same session.

6.3.2 WRITE RESUME


When executed, the write resume command (SET WRITE RESUME FOR DATABASE) lifts an active suspension and allows all write operations to table spaces and log files that are used by a particular DB2 Universal database to continue. Again, a database connection must exist before this command can be submitted.

6.3.3 DB2INIDB
Many storage vendors, including Network Appliance, provide storage solutions that ensure that data is constantly available. One such offering is the ability to make a mirrored copy of a database and then make that mirrored copy available for processing by the same or a different server. To take advantage of these offerings, DB2 Universal Database created a utility that is designed specifically to work with mirrored copies of a database. This utility, which is invoked by executing the command db2inidb, was also introduced in version 7.2. The db2inidb command looks like this: db2inidb [DatabaseAlias] as [snapshot | standby | mirror] When executed, this command works with a mirrored copy of a database to do one of the following: Perform database recovery using a mirrored copy of a database. Put a mirrored copy of a database in the roll-forward pending state so that it can be synchronized with the primary database. Allow a mirrored copy of a database to be backed up, thus providing a way to back up a large database without having to take it off line. Tip: A database connection does not have to exist before this command can be submitted.

Chapter 6. Backup and recovery options for databases that reside on NetApp filers

99

Which of these actions is performed is determined by the option that is specified when the db2inidb command is executed: snapshot: The mirrored copy of the database will be initialized as a read-only clone of the primary database. (The DB2INIDB snapshot should not be confused with the Network Appliance filer Snapshot.) standby: The mirrored copy of the database will be placed in roll-forward pending state. New logs from the primary database can be retrieved and applied to the mirrored copy of the database. The mirrored copy of the database can then be used in place of the primary database if, for some reason, it goes down. mirror: The mirrored copy of the database will be placed in roll-forward pending state and is to be used as a backup image, which can be used to restore the primary database. If the database is in an inconsistent state, it will remain in that state and any in-flight transactions will remain outstanding.

6.4 Using NetApp Snapshots with a DB2 database


Because a Network Appliance Snapshot is a read-only, online image of an entire volume's file system, it is easy to see how a Snapshot when used in conjunction with the WRITE SUSPEND and WRITE RESUME commands, can be used to quickly create a mirrored copy of an active database. And it is easy to imagine the various ways that a mirrored copy can be used with the DB2INIDB command. However, when planning to use Snapshots, SnapMirror, and/or SnapRestore technology in conjunction with one or more DB2 Universal Databases on a Network Appliance filer, it is essential that the database files and the databases corresponding log files are physically stored in two separate volumes on the filer. In the event that a database recovery operation becomes necessary, maintaining separate volumes will enable you to easily restore the database files from the appropriate Snapshot of the database volume, and then perform a roll-forward recovery operation using the 'original' database archive log files. In addition, this will facilitate use of Network Appliance's Snapshot technology, since the Snapshots are used to restore a filer at the volume level.

100

DB2 UDB exploitation of NAS technology

The location used to store a databases log files is determined by the value of the logpath parameter of a databases configuration file. To change the location of a databases log path, issue the following command: db2 UPDATE DB CFG FOR [DatabaseAlias] USING NEWLOGPATH [Location] In this command: DatabaseAlias is the alias for the database whose configuration is to be modified and Location is the location where database log files are to be stored.

6.4.1 Taking a Snapshot


To take a Snapshot of a DB2 UDB database (and its associated log files) that is stored on a Network Appliance filer: 1. Suspend all I/O being performed by the database by executing the following command: db2 set write suspend for database 2. Using a remote shell, create a Snapshot of the filer volume that contains the database by executing the following command: rsh -l root db2filer1 snap create [DataVolName] [DataSnapshotName] 3. Using a remote shell, create a Snapshot of the filer volume that contains the database log files by executing the following command: rsh -l root db2filer1 snap create [LogVolName] [LogSnapshotName] 4. Resume database I/O by executing the following command: db2 set write resume for database Snapshots for the database and its log files should now exist and the original database should now be available to all users and applications (and it should be in a consistent state).

6.4.2 Restoring a DB2 UDB database from a filer Snapshot


A database that has been backed up with a filer Snapshot can be recovered using one of two methods: version recovery and roll-forward recovery. With version recovery, the database is returned to the state it was in the last time a filer Snapshot was taken (backup image was made) any changes made since that time are lost. With roll-forward recovery, a database can be returned to the state it was in at a specific point in time (by returning it to the state it was in the last time a filer Snapshot was taken and then rolling it forward, using records stored in its associated transaction log files, to a specific point in time.)

Chapter 6. Backup and recovery options for databases that reside on NetApp filers

101

Version recovery
To restore a DB2 UDB database to the state it was in at the point in time that a filer Snapshot was taken (using an existing Snapshot): 1. Shut down the DB2 Database Manager instance by issuing the following command: db2stop 2. If the DB2 Database Manager instance cannot be shut down because one or more processes are still active, issue the following commands instead: db2 force application all db2stop 3. Using a remote shell, restore the database from the Snapshot taken of the filer volume that contains the database by executing the following command: rsh -l root db2filer1 vol snaprestore [DataVolName] -f -s [DataSnapshotName] The database can also be restored by copying the database files, including the log control file (i.e., SQLOGCTL.LFH) from the appropriate Snapshot directory on the filer, over the existing database and log control files. (Essentially, all files and directories contained in the database directory, should be copied. If you created table spaces that use containers that reside in other directories that were captured in the filer Snapshot, those files must be copied as well.) Note: Once a volume has been restored with a particular Snapshot, any Snapshots that were taken after the Snapshot that was used to restore the volume was taken will be returned to the available block pool, effectively eliminating them. Therefore, Snapshot recovery should be performed in descending time sequence where the most current Snapshot is used first, if applicable.

4. Place the restored database (which is a mirrored copy of the database) in a consistent state by executing the following command: db2inidb [DatabaseAlias] as snapshot 5. Restart the DB2 Database Manager instance by issuing the following command: db2start The database should now be available for use. However, all changes made to the database after the filer Snapshot was taken will no longer be reflected.

102

DB2 UDB exploitation of NAS technology

Roll-forward recovery
To restore a database to the state it was in at a specific point in time by reapplying changes stored in associated transaction log files: 1. Shut down the DB2 Database Manager instance by issuing the following command: db2stop 2. If the DB2 Database Manager instance cannot be shut down because one or more processes are still active, issue the following commands instead: db2 force application all db2stop 3. Using a remote shell, restore the database from the Snapshot taken of the filer volume that contains the database by executing the following command: rsh -l root db2filer1 vol snaprestore [DataVolName] -f -s [DataSnapshotName] The database can also be restored by copying the database files, including the log control file (i.e., SQLOGCTL.LFH) from the appropriate Snapshot directory on the filer, over the existing database and log control files. (Essentially, all files and directories contained in the database directory, should be copied. If you created table spaces that use containers that reside in other directories that were captured in the Snapshot, those files must be copied as well.) Important: Do not restore the log files from any Snapshot if you want to have the ability to perform roll-forward recovery on the restored database. 4. Restart the DB2 Database Manager instance by issuing the following command: db2start 5. Place the restored database (which is a mirrored copy of the database) in roll-forward pending state and indicate that it is to be used as a backup image, by executing the following command: db2inidb [DatabaseAlias] as mirror 6. Perform a roll-forward recovery operation on the database, using records stored in the databases log files, by executing the following command: rollforward database [DatabaseAlias] to end of logs and stop The database should now be available for use.

Chapter 6. Backup and recovery options for databases that reside on NetApp filers

103

6.4.3 DataLink considerations


Files that have been associated with DATALINK columns that were defined with the RECOVERY=YES option will be asynchronously backed up with the normal DataLinks Manager processes when the file association is first made in the database. However, filer Snapshot backups of a DB2 database will NOT back up external data that has been linked to the database through a DATALINK column. If a table that has a DATALINK column is involved in a Snapshot, any roll-forward recovery operation performed after the database is restored should properly re-synchronize DB2 with the DataLinks Manager. That is because the roll-forward recovery process communicates with the DataLinks Manager to ensure that the proper files are associated with their DATALINK columns. If a problem is encountered with a DATALINK column during a roll-forward recovery operation, the table affected will be placed in one of several Datalink Reconcile pending states. Executing the RECONCILE command against the table in question will nullify the DATALINK column and place an entry into an exception table, which can then be used to assist with further recovery actions.

104

DB2 UDB exploitation of NAS technology

Chapter 7.

Diagnostics and performance monitoring


In Chapter 1, Introduction to DB2 UDB, NAS, and SAN on page 3, we described the major components that make up a large database system that takes advantage of network attached storage. At times, it may become necessary to analyze the performance of such a database system in order to locate and resolve one or more problem areas that are having an adverse affect. Unfortunately, there is no single tool that can be used to help locate and diagnose problems across all components used in such a system. However, there are tools available for monitoring each individual component of the system. In this chapter, we discuss some of these tools and show you how and when they can or should be used.

Copyright IBM Corp. 2002 Copyright Network Appliance Inc. 2002

105

7.1 The DB2 Database System Monitor


A powerful tool provided with DB2 Universal Database, known as the Database System Monitor, can acquire information about the current state of a database system or about the state of a database system over a specified period of time. Once collected, this information can be used to: Monitor database activity. Assist in problem determination. Analyze database system performance. Aid in configuring and/or tuning the database system. Although the Database System Monitor is often referred to as a single monitor, in reality it consists of several individual monitors that have distinct, but related purposes. One of these individual monitors is known as the snapshot monitor (not to be confused with Network Appliance Snapshots); the rest are known as event monitors. Both types of monitors can be controlled using graphical user interfaces provided with the Control Center, administrative application programming interface (API) functions, and/or DB2 commands.

7.1.1 The snapshot monitor


The snapshot monitor is designed to provide information about the state of a DB2 UDB instance and the data it controls, and to call attention to situations that appear to be peculiar, irregular, abnormal, or difficult to classify. This information is provided in the form of a series of snapshots, each of which represents what the system looks like at a specific point in time. The information collected by the snapshot monitor is maintained as a count value, a high water mark, or a timestamp value that identifies the last time a specific activity was performed. Snapshot monitor information can be collected for the following items: The DB2 Database Manager Databases (local, remote or DCS) Applications (local, remote or DCS) The Fast Communications Manager (for internal communications between DB2 agents) Buffer pools Table spaces Tables Locks Dynamic SQL statements

106

DB2 UDB exploitation of NAS technology

Snapshot monitor switches


In some cases, obtaining data collected by the snapshot monitor requires additional processing overhead. For example, in order to calculate the execution time of an SQL statement, the DB2 Database Manager must make a call to the operating system to obtain timestamps before and after the statement is executed. Such system calls are generally expensive. Because of this, the snapshot monitor provides system administrators with a great deal of flexibility in choosing what information is collected when a snapshot is taken the type and amount of information returned (and the amount of overhead required) when a snapshot is taken is determined by the way one or more special switches (known as snapshot monitor switches) have been set. Table 7-1 shows the snapshot monitor switches available, along with a description of the type of information that is collected when each one is set.

Table 7-1 Snapshot monitor switches


Group Sorts Information provided Number of heaps used, overflows, sorts performance Number of locks held, number of deadlocks Measure activity (rows read, rows written) Number of reads and writes, time taken Start times, end times, completion status Start time, stop time, statement identification Monitor switch SORT DBM CFG parameter DFT_MON_SORT

Locks

LOCK

DFT_MON_LOCK

Tables

TABLE

DFT_MON_TABLE

Buffer pools

BUFFERPOOL

DFT_MON_BUFP OOL DFT_MON_UOW

Unit of work

UOW

SQL statements

STATEMENT

DFT_MON_STMT

Chapter 7. Diagnostics and performance monitoring

107

As you can see from the information provided in Table 7-1 on page 107, each snapshot monitor switch available has a corresponding parameter value in the DB2 Database Manager configuration file. By setting a snapshot monitor switch using a DB2 Database Manager configuration parameter, snapshot monitor information can be collected at the instance level as opposed to the application level. Snapshot monitor switches are set at the instance level using the UPDATE DBM CFG command; snapshot monitor switches are set at the application level using the UPDATE MONITOR SWITCHES command. When activating a snapshot monitor switch from the application level, for example, by issuing the UPDATE MONITOR SWITCHES command from the Command Line Processor, an instance connection is made and all data collected for the selected switch group(s) is made available to the application/user until the instance connection is terminated. All data collected will be different from that collected by any other application/user that turns on the same snapshot monitor switche(s) at a different point in time. In order to make the snapshot information available and consistent for all instance connections, the default monitor switches should be turned on using the appropriate DB2 Database Manager configuration file parameters. Note: Typically, when you change the value of a DB2 Database Manager configuration file parameter, you need to stop and restart the DB2 Database Manager instance before those changes will take effect. However, changes made to parameters that correspond to the snapshot monitor switches are effective immediately. Therefore, you do not need to stop and start the DB2 Database Manager instance. Instead, you need to terminate and reestablish any active connections before the changes made will be in effect.

Examining the current state of the snapshot monitor switches


Before a snapshot is taken, one or more snapshot monitor switches must be turned on. (If all snapshot monitor switches available are turned off, only very basic snapshot monitor information will be collected.) But before a particular snapshot monitor switch is turned on, its a good idea to examine the current state of snapshot monitor switches available. The easiest way to examine the current state of the snapshot monitor switches available is by executing the GET MONITOR SWITCHES command. Figure 7-1 illustrates how output from the GET MONITOR SWITCHES command looks (the timestamp values shown correspond to the date and time a particular snapshot monitor switch was reset or turned on.)

108

DB2 UDB exploitation of NAS technology

Figure 7-1 Sample GET MONITOR SWITCHES output

The easiest way to examine the current state of the DB2 Database Manager-level snapshot monitor switches available is by executing the GET DBM MONITOR SWITCHES command. Figure 7-2 illustrates how output from the GET MONITOR SWITCHES command looks; again the timestamp values shown correspond to the date and time a particular snapshot monitor switch was reset or turned on.

Capturing snapshot monitor information


Once a snapshot monitor switch has been turned on, the snapshot monitor collects appropriate monitor data until the switch is turned back off. To capture and view this data at a specific point in time, a snapshot of monitor values must be taken. Snapshots can be taken by embedding the appropriate API in an application program (see the Administrative API Reference, SC09-2947 for details) or by executing the GET SNAPSHOT command. Figure 7-3 shows an example of snapshot data that was collected and returned returned for the Dynamic SQL Statement cache. Note that the BUFFERPOOL snapshot monitor switch was turned ON in order to collect the information shown in Figure 7-3.

Chapter 7. Diagnostics and performance monitoring

109

Figure 7-2 Sample GET DBM MONITOR SWITCHES output

Figure 7-3 Sample table space-level snapshot output

In the output shown in Figure 7-3, we can see the disk read and write operations that have been performed at the tablespace level. If there are multiple tables in a tablespace then the command GET SNAPSHOT FOR TABLES ON [Database] can be used to determine which tables are the most active. Figure 7-4 shows an example of snapshot data that was collected at the table level and returned.

110

DB2 UDB exploitation of NAS technology

Figure 7-4 Sample Table-level snapshot output

Resetting snapshot monitor counters


As you can see from the examples provided, the output produced by a snapshot contains, among other things, cumulative information about how often a particular activity was performed within a particular time frame. Such cumulative information is collected and stored in a wide variety of counters whose current values are retrieved each time a snapshot is taken. So when does the counting start exactly? Counting begins When a snapshot monitor switch is turned on (or when the DB2 Database Manager is restarted after one or more of its configuration parameters that correspond to a snapshot monitor switch have been turned on) Each time the counters are manually reset To use the snapshot monitor effectively, it is usually desirable to obtain snapshot information after a specific period of time has elapsed. To control the window of time that is monitored, the appropriate counters are usually set to zero when monitoring is to begin and then a snapshot is taken once the desired period of time has elapsed. Although snapshot monitor counters can be set to zero by turning all appropriate snapshot monitor switches off and back on, the easiest way to reset all snapshot monitor counters to zero is by executing the RESET MONITOR command. (To reset all snapshot monitor switches for all database within an instance to zero, the command RESET MONITOR ALL should be used.) Resetting snapshot monitor switches to zero effectively restarts all counting, and future snapshots will contain new counter values.

Chapter 7. Diagnostics and performance monitoring

111

7.1.2 Event monitors


Unfortunately, some database activities cannot be monitored easily with the snapshot monitor. Take, for instance, when a deadlock cycle occurs. If the deadlock detector awakes and discovers that a deadlock exists in the database locking system, it randomly selects, rolls back, and terminates one of the transactions involved in the deadlock cycle. Information about this series of events cannot be easily captured by the snapshot monitor because in all likelihood, the deadlock cycle will have been resolved long before a snapshot can be taken. An event monitor, on the other hand, could be used to capture such information. Unlike the snapshot monitor, which is used to record the state of database activity at a specific point in time, an event monitor is used to record database activity as soon as a specific event or transition occurs. And although event monitors return information that is very similar to the information returned by the snapshot monitor, the event itself controls when the information is collected. Specifically, event monitors can capture and write system monitor data to a file or a named pipe whenever any of the following events occur: A transaction is terminated. An SQL statement is executed. A deadlock cycle is detected. A connection to a database is established. A connection to a database is terminated. A database is activated. A database is deactivated. An SQL statements subsection completes processing (when a database is partitioned). The FLUSH EVENT MONITOR SQL statement is executed. Unlike the snapshot monitor, which resides in the background and is controlled by the settings of the snapshot monitor switches (or corresponding DB2 Database Manager configuration parameter values), event monitors are created using Data Definition Language (DDL) statements. Because of this, event monitors only gather information for the database in which they have been defined. Thus, event monitors cannot be used to collect information at the DB2 Database Manager instance level. When an event monitor is created, the event types that will be monitored must be included as part of the event monitors definition. An event monitor can monitor any of the following types of events: DATABASE: Records an event record when the last application disconnects from the database.

112

DB2 UDB exploitation of NAS technology

TABLES: Records an event record for each active table when the last application disconnects from the database. An active table is a table that has changed since the first connection to the database was established. DEADLOCKS: Records an event record for each deadlock event. TABLESPACES: Records an event record for each active table space when the last application disconnects from the database. BUFFERPOOLS: Records an event record for each buffer pool when the last application disconnects from the database. CONNECTIONS: Records an event record for each database connection event each time an application disconnects from the database. STATEMENTS: Records an event record for each time an SQL statement is issued by an application. TRANSACTIONS: Records an event record each time a transaction completes (by executing a COMMIT or ROLLBACK statement). Event monitors are created by executing the CREATE EVENT MONITOR SQL statement. Note: SYSADM or DBADM authority is required to create an event monitor.

Starting and stopping event monitors


Just as one or more snapshot monitor switches must be turned on before a snapshot can be taken, one or more event monitors must be turned on before event monitor data will be collected. (An event monitor will be turned on, or made active, automatically each time the database is started if the AUTOSTART option was specified when the event monitor was created.) Event monitors can be turned on or off by executing the SET EVENT MONITOR STATE SQL statement. Once an event monitor is activated, it sits quietly in the background and waits for one of the events it is associated with to occur. When such an event takes place, the event monitor collects information that is appropriate for the event that fired it and writes that information to the event monitors target location. If the target location is a named pipe, a stream of data will be written directly to the named pipe. If the target location is a directory, the stream of data will be written directly to one or more files. These files are sequentially numbered and have the extension .evt (such as 00000000.evt, 00000001.evt, and so on). The application at the receiving end of the named pipe an event monitor is writing to is responsible for promptly reading the information written. Otherwise, the event monitor will turn itself off if the pipe becomes full. Likewise, an event monitor will turn itself off if all of the file space that was allocated for its output when the event monitor was created is consumed.

Chapter 7. Diagnostics and performance monitoring

113

Forcing an event monitor to provide data before it is activated


Because some events, such as database events, do not activate event monitors as frequently as others, it can often be desirable to have an event monitor to write its current monitor values to its target location before the monitor triggering event occurs. In such situations, a system administrator can force an event monitor to write all information collected so far to the appropriate output location by executing the FLUSH EVENT MONITOR SQL statement. The event records that are generated by this command are recorded in the Event Monitor log with a partial record identifier. You should be aware that flushing out an event monitor will not cause the event monitor values to be reset. This means that the event monitor record that would have been generated if the FLUSH EVENT MONITOR statement had not been executed will still be generated when the event monitor that was flushed is triggered normally.

Viewing event monitor data


Because data files that are produced by an event monitor are written in binary format, their contents cannot be viewed directly with a text editor. Instead, one of two special utilities that are provided with DB2 Universal Database must be used: Event Analyzer: A GUI tool that will read the information stored in an event monitor data file and produce a listing db2evmon: A text-based tool that will read the information stored in an event monitor data file and generate a report The Event Analyzer is activated by entering the command db2eva at the system command prompt; the db2emvon tool is activated by entering the command db2evmon at the system command prompt. By default, the report produced by the db2evmon utility is always displayed on the screen. However, if the report is quite long, you may find it to your advantage to redirect the output to a file, which can then be opened with a scrollable text editor or printed.

7.2 Operating system monitoring tools


Tools used for monitoring activity at the operating system can vary from platform to platform. In our test environment we used the Linux operating system. Therefore, in this section, we will briefly cover three the Linux system monitoring tools available: top, vmstat, and ps.

114

DB2 UDB exploitation of NAS technology

7.2.1 The top program


Linux is a multiuser, multitasking operating system. Therefore, at any given time more than one program (otherwise known as a process) can be running on a Linux system. While Linux makes it appear as if all processes are executing simultaneously, the reality is that only one process can be executing on each CPU available at a time. A useful tool for monitoring processes that are either using a CPU or are waiting to use a CPU in a Linux system is the top program. The top program is a full screen application that presents a display of process information, such as CPU and memory usage, in real-time. If executed without any of the flags that are available, the top program displays a regularly updated list of running processes, which are ordered by the percentage of CPU resources they consume. Figure 7-5 shows an example of process data that was collected by the top program.

Figure 7-5 Sample top output

Using the top program, you can quickly locate a process that is consuming a large amount of system resources.

7.2.2 Virtual memory statistics vmstat


Another useful tool for obtaining information about processes is a program called vmstat. Specifically, the vmstat program is used to collect and display statistical information about kernel threads in the run and wait queues, memory usage, paging, disk usage, interrupts, system calls, context switches, and CPU activity. If the vmstat command is used without any of the options that are available, or with only the interval (and optionally, the count parameter, like vmstat 2), then the first line of numbers returned is an average of all activity since system reboot.

Chapter 7. Diagnostics and performance monitoring

115

Figure 7-5 shows an example of process data that was collected by the vmstat program. In this example, 10 lines of data were collected and a 2-second interval was used.

Figure 7-6 Sample vmstat output

7.2.3 Process state ps


The ps utility is another tool that can be used to display information about the processes that are running on a Linux system. Using this tool, it is possible to display the current state of a process, which can be any of the following at any given time: Runnable: A runnable process is waiting in the schedulers run queue. Sleeping: A sleeping process is waiting for something to occur. Swapped: A swapped process is a process that has been completely removed from the computers main memory and written out to disk. Zombie: A zombie process is a process that has died but did not exit cleanly. Stopped: A stopped process is a process that is marked not runnable by the kernal. Figure 7-7 shows an example of process data that was collected by the ps tool.

116

DB2 UDB exploitation of NAS technology

Figure 7-7 Sample ps output

To determine which DB2 UDB processes are running, use the command: ps -ef | grep db2 To determine which DB2 UDB agents are idle, which agents are handling a database connection, and which agents are handling an instance connection, execute the command: ps -ef | grep db2agent

7.3 Network Appliance filer monitoring tools


The Data ONTAP operating system that drives the Network Appliance filer contains several utilities that allow a filer administrator to monitor filer activity. In our test environment we used some of these utilities; in this section, we will briefly cover four Network Appliance filer monitoring tools available: sysstat, ifstat, netstat, and df. Information about the commands used to invoke these tools can be obtained by entering the keyword help, followed by the command itself (for example, help sysstat).

Chapter 7. Diagnostics and performance monitoring

117

Note: In order to use these utilities, you must be remotely connected to the filer.

7.3.1 sysstat
The sysstat utility is used to display aggregated filer performance statistics such as the current CPU utilization, the current amount of network I/O, the current amount of disk I/O, and the current amount of tape I/O being performed. When invoked with no arguments sysstat prints a new line of statistics every 15 seconds. The sysstat utility can also be invoked by issuing one of the following commands: sysstat [interval] sysstat [-c count] [-s] [-u | -x] [interval] Figure 7-8 shows an example of the output that is provided by the sysstat utility.

Figure 7-8 Sample sysstat output

If sysstat is started with no interval count specified, it can be stopped by using Control-C. For more information on the sysstat utility, refer to the Network Appliance Manual Pages.

118

DB2 UDB exploitation of NAS technology

7.3.2 ifstat
The ifstat utility is used to display statistics about packets that have been received and sent on a specified network interface or on all network interfaces. The statistics returned by ifstat are a cumulative total that has been collected since the filer was booted.The ifstat utility can be invoked as follows: ifstat [-z] [-a | interface_name] If specified, the -z argument causes the statistics to be cleared. The -a argument causes statistics for all network interfaces including the virtual host and the loopback address to be displayed. If you don't use the -a argument, the name of a specific network interface should be provided. Figure 7-9 shows an example of the output that is provided by the ifstat utility.

Figure 7-9 Sample ifstat output

For more information on the ifstat utility, refer to the Network Appliance Manual Pages.

7.3.3 netstat
The netstat utility is used to symbolically display the contents of various network-related data structures. There are a number of output formats available, depending on the options specified for the information to be presented. The netstat utility can be invoked by issuing one of the following commands: netstat [-anx] netstat [-mnrs]

Chapter 7. Diagnostics and performance monitoring

119

netstat [-i | -I interface [ -dn ] [ -f { wide | normal } ] netstat [-w interval] [ -i | -I interface ] [ -dn ] netstat [ -p protocol ] The first form of the netstat utility command displays a list of active sockets for each protocol used. The second form presents the contents of one of the other network data structures according to the option selected. The third form will display cumulative statistics for all interfaces or, for the interface specified using the -I option. This form will also display the sum of the cumulative statistics for all configured network interfaces. The fourth form continuously displays information regarding packet traffic on the interface that was configured first, or for the interface specified using the -I option. This form will also display the sum of the cumulative traffic information for all configured network interfaces. The fifth form displays statistics about the protocol specified. Figure 7-10 shows an example of the output that is provided by the netstat utility.

Figure 7-10 Sample netstat output

For more information on the netstat utility, refer to the Network Appliance Manual Pages.

120

DB2 UDB exploitation of NAS technology

7.3.4 df
The df utility is used to displays statistics about the amount of free disk space remaining in one or all volumes on a filer. All sizes are reported in 1024-byte blocks. The df utility can be invoked by issuing the following command: df [ -i ] [ pathname ] In this command, pathname identifies the path name to a specific volume. If a path name is specified, df reports only on the corresponding volume; otherwise, it reports on every volume that is currently on-line. Figure 7-11 shows an example of the output that is provided by the df utility.

Figure 7-11 Sample df output

When executed, df displays statistics about snapshots for each volume on a separate line from the statistics about the active file system. The snapshot line displays information about the amount of disk space that is consumed by all the snapshots in the system. Blocks that are referenced by both the active file system and by one or more snapshots are counted only in the active file system line; their count is notreflected in the snapshot line. If snapshots consume more space than has been reserved for them (20% by default), the excess space consumed by snapshots is reported as used by the active file system as well as by snapshots. In this case, it may appear that more blocks have been used in total than are actually present in the file system. When invoked with the -i option specified, the df utility displays statistics about the number of free inodes available.

Chapter 7. Diagnostics and performance monitoring

121

122

DB2 UDB exploitation of NAS technology

Part 2

Part

DB2 working with IBM NAS

In this part of the book, we first introduce IBM NAS 200 and NAS 300 and terminology and concepts for IBM NAS. We then describe how to configure IBM NAS 200 and 300 and how to install DB2 on Windows. Next we walk you through the backup and recovery options using IBM NAS Persistent Storage Manager (PSM). Finally, we show the IBM NAS high availability features and the test results with DB2 UDB.

Copyright IBM Corp. 2002

123

124

DB2 UDB exploitation of NAS technology

Chapter 8.

Terminology and concepts of IBM NAS


IBM TotalStorage Network Attached Storage solutions are easy-to-manage appliances that are designed specifically for today's scalable, network-centric IT system architectures. IBM Network Attached Storage provide up to 7 terabytes of disk storage and can be attached directly to a network, rather than to a specific network server. In this chapter we provide an overview of IBM Network Attached Storage technology, and discuss some of the functionality a Network Appliance filer provides.

Copyright IBM Corp. 2002

125

8.1 The IBM TotalStorage NAS 200 and 300 concept


A popular and accelerating trend in networking has been to use appliances (devices that perform a single function very well) rather than general-purpose computers to provide common services to a network environment. Appliances have been successful because they are easier to use, more reliable, and have better price/performance than general-purpose computers. IBM Network Attached Storage provides a fast, scalable, and reliable data management solutions that overcome the challenges of sharing, managing, and protecting data in today's global, high-growth infrastructures. As dedicated appliances, IBM Network Attached Storage is optimized to serve and manage data and they is designed to dramatically simplify data management, improve overall performance, and ensure that continuous availability is provided.

8.1.1 System architecture


The IBM NAS 200 and 300 appliance come pre-installed with a fully integrated suite of optimized software pre-loaded. IBM NAS appliance are designed to be installed quickly and easily and the entire system is tested by IBM prior to delivery. A network interface driver is responsible for receiving all incoming NFS, CIFS, HTTP, and FTP requests. IBM NAS allows to exploit different RAID level for all data stored in the disk subsystem. The integrated Windows Powered OS based on Windows 2000 Advanced Server code that has been optimized specifically for network file access. By integrating the file system and RAID management, problems that result when RAID management sits on top of the file system (which is how RAID management is usually implemented) are eliminated.

126

DB2 UDB exploitation of NAS technology

The basic architecture of the IBM NAS appliance can be seen in Figure 8-1.

Storage
System Management UNIX Services Disaster Recovery PSM DHCP DNS FTP LDAP

Adminstration and Monitoring IAAU Director Manageability Services

Integrated OS

Windows File Services (CIFS)


10/100 Mbit Ethernet

UNIX File Services (NFS)


Gigabit Ethernet SX A

Web Services (HTTP)


PCI Fibre Channel

Novell NetWare

PCI SCSI LVD/SE

PCI Fast/Wide Ultra SCSI

TCP/IP
Figure 8-1 IBM NAS Appliance System Architecture

The main buildings blocks of the IBM NAS appliance architecture are: NAS Server Engine Storage Subsystem Pre-loaded Software

8.1.2 NAS Server Engine


The NAS Server Engine is directly connected to an IP Network and acts as an intermediate layer that maps incoming I/O request from network to the attached storage subsystem. All data enters and exits a NAS appliance via an IP Network in File System Formats (File I/O). The NAS O/S internally converts the File I/O into Block I/O Formats and stores all data in Block I/O Formats onto the integrated disk arrays via SCSI commands.

Chapter 8. Terminology and concepts of IBM NAS

127

Application Server IP Application Server


File I/O Transfer via TCP/IP

NAS File System

Block I/O via SCCI

Figure 8-2 IBM NAS I/O mapping

IBM NAS gets connected to the LAN by Ethernet Adapter: The NAS 200 includes two ethernet controllers. One is a PCI slot card and the other is directly integrated on the motherboard. Both adapters are configured by default to use the DHCP server. The adapter on board is used for administration tasks and the other one is used for the public network. The NAS 300 come with an integrated 10/100 Ethernet controller, which is exclusively used to communicate between the two engine nodes. At least one Ethernet adapter must be ordered with each engine of the configuration to connect to the Ethernet LAN for access by the users.

8.1.3 Storage subsystems


For NAS 300, high performance Fibre Channel Hubs and cabling are used for disk connectivity. (SCSI or FC disks) Data protection technology RAID implementation Data protection on disk Data backup to tape

128

DB2 UDB exploitation of NAS technology

8.1.4 Pre-loaded code


Each NAS 200 or 300 is pre-loaded at the factory with its base operating system, installation and administration software. The code is loaded to the system's hard disk with a backup copy provided on an emergency recovery CD-ROM. The operating system and NAS application code have been specifically tuned to enable the NAS 200 and 300 as high performance NAS server appliances. The difference between the NAS 200 and 300 is that the NAS 300 comes with additional software for clustering. In addition to the operating system and application software, each unit contains tools which simplify remote configuration and administration tasks. Additionally, included network management agents provide options for managing the units. Specifically, the units come pre-configured with the following functions: Windows Powered OS based on Windows 2000 Advanced Server code optimized for the IBM TotalStorage NAS 200 and 300 Models. Multi-protocol support for CIFS (Windows), NFS (UNIX), FTP, HTTP, Apple File protocol, and Novell file systems Multiple file transfer services. UNIX services like pre-configured NFS support, Microsoft Services for UNIX V2.2 and NFS V3.0 (IETF RFC 1830) with IBM Director 2.22 UM Server Extensions. ServeRAID Manager RAID Configuration and Monitoring this provides configuration tools and RAID management of xSeries appliances using ServeRAID-4 controllers. Columbia Data Products Persistent Storage Manager Persistent Storage Manager (PSM) creates point-in-time persistent images of any or all system and data volumes. All persistent images survive system power loss or a planned or unplanned reboot. Advanced appliance configuration utility this is designed to manage all your appliances from a single client with this Web-based application set

8.2 IBM NAS terminology


In this section we discuss IBM NAS terminology.

8.2.1 Hard disks and adapters


In this section we discuss hard disks and adapters for IBM NAS200 and IBM NAS300.

Chapter 8. Terminology and concepts of IBM NAS

129

Hard disks
The NAS 200 and 300 can contain up 48 disks in a variety of capacities. Currently there are 36 GB and 73 GB disk drive available which allows a total capacity of 3.5 TB for the NAS 200 and 6.5 TB for NAS300. Both disk types are hot-swappable, designed for high-performance (10,000 rpm HDD) and featured with Predictive Failure Analysis (PFA). The integrated disk adapter provide almost all the RAID functions within a NAS appliance, such as parity calculations, disk rebuild, and sparing.

Adapters for NAS200 and 300


The ServeRAlD -4L adapter used in the IBM NAS 200 Model 201 (5194-200) workgroup machine has 16 MB of internal RAM memory, most of which is for a disk-read cache. This adapter does not have a battery-backed write cache. If this RAID adapter is used in write-back mode, a failure at the wrong moment will result in permanent lost data, even if the data is written to a redundant RAID configuration (such as RAID 1 or RAID 5). The ServeRAlD-4H adapter used in the IBM NAS 200 Model 226 (5194-225) departmental machine has 128-MB ECC battery-backed cache,32-MB on board processor memory, and 1-MB L2 cache for complex RAID algorithms, which allows this RAID controller to be safely configured for write-back operations. In the IBM NAS 300 (5195-325),the RAID subsystem is not contained in the engine enclosure itself but instead is contained in the first storage unit enclosure. Within this storage unit enclosure are dual RAID controllers and dual power supplies to provide a completely redundant solution with no single point of failure. (Large 5195-325 system configurations have a second identical RAID subsystem, which also has dual RAID controllers). Each of the dual RAID controllers has 128 MB of internal battery backed-up write ECC RAM.

8.2.2 Arrays, logical disks, and volumes


An IBM NAS array represents a logical storage device, with configurable RAID level and physical disk space. Arrays can be configured and associated/mapped to physical devices by the ServeRAID Manager. Arrays are expandable entities that can be expanded by adding another storage unit (disk drive) to the system. This can be done on the fly while the IBM NAS server is up and running.

130

DB2 UDB exploitation of NAS technology

As shown in Figure 8-3, several arrays with different RAID level support can be assigned to a set of IBM NAS disks.

RAID-5 Array RAID-1 Array


Figure 8-3 IBM NAS array support

Figure 8-4 shows one or more logical disks can be defined within an array. For logical disks, the term logical drives is used. If required, you can add more disk space dynamically from the assigned array and increase the local drive size.

H a r d D is k

L o g ic a l D r iv e
Figure 8-4 IBM NAS logical drives

A rra y

In order to make disk space accessible for applications, drive partitions have to be defined. A drive partition is defined by assigning space from one or more logical drives to it if more than one logical drive is used, we called it a spanned logical partition. Logical partitions or short partitions are identified by drive letters like F:, E:, etc., as shown in Figure 8-5.

Chapter 8. Terminology and concepts of IBM NAS

131

P a rtitio n F :

P a rtitio n E :

Figure 8-5 INM NAS drive partitions

For logical disk and drive partition configuration and management, you can use the ServeRAID Manager program on NAS 200 the ServeRAID Manager program is part of the IBM Advanced Appliance Configuration Utility (IAACU) and can be accessed either by using Windows Terminal Services or Internet Explorer or the IBM Netfinity Fibre Channel Storage Manager on NAS300. The NAS system comes pre-configured with a default setup for array, logical drives, and drive partitions. (For NAS 200, by default, an array A with three logical drives has been set up. The logical drives are mapped to drive letter C:, D:, and E:).

8.2.3 RAID support


IBM TotalStorage NAS products support internal and external hardware RAID solutions. IBM TotalStorage NAS 200 uses an internal RAID controller, while the IBM TotalStorage NAS 300 uses an external RAID controller for increased performance and redundancy of the RAID subsystem. The IBM NAS 200 model 201 uses the IBM ServeRAID -4L Ultra160 SCSI internal RAID controller. The NAS 200 model 226 uses the ServeRAID-4H UltraSCSI internal RAID controller for more performance and storage capacity support. These controllers support nine different levels of RAID, including RAID 0, 1, 5, and 5E. This allows you the flexibility to tailor the RAID level based on your typical workload. Support for Logical Drive Migration allows various complex RAID setup and maintenance operations to be administered in a simple, straightforward manner.

132

DB2 UDB exploitation of NAS technology

The NAS 300 incorporates the IBM TotalStorage 5191 external RAID controller. The NAS 300 has a battery-backed cache that will protect any unwritten data (that was still in cache when the failure occurred) for up to 72 hours. It supports RAID levels 0, 1, 3 and 5. The 5191 is designed for high-availability applications requiring a high degree of component redundancy. It features two hot-plug RAID controllers and two hot-plug power supplies and redundant fans.

RAID 0
RAID 0 allows multiple physical drives to be logically concatenated into a single logical disk drive. A technique called data striping is applied to the physical disk drives. This technique interleaves blocks of data across the disks. The layout is such that a sequential read of data on the logical drive results in parallel reads to each of the physical drives. RAID 0 requires a minimum of two drives. RAID 0 provides no redundancy protection such as parity protection or data mirroring. If a single disk fails, all data is lost, and all disks must be reformatted.

RAID 1
RAID 1 uses the concept of data mirroring, which duplicates the data from a single logical drive across two physical drives. Data written to the logical drive is written to both physical disk drives. This creates a pair of drives that contain the same data. If one of these physical drives fails, the data is still available from the remaining disk drive.

RAID 3
RAID 3 stripes data across all the data drives, writing a single block across all drives. This type of striping is referred to as byte-level striping. Parity data is then stored on a dedicated drive. Parity data can be used to reconstruct the data if a single disk drive fails. RAID 3 requires a minimum of three drives (two data disks and one parity disk).

RAID 4
RAID 4 is very similar to RAID 3, except that it uses block-level striping instead of byte-level striping. With block-level striping, a complete block is written to a single disk. The use of larger stripes improves the write performance over RAID 3. It still maintains the use of a dedicated parity drive and requires a minimum of three drives, as does RAID 3.

RAID 5
RAID 5 uses block-level striping and distributed parity. This eliminates the bottleneck of writing to the dedicated parity drive and does not require the duplicate disk drives of RAID 1. Both the data and parity information are spread across the disks one block at a time. RAID 5 requires a minimum of three drives.

Chapter 8. Terminology and concepts of IBM NAS

133

As with RAID 4, the one performance penalty is in the read-modify-write cycle for writes smaller than a full stripe. A RAID array operating with a failed drive is said to be in degraded mode. RAID 5 arrays synthesize the requested data for the failed drive by reading the parity information for the corresponding data stripes from the remaining drives in the array. A failed drive in a RAID 1 or RAID 5 array can be replaced by physically swapping in a new drive or by a designated hot spare.

RAID 5E
RAID 5E (Enhanced) puts hot spares to work to improve reliability and performance. A hot spare is normally inactive during array operation and is not used until a drive fails. By utilizing deallocated space on the drives in the array, a virtual hot spare is created. By putting the hot spare to work, performance improves because more heads are writing the data. In the event of a drive failure, the RAID controller will start rearranging the data from the failed disk into the spare space on the other drives in the array.

8.2.4 File system I/O


One of the key differences of a NAS disk device, compared to direct access storage (DAS), is that all I/O operations use file level I/O protocols. File I/O is a high level type of request that, in essence, specifies only the file to be accessed, but does not directly address the storage device. This is done later by other operating system functions in the remote NAS appliance. A File I/O request specifies the file and the offset into the file. For instance, the I/O may specify Go to byte 1000 in the file (as if the file was a set of contiguous bytes), and read the next 256 bytes beginning at that position. Unlike Block I/O, there is no awareness of a disk volume or disk sectors in a File I/O request. Inside the NAS appliance, the operating system keeps track of where files are located on disk. The OS issues a Block I/O request to the disks to fulfill the File I/O read and write requests it receives. Network access methods, NFS and CIFS, can only handle File I/O requests to the remote file system. I/O requests are packaged by the node initiating the I/O request into packets to move across the network. The remote NAS file system converts the request to Block I/O and reads or writes the data to the NAS disk storage. To return data to the requesting client application, the NAS appliance software re-packages the data in TCP/IP protocols to move it back across the network.

134

DB2 UDB exploitation of NAS technology

8.2.5 Backup and recovery functions


The IBM TotalStorage NAS products use two types of backup implementations: point-in-time image copies and archival backup.

Point-in-time images
IBM Network Attached Storage products provide point-in-time images of the file volumes through the Persistent Storage Manager (PSM) function. This function uses storage cache that is privately managed by the PSM code. The point-in-time image function of PSM is similar to functions in other products, such as: FlashCopy function on the IBM Enterprise Storage Server (ESS) SnapShot function on the Network Appliance products SnapShot function on StorageTek or IBM RAMAC products In IBM Network Attached Storage product documents, all of the following terms refer to this functionality: Persistent Image, True Image, Point-in-Time Image, or Instant Virtual Copy. Attention: Throughout this redbook, we will always use the term True Image copy when we are referring to a PSM point-in-time image.

Archival backup
IBM NAS products offer support for archival backup of the NAS operating system and archival backup of NAS user data The archival backup of the NAS operating system is supported by the pre-loaded NTBackup software or separately purchased backup programs like Tivoli Storage Manager (TSM). For archival backup of user data the same backup programs, either the pre-loaded NTBackup or a separately purchased like Tivoli Storage Manager can be used. With the pre-loaded NTBackup, full, incremental, or differential backups of NAS user data can be taken. When a full backup is taken, all selected files are backed up without any exception. A differential backup image will contain all files changed since the previous full backup thus, no matter how many differential backups are made, only one differential backup plus the original full backup are needed for any restore operation. With incremental backup, all files will be included into the backup image that changed since that previous incremental backup.

Chapter 8. Terminology and concepts of IBM NAS

135

8.3 Backup and recovery in IBM NAS products


In this section, we discuss Windows NTbackup and IBM NAS backup assistant, TSM backup and ISV backup software.

Windows NT backup and NAS backup assistant


IBM NAS products come pre-loaded with Windows NTBackup and the NAS Backup Assistant. These programs can be used to backup operating system data or user data. Backups can be made to disk or tape. The pre-loaded PSM function is the recommended method of resolving the open file problem.

TSM backup
The IBM NAS products come pre-installed with the Tivoli Storage Manager (TSM) Client. The TSM client enables the backup of data in the NAS appliance. Because this is only a client, a separate TSM server is required to perform the actual backup. Based on the TSM servers configuration, the final destination of the NAS appliances backup can either be located in the TSM servers disk storage or an attached tape subsystem.

ISV backup software


The IBM appliances are sold as fixed-function boxes, and are in general not intended to be modified or changed by the customer. IBM and its vendors have cooperated to tune the performance and testing of these products in NAS environments. Additionally, the license agreements between IBM and its software vendors, and between IBM and its customers, prohibit the use of these appliances as general-purpose servers. Therefore, addition or modification of this software in the NAS system may void any support by IBM. However, a limited number of add-on applications have been tested with these NAS products, and customers may add those specific software applications to the system. Should a customer have problems with non-IBM software that they have added to this appliance, the customer should contact the vendor directly, as IBM does not provide on-site or remote telephone support for those non-IBM products. IBM will continue to support hardware and software that is shipped with the NAS appliance. However, in certain circumstances, any non-IBM software may have to be de installed for IBM service to provide problem determination on the IBM hardware and software. Persistent Storage Manager

136

DB2 UDB exploitation of NAS technology

8.4 IBM NAS Persistent Storage Manager (PSM)


Point-in-time images provide a near instant virtual copy of an entire storage volume. These point-in-time copies are referred to as True Image copies and are created and managed by the PSM software. These instant virtual copies have the following characteristics: Normal reads and writes to the disk continue as usual, as if the copy had not been made. Virtual copies are created very quickly and with little performance impact, as the entire volume is not truly copied at that time. Virtual copies appear exactly as the volume appeared when the virtual copy was made. Virtual copies typically take up only a fraction of the space of the original volume. Because these virtual copies are both created very quickly and relatively small in size, functions that would otherwise have been too slow or too costly are now made possible. The use of these persistent images may now allow individual users to restore their own files without any system administrators intervention.

8.4.1 How PSM works overview


The PSM software connects into the file system of the NAS product, and monitors all file reads and writes with minimal performance impact. When a persistent image copy is requested, the following activities occur: 1. The moment a persistent image is requested, PSM begins monitoring the file system, looking for a period of inactivity. This monitoring is required to make sure that ongoing write operations were committed before the instant virtual copy is made. The period of inactivity is necessary so that PSM can be sure that any data in a write-back buffer has a chance to get flushed to the disk before the instant copy is made. This period of inactivity that PSM requires is configurable by the NAS administrator. The NAS administrator can also configure how long PSM should search for this inactivity window. If that inactivity is not found within that time, the virtual instant copy will not be made. 2. An instant virtual copy is then made. At this point in time, PSM sets up control blocks and pointers. This virtual copy is created very quickly. 3. The PSM code continues to monitor the file system for write-sector requests. When a write-sector request occurs, the PSM code intercepts this request by first reading the data that is to be overwritten, then saves the original data in a PSM-specific cache file (which is also stored on disk). After a copy of the original data is saved, the write-sector request is allowed to be completed see Figure 8-6.

Chapter 8. Terminology and concepts of IBM NAS

137

C o p y - o n - w rite o p e ra tio n
N A S f ile s y s te m
1 . W rite re q u e s t t o u p d a te d is k

P S M s o f tw a r e
3 . W rite c o m p le t e s to d is k

PS M cache

2 . C o p y -o n -w rit e s a v e s t h e o rig in a l ( u n m o d ifie d ) c o n te n ts o f s e c t o r in th e P S M c a c h e

D is k

N o te : A c t u a lly , P S M c a c h e is a ls o o n d is k , b u t is s h o w n h e re s e p a ra te ly fo r s im p lic ity

Figure 8-6 PSM copy-on-write operation

4. As additional write-sector requests are made, PSM again saves a private copy of the original data in the PSM-specific cache. This process is called a copy-on-write operation and continues from then on until that virtual copy is deleted from the system. Note that through time, the PSM-specific cache will grow larger. However, only the original sector contents are saved and not each individual change. 5. When an application wants to read the virtual copy instead of the actively changing (normal) data, PSM substitutes the original sectors for the changed sectors. Of course, read-sector requests of the normal (actively changing) data pass through unmodified see Figure 8-7.

138

DB2 UDB exploitation of NAS technology

R e a d da ta fro m P e rs is ten t Im ag e
N A S file sy ste m
1. R e a d fro m th e p e rsis te nt im a g e c op y 3 . F o r c h a ng e d se c to rs , P S M s u b stitu te s the o rigina l fro m its c ac h e w h en it se n ds th e d a ta to th e N A S file s ys te m .

P S M s o ftw a re
2a . S e c to rs th a t h a ve n o t c h an g e d are re a d fro m th e re g u la r lo c a tio n .

P S M c a ch e
2b . F o r s e ctors th at h av e c h an g e d, th e p re viou s ly-sa v ed o rigina l s ec to r da ta is re trie ve d fro m the P S M c a c he

D isk

Figure 8-7 PSM read from persistent image

By design, processes (such as backup or restoration) having data access through a persistent image have a lower process priority than the normal read and write operations. Therefore, should a tape backup program be run at the same time the NAS is experiencing heavy client utilization, the tape-backup access to the PSM image is limited, while the normal production performance is favored, which helps to minimize normal user impact. While creating the PSM image happens very quickly, it might take a few minutes before that image is available and visible to the users. In particular, the very first image will generally take much longer to be made available than subsequent images. By design, PSM will run at a lower priority than regular traffic, so if the system is heavily utilized, this delay can be longer than normal.

8.4.2 PSM cache contents


The following examples illustrate how data in the sectors are updated by the PSM software during a copy-on-write operation.

Chapter 8. Terminology and concepts of IBM NAS

139

In these examples, we assume that the disk originally contained only the following phrase Now is the time for all good men to come to the aid of their country. In these examples below, the expression (FS) represents those sector(s) containing the file system meta-data. This, of course, is updated on every write operation. Empty (free space) sectors are indicated as #0001, #0002, etc. The disk/cache picture examples A through D below are not cumulative, that is,, in each case we are comparing against example A. A. Immediately after a persistent image (instant virtual copy) is made Table 8-1 shows the layout of how the disk would appear immediately after the instant virtual copy is made. Note that nothing has really changed. Although pointers and control blocks have changed, for simplicity, those details are not shown here.
Table 8-1 Layout of disk after instant virtual copy is made
Now i e aid #0019 #0028 s the of t #0020 #0029 time heir #0021 #0030 for a count #0022 #0031 ll go ry. #0023 #0032 od me #0015 #0024 #0033 n to #0016 #0025 #0034 come #0017 #0026 #0035 to th #0018 #0027 (FS)

Table 8-2 shows the layout of the PSM cache after instant virtual copy is made. Notice that it contains empty cells.
Table 8-2 Layout of PSM cache after instant virtual copy is made

B. Immediately after a file is deleted Table 8-3 shows the layout of how the disk would appear immediately after the original file was erased. Note that a copy of the original file system (meta-data, etc.) is all that is saved.

140

DB2 UDB exploitation of NAS technology

Table 8-3 Layout of disk immediately after file is deleted


#0001 #0010 #0019 #0028 #0002 #0011 #0020 #0029 #0003 #0012 #0021 #0030 #0004 #0013 #0022 #0031 #0005 #0014 #0023 #0032 #0006 #0015 #0024 #0033 #0007 #0016 #0025 #0034 #0008 #0017 #0026 #0035 #0009 #0018 #0027 (FS)

Table 8-4 shows the layout of the PSM cache immediately after file is deleted. Notice that the PSM cache contains a copy of the original file system data.
Table 8-4 Layout of PSM cache immediately after file is deleted:
(FS)

C. Immediately after an update in place changing time to date Table 8-5 shows the layout of how the disk would appear if the word time was changed to date. For this example to be truly correct, we would further assume the application program only wrote back the changed sectors. As explained below, this is not typical. The picture below illustrates how the sectors might appear.
Table 8-5 Layout of disk after changing time to date
Now i e aid #0019 #0028 s the of t #0020 #0029 date heir #0021 #0030 for a count #0022 #0031 ll go ry. #0023 #0032 od me #0015 #0024 #0033 n to #0016 #0025 #0034 come #0017 #0026 #0035 to th #0018 #0027 (FS)

Table 8-6 shows the layout of how the PSM cache would contain the original sector contents for the word time and the file systems meta-data:
Table 8-6 Layout of PSM cache after changing time to date
time (FS)

Chapter 8. Terminology and concepts of IBM NAS

141

D. Immediately after an update in place changing men to women Table 8-7 shows the layout of how the disk would appear if the change requires more spaces. Since more spaces are required, obviously the data following the word women would also change as well. The original contents of all changed sectors would have to be saved in the PSM cache. Note that this example is not cumulative with examples B or C above.
Table 8-7 Layout of disk after changing men to women.
Now i the a #0019 #0028 s the id of #0020 #0029 time their #0021 #0030 for a r cou #0022 #0031 ll go ntry. #0023 #0032 od wo #0015 #0024 #0033 men t #0016 #0025 #0034 o com #0017 #0026 #0035 e to #0018 #0027 (FS)

Table 8-8 shows the layout that the PSM cache would contain all the changed sectors starting with the sector containing men plus the data that was slid to the right and together with the original file systems meta-data.
Table 8-8 Layout of PSM cache after changing men to women:
od me (FS) n to come to th e aid of t heir count ry.

E. Appearance for most file updates In the above examples, we assumed that the change was an update in place, where the changes were written back to the very same sectors containing the original data. Most databases do an update in place. However, most desktop applications, such as Freelance, WordPro, Notepad, etc., performs a write and erase original update. When these desktop applications write a change to the file system, they actually write a new copy to the disk. After that write is completed, they erase the original copy. Individual sectors on a disk always have some ones and zeros stored in every byte. Sectors are either allocated (in use) or free space (not in use or empty, and the specific data bit pattern is considered as garbage). The disk file system keeps track of which data is in what sector, and also which sectors are free space.

142

DB2 UDB exploitation of NAS technology

For the NAS code that shipped on 9 March 2001, PSM is unaware of free space in the file system. Therefore, if something is written to the disk, even if it is written to deallocate disk storage, the underlying sectors are copied to the PSM cache. The following example illustrates this: Table 8-9 shows the layout of how the disk would appear following a save operation after changing the word time to date. This assumes no free space detection and no update in place. Note again that this example is not cumulative with examples A, B, C and D above.
Table 8-9 Layout of disk after changes without free space detection
#0001 #0010 ll go r. y #0002 #0011 od me #0029 #0003 #0012 n to #0030 #0004 #0013 come #0031 #0005 #0014 to th #0032 #0006 Now i e aid #0033 #0007 s the of t #0034 #0008 date heir #0035 #0009 for a count (FS)

After this save is complete, the new, saved information is written into free space sectors #0015-#0028, and the original location sectors then turn into free space, as indicated by #0001-#0014 above. Since the PSM cache works at the sector level and since this version of PSM code is unaware of free space, PSM would copy the previous free-space sectors to its cache as shown in Table 8-10 below:
Table 8-10 Layout of PSM cache after changes without free space detection
#0015 #0024 #0016 #0025 #0017 #0026 #0018 #0027 #0019 #0028 #0020 (FS) #0021 #0022 #0023

F. Appearance for most file updates, with free space detection For the NAS code that shipped on 28 April 2001, PSM will be enhanced and will detect free space in the file system. Therefore, if data is written to the disks free-space sectors, those free space sectors will not be copied to the PSM cache. Table 8-11 shows the layout of the disk in the event of a save operation after changing the word time to date, with free space detection but not update in place. Again, this example is NOT cumulative with examples B, C, D and E above.

Chapter 8. Terminology and concepts of IBM NAS

143

Table 8-11 Layout of disk after changes with free space detection
#0001 #0010 ll go ry. #0002 #0011 od me #0029 #0003 #0012 n to #0030 #0004 #0013 come #0031 #0005 #0014 to th #0032 #0006 Now i e aid #0033 #0007 s the of t #0034 #0008 date heir #0035 #0009 for a count (FS)

Table 8-12 shows the layout of the PSM cache after saving the changes from time to date. Here, since the PSM cache is aware that the new phrase is being stored in free space, it does not copy the original free space contents into the cache, and instead only updates the file system information containing pointers to the data, etc.
Table 8-12 Layout of PSM cache after changes with free space detection
(FS)

Finally note that in this situation, as the recycle bin is active on the NAS, these save operations will tend to walk through disk storage and write in free-space sectors. Therefore, with free space detection (28 April 2001 code) the recycle bin should be set to a higher number to minimize cache writes and minimize cache size. For the 9 March 2001 code, the recycle bin should be set to a low number or turned off, to minimize cache size. Eventually a save operation will need to use sectors that were not free space when the original persistent image was made. Then the original contents will be copied into the PSM cache.

8.4.3 PSM True Image: read-only or read-write


A persistent image is read-only by default, so no modifications can be made to it. However, the persistent image can be set to read-write, which allows it to be modified. When a persistent image is changed, the modifications made are also persistent (they survive a reboot of the system). Changing a persistent image from read-write to read-only resets the persistent image to its state at the time that the persistent image was taken, as does selecting Undo Writes for a read-write persistent image from the Persistent Images panel.

144

DB2 UDB exploitation of NAS technology

The ability to create a read-write copy is particularly valuable for test environments when bringing up a new test system. Specifically, using PSM, a True Image copy can be made of a live database, and this True Image copy could be configured as read-write. Then, a separate non-production test system could use the True Image copy for test purposes. During debug of the non-production system, the tester could select Undo Writes to reset the test-system database to its original True Image copy. All of this testing would be kept completely separate from the ongoing active system, and a full copy would not be required. By design, processes (such as the test system in this example) having data access through a True Image copy have a lower process priority than the normal read and write operations, thus minimizing the performance impact to the production database use.

Chapter 8. Terminology and concepts of IBM NAS

145

146

DB2 UDB exploitation of NAS technology

Chapter 9.

Introduction to IBM NAS


In this chapter we provide an overview of the IBM Total Storage NAS 200 and NAS300.

Copyright IBM Corp. 2002

147

9.1 IBM Network Attached Storage overview


IBM NAS uses File I/O. In an IBM NAS appliance, all data enters and exits via an IP Network in File System Formats (File I/O). The NAS OS internally converts the File I/O into Block I/O Formats and stores all data in Block I/O Formats onto the integrated disk arrays via SCSI commands. This was illustrated in Figure 2-3 on page 40.

9.2 IBM TotalStorage Network Attached Storage


The TotalStorage Network Attached Storage is a family of products, which are specifically designed, configured, and packaged to provide solutions to help overcome the challenges of cost effectively sharing, managing, and protecting data within complex network infrastructures. The TotalStorage NAS 200 and 300 series products are designed to fill the storage and file serving needs of workgroup, departmental, and small enterprise customers. With their scalability, high reliability, high performance, and afford ability, the NAS 200 and 300 products are designed to meet a host of requirements in demanding environments.

9.2.1 The IBM TotalStorage Network Attached Storage 200


The IBM TotalStorage NAS 200 is a storage appliance family that consists of two machines and associated optional features. The 5194 Model 201 is a towerbased engine and HDD storage. The 5194 Model 226 is a rack-mounted engine and HDD storage. The storage capacity of model 226 can be extended by attaching Storage Expansion Units. The two models have been developed for use in a variety of workgroup and departmental environments. They support file serving requirements across NT and UNIX clients, e-business, and similar applications. In addition, these devices support Ethernet LAN environments with large or shared end user work space storage, remote running of executable programs, remote user data access, and personal data migration. Both models are simple to install, and feature an easy-to-use Web browser interface that simplifies setup and ongoing system management. Both NAS models have pre-loaded, pre-configured, pre-tuned, and pre-tested operating systems, supporting system management, and RAID management software.

148

DB2 UDB exploitation of NAS technology

The IBM TotalStorage NAS Models 201 and 226 are designed to be high-throughput, two-way SMP-capable appliances with excellent scalability. They incorporate a powerful 1.133 GHz processor with 512 KB advanced transfer L2 cache. The advanced transfer cache is the result of a new backside bus that is 256 bits wide. The quad-wide cache line can transfer four 64-bit cache line segments at one time to deliver full-speed capability. Two Intel Pentium III connectors are standard on the system board to support installation of a second processor. The second 1.133 GHz processor is standard on the Model 226; it may be added as an option on the Model 201. When both processors are present, they share the workload and are load-balanced. High-speed, 133 MHz SDRAM is optimized for 133 MHz processor-to-memory subsystem performance. The IBM TotalStorage NAS Models 201 and 226 use the Server Works HE-SL chip set to maximize throughput from processors to memory, and to the 64-bit and 32-bit PCI buses. The NAS 200 models scale from 109 GB to over 3.49 TB total storage. Their rapid, non-disruptive deployment capabilities means you can easily add storage on demand. Capitalizing on IBM experience with RAID technology, system design and firmware, together with the Windows Powered operating system (a derivative of Windows 2000 Advanced Server software) and multi-file system support, the NAS 200 delivers high throughput to support rapid data delivery.

IBM NAS 200 Model 201


The NAS 200 tower offers many features found in larger systems, but at an entry-level price. The tower model is powered by a single 1.13 GHz Pentium III processor inside a single engine with a single-channel hardware RAID controller and six internal storage bays. With a basic storage capacity of 109 GB (3 x 36.4 GB disk drives), this model can be expanded up to 440.4. The entry level configuration of this model consists of the following hardware components: Compact tower configuration One 1.133 GHz Pentium III processor second (dual) processor is optional 512 MB of ECC 133 MHz memory standard expandable to 2.5 GB ServeRAID-4Lx economical, single-channel RAID One 10/100 Ethernet connection Dual-channel, 160 MB/s Ultra160 SCSI controller Can be configured with 3 to 6 HDDs, either 36.4 or 73.4 GB HDD (109.2 GB up to 440.4 GB)

Chapter 9. Introduction to IBM NAS

149

IBM NAS 200 Model 226


The departmental model is a higher-capacity, rack-configured appliance for larger client/server networks. With dual 1.13 GHz Pentium III processors, a four-channel hardware RAID controller, and storage capacity from 109.2 GB to 3.49 TB, this model is designed to provide the performance and storage capabilities for more demanding environments. The base configuration of this model consists of the following hardware components: Rack-optimized configuration Two 1.133 GHz Pentium III processors 1 GB of ECC 133 MHz memory standard expandable to 3 GB ServeRAID-4H high function, four-channel RAID One 10/100 Ethernet connection A dual-channel, 160 MB/s Ultra160 SCSI controller Can be configured with 3 to 6 HDDs, either 36.4 or 73.4 GB HDD (109.2 GB up to 440.4 GB) Expandable up to 3.49 TB using up to three IBM 5194 NAS Storage Unit Model EXP. The Storage Expansion Unit Model EXP (option on 226 only) features: 3U form factor Configure from 3 to 14, 36.4 or 73.4 GB HDD (109.2 GB up to 1.027 TB)

9.2.2 The IBM TotalStorage Network Attached Storage 300


IBM's TotalStorage Network Attached Storage 300 (5195 Model 326) is an integrated storage product that is system-tested and comes with all components completely assembled into a 36U rack. It is designed to be installed quickly and easily and the entire system is tested by IBM prior to delivery. The NAS 300 appliance provides the same features and benefits as the IBM NAS 200 series products. In addition, with its second engine, it provides increased reliability and availability through the use of clustering software built into the appliance. The NAS 300 also provide scalability, fault tolerance and performance for demanding and mission critical applications. The NAS 300 consists of a dual node chassis with fail-over features. It has dual fibre channel hubs and a fibre channel RAID Controller. The 300 is pre-loaded with a task-optimized Windows Powered Operating System. With its fault-tolerant, dual engine design, the 300 provides a significant performance boost over the 200 series.

150

DB2 UDB exploitation of NAS technology

The TotalStorage NAS 300 Model 326 base machine consists of the following machines: one IBM 5186 NAS Rack Model 36U, two IBM 5187 NAS Engines Model 6RZs, two 3534 SAN Fibre Channel Managed Hubs Model 1RUs, and one IBM 5191 NAS RAID Storage Controller. The NAS 300 Model 326 is preinstalled in the rack with a fully integrated suite of optimized software preloaded. It is designed to be installed quickly and easily and the entire system is tested by IBM prior to delivery. For high-performance data handling, each NAS 300 Model 326 engine employs dual 1.133 GHz Pentium III processors and IGB of memory standard. High performance Fibre Channel Hubs and cabling are used for disk connectivity. Optionally, the NAS 300 Model 326 can be enhanced by adding: an additional 5191 NAS RAID Storage Controller Model RU, up to seven 5192 NAS Storage Unit Model RUs. The Rack is preconfigured with Fibre Channel cabling for all such expansions. The base NAS 300 Model 326 is configured with either 109 GB or 218 GB of HDD. These minimums consist of the minimum three 36.4 GB or 73.4 GB HDDs in the first (required) RAID Controller. These minimums can be expanded by adding one to seven additional HDDs to the first RAID Controller. It is recommended that the practical minimum be six to ten HDDs. Additional HDDs can be added in eight increments of 109.2 to 728 GB. This level of granularity is achieved using three to ten 36.4 GB or 73.4 GB HDDs in each additional 5191 and 5192. The maximum configuration of 6.61 TB is achieved by adding 7 IBM 5192 NAS Storage Units Model 0RU and the additional IBM 5191 NAS RAID Storage Controller Model 0RU optional machines all using 73.4 GB HDDs. The NAS 300 base configuration features the following: One Rack 36U (with state-of-the-art Power Distribution Unit) Two Engines, each with: Dual 1,13 GHZ Pentium III processors 1 GB memory Two Redundant and hot swap power supplies/fans Support for 36.4 GB HDD and 73.4 GB HDD 364 GB starting capacity (10 x 36,4 GB hot swap-able HDD), expandable to over 6.57 TB Two Fibre Channel Managed Hubs One RAID Controller Ten 36.4 GB Hot swap-able HDD

Chapter 9. Introduction to IBM NAS

151

Optionally, it supports the following: Additional RAID Controller Maximum of 7 Storage Expansion Units. Each populated with ten 36.4 GB Hot Swap-able HDD The system comes standard with dual-node engines for clustering and fail-over protection. The dual Fibre Channel Hubs provide IT administrators with high performance paths to the RAID storage controllers using fibre-to-fibre technology. The pre-loaded operating system and application code is tuned for the network storage server function, and designed to provide 24 X 7 up time. The simple point-and-click restore feature makes backup extremely simple. With multi-level persistent image capability, recovery is quickly managed to ensure highest availability and reliability.

152

DB2 UDB exploitation of NAS technology

10

Chapter 10.

Configuration of IBM NAS 200 and 300


In this chapter, we cover the following topics: Setting up NAS 200 Setting up NAS 300 Accessing the NAS from the client

Copyright IBM Corp. 2002

153

10.1 Our environment


In order to test the NAS 200 and NAS 300, the environment was set up as shown in Figure 10-1.

PDC (Windows 2000)

DB2

NAS 200

NAS 300

STATION 2

STATION 1

Figure 10-1 Our environment

The network is 10/100 Ethernet; all the other computers are in the same local network. We created a Windows Active Directory, the domain name is NAS-DB2.ITSO.IBM.COM, and the PDC is running Windows 2000 Server. We have three stations: two stations are running Windows NT, one is running WIndows 2000. In the following sections we describe the steps to set up the test network, including creating the domain user account for DB2 and adding the computer to the domain. This is the main network environment. In addition, we also set up a trusted domain, and an isolated network for NAS 300, but the setting up procedure is similar.

154

DB2 UDB exploitation of NAS technology

10.1.1 Create db2 user account


Logon to the Primary Domain Controller of nas-db2.itso.ibm.com, open Active Directory Users and Computers, right-click user, and select a new user (Figure 10-2). ,

Figure 10-2 Create new user

After entering the user information, type the password. The final message shows that the user has been successfully created.

Chapter 10. Configuration of IBM NAS 200 and 300

155

Now change the users property, and add this user to the domain admin group. Go to Active Directory Users and Computers, right-click the user nas_db2_user, and select Properties (Figure 10-3).

Figure 10-3 Change user properties

156

DB2 UDB exploitation of NAS technology

Select Member of tab, and click Add. Select Domain Admins and the Administrators group from the domain and click OK (see Figure 10-4).

Figure 10-4 Change nas_db2_user group select group

Chapter 10. Configuration of IBM NAS 200 and 300

157

Now, groups to which user nas_db2_user belongs are listed; click OK to confirm (see Figure 10-5).

Figure 10-5 Change nas_db2_user group result

10.1.2 Add computer to domain


After creating the user, now all the stations need to be added into the domain. The following paragraph shows a sample of adding one station into the domain. First logon to the computer as Administrator user (otherwise you cant add the computer into the domain), then go to the control panel, double-click system, select Network Identification tab, and click Network ID. The Network Identification Wizard now starts, and should give you a welcome message; click Next. Select My company uses a network with a domain. The wizard will pop up a message box, tell you a domain account is needed (id, password, domain name); just click Next. We will use the user account we created for DB2 to add this computer to the domain (see Figure 10-6).

158

DB2 UDB exploitation of NAS technology

Figure 10-6 Create computer account in Windows domain

Because we havent created a computer account in the domain, the wizard will ask for the account information (see Figure 10-7). Note: If you dont have domain administrator privilege, your domain administrator must create a computer account for you before this step. Then, at this step, the system wont require you to create a computer account.

Figure 10-7 Add to domain

After you enter the computer account information, click Next, and the wizard will ask for a domain user account which has the permission to add this computer into the domain. Use the nas_db2_user account (see Figure 10-8).

Chapter 10. Configuration of IBM NAS 200 and 300

159

Figure 10-8 Add to domain the domain user account

Now it will take a little while (about 10 seconds). If successful, a message box, Welcome to NAS-DB2 domain will be shown (see Figure 10-9).

Figure 10-9 Add user to local Administrator group

After selecting the Administrators group, this computer has been added to the domain. Restarting is required; just click OK. After restarting the computer, the login screen is changed, and the Login into Domain option is available now. Note: This step is not required if the nas_db2_user belongs to the domain admin group, because the domain admin user is automatically added to the local administrators group.

160

DB2 UDB exploitation of NAS technology

10.2 Setting up IBM NAS 200


This section explains how to set up the NAS 200.

10.2.1 Connecting to the NAS 200


There are three major ways to get connected to your NAS unit: Direct connect to a NAS unit Connect to NAS Windows session Connect to NAS using a utility program

Direct connect to a NAS unit


Using a keyboard, mouse, and monitor
The NAS 200 and NAS 300 units are designed as headless appliances. However, the keyboard, mouse, and monitor can be attached directly to the system. This is sometimes more convenient than using the remote utility to configure the NAS system.

Using virtual keyboard, mouse, and monitor


Server console also can be directly accessed using a utility program. In our environment, DSView was used. The computers monitor, keyboard and mouse were connecting to a server though the network. A client computer can connect to DSView server to access the console of the computer.

Connect to NAS Windows session


Using MS Windows Terminal Client
The Windows Powered OS that pre-installed on the NAS is also running Windows Terminal Services. You can use a Terminal Service Client to get access to the NAS unit. The Terminal Services Client is included in Total Storage Network Attached Storage System Supplementary CD. If you dont have a Terminal Service Client installation source (since it is not included in the Windows installation CD), you can use any Windows Server which has the Windows Terminal Service installed to build the client installation. You should be able to find the Client Installation source on: %WINDOWS_HOME%/system32/client/TSClients

Using MS Internet Explorer to access Terminal Service


You can access the Terminal Service through Internet Explorer 5.0 or higher. The screen itself is the same as the MS Terminal Service Client, a little bit slower, but it is still convenient.

Chapter 10. Configuration of IBM NAS 200 and 300

161

Connect to NAS using a utility program


Using Internet Explorer
In order to have access to the NAS appliance, you can type this URL:
http://nas-ipaddress:8099

After executing the URL, the system will pop up a Windows login to prompt you for the user name and password. Note: Please use a domain user to login if you already have a domain user account setup, and remember to enter the domain name. If you use the local default Administrator account, you will not be able to access other network resources directly.

Using the IBM Advanced Appliance Configuration Utility (IAACU)


The IBM Advanced Appliance Configuration Utility can also be used to connect to NAS. It is from the IBM Total Storage Network Attached Storage System Supplementary CD. The CD is delivered with the server. This tool should be installed on a workstation connected to the same network segment with the NAS unit. While using this tool, you are able to discover any NAS appliances in the network and see their characteristics and specific data. Note: Most of the tasks you need to do with NAS can be done with any of these types of connections, but there are exceptions: 1. Setting up a cluster on the NAS 300 can only be done by direct connection, since it requires you to shut down one node. 2. For performance considerations, Terminal Client is the best if you have a fast network, while Internet Explorer or virtual connection works better for a slow network.

10.2.2 Default configuration


In this section we describe the default configuration for the NAS unit.

Disk
The NAS 200 includes a ServeRAID 4H with 4 channels. The first channel is connected to 6 disks of 36 GB each. By default, a drive array A with 3 logical drives has been set up in the NAS 200: Array Drive 1, C drive, 3123 MB Array Drive 2, D drive, 6498 MB Array Drive 3, E drive, 157535 MB

162

DB2 UDB exploitation of NAS technology

If you want to customize the array configuration, you need to run the ServeRAID Manager program.

Network
The NAS 200 includes two ethernet controllers. One is a PCI slot card and the other is directly integrated on the motherboard. The Windows 2000 operating system shows these adapters as an IBM Netfinity Fault Tolerance PCI Adapter for the on-board adapter, and as an IBM 10/100 for the PCI-SLOT adapter. Both adapters are configured by default to use a DHCP server. We recommend that you change it to a static IP addresses. The adapter on board is used for a cluster, and the other one is used for the public network connection. By default, the appliances names are composed by the IBM Machine Type Model appliance plus IBM Serial Number. We recommend that you change this according to your company naming standard.

10.2.3 Setting up storage


In this section we explain how to set up storage.

Using ServeRAID Manager


You can use the ServeRAID Manager program to configure your ServeRAID controllers, view the ServeRAID configuration and associated devices, create arrays and logical drives, delete an array, dynamically increase the logical-drive size, change RAID levels, and perform many more functions for your disk management needs. The following steps explain how to use the ServeRAID manager: 1. Log on to the NAS 200 with an administrative account, and open the IBM NAS ADMIN.MMC from the desktop or start menu.

Chapter 10. Configuration of IBM NAS 200 and 300

163

2. Select NAS Management -> Storage -> ServeRAID Manager -> Server Raid Manager (see Figure 10-10).

Figure 10-10 Using IBM NAS 200 Server RAID Manager

Creating arrays
The following steps show how to create drive arrays: 1. In the storage browsing tree, right-click the ServeRAID controller that you want to configure. 2. Click Create Arrays. 3. Click the Custom configuration button. 4. Click Next and the Create Arrays window opens. 5. Right-click the drive or SCSI channel icons in the Main Tree to select the drives that you want to add to your arrays, delete from your arrays, or define as hot-spare drives; then select a choice from the pop-up list. If you want to create a spanned array, click the Span Arrays box. 6. After you select the ready drives for your arrays and define your hot-spare drives, click Next. If you are not creating spanned arrays, here you can select the RAID level.

164

DB2 UDB exploitation of NAS technology

7. To finish the procedure, click Apply. 8. Click Yes in answer to the question Do you want to apply the new configuration?. 9. Right-click new array to synchronize it; the synchronization time depends on the RAID level and number of drives. 10.Now the array is ready to be created on the operating system.

Creating logical drives


Now the array is ready to be accessed by the operating system. Follow the steps to create new logical drives: 1. Open IBM NAS ADMIN.MMC from the desktop, click Disk Managed (Local), and choose the new drive. 2. Right-click Create Partition and the wizard window appears. Choose a primary partition, and click Next. 3. Set your desired size and click Next. 4. Assign a drive letter H. 5. Select file system; we recommend to use NTFS in general. 6. Now the drive letter shows in the disk management window. The drive H shows Formatting; it takes from a few minutes to tens of minutes.

10.2.4 Add NAS 200 to domain


Now we need add this computer to the domain, refer to Figure 10-5 on page 158.

10.2.5 Creating a share volume


Creating a share volume on the NAS 200 is simple. Just as when sharing any drive on a Windows Server, you must give the DB2 user read and write access for the volume created for DB2:

Chapter 10. Configuration of IBM NAS 200 and 300

165

1. Right-click the folder (or drive) you want to share, select the Sharing tab, and enter the Share name (see Figure 10-11).

Figure 10-11 Share folder

2. Select Permissions. By default, everyone can read and change. For security reasons, the default permissions should be removed, and permissions for a DB2 user should be added.

166

DB2 UDB exploitation of NAS technology

3. Select the db2 user from the domain: nas_db2_user (see Figure 10-12).

Figure 10-12 Add user for share folder

Chapter 10. Configuration of IBM NAS 200 and 300

167

4. Select your access type for nas_db2_user and click OK (see Figure 10-13).

Figure 10-13 Set permissions for db2_data folder

168

DB2 UDB exploitation of NAS technology

10.3 Setting up the IBM NAS 300


This section describes how to set up the NAS 300. Connecting to the NAS 300 is almost the same as connecting to the NAS 200. For details, refer to 10.2.1, Connecting to the NAS 200 on page 161. There are two differences: 1. The NAS 300 has two nodes, so you need to separate the two nodes. 2. You can only use Terminal Client, HTTP client, or a utility program to connect to a virtual server, including Cluster Server and Virtual Server.

10.3.1 Default configuration


In this section we discuss the default configuration.

Disk
Each engine on the NAS300 has one 9-GB ultra SCSI hard disk. It is divided into two partitions: there is one 3-GB partition for the system, and the rest (6 GB) is reserved for maintenance. The system partition contains the Windows Powered Operating System files. The Model 326 comes with a preconfigured shared storage RAID configuration on the first IBM 5191 RAID Storage Controller Model 0RU. The storage configuration application is IBM FAST Storage Manager 7 Client. The storage is formatted as an array, at RAID-Level 5, consisting of the following LUNs: A LUN of 500 MB, for the Quorum drive (the drive letter will be G). The Quorum drive is used by Microsoft Cluster Service to manage clustered resources A second LUN, composed of the remaining space, and used as a shared drive with one built-in hot spare.

Network
There are at least four network interface cards (NICs) per engine. One on-board card is called an IBM 10/100 Netfinity Fault Tolerant Adapter; and the other add-on PCI cards are called IBM 10/100 Ethernet Server Adapters.

Chapter 10. Configuration of IBM NAS 200 and 300

169

10.3.2 Setting up storage on the NAS 300


Just like the NAS200, the NAS300 system uses a disk utility to manage its storage. This is called the IBM Netfinity Fibre Channel Storage Manager. Storage Manager 7 provides a GUI for managing storage subsystems. It manages the storage subsystems through the Fibre Channel I/O path between the engines and the RAID controllers (host-agent method). It is pre-installed on both nodes, and you can access it though a Terminal Service Session. Caution: When you set up the shared storage with Storage Manager 7, you need to power off the other node.

Creating drive arrays


Using Netfinity Fibre Channel Storage Manager to Create Arrays is similar to creating arrays on the NAS 200; please refer to Creating arrays on page 164. In our environment, we create the two arrays: db2_data and db2_logs. This is very similar to creating an array in NAS 200. Figure 10-14 shows creating the db2_logs array.

Figure 10-14 Create disk array

170

DB2 UDB exploitation of NAS technology

After creating the array, the result in the RAID controller is shown in Figure 10-15.

Figure 10-15 Create disk array

Creating a logical drive


Creating a logical drive here is slightly different than creating a logical drive in the NAS 200. Follow these steps to create a logical drive: 1. After rebooting, open IBM NAS Admin and select DISK Management (local) in the storage folder. 2. A Write Signature and Upgrade Disk Wizard pops up; click Next. 3. Select the disk you want write a disk Signature, and click Next. You should see an informational screen. Click Finish (see Figure 10-16).

Chapter 10. Configuration of IBM NAS 200 and 300

171

Figure 10-16 Create logical disk

4. Disk 3 is new disk we just created, right-click and select create partition, and click Next (see Figure 10-17). .

Figure 10-17 Create partition

172

DB2 UDB exploitation of NAS technology

5. Select primary partition, and click Next. 6. Select the disk size (see Figure 10-18).

Figure 10-18 Create partition select disk size

7. Assign drive letter I. 8. Format the drive using NTFS. Dont select compression (see Figure 10-19).

Figure 10-19 Create partition format disk

The formatting takes approximately tens of minutes. After done, the I drive is ready for the application.

Chapter 10. Configuration of IBM NAS 200 and 300

173

Assigning sticky drive letters


You need to assign sticky drive letters to the Quorum and data drives, so that each drive will be recognized exactly the same on each node. In other words, the drive mappings on both nodes need to be set up exactly the same way because by default, when the Windows system finds an available disk drive, it will assign the first available drive letter for the disk. (So, if you create a logical drive at node 1 as the I drive, it may use a different drive letter in node 2.) You also need to make sure that they are configured as Basic. These are requirements of the Microsoft Cluster Server. To achieve this, do the following on each node (one at a time): 1. Open IBM NAS Admin utility and select Storage 2. Choose Disk Management (Local) 3. Right-click the drive and select Change Drive Letter and Path. On the Edit Drive Letter or Path, click down the pull-down arrow and select the drive letter you want (again, it is recommended to use G: for the Quorum) 4. Then click OK.

10.3.3 Setting up the Cluster Server


The IBM Total Storage NAS 300 uses Microsofts Cluster Server (MSCS) software to provide high availability. Clustering the NAS 300 is similar to clustering in a client/server; it ensures availability on the share volume service level, regardless of individual service or computer failure. The physical storage is ensured by the RAID disk controller. In this cluster environment, the two nodes are connected together via an Ethernet cross-over cable. By connecting each of the nodes, this cable acts as a lifeline of the cluster and carries the heartbeat, continually checking to see if all nodes in the cluster are functional. The hardware is already set up correctly for clustering. Note: Novel Netware and Apple Macintosh shares are available on both nodes, but not through clustering services. If either node fails, the shares become unavailable until the node is brought back up. The following paragraphs guide you through the configuration and setup of a NAS 300 cluster using MSCS on NAS 300 Version 2. The MSCS software is included with the NAS 300. We use an isolated network with two Windows advanced servers ruining DB2.

174

DB2 UDB exploitation of NAS technology

Checklist
Before you start, you should make sure you have all the information you will need.

Network
Both two nodes can connect to the domain controller. Connect means they are in the same broadcast network, or DNS and gateway are properly configured for both nodes to find the Domain Controller. You need a domain admin account. DNS: The PDC running Windows 2000 is a DNS server. You may have an official DNS server to resolve the domain name outside of this domain. Gateway: Gateways for each network to which the node is connected are needed for NAS to connect outside of the broadcast network.

Cluster information
This is the information you will need regarding the cluster. Cluster name: The Virtual Name is the host name the clients will use to address the NAS300. This home name can bind and fail-over to any of the engines/nodes on the appliance. Cluster IP address: The cluster IP address is the address bound to the Virtual Name. Drive letter of Quorum drive: This is where the cluster information is kept. It is recommended to use drive G (which is the default on the NAS300). Virtual server information: The virtual server is the network name resource created in the cluster for the client to access the share volume: Virtual server name Virtual server IP address Share volume information Note: The virtual server is not required to finish the cluster setting; you can add new virtual server or share volume later. In our environment, we us Avocent DSView to access the NAS 300 appliance. The following configuration was used: Avocent DSView access: userID: sndiop\nas_db2_user (this is a completely different user) password: nas_db2_user

Chapter 10. Configuration of IBM NAS 200 and 300

175

Domain environment: Domain name: nas-db2 Domain userID: nad_db2_user Domain user password: nas_db2_user Gateway: 192.168.100.10 (Since this is an isolated network, the gateway is not required) DNS: 192.168.100.10 Server 1 (PDC): Host name: db2w2ksvr1.nas-db2.itso.ibm.com IP address: 192.168.100.10 OS: Windows 2000 Advanced Server SP2 Software: DB2 7.2 Server Enterprise Edition DB2 7.2 Client Server 2: Hostname: db2w2ksvr1.nas-db2.itso.ibm.com IP Address: 192.168.100.20 OS & Software same as Server 1. NAS 300 (V2): Node 1: Hostname: nasdb2n1 IP Address: 192.168.100.30 Node 2: Host Name: nasdb2n2 IP Address: 192.168.100.31 Cluster information: Cluster Name: db2cluster Cluster IP Address: 192.168.100.32 Virtual Server 1 information: Virtual Server Name:nasdb2nn1 IP Address: 192.168.100.33 Share Volume: nasdb2sf1,nasdb2sf2

Network setup
Networking for both nodes needs to be set up. Each node has a interconnect network and three public networks. Only the networks in use need to be configured. In our environment, there is only one public network.

176

DB2 UDB exploitation of NAS technology

Set up interconnect network


Do the following steps on both nodes to configure the interconnect (Private) network adapter. The Private connection is the heartbeat interconnect for the cluster. 1. Right-click My Network Places and then select Properties. 2. Select the network connection that uses the IBM 10/100 Adapter. If only the network name is shown, select View -> Detail. The device name will show. 3. Right-click, select Rename, and set new name to Private Network. 4. Right-click the Private Network and click Properties. 5. Click Configure, select the Advanced tab, and verify that the Link Speed and Duplex are set to: 100 Mbps /Full Duplex. 6. Select Protocol (TCP/IP) and click Properties. 7. The default IP address should be 10.1.1.1 for the first node; and 10.1.1.2 for the joining node. If not, it is recommended that you set them to those values. Please ensure that the subnet mask is 255.255.255.0. 8. Select Advanced, and select the WINS tab. 9. Select the Disable NetBIOS over TCP/IP radio button and click OK. 10.Select Yes when you are asked Do you want to continue using an empty Primary WINS address?. 11.Click OK as needed to return to Network and Dial-up connections.

Set up public network


You must set up the proper public network for both nodes; the default is to use DHCP. You should change all the public network IP addresses to static IP addresses so the cluster nodes are still available when the DHCP server goes down. Follow the steps on each node to configure each public local area connection: 1. Right-click My Network Places, then click Properties. 2. Select the desired Local Area Connection. When you do this step, the connection that uses the IBM 10/100 Adapter is the private connection. The other active connection is the public connection. Use the other active connection for this step and the next step. 3. Right-click, select Rename, type Local Network Area Public 1, and press Enter. Ensure that local area connection names are unique.

Chapter 10. Configuration of IBM NAS 200 and 300

177

Note: When you do these renaming steps for the joining node, ensure that the local area connection name for each physically connected network is identical on each server. 4. Right-click My Network Places, click Properties, right-click the Public icon, then click Properties, select Internet Protocol (TCP/IP), and click Properties. 5. Use the networking information in the checklist to enter the networking addresses (Figure 10-20), including: IP address, subnet mask, default gateway, and preferred DNS server.

Figure 10-20 Set up public network

6. If needed, configure the DNS, WINS, HOSTS, or whichever, method you will be using for name resolution. You can view this information by clicking the Advanced button on the Properties window.

178

DB2 UDB exploitation of NAS technology

7. Click OK on each panel to return to the Properties window. You need to check the network binding order. The Cluster function requires binding of the Private Network to be the first network binding. To change it: a. Right-click My Network Places and then select Properties. b. Select Advanced,then Advanced Settings. c. Reorder the position of the adapters by select them, then pressing the up or down arrow keys, then clicking OK. The result of the network configuration is shown in Figure 10-21.

Figure 10-21 Configure public network

Add both NAS 300 nodes to domain


You need to add both of the two nodes into the domain. Please refer to 10.2.4, Add NAS 200 to domain on page 165 for details; the procedure is same.

Set up the first node


Before you start this step, you need to have completed the setup procedure on each node. The nodes are equivalent in hardware and software, so first node and joining node refer only to the sequence in which you set up the nodes. The first node will become the default primary node. Important: Before you start to install the cluster server on the first node, make sure that the joining node is powered off. This is required to prevent corruption of data on the shared storage devices. Corruption could occur if both nodes were to try to simultaneously write to the same shared disk that is not yet protected by the clustering software 1. Open IBN NAS Admin, select Cluster Tools folder, and click the Cluster Setup icon. A welcome screen is shown; select Continue. 2. On the Nodes type window, click First Node.

Chapter 10. Configuration of IBM NAS 200 and 300

179

3. The Cluster information panel appears. Enter the following information (see Figure 10-22): Administrators account info Domain Name Cluster IP address Subnet mask Quorum drive

Figure 10-22 Set up first node

Note: You must use a domain account

180

DB2 UDB exploitation of NAS technology

4. On the confirmation window, select Yes; it will take few minutes to finish. The cluster administration utility will automatically start showing the first node with its groups and the resources. After finishing, the cluster and node 1 information is shown on the Cluster Administrator Window. See Figure 10-23.

Figure 10-23 Set up cluster on first node

Chapter 10. Configuration of IBM NAS 200 and 300

181

Set up the joining node


Setting up the Joining node is much simpler, you need to login using a domain admin account (see Figure 10-24). 1. Open the IBM NAS Admin utility, and click Cluster Tools. 2. Choose Cluster Administration, and click continue 3. On the Nodes selection window, click Joining Node. 4. The first Node Information panel appears. Enter the name of the first node.

Figure 10-24 Set up second node

5. You may be requested for domain admin account information if you are not logged in as a domain admin user.

182

DB2 UDB exploitation of NAS technology

You will see a message that configuration will take a few minutes. After it is completed, the Cluster Administration function starts on the second node (see Figure 10-25).

Figure 10-25 Result of setting up second node

Chapter 10. Configuration of IBM NAS 200 and 300

183

Recommended adjustment setting


The following tasks are not required to complete the cluster setting, but will make the cluster work more efficient. All tasks will be done in the Cluster Administration. 1. Increase log file size: Right-click the cluster name and select Properties, select the Quorum Disk, and change the quorum log from 64 KB to 4096 KB (see Figure 10-26).

Figure 10-26 Adjust cluster quorum log size

184

DB2 UDB exploitation of NAS technology

2. Change private network priority: Select Network Priority to view all networks acknowledged by the cluster server, select the private network connection, and move it to the top for cluster communication priority (see Figure 10-27).

Figure 10-27 Increase private network priority

Chapter 10. Configuration of IBM NAS 200 and 300

185

3. Set private network to Internal communication: Open the Properties for the private network and select Internal cluster communication only to ensure that no client traffic will be placed on the private network (see Figure 10-28).

Figure 10-28 Set private network to internal communication

Cluster resource balancing


The MS Cluster Server cant do balancing on the same resource, so you need manually balance the disk groups into two nodes to distribute the cluster resource functions between the two nodes. This allows for a more efficient response time. To set up cluster resource balancing, do the following steps: 1. Select a disk group and right-click to bring up its Properties panel by right-click, select the General tab. 2. Click the Modify button to the right of the Preferred Owners.

186

DB2 UDB exploitation of NAS technology

3. In the available nodes panel, select a node and click the arrow button (see Figure 10-29).

Figure 10-29 Resource balance for disk group 1

Click OK, and the preferred owners will be shown on the Disk Group 1 property window. Each disk group has a preferred owner, so that, when both nodes are running, all resources contained within each disk group will have a node defined as the owner of those resources. Even though a disk group has a preferred owner, its resources can run on the other node after a cluster failover. If you restart a cluster node, those resources that are preferentially owned by the restarted node will fallback to that node once the cluster service detects that the node is operational, and provided that the defined failover policy allows this to occur. If you have not defined the node as the preferred owner for the resources, then they do not fallback to the node.

Failover setup
The failover of resources under a disk group on a node allows users to continue accessing the resources if the node goes down. Individual resources contained in a group cannot be moved to the other node; rather, the group it is contained in is moved. If a disk group contains a large number of resources and any one of those resources fails, then the whole group will failover according to the groups failover policy.

Chapter 10. Configuration of IBM NAS 200 and 300

187

The setup of the failover policies is critical to data availability. To set up the failover function (see Figure 10-30): 1. Open the Properties panel for the disk group. 2. Select the Failover tab to set the Threshold for Disk Group Failure.

Figure 10-30 Set up threshold for failover of Disk Group 1

For example, if a network name fails, clustering services attempts to failover the group 10 times within 6 hours, but if the resource fails an eleventh time, the resource will remain in a failed state and Administrator action is required to correct the failure. 3. Select the Fallback tab to allow, or prevent, failback of the disk group to the preferred owner, if defined (see Figure 10-31).

Figure 10-31 Set up failback for Disk Group 1

In calling fallback of groups, there is a slight delay in the resources moving from one node to the other. The group can also be instructed to allow fallback when the preferred node becomes available or to fallback during specific off-peak usage hours.

188

DB2 UDB exploitation of NAS technology

Each resource under each disk group has individual resource properties. The properties range from restart properties, polling intervals to check if resource is operational, to a time-out to return to an online state. The default settings for these properties are selected from average conditions and moderate daily usage.

Test the cluster


After the failover setting is completed, the cluster setting is finished, To test the cluster, you can simply move one disk group from its current node to another node. If successful, the cluster server is ready to use.

10.3.4 Create clustered share volume


The creation of file shares in cluster server involves dependencies on a physical disk, a static IP address, and a network name. The various dependencies allow resources that are defined to the same disk group to move as a group. The dependencies also assure necessary access for the given resource. So this step is different with the traditional ways to set up and share volumes in Windows.

Physical disk
This is the base resource in which to store user data. It is not dependent on the other resources except for the physical disk that it defines. The disk resource must also have the same drive letters on both nodes so that the definitions of resources that depend on it will remain if the resource is moved to the other node.

Static IP address
This is a virtual address that will bind onto an existing IP address on one of the clusters public networks. This IP address provides access for clients, and is not dependent on a particular node, rather a subnet that both nodes can access. Because this address is not the physical adapters permanent address, it can bind and unbind to its paired adapter on the same network on the other node in the cluster. You can create multiple IP addresses through the Cluster Administrator on the same physical network. Note: The cluster IP Address is not to be used for file shares. That address is reserved to connect to and manage the cluster through the network that it is defined on.

Chapter 10. Configuration of IBM NAS 200 and 300

189

Network name (Virtual Server Name)


This provides an alternate computer name for an existing named computer. It is physically dependent on an IP address on one of the public networks. With an IP address resource and a network name, it becomes a virtual server and provides identity to the group, which is not associated to a specific node and can be failed over to another node in the cluster, and can be accessed by a client computer as a server in the Windows network. Users access the groups using this virtual server. A network name in the Windows network is same as a server (to the network client computer). Because you can see it in the Network Neighborhood, and can use Terminal Client to access it, it is a virtual server. What you need to do is to create a file sharing resource on the virtual server. Your share folder will be available to the client of this virtual server, and you can create more than one share folders on one virtual server. In the creation of a basic file share that is publicized to the network under a single name, you will need to set it up to be dependent on the physical disk and network name in the same disk group in which you are creating the file share. The network name is dependent on the IP address, so do not add that to the dependency list. You can set the share permissions and advanced share resources also. Users will access the cluster resources using \\<network_name>\<fielshare_name>. The following steps explain how to create a DB2 data share volume: 1. Creating the IP address resource: a. Right-click Disk Group 2, and select New -> Resource (Figure 10-32).

190

DB2 UDB exploitation of NAS technology

Figure 10-32 Create IP address resource

2. Enter an IP address name, for example, ipaddr2, and change the resource type to IP Address. Select Run this resource in a separate Resource Monitor, and click Next (Figure 10-33).

Figure 10-33 Enter IP address resource information

b. A list of possible owners displays, and both nodes should remain as assigned. Click Next (Figure 10-34).

Chapter 10. Configuration of IBM NAS 200 and 300

191

Figure 10-34 Select possible owner of IP address resource

c. There are no resource dependencies on this panel, so click Next in the resource dependencies panel. d. Enter your TCP/IP parameters (Figure 10-35). This will be the first virtual IP Address. The value in the Network field identifies to the system which network the address is located on. Click Finish to create the resource.

Figure 10-35 Enter IP address for IP address resource

e. Right-click the resource and bring it online (Figure 10-36).

Figure 10-36 Bring IP address resource online

192

DB2 UDB exploitation of NAS technology

3. Creating the network name resource: a. Right-click Disk Group 1, and select New-> Resource. b. Enter the virtual server name you want to use, for example, NN2, select Network Name as the resource type, and click Next (Figure 10-37).

Figure 10-37 Enter network name resource information

c. Both nodes will be possible owners. Click Next. d. Add the IP address you created as a resource dependency in Step 1 and click Next (Figure 10-38).

Figure 10-38 Enter dependencies information

Chapter 10. Configuration of IBM NAS 200 and 300

193

e. Enter the virtual server name NASDB2NN1 into the Network Name Parameters field and click Finish (Figure 10-39).

Figure 10-39 Enter network name

f. It takes a few moments to register the virtual server name with your Name Server. After this has completed, bring the resource online (Figure 10-40).

Figure 10-40 Bring network name online

194

DB2 UDB exploitation of NAS technology

4. Creating the CIFS file share resource: a. Right-click DIsk Group 1, and select New -> Resource. b. Enter a file share name, for example, FS2, and select either File Share or NFS Share (Figure 10-41).

Figure 10-41 Create share volume

c. Both nodes are possible owners. Click Next. d. Add the resource dependencies for the Physical Disk and Network Name that the file share will use and click Next (Figure 10-42).

Figure 10-42 Enter dependencies for Share Volume resource

Chapter 10. Configuration of IBM NAS 200 and 300

195

e. Enter the share name of FS2 and the path to the disk in this group, either drive or sub-directory. You can then set User Limit, Permissions, and Advanced File Share (Figure 10-43). .

Figure 10-43 Enter Share Volume information

f. Click Finish and bring the resource online.

Create multiple volumes on one virtual server


You can create multiple volumes on the one virtual server you create. Just add more file share resources on the same disk group, and use the same IP address and network name resource.

196

DB2 UDB exploitation of NAS technology

10.4 Getting connected to NAS


This section explains how to connect to IBM NAS.

10.4.1 Accessing the shares from our Windows clients


From Windows, accessing the share was extremely straightforward. You went to the Network Neighborhood (or My Network Places, as Windows 2000 prefers to call it), drilled down to the NAS 200, supplied a user name and password, right-clicked the shared directory, and chose Map Network Drive. We were presented with a window requesting a drive and folder.

10.4.2 Accessing the shares for DB2 user


For DB2 to access the shares in NAS, each of the shares in Windows NT or Windows 200 must be defined as a service, and the service has to available before the DB2 service starts. DB2 checks the disk availability when the service starts. It will remember all the drives available to it, and a drive added after a user login (when you map a network drive) will be labeled not available for DB2. To connect to the NAS before DB2 service starts, you must install a service named AutoExNT Service; it works just like autoexec.bat for Window 3.x and 9.x. It can be used to run any.bat file as a service before user login to the Windows system; and this service must start before the DB2 service starts. AutoExNT is a utility under Windows NT/2000 that is included in the Resource Kit. In our scenario it can be used to start the redirector, establish drive mappings, and start the applications installed as an NT Service. It is a Windows NT/2000 service that executes a batch file called autoexnt.bat. Here are the steps to install AutoExNT: 1. From the Resource kit, copy the following files from the \ntreskit folder to the %Systemroot%\System32 directory: autoexnt.exe instexnt.exe servmess.dll 2. Install AutoExNT by typing the following on the command prompt from the %Systemroot%\System32:
instexnt install

You can also use the /interactive switch to see the commands that are executed by AutoExNT when the system starts and a user has logged on.

Chapter 10. Configuration of IBM NAS 200 and 300

197

3. Right-click My Computer, select Manage. Select Services and Applications. Double-click Services. 4. Right-click AutoExNT, select Properties. Select the General tab. Under Startup type: make sure that it is set as Automatic as shown in Figure B-2 (it should be set as Automatic by default). 5. Select the Log On tab. Under Log on as: click the button beside This account:. 6. Click Browse, select the domain administrator account (for example, ITSOSJ\Administrator). Supply the password and confirm it. Click Apply. Then click OK. 7. Create a file named autoexnt.bat and save it on the directory %systemroot%\system32. You may leave it empty for the moment. The actual content will depend on the applications that you are going to use and the drive mappings that you will need before starting the applications. Note: We used the administrator account in our scenario, but we recommend that you create a special service account with the special rights: Act as part of the operating system and Log on as a service.

198

DB2 UDB exploitation of NAS technology

11

Chapter 11.

DB2 installation on IBM NAS


In this chapter, we describe the installation of DB2 on IBM NAS.

Copyright IBM Corp. 2002

199

11.1 DB2 for Windows on IBM NAS


Before installing DB2 UDB for Windows on IBM NAS, the following prerequisites have to be in place: The server machine where DB2 UDB is installed needs to be part of a Domain or Active Directory. Make sure that the server has joined the domain prior to the DB2 installation because the Domain security context is used for the DB2 UDB service account. A DB2 UDB service account with the special rights Act as part of the operating system and Log on as a service needs to be in place (a detailed description of the service account setup can be found in the Windows NT 2000 documentation). The installation process of DB2 UDB requires a domain user account that has administrative rights on the machine where you perform the installation. If you plan to install DB2 UDB on IBM NAS devices, make sure that the service account has access rights to the required NAS devices. The default installation folder for DB2 for Windows is C:\Program_Files\SQLLIB. If you plan to install DB2 UDB on an IBM NAS device, make sure that the target NAS volume or directory is either mapped to the default installation path or to the path you choose for the installation. If you decide to install DB2 on a NAS device instead of a local attached disk, you have to make this device accessible to your server before you begin with the installation, as described in the next section. During the installation process, you will be asked to provide a service user account that will be used by DB2 UDB 7.2. The default service user account generated by the setup program is db2admin. You can accept the default user account or create your own. If you choose to use an existing account, you must use the password that was previously set for this user account. For our test environment, we decided to install the DB2 UDB code on the internal disks of the server.

200

DB2 UDB exploitation of NAS technology

11.1.1 DB2 for Windows Objects on IBM NAS


Before you can create a database that resides on a NAS device, you need to create space on the NAS system and make this space accessible to the DB2 server.

Mapped network drive


IBM NAS devices are accessed as mapped network drives by DB2 UDB. The mapping of the network drives must occur before the DB2 UDB services start up. The required network mapping will be created using the AutoExNT service from the Microsoft Windows NT/2000 Resource Kit. 1. Install AutoExNT (...) 2. Create a shared folder on NAS with the necessary access to the user that you used on the installation process. 3. Edit or create an AUTOEXNT.BAT file in the directory C:\WINNT\SYSTEM32 and append this line:
NET USE F: \\DB2_production_data

Create DB2 database


For DB2 UDB, it makes no difference if a database is created on an IBM NAS device or a disk that is local to the server provided that the setup of the network drive in the previous step was successful. In the following example, we used the DB2 Database Wizard Tool for creating a database named SAMPLE. The database was created on the mapped network drive F: (drive F: was defined as a shared folder on the NAS volume FDrive on our IBM NAS).

Chapter 11. DB2 installation on IBM NAS

201

1. From the IBM DB2 program folder, open the Control Center. Open the instance where you are going to create the SAMPLE database. 2. Right-click on the Databases folder. Click Create and select Database using the wizard; see Figure 11-1.

Figure 11-1 DB2 Control Center launching database wizard

202

DB2 UDB exploitation of NAS technology

3. For database name, type SAMPLE. For Default drive, choose F:. For Alias, type SAMPLE. See Figure 11-2.

Figure 11-2 DB2 UDB 7.2 Create Database Wizard

Chapter 11. DB2 installation on IBM NAS

203

204

DB2 UDB exploitation of NAS technology

12

Chapter 12.

Backup and recovery options for DB2 UDB and IBM NAS
In this chapter we describe how IBM NAS True Images can be used for DB2 backup and recovery solutions. We start with a brief description of DB2 standard backup and recovery options, followed by a short overview of the DB2 True Image support. Finally, we describe DB2 backup and recovery scenarios using IBM NAS True Images.

Copyright IBM Corp. 2002

205

12.1 Backup and recovery considerations on IBM NAS


PSM point-in-time images provide a near instant virtual copy of an entire storage volume. These point-in-time copies are referred to as IBM NAS True Images and are created and managed by the PSM software. These instant virtual copies have the following characteristics: Normal reads and writes to the disk continue as usual, as if the copy had not been made. Virtual copies are created very quickly and with little performance impact, as the entire volume is not truly copied at that time. Virtual copies appear exactly as the volume appeared when the virtual copy was made. Virtual copies typically take up only a fraction of the space of the original volume. Because these virtual copies are created very quickly and are relatively small in size, functions that would otherwise have been too slow or too costly are now made possible. Combing this technology with DB2 UDB backup features allows you to implement a new kind of backup and recovery solutions. With a point-in-time copy of a database, a copy of production data can be produced with minimal application downtime. For backup and recovery purposes, True Images of databases can be utilized to do such tasks as the following: Off-load the database backup process from the primary database system. A DB2 backup can be performed on the secondary system. The DB2 backup can then be restored on either the primary system or on another system. Roll forward can then be issued to bring the database to a particular point in time, or until the end of the logs are reached. Perform a fast primary database restore. A IBM NAS True Image can be used to reestablished the primary copy to the point-in-time the True Image was taken. Then a roll-forward can be issued on the primary database to bring the database to a particular point in time, or till the end of the logs are reached. An example of how IBM NAS True Images could be used for off-loading a database backup is shown in Figure 12-1.

206

DB2 UDB exploitation of NAS technology

DB2A DB2B DB2A DB2B


Primary Image True Image

Figure 12-1 Database Backup from True Image Copy

In this example, a True Image which was taken from a database on the primary system (instance DB2A) is accessed by the secondary system (instance DB2B). On the secondary system the DB2 backup utility is used to create a backup of the True Image copy. This backup image represents a valid point-in-time image of the primary database on system DB2A. An example for a version recovery from a True Image is shown in Figure 12-2. In this example a True Image copy is used to recover a database of instance DB2A to point-in-time when the True Image copy for this database was taken.

DB2AA DB2
Primary Image True Image

Figure 12-2 Version recovery from True Image

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

207

The usage of True Images is not restricted to backup and recovery only. True Image copies can also be utilized for such tasks as the following: Provide a transactional consistent True Image of the database at the current point in time. This database can be used to off-load user queries that dont need the most current version of the database. Provide a standby database that can be accessed as a disaster recovery strategy if the primary database is not available. All logs from the primary database will be applied to the secondary database so that it will represent the most current transactional consistent version of the primary database.

12.1.1 DB2 UDB standard backup and recovery methods


With DB2 UDB, a variety of backup and recovery scenarios can be implemented. Here we present a brief overview of the standard backup and recovery capabilities of DB2 UDB. A more detailed description of DB2s backup and recovery capabilities can be found in Chapter 6, Backup and recovery options for databases that reside on NetApp filers on page 95.

Standard backup
The most common approach to creating a backup image of a database is to terminate all connections to the database, take the database offline, then using the DB2 BACKUP DATABASE command, make a full backup image of the database. Another approach is to isolate and backup specific portions of the database, again by using the DB2 BACKUP DATABASE command. With this approach, full, incremental, and/or delta backup images can be made while a database remains online.

Standard recovery
A database that has been backed up by means of DB2 backup utility can be recovered using either version recovery or roll-forward recovery. With version recovery, the database is returned to the state it was at that the time the last backup was taken any changes made since that time are lost. With roll-forward recovery, a database can be returned to the state it was in at a specific point in time by returning it to the state it was in the last time a backup was taken and then rolling it forward, using db2s transaction log files, to a specific point in time.

208

DB2 UDB exploitation of NAS technology

12.1.2 DB2 UDB NAS True Image support


Backup and recovery of a DB2 UDB databases residing on IBM NAS volumes can use a faster, more efficient approach when backing up and recovering up a database by using the True Image capabilities of IBM NAS PSM. In order to use these capabilities, specific DB2 features and functions have to be in place.

Suspend I/O
With PSM, a point-in-time True Image of a database can be taken. Because this function is working on file and logical volume level rather then on application level (database), special means have to be in place which allow PSM to take a consistent point-in-time copy of the database. This could be achieved by either taking the database offline while a True Image copy is running or to provide special commands which suspend database I/Os for this period of time. Beginning with Version 7.1 (fix pack 2) new DB2 commands were introduced that provides the capability to use True Image and split mirroring technology while DB2 is online. Suspend I/O supports continuous system availability by providing a full implementation for taking a True Image without shutting down the database. The new DB2 commands are SET WRITE SUSPEND FOR DATABASE and SET WRITE RESUME FOR DATABASE.

WRITE SUSPEND
When executed, the write suspend command (SET WRITE SUSPEND FOR DATABASE) all write operations to table spaces and log files for that particular database are suspended. Read-only transactions are not suspended and are able to continue execution against the suspended database provided they do not request a resource that is being held by the suspended I/O process. In addition, while I/O is suspended, applications can continue to process insert, update, and delete operations using data that has been cached in the databases buffer pool(s). A database connection must exist before this command can be submitted.

WRITE RESUME
The write resume command (SET WRITE RESUME FOR DATABASE) lifts an active suspension and allows all write operations to table spaces and log files that are used by a particular DB2 UDB database to continue. You need to be connected to that database before this command can be submitted.

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

209

Initialize True Image copies


Taking a True Image copy from an online database requires that you suspend all I/O on that database (set write suspend for database). Therefore, the True Image copy of the database will be in write suspend mode too. In order to access a True Image copy from a DB2 instance the True Image needs to be initialized. With Version 7.1 FixPack2 a new DB2 command (db2inidb) was introduced which is used to initialize True Image copys and make them accessible to a DB2 instance. Executing the db2inidb command against a True Image copy of a database will perform one of the following actions: Perform database recovery. Put a mirrored copy of a database in the roll-forward pending state so that it can be synchronized with the primary database. Allow a mirrored copy of a database to be backed up, thus providing a way to backup a large database without having to take it offline. Which of these actions is performed is determined by the option that is specified when the db2inidb command is executed: snapshot: The mirrored copy of the database will be initialized as a read-only clone of the primary database. standby: The mirrored copy of the database will be placed in roll-forward pending state. New logs from the primary database can be retrieved and applied to the mirrored copy of the database. The mirrored copy of the database can then be used in place of the primary database if, for some reason, it goes down. mirror: The mirrored copy of the database will be placed in roll-forward pending state and is to be used as a backup image, which can be used to restore the primary database. If the database is in an inconsistent state, it will remain in that state and any in-flight transactions will remain outstanding.

Database reallocation
Beginning with Version7.2/Version7.1 FixPak4 the new command db2relocatedb was introduced. This command allows you to rename or realocate a database or parts of it (for example, container, log directory). The intended changes are specified in a configuration file which has to be provided by the user of the command.

210

DB2 UDB exploitation of NAS technology

This command is needed if you plan to have a database name for your True Image copy which is different to the name of the primary image. Furthermore, if the directory structure of your True Image and primary image is not the same, the db2relocatedb can be used to make the necessary changes to the DB2 instance and the database see 12.2.6, Accessing True Image copy overview on page 220 for a detailed description. A potential relocate database scenario is depicted in Figure 12-3, where the True Image copy will be accessed as database nasdb. Furthermore, the mapping of the volume directories to Drives are different for the Primary and Secondary Site (on the Primary Site the directories DB2 data and DB2 logs are mapped to Drive F: and I:; on the Secondary Site the corresponding True Image copys are mapped to G: and I:). In order to make the True Image copy accessible to the Secondary Site, the database name and the Drive settings need to be adjusted with the db2relocatedb command.

Mounts on the Primary Site directory for DB2 data directory for DB2 logs

--> F:\db2_data\db2 --> I:\db2_logs

Mounts on the Secondary Site mount ...snapshot.1\db2_data\db2--> G:\ mount ...snapshot.1\db2_logs --> K:\ db2relocatedb Command used db2relocatedb rname_db.sql rname_db.sql db_name=sample, nasdb db_path=f:,g: instance=db2 log_dir=i:,k:

Figure 12-3 DB2relocatedb scenario

Another way to do the necessary changes to a True Image copy is to use the new RELOCATE option of the db2inidb command (since Version7.2/Version7.1 FixPak4). For our example, the systax would be (assuming, that we initiate the True Image copy as DB2 SNPASHOT): db2inidb NASDB as snapshot RELOCATE USING rname_db.sql

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

211

12.2 DB2 UDB considerations for PSM True Images


DB2 UDB databases residing on IBM NAS volumes can use a faster, more efficient approach for backup and recovery by using the True Image capabilities of IBM NAS PSM. Furthermore, True Images provide an elegant, fast, and easy way to create read-only copies of your database for reporting and data mining purposes. Before exploiting this technology, however, you should keep the following in mind: DB2 UDB raw devices are currently not supported by IBM NAS True Image. PSM True Image Copy can be create as read-only or read-write True Image copies. If you plan to use True Images for backup, reporting or test purpose the True Images must be created as read-write copies. Depending on the True Image solution you want to implement the allocation of your database files (container, db2 control and log files) has to be planned accordingly. IBM NAS True Image is not a substitute for backup and recovery on tape or dedicated disks. Keep in mind, that IBM NAS True Image will produce an image of your database that resides on a virtual disk. If a disk crash hits your primary database image the True Image copy is also damaged. Therefore, it is still strongly recommended to have backup images of your database on tapes or dedicated disks!

12.2.1 Getting DB2 UDB prepared for IBM NAS True Image
For creating a virtual copy of your database image, you have to make sure that all the required files are captured. For DB2 UDB, this includes the following objects: Container (SMS or DMS files) DB2 control files DB2 configuration files and DB2 log files For the setup of your database on IBM NAS, we recommend that the database files and the databases corresponding log files be physically stored in two separate IBM NAS volumes. In the event that a database recovery operation becomes necessary, maintaining separate volumes will enable you to easily restore the database files from the appropriate True Image of the database volume, and then perform a roll-forward recovery operation using the 'original' database archive log files.

212

DB2 UDB exploitation of NAS technology

The location used to store a databases log files is determined by the value of the log path parameter of a databases configuration file. To change the location of a databases log path, issue the following command: db2 UPDATE DB CFG FOR [Database Alias] USING NEWLOGPATH [Location] In this command, Database Alias is the alias for the database whose configuration is to be modified, and Location is the location where database log files are to be stored.

12.2.2 PSM configuration


Before you can create an IBM NAS True Image, you first have to configure Persistent Storage Manager. The configuration requires you to create PSM caches and to specify some general settings for the PSM cache management policy. The definition of PSM cache management policy is done via the Persistent Image Global Settings menu of the IBM NAS administration application, and the setup of the PSM caches itself via the Volume Settings menu. Both tasks can be accomplished by accessing the NAS Administration Program. either via Web interface or using Terminal Services. The Disk and Volumes task of the main menu includes both functions (Persistent Image Global Settings and Volume Settings).

Persistent Image Global Settings


For Persistent Image Global Settings, choose the following from the main menu options: 1. Click Disks. 2. Click Persistent Storage Manager. 3. Click Persistent Image Global Settings.

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

213

Figure 12-4 PSM True Image: Global Settings

The Persistent Image Global Settings offer the following options: Maximum persistent images: This corresponds to the maximum number of images that you can create per volume. Default value is 250. Inactive period (Quiescent period): This is the idle time (on the volume) PSM will wait before creating a True Image or persistent image. The Default value is 5 seconds. Inactive time-out (Quiescent time-out): This is the time that PSM will be willing to wait for quiescence. If the Quiescent period (for example, 5 seconds) does not occur within the specified Quiescent time-out (for example, 15 minutes), PSM will force a persistent image creation. The Default value is 15 minutes. Persistent image files location: This drive is where the images will be created. Note that you can only select one location for all the images (of your volumes). So if youre planning to create several images of each volume, you need to have enough space on this drive. The exact image size will depend on the changes made to the volume. The Default value is D: (maintenance), but you may want to put it in a fault tolerant array. For our example we set the Maximum persistent images to 20. For the inactive period we assume that because of the DB2 WRTITE SUSPEND the lowest available inactive period would be sufficient which is 1 second.

214

DB2 UDB exploitation of NAS technology

Volume settings
With the menu option Volume Settings you get a list of all available NAS Volumes in your system see Figure 12-5. Remember that this disks are NAS logical disks! From the list of available NAS Volumes, choose the one you want to configure here, which means, define the PSM settings for this NAS volume.

Figure 12-5 PSM True Image: Select Volume for configuration

Figure 12-6 PSM True Image: Volume Settings

For Volume Settings the following options are available (see Figure 12-6): Warning threshold reached... (Cache full warning threshold): This is the percentage of the cache size before warnings are sent. This is done to inform the NAS administrator that it is time to save the images before unwanted deletion of the first few persistent images occurs. The logs for this option are saved on the NT Event Viewer, so you can check for it using either Internet Explorer or a Terminal Services Client.

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

215

Begin persistent image deletions: This is the percentage of cache size that if reached will begin deleting images on First In First Out basis. The Default value is cache 90% full. Cache size: This will be the size of the PSM cache allocated from the PSM volume location. The Default value is 1GB. In our example we set the cache size for Volume H (Drive) to 10% because this Volume was used for test purposes only. As a Rule-of-Thumb we would recommend to set the Volume cache sizes to 20% for production environments. Figure 12-7 on page 216 shows that the allocated PSM cache size has changed to 10% of the total Volume capacity.

Figure 12-7 PSM True Image: Volume List

12.2.3 Options for IBM NAS True Image copies


A PSM True Image Copy can be created as read-only or read-write True Images (see 8.4.3, PSM True Image: read-only or read-write on page 144). If you plan to access True Image copy from a non-production or secondary system for backup, reporting, or test purposes you have to create the True Image as a read-write PSM True Image Copy. This is because we need to have read-write access to some DB2 objects when we initiated the True Image copy with the db2inidb command. Important: If you intend to use IBM NAS True Image for version recovery or roll-forward recovery of the primary or production system, we strongly recommend not to use the read-write option for PSM True Image Copy!

216

DB2 UDB exploitation of NAS technology

12.2.4 Creating an IBM NAS True Image


A PSM True Image copy can be created by accessing the NAS system either via Web interface or using Terminal Services. In order to do so, you have to select the following functions from the NAS Administration Program: 1. Click Disks. 2. Click Persistent Storage Manager. 3. Click Persistent Images.

Figure 12-8 PSM True Image Copy: Create new Copy

From the Persistent Images menu, select the New function. The Create Persistent Image box will appear see Figure 12-9.

Figure 12-9 PSM True Image Copy: Volume Selection

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

217

Here you can include the Volumes you want to be included in your True Image. For this example the F-Drive was chosen. Furthermore you can select to create a read-only or a read-write PSM True Image Copy, the retention weight, and the name of the PSM True Image Copy. The creation of a PSM True Image Copy might take a few seconds depending on the size of the data you included in the copy. After completion the Persistent Image list will show all new and existing PSM True Image Copy in your system. In our example we created beside a read-write copy for Volume F: three additional copies see Example 12-10.

Figure 12-10 PSM True Image Copy: Persistent Images List

12.2.5 Restoring an IBM NAS True Image


Restoring a IBM NAS True Image copy can be done with the NAS Administration Program. In order to do so, you have to select the following functions: 1. Click Disks. 2. Click Persistent Storage Manager. 3. Click Restore Persistent Images. Select from the list of available True Images. Copy the one you want to restore and click the Restore button see Figure 12-11.

218

DB2 UDB exploitation of NAS technology

Figure 12-11 PSM True Image Copy: Restore read-write True Image

The successful completion of a True Image restore is recorded in the NAS system log. To check to NAS system log select the following functions from the NAS Administration Program see Figure 12-12. 1. Click Maintenance. 2. Click Logs. 3. Click System Log.

Figure 12-12 NAS System Log

From the list of events, select the one with the appropriate time stamp (column Time) and select Event Details ... see Figure 12-13.

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

219

Figure 12-13 PSM True Image Copy: System Log Details

12.2.6 Accessing True Image copy overview


True Image copies appear to the end user as a mounted drive (or directory on UNIX), that is, as special (virtual) subdirectories. If enabled by the NAS administrator, each user can have access to copies of his or her files, as saved in the persistent images.

NAS directory structure


Figure 12-14 illustrates how files might appear.

Figure 12-14 NAS Volumes with allocated PSM cache

220

DB2 UDB exploitation of NAS technology

Figure 12-14 on page 220 shows a directory structure of a IBM NAS machine as seen by the NAS Administrator. In our example the NAS volume F: (FDrive) is dedicated to all database data in directory DB2_data, except DB2 log files which are allocated on NAS volume I: (IDrive) in directory db2_logs. Beside the directories DB2_data and db2_logs (in our example defined as Windows shared folder) dedicated PSM Cache directories (SNAPSHOTS) are allocated on NAS volumes FDrive and IDrive. The name of the PSM cache folder was set to SNAPSHOTS by the NAS administrator at PSM configuration time (see PSM configuration on page 213). For each True Image copy of a NAS volume a dedicated subdirectory is allocated in the PSM cache by PSM at the time a True Image is taken. In our example a True Image copy of the NAS volume FDrive is stored in subdirectory snapshot.1. Figure 12-15 shows that PSM True Image Copy creates an exact (virtual) copy of a NAS Volume. In our example a (virtual) copy of the FDrive (directory structure and all files) is created under F:\SNAPSHOTS\snapshot.1.

Figure 12-15 NAS volumes with PSM cache

Accessing a True Image copy


NAS volumes or directories can be made accessible to end user by defining them as Windows shared folder or UNIX mountable devices that can be mapped or mounted to the end user server. In the example depicted in Figure 12-16, the NAS Administrator has setup the directories F:\DB2_data and F:\SNAPSHOTS\snapshot.1\DB2_data as shared folders. In order to access the shared folders they have to be mounted on a server (as mapped network drives on Windows or mountable devices on UNIX).

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

221

In our example the NAS directory F:\DB2_data\DB2... where we created our sample database is mounted as network mapped drive to server DB2A. The NAS directory F:\SNAPSHOTS\snapshot.1\DB2_data\DB2... which contains the True Image copy of this database is mounted as network mapped drive to server DB2B. As the directories on each server is mounted as f:\DB2\... both servers see the same directory structure in this way the different directory structure of the primary image and True Image copy are masked.

f:\DB2\...

DB2AA DB2

DB2B DB2B

f:\DB2\...

IBM NAS
Figure 12-16 How to access PSM True Image copies

If, for any reason, the same drive letter or mount point for primary image and True Image copy cannot be used (this is the case if you plan to access more then one True Image copy of a primary image at a time) a different drive letter or mount point can be used. In this case, however, the path settings of the True Image needs to be adjusted at True Image initiation time using the RELOCATE option of the db2inidb command a RELOCATE example can be found in Section 12.3.4, PSM True Image copy as DB2 UDB True Image database on page 229.

222

DB2 UDB exploitation of NAS technology

12.2.7 Some considerations about cache size and location


Once a persistent image is created, the PSM cache must keep a copy of any and all changes to the original file. Therefore, the cache for a specific True Image copy could eventually grow to be as big as the original volume. The maximum cache storage size is configured by the administrator. Important: If insufficient storage is allocated, then not all the changes can be stored. The PSM cache would then be made invalid, as it would have some good information and some missing information. For this reason, if the PSM cache size is exceeded, the cache will be deleted, starting with the oldest cache first. It is highly recommended that the NAS administrator configure a warning threshold that will signal if the cache exceeds the warning level. PSM caches can neither be backed up or restored from tape. Therefore, the tape-archive backup program should not be configured to back up the PSM caches. Tip: To calculate the initial size of the PSM cache, you need to sum up 20% of each volume's capacity from whom you'll create True Images. For example, suppose you have the following disk configuration, and you're planning to create a True Image of each disk drive: F: 9,75 GB I : 9,75 GB PSM Cache size = (9,75x + 9,75) x 20% ~ 3.9 GB Therefore, Persistent Image file location should be greater than or equal to 4 GB.

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

223

12.3 Using IBM NAS True Image with DB2 UDB


In this section we provide an overview of how to utilize IBM NAS True Image capabilities for implementing advanced backup and recovery solutions. We will start with a description of our test environment, and then offer some more general guidelines on how to prepare and take IBM NAS True Images of online and offline DB2 databases.

12.3.1 System environment


Figure 12-17 shows the system environment we used for the scenarios of the subsequent sections. It consisted of two NT servers, a Public Domain Computer (PDC) running Windows 2000, and two IBM NAS appliance machines (IBM NAS200 and IBM NAS300).

DB2A

DB2B

NAS 200

PDC (Windows 2000)

NAS 300

Figure 12-17 Our test environment

A DB2 instance was created on each NT server (DB2A and DB2B). Our test database SAMPLE was created on DB2A. Therefore we call this machine the primary server and the database image on that server the primary image. The second instance (DB2B) was used to initiate and access True Image copies of the SAMPLE database. Therefore we call this machine the secondary server.

224

DB2 UDB exploitation of NAS technology

Directory structure
For the primary (database) image, two dedicated NAS volumes were reserved: One for DB2 logs and a second for the remain database objects (container and control files). The directory structure we used is shown in Figure 12-18.

Figure 12-18 NAS directory structure for scenario environment

The directories DB2_data and db2_logs were set up as shared folder. The folder names we used were \\db2_production_data for the DB2_data and \\db2_production_logs for the db2_logs directory. Both shared folder were mounted on the primary server \\db2_production_data was mapped to drive letter F: and \\db2_production_logs to drive letter I: The snapshot directory on each of the NAS volumes (FDrive and IDrive) are dedicated PSM cache directories which got allocated during the PSM configuration for details on the PSM configuration refer to 12.2.2, PSM configuration on page 213. For each True Image we took PSM allocated a dedicated subdirectory in the SNAPSHOTS directory for example snapshot.1. In order to make a True Image copy accessible to our secondary server the directory of that image is setup as shared folder and mounted on the server. In our example the True Image copys of the primary database are allocated in the following directories: F:\SNAPSHOTS\snapshot.1\DB2_data and I:\SNAPSHOTS\snapshot.1\db2_logs. Both directories were set up as shared folder \\db2_snapshot_data and \\db2_snapshot_logs and mounted on drives F: and I: on the secondary server (DB2B).

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

225

Figure 12-19 Directory structure for primary and secondary images

Database setup
In our test environment a DB2 instance was created on the primary and secondary server (so binaries and instance related DB2 objects got allocated on disks local to the server). The test database SAMPLE was created on the primary server on the F: drive (the drive maps to the shard folder DB2_data on the NAS volume FDrive). After database creation the DB2 logpath was changed to drive I: (the drive maps to the shared folder db2_snapshot_data on the NAS volume IDrive) and logretain was switched on.

12.3.2 Taking a True Image of an offline DB2 UDB database


True Image copies taken from an offline database can be utilized in different ways. They can be used, for example, as an easy way to create a read-only copy of a database image or to take a virtual backup copy of a database that can be used as a backup image for fast Version Restore. True Images from an offline database do not require you to set the database into suspend and resume I/O mode. Hence, after restoring a True Image copy of a database, a re-initiation of the database is not required.

Overview
Figure 12-20 shows the necessary steps for an offline True Image. Before you can start taking a True Image, the database needs to be offline (t2). As soon as the database is offline, a True Image can be taken (t3). After completion, the database can be taken back to online (t4).

226

DB2 UDB exploitation of NAS technology

DB2 DB2

t1: db is online

DB2 DB2

t2: get db offline

DB2 DB2

t3: take True Image

DB2 DB2

t4:get db online

Figure 12-20 True Image of an offline DB2 database

Required steps
In order to create a True Image copy of an offline database, follow these steps: 1. Set database offline: Disconnect all users from the database. 2. Disconnect from database: db2 connect reset Terminate all command line back end processes by issuing: db2 terminate 3. Create True Image: Create a True Image of the required Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217) 4. Resume access for database: Issue the following command: db2 connect to <database alias>

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

227

12.3.3 Taking a True Image of an online DB2 UDB database


True Image copies taken from an online database can be utilized in different ways. This type of True Images can be used for creating a read-only copy of a database as well as to create virtual backup images that can be used for version recovery of roll-forward recovery. Taking a True Image of an online database requires special attention to the log files of the primary DB2 instance. If you intend to use IBM NAS True Image for roll-forward recovery, for example, you have to ensure that the log files of your primary site do not get over-written by a True Image restore. Having a dedicated NAS Volume for DB2 log files and separating these files from other DB2 files and data is one way to deal with this requirement.

Overview
Figure 12-21 shows the necessary steps for taking an online True Image. Before taking a True Image, all write I/Os to that database need to be suspended (t2). After suspending write I/Os, a point-in-time image of the database can be created (t3). After PSM True Image has finished, I/O to the primary image (database image) can be resumed (t4).

DB2 DB2

t1: read/write access

DB2 DB2

t2: suspend I/O

DB2 DB2

t3: take True Image

DB2 DB2

t4: resume I/O

Figure 12-21 True Image of an online database

228

DB2 UDB exploitation of NAS technology

Required steps
In order to create a True Image copy of an online database, follow these steps: 1. Write suspend for database: Connect to database. Issue the following command:
db2 set write suspend for database

2. Create True Image: Create a True Image of the required NAS Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217) 3. Write resume for database: Issue the following command:
db2 set write resume for database

12.3.4 PSM True Image copy as DB2 UDB True Image database
If a quick copy of a DB2 database is required to populate a test or development system, then a True Image of the production system can be utilized.

Overview
For this scenario, we used a second server with a dedicated DB2 instance. The True Image copy we used was accessible through a shared directory (shared folder on Windows, NFS mounted on UNIX). Figure 12-22 gives a high level description of that environment.

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

229

f:\DB2\....

DB2AA DB2

DB2B DB2B

f:\DB2\....

f:\DB2\....

f:\SNAPSHOTS\snapshot.1\DB2\....

NAS200
Figure 12-22 Accessing a True Image copy from a secondary server

Here, the directory of the True Image (f:\SNAPSHOTS\snapshot.1\DB2\...) is mounted as f:\DB2 on the secondary server (DB2B). Therefore, both server working with the same directory structure. In cases where this is not possible, DB2 path settings have to be adjusted to the directory structure of the secondary server by using the RELOCATE option of the db2inidb command. Figure 12-23 briefly describes the required steps. At t1 an online True Image is taken. The created True Image copy is allocated into a PSM cache directory by PSM. After the PSM cache directory is made accessible to server DB2B (t2), the db2inidb command is used to make the True Image copy of the primary database image accessible to the secondary server (t3). In our example the True Image is accessed as DB2 True Image database.

230

DB2 UDB exploitation of NAS technology

DB2 AA DB2

t1: take True Image

DB2 AA DB2 B DB2 DB2 B

t2: get access to PSM cache

IBM NAS

IBM NAS

DB2AA DB2BB DB2 DB2

t3: dbinidb db as Snapshot

IBM NAS

Figure 12-23 Initiate database as DB2 True Image database

Required steps without RELOCATE option


In order to create a DB2 UDB True Image database, the following steps are required: 1. Create True Image copy online database: Create a True Image of the required NAS Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217) 2. Initiate True Image copy image from the secondary server: Login to your secondary server. Make sure that the shared folder with the True Image copy is accessible either as network mapped device on Windows OS or mountable device on UNIX. Catalog the database. You should see the database name of the primary database remember you are accessing a (virtual) copy of the primary image!
db2 list database directory

Initiate the database as True Image by issuing:


db2inidb <database-name> as snapshot

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

231

Attention: If the directory structure of your True Image copy is the same as on your primary image, the RELOCATE option of the db2inidb command is not required unless you intend to change the name of the database, for example, from SAMPLE to NASDB.
3. Access your database: The database should now be ready for access! Figure 12-24 shows the command sequence we used in our scenario. Note that the first attempt to get connected to the True Image copy of the sample database failed. Because this True Image copy was taken from an online database, the image itself was still in write suspend mode! In order to get access, the True Image copy of the primary database needs to be initiated with the db2inidb command.
C:\PROGRA~1\SQLLIB\BIN>db2 list database directory System Database Directory Number of entries in the directory = 1 Database 1 entry: Database alias Database name Database drive Database release level Comm ent Directory entry type Catalog node number = = = = = = = SAMPLE SAMPLE F:\DB2 9.00 Indirect 0

C:\PROGRA~1\SQLLIB\BIN>db2 connect to sample SQL20153N The database's split image is in the suspended state. SQLSTATE=55040 C:\PROGRA~1\SQLLIB\BIN>db2inidb sam ple as snapshot Operation was successful. C:\PROGRA~1\SQLLIB\BIN>db2 connect to sample Database Connection Information Database server = DB2/NT 7.2.2 SQL authorization ID = NAS_DB2_... Local database alias = SAMPLE

Figure 12-24 Screen Capture of the db2inidb command sequence

232

DB2 UDB exploitation of NAS technology

Required steps with RELOCATE option


If your True Image copy cannot be allocated to the same drives or directories as your primary copy you need to use the RELOCATE option for initiating a DB2 True Image database. Figure 12-25 illustrates a scenario where the mount points of primary and secondary servers are different. In order access the database True Image copy on the second server, the DB2 path settings for this image have to be adjusted. In our scenario we use the RELOCATE option of the db2inidb command. It invokes a configuration file (rname_db.sql) in which the new settings are specified.

Mounts on the Primary Site directory for DB2 data directory for DB2 logs

--> F:\db2_data\db2 --> I: \db2_logs

Mounts on the Secondary Site mount ...snapshot.1\db2_data\db2--> G:\ mount ...snapshot.1\db2_logs --> K:\ db2relocatedb Command used db2relocatedb rname_db.sql rname_db.sql db_name=sample, nasdb db_path=f:,g: instance=db2 log_dir=i:,k:
Figure 12-25 Different directory structure for database True Image copy

In order to create a DB2 UDB True Image database that needs to be relocated, the following steps are required: 1. Create True Image copy online database: Create a True Image of the required NAS Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217)

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

233

2. Initiate True Image copy image from the secondary server: Login to your secondary server. Make sure that the shared folder with the True Image copy is accessible either as network mapped device on Windows OS or mountable device on UNIX. Catalog the database. Initiate the database as True Image by issuing:
db2inidb <database-name> as snapshot RELOCATE USING <file-name>

3. Access your database: The database should now be ready for access! Figure 12-26 shows the command sequence we used in our scenario.

C :\ > d b 2 lis t d a ta b a s e d ir e c to r y o n g : L o c a l D a t a b a s e D ir e c t o r y o n g : N u m b e r o f e n tr ie s in t h e d ir e c t o r y = 1 D a ta b a s e 1 e n tr y : D a t a b a s e a lia s D a ta b a s e n a m e D a t a b a s e d ir e c to r y D a t a b a s e r e le a s e le v e l C om m ent D ir e c to r y e n t r y ty p e C a t a lo g n o d e n u m b e r N od e nu m ber

= = = = = = = =

SAMPLE SAMPLE SQ L00001 9 .0 0 H om e 0 0

C :\ > d b 2 in id b n a s d b a s s n a p s h o t r e lo c a t e u s in g r n a m e _ d b _ g . s q l R e lo c a t in g d a ta b a s e .. . F ile s a n d c o n tr o l s t r u c tu r e s w e r e c h a n g e d s u c c e s s fu lly . D a ta b a s e w a s c a t a lo g e d s u c c e s s fu lly . D a ta b a s e r e lo c a t io n w a s s u c c e s s fu l. O p e r a tio n w a s s u c c e s s fu l. C :\ > d b 2 lis t d a ta b a s e d ir e c to r y D a ta b a s e 1 e n tr y : D a ta b a s e a lia s = NASDB D a ta b a s e n a m e = NASDB D a ta b a s e d r iv e = G : \D B 2 D a ta b a s e r e le a s e le v e l = 9 .0 0 C om m en t = C a t a lo g e d b y d b 2 r e lo c a t e d b D ir e c t o r y e n tr y t y p e = I n d ir e c t C a ta lo g n o d e n u m b e r = 0

Figure 12-26 The db2inidb RELOCATE command sequence

234

DB2 UDB exploitation of NAS technology

12.3.5 Creating a DB2 backup from a True Image


Creating DB2 backups from True Image copy of a database allows you to perform online database backup with minimal impact to the production system. Furthermore, the backup process itself can be off-loaded from the primary to secondary database system.

Overview
As a starting point for a DB2 backup copy from a True Image, a valid point-in-time copy (True Image) of the primary database system is required (t1) see 12.3.3, Taking a True Image of an online DB2 UDB database on page 228. At this point the True Image copy is a valid virtual image of the primary database. Using the virtual image of the primary database requires that a DB2 Instance either local or remote can access the True Image copy for backup purposes (t2 and t3).

DB2AA DB2

t1: True Image of online db

DB2A DB2B DB2A DB2B

t2: get access to PSM cache

IBM NAS

IBM NAS

DB2AA DB2B DB2 DB2B

t3:database backup to tape

IBM NAS

Figure 12-27 DB2 Backup from a True Image Copy

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

235

Required steps
In order to create a copy of an online database, follow these steps: 1. Create True Image copy of the online database: Create a True Image of the required NAS Volumes (for a detailed description of the required steps, please refer to 12.2.4, Creating an IBM NAS True Image on page 217). 2. Get access to the True Image copy: Login to your secondary server. Make sure that the shared folder with the True Image copy is accessible either as network mapped device on Windows OS or mountable device on UNIX. Catalog the database. You should see the database name of the primary database remember you are accessing a (virtual) copy of the primary image!
db2 list database directory

3. Take the backup of True Image copy: Issue the following command:
db2 backup database <database> to <device>

12.3.6 Version recovery from a PSM True Image


For version recovery, a primary database image is replaced by a True Image copy of the database. In this way a database can be restored to the state it was at the point in time a True Image was taken. This scenario assumes that only the database related volumes (container, control files, logs) are restored and not the instance related volumes (%SQLLIB%)!

Overview
This scenario requires that a valid True Image copy of the primary database exists. Before the True Image can be restored the database must be taken offline (t1). For the restore, a True Image copy, which represents a valid image of the database at a prior point in time, is used to restore the primary database (t2). After the restore has been completed successfully, the database image needs to be initialized if the True Image was taken from on online database (t3). If the True Image was taken from an offline database, an initiation of the restored database is not required.

Note: For an offline True Image, we do not need to suspend I/O. Therefore, the True Image copy image is not in a write suspend state.

236

DB2 UDB exploitation of NAS technology

DB2A DB2A

t1: set db offline

DB2A DB2A

t2: PSM restore True Image

IBM NAS

IBM NAS

DB2A DB2A

t3:get db online

IBM NAS
Figure 12-28 Version recovery from a PSM True Image

Steps for restoring a True Image of an offline database


In order to perform a version recovery from a True Image that was taken during the database was offline, follow these steps: 1. Set database offline: Terminate all database connections to the database by issuing:
db2 connect reset

Terminate all CLP back end processes by issuing


db2 terminate

At this point you might shutdown the database in order to prevent user from re-connecting to the database by issuing
db2stop

2. Restore database from True Image copy On IBM NAS Administration console use the Restore Persistent Images menu to select and restore the appropriate True Image copy see 12.2.5, Restoring an IBM NAS True Image on page 218. 3. Restart the DB2 instance, if it was stopped, by issuing:
db2start

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

237

4. Connect to database: Because we used a True Image of an offline database for version restore no database initiation needs to be done after restoring the True Image.
db2 connect to <database>

Required steps for online True Image


In order to perform a version recovery from an online True Image, follow these steps: 1. Set database offline: Terminate all database connections to the database by issuing:
db2 connect reset

Terminate all command line back end processes by issuing:


db2 terminate

At this point you might shutdown the database in order to prevent user from re-connecting to the database by issuing:
db2stop

2. Restore database from True Image copy: On IBM NAS Administration console use the Restore Persistent Images menu to select and restore the appropriate True Image copy see 12.2.5, Restoring an IBM NAS True Image on page 218. 3. DB2 instance, if it was stopped, by issuing:
db2start

4. Initiate the database as True Image: Because we used a True Image of an online database for version restore a database initiation needs to be done after restoring the True Image:
db2inidb <database> as snapshot

5. Connect to database: Issue the following command:


db2 connect to <database>

238

DB2 UDB exploitation of NAS technology

12.3.7 Roll-forward recovery from a True Image


For roll-forward recovery, a primary database image is replaced by a True Image copy of a database followed by applying changes stored in the log files of the primary image. In this way a database can be restored to the most current point in time.
Note: If you plan to use a True Image copy for roll-forward recovery, the True Image copy of the DB2 logs should not be restored. If this is not possible (because all database objects are all in the same True Image copy), you can save the logs of the primary image before the mirrored logs are copied over to the primary system. Then, you must copy back the primary image logs.

Overview
This scenario requires that an online True Image of the primary database exists. Before the True Image can be restored, the database must be taken offline (t1) see Figure 12-29. A True Image copy of the database, which represents a valid image of this database at a prior point in time, is used to restore the primary database (t2). After restoring the True Image, the database needs to be initialized with the db2inidb <database> as MIRROR command (t3). This will place the database image in roll-forward pending state. In t2 the DB2 logs of the primary site were not replaced. By using these logs we can roll for wad the primary database by the ROLLFORWARD command (rollforward <database> to end of logs end complete).

DB2 AA DB2

t1: set db offline

DB2 AA DB2

t2: PSM restore True Image

IBM NAS

IBM NAS

DB2 AA DB2

t3:db2inidb db as MIRR OR

DB2 AA DB2

t4:roll-forward db

IBM NAS

IBM NAS

Figure 12-29 Roll-forward recovery from database True Image

Chapter 12. Backup and recovery options for DB2 UDB and IBM NAS

239

Steps for restoring a True Image of an online database


In order to perform a roll-forward recovery from a True Image that was taken during the database was online, follow these steps: 1. Set database offline: Terminate all database connections to the database by issuing:
db2 connect reset

Terminate all CLP back end processes by issuing:


db2 terminate

At this point you might shutdown the database in order to prevent user from re-connecting to the database by issuing:
db2stop

2. Restore database from True Image copy: On IBM NAS Administration console use the Restore Persistent Images menu to select and restore the appropriate True Image copy see 12.2.5, Restoring an IBM NAS True Image on page 218. 3. Restart the DB2 instance, if it was stopped, by issuing:
db2start

4. Initiate the database as MIRROR: Because we used a True Image of an online database we need to initialize the database after restoring the True Image:
db2inidb <database> as mirror

This will place the database in roll-forward pending state. 5. Roll forward the database to end of logs and complete:
db2 rollforward <database> to end of logs and complete

240

DB2 UDB exploitation of NAS technology

13

Chapter 13.

IBM NAS high availability


In this chapter we describe the high availability features of IBM NAS, which are among the most important features of NAS (especially the NAS 300, which offers even more features for high availability). We also describe some client responses on NAS 300 failover.

Copyright IBM Corp. 2002

241

13.1 NAS 200 high availability


The NAS product is designed for high availability. For NAS 200, reliability and serviceability is delivered via these features: Six hot-swappable HDD bays with SCA-2 connectors supports SAF-TE functions Standard ServeRAID-4L (model 200) or -4H (model 225) controllers support Active PCI failover RAID levels 0, 1, 1E, 5, 5E, 00, 10, 1E0, and 50 ECC DIMMs combined with an integrated ECC memory controller corrects soft and hard single-bit memory errors, while minimizing disruption of service to LAN clients Memory hardware scrubbing corrects soft memory errors automatically without software intervention ECC L2 cache processors ensures data integrity while reducing downtime Predictive Failure Analysis on HDD options, memory, processors, VRMs, and fans alerts the system administrator of an imminent component failure Three worldwide, voltage-sensing 250-watt power supplies feature auto restart and redundancy An integrated Advanced System Management Processor (ASMP) for diagnostic, reset, Power On Self Test (POST), and auto recovery functions from remote locations; monitoring of temperature, voltage, and fan speed alerts generated when thresholds are exceeded. An optional ASM PCI adapter also allows for SNMP alerts via network connection when the administrator console is running either Tivoli Netview or Netfinity Director. Information LED panel provides visual indications of system well-being. Light-Path Diagnostics and on-board diagnostics provides an LED map to a failing component, designed to reduce downtime and service costs. Easy access is provided to system board, adapter cards, processor, and memory. CPU failure recovery in Symmetric Multi Processor (SMP) configurations does the following: Forces failed processor offline Automatically reboots server Generates alerts Continues operations with the working processor (if present)

242

DB2 UDB exploitation of NAS technology

13.2 NAS 300 high availability


The NAS 300 appliance provides all the high availability features NAS 200 has. In addition, with its cluster feature, it provides increased reliability and availability. The system comes standard with dual-node engines for clustering and failover protection. The dual Fibre Channel Hubs provide IT administrators with high performance paths to the RAID storage controllers using fibre-to-fibre technology. The preloaded operating system and application code is tuned for the network storage server function, and designed to provide 24 X 7 uptime. The simple point-and-click restore feature makes backup extremely simple. With multi-level persistent image capability, recovery is quickly managed to ensure highest availability and reliability. The IBM TotalStorage NAS 300 connects to an Ethernet LAN. This rack-mounted system provides for power distribution, but sufficient power must be provided to the rack. High availability of IBM NAS 300 also comes from the following features: Designed for 24X7 operation Predictive Failure Analysis alerts the system administrator of an imminent component failure Dual node engines for clustering and failover Dual Fibre Channel hubs for high speed data transfer and contention Hot Swap power supplies for system redundancy Connectivity Supports multiple RAID levels 0, 1, 3 and 5 Setting up a NAS 300 cluster was discussed in Chapter 10, Configuration of IBM NAS 200 and 300 on page 153. Therefore, in the following section, we discuss failover tests on the NAS 300.

13.3 Failover tests on NAS 300


We did a number of failover tests on the NAS 300 in our test environment. First, we simply moved resource from one node to another node. After this was successful, we did two other failover scenarios to track the NAS and Windows client response to the failover. Our last step was to track the DB2 system response to the failover on NAS 300. All the tests were successful. In the following sections, we describe each type of failover event and client response.

Chapter 13. IBM NAS high availability

243

13.3.1 Creating a failover event


We created three types of failover events:
1. Moving resource between nodes using the Cluster Administrator:

Moving resource between nodes is also used to make sure the cluster is successful set up. Please refer to 10.3.3, Setting up the Cluster Server on page 174 for details.
2. Restarting the Primary Node:

The second scenario we used was to restart the primary node. This is very easy because it can be done using a Windows Session connection, but the connection needs to reconnect to the server manually after restart, and it is difficult to know when the node is back. We did some failover tests using this type when we only had a remote connection. In reality, this type of event only happens when the OS is forced to restart by some crashed application. Its client response is the same as a type 1 event, because the cluster software has enough time to notify the other node when it is going down.
3. Powering off the Primary Node:

Turning off the power of the Primary Node is the last scenario we tested; the network acted differently. (See Figure 13-1.) In the response analyzed below, we only analyzed events 2 and 3, since events 1 and 2 had the same results for the client.

13.3.2 Failover response


The cluster we used had two share volumes, each one on a different network name (two virtual servers).

Failover response on network resource


Our test method was simple: Keep pinging all the network resources. Because the IP address resource is the basic resource, if the IP address is not up, the server and its service cannot be available to the client.

244

DB2 UDB exploitation of NAS technology

Figure 13-1 Network resource response in type 3 failover

The result shows that in the type 2 event, the clustered resource includes cluster IP, and virtual IP addresses were switched to another node without breaking. In the type 3 event (Figure 13-1), all the clustered resources on the primary node were not available after the primary node was powered off. It took less than 10 seconds for the cluster to come back. One of the virtual servers (IP address) came back almost at the same time as the cluster IP address; the other virtual server took few seconds more to come back.

Chapter 13. IBM NAS high availability

245

Failover response on Windows client


In the Windows client, the response is a little different. We created a share folder and kept writing a file on the folder from the client. The result is not accurate, because each time when the share volume on NAS 300 is not available, the Windows client just keeps trying and trying. After a long wait, it returns either a successful result or a disk error. The network share volume takes much longer than the IP address to come back. This is because the share volume access is a client-server communication, although the IP address is available immediately in a type 1 event, but the connection still needs to be rebuilt. Fortunately, this is taken care by the OS.

Failover response on DB2


Failover on DB2 is a bit more complex, because there are different level error responses. The best scenario is when you dont try to access to the database during the blackout period. You wont notice any change, but that is not going to happen if you have an online database.

Figure 13-2 Error message in the DB2 command center

If a database client tries to access the database during that 20-45 second blackout period, he will get a connection error. The only solution for this is to design an application that can try the connection again and again.

246

DB2 UDB exploitation of NAS technology

13.3.3 Load balancing


The MS Cluster Server cannot do balancing on same resource, so you need to manually balance the disk groups into two nodes to distribute the cluster resource functions between the two nodes. This improves the response time. It is recommended that you put storage for different applications into different disk groups using a different network name (different virtual server), so your resource can be balanced between different nodes.

13.3.4 Administration considerations for NAS


Although the NAS 300 provides a failover feature, the system administrator still needs to be notified if a failover event has happened. Because there are reasons that can cause a failover to occur, it is still important for the administrator to find out the reason for failover in order to prevent it from happening again.

Chapter 13. IBM NAS high availability

247

248

DB2 UDB exploitation of NAS technology

Abbreviations and acronyms


ABI ACE ACL AD ADSM AFS AIX ANSI APA API APPC APPN ARC ARPA ASCII

Application Binary Interface Access Control Entries Access Control List Microsoft Active Directory ADSTAR Distributed Storage Manager Andrew File System Advanced Interactive eXecutive American National Standards Institute All Points Addressable Application Programming Interface Advanced Program-to-Program Advanced Peer-to-Peer Networking Advanced RISC Computer Advanced Research Projects Agency American National Standard Code for Information Interchange Asynchronous Terminal Emulation Asynchronous Transfer Mode Audio Video Interleaved Backup Domain Controller

BIND BNU BOS BRI BSD BSOD BUMP CA CAL C-SPOC CDE CDMF CDS CERT CGI CHAP CIDR CIFS CMA CO COPS

Berkeley Internet Name Domain Basic Network Utilities Base Operating System Basic Rate Interface Berkeley Software Distribution Blue Screen of Death Bring-Up Microprocessor Certification Authorities Client Access License Cluster single point of control Common Desktop Environment Commercial Data Masking Facility Cell Directory Service Computer Emergency Response Team Common Gateway Interface Challenge Handshake Authentication Classless InterDomain Routing Common Internet File System Concert Multi-threaded Architecture Central Office Computer Oracle and Password System

ATE ATM AVI BDC

Copyright IBM Corp. 2002

249

CPI-C

Common Programming Interface for Communications Central Processing Unit Client Service for NetWare Client/server Runtime Discretionary Access Controls Defense Advanced Research Projects Agency Direct Access Storage Device Database Management Distributed Computing Environment Distributed Component Object Model Dynamic Data Exchange Dynamic Domain Name System Directory Enabled Network Data Encryption Standard Distributed File System Dynamic Host Configuration Protocol Data Link Control Dynamic Load Library Differentiated Service Directory Service Agent Directory Specific Entry Domain Name System Distributed Time Service Encrypting File Systems Effective Group Identifier

EISA EMS EPROM ERD ERP ERRM ESCON ESP ESS EUID FAT FC FDDI FDPR FIFO FIRST FQDN FSF FTP FtDisk GC GDA GDI GDS GID

Extended Industry Standard Architecture Event Management Services Erasable Programmable Read-Only Emergency Repair Disk Enterprise Resources Planning Event Response Resource Manager Enterprise System Connection Encapsulating Security Payload Enterprise Storage Server Effective User Identifier File Allocation Table Fibre Channel Fiber Distributed Data Interface Feedback Directed Program Restructure First In/First Out Forum of Incident Response and Security Fully Qualified Domain Name File Storage Facility File Transfer Protocol Fault-Tolerant Disk Global Catalog Global Directory Agent Graphical Device Interface Global Directory Service Group Identifier

CPU CSNW CSR DAC DARPA

DASD DBM DCE DCOM DDE DDNS DEN DES DFS DHCP DLC DLL DS DSA DSE DNS DTS EFS EGID

250

DB2 UDB exploitation of NAS technology

GL GSNW GUI HA HACMP HAL HBA HCL HSM HTTP IBM ICCM IDE IDL IDS IEEE IETF IGMP IIS IKE IMAP I/O IP

Graphics Library Gateway Service for NetWare Graphical User Interface High Availability High Availability Cluster Multiprocessing Hardware Abstraction Layer Host Bus Adapter Hardware Compatibility List

IPC IPL IPsec IPX ISA iSCSI ISDN ISNO ISO

Interprocess Communication Initial Program Load Internet Protocol Security Internetwork Packet eXchange Industry Standard Architecture SCSI over IP Integrated Services Digital Network Interface-specific Network Options International Organization for Standardization Interactive Session Support Independent Software Vendor Initial Technology Security Evaluation International Technical Support Organization International Telecommunications Union Inter Exchange Carrier Just a Bunch of Disks Journaled File System Just-In-Time Layer 2 Forwarding Layer 2 Tunneling Protocol Local Area Network Logical Cluster Number Lightweight Directory Access Protocol

Hierarchical Storage
Management Hypertext Transfer Protocol International Business Machines Corporation Inter-Client Conventions Manual Integrated Drive Electronics Interface Definition Language Intelligent Disk Subsystem Institute of Electrical and Electronic Engineers Internet Engineering Task Force Internet Group Management Protocol Internet Information Server Internet Key Exchange Internet Message Access Protocol Input/Output Internet Protocol

ISS ISV ITSEC ITSO ITU

IXC JBOD JFS JIT L2F L2TP LAN LCN LDAP

Abbreviations and acronyms

251

LFS LFS LFT JNDI LOS LP LPC LPD LPP LRU LSA LTG LUID LUN LVCB LVDD LVM MBR MCA MDC MFT MIPS MMC MOCL MPTN

Log File Service (Windows NT) Logical File System (AIX ) Low Function Terminal Java Naming and Directory Interface Layered Operating System Logical Partition Local Procedure Call Line Printer Daemon Licensed Program Product Least Recently Used Local Security Authority Local Transfer Group Login User Identifier Logical Unit Number Logical Volume Control Block Logical Volume Device Driver Logical Volume Manager Master Boot Record Micro Channel Architecture Meta Data Controller Master File Table Million Instructions Per Second Microsoft Management Console Managed Object Class Library Multi-protocol Transport Network

MS-DOS MSCS MSS MSS MWC NAS NBC NBF NBPI NCP NCS NCSC NDIS NDMP NDS NETID NFS NIM NIS NIST

Microsoft Disk Operating System Microsoft Cluster Server Maximum Segment Size Modular Storage Server Mirror Write Consistency Network Attached Storage Network Buffer Cache NetBEUI Frame Number of Bytes per I-node NetWare Core Protocol Network Computing System National Computer Security Center Network Device Interface Specification Network Data Management Protocol NetWare Directory Service Network Identifier Network File System Network Installation Management Network Information System National Institute of Standards and Technology National Language Support Novell Network Services Netscape Commerce Server's Application NT File System

NLS NNS NSAPI NTFS

252

DB2 UDB exploitation of NAS technology

NTLDR NTLM NTP NTVDM NVRAM NetBEUI NetDDE OCS ODBC ODM OLTP OMG ONC OS OSF PAL PAM PAP PBX PCI PCMCIA PDC PDF

NT Loader NT LAN Manager Network Time Protocol NT Virtual DOS Machine Non-Volatile Random Access Memory NetBIOS Extended User Interface Network Dynamic Data Exchange On-Chip Sequencer Open Database Connectivity Object Data Manager OnLine Transaction Processing Object Management Group Open Network Computing Operating System Open Software Foundation Platform Abstract Layer Pluggable Authentication Module Password Authentication Protocol Private Branch Exchange Peripheral Component Interconnect Personal Computer Memory Card Primary Domain Controller Portable Document Format

PDT PEX PFS PHB PHIGS

Performance Diagnostic Tool PHIGS Extension to X Physical File System Per Hop Behavior Programmer's Hierarchical Interactive Graphics System Process Identification Number Personal Identification Number Path Maximum Transfer Unit Post Office Protocol Portable Operating System Interface for Computer Environment Power-On Self Test Physical Partition Point-to-Point Protocol Point-to-Point Tunneling Protocol PowerPC Reference Platform Persistent Storage Manager Program Sector Number Parallel System Support Program Physical Volume Physical Volume Identifier Quality of Service Resource Access Control Facility Redundant Array of Independent Disks

PID PIN PMTU POP POSIX

POST PP PPP PPTP PReP PSM PSN PSSP PV PVID QoS RACF RAID

Abbreviations and acronyms

253

RAS RDBMS RFC RGID RISC RMC RMSS ROLTP ROS RPC RRIP RSCT RSM RSVP SACK SAK SAM SAN SASL SATAN SCSI SDK SFG SFU

Remote Access Service Relational Database Management System Request for Comments Real Group Identifier Reduced Instruction Set Computer Resource Monitoring and Control Reduced-Memory System Simulator Relative OnLine Transaction Processing Read-Only Storage Remote Procedure Call Rock Ridge Internet Protocol Reliable Scalable Cluster Technology Removable Storage Management Resource Reservation Protocol Selective Acknowledgments Secure Attention Key Security Account Manager Storage Area Network Simple Authentication and Security Layer Security Analysis Tool for Auditing Small Computer System Interface Software Developer's Kit Shared Folders Gateway Services for UNIX

SID SLIP SMB SMIT SMP SMS SNA SNAPI SNMP SP SPX SQL SRM SSA SSL SUSP SVC SWS TAPI TCB TCP/IP

Security Identifier Serial Line Internet Protocol Server Message Block System Management Interface Tool Symmetric Multiprocessor Systems Management Server Systems Network Architecture SNA Interactive Transaction Program Simple Network Management Protocol System Parallel Sequenced Packet eXchange Structured Query Language Security Reference Monitor Serial Storage Architecture Secure Sockets Layer System Use Sharing Protocol Serviceability Silly Window Syndrome Telephone Application Program Interface Trusted Computing Base Transmission Control Protocol/Internet Protocol Trusted Computer System Evaluation Transport Data Interface

TCSEC TDI

254

DB2 UDB exploitation of NAS technology

TDP TLS TOS TSM TTL UCS UDB UDF UDP UFS UID UMS UNC UPS URL USB UTC UUCP UUID VAX VCN VFS VG VGDA VGSA VGID VIPA VMM

Tivoli Data Protection Transport Layer Security Type of Service Tivoli Storage Manager Time to Live Universal Code Set Universal Database Universal Disk Format User Datagram Protocol UNIX File System User Identifier Ultimedia Services Universal Naming Convention Uninterruptable Power Supply Universal Resource Locator Universal Serial Bus Universal Time Coordinated UNIX to UNIX Communication Protocol Universally Unique Identifier Virtual Address eXtension Virtual Cluster Name Virtual File System Volume Group Volume Group Descriptor Area Volume Group Status Area Volume Group Identifier Virtual IP Address Virtual Memory Manager

VP VPD VPN VRMF VSM W3C WAN WFW WINS WLM WOW WWW WYSIWYG WinMSD XCMF XDM XDMCP XDR XNS XPG4

Virtual Processor Vital Product Data Virtual Private Network Version, Release, Modification, Fix Virtual System Management World Wide Web Consortium Wide Area Network Windows for Workgroups Windows Internet Name Service Workload Manager Windows-16 on Win32 World Wide Web What You See Is What You Get Windows Microsoft Diagnostics X/Open Common Management Framework X Display Manager X Display Manager Control Protocol eXternal Data Representation XEROX Network Systems X/Open Portability Guide

Abbreviations and acronyms

255

256

DB2 UDB exploitation of NAS technology

Related publications
The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook.

IBM Redbooks
For information on ordering these publications, see How to get IBM Redbooks on page 260:
Implementing the IBM TotalStorage NAS 300G: High Speed Cross Platform Storage and TivoliSANergy!, SG24-6278-00 The IBM TotalStorage NAS 200 and 300 Integration Guide, SG24-6505-00 Using iSCSI Planning and Implementing Solutions, SG24-6291-00 DB2 UDB e-business Guide, SG24-6539 IP Storage Networking: NAS and iSCSI Solutions, SG24-6240 A Practical Guide to Tivoli SANergy, SG24-6146 Tivoli SANergy Administrators Guide, GC26-7389 IBM SAN Survival Guide, SG24-6143 IBM Storage Solutions for Server Consolidation, SG24-5355 Tivoli Storage Management Concepts, SG24-4877 Getting Started with Tivoli Storage Manager: Implementation Guide, SG24-5416 Using Tivoli Storage Manager in a SAN Environment, SG24-6132 Tivoli Storage Manager Version 4.2: Technical Guide, SG24-6277 Red Hat Linux Integration Guide for IBM eServer xSeries and Netfinity, SG24-5853 AIX 5L and Windows 2000: Side by Side, SG24-4784 Migrating IBM Netfinity Servers to Microsoft Windows 2000, SG24-5854 Using TSM in a Clustered NT Environment, SG24-5742 ESS Solutions for Open Systems Storage: Compaq Alpha Server, HP and SUN, SG24-6119 Backing Up DB2 Using Tivoli Storage Manager, SG24-6247-00

Copyright IBM Corp. 2002

257

Other resources
These publications are also relevant as further information sources: Roger E. Sanders, DB2 Administration, McGraw-Hill/Osborne, 2002, ISBN 0-07-213375-9 Larry Peterson and Bruce Davie, Computer Networks - A Systems Approach, Morgan Kaufmann Publishers, 1996, ISBN 1558603689 A. S. Tanenbaum, Computer Networks, Prentice Hall, 1996, ISBN 0133499456 M. Schwartz, Telecommunication Networks: Protocols, Modeling and Analysis, Addison-Wesley, 1986, ISBN 020116423X Matt Welsh, Mathias Kalle Dalheimer, and Lar Kaufman, Running Linux (3rd Edition), OReilly, 1999, ISBN 156592469X Scott M. Ballew, Managing IP Networks with CISCO Routers, OReilly, 1997, ISBN 1565923200 Ellen Siever, et al., Linux in a Nutshell (3rd Edition), OReilly, 2000, ISBN 0596000251 Andreas Siegert, The AIX Survival Guide, Addison-Wesley, 1996, ISBN 0201593882 William Boswell, Inside Windows 2000 Server, New Riders, 1999, ISBN 1562059297 Paul Albitz and Cricket Liu, DNS and BIND (4th Edition), OReilly, 2001, ISBN 0596001584 Gary L. Olsen and Ty Loren Carlson, Windows 2000 Active Directory Design and Deployment, New Riders, 2000, ISBN1578702429
Microsoft Windows 2000 Professional Resource Kit, Microsoft Press, 2000, ISBN 1572318082

D. Libertone, Windows 2000 Cluster Server Guidebook, Prentice Hall, 2000, ISBN 0130284696
Microsoft Services for UNIX version 2 white paper, found at: http://www.microsoft.com/WINDOWS2000/sfu/sfu2wp.asp

C. J. Date, An Introduction to Database Systems (7th Edition), Addison-Wesley, 1999, ISBN 0201385902 George Baklarz and Bill Wong, DB2 Universal Database V7.1, Prentice Hall, 2001, ISBN 0130913669

258

DB2 UDB exploitation of NAS technology

Referenced Web sites


These Web sites are also relevant as further information sources: IBM Storage http://www.storage.ibm.com/ IBM TotalStorage http://www.storage.ibm.com/ssg IBM NAS http://www.storage.ibm.com/snetwork/nas/index.html IBM TotalStorage 200 http://www.storage.ibm.com/snetwork/nas/200/index.html IBM TotalStorage 300 http://www.storage.ibm.com/snetwork/nas/300/index.html IBM TotalStorage 300G http://www.storage.ibm.com/snetwork/nas/300g_product_page.htm IBM FastT200 http://www.storage.ibm.com/hardsoft/products/fast200/fast200.htm IBM Enterprise Storage Server (formerly known as Shark) http://www.storage.ibm.com/hardsoft/products/ess/ess.htm Microsoft Technical Library http://www.microsoft.com/windows2000/techinfo/default.asp Microsoft Services for UNIX http://www.microsoft.com/WINDOWS2000/sfu/default.asp Tivoli http://www.tivoli.com/ Tivoli Sanergy Support http://www.tivoli.com/support/sanergy Brocade http://www.brocade.com/ Storage Networking Industry Association http://www.snia.org/ Sysinternals Microsoft Tools http://www.sysinternals.com/ Linux Documentation http://www.linuxdoc.org/ Linux Kernel Resource http://www.kernel.org/

Related publications

259

Red Hat Linux http://www.redhat.com/ SUSE Linux http://www.suse.com/index_us.html SAS Institute Inc. http://www.sas.com/ IBM/SAS alliance http://www.sas.com/partners/directory/ibm SAS Administrator documentation http://www.sas.com/service/admin/admindoc.html SAS/ACCESS sample programs for UNIX http://www.sas.com/service/techsup/sample/unix_access.html Oracle http://www.oracle.com/

How to get IBM Redbooks


You can order hardcopy Redbooks, as well as view, download, or search for Redbooks at the following Web site:
ibm.com/redbooks

You can also download additional materials (code samples or diskette/CD-ROM images) from that site.

IBM Redbooks collections


Redbooks are also available on CD-ROMs. Click the CD-ROMs button on the Redbooks Web site for information about all the CD-ROMs offered, as well as updates and formats.

260

DB2 UDB exploitation of NAS technology

Index
A
Access 80 active log file 37 Adapters for NAS200 and 300 130 Add Option dialog 80 Add Volume 73 age cleaner agents 25 AIX commands vmstat 115 Alert Center 7 application failure 34 Archival backup 135 Archival logging 37 archive logging 92 Arrays, logical disks, and volumes 130 ASCII Delimited 10 AutoExNT Service 197 autorestart 35 Avocent 175

C
caching 25 Character large object 31 Check list 175 CIFS 13, 3839, 46, 134, 195 Circular logging 36 circular logging 92 CLOB 31 clone 11 Cluster information 175 Cluster resource balancing 186 Cluster Server 174 Clustering 43 coherency control 43 column 30 Command Center 7 command line interface 69 COMMIT 36 Common Internet File System 39, 46 Concurrent Copy 43 Configuration of DB2 Create db2 user account 153 Connectivity 16 consistent state 34 container 26 Containers 27 Control Center 7 copy-on-write operation 139 Create a new Qtree dialog 76 CREATE DATABASE 87 Create db2 user account 155 cumulative backup 9 customized operating systems, 14

B
Backup 205 backup 123 Backup and recovery 34 Backup and recovery functions 135 Backup and recovery in IBM NAS products 136 BACKUP DATABASE 35, 96, 208 backup image 6 balanced binary tree 30 base table 30 Berkeley Fast File System 58 bin 34 Binary large object 31 BLOB 31 Block I/O 40, 134 BM NAS 200 126 buffer pool manager 36 Buffer pools 25 Bufferpool 8 bulk insert 34

D
DAS 40 data accessibility 41 Data Backup to Tape 128 Data maintenance utilities 12 data mining 4 Data ONTAP 46, 65, 72 Data Protection 43 Data Protection on Disk 128

Copyright IBM Corp. 2002

261

Data Protection Technology 128 Data sharing 20 data sharing 42 data throughput 31 data type 30 Data Vaulting 42 data warehousing 4 database 201 database configuration 31 database engine 6 database loading 32 database managed space 26, 68 Database Reallocation 210 database recovery 25 database recovery history file 35 Databases 25 Data-Copy Sharing 42 DataLink 104 DB2 199, 205 DB2 clients 24 DB2 Command Reference 9 DB2 Connect 5 DB2 Database Manager 24 DB2 Everyplace 6 DB2 Optimizer 12 DB2 optimizer 30 DB2 UDB 45 DB2 UDB 7.1 Create database 201 DB2 UDB for OS/400 5 DB2 UDB Utilities 8 DB2 Universal Database 4 DB2 Universal Database packaging 4 DB2_PARALLEL_IO 32, 86 DB2_STRIPED_CONTAINERS 33 DB2_STRIPPED_CONTAINERS 86 DB2s query optimizer 8 DB2EMPFA 34 db2empfa 92 DB2INIDB 96, 99 db2inidb 210211 db2relocatedb 210 DB2SET 32 db2start 32 db2stop 32 DBCLOB 31 DEL 10 delta 9 DEVICE 27

DEVICE containers 27 dftdbpath 87 diagnostics and performance monitoring 1 dirty pages 25 Disaster Recovery 43 Disk 162, 169 DMS 26, 68, 212 Domain 165 domain 158 Double-byte character large object 31 DSS 5 DSView 175

E
EE 4 EEE 4, 32 engineering drawings 5 Enterprise Systems Connection 18 Enterprise-Extended Edition 32 environment variables 32 ESCON 18 Event Analyzer 7 Event Monitor partial record identifier 114 Export 10 exportfs 81 extenders 6 Extent size 29

F
fabric 42 Failover 187 FC 18 FC disks 128 Fibre Channel 1718 Fibre Channel SAN 15 Fibre Channel switching technology 42 field 30 FILE 27 File I/O 40 file level I/O protocols 40 file locking 39 file permissions 39 File servers 13 File System Formats (File I/O 127 File system I/O 134 file systems 27 FilerView 7071

262

DB2 UDB exploitation of NAS technology

FlashCopy 43, 135 flat file transfer 42 free block-map file 59 free inode-map file 59 FTP 38 functions 31

G
gateways 17 global temporary tables 27 graphics 5 Group 39

H
HA 43 Hard disks and adapters 129 hardware environment 31 Heterogeneous file sharing 16 heuristic algorithms 25 hierarchical data structure 30 High Availability 43 high bandwidth 17 high Performance Parallel Interface 18 HIPPI 18 history file 38 homogeneous server environment 20 HTTP 38 hubs 17 Hyper Text Transfer Protocol 46

IBM TransArc Episode 58, 63 IBMs 3494 21 IBMDEFAULTBP 25 images 5 Import 10 IMS 5 Incremental 9 Index creation wizard 7 index creation 32 Indexes 30 indexes 25 inode file 59 installation 199 Instances 24 instantaneous data replication functions 18 insurance claim forms 5 integrated disks 18 inter-partition parallelism 32 inter-query parallelism 32 intra-partition parallelism 32 intra-query parallelism 32 Introduction to IBM NAS 147 ISO 41 ISV backup software 136

J
Java 5 Journal Facility 7

I
I/O buffer 28 I/O Parallelism 31 I/O parallelism 31, 33 IAACU) 162 IBM NAS 125, 153, 199, 205 NAS 200 126 NAS 300 128 IBM NAS 200 Model 201 (5194-200) 130 IBM NAS 200 Model 226 (5194-225) 130 IBM NAS 300 (5195-325) 130 IBM NAS Persistent Storage Manager (PSM) 137 IBM NAS True Image 205 IBM Network Attached Storage - Overview 148 IBM RAMAC Virtual Array (RVA) 16 IBM Total Storage NAS 200 147 IBM TotalStorage NAS Models 201 149

L
LAN 13 LAN-free 21 Large database systems 3 large object 27 LIST DATABASE DIRECTORY 89 List prefetches 30 LIST TABLESPACE CONTAINERS 91 LIST TABLESPACES 90 Load 10 LOB 27 Local Area Networks 41 Local database directory 89 log buffer 36 Logical consolidation 19 logical database design 26 logical drive 171 logical drives 165

Index

263

logical structure 30 logretain 35 Long data 31 long DMS table spaces 27 long field data 27 LONG VARCHAR 31 LONG VARGRAPHIC 31

O
object hierarchy 24 object-relational database 4 offline 209 offline archive log file 37 offline backup 37 OLAP 5, 8 OLAP SQL extensions 8 OLTP 5, 8 online archive log file 37 Open Systems Interconnection 41 operators 31 optimum performance. 28 optional Network Lock Manager 39 OSI 41 Other 39

M
Manage NFS Exports 79 Manage Qtrees 75 Manage Volumes 72 Microsofts Server Message Block 39 mirror 210 Model 326 169 mount 39 mount point 8283 MPPs 4 MS Windows Terminal Client 161 MSCS 174 multimedia 5 multipage_alloc 34

P
Page cleaners 25 Page size 28 pages. 28 parallel query processing 8 parallelism 31 Parity Disk 53 partition 172 partitioned database environment 9 PATH 27 PC Integrated Exchange Format 10 PC/IXF or IXF 10 PDC 154 PE 4 performance 25 Performance Monitor 7 Persistent Storage Manager 123 Persistent Storage Manager (PSM) 129 Personal Developers Edition 4 Physical consolidation 18 physical storage on a system 26 pointers 30 Point-in-time images 135 power interruptions 34 Predictive Failure Analysis (PFA) 130 prefetch mechanisms 33 Prefetch size 29 prefetch size / extent size 33 Pre-loaded code 129 Primary Domain Controller 155 primary online log files 36

N
NAS 200 153, 161 NAS 200 Model 201 149 NAS 200 Model 226 150 NAS 300 150, 153, 169170, 174 NAS 300 base configuration 151 NAS 300 Model 326 151 NAS device 37 NAS Server Engine 127 NAS300 147 NetApp filer 1 Network 163, 169 Network Appliance filers 3 Network appliances 14 network attached storage 3 network availability 41 Network File System 39, 46 NFS 13, 3839, 46, 134 NFS exports 79 NLM 39 node 179 Node directory 89 Non-disruptive scalability for growth 21 non-volatile RAM 48 NVRAM 48

264

DB2 UDB exploitation of NAS technology

primary partition 173 protocol 39 PSM 137, 206 PSM cache contents 139 PSM True Image 212

Q
qtree 66, 97 qtree create 76 qtrees 74 Query Parallelism 32 Query parallelism 31 Quota Trees 66 quota trees 97

R
RAID 1 52 RAID 3 52 RAID 4 52, 133 RAID 5 52, 133 RAID capabilities 18 RAID Implementation 128 RAID reconstruction 56 RAID scrubbing 57 RAID stripe size 33 RAID support 132 RAID 0 133 RAID 1 133 RAID 3 133 RAID 4 133 RAID 5 133 RAID 5E 134 RAID support 132 raw devices 27 RECONCILE 104 record 30 recovery 205 Recovery History file 38 Redbooks Web site 260 Contact us xx REDISTRIBUTE 13 Redistribute Data 13 Redundant Array of Inexpensive Disks 48 referential constraints 30 Registry 32 registry variable 33 regular DMS table spaces. 27 Relational Connect 5

reliability 6 remote cluster storage 43 Remote Copy 43 remote copy 43 Remote file sharing 16 remote mirroring 18 REORG 12, 38 Reorganize Table 12 Resource pooling 15 resource sharing 41 RESTART DATABASE 34 RESTORE DATABASE 35, 37 ROLLBACK 36 rolled back 36 ROLLFORWARD DATABASE 35 roll-forward recovery 35, 37, 208 Root 80 round-robin fashion 29 routers 17 rows 30 Run Statistics 12 RUNSTATS 12, 38 RW 80

S
SAN 1 SAN Fabric 42 SAN Storage 41 Scalability 16 scalability 6 Script Center 7 SCSI 18 second redundant copies of the data 43 secondary log files 36 sequential detection 33 Sequential prefetches 30 serial storage architecture 18 serialization 43 server process 43 ServeRAID Manage 163 server-free 21 SET WRITE RESUME FOR DATABASE 209 SET WRITE SUSPEND FOR DATABASE 209 Share Volume 165 Shared repository 42 sharing of resources 18 single database operation 32 Small computer systems interface 18

Index

265

SMB 39 SMPs 4 SMS 26, 68, 212 SMS table spaces 34 snapshot 210 SnapShot Copy 43 SnapShot function 135 Snapshots 49, 62 spatial data 6 sqllib directory 34 ssa 18 stand-alone 24 Standard backup 208 Standard recovery 208 standby 210 Starburst 8 Storage consolidation 18 storage failure 34 storage mirroring 43 Storage Subsystem 128 Subsystem local copy services 43 subtypes 30 Suspend I/O 209 switches 17 SYSCATSPACE 26 System Architecture 126 system catalog tables 25 system catalog tables and views 26 System database directory 89 system manageability 41 system managed space 26, 68 system temporary table spaces 27

true data sharing 20 TSM 9 TSM backup 136 TSM client software support 16

U
UDB for OS/390 5 UDE 4 UDFs 5 UDTs 5 units of work 34 Universal Database Enterprise Edition 4 Universal Database Enterprise-Extended Edition 4 Universal Database Personal Edition 4 Universal Database Workgroup Edition 4 Universal Developers Edition 4 Universal Extensibility 5 Universal management 7 unstable stat 34 UPDATE DB CFG 213 UPDATE MONITOR SWITCHES 108 URL 70 user table space 26 User temporary table spaces 27 user-defined data types 5, 31 user-defined functions 5 user-defined objects 26 userexit 35 USERSPACE1 26 Utilities Backup and Recovery 9 Data movement 9 Utility Autoloader 11 DB2LOOK 11 DB2MOVE 11

T
Table spaces 26 table spaces 9, 25 Tables 30 tables 25 Tape pooling 21 temporary table space 26 TEMPSPACE1 26 Terminal Service 161 Tivoli SANergy File Sharing 20 transaction 36 transaction log 34, 208 transaction logger 36 Transaction logging 36 transactions 34

V
value 30 Varying-length double-byte long character string 31 Varying-length long character string 31 Version Recovery 207 views 25 Virtual copies 137 Virtual Server 169 virtual server 196 Visual Explain 7 VLANs 39

266

DB2 UDB exploitation of NAS technology

vol create 73 volume 72, 196 Volume Reports 74

W
WAFL 49, 58 WAN 41 WANs 17 warm standby techniques 43 WE 4 web interface 69 Wide Area Network 41 Windows clients 197 Windows NT backup and NAS backup assistant 136 Worksheet (WSF) 10 Write Anywhere File Layout 49 WRITE RESUME 96, 99 WRITE SUSPEND 96, 98

X
XML, 6 XRC 43

Index

267

268

DB2 UDB exploitation of NAS technology

DB2 UDB Exploitation of NAS Technology

(0.5 spine) 0.475<->0.875 250 <-> 459 pages

Back cover

DB2 UDB Exploitation of NAS Technology


Integrate DB2 UDB and NAS using this hands-on guide Learn how NAS can enhance your DB2 environment Configure DB2 for optimal NAS usage
This IBM Redbook is an informative guide that describes how DB2 Universal Database (UDB) can take advantage of Network Attached Storage (NAS) technology. Specifically, this book provides detailed information to help you learn about Network ApplianceTM filers and IBM NAS 200/300 appliances and to show you how DB2 UDB databases can be stored on these devices. This easy-to-follow guide documents the generic network, software, and hardware requirements, as well as the basic procedures needed to set up, configure, and integrate DB2 UDB databases with Network Appliance filers and IBM NAS 200 and 300 appliances. These procedures start with the basics of initializing the NAS/SAN device for DB2 UDB and then continue with the more advanced topics of backing up databases stored on NAS devices using the Snapshot technology that is supplied with each. This book also provides general information on how DB2 UDB databases, which are stored on NAS devices, as well as the NAS devices themselves, can be monitored for performance.

INTERNATIONAL TECHNICAL SUPPORT ORGANIZATION

BUILDING TECHNICAL INFORMATION BASED ON PRACTICAL EXPERIENCE IBM Redbooks are developed by the IBM International Technical Support Organization. Experts from IBM, Customers and Partners from around the world create timely technical information based on realistic scenarios. Specific recommendations are provided to help you implement IT solutions more effectively in your environment.

For more information: ibm.com/redbooks


SG24-6538-00 ISBN 0738425222

Das könnte Ihnen auch gefallen