Sie sind auf Seite 1von 34

Ingres Backup and Recovery

Bruno Bompar
Senior Manager Customer Support

Abstract

Proper backup is crucial in any production DBMS


installation, and Ingres is no exception. And backups
are useless unless you can recover from them. This
session explains how Ingres backup and recovery
work. We will also cover some ideas on how best to
do a regular backup and how to do a save recovery.

Agenda
Why backup and recovery?
Disaster scenarios
Ingres features
Housekeeping
Customisation
Issues to Consider
Tips and cautions

Why backup and recovery?


Insurance
What if?
Cost to business
Critical functionality
One part of overall process

Scenarios to Consider
System Crash
Database Corruption
Lost Table
Accidental Transaction

System Crash
Automated Recovery
After a crash Ingres will
Scan the transaction log file
Rollback uncompleted transaction
Apply completed transactions

Databases will be consistent


Depends on the crash

Database Corruption
Databases can be recovered
Only if valid Ingres backup is available!
ckpdb command to backup
rollforwarddb to recover

Backup Mechanisms

OS backup
invalid unless done with Ingres shut down cleanly
important for backing up Ingres installation, journals,

checkpoints, dumps
useless for backing up databases unless you can
guarantee a clean shutdown

unloaddb
an archiving or porting tool, not a backup tool
no way to ensure a consistent snapshot without locking out

all users (an "offline" archive)

Backup Mechanisms

In order to get the most out of a backup mechanism,


two things are needed:
a way to take a static snapshot of the database without

interfering too greatly with active users


a way to record incremental changes since that static
snapshot

Ingres does both via checkpoints and journals


a checkpoint is the static backup or snapshot
the journals are the ongoing change records

Backup Mechanisms

Terminology note! Ingres differs from other DBMS's


in its use of the word "checkpoint"
Ingres:
a checkpoint is a backup snapshot
a consistency point (CP) is a buffer and log flush

Other DBMS's:
a checkpoint means a buffer flush
a backup is just called a backup

10

Database Checkpoints
Backup the whole database
Online or Offline
Enable / Disable journaling
Can be performed in parallel
Written to
Tape
Disk

Dont forget iidbdb!!


11

Online versus Offline


Offline
Requires exclusive access to database

Online
Users carry on working
No DDL statements
Slower than offline
Can cause transaction log file to fill

12

Online Checkpointing

An online checkpoint (the ckpdb command) has


three phases:
quiescing the database
file copying with change logging
completion recording

13

Online Checkpointing

14

Online Checkpointing

15

Online Checkpointing

File copying is controlled by the checkpoint template


(cktmpl.def)
can be modified by Ingres administrator
change copy command, add file compression, etc
amazing things are possible

DML allowed during file copying


but not DDL - no file creation/deletion

Changes during file copying are specially logged


before-images sent to dump files

16

Checkpointing

After copying is complete, the checkpoint success or


failure is recorded in the database config file
aaaaaaaa.cnf
another copy left in cnnnnnnn.dmp in dump location
note that the checkpoint itself does not contain a record of

the checkpoint completion

Config file records last N checkpoint attempts


successful or not
N = 99 for recent releases of Ingres
N = 16 for older versions (2.0 and older)

17

Online Checkpointing

When it's all over, you have


one or more checkpoint files (one for each data location)
in disk checkpoint area, or on tape

zero or more dump files containing changes made while

file-copying
an updated database config file

plus an updated copy in the dump location

a new set of journal files


a fresh journal file is started at the end of the database quiescent

phase

18

Checkpointing

What to save after the checkpoint completes:


the checkpoint and dump locations

you need both


infodb output (human readable listing of the database
config file)
output of: select * from iifile_info
for manual table level recovery and emergencies
optional but recommended

19

Journals

Audit trail of all changes made to selected tables


written in batches by the archiver (dmfacp)

Default for tables is journaling ON


journaling also needs to be enabled for the database using

ckpdb +j

this is an offline checkpoint; no users allowed

Journal files grow to a target size, then a new one is started


current expected size and sequence number is stored in the

database config file


each checkpoint starts a fresh set of journal files

20

Database Checkpoint - Examples


Command line
Online checkpoint

ckpdb dbname
Offline checkpoint enabling journaling

ckpdb +j dbname #m3


Offline checkpoint disabling journaling

ckpdb -j dbname

21

Database Checkpoint - Examples


Visual DBA

22

Recovery

Recovery is a two step process


one command (rollforwarddb) with two distinct phases

First, restore the database to a point in time (a


checkpoint)
Second, replay journals
optional
all journals, or stop at a given time

23

Recovery

24

Recovery

25

Recovery

26

Recovery

The database must exist before it can be recovered


All required data locations must exist
A valid config file must be available
recovery looks in the data location first, then the dump

location
config file is renamed to aaaaaaaa.rfc

The last checkpoint must be valid


can ask for an earlier checkpoint with #cn option

27

When Recovery Is Needed

Stay calm!
you have practiced recovery, right?
haste makes mistakes
turn off the mobile phone, pager, etc
the database will be ready when it's ready

Save your current database config


ideally, make a copy of the dump location and the data location

aaaaaaaa.cnf

as a minimum save aaaaaaaa.cnf


allows you to try again if something goes wrong
if you have time, save everything in sight

28

Database Recovery
Point in time recovery
Last checkpoint only
Last checkpoint + 10 hours work
5 checkpoints ago

Based on available files

29

Database Recovery - Examples


Command Line
Last checkpoint only, no journals
Rollforwarddb +c j dbname
Last checkpoint, journals to 12:32 on 10/05/02
Rollforwarddb +c +j dbname
e10-may-2002:12:32:00

30

Database Recovery - Examples


Visual DBA
Last checkpoint
only, no journals

31

Database Recovery - Examples


Visual DBA
Last checkpoint,
journals to 12:32 on
10/05/02

32

Recovery Scenarios

Data area is lost


shut down Ingres if it's not down
restore data directories with db config file
restart Ingres
transaction log contents can be moved to journals only if a valid

config file is available!

rollforwarddb
up-to-the-minute recovery should be possible

33

Recovery Scenarios

Transaction log is lost


wasn't it mirrored?
recreate transaction log
rollforwarddb
most recent transactions not moved to journals will be lost

34

Recovery Scenarios

Checkpoint or dump location is lost


recreate location directories
take fresh checkpoint
loss of checkpoint area should not affect running database

35

Recovery Scenarios

Journal location is lost


installation will continue to run until transaction log fills up
recreate journal directory
alterdb -disable_journaling to halt journaling
restart archiver which will have stopped due to inability to

write journals
ckpdb +j to restart journaling

36

Recovery Scenarios

Software or human error is discovered


If mistake is discovered immediately:
crash/restart Ingres, or remove all user sessions
rollforwarddb with -e option to replay journals, stopping

short of the time of mistake

If mistake isn't discovered until later, recovery is


more complicated
Ingres Journal Analyzer (IJA) can help

37

Accidental Transaction
AuditDB
Filter against
Table
Users
Time

Scan Journal files


Generate SQL
Execute

38

Accidental Transaction
Ingres Journal Analyzer
Auditdb with Knobs on
Connect to remote servers
Force Log Flush
Point and Click

39

Accidental Transaction

40

Accidental Transaction

41

42

Recovery Scenarios

Disaster
Use OS backups to restore Ingres system
directories, all data, work, checkpoint, dump, journal
directories
rollforwarddb iidbdb
you have been checkpointing iidbdb, right?
restores users, locations, database privileges, etc

rollforwarddb databases

43

Recovery Scenarios

Rollforwarddb failure
restore the config or dump info you saved before

attempting rollforwarddb
rename aaaaaaaa.rfc back to aaaaaaaa.cnf if it exists
cure any other rollforwarddb complaints
try again

Last checkpoint didn't work


use ckpdb #cn to restore an older one
you do have more than one checkpoint around, right?

44

Lost Table
Table can be recovered
From table checkpoint only
Enforce logical consistency
Journaling must be enabled

45

Table Checkpoints - Examples


Command line
Checkpoint table t1

ckpdb dbname table=t1


Checkpoint table t1 and t2

ckpdb dbname table=t1,t2

46

Table Recovery - Examples


From table checkpoint only
Command line
Recover table t1

rollforwarddb dbname table=t1


Recover table t1 and t2

rollforwarddb dbname table=t1,t2

47

Housekeeping Ingres
Infodb
Checkpoints
Dumps
Journals

48

Infodb / aaaaaaaa.cnf
Shows meta-data about database
Locations
Checkpoint sequence
Valid / Invalid
Dump / Journal sequence
Counters
Last table id
Last valid checkpoint

49

Infodb / aaaaaaaa.cnf
Info stored in aaaaaaaa.cnf
Three copies

Primary database location

Dump location as aaaaaaaa.cnf

Dump location as cxxxx.dmp

Infodb reads CNF file in database area


Copy to dump area with every change

50

II_DUMP

database own dump area

Checkpoint files
Stored in 1 location
II_CHECKPOINT
Database defined checkpoint area

One file for each location


Format depends on archiver used

51

Dump files
Changes during ONLINE checkpoint
Required for recovery
Single location
II_DUMP
Database defined dump area

52

Journal Files
Record of changes
Table configuration

Facilitates point in time recovery


Files stored in single location
II_JOURNAL
Database defined journal area

53

Backing up the backup files


OFFLINE Checkpoint
Database aaaaaaaa.cnf
Dump aaaaaaaa.cnf
Output from infodb
Checkpoint
Journals

ONLINE Checkpoint
All above
Dump files

54

Cleaning up
ckpdb d
All but the last checkpoint
Dump, journal files deleted as well

alterdb delete_oldest_ckp
Oldest checkpoint only
Maintain set of checkpoints
Dump, journal files deleted as well

55

Customisation

cktmpl.def

Defines actions

ingsetenv only

Most common entries to change:

Before / During / After


Tape
Disk

II_CKTMPL_FILE

$II_SYSTEM/ingres/files

WSDD: work phase of regular checkpoint


WRDD: work phase of regular rollforward

Some things you can do:

56

add compression/decompression
use a different utility (eg star instead of tar)
wild and crazy stuff

Test both checkpoint and restore after modifying the template

Issues To Consider
Files
Ingres supports large files
OS archiver utility may not

POSIX standard
tar
cpio

57

Tips and Cautions

Hardware "solutions" aren't solutions


"I dont need to backup, I have magic solution of the

58

moment"
RAID 5, mirroring, whatever
you aren't protected against software failures
you aren't protected against human failures
you aren't protected against disasters
you may not be protected against multiple hardware
failures
you are putting all your eggs in one basket

Tips and Cautions

Backups are no good if they don't work


make sure that ckpdb works
automatic verification is better than manual verification

not ensuring that checkpoints are working may be


the #1 cause of recovery failure
Automate as much as possible
error checking
disk space checking
old-checkpoint deletion

59

Tips and Cautions

A choice of checkpoints is better than just one


avoid ckpdb -d (delete all prior checkpoints)
alterdb -delete_oldest_ckp is better
manual (or scripted) deletion of old checkpoints is often best
maintains checkpoint history in the config file

Keep as many checkpoints as you can


gives you more recovery options
don't skimp on checkpoint disk space (disks are cheap!)
you can delete checkpoints but keep journals

60

it's all on OS backups, right??

Tips and Cautions


Be wary of checkpointing to tape
nasty, unreliable devices they are
"oops, there wasn't a tape in the drive"
if you must use tape, verify your backups regularly
tape drives have been known to write unreadable tapes

Keep checkpoint and dump locations together


on the same file system or drive
keep them on the same OS backup schedule
checkpoints are worthless without the dump info

61

Tips and Cautions


Practice is essential
not just once, but regularly
practice on look-alike installation if production is not

available
practice on production at least occasionally
clean Ingres shutdown
OS backup everything in sight

verify the OS backup, then run your recovery tests

you need hardware resources to support your recovery

practice

62

Tips and Cautions


Document your recovery procedures
let someone else do a trial recovery
keep the procedures up to date
make sure that more than one person knows how to do a

recovery

make sure that more than one person knows where


to find the documentation
keep a copy offsite or in a safe place

63

Tips and Cautions


Backing up and archiving are different
a backup has a short useful lifetime
an archive (unload) is good indefinitely

Backup planning and disaster recovery planning are


different
recoverable backups are just one aspect of a complete

disaster recovery plan

64

More Information
Ingres DBA guide
Chapter 15 (2.6)

Ingres Command Reference Guide


Compressed Checkpoints
Servicedesk Doc ID 409751

65

Summary
Backups deserve more than lip service
Ensuring 100% recoverable backups takes time,
effort, and money
Ingres checkpoint and rollforward capabilities are
simple yet powerful and customisable
With proper practice and procedures, a recovery is
nothing to be afraid of

66

Questions & Answers

67