Sie sind auf Seite 1von 39

Oracle Active Data Guard

Performance
Geovanni Vega Velasquez
Database Brand Manager Oracle Mexico

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Note to viewer

These slides provide various aspects of performance data for


Data Guard and Active Data Guard we are in the process of
updating for Oracle Database 12c.

It can be shared with customers, but is not intended to be a


canned presentation ready to go in its entirety

It provides SCs data that can be used to substantiate Data Guard


performance or to provide focused answers to particular concerns
that may be expressed by customers.

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Note to viewer
See this FAQ for more customer and sales collateral
http://database.us.oracle.com/pls/htmldb/f?p=301:75:1014514610433

66::::P75_ID,P75_AREAID:21704,2

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Agenda Data Guard Performance

Failover and Switchover Timings


SYNC Transport Performance
ASYNC Transport Performance
Primary Performance with Multiple Standby Databases
Redo Transport Compression
Standby Apply Performance

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Data Guard 12.1 Example - Faster Failover

48 seconds
2,000 sessions
on both primary
and standby

43 seconds
2,000 sessions
on both primary
and standby

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

# of database
sessions on primary
and standby

# of database
sessions on primary
and standby

Data Guard 12.1 Example Faster Switchover

72 seconds
1,000 sessions on
both primary and
standby

83 seconds
500 sessions on
both primary and
standby

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

# of database
sessions on
primary and
standby

# of database
sessions on
primary and
standby

Agenda Data Guard Performance

Failover and Switchover Timings


SYNC Transport Performance
ASYNC Transport Performance
Primary Performance with Multiple Standby Databases
Redo Transport Compression
Standby Apply Performance

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Synchronous Redo Transport


Zero Data Loss
Primary database performance is impacted by the total round-trip time for

acknowledgement to be received from the standby database


Data Guard NSS process transmits Redo to the standby directly from log buffer, in

parallel with local log file write


Standby receives redo, writes to a standby redo log file (SRL), then returns ACK
Primary receives standby ACK, then acknowledges commit success to app

The following performance tests show the impact of SYNC transport on

primary database using various workloads and latencies


In all cases, transport was able to keep pace with generation no lag
We are working on test data for Fast Sync (SYNCNOAFFIRM) in Oracle

Database 12c (same process as above, but standby acks primary as soon as
redo is received in memory it does not wait for SRL write.
8

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Test 1) Synchronous Redo Transport


OLTP with Random Small Insert < 1ms RTT Network Latency
Workload:
Random small inserts (OLTP) to 9 tables with 787 commits per second
132 K redo size, 1368 logical reads, 692 block changes per transaction

Sun Fire X4800 M2 (Exadata X2-8)


1 TB RAM, 64 Cores, Oracle Database 11.2.0.3, Oracle Linux
InfiniBand, seven Exadata cells, Exadata Software 11.2.3.2

Exadata Smart Flash, Smart Flash Logging and Write-Back flash

enabled provided significant gains

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Test 1) Synchronous Redo Transport


OLTP with Random Small Inserts and < 1ms RTT Network Latency
Local standby,
Redo Rate

With Data Guard Synchronous Transport Enabled

104,051,368.80

Data Guard Transport Disabled

104,143,368.00

Txn Rate

0
DG Sync
No DG

20000000

40000000

Txn Rate
787
790.6

60000000

80000000

100000000

Redo Rate
104,051,368.80
104,143,368.00

120000000

<1ms RTT
99MB/s redo rate
<1% impact on
database
throughput
1% impact on
transaction rate
RTT = network round trip time

10

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Test 2) Synchronous Redo Transport


Swingbench OLTP Workload with Metro-Area Network Latency
Exadata X2-8, 2-node RAC database
smart flash logging, smart write back flash

Swingbench OLTP workload


Random DMLs, 1 ms think time, 400 users, 6000+ transactions per

second, 30MB/s peak redo rate (different from test 2)


Transaction profile
5K redo size, 120 logical reads, 30 block changes per transaction

1 and 5ms RTT network latency

11

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Test 2) Synchronous Redo Transport


Swingbench OLTP Workload with Metro-Area Network Latency
Transactions
per/second

6000
5000
4000
3000
2000
1000
0

12

Swingbench OLTP
30 MB/s redo

6363

6151

6077

tps

tps

tps

3% impact at

1ms RTT
5% impact at
5ms RTT
Baseline
No Data Guard

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Data Guard SYNC


1ms RTT
Network Latency

Data Guard SYNC


5ms RTT
Network Latency

Test 3) Synchronous Redo Transport


Large Insert OLTP Workload with Metro-Area Network Latency
Exadata X2-8, 2-node RAC database
smart flash logging, smart write back flash

Large insert OLTP workload


180+ transactions per second, 83MB/s peak redo rate, random tables

Transaction profile
440K redo size, 6000 logical reads, 2100 block changes per transaction

1, 2 and 5ms RTT network latency

13

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Test 3) Synchronous Redo Transport


Large Insert OLTP Workload with Metro-Area Network Latency
Transactions
per/second

Large Insert - OLTP

200
150

83 MB/s redo
<1%% impact

189

188

177

167

tps

tps

tps

tps

2ms RTT
Network
Latency

5ms RTT
Network
Latency

100
50
0

14

Baseline
No
Data Guard

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

1ms RTT
Network
Latency

at 1ms RTT
7% impact at
2ms RTT
12% impact at
5ms RTT

Test 4) Synchronous Redo Transport


Mixed OLTP workload with Metro-Area Network Latency
Exadata X2-8, 2-node RAC database
smart flash logging, smart write back flash

Mixed workload with high TPS


Swingbench plus large insert workloads
26000+ txn per second and 112 MB/sec peak redo rate

Transaction profile
4K redo size, 51 logical reads, 22 block changes per transaction

1, 2 and 5ms RTT network latency

15

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Test 4) Synchronous Redo Transport


Mixed OLTP workload with Metro-Area Network Latency
35,000

Swingbench plus large insert


30,000

112 MB/s redo

Txn Rate
Redo Rate

25,000

3% impact at < 1ms RTT

20,000

5% impact at 2ms RTT

15,000

6% impact at 5ms RTT

10,000
5,000
0

Txns/s
Redo Rate (MB/sec)
% Workload

16

No Sync
29,496
116
100%

0ms
28,751
112
97%

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

2ms
27,995
109
95%

5ms
27,581
107
94%

10ms
26,860
104
91%

Note: 0ms latency on graph


20ms
represents
values falling in
26,206
the 102
range <1ms
89%

Additional SYNC Configuration Details


For the Previous Series of Synchronous Transport Tests
No system bottlenecks (CPU, IO or memory) were encountered during

any of the test runs


Primary and standby databases had 4GB online redo logs
Log buffer was set to the maximum of 256MB
OS max TCP socket buffer size set to 128MB on both primary and standby
Oracle Net configured on both sides to send and receive 128MB with an

SDU for 32k


Redo is being shipped over a 10GigE network between the two systems.
Approximately 8-12 checkpoints/log switches are occurring per run

17

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Customer References for SYNC Transport


Fannie Mae Case Study that includes performance data
Other SYNC references
Amazon
Intel
MorphoTrak prior biometrics division of Motorola, case study, podcast, presentation
Enterprise Holdings
Discover Financial Services, podcast, presentation
Paychex

VocaLink

18

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Synchronous Redo Transport


Caveat that Applies to ALL SYNC Performance Comparisons
Redo rates achieved are influenced by network latency, redo-write

size, and commit concurrency in a dynamic relationship with each


other that will vary for every environment and application
Test results illustrate how an example workload can scale with minimal
impact to primary database performance
Actual mileage will vary with each application and environment.
Oracle recommends customers conduct their own tests, using their
workload and environment. Oracle tests are not a substitute.

19

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Agenda

Failover and Switchover Timings


SYNC Transport Performance
ASYNC Transport Performance
Primary Performance with Multiple Standby Databases
Redo Transport Compression
Standby Apply Performance

20

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Asynchronous Redo Transport


Near Zero Data Loss
ASYNC does not wait for primary acknowledgement

A Data Guard NSA process transmits directly from log buffer in parallel with

local log file write


NSA reads from disk (online redo log file) if log buffer is recycled before redo

transmission is completed
ASYNC has minimal impact on primary database performance
Network latency has little, if any, impact on transport throughput
Uses Data Guard 11g streaming protocol & correctly sized TCP send/receive buffers

Performance tests are useful to characterize max redo volume that ASYNC is

able to support without transport lag


Goal is to ship redo as fast as generated without impacting primary performance
21

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Asynchronous Test Configuration


Details
100GB online redo logs
Log buffer set to the maximum of 256MB
OS max TCP socket buffer size set to 128MB on primary and standby
Oracle Net configured on both sides to send and receive 128MB

Read buffer size set to 256 (_log_read_buffer_size=256) and archive buffers

set to 256 (_log_archive_buffers=256) on primary and standby


Redo is shipped over the IB network between primary and standby nodes

(insures that transport is not bandwidth constrained)


Near-zero network latency, approximate throughput of 1200MB/sec.

22

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

ASYNC Redo Transport Performance Test


Oracle Database 11.2.
Data Guard ASYNC transport can sustain very

600

high rates

500
484
Redo
Transport
MB/sec

400

484 MB/sec on single node


Zero transport lag

Add RAC nodes to scale transport performance

300

Each node generates its own redo thread and has a

dedicated Data Guard transport process

200

Performance will scale as nodes are added assuming

adequate CPU, I/O, and network resources

100

A 10GigE NIC on standby receives data at

maximum of 1.2 GB/second

Single Instance

Standby can be configured to receive redo across two

or more instances
23

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Data Guard 11g Streaming Network Protocol


High Network Latency has Negligible Impact on Network Throughput

Redo
Transport
Rate

35

Streaming protocol is new with Data Guard 11g

30

Test measured throughput with 0 100ms RTT

25

ASYNC tuning best practices

MB/sec

20

Set correct TCP send/receive buffer size = 3 x


Network
Latency

15

0ms
25ms
50ms
100ms

10

5
0
ASYNC
24

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

BDP (bandwidth delay product)


BDP = bandwidth x round-trip network latency
Increase log buffer size if needed to keep NSA

process reading from memory


See support note 951152.1

X$LOGBUF_READHIST to determine buffer hit rate

Agenda

Failover and Switchover Timings


SYNC Transport Performance
ASYNC Transport Performance
Primary Performance with Multiple Standby Databases
Redo Transport Compression
Standby Apply Performance

25

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Multi-Standby Configuration
Primary - A

Local Standby - B

A growing number of customers use multi-standby Data

Guard configurations.
SYNC

Additional standbys are used for:


Local zero data loss HA failover with remote DR
Rolling maintenance to reduce planned downtime
Offloading backups, reporting, and recovery from primary

ASYNC

Reader farms scale read-only performance

This leads to the question: How is primary database


Remote
Standby - C

26

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

performance affected as the number of remote transport


destinations increases?

Redo Transport in Multi-Standby Configuration


Primary Performance Impact: 14 Asynchronous Transport Destinations

105.0%
104.0%
103.0%
102.0%
101.0%
100.0%
99.0%
98.0%
97.0%

Increase in CPU

Change in redo volume

(compared to baseline)

(compared to baseline)

100.0%
98.0%

96.0%
94.0%

92.0%
0 - 14 destinations

27

102.0%

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

0 -14 destinations

Redo Transport in Multi-Standby Configuration


Primary Performance Impact: 1 SYNC and multiple ASYNC Destinations
Increase in CPU
104.0%

(compared to baseline)

(compared to baseline)

98.0%

100.0%

96.0%

98.0%

94.0%
Zero

1/0

1/1

1/14

# of SYNC/ASYNC destinations
28

102.0%
100.0%

102.0%

96.0%

Change in redo volume

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

92.0%

Zero

1/0

1/1

1/14

# of SYNC/ASYNC destinations

Redo Transport for Gap Resolution

Standby databases can be configured to request log files needed to

resolve gaps from other standbys in a multi-standby configuration


A standby database that is local to the primary database is normally
the preferred location to service gap requests
Local standby database are least likely to be impacted by network outages
Other standbys are listed next
The primary database services gap requests only as a last resort

Utilizing a standby for gap resolution avoids any overhead on the primary

database
29

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Agenda

Failover and Switchover Timings


SYNC Transport Performance
ASYNC Transport Performance
Primary Performance with Multiple Standby Databases
Redo Transport Compression
Standby Apply Performance

30

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Redo Transport Compression


Conserve Bandwidth and Improve RPO when Bandwidth Constrained
2500
2000

Test configuration

Transport
Lag - MB

12.5 MB/second bandwidth


22 MB/second redo volume

22 MB/sec
uncompressed

1500

Uncompressed volume exceeds

available bandwidth
Recovery Point Objective (RPO)

1000

impossible to achieve
perpetual increase in transport lag

500

12 MB/sec
compressed

volume < bandwidth = achieve RPO


ratio will vary across workloads

Elapsed Time - Minutes


31

50% compression ratio results in:

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Requires Advanced Compression

Agenda

Failover and Switchover Timings


SYNC Transport Performance
ASYNC Transport Performance
Primary Performance with Multiple Standby Databases
Redo Transport Compression
Standby Apply Performance

32

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Standby Apply Performance Test


Redo apply was first disabled to accumulate a large number of log files

at the standby database. Redo apply was then restarted to evaluate


max apply rate for this workload.
All standby log files were written to disk in Fast Recovery Area
Exadata Write Back Flash Cache increased the redo apply rate from
72MB/second to 174MB/second using test workload (Oracle 11.2.0.3)
Apply rates will vary based upon platform and workload

Achieved volumes do not represent physical limits


They only represent the particular test case configuration and workload,

higher apply rates have been achieved in practice by production customers


33

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Apply Performance at Standby Database


Test 1: no write-back flash

cache
On Exadata x2-2 quarter rack
Swing bench OLTP workload
72 MB/second apply rate
I/O bound during checkpoints
1,762ms for checkpoint

complete
110ms DB File Parallel Write

34

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Apply Performance at Standby Database


Test 2: a repeat of the previous

test but with write-back flash


cache enabled
On Exadata x2-2 quarter rack
Swing bench OLTP workload
174 MB/second apply rate
Checkpoint completes in

633ms vs 1,762ms
DB File Parallel Write is

21ms vs 110ms

35

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Two Production Customer Examples


Data Guard Redo Apply Performance
Thomson-Reuters
Data Warehouse on Exadata, prior to write-back flash cache
While resolving a gap of observed an average apply rate of 580MB/second

Allstate Insurance
Data Warehouse ETL processing resulted in average apply rate over a 3

hour period of 668MB/second, with peaks hitting 900MB/second

36

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Redo Apply Performance for Different Releases


Range of Observed Apply Rates for Batch and OLTP

Standby
Apply
Rate
MB/sec

700
600
500
400
300
200
100
0

High End - Batch


High End - OLTP
Oracle
Database 9i

37

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Oracle
Database
10g

Oracle
Database
11g (non
Exadata)

Oracle
Database
11g
(Exadata)

38

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

39

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Das könnte Ihnen auch gefallen