QRep Performance Tuning 2013 v1

Ya Liu, liuya@cn.ibm.
com
InfoSphere Data Replication Technical Enablement, CDL, IBM
Q Replication technology is available in:
IBM InfoSphere Data Replication for z/OS

IBM InfoSphere Data Replication for LUW
IBM DB2 Database for LUW
IBM InfoSphere Warehouse
IBM PureData System for Transaction
IBM Smart Analytics System
Q Replication
Performance Tuning & Best Practice
Information Management
© 2013 IBM Corporation

Objectives
 Abstract
– Q Replication is a high performance log capture / transaction replay replication
technology which uses IBM WebSphere ® MQ to transmit and stage data between
source and target database systems.
– Q Replication can be used in various business scenarios, such as offload reporting, real-
time warehousing, data synchronization, information integration, high availability and
active/active solutions. Its performance is key to achieve success in these usage
scenarios.
– This presentation describes performance challenges that Q Replication faces, and the
best practices to achieve optimal performance.
 Objectives
– Understand components that may impact Q Replication performance
– Learn how to identify Q Replication performance bottlenecks
– Learn key performance monitoring area and tuning parameters
– Learn best practice to achieve optimized performance, such as on application design
2 © 2013 IBM Corporation

Outline
 Q Replication overview
 Performance bottleneck analysis
 Tuning source database
 Tuning Q Capture
 Tuning MQ
 Tuning Q Apply
 Tuning target database
 Application design considerations
 Q Replication scalability

Q Replication Performance Tuning & Best Practice
Q Replication Overview

Q Replication
 IBM‟s strategic data replication technology for the

DB2 Family
– Has a large, well-established customer base
– Is a key technology of IBM‟s Active-Active strategy
– Is developed in close cooperation with DB2 development
teams
• Supports new DB2 features at DB2 GA time
– Available in multiple products
– Supports both mainframe and distributed platforms
 Technical characteristics
– Asynchronous replication – not limited by geographical
distance
– Application oriented – can replicate a subset of tables
– Log-based change data capture – lowest impact to source
system
– Only changed data is delivered – minimum data
processing
– Transferring and staging data in IBM WebSphere® MQ
queues – excellent data recovery capability
– Target data always transactional consistent – target data
available at any time
– Parallel data applying – high performance
– Run as a service – no batch window

Q Replication Architecture
Administration & Monitoring Tools • GUI tools to configure and

monitor Q Replication
• Script tool to deploy Q
Replication for plenty of tables
• Utilities to compare table
Replication Center Dashboard Alert Monitor contents
Source Target
SOURCE2
SOURCE1
TARGET2
Capture Apply TARGET1
WebSphere WebSphere
MQ MQ
DB Log
 A Capture program reads changed data from the database recovery log, and puts data directly into
WebSphere MQ queues
 WebSphere MQ delivers data to a target system where an Apply program runs
 The Apply program pulls data from queues and applies to target tables

Usage Scenarios
1) Multiple systems with a center or not for:
 Log-based Capture
… - Information Integration
- Synchronization
 Real-time Capture - Distribution and Consolidation
 Asynchronous
 Changed Data Only 2) One target system for:
- Offload Query/Reporting
- Real-time BI (Dynamic Warehouse)
- Audit
3) Two data centers for:

- High/Continuous Availability
- Active/Active
Transactional
Source DB 4) Two systems for system maintenance such as:
- Migration
- Upgrade
5) Changed data pushed into message queues

for Event-Driven Business

Customers
> 1000 Worldwide

Customers
Across Banking, Insurance, Retail, Public, Energy, Manufacturing,
Education, Health Industries

What To Buy: Products and Licenses
 Q Replication is available in multiple products

 No additional license is needed to run Q Replication if deployed via InfoSphere Data Replication or
InfoSphere Replication Server products
 Additional licenses may be needed to run Q Replication if deployed via DB2 or InfoSphere Warehouse
products
Product Platform Additional License Needed

InfoSphere® Data Replication z/OS
InfoSphere® Replication Server (EoM) z/OS
InfoSphere® Data Replication LUW
InfoSphere® Replication Server (EoM) LUW
DB2® Enterprise Server Edition LUW Homogeneous Replication Feature
DB2® Advanced Enterprise Server LUW Free when replicating with another DB2 LUW
Edition server (v97)
Free when replicating with other two DB2 LUW
servers (v10)
InfoSphere® Warehouse LUW Free when replicating with another DB2 LUW
server (v97)
Free when replicating with other two DB2 LUW
servers (v10)

Performance Bottleneck
Analysis

Performance In Q Replication
 Q Replication performance is usually represented by

– End-to-end replication throughput
– End-to-end replication latency
 With bad Q Replication performance
– End-to-end latency keeps increasing – as high as several hours or more
– End-to-end throughput is less than source workload
 With good Q Replication performance
– End-to-end latency keeps low – a few seconds
– End-to-end throughput is the same as source workload
 Cause of bad performance
– Not enough system resources – CPU, I/O, memory, network bandwidth
– Non-optimal configuration parameters
 How to optimize performance
– Analysis - analyze and understand performance issues
– Identification - locate and identify the bottleneck
– Tuning - optimize resource and configuration to improve performance

Where Can The Performance Bottlenecks Be?
Site A Site B
Applications WebSphere MQ WebSphere MQ Applications
queue manager queue manager Q Apply agents
execute SQL
Captured DB2 transactions
SQL Transactions
statements from the LOG SQL
Q Capture (1 per MQ msg) MQ transactions
MQGET
DB2 spill file MQ recovery
recovery
log
log (browse)
7 DB2
browser agent
Send/transmit
User queue(s) 5 Receive queue(s)
Q Apply agent
tables User
Q Capture
tables
logrdr 2 4 Channel pruner agent
database MQ Bufferpool MQ Bufferpool 10
recovery publisher (TCP/IP)
MQGET
8 9
log
1 3
MQPUT (delete)
DB2/z IFI 306 MQCMIT 6
DB2/luw logRead API
MQ pageset MQ pageset
1. Source database 4. Source MQ queue manager 7. Q Apply browser thread

2. Q Capture log reader thread 5. MQ channel 8. Q Apply pruner thread
3. Q Capture publisher thread 6. Target MQ queue manager 9. Q Apply agent threads
10. Target database

Steps To Identify Performance Bottleneck
 1) Locate performance issues by analyzing replication latencies. Usually,

– Big CAPTURE_LATENCY usually indicates performance issue in Q Capture component
or accessing source DB2 log
– Big QLATENCY usually indicates performance issue in MQ or Q Apply components
– Big APPLY_LATENCY usually indicates performance issue in target database or Q
Apply
 2) Determine if the performance issue is caused by system resource bottleneck
– System resources include CPU, memory, I/O, network bandwidth
– This can be identified by looking at system resource utilization data
– The resolution is to allocate more system resource, or adjust process or job priority
 3) Locate performance bottleneck at sub-component level, by examining more performance
data such as CPU time and memory/storage usage in Q Replication monitor tables

Q Replication Latencies
End to End Latency (E2E_LATENCY)

QLATENCY
CAPTURE_LATENCY APPLY_LATENCY
Log Read Msg Read

MQ Transmit
Wait Wait
Latency
Capture Apply
DB2 Log
ＣＡＰＭＯＮ APPLYＭＯＮ
TARGET1
SOURCE3 TARGET1
SOURCE2 TARGET1
SOURCE1
It‟s easy to observe

latencies in Q Replication
Dashboard

Identify Bottlenecks At Sub-component Level

 Q Replication performance bottlenecks can be roughly identified by analyzing the counters in monitor tables
Cate Counters In Q No (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
gory Replication Monitor bottlenec Source Capture Capture Source MQ Target Apply Apply Apply Target
Tables k Database log publisher MQ Q Channel MQ Q browser pruner agent database
log read reader thread Mgr Mgr thread thread threads
interface thread
Log reader latency

(MONITOR_TIME – Small Big Big Small Small Small Small Small Small Small Small
CURRENT_LOG_TIME)
Latency
CAPTURE_LATENCY Small Big Big Big Big Small Small Small Small Small Small
QLATENCY Small Small Small Small Small Big Big Big Big Big Big
APPLY_LATENCY Small Small Small Small Small Small Small Small Small Big Big
DBMS_TIME (*) Small Small Small Small Small Small Small Small Small Small Big
LOGREAD_API_TIME Normal Big Normal Normal Normal Normal Normal Normal Normal Normal Normal
LOGRDR_SLEEPTIME Normal Busy Busy Normal Normal Normal Normal Normal Normal Normal Normal
CPU Time
MQPUT_TIME Normal Normal Normal Normal Big Normal Normal Normal Normal Normal Normal
MQGET_TIME Normal Normal Normal Normal Normal Normal Big Normal Big Normal Normal
APPLY_SLEEP_TIME Normal Normal Normal Normal Normal Normal Normal Normal Normal Busy Busy
Capture memory
Normal Normal Normal Full Full Full Full Full Full Full Full
Memory / Storage
(CURRENT_MEMORY)
XMITQDEPTH Small Small Small Small Small Jam Jam Jam Jam Jam Jam
Receive queue depth
Small Small Small Small Small Small Jam Jam Jam Jam Jam
(QDEPTH)
Apply Memory
Normal Normal Normal Normal Normal Normal Normal Normal Normal Full Full
(CURRENT_MEMORY)
DONEMSG row count Normal Normal Normal Normal Normal Normal Normal Normal Big Normal Normal
Key Performance Counters In Q Replication Monitor Tables

 Q Capture and Apply programs write lots of performance counters into monitor tables at a regular interval
Monitor Table Column Description

IBMQREP_CAPMON CURRENT_LOG_TIME Latest transaction commit time seen by Q Capture
(per Capture instance)
LOGREAD_API_TIME Time spent in DB2 API to read log records
At the SOURCE database LOGRDR_SLEEP_TIME Sleep time of log reader thread in this monitor interval
CURRENT_MEMORY The amount of memory used by Q Capture to construct transactions
TRANS_SPILLED Number of transactions that are too large and spilled to disk
MQCMIT_TIME Time spent on MQCMIT calls
IBMQREP_CAPQMON ROWS_PUBLISHED Total number of rows put into MQ by Q Capture
(one row per send queue) MQ_MESSAGES Total number of messages put into MQ by Q Capture
MQPUT_TIME Time spent on MQPUT calls
At the SOURCE database
XMITQDEPTH Number of messages currently in the MQ transmit queue.
IBMQREP_APPLYMON OLDEST_TRANS Q Replication synchronization point - all source transactions prior to this timestamp
have been applied
(one row per receive queue)
ROWS_APPLIED Total number of rows applied to target database.
At the TARGET database
CURRENT_MEMORY The amount of memory used by Q Apply to read transactions
MQGET_TIME Time spent on MQGET calls
QDEPTH Number of messages currently in the MQ receive queue.
END2END_LATENCY Average end-to-end latency time for all transactions applied in this monitor interval -
between source DB commit and target DB commit
Breakdown CAPTURE_LATENCY Latency time spent in Capture – between source DB commit and source MQ commit
QLATENCY Latency time spent in MQ – between source MQ commit and target MQGET
by components
APPLY_LATENCY Latency time spent in Apply – between target MQGET and target DB commit
DBMS_TIME Latency time spent in target database for SQL processing
16 APPLY_SLEEP_TIME Total sleep time of all apply agents in this monitor interval © 2013 IBM Corporation
Tuning Source Database

Bottleneck (1) source database log reader interface
 Description
– Performance issue in source database when merging and returning log records to Q
Capture
 Typical causes
– Log files are not accessible (e.g., have been moved to tape by automatic archival)
– Log file I/O performance
– Log read contention inside source database
– Performance issues when merging log records from various members
 Key symptoms – used to identify the bottleneck
– Log read latency is increasing
– Log reader API time is big
 Reference symptoms
– Capture latency and end-to-end latency is increasing
– Log read throughput is low
– Q Capture used memory is low

Considerations On Source Database Configuration
 Database log
– Use separate disk storage for database log
– Use disk striping to improve performance
– Log archival strategy should be adjusted to keep log files in disk before they are
captured
 Database parameters (LUW)
– logbufsz
• A value between 64 and 128 pages should be adequate
• Do not set this value to be more than 512 pages to avoid performance degrade
• Do not set this value to be more than 35% of database heap size
 Database parameters (z/OS)
– cachedyn = YES
– deallt = NOLIMIT

Tuning Q Capture

Bottleneck (2) Q Capture log reader thread
 Description
– Performance issue in Q Capture log reader thread when requesting for source log
records via log read API and/or constructing source transactions in internal memory
 Typical causes
– Q Capture spills monster transactions into disk
– Q Capture job priority is too low
– Log read latency is increasing
– Log reader API time is low
– Log reader sleep time is low (busy)
– Log read throughput is low
– Q Capture used memory is low

Bottleneck (3) Q Capture publisher thread
 Description
– Performance issue in Q Capture publisher thread when constructing MQ messages
and/or publishing them into MQ
– This bottleneck is seldom observed.
 Typical causes
– LOB columns whose value needs to be fetched from source tables
– Expensive row filtering (???)
– Too many queues for Q Capture publisher thread to handle
– Too many columns subscribed
– Q Capture job priority is too low
– Log read latency is normal
– Capture latency is increasing
– MQPUT time cost is normal
– Capture throughput is low
– Q Capture used memory is full
– Q Capture publisher thread is busy
Some Tuning Parameters For Q Capture

SLEEP_INTERVAL (Q Capture)
 Description
– Defined in IBMQREP_CAPPARMS control table
– How long Q Capture log reader thread sleeps when
• Reaching end of log (EOL)
• Or at the end of an IFI306 call scope
• Or memory usage will exceeds MEMORY_LIMIT
– This parameter can be changed dynamically using Q Capture “chgparms” command
 Default value
– 500 milliseconds (0.5 second)
 Tuning recommendations
– If the workload volume is high
• Usually no need to tune this parameter, since log reader thread is continually reading
logs
– If the workload volume is low
• Big SLEEP_INTERVAL can reduce CPU usage, but result in big replication latency
• Small SLEEP_INTERVAL can reduce replication latency, but result in higher CPU
usage

MEMORY_LIMIT (Q Capture)
 Description
– The amount of memory Q Capture uses to build transactions
– At most 32,000 transactions are buffered
• For OLTP workloads with small transactions, Q capture may be unable to use
memory upto MEMORY_LIMIT
– If there is a monster transaction which is bigger than MEMORY_LIMIT, Q capture will
spill the transaction data into spill file
• LUW: spilled to a disk file in CAPTURE_PATH
• z/OS: spilled to VIO or file specified by CAPSPILL DD
– This parameter can be changed dynamically using Q Capture “chgparms” command on
LUW, but cannot be changed dynamically on z/OS
 Default value
– LUW: 500 megabytes
– z/OS: 0 – Q Capture calculates memory allocation based on region size
– If IBMQREP_CAPMON.TRANS_SPILLED > 0
• Increase MEMORY_LIMIT to avoid transaction spilling
• This is needed usually for large transactions or batch processing,
 Best practice
– avoid monster transactions in application design
COMMIT_INTERVAL (Q Capture)
 Description
– How often Q Capture publisher thread issues MQCMIT to commit MQPUT operations
– Q Capture publisher issues MQCMIT when
• COMMIT_INTERVAL reached
• 128 source transactions have been put in send/xmit queues since last MQCMIT
 This parameter can be changed dynamically using Q Capture “chgparms” command
 Default value
– 500 milliseconds (0.5 second)
– If the workload transaction rate (TPS) is high
• Usually no need to tune this parameter, since publisher thread frequently issues
MQCMIT due to the condition 128 source transactions
– If the workload transaction rate (TPS) is low
• Big COMMIT_INTERVAL can reduce CPU usage, but result in big replication latency
• Small COMMIT_INTERVAL can reduce replication latency, but result in higher CPU
usage
TRANS_BATCH_SZ (Q Capture)
 Description
– Defined in Q Capture command line
– How many source transactions are packaged by Q Capture publisher thread into a single
MQ message
– The purpose is to avoid too small MQ messages
– This single MQ message will be processed as a single source transaction by Q Apply at
target side, resulting in possibly more transaction dependencies
– This parameter can not be changed dynamically
 Default value
– 1 (no batching)
– For OLTP systems with small transaction size and high TPS
• Throughput of a single MQ channel is not good when messages are small. In this
situation, increasing TRANS_BATCH_SIZE may create bigger messages and
improve MQ channel throughput.

Example: How TRANS_BATCH_SZ Impacts Performance?

NUM_PARALLEL_SENDQS (Q Capture) - Multiplexing

 Description
– Defined in IBMQREP_SENDQUEUES control table
– How many send queues will be used to replicate source transactions for a queue map
(consistency group)
– This parameter can not be changed dynamically
 Default value
– 1 (no multiplexing)
– For OLTP systems with small transaction size and high TPS
• Throughput of a single MQ channel is not good when messages are small. In this
situation, users may use multiple send queues / transmit queues / send channels to
increase the overall MQ transfer throughput
• To make this work, for each send queue, a dedicated transmit queue and channel
should be defined

MAX_MESSAGE_SIZE (Q Capture)
 Description
– Defined in IBMQREP_SENDQUEUES control table
– Determine the maximum size of MQ messages that Q Capture will publish into queues
– For transactions whose size is larger than MAX_MESSAGE_SIZE
• The transaction will be split into multiple messages
• The transaction is broken at a row boundary. This requires that each row should not
exceed MAX_MESSAGE_SIZE
• For LOB columns, if LOB_SEND_OPTION=„S‟, LOB columns are sent in separated
messages and segmented as necessary
– This parameter can be changed dynamically using Q Capture “reinitq” command
 Default value
– 64 kilobytes
– If there are lots of big transactions bigger than MAX_MESSAGE_SIZE
• This can be concluded if IBMQREP_CAPQMON.MQ_MESSAGES is much bigger
than IBMQREP_CAPQMON.TRANS_PUBLISHED
• Increase MAX_MESSAGE_SIZE, to reduce the number of segmented MQ
messages
CHANGED_COLS_ONLY (Q Capture) – Column Suppression
 Description
– Defined in IBMQREP_SUBS control table
– Determine if Q Capture will publish the non-key columns that are not changed
• „Y‟:Q Capture will publish only the key columns and changed non-key columns. This
mode is also called “column suppression”
• „N‟: Q Capture will publish all subscribed columns
– This option impacts UPDATE operation only
– This parameter can be changed dynamically using Q Capture “reinit” command
 Default value
– „Y‟
– For tables which have many columns any only a few are updated
• Using CHANGED_COLS_ONLY=„Y‟ can let Q Capture sends out only key and
changed data, resulting in smaller MQ messages and saving network bandwidth
• At target side, CONFLICT_ACTION cannot be „F‟ since the received row data is
incomplete

Tuning WebSphere MQ

Bottleneck (4) source MQ queue manager
 Description
– Performance issue in source MQ queue manager when putting MQ messages into
send/transmit queues from Q Capture
 Typical causes
– Bad MQ logging performance
– MQ I/O contention (e.g., check point)
– MQ job priority is too low
– Log read latency is normal
– Capture latency is increasing
– MQPUT time cost is big
– Capture throughput is low
– Q Capture used memory is full
– Depth of send/transmit queue is small

Bottleneck (5) MQ channel

 Description
– Performance issue in MQ channels when getting messages from transmit queues and
transferring them over network and putting them into receive queues
 Typical causes
– Source MQ read ahead feature is not enabled
– Source MQ buffer pool shortage
– Channel batch size is too low
– Messages are too small
– MQ channel job priority is too low
– Apply latency is normal
– Q latency is increasing
– MQPUT time cost is normal
– Channel throughput is low
– Depth of transmit queue is big
– Depth of receive queue is normal
– Capture performance (latency and throughput) is also impacted

Bottleneck (6) target MQ queue manager
 Description
– Performance issue in target MQ queue manager when putting MQ messages into
receive queues from channels and reading/getting them out to Q Apply
 Typical causes
– Bad MQ logging performance
– MQ I/O contention (e.g., check point)
– MQ job priority is too low
– Target MQ read ahead feature is not enabled
– Target MQ buffer pool shortage
– Q latency is big
– MQGET time cost is big
– Depth of receive queue is big
– End-to-end latency is increasing
MQ Logging (MQ)
 By default, Q Replication uses persistent messages for data recovery on failure

– All Persistent messages are logged to disk
– WMQ log files are limited to 4G
– Proper positioning of the log files is a critical factor
– Log data = User message length + length (all headers) + 1000 bytes
• Thus, for a 1000 byte persistent message put to and got from a local queue
approximately 2300 bytes of data will be written to the MQ log
 Bad logging performance will impact Q Replication performance
– MQ Accounting will shows long LOG WRITE elapsed time in such a case
– Store the log in a separated disk device from data files
– Use disk striping to allow for parallel I/O
– Log compression
• Can save I/O, but with CPU trade off

Buffer Pool Size (MQ)
 z/OS only
 MQ performance will be greatly impacted by buffer pool shortage
– MQ Accounting will show long elapsed time of MQPUT/MQGET operations when there
are lots of messages accumulated in transmit or receive queues
– In such cases, MQPUT/MQGET will involve disk I/O operations
– A buffer pool page is written to disk page set when
• At MQ checkpoint or shutdown
• 15% free threshold - causes async write
• 5% free pages - causes sync write
– Currently the maximum buffer pool size is limited
• Usually 1GB - limited by 31-bit MQ address space
– Allocate a buffer pool as large as possible in advance
– Enable MQ read-ahead feature, which can improve MQPUT/MQGET performance a lot
when MQ buffer pool is full
– Future plan: 64-bit MQ buffer pool

MQ Read Ahead (MQ)

 When the number of messages overruns the buffer pool allocated for the queue, messages
are spilled to disk and must then be retrieved from disk.
 The read ahead enhancement enables message pre-fetch from disk storage and improves
MQGET performance.
 Available in PM PM63802/UK79853 in 2012 and PM81785/ UK91439 in 2013.
 Internal testing shows ~50% improvement with read ahead enabled (msglen=6KB), and
better improvement for big message size than small message size.
 Enable this feature if MQ buffer pool may overrun.

BATCHSZ - Channel Batch Size (MQ)
 Description
– Number of messages that are committed by the queue manager between two synch
points.
– A channel batch is committed when:
• BATCHSZ messages have been sent, or
• The transmission queue is empty and BATCHINT is exceeded
– Big BATCHSZ will reduce the number of MQ commits
 Default value
– 50
– If the workload transaction rate (TPS) is low
• Usually no need to tune this parameter.
– If the workload transaction rate (TPS) is high
• Use a bigger value for BATCHSZ, usually between 50 and 640, to reduce number of
commits and improve channel throughput, with a trade off of more consumed
memory for uncommitted messages.

Tuning Q Apply

Bottleneck (7) Q Apply browser thread
 Description
– Performance issue in Q Apply browser thread when getting MQ messages from receive
queues and construct transactions in internal memory
 Typical causes
– Q Apply bad performance when calculating transaction dependencies
– Big waiting due to synchronization between different consistency groups
– Q Apply job priority is too low
– MQGET time cost is small

Bottleneck (8) Q Apply agent threads
 Description
– Performance issue in Q Apply agent threads when applying data changes to target
database
 Typical causes
– Apply latency is big
– Target DB2 response time is normal
– Q Apply used memory is full

Bottleneck (9) Q Apply pruner thread
 Description
– Performance issue in Q Apply pruner/housekeeping thread when pruning applied
transactions from IBMQREP_DONEMSG table and/or receive queues.
 Typical causes
– Bad performance when deleting records from IBMQREP_DONEMSG table.
– Bad performance when deleting MQ messages from receive queues.
– Q Apply latency is normal
– Q Apply throughput is normal or equal to Q Capture throughput
– Row count of IBMQREP_DONEMSG is big
– Q Apply used memory is normal

Some Tuning Parameters For Q Apply

NUM_APPLY_AGENTS (Q Apply)
 Description
– Defined in IBMQREP_RECVQUEUES control table
– The number of Q Apply agent threads that concurrently apply transactions to target
database for a single receive queue (consistent group).
– Q Apply agents fetches independent transactions from internal WORKQ and apply them
in parallel
– This parameter can be changed dynamically using Q Apply “reinitq” command
 Default value
– 16 agents
– More agents may increase apply throughput, but may cause more contention at target
database.
– Start tuning from the default value (16 agents)
– If the agent threads are busy (IBMQREP_APPLYMON.APPLY_SLEEP_TIME is nearly
0), increase the number of agent threads
– If there are lock contention in target database between Q Apply agents, decrease the
number of agent threads to avoid unnecessary CPU usage

Example: How NUM_APPLY_AGENTS Impacts Performance?

MEMORY_LIMIT (Q Apply)
 Description
– The amount of memory that a Q Apply program can use as a buffer to process
transactions from one receive queue (consistency group).
– When the used memory exceeds MEMORY_LIMIT, Q browser thread will stop reading
more messages from receive queue
– If a single transaction is bigger than MEMORY_LIMIT, Q Apply will apply this transaction
in serial mode
 Default value
– 32 megabytes
– It‟s not always the case that bigger MEMORY_LIMIT will make better performance
– Start tuning from the default value (32MB)
– Usually full memory (IBMQREP_APPLYMON.MEM_FULL_TIME is big) indicates that
agents or target database do not perform well. Increasing Q browser memory cannot
solve such issues.
– If the agent threads are idle and target database response time is normal, and the
browser memory is full, try increase MEMORY_LIMIT

MAXAGENTS_CORRELID (Q Apply)
 Description
– The maximum number of Q Apply agent threads that can concurrently apply transactions
belonging to the same correlation ID for a single receive queue (consistent group).
– This is designed for batch processing. One batch job has a single correlation ID, and
users may use MAXAGENTS_CORRELID to control the parallelism degree for each
batch job.
 Default value
– 0 (no limit – use as many agents as possible)
– Start tuning from the default value (same effect as NUM_APPLY_AGENTS)
– Check the value of IBMQREP_APPLYMON.DEADLOCK_RETRIES and
IBMQREP_APPLYMON.JOB_DEPENDENCIES. If they are high, try to decrease the
value of MAXAGENTS_CORRELID

Tuning Target Database

Bottleneck (10) target database
 Description
– Performance issue in target database when executing SQL statements from Q Apply.
 Typical causes
– Lack of indexes that can be utilized when executing SQL statements from Q Apply
– Bad performance when accessing Q Apply control tables (such as
IBMQREP_DONEMSG)
– Bad performance inside target database (such as lock contention, I/O contention, etc.)
– Target DB2 response time is big
– Apply latency is big
– Q Apply used memory is full
– Capture performance (latency and throughput) is impacted

Considerations On Target Database Configuration
 General configuration
– Target database should be tuned in a similar way as the source database to optimize
performance
 Special configuration
– Unique index
• Unique indexes at target side are necessary for Q Apply to generate effective SQL
statements, and is very important to ensure Q Apply performance.
– Lock size
• Configure target table spaces to use row level locking to avoid lock contention.
 Special consideration to tune Q Apply control tables
– IBMQREP_DONEMSG
• This is a HOT table introduced by Q Replication
• Totally 2 additional operations (1 insert + 1 delete) on this table for each replicated
source transaction
• The table space is by default defined with APPEND ON and VOLATILE keywords.
Users should periodically run REORG against the table space and index space.

Application Design
Considerations

Application Design That Impacts Q Replication Performance
 During application design, special considerations are required on database and workload
design to guarantee the functionality and performance of Q Replication
– SEQUENCE and IDENTITY columns
– Unique indexes on target table
– Triggers
– RI constraints
– Large Object (LOB) data types
– Big transactions
– Long running transactions jobs
– Hot row
– Non-logged operations
– Multi-row update statement
 The full list of application considerations is documented at:
– http://pic.dhe.ibm.com/infocenter/dzichelp/v2r2/topic/com.ibm.swg.im.iis.repl.qrepl.doc/to
pics/iiyrqplnconstructs.html

Unique Indexes On Target Table
 Target table is created with no unique index

– Q Replication is unable to find a unique key definition in DB2 database to use as
“replication key”, which makes Q Apply capable to uniquely identify a row in the target
table when applying row changes
– Q Replication will use all columns as “replication key”
• This is not always functionally correct when replicating DELETE/UPDATE operations
• This has performance issues and may result ineffective DB2 execution plan
 Target table is created with multiple unique indexes
– This improves the complexity of transaction dependency calculation by Q Apply, and
may impact Q Apply performance.
– In special scenarios such as data catching up in initial loading phase, in theory, multiple
unique indexes in a target table may result in unresolveable conflicts, and cause initial
loading failure
• Workaround: remove all secondary unique indexes before data catching up, and add
them when data catching up completes
 Best practice
– Create a unique index for each target table with lots of changes

Example: How Multiple Unique Indexes Impact Performance?

LOB Data Type
 Non-logged LOB
– Q Capture fetches LOB column value from table when publishing row changes
– The performance is very bad
 Logged LOB
– In-lined LOB
• Supported since DB2 for z/OS v10 and DB2 for LUW v9.7
• Q Capture fetches LOB column value directly from DB2 recovery log records
• Users should define LOB as inline in DB2 whenever possible
– Not in-lined LOB
• Q Capture fetches LOB column value from table when publishing row changes
• The performance is very bad
• Since DB2 for LUW v10.1, Q Capture fetches LOB column values directly from DB2
recovery log records, no matter they are in-lined or not
 Best practice
– Avoid using LOBs, or using in-lined LOBs as much as possible

Big Transactions
 Big transactions impact both Q Capture and Apply performance

– If they are bigger than Q Capture MEMORY_LIMIT, Q Capture will spill them to disk
before publishing them
– If they are bigger than Q Apply MEMORY_LIMIT, Q Apply will work in serial mode
 Best practice
– Avoid very big transactions as much as possible, such as, forcefully commit after some
(such as, 1000) row changes
– Use the Q Capture “warntxsz” parameter to detect the existence of very large
transactions, and tune MEMORY_LIMIT parameters
– Some customers chose to exclude some very large transactions (e.g., a purge job that
deletes GBs of data in a single DB2 transaction) from replication by using the
IBMQREP_IGNTRAN table, and manually execute the job at each site

Hot Row
 Hot row impacts Q Apply performance

– Example:
• A common table stores a row which contains variables used and updated by each
transaction in a batch job
• Each transaction contains row changes on this “hot” row
– Q Apply calculates dependency between transactions by checks the uniqueness of
replication key and secondary unique indexes, and RI constraints
– Hot row will cause lots of dependencies on uniqueness of replication key, and impacts Q
Apply parallelism and throughput
 Best practice
– Avoid to use hot row in application design

Q Replication Scalability

Q Replication Scalability (1)
 1. Scale inside consistency group: multiple MQ channels
Send queues
Receive Queue
Q Capture
Q Apply

Q Replication Scalability (1)
 2. Scale using multiple instances, without synchronization on consistency

 No guarantee of target data consistency at any specific time point
 The gap of target data sync point is unlimited – may be very large
Q Apply
Q Capture
Send queues Receive queues
Q Apply
Q Capture
Q Apply
Q Capture

Q Replication Performance Scalability (2)

 3. Scale using multiple instances, with synchronization on consistency
 No guarantee of target data consistency at any specific time point
 The gap of target data sync point is limited to be no more than a predefined within
a given window
 Eventual consistency can be achieved if each CG is stopped independently
1) Every Q Apply periodically reports the
most recent transaction – sync point – for
this CG which has already been fetched into
its memory, to a common table
Q Apply
Q Capture
Send queues Receive queues
Q Apply
Q Capture
Q Apply
Q Capture
2) Every Q Apply will set its goal: applying

transactions up to the most oldest sync point of
all CGs – the green transaction in this example
(unbridled mode)
References

References
 Information Center: Q Replication and Event Publishing Performance Tuning Guide

– http://pic.dhe.ibm.com/infocenter/dzichelp/v2r2/topic/com.ibm.swg.im.iis.repl.qtune.doc/topics/iiyrqtun
over.html
 Red Paper: InfoSphere Data Replication for DB2 for z/OS and WebSphere Message Queue for z/OS
– http://www.redbooks.ibm.com/redpapers/pdfs/redp4947.pdf
 InfoSphere Data Replication v10.2 Information Center
– http://pic.dhe.ibm.com/infocenter/iidr/v10r2m0/index.jsp
 Q Replication Information Roadmaps (IBM developerWorks)
– http://www.ibm.com/developerworks/data/roadmaps/qrepl-roadmap.html
 Q+SQL Replication Forum (IBM developerWorks)
– https://www.ibm.com/developerworks/mydeveloperworks/groups/service/html/communityview?commu
nityUuid=a82063fa-aecf-4716-81a4-8d33425a8735

QRep Performance Tuning 2013 v1

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

QRep Performance Tuning 2013 v1

Hochgeladen von

Copyright:

Verfügbare Formate

Ya Liu, liuya@cn.ibm.

Q Replication technology is available in:

IBM InfoSphere Data Replication for z/OS

© 2013 IBM Corporation

2 © 2013 IBM Corporation

3 © 2013 IBM Corporation

Q Replication Performance Tuning & Best Practice

4 © 2013 IBM Corporation

 IBM‟s strategic data replication technology for the

5 © 2013 IBM Corporation

Administration & Monitoring Tools • GUI tools to configure and

Capture Apply TARGET1

6 © 2013 IBM Corporation

1) Multiple systems with a center or not for:

3) Two data centers for:

5) Changed data pushed into message queues

7 © 2013 IBM Corporation

> 1000 Worldwide

8 © 2013 IBM Corporation

What To Buy: Products and Licenses

 Q Replication is available in multiple products

Product Platform Additional License Needed

9 © 2013 IBM Corporation

Q Replication Performance Tuning & Best Practice

10 © 2013 IBM Corporation

 Q Replication performance is usually represented by

11 © 2013 IBM Corporation

Where Can The Performance Bottlenecks Be?

1. Source database 4. Source MQ queue manager 7. Q Apply browser thread

12 © 2013 IBM Corporation

Steps To Identify Performance Bottleneck

 1) Locate performance issues by analyzing replication latencies. Usually,

13 © 2013 IBM Corporation

End to End Latency (E2E_LATENCY)

Log Read Msg Read

It‟s easy to observe

14 © 2013 IBM Corporation

Identify Bottlenecks At Sub-component Level

Log reader latency

Key Performance Counters In Q Replication Monitor Tables

Monitor Table Column Description

Q Replication Performance Tuning & Best Practice

Tuning Source Database

17 © 2013 IBM Corporation

Bottleneck (1) source database log reader interface

18 © 2013 IBM Corporation

Considerations On Source Database Configuration

19 © 2013 IBM Corporation

Q Replication Performance Tuning & Best Practice

20 © 2013 IBM Corporation

Bottleneck (2) Q Capture log reader thread

21 © 2013 IBM Corporation

Bottleneck (3) Q Capture publisher thread

Some Tuning Parameters For Q Capture

23 © 2013 IBM Corporation

24 © 2013 IBM Corporation

27 © 2013 IBM Corporation

Example: How TRANS_BATCH_SZ Impacts Performance?

28 © 2013 IBM Corporation

NUM_PARALLEL_SENDQS (Q Capture) - Multiplexing

29 © 2013 IBM Corporation

CHANGED_COLS_ONLY (Q Capture) – Column Suppression

31 © 2013 IBM Corporation