Beruflich Dokumente
Kultur Dokumente
Solutions
Sun Hongda
Abstract
The redo log plays a prime role in Oracle databases core functionality. However it
imposes disk i/o for the redo logs inherent functionality, increased i/o can highly affect
the Oracle performance. In this paper I am analyzing the performance issues related
to the Oracle redo log, and solutions to address those issues, also I am covering how
to minimize the overhead of redo logs and improve the overall database performance.
Concept review
The redo log buffer is a RAM area defined by the initialization parameter log_buffer
that works to save changes to database, in case something fails and Oracle has to put
it back into its original state (a rollback). When Oracle SQL updates a table, redo
images are created and stored in the redo log buffer. Since RAM is faster than disk,
this makes the storage of redo very fast.
Oracle will eventually flush the redo log buffer to disk, called online redo log. This can
happen in a number of special cases, but whats really important is that Oracle
guarantees that the redo log buffer will be flushed to disk after a commit operation
occurs. When you make changes in the database you must commit them to make the
changes permanent and visible to other users.
In case of Oracles Archive log mode , online redo can be achieved and called as
offline or achieved redo log ., In case of No archive log mode Oracle writes in rotation
through its redo log groups, it will eventually overwrite a group which it has already
written to. Data that is being overwritten would of course be useless for a recovery
scenario. In order to prevent that, oracle has introduced the archive log mode,. In
case of archive log mode, Oracle makes sure that online redo log files are not
overwritten unless they have been safely archived at some other place in disk. Some
key points are -
A database can be recovered from media failure only if it runs under archive log
mode.
Oracles background process LGWR process is responsible to write the redo log
buffers to disk.
Oracles background process ARCn is in-charge for archiving redo logs , if
automatic archiving is enabled by keeping achieve log mode On. All changes that
are covered by redo are first written into the log buffer (RAM). The idea of storing
redo log in the RAM is to reduce disk I/O. Of-course, when a transaction commits,
the redo log buffer must be flushed to disk, otherwise the recovery for that commit
could not be guaranteed. It is LGWR (Log Writer) process which is responsible for
redo log buffer flushing into disk .
Issues
Excessive redo log generation is one of the potential issue related to redo log, Impact
of excessive redo generation is very high and impacted many areas;
Higher CPU usage - Generating redo records and copying them to log buffer
consumes CPU resources
LGWR process need to work hard if the redo log generation rate is very high.
High redo log generation resulted into very frequent log switches and might
increase the checkpoint frequency which in turn slow down the overall disk
performance
In case of archive log mode,
Arch process leads to generate more archive log files. This again introduces
additional CPU and disk usage.
Even backup of these archive log files uses more CPU, disk and possibly
tape resources.
Redo entries are copied in to redo buffer under the protection of various latches,
and thus excessive redo log leads to increase in contention for redo latches
Another potential issue is the disk performance, which is related to the above Issus, if
database generates excessive redo logs, it will slowdown the disk performance
drastically. Database performance is worse in case of heavily loaded environment,
more load on database server leads to more disk I / o due to inherent nature of redo
logs to be written to disk.
We can take a conclusion that, the main issues of redo logs are its rate of
generation and disk performance, so the key aspect of
redo log performance
tuning is to control excessive redo log generation and improve the disk performance.
Number
1
2
3
4
5
6
Factor
the number of index
the number of threads
the type of SQL statement
number of columns
commit frequency
cache size
Table 1, the factors affecting redo log size(1)
Emrick.S.E.(2006) also done an excellent research about oracle business
reinforcement rules affect the oracle redo log performance. He divided the factors
related to business reinforcement rules affect the redo log into following categories,
please refer the Table 2 below Number Factor
1 the Uniqueness Enforcement through
primary key or unique key constraint
2 the referential integrity constraint
Table 2 the factors which affect redo log size(2)
However their research did not conclude the most affecting factors to reduce
excessive redo log generation , In order to enhance the solution I came-up with a
case study based on above factors to find out the rationale of redo log size changes
under different conditions , listed in below table Table3.
In order to find out the factors which can reduce the size of redo log generation
among those factor listed in above research, I have summarized their results in one
table and created a chart by using data in Shamsudeen.A.(2008) and ratio R in
Emrick.S.E.(2006), The results are listed in table 3 with test number.
Number
Test case
Redo size
No indices(10000rows)
30107436
One index(10000rows)
38043080
18432752
26020552
4612968
4612968
202172680
21735972
4608084
10
419249316
11
261210412
12
145939212
13
7620572
14
4296472
15
4002344
16
4217152
17
4238944
18
19
cache size
10mb
4975192
20
2628168
21
2393420
22
23
15419184
52601052.24
91725801.84
161992525.3
table) fail
update 1 varchar2(100) columns and 3 number columns with Non unique
25
357551069.4
index(parent table) fail
26
Deleted parent table and insert child table failed transaction(Unique index)
64990556.28
27
Deleted parent table and insert child table failed transaction( non Unique index)
592365952.4
210152465.3
table) fail
update 1 varchar2(100) columns and 3 number columns with non-unique index(child
29
103616840.5
table) fail
As illustrated in Figure 1, we can see the different cases of redo log generation are
changing under different conditions,
419249316
357551069.4
592365952.4
To conclude, I have summarizes top five test cases which generate less redo log in
table 5.
Number
Test case
Redo size
21
2393420
20
2628168
15
4002344
16
4217152
17
4238944
rows
As we can spot in the Figure 2, the SSD Server is faster than HD server. It shows SSD
4 Server has highest TPS, while the number of client reaching 300 and HD2 Server
has lowest TPS at the same time. There are certain other alternative solutions
available in market to achieve the similar disk performance as SSD. One of solution is
adding faster disk chains to the computer, the other solution is to increase the disk
buffer in memory. Matthew .Y.(2008) has done the research to compare the
performance of those solutions
If the type of disk changes from Raptor to MFT , it will sharply increasing TPM from
285.15 to 4328.49 i.e. 15x faster than Raptor. However, by increasing memory up to
2.75GB increases Disk TPM from 285.15 to 625.8 i.e. only 2x faster.
The Figure 3 is also indicates if Disk Raptor is changed to MFT, and the innodb Buffer
size become 1GB, the disk TPM increases from 285.15 to 6046.75, i.e. 21x faster. It
conclude that the faster disk is more important in order to speed up disk performance.
But at the same time it is even better to add more memory as well . For the system
engineers to improve the disk i/o performance , recommended solution in priority
order given below 1. Faster disk with higher main memory
2. Faster disk
Summary
In order to reduce the redo log generation, DBAs can increase the main memory
cache size, one should also increase the commit frequency to reduce the disk
i/o. However, for the faster i/o, DBAs can add more disk chains, for an economic
solution, if cost is not the factor then it is suggested to use SSD disks. It is also
recommended that one should not use the RAID 5 for system reliability as it will
marginally lower the disk performance.
Reference
Shamsudeen.A.(2008) redo internals and tuning by redo reduction[Electronic version]. retrieved
May
10,2010
from
http://orainternals.files.wordpress.com/2008/07/riyaj_redo_internals_and_tuning_by_redo_red
uction_doc.pdf
Emrick.S.E.(2006) The Oracle Redo Generation [Electronic version]. Retrieved May 25,2010 from
http://hosteddocs.ittoolbox.com/EE010306.pdf
Confio software .(2009) Log File Switch Completion [Electronic version]. retrieved June 1,2010
from http://www.confio.com/English/Tips/Log_File_Switch_Completion.php
retrieved June
Hutsell,W and Mike.A.(2009) Faster Oracle Performance with Solid State Disks retrieved May
1,2010 http://www.texmemsys.com/files/f000139.pdf
ACKNOWLEDGEMENTS
I greatly appreciate the guidance, valuable feedback and insight from Shailesh
Paliwal , Denise Kalm for all her encouragement.
Authors Profile
Sun Hongda is Cisco Certified Network Associate, Certified Internet Web Security
Analyst, Certified Internet Web (CIW) professional, MSc in Networking and Web
Technology (UK).He is working in TAIJI Computer Corporation Company limited,
E-government research center, which is The Leading supplier of government
IT-service in CHINA. He works as system engineer and mainly focuses on Web
system performance and reliability.