Sie sind auf Seite 1von 37

Case Study: Diagnosing Another Buffer Busy Waits Issue

Authors: Stephan Haisley, Consulting Technical Advisor, Oracle Corporation Editor: Vic ie Carbonneau, Principal Support Engineer, Oracle Corporation

S ill Level Rating for this Case Study: Intermediate

About Oracle Case Studies

Oracle Case Studies are intended as learning tools and for sharing information o r nowledge related to a complex event, process, procedure, or to a series of rela ted events. Each case study is written based upon the experience that the writer/s encountered.

Each Case Study contains a s ill level rating. The rating provides an indication of what s ill level the reader should have as it relates to the information in the case study. Ratings are:

Expert: significant experience with the subject matter Intermediate: some experience with the subject matter Beginner: little experience with the subject matter

Case Study Abstract

This case study details how to diagnose a buffer busy wait issue on a table unde rgoing large amounts of concurrent inserts. This is typical of a table being used for w eb-based OLTP transactions where sales data is being recorded. The recommended solution f or

such problems involves increasing the number of process freelists for the table to spread out the use of available databloc s. The problem presented in this case study pe rsisted when this solution was implemented.

When diagnosing the problem several diagnostic events were set to highlight whic h part of the freelist search algorithm was causing the buffer contention. These events should not be set unless advised by Oracle Support Services.

Case History

The issue reported by customer in the case study was:

We have a big 1.4TB table and every day the application inserts around 6 to 9 mi llion rows. The row contains a long raw column and is around 3KB. During the time of the ins erts we see lots of sessions waiting on buffer busy waits on the same databloc s.

The customer reported they were carrying out the inserts from a number of concur rent sessions, and had set the number of process freelists on the table to 23. The ta ble involved had the following structure:

Name Null? Type ---------------------------------------- -------- --------ID NOT NULL NUMBER(38) X_SIZE NOT NULL NUMBER(4) Y_SIZE NOT NULL NUMBER(4) BEGIN_DATE NOT NULL DATE FINISH_DATE NOT NULL DATE PICTURE NOT NULL LONG RAW PICTURE_LEN NOT NULL NUMBER(38)

Three indexes existed on the table:

1. Unique index on ID column 2. Non unique index on BEGIN_DATE column 3. Non unique index on FINISH_DATE column

It was reported that these indexes showed no signs of contention, and they could always

see one session waiting on a db file sequential read and all others are waiting on a buffer busy wait with reason code 120, which indicates the session is waiting for the b loc to be read into the buffer cache.

This table had been created some time ago so the history of why the buffer busy waits issue had suddenly become a problem was simply explained as an increase in data volume being inserted.

The puzzling thing was that processes are supposed to map to different process f reelists based on the following algorithm:

Process free list entry = (P % NFL) + 1 where P : Oracle PID of the process (index in V$PROCESS), and NFL : Process free lists number as defined by FREELISTS

If no free bloc s can be found on the assigned process freelist (PFL) then the p rocess will search the Master Freelist (MFL) for space. If no space is found, committed Tran saction Freelists are merged into the MFL and those bloc s are scanned for usefulness. I f still no

space is found the High Water Mar is incremented and the new bloc s can be move d to the PFL and used. The customer was reporting all processes waiting on the same databloc s whilst carrying out the insert statements, which indicates some probl em with the freelist search mechanism since the multiple process freelists should have m itigated this ind of contention. This is where the data gathering and analysis begins.

The customer's database version used in this case study is 8.1.7.4 on Solaris. H owever, this issue could occur on any platform with database versions 8.0 or higher.

Analysis Summary A number of different data items were collected to help determine a cause. These included: Statspac reports to show what the largest contention points were for the database I/O statistics from the operating system to see if we were running into bad I/O times slowing down the freelist search Databloc dumps for the segment header and other bloc s being waited on to see how the freelist lists were changing Several diagnostic events that dump information to a trace file and show how the freelist search mechanism was wor ing Using the collected data it was possible to build a test case that reproduced th e same problem on a multiple CPU box using concurrent sessions inserting into the same table. The test case did in fact show a problem with how the freelist searc h mechanism is a single point of contention when there are few rows per bloc and a high concurrency level of DML activity. An in-depth review of the collected data is provided in the next section. Detailed Analysis The following sections describe how each piece of data was collected and what it shows. Once all the data has been described a final cause determination will be present ed, tying everything together.

The statspac report clearly showed massive I/O contention issues:

Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time -------------------------------------------- ------------ ------------ ------db file sequential read 3,045,656 5,483,001 82.74 buffer busy waits 1,253,698 1,013,605 15.30 latch free 100,955 95,862 1.45 log file sync 30,198 13,384 .20 db file parallel write 19,057 8,318 .13

Loo ing further down the statspac report it is clear which tablespace is having all the waits against it:

Tablespace IO Summary for DB:

Avg Read Total Avg Wait Tablespace Reads (ms) Writes Waits (ms) ------------------- ----------- ---- ----------- - -------------------------DATA01 2,670,571 19.2 58,417 1,244,996 8.1 INDX01 378,959 18.6 9,345 2,513 9.4 RBS02 14 ###### 6,648 89 0.9 USERS01 248 113.1 248 0 0.0 SYSTEM 39 181.3 7 0 0.0

TEMP02 3 ###### 2 0 0.0

Ideally the average read times should be at a maximum of 10-20ms, so although th e DATA01 tablespace is in the upper band, it is still not a cause for why the cust omer is reporting large waits for buffer busy waits. The datafile I/O statistics section of the statspac reports showed many of the datafiles with waits against them, and four of the fourteen dis s being used for the datafiles showed average read times of 20-25ms . I/O statistics from the operating system using vxstat also showed the same four dis s with a higher average read time than the rest. The other dis s also showed relatively high read times:

OPERATIONS BLOCKS AVG TIME(ms) TYP NAME READ WRITE READ WRITE READ WRITE

Tue Jun 07 11:01:00 2005 vol data01 10620 445 169920 7120 24.7 5.5 vol data02 10148 231 162368 3696 25.7 5.3 vol data03 11559 221 184944 3536 24.6 5.3 vol data04 12018 202 192288 3232 26.7 4.5 vol data05 13061 199 208976 3184 23.0 4.8 vol data06 12254 215 196064 3440 23.8 6.0 vol data07 13153 187 210448 2992 23.9 4.4 vol data08 18047 275 288752 4400 23.9 4.7 vol data09 16159 474 258544 7584 23.6 5.2 vol data10 16492 355 263872 5680 29.6 6.5 vol data11 15840 275 253440 4400 26.0 3.8 vol data12 29629 190 474064 3040 44.2 19.2

vol data13 35560 174 568960 2784 41.8 12.8 vol data14 28231 127 451696 2032 36.3 14.0

The I/O statistics certainly indicate a performance problem but do not explain t he earlier reports of buffer busy waits on the same bloc s, contradicting with the expected way in which process freelists spread out the use of databloc s amongst processes.

It was decided by the customer that we would assume the I/O was certainly not he lping performance issues but they wanted to concentrate on why the process freelists d id not appear to be used correctly. The investigation moved in this direction.

Buffer wait Statistics Tot Wait Avg Class Waits Time (cs) Time (cs) ------------------ ----------- ---------- --------data bloc 1,224,478 1,011,605 1 segment header 19,946 753 0 undo header 79 6 0 undo bloc 10 2 0

This indicates we have some hot bloc issue, so we needed to find out what these hot bloc s are and why they appear so hot.

In order to find the hot bloc (s) v$session_wait was queried to show which bloc s were being contended for. The results proved interesting as the particular bloc bein g sought was always changing:

The buffer busy wait statistics for the class of databloc is shown in the statspac :

bloc being waited on

select b.sid,b.username,event,wait_time,p1,p2,p3,b.sql_hash_value,b.status from v$session_wait a,v$session b where event not li e 'SQL*Net message%' and event not li e 'rdbms%' and a.sid=b.sid and b.sql_hash_value=4290940428 and b.sid>8 order by sql_hash_value;

SID USERNAME EVENT WAIT_TIME P1 P2 P3 ---------- ---------- ------------------ --------- ------ ---------- ---------16 GALLERY buffer busy waits 0 42 249961 120 44 GALLERY buffer busy waits 0 42 249961 120 55 GALLERY buffer busy waits 0 42 249961 120 111 GALLERY buffer busy waits 0 42 249961 120 117 GALLERY buffer busy waits 0 42 249961 120 313 GALLERY buffer busy waits 0 42 249961 120 316 GALLERY buffer busy waits 0 42 249961 120 282 GALLERY buffer busy waits 0 42 249961 120 179 GALLERY buffer busy waits 0 42 249961 120 200 GALLERY buffer busy waits 0 42 249961 120 226 GALLERY db file sequential read 0 42 249961 1

SQL> /

1 In v$session_wait, the P1, P2, and P3 columns identify the file number, bloc number, and buffer busy reason codes, respectively.

SID USERNAME EVENT WAIT_TIME P1 P2 P3 ---------- ---------- ------------------ --------- ------ ---------- ---------16 GALLERY buffer busy waits 0 257 101465 120 44 GALLERY buffer busy waits 0 257 101465 120 86 GALLERY buffer busy waits 0 257 101465 120 104 GALLERY buffer busy waits 0 257 101465 120 147 GALLERY buffer busy waits 0 257 101465 120 179 GALLERY buffer busy waits 0 257 101465 120 200 GALLERY buffer busy waits 0 257 101465 120 226 GALLERY buffer busy waits 0 257 101465 120 254 GALLERY buffer busy waits 0 257 101465 120 316 GALLERY buffer busy waits 0 257 101465 120 313 GALLERY buffer busy waits 4 257 101465 120 292 GALLERY buffer busy waits 0 257 101465 120 184 GALLERY buffer busy waits 0 257 101465 120 164 GALLERY buffer busy waits 0 257 101465 120 111 GALLERY db file sequential read 0 257 101465 1

Note: In v$session_wait, the P1, P2, and P3 columns identify the file number, bl oc number, and buffer busy reason codes, respectively

We then needed to find out which object these bloc s belonged to so we could dum p the segment header over a period of time to see if the freelists were changing:

SELECT owner, segment_name, segment_type FROM dba_extents

WHERE file_id=42 AND 249961 BETWEEN bloc _id AND bloc _id+bloc s-1;

OWNER SEGMENT_NAME SEGMENT_TYPE ------------ ------------ -----------GALLERY ITEMS TABLE

Note: Each time we ran this query with the bloc s being waited on it was always this same table.

Dumping the segment header showed that the process freelists were being used as expected, because all the freelist structures were changing over time:

FROM dba_segments WHERE owner= GALLERY

HEADER_FILE HEADER_BLOCK ------------ -----------14 2

SQL> ALTER SYSTEM DUMP DATAFILE 14 BLOCK 2;

-- WAIT a few second

SQL> ALTER SYSTEM DUMP DATAFILE 14 BLOCK 2;

The dump shows (only selected information shown):

SQL> SELECT header_file, header_bloc

AND segment_name =

ITEMS ;

nfl = 23, nfb = 1 typ = 1 nxf = 177 ccnt = 235781012 SEG LST:: flg: USED lhd: 0xd9c2e918 ltl: 0xdf80c500 <- Master Freelist SEG LST:: flg: USED lhd: 0x09c28b4b ltl: 0x00825c2d <- Process Freelist #1 SEG LST:: flg: USED lhd: 0xe583a1ea ltl: 0xac02742c SEG LST:: flg: USED lhd: 0x66c37711 ltl: 0x468034f9 SEG LST:: flg: USED lhd: 0x5502bfed ltl: 0x42c2c252 SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x1d8081ce ltl: 0x1503c119

SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x09c29f25 ltl: 0x09c29f25 SEG LST:: flg: USED lhd: 0xa5822a00 ltl: 0xa5822a00 SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0xa0036033 ltl: 0x33038d14 SEG LST:: flg: USED lhd: 0xa1c0d61d ltl: 0x8702d4dc SEG LST:: flg: USED lhd: 0x9483ab8e ltl: 0x0f83baec SEG LST:: flg: USED lhd: 0x6ac071bb ltl: 0x61c23e99 SEG LST:: flg: USED lhd: 0xd401cd2c ltl: 0xaa02eabb SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x97432a9a ltl: 0x84c3cced SEG LST:: flg: USED lhd: 0xa88062fb ltl: 0x9483911c SEG LST:: flg: USED lhd: 0x79436ef5 ltl: 0x790102af SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x8000b9d1 ltl: 0x1a42881d SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x1cc20fc7 ltl: 0x19005a17 <- Process Freelist #23 XCT LST:: flg: UNUSED lhd: 0x00000000 ltl: 0x00000000 xid:0x0000.000.00000000 ... <- Transaction freelists populated when a transaction frees more bloc s XCT LST:: flg: USED lhd: 0x0d03d289 ltl: 0x0d03d289 xid:0x0001.00e.00870924 XCT LST:: flg: USED lhd: 0x2900d4fd ltl: 0x2900d4fd xid:0x0004.05e.006b4288 XCT LST:: flg: USED lhd: 0x1d42e271 ltl: 0x1d42e271 xid:0x0002.052.0086f6c3 XCT LST:: flg: USED lhd: 0xda41e80c ltl: 0xda425b83 xid:0x0004.018.006b42f4 End dump data bloc s tsn: 4 file#: 14 minbl 2 maxbl 2

The second dump showed:

nfl = 23, nfb = 1 typ = 1 nxf = 177 ccnt = 235781120 SEG LST:: flg: USED lhd: 0xd60242c5 ltl: 0xdf80c500 <- MFL header changed SEG LST:: flg: USED lhd: 0xde413a87 ltl: 0xd103aa92 <- PFL #1 changed SEG LST:: flg: USED lhd: 0xe583a1ea ltl: 0xac02742c SEG LST:: flg: USED lhd: 0x66c37711 ltl: 0x468034f9 SEG LST:: flg: USED lhd: 0x5502bfed ltl: 0x42c2c252 SEG LST:: flg: USED lhd: 0xdac13e6b ltl: 0xdac13e6b <- PFL #5 changed SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 <- PFL #6 changed SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 <- PFL #8 changed SEG LST:: flg: USED lhd: 0x3603b1a1 ltl: 0x3603b1a1 <- PFL #9 changed SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 <- PFL #11 changed SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 <- PFL #12 changed SEG LST:: flg: USED lhd: 0x29c010c8 ltl: 0x2543d499 <- PFL #13 changed SEG LST:: flg: USED lhd: 0xb3c27294 ltl: 0xb3c27294 <- PFL #14 changed SEG LST:: flg: USED lhd: 0xd401cd2c ltl: 0xaa02eabb SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0x94825278 ltl: 0x4103444f <- PFL #17 changed SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 <- PFL #18 changed SEG LST:: flg: USED lhd: 0x2242077d ltl: 0xd8000f08 <- PFL #19 changed SEG LST:: flg: USED lhd: 0x00000000 ltl: 0x00000000 SEG LST:: flg: USED lhd: 0xdbc351d6 ltl: 0x1a42881d <- PFL #21 changed SEG LST:: flg: USED lhd: 0x9b00c830 ltl: 0x8d009d0e <- PFL #22 changed SEG LST:: flg: USED lhd: 0x1cc20fc7 ltl: 0x19005a17 XCT LST:: flg: UNUSED lhd: 0x00000000 ltl: 0x00000000 xid:0x0000.000.00000000 . . . XCT LST:: flg: USED lhd: 0x0d03d289 ltl: 0x0d03d289 xid:0x0001.00e.00870924 XCT LST:: flg: USED lhd: 0x2900d4fd ltl: 0x2900d4fd xid:0x0004.05e.006b4288 XCT LST:: flg: USED lhd: 0x1d42e271 ltl: 0x1d42e271 xid:0x0002.052.0086f6c3

XCT LST:: flg: USED lhd: 0xda41e80c ltl: 0xda425b83 xid:0x0004.018.006b42f4 End dump data bloc s tsn: 4 file#: 14 minbl 2 maxbl 2

The transaction freelists (TFL) are emptied when a process cannot find any free bloc s on the MFL (as described earlier). The fact that the TFLs have not been emptied imp lies the MFL has always managed to supply enough bloc s for the requesting processes or a ll transactions are still uncommitted (which seems unli ely).

The second segment header dump shows 15 out of the 23 process freelists have cha nged, including the master freelist. The tail of the master freelist has not changed, but the header has, which indicates the master freelist has enough free bloc s on it to satisfy all searches within the monitored time period. The fact that many of the freelists a re changing indicates the process freelist assignment is wor ing correctly. What th is data doesn t prove is if the searching mechanism is wor ing correctly and it certainly doesn t highlight any cause to the buffer busy waits issue. In order to find out more about what was happening with several of the waiting s essions during the free space search, we set a few diagnostic events to gather tracing i nformation. The events that were used:

Event Level Reason 10320 1

10022 1

10085 1 Trace when bloc s moved from TFL to MFL 10082 1

Trace which process freelist and bloc

Trace the getting of a bloc

to be used after free space search

is found during search

Trace part of free space search 10080 1 Trace changing of freelist (removing bloc s from list) 10046 12 Trace SQL statements with wait and bind data

Note: It is only recommended to set the freelist tracing events under advice fro m Oracle Support. These events can produce a large amount of trace data so setting them should be done only for short periods of time.

The customer was instructed to set the events for three waiting sessions when th ey saw a high number of buffer busy waits with the 120 reason code. After tracing for 5 m inutes, all the events were turned off. The PL/SQL code used to enable and disable the e vents is listed below.

create table tracing(sid number, serial# number, event number) tablespace users; -- change tablespace if different

create or replace procedure trace_freelists(what IN NUMBER, onoff IN number) as cursor c1 is -- retrieve top 3 sessions ordered by time_waited for buffer busy waits SELECT s.sid , s.serial# FROM v$session s,v$session_event se WHERE s.sid = se.sid and se.event = 'buffer busy waits'

and s.sid>8 and s.server = 'DEDICATED' and rownum<4 ORDER BY se.time_waited desc;

cursor c2 (wevent NUMBER) is select sid,serial# from tracing where event=wevent;

BEGIN if what=10320 -- Freelist tracing then if onoff=1 -- Turn freelist tracing ON then for rec in c1 loop dbms_system.set_ev(rec.sid,rec.serial#,10320,1,''); insert into tracing values (rec.sid, rec.serial#, 10320); commit; end loop; elsif onoff=0 -- Turn freelist tracing OFF then for rec in c2(10320) loop dbms_system.set_ev(rec.sid,rec.serial#,10320,0,''); delete from tracing where sid=rec.sid and event=10320; commit; end loop; end if; elsif what=10022 -- Freelist 10022 tracing then if onoff=1 -- Turn freelist tracing ON then for rec in c2(10320) loop dbms_system.set_ev(rec.sid,rec.serial#,10022,1,''); insert into tracing values (rec.sid, rec.serial#, 10022); commit;

end loop; elsif onoff=0 -- Turn freelist tracing OFF then for rec in c2(10022) loop dbms_system.set_ev(rec.sid,rec.serial#,10022,0,''); delete from tracing where sid=rec.sid and event=10022; commit; end loop; end if; elsif what=10085 -- Freelist 10085 tracing then if onoff=1 -- Turn freelist tracing ON then for rec in c2(10320) loop dbms_system.set_ev(rec.sid,rec.serial#,10085,1,''); insert into tracing values (rec.sid, rec.serial#, 10085); commit; end loop; elsif onoff=0 -- Turn freelist tracing OFF then for rec in c2(10085) loop dbms_system.set_ev(rec.sid,rec.serial#,10085,0,''); delete from tracing where sid=rec.sid and event=10085; commit; end loop; end if; elsif what=10080 -- Freelist 10080 tracing then if onoff=1 -- Turn freelist tracing ON then

for rec in c2(10320) loop dbms_system.set_ev(rec.sid,rec.serial#,10080,1,''); insert into tracing values (rec.sid, rec.serial#, 10080);

commit; end loop; elsif onoff=0 -- Turn freelist tracing OFF then for rec in c2(10080) loop dbms_system.set_ev(rec.sid,rec.serial#,10080,0,''); delete from tracing where sid=rec.sid and event=10080; commit; end loop; end if; elsif what=10082 -- Freelist 10082 tracing then if onoff=1 -- Turn freelist tracing ON then for rec in c2(10320) loop dbms_system.set_ev(rec.sid,rec.serial#,10082,1,''); insert into tracing values (rec.sid, rec.serial#, 10082); commit; end loop; elsif onoff=0 -- Turn freelist tracing OFF then for rec in c2(10082) loop dbms_system.set_ev(rec.sid,rec.serial#,10082,0,''); delete from tracing where sid=rec.sid and event=10082; commit; end loop; end if; elsif what=10046 -- SQL Trace - SHOULD BE TURNED ON AFTER 10320 TRACE then if onoff=1 -- Turn SQL trace ON

then for rec in c2(10320) loop dbms_system.set_ev(rec.sid,rec.serial#,10046,12,''); insert into tracing values (rec.sid, rec.serial#, 10046); commit; end loop; elsif onoff=0 -- Turn SQL trace OFF then for rec in c2(10046) loop dbms_system.set_ev(rec.sid,rec.serial#,10046,0,''); delete from tracing where sid=rec.sid and event=10046; commit; end loop; end if; end if; end trace_freelists; /

To enable the events:

exec trace_freelists(10320,1);

exec trace_freelists(10022,1); exec dbms_loc .sleep(10); exec trace_freelists(10085,1); exec dbms_loc .sleep(10); exec trace_freelists(10080,1); exec dbms_loc .sleep(10); exec trace_freelists(10082,1); exec dbms_loc .sleep(10);

exec dbms_loc .sleep(10); -- need to wait 10secs for next event to wor

exec trace_freelists(10046,1);

Wait 5 minutes then turn off each event in the following order:

exec trace_freelists(10046,0); exec dbms_loc .sleep(10); exec trace_freelists(10082,0); exec dbms_loc .sleep(10); exec trace_freelists(10080,0); exec dbms_loc .sleep(10); exec trace_freelists(10085,0); exec dbms_loc .sleep(10); exec trace_freelists(10022,1); exec dbms_loc .sleep(10); exec trace_freelists(10320,1);

The trace files generated confirmed the sessions were assigned different process freelists and use different bloc s for some of the inserts:

Session #1: *** 2005-07-13 05:56:24.594

*** 2005-07-13 05:56:24.655 KTSGSP: flag = 0x24, seg free list = 7, tsn = 4 bloc = 0xca030308

Session #2: *** 2005-07-13 05:56:11.672 KTSGSP: flag = 0xa7, seg free list = 9, tsn = 4 bloc = 0xbf02e415 *** 2005-07-13 05:56:11.730 KTSGSP: flag = 0xa7, seg free list = 9, tsn = 4 bloc = 0xb4c265e7

KTSGSP: flag = 0xa7, seg free list = 7, tsn = 4 bloc

= 0xb1c3721d

Session #3: *** 2005-07-13 05:56:05.362

*** 2005-07-13 05:56:05.369 KTSGSP: flag = 0xa7, seg free list = 20, tsn = 4 bloc = 0xabc17d9b

But it also showed times when the sessions would be chec ing the same bloc s for use:

Session #1: KDTGSP: seg:0x1c000002 wl :0 rls:0 options:KTS_EXCHANGE KTS_UNLINK pdba:0xbf40185d WAIT #3: nam='buffer busy waits' ela= 2 p1=438 p2=91992 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=429 p2=117294 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=429 p2=117294 p3=120 WAIT #3: nam='buffer busy waits' ela= 1 p1=429 p2=117294 p3=120 ... Waits on many different bloc s with an occasional db file seq read WAIT #3: nam='buffer busy waits' ela= 1 p1=839 p2=53567 p3=120 <- Starts waiting on same bloc s here WAIT #3: nam='buffer busy waits' ela= 0 p1=839 p2=53561 p3=120 WAIT #3: nam='buffer busy waits' ela= 1 p1=832 p2=187797 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=832 p2=187797 p3=220 WAIT #3: nam='buffer busy waits' ela= 1 p1=832 p2=187781 p3=120 <- waiting on session #2 to read the bloc WAIT #3: nam='buffer busy waits' ela= 0 p1=764 p2=65082 p3=120 WAIT #3: nam='buffer busy waits' ela= 2 p1=764 p2=65082 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=758 p2=29094 p3=120 WAIT #3: nam='buffer busy waits' ela= 1 p1=758 p2=29094 p3=120

KTSGSP: flag = 0xa7, seg free list = 20, tsn = 4 bloc

= 0xabc191fa

... This continues for at least another 30-40 bloc s

Session #2: KDTGSP: seg:0x1c000002 wl :0 rls:0 options:KTS_EXCHANGE KTS_UNLINK pdba:0x35429ced WAIT #3: nam='buffer busy waits' ela= 1 p1=839 p2=53567 p3=120 <- Goes straight to the MFL here as loo ing for common bloc s WAIT #3: nam='buffer busy waits' ela= 0 p1=839 p2=53561 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=839 p2=53561 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=839 p2=53561 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=832 p2=187797 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=832 p2=187797 p3=120 WAIT #3: nam='buffer busy waits' ela= 1 p1=832 p2=187797 p3=120 WAIT #3: nam='db file sequential read' ela= 1 p1=832 p2=187781 p3=1 <- first process to need bloc so reads in WAIT #3: nam='buffer busy waits' ela= 0 p1=764 p2=65082 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=764 p2=65082 p3=120 WAIT #3: nam='buffer busy waits' ela= 2 p1=764 p2=65082 p3=120 WAIT #3: nam='buffer busy waits' ela= 0 p1=758 p2=29094 p3=120 WAIT #3: nam='buffer busy waits' ela= 1 p1=758 p2=29094 p3=120 ... This continues for at least another 30-40 bloc s

The second trace output shows that session #1 first starts to traverse a set of bloc s that another session (not nown which session) is currently reading from dis . We no w this because it is still waiting on buffer busy waits with a p3 (reason code) value o f 120. Then it starts to traverse bloc s that session #2 is also trying to read. This indica

tes that both sessions are reading from the master freelist at the same time. We now this bec ause earlier in the trace file we saw they are assigned to different process freelist s. Because all processes will move to searching the master freelist if no suitable bloc s are f ound on their process freelist this highlights a possible limitation that can occur when several process freelists are empty resulting in a number of processes trying to search the same master freelist. The severity of this seemed unexpected so a bug (4523986) was o pened to get some input from Oracle Development. Development came bac with the following thoughts:

We attempt to move a section of the MFL in one go - in this case 5 bloc s and obtain the necessary info with shared loc s. We then attempt the move, ta ing an exclusive loc . If the sublist we've identified to move has been changed, we start over.

The crux of the problem is that all sessions are reading the same five bloc s at the same time (and only one of them will eventually succeed in moving them to their PFL w hich means the problem repeats on another five bloc list for the rest of them). Your I/O issue is probably ma ing this worse as presumably the cache is running slowly the five bloc lists are read in current shared mode.

With this input from development and the diagnostic data we have analyzed we can state the cause determination as being due to allowing concurrent processes to travers e and

manipulate the master freelist when loo ing for space. If the assigned process f reelists are empty of suitable bloc s, they move to searching the master freelist. By default only one master freelist is created and controlled by the segment header, which becomes t he new serialized point of contention.

Conclusion and Learnings

Now that loc s from the to find solution

we had confirmed a problem with serialization of searching and moving b master freelist to a process freelist by concurrent sessions, we needed a to relieve the buffer busy waits.

In the bug, development had first suggested increasing the _bump_highwater_mar _count initialization parameter. By default this parameter i s set to 5, and wor s by bumping the HWM by 5 multiplied by the number of PFLs+1 (24 i n this case). When a process does not find any available bloc s on the assigned PF L it attempts to move 5 bloc s from the MFL to the PFL. By increasing this parameter it only increases the jump in HWM movement and not the amount of bloc s being transferre d to a PFL. This still defaults to a maximum of 5 bloc s, or less than 5 if an extent boundary is reached.

I tested an increase of the _bump_highwater_mar _count parameter in a test syste m using the customer test case and it didn t decrease the buffer busy waits, but it did increase the amount of waits on free buffer waits. This is due to bringing more bloc s into the buffer cache to be formatted when increasing the HWM by larger amounts.

Development had made some suggestions for solutions to this issue:

The three possible resolutions we had to this problem were:

1. Introducing a local enqueue to serialize the wal

of the MFL and acquiring an

enqueue whenever the bloc s move from MFL to PFL. The prone problem in this resolution is: (a) Acquiring an enqueue at this point can be very costly, and can ma e the system slow. (b) The customer would need to rebuild the whole system. (c) Possible deadloc s.

(2) Pinning the Segment Header when wal ing the MFL slightly costly but could be wor ed out.

(3) Aquiring bloc ther than waiting.

in CR mode, so that the other processes could do some wor ra

Possible wor around we loo ed at for this this problem was:

Use FREELIST GROUPs (even in single instance this can ma e a difference), I came across some note stating possible resolutions for high buffer busy waits.

Freelist groups are mapped as (in a non-OPS environment or OPS and Single-Instan ce environment):

Free list group is: (Process Id % Number of Free group ) + 1

Ma ing RDBMS ernel code changes for 8.1.7.4 would provide a more comprehensive solution for all tables that would have this high concurrency issue but also wou ld have further implications. For example, it may simply move contention from buffer bus y waits to enqueue waits which might cause a worse bottlenec .

In this customer s case, the suggested wor around of using freelist groups ma es p erfect sense. When an object is created it defaults to freelist groups of 1 where all t he freelist information is maintained in the segment header, as in this case. When using a v alue greater than 1 additional databloc s are created after the segment header that w ill store a master freelist, a number of process freelists and transaction freelists. The se gment header will only contain a single master freelist, the master of all freelists if you l i e. Freelist groups were originally designed for use with OPS (Oracle Parallel Server) so tha t each instance will be assigned a different freelist group and all processes connectin g to that instance will not interfere with free space searching on another instance causin g bloc s to ping between them. Within a single instance environment, using freelist groups c an still provide some benefit because each process will be assigned to a different master and set of process freelists. By increasing the freelist groups from 1 it allows us to r educe the contention on the single master freelist, as now processes will search the maste r freelist in their assigned freelist group bloc . A process will only search the MFL in the s egment header if no space can be found in their freelist group bloc .

I decided to test using freelist groups on my test environment and the results a re listed below:

FREELIST GROUPS = 1 Avg time for each session to complete inserts: 2:57mins

Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time -------------------------------------------- ------------ ------------ ------buffer busy waits 113,465 60,129 52.25 free buffer waits 599 40,244 34.97 db file sequential read 21,781 4,455 3.87 rdbms ipc reply 106 3,543 3.08 latch free 1,659 3,038 2.64 FREELIST GROUPS = 7 Avg time for each session to complete inserts: 2:20mins Wait % Total Event Waits Time (cs) Wt Time -------------------------------------------- ------------ ------------ -------

free buffer waits 1,988 39,364 45.20 db file sequential read 23,666 17,588 20.20 buffer busy waits 5,491 9,534 10.95 rdbms ipc reply 126 9,444 10.84 latch free 793 5,793 6.65 FREELIST GROUPS = 17 Avg time for each session to complete inserts: 2:31mins Wait % Total Event Waits Time (cs) Wt Time -------------------------------------------- ------------ ------------ ------free buffer waits 462 41,452 46.90 db file sequential read 25,048 24,291 27.48 latch free 2,045 9,825 11.12 rdbms ipc reply 124 6,072 6.87 log file parallel write 341 2,937 3.32

The test involved 12 concurrent session inserting 10,000 rows into a table of th e same structure provided at the beginning of this case study.

It is clear from my testing results that using multiple freelist groups can incr ease performance significantly by decreasing the number of waits on buffer busy waits . When the buffer busy waits decreased and freelist contention no longer became an issu e, especially when using 17 freelist groups, it is apparent that my test system has an I/O issue due to the increase in free buffer waits and db file sequential read waits . This could also be due to an incorrectly sized buffer cache so further investigation would be required to remove this new area of contention.

NOTE: Freelist groups cannot be dynamically added to objects. The object must be recreated with a new FREELIST GROUPS setting and then repopulated with data.

In conclusion, when seeing a high number of sessions waiting on buffer busy wait s for the same databloc , that are continually changing, on a table that has a number of p rocess freelists defined, it is possible you may be running into the serialization prob lem with searching and moving bloc s from the master freelist to the assigned process fre elist. The diagnostic steps outlined in this case study have described an approach to deter mining the cause to such waits. A wor around is also provided of rebuilding the table with multiple freelist groups, which was demonstrated to relieve the buffer busy waits. It is important to note that removing one area of contention often highlights a different area that needs further optimization.

References

Note 157250.1 Freelist Management with Oracle 8i, Stephan Haisley, Center of Exp ertise

Das könnte Ihnen auch gefallen