Beruflich Dokumente
Kultur Dokumente
1
Session Objectives
Define wait events
Discuss how to use the wait event interface
Walk through four examples of how wait
event information was used to troubleshoot
production problems
2
“Wait Event” Defined
We say an Oracle process is “busy” when it wants
CPU time.
When an Oracle process is not busy, it is waiting
for something to happen.
There are only so many things an Oracle process
could be waiting for, and the kernel developers at
Oracle have attached names to them all.
These are wait events.
3
Wait Event Examples
An Oracle process waiting for the client
application to submit a SQL statement waits
on a “SQL*Net message from client” event.
An Oracle process waiting on another
session to release a row-level lock waits on
an “enqueue” event.
4
Wait Event Interface
Each Oracle process identifies the event it
is waiting for each time a wait begins.
The instance collects cumulative statistics
about events waited upon since instance
startup.
You can access this information through v$
views and tracing events.
These make up the wait event interface.
5
Viewing Wait Events
http://dbrx.dbspecialists.com/pls/dbrx/vie
w_report
6
Why Wait Event Information
Is Useful
Wait events touch all areas of Oracle—from
I/O to latches to parallelism to network
traffic.
Wait event data can be remarkably detailed.
“Waited 0.02 seconds to read 8 blocks from
file 42 starting at block 18042.”
Analyzing wait event data will yield a path
toward a solution for almost any problem.
7
Important Wait Events
There were 102 wait events in Oracle 7.3.
There are 217 wait events in Oracle 8i Release 3
(8.1.7).
Most come up infrequently or are rarely significant
for troubleshooting performance.
Different wait events are significant in different
environments, depending on which Oracle
features have been deployed.
8
A Few Common Wait Events
enqueue log file sequential read
library cache pin log file parallel write
library cache load lock log file sync
latch free db file scattered read
buffer busy waits db file sequential read
control file sequential read db file parallel read
control file parallel write db file parallel write
log buffer space direct path read / write
9
Idle Events
Sometimes an Oracle process is not busy
simply because it has nothing to do.
In this case the process will be waiting on
an event that we call an “idle event.”
Idle events are usually not interesting from
the tuning and troubleshooting perspective.
10
Common Idle Events
client message parallel query dequeue
dispatcher timer rdbms ipc message
Null event SQL*Net message from client
smon timer SQL*Net message to client
PX Idle Wait SQL*Net more data from client
pipe get wakeup time manager
PL/SQL lock timer virtual circuit status
pmon timer lock manager wait for remote
message
11
Accounted For By The Wait
Event Interface
Time spent waiting for something to do (idle
events)
Time spent waiting for something to happen
so that work may continue (non-idle events)
12
Not Accounted For By The
Wait Event Interface
Time spent using a CPU
Time spent waiting for a CPU
Time spent waiting for virtual memory to be
swapped back into physical memory
13
Timed Statistics
The wait event interface will not collect timing
information unless timed statistics are enabled.
Enable timed statistics dynamically at the instance
or session level:
ALTER SYSTEM SET timed_statistics = TRUE;
ALTER SESSION SET timed_statistics = TRUE;
15
The v$system_event View
Shows one row for each wait event name, along with
cumulative statistics since instance startup. Wait
events that have not occurred at least once since
instance startup do not appear in this view.
Column Name Data Type
-------------------------- ------------
EVENT VARCHAR2(64)
TOTAL_WAITS NUMBER
TOTAL_TIMEOUTS NUMBER
TIME_WAITED NUMBER
AVERAGE_WAIT NUMBER
16
Columns In v$system_event
EVENT: The name of a wait event
TOTAL_WAITS: Total number of times a process
has waited for this event since instance startup
TOTAL_TIMEOUTS: Total number of timeouts while
waiting for this event since instance startup
TIME_WAITED: Total time waited for this wait event
by all processes since instance startup (in
centiseconds)
AVERAGE_WAIT: The average length of a wait for
this event since instance startup (in centiseconds)
17
Sample v$system_event
Query
SQL> SELECT event, time_waited
2 FROM v$system_event
3 WHERE event IN ('smon timer',
4 'SQL*Net message from client',
5 'db file sequential read',
6 'log file parallel write');
EVENT TIME_WAITED
--------------------------------- -----------
log file parallel write 159692
db file sequential read 28657
smon timer 130673837
SQL*Net message from client 16528989
18
The v$session_event View
Shows one row for each wait event name within
each session, along with cumulative statistics since
session start.
Column Name Data Type
-------------------------- ------------
SID NUMBER
EVENT VARCHAR2(64)
TOTAL_WAITS NUMBER
TOTAL_TIMEOUTS NUMBER
TIME_WAITED NUMBER
AVERAGE_WAIT NUMBER
MAX_WAIT NUMBER
19
Columns In v$session_event
SID: The ID of a session (from v$session)
EVENT: The name of a wait event
TOTAL_WAITS: Total number of times this session has waited for
this event
TOTAL_TIMEOUTS: Total number of timeouts while this session
has waited for this event
TIME_WAITED: Total time waited for this event by this session (in
centiseconds)
AVERAGE_WAIT: The average length of a wait for this event in
this session (in centiseconds)
MAX_WAIT: The maximum amount of time the session had to
wait for this event (in centiseconds)
20
Sample v$session_event
Query
SQL> SELECT event, total_waits, time_waited
2 FROM v$session_event
3 WHERE SID =
4 (SELECT sid FROM v$session
5 WHERE audsid =
6 USERENV ('sessionid') );
21
The v$event_name View
Shows one row for each wait event name known
to the Oracle kernel, along with names of up to
three parameters associated with the wait event.
Column Name Data Type
-------------------------- ------------
EVENT# NUMBER
NAME VARCHAR2(64)
PARAMETER1 VARCHAR2(64)
PARAMETER2 VARCHAR2(64)
PARAMETER3 VARCHAR2(64)
22
Columns In v$event_name
EVENT#: An internal ID
NAME: The name of a wait event
PARAMETERn: The name of a parameter
associated with the wait event
23
Sample v$event_name
Query
SQL> SELECT *
2 FROM v$event_name
3 WHERE name = 'db file scattered read';
EVENT# NAME
---------- ------------------------------
PARAMETER1 PARAMETER2 PARAMETER3
------------- ------------- -------------
95 db file scattered read
file# block# blocks
24
The v$session_wait View
Shows one row for each session, providing detailed information
about the current or most recent wait event.
Column Name Data Type
-------------------------- ------------
SID NUMBER
SEQ# NUMBER
EVENT VARCHAR2(64)
P1TEXT VARCHAR2(64)
P1 NUMBER
P1RAW RAW(4)
P2TEXT VARCHAR2(64)
P2 NUMBER
P2RAW RAW(4)
P3TEXT VARCHAR2(64)
P3 NUMBER
P3RAW RAW(4)
WAIT_TIME NUMBER
SECONDS_IN_WAIT NUMBER
STATE VARCHAR2(19)
25
Columns In v$session_wait
SID: The ID of a session
SEQ#: A number that increments by one on each new
wait
STATE: An indicator of the session status:
– ‘WAITING’: The session is currently waiting, and details of
the wait event are provided.
– ‘WAITED KNOWN TIME’: The session is not waiting, but
information about the most recent wait is provided.
– ‘WAITED SHORT TIME’ or ‘WAITED UNKNOWN TIME’:
The session is not waiting, but partial information about the
most recent wait is provided.
26
Columns In v$session_wait
(continued)
EVENT: The name of a wait event
PnTEXT: The name of a parameter associated with the
wait event
Pn: The value of the parameter in decimal form
PnRAW: The value of the parameter in raw form
WAIT_TIME: Length of most recent wait (in centiseconds)
if STATE = ‘WAITED KNOWN TIME’
SECONDS_IN_WAIT: How long current wait has been so
far if STATE = ‘WAITING’
27
Sample v$session_wait
Query
SQL> SELECT * FROM v$session_wait WHERE sid = 16;
28
System Event 10046
Setting event 10046 enables SQL trace, and
can optionally include wait event information
and bind variable data in trace files as well.
Methods for setting system events:
“event” instance parameter
dbms_system.set_ev
oradebug
ALTER SESSION SET events
29
System Event 10046
Settings
ALTER SESSION SET events
'10046 trace name context forever, level N’;
Value of N Effect
1 Enables ordinary SQL trace
4 Enables SQL trace with bind variable
values included in trace file
8 Enables SQL trace with wait event
information included in trace file
12 Equivalent of level 4 and level 8 together
30
Sample Trace Output
=====================
PARSING IN CURSOR #1 len=80 dep=0 uid=502 oct=3 lid=502
tim=2293771931 hv=2293373707 ad='511dca20'
SELECT /*+ FULL */ SUM (LENGTH(notes))
FROM customer_calls
WHERE status = :x
END OF STMT
PARSE #1:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=0,tim=2293771931
BINDS #1:
bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0
size=24 offset=0
bfp=09717724 bln=22 avl=02 flg=05
value=43
EXEC #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=2293771931
WAIT #1: nam='SQL*Net message to client' ela= 0 p1=675562835 p2=1 p3=0
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=923 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=931 p3=8
WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=939 p3=8
WAIT #1: nam='db file sequential read' ela= 0 p1=17 p2=947 p3=1
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=1657 p3=8
31
Using Wait Event
Information
Four examples of how wait event information
was used to diagnose production problems
32
Example #1: A Slow Web
Page
A dynamic web page took several seconds
to come up. Developers tracked the
bottleneck down to one query. The execution
plan showed that the query was using an
index, so the developers thought there might
be a “database problem.”
33
The Slow Query
SELECT COUNT (*)
FROM customer_inquiries
WHERE status_code = :b1
AND status_date > :b2;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'CUSTOMER_INQUIRIES'
3 2 INDEX (RANGE SCAN) OF 'CUSTOMER_INQUIRIES_N2' (NON-UNIQUE)
35
The Path To Problem
Resolution
What we learned from wait event information:
– The query performed a large number of index
lookups.
– 1.40 seconds were spent waiting on the index
lookups, plus any CPU overhead.
Areas to research further:
– Was the database server CPU starved?
– Was the index lookup selective?
– Idea: Modify the query to use a full table scan instead
of the index.
36
Research Results
The database server was CPU starved. The run
queue length often exceeded twice the number of
CPUs on the server.
Using just the status_code column of the
CUSTOMER_INQUIRIES_N2 index made for a
very unselective index lookup. Over 90% of the
rows in the table had a status code of 12.
A full table scan against CUSTOMER_INQUIRIES
appeared to run faster than using the index.
37
Problem Resolution
A query against v$session_event after the modified
query ran in isolation yielded:
TOTAL TIME
EVENT WAITS WAITED
------------------------------ ----- ------
db file scattered read 460 13
db file sequential read 3 1
latch free 1 0
SQL*Net message to client 10 0
SQL*Net message from client 9 18317
38
Analyzing The Results
The rule of thumb that a full table scan is better
than a scan of an unselective index is true.
I/O systems can perform a few multi-block I/O
requests much faster than many single-block I/O
requests.
Physical reads require a small amount of CPU
time. Lack of available CPU can make an I/O
intensive statement run even slower, although the
wait event interface will not show this.
39
Example #2: Slow Batch
Processing
An additional data feed program was added
to the nightly batch processing job queue,
and the overnight processing no longer
finished before the morning deadline. More
CPUs were added to the database server,
but this did not improve processing speed
significantly.
40
Summarizing Wait Events
During A Period Of Time
v$system_event shows wait event totals since
instance startup.
v$session_event shows wait event totals since
the beginning of a session.
You can capture view contents at different points
in time and compute the delta in order to get wait
event information for a specific period of time.
Statspack and many third-party tools can do this
for you.
41
Simple Script To See Wait
Events During A 30 Second
Time Period
CREATE TABLE previous_events AS
SELECT SYSDATE timestamp, v$system_event.*
FROM v$system_event;
SELECT A.event,
A.total_waits
- NVL (B.total_waits, 0) total_waits,
A.time_waited
- NVL (B.time_waited, 0) time_waited
FROM v$system_event A, previous_events B
WHERE B.event (+) = A.event
ORDER BY A.event;
42
Wait Events During 30
Seconds Of Batch
EVENT
Processing TOTAL_WAITS TIME_WAITED
------------------------------ ----------- -----------
LGWR wait for redo copy 115 41
buffer busy waits 53 26
control file parallel write 45 44
db file scattered read 932 107
db file sequential read 76089 6726
direct path read 211 19
direct path write 212 15
enqueue 37 5646
free buffer waits 11 711
latch free 52 44
log buffer space 2 8
log file parallel write 4388 1047
log file sequential read 153 91
log file single write 2 6
log file switch completion 2 24
write complete waits 6 517
43
The Path To Problem
Resolution
What we learned from wait event information:
– There appeared to be significant lock contention.
– In 30 seconds of elapsed time, sessions spent over 56
seconds waiting for locks.
Areas to research further:
– What type of locks are being waited on? Row-level
locks? Table-level locks? Others?
– If the locks are table-level or row-level, then which
database tables are experiencing contention? Which
SQL statements are causing the contention?
44
Tracing Waits In A Session
The following commands were used to enable wait
event tracing in the process with Oracle PID 13:
45
Trace File Contents
EXEC #5:c=0,e=0,p=3,cr=2,cu=1,mis=0,r=1,dep=1,og=4,tim=2313020980
XCTEND rlbk=0, rd_only=0
WAIT #1: nam='write complete waits' ela= 11 p1=3 p2=2 p3=0
WAIT #4: nam='db file sequential read' ela= 4 p1=10 p2=12815 p3=1
WAIT #4: nam='db file sequential read' ela= 1 p1=10 p2=12865 p3=1
WAIT #4: nam='db file sequential read' ela= 5 p1=3 p2=858 p3=1
=====================
PARSING IN CURSOR #4 len=65 dep=1 uid=502 oct=6 lid=502
tim=2313021001 hv=417623354 ad='55855844'
UPDATE CUSTOMER_CALLS SET ATTR_3 = :b1 WHERE CUSTOMER_CALL_ID=:b2
END OF STMT
EXEC #4:c=1,e=10,p=3,cr=2,cu=3,mis=0,r=1,dep=1,og=4,tim=2313021001
WAIT #4: nam='db file sequential read' ela= 0 p1=10 p2=5789 p3=1
WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=196705 p3=6744
WAIT #4: nam='enqueue' ela= 307 p1=1415053318 p2=196705 p3=6744
WAIT #4: nam='enqueue' ela= 53 p1=1415053318 p2=196705 p3=6744
WAIT #4: nam='db file sequential read' ela= 0 p1=10 p2=586 p3=1
WAIT #4: nam='db file sequential read' ela= 1 p1=3 p2=858 p3=1
EXEC #4:c=0,e=668,p=3,cr=5,cu=3,mis=0,r=1,dep=1,og=4,tim=2313021669
46
Understanding The enqueue
Wait Event
SQL> SELECT parameter1,parameter2,parameter3
2 FROM v$event_name
3 WHERE name = 'enqueue';
CH LOCK_MODE
-- ----------
TX 6
47
Analyzing The Results
Contention for exclusive locks on rows in the
customer_calls table was responsible for
substantial delays in processing.
Looking at the row_wait_obj# and
row_wait_row# columns in v$session would
have identified the exact rows undergoing
contention.
48
Problem Resolution
Multiple programs were attempting to
update the same rows in tables at the same
time. Contention could be reduced by doing
one or more of the following:
– Running conflicting programs separately
– Reducing lock scope
– Reducing lock duration
49
Example #3: A Slow
Client/Server Application
A client/server application was taking several
seconds to bring up a certain screen. The
delay was occurring during startup before
the user had a chance to kick off a query.
The only thing happening in the form on
startup was some fetching of basic reference
data. All of the SQL had been tuned and was
known to run very quickly.
50
Manipulating
timed_statistics
The timed_statistics parameter can be changed at
any time at the session level with the following
commands:
52
The Path To Problem
Resolution
What we learned from wait event information:
– There were over 18,000 network roundtrips during form startup,
almost exactly two for every logical read.
– The Oracle process spent over 10 seconds waiting for activity from
the client. Since timed statistics were disabled at the end of the
form startup logic, this does not include time spent waiting on the
end user.
Areas to research further:
– How many rows of data does the form read from the database
during the startup phase?
– Does the form really need to fetch all of this data?
– Is the form fetching one row at a time or is it using Oracle’s array
processing interface?
53
Research Results
The form was fetching 9245 rows of
reference data during startup.
All of this data was necessary; none could
be eliminated.
All data was fetched one row at a time.
54
Problem Resolution
The startup logic of the form was modified to fetch 100 rows at a
time. This yielded the following information in v$session_event:
TOTAL TIME
EVENT WAITS WAITED
------------------------------ ----- ------
SQL*Net message to client 200 0
SQL*Net message from client 199 28
55
Analyzing The Results
Fetching rows 100 at a time instead of one at a
time dramatically reduced network roundtrips.
Reducing network roundtrips reduced time spent
waiting on the network.
Fetching rows 100 at a time also significantly
reduced the number of logical reads, and therefore
the amount of CPU time required.
56
Example #4: A Floundering
Database Server
The DBA group discovered that one of the
database servers was completely
overwhelmed. Connecting to the database
took a few seconds, selecting from SYS.dual
took more than a second. Everything on the
system ran very slowly.
57
Longest Waits In
v$system_event
EVENT TIME_WAITED
------------------------------ ----------------
log file sync 326284
write complete waits 402284
control file parallel write 501697
db file scattered read 612671
db file sequential read 2459961
pmon timer 31839833
smon timer 31974216
db file parallel write 1353916234
rdbms ipc message 6579264389
latch free 8161581692
SQL*Net message from client 15517359160
58
The Path To Problem
Resolution
What we learned from wait event information:
– Most of the waits involved idle events or I/O events.
– A large amount of time was spent waiting on latches.
Areas to research further:
– How long has the instance been up?
– Which latches are experiencing contention?
59
Research Results
The instance had been up for about seven days.
The latch contention was in the shared pool and library cache,
as evidenced by a query against v$latch_misses:
PARENT_NAME SUM(LONGHOLD_COUNT)
------------------------------ -------------------
enqueue hash chains 614
enqueues 637
Checkpoint queue latch 790
session allocation 1131
messages 1328
session idle bit 2106
latch wait list 5977
modify parameter values 6242
cache buffers chains 9876
row cache objects 38899
cache buffers lru chain 125352
shared pool 4041451
library cache 4423229 60
Further Research Results
The shared pool was 400 Mb in size.
There were over 36,000 statements in the shared
pool, almost all executed exactly once.
The application was not using bind variables.
Modifying the application to use bind variables
was not an option.
Setting the cursor_sharing parameter to FORCE
was also not an option.
61
Problem Resolution
Bigger is not always better!
62
A Summary Of Wait Event
Techniques
Isolating a statement and analyzing its wait
events
Collecting wait event data for a session or
the entire instance at two different times and
computing the difference to find the wait
events during a specific period of time
Enabling wait event tracing in a session
63
A Summary Of Wait Event
Techniques (continued)
Enabling and disabling timed statistics
dynamically to measure wait event times for
a specific section of code
Ranking cumulative wait event data in order
to see which wait events account for the
most wait time
64
In Conclusion
The wait event interface gives you access to a
detailed accounting of how Oracle processes
spend their time.
Wait events touch all aspects of the Oracle
database server.
The wait event interface will not always give you
the answer to every performance problem, but it
will just about always give you insights that guide
you down the proper path to problem resolution.
65
The White Paper
A companion white paper to this presentation
is available for free download from my
company’s website at:
www.dbspecialists.com/present.html
66
Resources from
Database Specialists
The Specialist newsletter
– www.dbspecialists.com/specialist.html
Database Rx®
– dbrx.dbspecialists.com/guest
• Provides secure, automated monitoring, alert
notification, and analysis of your Oracle
databases
67
Contact Information
Roger Schrag
Database Specialists, Inc.
388 Market Street, Suite 400
San Francisco, CA 94111
Tel: 415/344-0500
Email: rschrag@dbspecialists.com
Web: www.dbspecialists.com
68