Sie sind auf Seite 1von 90

Presenter: Susan White

Senior Engineer Oracle Corporation

Oracle Wait Interface


Performance Diagnostics of Common Wait Events

Agenda
Introduction to Wait events Components Common Wait Events Interpreting Common I/O Related Wait events Interpreting Locks Related Wait Events Interpreting Common Latency Related Wait Events Dumps and Traces

Introduction to Wait Events


A Little history on wait events: Version 7.0 104 wait events Version 8.0 140 wait events Version 8i 220 wait events Version 9i approx. 400 wait events Version 10G approx 800 wait events

Introduction to Wait Events


The OWI is a performance tracking/tuning methodology that focuses on process bottlenecks (better known as wait events). This includes waits for I/O, locks, latches, enqueues, background process activities, network latencies, memory, and so on. Each time a process has to wait for some resource, Oracle collects statistics These statistics are available in several V$ views.

Introduction to Wait Events


When looking at wait events the focus is on response time for a session
ResponseTime=CPU(service time) + Time Waited

Components
The OWI (Oracle Wait Interface) is a collection of a few dynamic performance views and an extended SQL trace file.
Parameter TIMED_STATISTICS must be set to TRUE This parameter does not add any appreciable overhead to your database performance.

Components
The following 4 views are the key components of the OWI V$EVENT_NAME V$SESSION_WAIT V$SESSION_EVENT V$SYSTEM_EVENT

Components
Oracle 10G adds the following V$SESTEM_WAIT_CLASS V$SESSION_WAIT_CLASS V$SESSION_WAIT_HISTORY V$EVENT_HISTOGRAM V$ACTIVE_SESSION_HISTORY

Components
The V$EVENT_NAME is a non-dynamic view that contains all the wait events defined in your database.
SQL> descr v$event_name Name Null? Type ----------------------------------------- -------- ---------------------------EVENT# NUMBER EVENT_ID NUMBER NAME VARCHAR2(64) PARAMETER1 VARCHAR2(64) PARAMETER2 VARCHAR2(64) PARAMETER3 VARCHAR2(64) WAIT_CLASS_ID NUMBER WAIT_CLASS# NUMBER WAIT_CLASS VARCHAR2(64)

Components
The V$SYSTEM_EVENT view displays aggregated statistics of all wait events encountered by all Oracle sessions.
SQL> descr v$system_event Name Null? Type ----------------------------------------- -------- ---------------------------EVENT VARCHAR2(64) TOTAL_WAITS NUMBER TOTAL_TIMEOUTS NUMBER TIME_WAITED NUMBER AVERAGE_WAIT NUMBER TIME_WAITED_MICRO NUMBER EVENT_ID NUMBER

Components
The V$SESSION_EVENT contains aggregated wait events stats by session for all sessions currently connected.
SQL> descr v$session_event; Name Null? Type ----------------------------------------- -------- ---------------------------SID NUMBER EVENT VARCHAR2(64) TOTAL_WAITS NUMBER TOTAL_TIMEOUTS NUMBER TIME_WAITED NUMBER AVERAGE_WAIT NUMBER MAX_WAIT NUMBER TIME_WAITED_MICRO NUMBER EVENT_ID NUMBER

Components
The V$SESSION_WAIT view provides detailed information about the event or resource that each session is waiting for. This is a real time view.
(Note in 10G the v$session_wait is wholely incorporated into the v$session view, so you will not need to do a join to get the session info.)

Components
SQL> descr v$session_wait; Name Type --------------------------------- ------------------SID NUMBER SEQ# NUMBER EVENT VARCHAR2(64) P1TEXT VARCHAR2(64) P1 NUMBER P1RAW RAW(8) P2TEXT VARCHAR2(64) WAIT_CLASS_ID NUMBER WAIT_CLASS# NUMBER WAIT_CLASS VARCHAR2(64) WAIT_TIME NUMBER SECONDS_IN_WAIT NUMBER STATE VARCHAR2(19)

Components: Tracing
Trace event 10046 The extended SQL Trace. When you cant monitor events interactively, you can diagnose a performance problem by recording the wait events in a trace file for more detailed analysis.

Components: Tracing
You can enable trace event 10046 at the instance level or at the session level Set the event parameter, and restart the instance for the instance level. (Do not do this!!!)
Event=10046 tace name contex forever, level 8

Set the event parameter at the session level


Alter session set events 10046 trace name context forever, level 8 Run the sql statement/s you wish to trace Alter session set events 10046 trace name context off

You can also use the dbms_support.start_trace package to do your tracing.

Components: Tracing
When you want to trace someone elses session Exec dbms_support.start_trace_in_session (sid=>xxx, serial#=>, waits=>true, binds=>true); Use the stop_trace_in_session to end your tracing.

Components: Tracing
To find your trace file look in the user_dump_dest directory. You can list the directory contents by time, one of the most recent will be your trace file. You can also use tracefile_identifer to name your trace file for easy referencing Alter session set tracefile_identifier=Tracefilesql1 Once you find your trace file tkprof utility (Transient Kernel Profiler) will summarize the wait events.

New Views in 10G


V$SESSION_WAIT_HISTORY V$SYSTEM_WAIT_CLASS V$SESSION_WAIT_CLASS V$EVENT_HISTOGRAM

OWI Limitations
No CPU Statistics No end-to-end visability No historical data you much capture and create your own history. Some inaccuracies when computing total time time rounded/ centisecond unit of measurement too course grained for todays fast computers.

Common Wait Events


Buffer busy waits occurs when a session wants to access a data block in the buffer cache that is currently in use by some other session. (10G: read by other session) P1 absolute file number where the block in question resides P2 the actual block number P3 9i: reason for the wait; 10G: wait class in v$waitclass view. Wait Time: 100cs or 1 second

Common Wait Events


Control file parallel write occurs when the session waits for the completion of the write requests to all the control files. P1 absolute file number where the block in question resides P2 total number of blocks P3 number of I/O requests Wait Time: no delay acctual elapsed time to complete all I/O requests.

Common Wait Events


Db file parallel read occurs when a process reads multiple noncontiguous single blocks from one or more data files, or during a database recovery operation when database blocks that need changes as part of recovery are read in from the datafiles. P1 number of files to read from P2 total number of blocks to read P3 total number of I/O requests Wait Time No timeouts session waits until all I/O is completed.

Common Wait Events


Db file parallel write occurs when the database write writes the dirty blocks to the datafiles in a write batch. P1 number of files to write to P2 total number of blocks to write P3 9.2 onwards, shows the timeout value in centiseconds to wait for the I/O completion Wait Time No timeouts

Common Wait Events


Db file scattered read occurs when the session issues an I/O request to read multiple data blocks. The blocks read from the datafiles are scattered into the buffer cache. Typical of full table scans or index fast full scans. DB_FILE_MULTIBLOCK_READ_COUNT determines the maximum number of blocks to read P1 file number to read the blocks from P2 starting block number P3 number of blocks to read Wait Time No timeouts

Common Wait Events


Db file sequential read occurs when the process waits for an I/O completion for a sequential read. This is a single block read operation. This event gets posted when reading from an index, rollback or undo segments, table access by rowid, datafile headers, or some temporary segments. P1 file number to read the data block from P2 starting block number to read P3 in most cases this is 1, but for temporary segments can be more than 1 Wait Time: No timeouts.

Common Wait Events


Db file single write occurs when Oracle is updating datafile headers, typically during a checkpoint. You may notice this event when your database has a lot of datafiles. P1 file number to write to P2 starting block to write to P3 typically 1 Wait Time No timeoutsl.

Common Wait Events


Direct path read occurs when oracle is reading data blocks directly into the sessions PGA instead of the buffer cache in the SGA. Direct read I/O is normally used while accessing the temporary segments that reside on disk for sorts, parallel queries and hash joins. P1 absolute file number to read from P2 starting block to read from P3 number of blocks to read Wait Time No timeouts.

Common Wait Events


Direct path write is the opposite of your direct path read Oracle writes buffers directly from the PGA to the datafiles. Normally used when writing to temporary segments, in direct data loads or in parallel DML operations.

Common Wait Events


Enqueue is a shared memory structure used by Oracle to serialize access to the database resources. The process will wait in a queue for its turn to acquire this enqueue lock. There are various types of enqueues used to serialize access: ST Space management SQ Sequence numbers TX - transactions

Common Wait Events


P1 enqueue name and mode requested by the waiting process. P2 resource identifier ID1 for the requested lock P3 resource identifier ID2 for the requested lock. (same Ids as in V$LOCK) Wait Time dependent on enqueue name Oracle can wait up to 3 seconds, or until the enqueue resource becomes available which ever comes first.

Common Wait Events


Free buffer waits occurs when the session cannot find free buffers in the database buffer cache to read data blocks in or to build a consistent read. This will signal DBWR to free up dirty buffers. P1 File number P2 block number from the file that needs to be read into cache P3 9i not used; 10G shows the id for the LRU lists in the buffer cache Wait Time Oracle will wait up to 1 second for free buffers, then try again.

Common Wait Events


Latch free occurs when the process waits to acquire a latch that is currently held by another process. Processes needing a latch do not have to wait in a queue. If the process fails to acquire a latch, it will spin then try again. Most common latches are cache buffer chains, library cache and shared pool. P1 address of the latch P2 Number of the latch in v$latchname.latch# P3 number of tries Wait Time increases exponentially. In 10G latches have their own wait events.

Common Wait Events


Library cache pin / library cache lock occurs when the session tries to pin an object in the library cache to modify or examine it. Must acquire a pin to ensure that the object does not change. P1 address of the library object P2 address of the load lock P3 mode plus the namespace from v$db_object_cache Wait Time PMON waits 1 second, all other processes wait 3 seconds.

Common Wait Events


Log buffer space occurs when the process has to wait for space to become available in the log buffer. P1-P3 Not used Wait time 1 second, but can be 5 seconds if it has to wait for a log file switch

Common Wait Events


Log file parallel write occurs when the session waits for LGWR to write to all the members of the redo group. P1 number of log files to write to P2 number of OS blocks to write P3 number of I/O requests Wait Time Actual elapsed time for the writes to all the log files.

Common Wait Events


Log file sequential read occurs when the process waits for blocks to be read from the online redo log files. P1 relative sequence number of te redo log file P2 block number to start reading from P3 number of OS blocks to read Wait Time Actual elapsed time.

Common Wait Events


Log file switch (archiving needed) indicates that the ARCH process is not keeping up with the LGWR. Log file switch (checkpoint incomplete) checkpoint must complete before the file is archived indicates that redo log files may be too small.

Common Wait Events


Log file sync when a transaction completes (commits/rollback) the redo info must be written to the log files. This wait occurs when the process is waiting for the redo info to be written out. This can be seen in applications that have too many short transactions and too frequent commits batch the commits for better throughput. P1 the number of the buffer in the log buffer that needs to be synchronized. Wait Time 1 second.

Common Wait Events


SQL*Net message from/to client session is waiting for client to respond. Excessively long wait times can indicate that there is a network issue, however this does not degrade any database performance. P1 ASCII value to show the type of network driver P2 number of bytes P3 not used Wait Time actual time it takes for the message to be sent/received.

Common Wait Events


Tracking CPU and other statistics V$SESSTAT / V$SYSSTAT
CPU used by this session CPU used when call started Recursive CPU usage Parse time CPU Session logical reads Physical reads Physical writes

Wait Events: Root Cause Analysis


In order to do a RCA, you must have some type of historical data collector on the wait events statistics. Trace 10046 too high an overhead but gives fine grained performance data. Statspack does not give session level data. Collect historical data with a BEFORE LOGOFF TRIGGER low overhead. (disadvantage no data if session hangs or is killed). You build a table and keep approx. 7 days worth of data. You can archive the data for long term comparisons. Store both the wait events and the SQL statement that generated the events ( V$SQLTEXT) Collect historical data with PL/SQL and SQL samples.

Interpreting Common Wait Events: How Do I find and fix the Problems?

Interpreting Common I/O Related Wait Events


I/O operations remain the slowest activity in any computer system. Seven most common I/O related wait events
Db file sequential read Db file scattered read Direct path read Direct path write Log file parallel write Db file parallel write Controlfile parallel write

Interpreting Common I/O Related Wait Events


Db file sequential read - Oracle process wants a block not currently in SGA. Look at the TIME_WAITED and AVERAGE_WAIT from your V$SESSION_EVENT. This event is normally one of your top 5 wait events when looking at systemwide events. Significant time waited is most likely an application issue. Average wait times on todays newer storage systems should be about 1cs, or .4 to .8cs on a SAN (due to memory caching).

Interpreting Common I/O Related Wait Events


If the object with the db files sequential reads is an index, your application may be doing a lot of index reads (SAP). Inspect the application would parallel full table scans be more efficient? Check the clustering factor of the index this will affect the number of I/Os Check the two init.ora parameters - inappropriate use of these parameters can cause significant I/O waits
- optimizer_index_cost_adj and optimizer_index_caching

Interpreting Common I/O Related Wait Events


If the object with the db files sequential reads is a table keep in mind that access by rowid after the index read is sequential. Check out the average_wait time from the V$SYSTEM_EVENT. This event is usually one of your top 5 wait events and does not indicate a problem. If it does not appear as one of your top 5 wait events, you have issues with other wait events that need to be addressed. If AVGERAGE_WAIT is excessively high check for hot spots on your disks -- 10G use ASM for load balancing.

Interpreting Common I/O Related Wait Events


With the event db file scattered read the Oracle session is using db_file_multiblock_read_count to read your blocks from disk to SGA. Multiblock I/O requests are associated with full table scans and index fast full scans. A significantly high db file scattered read wait is usually an application issue. Ideally you want to have more single block reads ( sequential) than multiblock reads (scattered). Parallel operations can be implemented to decrease the wait time if the full scan is appropriate. If an application has been running efficiently for a while, the suddenly starts generating db file scattered read waits look for a dropped index or one that has become unusable.

Interpreting Common I/O Related Wait Events


Optimizer parameters that can skew a database toward full table scans are db_file_multiblock_read_count (MBRC) hash_area_size and optimizer_index_cost_adj Sometimes this wait event goes up because the database has not been analyzed lately. Check the last_analyzed date. This event can increase if a table has many chained rows. Check the chain_cnt after analyzing.

Interpreting Common I/O Related Wait Events


Direct path read waits are from SQL statements that perform direct read operations from temp or regular tablespaces. Sorts, order by, group by, union, distinct, rollup, merge joins, parallel scans. Evaluate your sorts with the v$sesstat view. The init.ora parameters that will have more of your sorts performed in memory are sort_area_size and pga_aggregate_target Goal is to tune the queries to do less sorting Use union all instead of union, hash joins instead of sort merge, nested loops instead of hash joins.

Interpreting Common I/O Related Wait Events


Direct path write waits occur when the PGA is writing to temporary tablespaces or data files. Direct writes come from SQL statements such as SORT, CTAS, HASH, INDEX, and sqlldr running in direct mode. The direct path write can be tuned the same way as the direct path read. Note: Tuning your SQL statements first will have the greatest impact on both direct reads and writes.

Interpreting Common I/O Related Wait Events


Db file parallel write belongs to the DBWR process. Significant waits for this event is most likely an I/O issue. User sessions will not have this wait event but may show waits on write complete wait or free buffer wait. Check to see if your system supports asynch_io, if so use it. If your system does not (HPUX only supports this on raw). Consider using multiple DBWRs.

Interpreting Common I/O Related Wait Events


Log file parallel write belongs to the LGWR. This is a system wide wait, a user session may wait on log file sync. Look at the time_waited and average_wait in your v$system_event table. If the average is > 1cs, you could be experiencing slow throughput. Fix this problem the same way you would your DBWR async_io. Unfortunately you cant add multiple LGWRs. Check the placement of your log files to ensure there is no contention. Unless you have a standby database, consider using NOLOGGING for some of your operations.

Interpreting Common I/O Related Wait Events


Control file parallel write waits are usually a symptom of a high number of log switches. Increase the size of your redo logs.

Interpreting Common Locks Related Wait Events

Interpreting Common Locks Related Wait Events


Difference between a latch and a lock Latch only purpose of a latch is to prove exclusive access to memory structures. Latches protect memory objects. Two modes willing-to-wait or no-wait Lock two purposes 1) to allow multiple processes to share the same resource when the modes are compatible 2) to enforce exclusive access when the modes are incompatible. Locks protect database objects. Six modes null, row share, row exclusive, share, row share exclusive or exclusive.

Interpreting Common Locks Related Wait Events


Latch Free wait occurs when a process failed to obtain a latch. Look at the TOTAL_WAITS in the V$SYSTEM_EVENT table. Latches can be monitored with the V$LATCH view. Latch contention is common in high concurrency environments, and should be expected. The 5 most common latch waits are the shared pool, library cache, cache buffers chains, cache buffers lru chain, and row cache objects.

Interpreting Common Locks Related Wait Events


Shared Pool and Library cache latches waits are mainly due to intense hard parsing. This can be scene in applications that use literals in the SQL statements. Convert these SQL statements to use bind variables. Alternatively set CURSER_SHARING to FORCE. Identify SQL statements by looking at V$SQLAREA.PARSE_CALLS

Interpreting Common Locks Related Wait Events


Cache buffers chains latches waits are caused by inefficient SQL statements. You can get this contention when the application opens multiple concurrent sessions that execute the same SQL statements that go after the same data set. Another cause is hot blocks in the buffer cache. To determine if you have hot blocks you can examine the P1RAW column of the V$SESSION_WAIT view. You can reorg a table/index with hot blocks by exp/imp, upping the PCTFREE to spread the data out more. Consider reducing the block size for tables with many hot blocks 9i supports multiple block sizes.

Interpreting Common Locks Related Wait Events


Cache buffers LRU chain latches waits occurs when there is intense buffer cache activity. Statements that repeatedly scan large unselective indexes or perform full table scans are the main problem SQL. No easy fix Tune the SQL

Interpreting Common Locks Related Wait Events


Row cache objects latch protects the data dictionary. Only way to influence this latch is to increase the shared pool size. This problem should not be seen as often in 9i and above due to the multiple child latches in this version and above.

Interpreting Common Locks Related Wait Events


Enqueues are locks that apply to database resources. They are initiated by the application requests. The wait event is the enqueue wait. An enqueue wait can occur for many different reasons. Query the V$ENQUEUE_STAT view for information.

Interpreting Common Locks Related Wait Events


Most common enqueue is a wait for TX Enqueue in Mode 6 This is a row level lock. Look into V$LOCK to determine what the blocking session is, and kill or have the user log off the blocking session.

Interpreting Common Locks Related Wait Events


Another common enqueue is a wait for TX Enqueue in Mode 4 This is usually a Unique Key enforcement or a wait for an ITL(transaction slot) in the data block. For ITL problems Check your V$SEGMENT_STATISTICS view to determine the volume of ITL waits in your database (statistic name > ITL waits) Fix is to recreate the object with a higher INITTRANS or PCTFREE storage option.

Interpreting Common Locks Related Wait Events


For Unique key enforcement problems this can occur if you have multiple concurrent users inserting the same key value into a table that has unique constraints. Action is to find the blocking lock and find out why the application is allowing the users to try to insert duplicates at the same time.

Interpreting Common Locks Related Wait Events


Another common enqueue is a wait for ST Enqueue There is only one ST lock per database. Actions that modify the UET$ and FET$ require this lock. Fix is to use locally managed tablespaces. Fix is to recreate all temporary tablespaces as TEMPFILE. If you cant change from dictionary managed, increase the next extent sizes of all your segments that are high growth, also preallocate extents for your growing segments.

Interpreting Common Locks Related Wait Events


Another common enqueue is a wait for TM Enqueue in mode 3 Unindexed foreign keys are the primary reason for this lock contention.

This lock can also occur if the application issues an explicit LOCK TABLE command Query the V$SQLAREA for this statement.

Interpreting Common Locks Related Wait Events


Buffer busy waits mainly occur because multiple sessions either are trying to read the block into memory or are trying to pin the block in memory. To fix, try to reduce the level or concurrency or consider increasing FREELISTS or FREELIST GROUPS of the object. You can also try to rebuild the object with a smaller block size or reduce the number of rows in the block with the PCTFREE storage option. If the majority of your waits are on segment headers, check your NEXT extent size and make sure that the gap between PCTFREE and PCTFREE is not too small. If the majority of your waits are for undo segment headers you may have too few rollback segments use system managed undo.

Interpreting Common Latency Related Wait Events

Interpreting Common Latency Related Wait Events


Log file sync waits occurs when a session waits on the LGWR to write out the buffer. There are 3 main causes of this wait. High commit frequency application related. Slow I/O subsystem only solution is faster hardware. Consider raw, RAID 0 vs. RAID 5, fiber channel connections. Oversized log buffer (greater than 1 meg) Note: you may also want to check your PROCESSES parameter If set too high this may increase your log file sync waits.

Interpreting Common Latency Related Wait Events


Log buffer space wait can occur if sessions wait to copy redo entries to the log buffer due to insufficient space or the LGWR process is not fast enough. If the log buffer is too small (less than 1 mg) increase it If the I/O subsystem is too slow, consider using NOLOGGING or upgrading the hardware or tuning any disk contention.

Interpreting Common Latency Related Wait Events


Free buffer waits occur when the DBWR is writing out dirty buffers from the SGA. Reasons are
poorly written SQL Not enough DBWRs Slow I/O Delayed block cleanouts Small buffer cache

Interpreting Common Latency Related Wait Events


Delayed block cleanouts the first process to scan a table that has just been loaded will be penalized . Full scan a newly loaded table to minimize this issue. Small buffer cache is not a common problem most DBAs oversize the buffer cache Try adding more DBWRs before increasing the size of the cache.

Interpreting Common Latency Related Wait Events


Write complete waits is symptomatic of foreground processes waiting on the DBWR to write out blocks. This problem will be secondary to other DBWR problems check you write complete waits and db file parallel writes.

Interpreting Common Latency Related Wait Events


Log file switch completion waits occur when the redo log files are too small and transactions generate a lot of redo entries. Fix is to create larger log files. Your ideal target is to have no more that 4-6 log switches per hour. Log file switch (checkpoint complete) wait is a closely related wait event. Same solution --- you may also want to add more log file groups.

Additional Topics
Wait events in 10G new additions Wait events in a RAC environment.

Dumps and Traces


Many times you encounter ORA-0600 errors and core dumps ORA-7445. You contact support and they ask you for additional trace files and dumps or memory dumps so the root cause of the issue can be identified.

Dumps and Traces


As a DBA you should become familiar with ORADEBUG . This utility does not have a lot of documentation, however it does have extensive help Sql> oradebug help Sql> oradebug dumplist

Dumps and Traces


When investigating block corruptions you may also want to dump data/index blocks SQL> alter system dump datafile <file#> block <block#>;

Dumps and Traces


Control files can also be dumped. This can be useful when tracking recovery related issues or SCN synchronization issues. This can be done with SQL SQL> alter session set events immediate trace name controlf level 10; Or with oradebug SQL> oradebug sitmypid SQL> oradebug ulimit SQL> oradebug dump controlf 10

Dumps and Traces


Heap Dumps are often requested when there is a shared pool issue. SQL> alter sessions set events immediate trace name heapdump level <level>; Or with oradebug SQL> oradebug sitmypid SQL> oradebug ulimit SQL> oradebug dump heapdump <level>

Dumps and Traces


Library cache dumps will give details about the object in the library cache. SQL> alter sessions set events immediate trace name library_cache level 10; Or with oradebug SQL> oradebug sitmypid SQL> oradebug ulimit SQL> oradebug dump library_cache 10

Dumps and Traces


Processstate dumps are used when diagnosing memory corruptions or dead lock errors. SQL> alter sessions set events immediate trace name processstate level <level>; Or with oradebug SQL> oradebug sitmypid SQL> oradebug ulimit SQL> oradebug dump processstate<level>

Dumps and Traces


System state dumps are used when diagnosing database hang conditions. SQL> alter sessions set events immediate trace name systemstate level <level>; Or with oradebug SQL> oradebug sitmypid SQL> oradebug ulimit SQL> oradebug dump systemstate <level>

Summary
Become familiar with the various views used to diagnose any waits occuring in your database. Set up an historical tracking system so that you can respond to the questions of What Happened after the fact. Become familiar with the most common wait events and what the causes are.

References
www.hotsos.com - Why are Oracles read events named backwards? www.ioug.org - If your memory serves you right

References Recommended Reading


Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning
By Richmond Shee, K. Deshpande, K Gopalakrishnan

Q U E S T I O N S A N S W E R S

Das könnte Ihnen auch gefallen