Beruflich Dokumente
Kultur Dokumente
pro-active - try to see the problem before it happens i.e. warning alert on a tablespace threshold before it becomes a
problem
pro-active means that it does not impact the end user where as reactive is generally when the user tells you he/she has a
problem. OEM provides many graphs that can shown you trends about a particular area of the database for more
information see OEM.
One rule is first try and fix the SQL code, especially if it is old code, no matter how you tune the database if the code is
badly written it won't make any difference, however oracle does provide some parameters that can help with poorly written
code.
Oracle Query process
Query processing requires the transformation of your SQL query into an efficient execution plan, Query optimization
requires that the best execution plan is the one executed, the goal is to use the least amount of resources possible (CPU and
I/O), the more resources a query uses the more impact it has on the general performance of the database.
A users query will go through 3 phases
At this stage the syntax ( decomposed into a relational algebra that's analyzed to see if its syntactically
correct) and semantics (make sure that tables and columns exist and you have permissions to access the
objects) are checked, at the end you have a parse tree which represents the query's structure.
Parsing
The statement is normalized so that it can be processed more efficiently, once all the checks have completed
it is considered a valid parse tree and is sent to the logical query plan generation stage.
All this is done in the library cache of the SGA
Optimization
The optimizer is used at this stage, which is a cost-based optimizer (CBO - see below for more info), this
chooses the best access method to retrieve the requested data. It uses statistics and any hints specified in the
SQL query, the CBO produces an optimal execution plan for the SQL statement.
The optimization process can be divided in two parts
Query rewrite phase : The parse tree is converted into an abstract logical query plan, the various nodes and
branches are replaced by operators of relational algebra.
Execution plan generation phase: Oracle transforms the logical query plan into a physical query plan, the
physical query or execution plan takes into account the following factors
The best way to pass data from operation to another during the query
The optimizer may well come up with multiple physical plans, all of which are potential execution plans.
The optimizer then chooses among them by estimating the costs of each possible plan (based on table and
index statistics) and selecting the plan with the lowest cost. This is called the cost-based query optimization
(CBO).
Execution
The final stage is to execute the physical query plan that was selected by the CBO
Optimization Mode: In previous versions of Oracle you had two choices RBO (rule-based optimizer) or CBO (cost-based
optimizer), RBO is available in Oracle 10g but is a deprecated product the CBO is the preferred and default method.
RBO
Rule-Base
Optimizer
used a heuristic method to select among several alternative access paths with the help of certain rules. All
paths were ranked and the lowest was chosen i.e. using the ROWID was a cost of 1, a full table scan was
a cost of 19.
CBO
Cost-based
Optimizer
The CBO uses statistics on tables and indexes, the order of tables and columns in the SQL statements,
available indexes, and any user-supplied hints to pick the most efficient way to access the data requested.
CBO almost always performs better than RBO.
Cost-Based Optimizer (CBO): The optimizers job is to find out the optimal or best plan to execute your DML statements
(select, insert, update and delete). The CBO uses statistics on tables and indexes, the order of tables and columns in the SQL
statements, available indexes, and any user-supplied hints to pick the most efficient way to access the data requested, the
CBO uses the least costly method (cost being CPU and I/O which is the most expensive) to get at the data.
Oracle 10g has a automatic job that collects statistics which is used the the CBO, the gather_stats_job is run between 10pm6am everyday. The job collects statistics on all tables that either have no statistics or stale statistics (more than 10% of data
has changed in the table since the last collection).
Once the statistics have been gathered a number of columns will have been updated in a table or index
Cardinality (the number of columns with similar values for each column)
System statistics, which include I/O characteristics of your system and CPU statistics which include CPU speed and
other related statistics.
The CBO will use all of the above statistics and other statistics (CPU, I/O and O/S statistics) to help with finding the
optimal plan, using the above statistics the CBO can estimate costs of individual operations. The less number of statistics
collected will result in less physical plans the CBO can come up with, thus the less number of choices the CBO can make.
Column statistics
Optimizer Configuration
The mode level can be set to a number of levels, depending on what your application requirements are you can start sending
data to the user quickly, if the user requires all the data to be seen altogether then the ALL_ROWS option is best but if the
user does not care about getting all the data only getting something fast then FIRST_ROWS may be an option, also by
using hints you can force the optimizer to take a chosen path seehints below for more details
Mode Levels
ALL_ROWS (default)
FIRST_ROWS_n
FIRST_ROWS
CBO Drawbacks
The CBO may not use the same plan for the same SQL statement every time, you need to watch out for the following
Execution plans can change across oracle versions, to make sure that the same plan is used then you need to use
stored outlines
The application developer may know of a better plan than the CBO, in which case hints can be used to direct or help
the CBO.
The CBO depends enormously on current statistics, if the statistics are absent or outdated the optimizer can make
poor decisions
Manually Collect Statistics: To collect statistics manually you need to use the package dbms_stats, you can sample row or
block and depending on the size of the table
Database
dbms_stats.gather_database_stats (
estimate_percent => NULL,
method_opt => 'auto',
granularity => 'all,
cascade => 'true',
option => 'gather_auto'
);
Note: estimate - refers to the percentage of rows that should be used to estimate the statistics, null means all
rows
method_opt - can specify several things , like include histograms, AUTO means collect histograms
granularity - only applies to tables, ALL means collect statistics for subpartitions, partitions, etc
cascade - collect all statistics for both table and index
option - gather = collect statistics for all objects even it the object has stale or fresh statistics
gather_auto = Oracle will decide on what to collect
gather_empty = collect only for objects that do not have any statistics
gather_stale = collect only for objects with stale statistics
Schema dbms_stats.gather_schema_stats( ownname => 'VALLEP');
Table
Index
System
Analyze a table
DBA_TABLES
look at the last_analyzed column to make sure statistics are being collected
DBA_OBJECT_TABLES look at the last_analyzed column to make sure statistics are being collected
Dynamic Sampling
Dynamic sampling is controlled by the optimizer_dynamic_sampling parameter which accepts values from 0 (off) to 10
(aggressive sampling) with 2 as the default.
Dynamic sampling can be a benefit
The sample time is small compared to the overall query execution time
Display
Instance
Session
SQL Hint
Efficient SQL: To elimate the number of rows that the optimizer has to retrieve we use where clauses, however the
optimizer may not end up writing the best execution plan, you have better knowledge of the application than the optimizer
and with this knowledge you can use hints which force the optimizer to use that knowledge. Using where clauses efficiently
will reduce the I/O, thus increasing performance, the optimizer will use statistics (row count) to determine how to create the
best plan, if no statistics are available then it has no option but to perform a full table scan.
Sometimes the optimizer will not use a index, even if you know one exists, the possible reason for these could be any of the
following
If you think heavy data skew is in the table, using histograms will help the optimizer to provide more accurate
representations of the data distribution in the table.
If for some reason it still refuses to use the index you can force it by using a hint.
Try using the where clause instead of the having clause, as the having clause incurs the additional overhead of sorting and
summing.
Hints: Sometimes the CBO does not know best and needs help to point it in the right direction, this is where hints help by
forcing the optimizer to take a path that you have knowledge about, hints can alter the join methods, join order or even
access paths. The are many hint options you would need to see the Oracle documentation for a full listing but i have listed
some of the more common one
all_rows - optimize throughput, not optimizer the response time of the statement
index_ffs - force a full scan of a index, can use parallelization for this to improve performance.
Examples
Join Methods
You should choose a join method based on how many rows you expect to be returned from the join
Cartesian
Joins
cartesian joins are normally a result of not using a where clause, it basically joins every row in all tables, so
for an example if one table as 50,000 rows and the other table has 100 rows then a cartesian join of the tables
would be 50,000 * 100 = 500,000 rows
Nested
Loops
If you are join tables with few than 10,000 rows then a Nested Loop would be the way to go, see hintssection
above for more details on how to use a Nested Loop
Hash Join
Use if the join will produce large subsets of data or a substantial proportion of a table is going to be joined.
Merge Join If the tables in the join are being joined with an inequality condition (not an equi join), then use a merge join
Bitmap Join
Basically used for data warehouses, do not use if running OLTP system, they are used with low cardinality
columns (columns having low distinct values - gender, marital _status, relation, etc)
See join methods for more information regarding different types of joins
Index Strategy: An index is a data structure that takes the value of one or more columns of a table (the key) and returns all
rows (or the requested columns in that row) with that value of the column quickly. The efficiency of the index is that it lets
you find the necessary rows without having to perform a full table scan, this leads to few I/O's.
As a general rule you so only use a index if you select about 10-15% of a table, when using a index this prevents you from
performing a full table scan. When you want to retrieve a row, oracle has to perform a lookup in the index to obtain the
ROWID, using this ROWID it can then retrieve the requested row. Using to many indexes can degrade performance as
when a table is updated the index has to be updated as well, so only index when the trade-off is better, in other words if you
use a OLTP database then limit the amount of indexes that you use. Here are some guidelines on when you should index
Index columns with high selectivity (few rows with identical values)
Try to use index-only query plans, in other words try to retrieve the query using the index only.
Use secondary indexes on columns frequently involved in order by and group by as well as sorting
operations union and distinct
B-Tree index
this is the default index type, you will probably use it for almost all the indexes in a typical OLTP
application.
Bitmap Index
This is ideal for column data that has a low cardinality (few distinct values). Do not use if you have
heavy DML going on the table, this is ideal in a data warehouse environment.
IOT
IOT's place all the table data in its primary key index, thus eliminating the need for a separate index.
(index organized The data is sorted and rows are stored in primary key order. This type of index will save storage space
table)
when compared to a normal B-Tree index.
Concatenated
Index
Concatenated or Composite indexes are indexes that include more than one column and are excellent for
improving selectivity of the where predicates. You are trying to elimate the optimizer performing full
table scans thus reducing I/O.
Funcation-Based are efficient in frequently used statements that involve functions or complex expression on columns,
Index
they can quickly return the computed value of the function or expression directly from the index.
Reverse-Key
Index
These are ideal when you have a heavy insert application, the reverse key index provide an efficient
way to distribute the index values more evenly and thus improve performance.
each partition can be distributed across disks thus increasing performance, also each partition can be
Partioned Index maintained separately (backup, remove) without affecting any other partition You can also use parallel
query options to improve performance.
It is worth rebuilding indexes regularly so queries can run faster, see indexes for more information.
Bind Variables: In a ideal world you should parse just once and use the same parsed version of the statement for repeated
executions, this operation is much more expensive than actually executing the statement. In order you use the same
execution plan for the same query the SQL statement must be identical, you use bind variables to archive this.
The way oracle knows if the query is identical is that the statement is hashed and a hash key stored, if this key does not
match then the statement is not the same, so even you add a space or replace a letter with a upper case letter the hash key
will be different.
There is a parameter you can set to force statements that fail to use bind variables to do so
Force bind variable
usage
Materialized Views: If you are dealing with very large amounts of data, you should consider using materialized views to
improve response time. Materialized views are objects with summary data from the underlying table. Expensive table joins
can be done beforehand and saved in the materialized view. You can use the package dbms_olap package to get
recommendations on ideal materialized views.
Stored Outlines: The CBO doesn't always use the same execution strategy, changes within the database can force the CBO
to change its plan. You can force oracle to use the same plan by using the plan stability feature stored outlines to preserve
the current execution plans, even if the statistics and optimizer mode changes. The only catch is that the SQL statement
must be identical if you wish to use the stored outline.
Stored outlines are use when you plan to migrate from one oracle version to another, thus you can cut risks and preserve the
applications present performance via stored outlines. Outlines ensure that the execution paths the queries used in a test
instance successfully carry over to the production instance. Also it can be used to override the code that is imbedded in the
application.
The more common name for this feature is called optimizer stability. All the information on stored outlines is stored in the
OUTLN schema in two tables OL$ and OL$HINTS (these are created with you install Oracle).
System parameters
Stored Outlines
(database)
Create Outline
Use Outline
Edit Outline
Remove Outline
DBA_OUTLINES
OL$
OL$HINTS
OL$NODES
DBMS_OUTLN
height-based - divide column values into bands, with each band containing a roughly equal number of rows
frequency-based - determine the number of buckets based on the distinct values in the column, each bucket contains
all the data that has the same value
height-based
begin
dbms_stats.gather_table_stats( ownname => 'HR', tabname => 'benefits',
method_opt => 'for column size 10 number_of_vists');
end;
/
begin
dbms_stats.gather_table_stats( ownname => 'HR', tabname => 'benefits',
method_opt => 'for column size skewonly');
end;
/
begin
dbms_stats.gather_table_stats( ownname => 'HR', tabname => 'BENEFITS',
method_opt => 'for all columns size auto');
end;
/
Note: for column - your own histogram creation auto - let oracle decide what to do
based on data distribution and workload skewonly - based the decision only on the
data distribution of the columns
frequency-based
begin
dbms_stats.gather_table_stats( ownname => 'HR', tabname => 'benefits',
method_opt => 'for column size 20 department_i-d');
end;
/
Useful Views
DBA_TAB_HISTOGRAMS
DBA_HISTOGRAMS
DBA_PART_HISTOGRAMS
provides the histogram data (end-points per histogram) for histograms on all table
partitions in the database
DBA_SUBPART_HISTOGRAMS
lists actual histogram data (end-points per histogram) for histograms on all table
subpartitions in the database
PL/SQL Performance: When you are developing a database there are four system parameters that can help the developer
PLSQL_WARNING
(disabled by default)
PLSQL_DEBUG
(false by default)