Sie sind auf Seite 1von 23

Relational Optimization

The optimizer is the heart of a relational


database management system.

Optimizer is an inference engine for
determining the best possible database
navigation strategy for any given SQL request.
Relational optimization is very powerful because it
allows queries to adapt to a changing database
environment. It can also react to changes by formulating
new access paths without requiring application coding
changes to be implemented.

Physical Data Independence is the separation of access
criteria from physical storage characteristics
To optimize SQL
The relational Optimizer must analyze each SQL
statements by parsing it to determine the tables and
columns that must be accessed.

The optimizer will access statistics stored by the RDBMS
in either system catalog or the database objects
themselves
Every RDBMS has an embedded
relational optimizer that renders SQL
statements into executable access
paths.
Modern relational optimizers are cost based,
meaning that the optimizer will attempt to
formulate an access path for each query that
reduces overall cost.
CPU and I/O Costs
The optimizer can arrive at a rough estimate of the
CPU time required to run the query using each optimized
access path at analyzes.
Database Statistics
A relational optimizer is of little use without accurate
statistics about the data stored in the database. It provides
DBMS a utility program or command to gather statistics
about database objects and to store them for them for use
by the optimizer.
The DBA should collect modified statistics whenever
a significant volume of date has been added or modified.
Failure to do so will result in the optimizer basing its cost
estimates on inaccurate statistics. This may be detrimental
to query performance.
DBMS collects statistical information
Number of unique values stored in the column

Most frequently occurring values for columns

Index key density

Details on the ratio of clustering for clustered tables

Correlation pf columns to other columns

Structural state of the index or tablespace

Amount of the storage used by the database object
Query Analysis
It scans the SQL statement to determine its overall
complexity. The formulation of the SQL statement is a
significant factor in determining the access paths chosen
by the optimizer.

The complexity of the query, the number and the
type of predicates, the presence of functions, and the
presence of ordering clauses enter into the estimated cost
that is calculated by the optimizer
Which tables in which database are required

Whether any views are required to be broken down into
underlying tables

Whether tables joins or subselects are required

Which indexes, if any, can be used

How many predicates must be satisfied

Which functions must be executed

Whether the SQL uses OR or AND



How the DBMS process each component of the SQL
statement

How much memory has been assigned to the data caches
used by the tables in the SQL statement

How much memory is available for sorting if the query
requires a sort.
Density
Density is the average percent of duplicate values
stored in the index key column and is recorded as a
percentage.


Joins
Joining combining information from multiple tables.

When multiple tables are accessed, the optimizer
figures out how to combine the tables in the most efficient
manner.

When determining the access path for a join, the
optimizer must determine the order in which the tables will
be joined.

Choose the table to process first

Series of operations are performed on the outer table to
prepare it for joining.

Rows from that table are then combined with rows from
the second table, called the INNER TABLE.
Two common Join Method

Nested-loop join

Merge-scan join
Nested-loop Join
Works by comparing qualifying rows of the outer
table to the inner tables. A qualifying row is identified in the
outer table, and then the inner table is scanned for a
match.
Merge-scan Join
The tables to be joined are ordered by the keys. This
ordering can be accomplished by a sort or by access via an
index.
Join Order
The optimizer reviews each join in a query and
analyzes that appropriate statistics to determine the optimal
order in which the tables should be accessed to complete
the join.
To find optimal join access path, the optimizer uses
built-in algorithms containing knowledge about joins and
data volume.
It matches this intelligence against the join
predicates, databases statistics, and available indexes to
estimate which order is more efficient.

Das könnte Ihnen auch gefallen