Sie sind auf Seite 1von 21

CALCULATION OF COST OF NESTED LOOP JOINS

Some times while analyzing the SQL for tuning you might have come across scenarios where
you feel that CBO is not making the optimized plan as it should be and you have tried all the
possible combinations of tuning the database or the SQL queries. In that scenario there are two
options left at the performance engineer end:

1. Use hints to force optimizer to use your desired plan.


2. Analyze why CBO is making different plan and what all changes are required to put CBO in
a situation so that it picks up the optimized plan on its own.

However the first option seems pretty simple but that is going to impact the performance of
those query at different data load conditions. Suppose you have tuned it for high-end data
condition but it may not be OK for low-end data or the case may be otherwise also. So this type
of fix is recommended only in the scenario in which you are left with no other option.

Second option is to take the dump of the optimizer plans and find out how it is making the plan.
What all can be changed or added so as to pick up a much more efficient and desirable plan.
For that there is some detailed analysis required.

This paper will be focused on this second attribute of analysis in terms of how CBO calculates
the cost for SQL plans so that you can make the changes that further lead optimizer to pick up a
different plan. In this sense you will be able to find out which all things are affecting the CBO
cost calculations and which can suitably be changed to make sure that optimizer is going to pick
up the correct plan.

PREREQUISITES

The one of the most important prerequisites for this process is that you are using the CBO and
for CBO to calculate the cost of your plan it is essential that all your tables, indexes are well
analyzed so as to have the accurate information in the dictionary views.

Further more to analyze a particular statement following things needs to be done:

ALTER SESSION SET SQL_TRACE=TRUE;


ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';

You just don’t see any of the plans which “lost out”. Unless you activate the 10053 event trace,
that is. There you see all the access plans the CBO evaluated and the costs assigned to them.
Event 10053 details the choices made by the CBO in evaluating the execution path for a query.
It externalizes most of the information that the optimizer uses in generating a plan for a query.
HOW TO SET EVENT 10053

1. For your own session


ALTER SESSION SET EVENTS '10053 trace name context forever [, level {1|2}]'
ALTER SESSION SET EVENTS '10053 trace name context off'

2. For another session


SYS.DBMS_SYSTEM.SET_EV (<sid>, <serial#>, 10053, {1|2}, '')
SYS.DBMS_SYSTEM.SET_EV (<sid>, <serial#>, 10053, 0, '')

Unlike other events, where higher levels mean more detail, the 10053-event trace at level 2
produces less detail than the trace at level 1. Like the sql_trace (a.k.a. 10046 event trace), the
10053-event trace is written to user_dump_dest.
The trace is only generated if it the query is parsed by the cost based optimizer. Note that this
entails two conditions: the query must be (hard) parsed and it must be parsed by the CBO. If the
session, for which the 10053 trace has been enabled, is executing only sql that is already
parsed and is being reused, no trace is produced. Likewise, if the sql statement is parsed by the
rule based optimizer, trace output will consist of only the sql query, none of the other
information.

TRACE CONTENTS

The trace consists of 6 sections:


• Query
• Parameters used by the optimizer
• Base Statistical Information
• Base Table Access Cost
• General Plans
• Recosting for special features

INTERPRETATION OF TRACE FILES

Query

This part of trace contains SQL you want to trace out. But if the Query is parsed using Rule
Based Optimizer then this trace file is going to be finished here only. Rule based optimizer will
be used in the following cases:

· Explicit Rule hint is provided


· Optimizer_mode or optimizer_goal are set to rule
· No dictionary information are present that means no tables are analyzed.
PARSING IN CURSOR #1 len=69 dep=0 uid=82 oct=42 lid=82 tim=1021693927016571 hv=552079412
ad='308995f8'
ALTER SESSION SET EVENTS '10053 trace name context forever, level 1'
END OF STMT
PARSE #1:c=0,e=126,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1021693927016559
EXEC #1:c=0,e=176,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1021693927016870
*** 2003-02-25 18:10:58.251
QUERY
explain plan set statement_id = 'Q' for
SELECT OCT.OPO_OMRPLANID,
OCT.ITR_REQUIREMENT_NO,
OCT.ITR_VERSION_NO,
ITM.DTY_DECLTYPE_FK,
ITR.LBL_BILLNO_FK,
ITM.TACC_IND,
ITR.COMMCLR_IND
FROM OP_CURPLAN_TRANSIMPREQS OCT,
OP_TRANSIMP_MOVES ITM,
OP_TRANSIMP_REQS ITR
WHERE OCT.OPO_OMRPLANID=:B1
AND OCT.ITR_VERSION_NO=ITM.ITR_VERSION_NO
AND OCT.ITR_REQUIREMENT_NO=ITM.ITR_REQUIREMENT_NO
AND ITR.VERSION_NO=ITM.ITR_VERSION_NO
AND ITR.REQUIREMENT_NO=ITM.ITR_REQUIREMENT_NO

Terminology Used

Parsing in Cursor Parse/ Exec


len Length of Query C CPU Time
dep Recursive Depth of Query E Elapsed Time
uid User ID parsing the Query P Parse Time
oct Cr Blocks Read in Consistent Reads
lid Cu Blocks Read in Current Mode
tim End Time taken to parse the query Mis Library Cache Misses
hv Hash Value of Query R
ad Dep Recursive Depth
Og
Tim Time taken to Parse or Execute the
query
DESCRIPTION OF TABLES

OP_CURPLAN_TRANSIMPREQS OP_TRANSIMP_MOVES OP_TRANSIMP_REQS

ITR_REQUIREMENT_NO ITR_REQUIREMENT_NO REQUIREMENT_NO


ITR_VERSION_NO ITR_VERSION_NO VERSION_NO
OPO_OMRPLANID ITEM_NO GAT_ACTIVITYTYPE_FK
CURRENT_IND MOVE_SEQNO LOC_DISCHARGEPORT_FK
MOVE_TYPE LBL_BILLNO_FK
DTY_DECLTYPE_FK LOC_PLD_FK
PREFERRED_MODE OST_ORGAREAID_FK
CREATEDBY IMPHAULAGE
CREATE_DATETM SHIPMENT_TYPE
LASTUPDATEDBY VADCD_IND
LASTUPDATE_DATETM CREATEDBY
FAC_FROMFACILITY_FK CREATE_DATETM
FAC_TOFACILITY_FK LASTUPDATEDBY
FAC_DECLENDFCLTY_FK LASTUPDATE_DATETM
CUS_DECLARANTCODE_FK EQM_EQUIPMENTNO_FK
DECLARANT_NAME ROT_OWNINGTEAM_FK
COMBINE_DECL_IND SHO_SHIPMENTOWNER_FK
BONDED_CARRIER_ID LOC_INTERMEDIATELOC_FK
APP_DATETM RESTITUTION_LOCATION
EARL_PICKUP_DATETM WHARFAGE_CLR_NO
PRV_DECL_NO CUS_CONSIGNEE_FK
CRSNGTPT_MEANS_ID CONSIGNEE_NAME
CRSNGTPT_MEANS_NAT CUS_CNPARTY1_FK
RELEASE_PARTY_CODE CNPARTY1_NAME
RELEASE_PARTY_NAME CUS_CNPARTY2_FK
DECL_REF CNPARTY2_NAME
RCC_PRVCUSTOFFCODE_FK DEST_CUST_REF1
PRV_CUSTOMSOFF_NAME DEST_CUST_REF2
RCC_DEPCUSTOMSOFFCODE_FK DEST_CUST_REF3
DEP_CUSTOMSOFF_NAME INTERNAL_NOTES1
RCC_DESCUSTOMSOFFCODE_FK INTERNAL_NOTES2
DES_CUSTOMSOFF_NAME INTERNAL_NOTES3
DECL_CNTRY_DEST STOWAGE_REQUEST
DECL_CNTRY_DSPTCH TPT_INTERNAL_NOTES1
DECL_CNSG_NAME TPT_INTERNAL_NOTES2
DECL_CNSG_ADDR1 TPT_INTERNAL_NOTES3
DECL_CNSG_ADDR2 DECL_INSTR_NO
DECL_CNSG_ADDR3 COMMCLR_IND
DECL_CNSG_ADDR4 COMMCLR_EXPIRY_DATETM
POSTCODE COMMCLR_UPDATEDBY
CNT_COUNTRYCODE_FK COMMCLR_UPDATED_DATETM
CONTACT_PERSON CONFIRMED_IND
JOBTITLE BILLTYPE
PHONE_NO BILL_SPLIT_IND
DECL_REMARK_1 PORT_OCC_NO
DECL_REMARK_2
DECL_REMARK_3
DECL_INSTR_NO
DECL_INSTRITEM_NO
DECLARATION_NO
DECL_STATUS_CODE
TACC_IND
DECL_INSTR_STATUS
LAST_ISSCANC_BY
LAST_ISSCANC_DATETM
TACC_UPDATEDBY
TACC_UPDATED_DATETM
DECLARANT
SEPARATE_PICKUP_DATETM
CUSTOFF_DEST_CNTRY
CUST_PICK_DEL_REF
PARAMETERS USED BY THE OPTIMIZER

This section of the trace the optimizer lists all the init.ora parameters that have an influence on
the access plan. The list changes from Oracle version to version. This here is the list for 9.2, the
description of these one can look into the Oracle Documentation.

OPTIMIZER_FEATURES_ENABLE = 9.2.0 _UNNEST_SUBQUERY = TRUE


OPTIMIZER_MODE/GOAL = Choose _PUSH_JOIN_UNION_VIEW = TRUE
_OPTIMIZER_PERCENT_PARALLEL = 101 _FAST_FULL_SCAN_ENABLED = TRUE
HASH_AREA_SIZE = 1048576 _OPTIM_ENHANCE_NNULL_DETECTION = TRUE
HASH_JOIN_ENABLED = TRUE _ORDERED_NESTED_LOOP = TRUE
HASH_MULTIBLOCK_IO_COUNT = 0 _NESTED_LOOP_FUDGE = 100
SORT_AREA_SIZE = 524288 _NO_OR_EXPANSION = FALSE
OPTIMIZER_SEARCH_LIMIT = 5 _QUERY_COST_REWRITE = TRUE
PARTITION_VIEW_ENABLED = FALSE QUERY_REWRITE_EXPRESSION = TRUE
_ALWAYS_STAR_TRANSFORMATION = FALSE _IMPROVED_ROW_LENGTH_ENABLED = TRUE
_B_TREE_BITMAP_PLANS = TRUE _USE_NOSEGMENT_INDEXES = FALSE
STAR_TRANSFORMATION_ENABLED = FALSE _ENABLE_TYPE_DEP_SELECTIVITY = TRUE
_COMPLEX_VIEW_MERGING = TRUE _IMPROVED_OUTERJOIN_CARD = TRUE
_PUSH_JOIN_PREDICATE = TRUE _OPTIMIZER_ADJUST_FOR_NULLS = TRUE
PARALLEL_BROADCAST_ENABLED = TRUE _OPTIMIZER_CHOOSE_PERMUTATION = 0
OPTIMIZER_MAX_PERMUTATIONS = 2000 _USE_COLUMN_STATS_FOR_FUNCTION = TRUE
OPTIMIZER_INDEX_CACHING = 0 _SUBQUERY_PRUNING_ENABLED = TRUE
_SYSTEM_INDEX_CACHING = 0 _SUBQUERY_PRUNING_REDUCTION_FACTOR = 50
OPTIMIZER_INDEX_COST_ADJ = 100 _SUBQUERY_PRUNING_COST_FACTOR = 20
OPTIMIZER_DYNAMIC_SAMPLING = 1 _LIKE_WITH_BIND_AS_EQUALITY = FALSE
_OPTIMIZER_DYN_SMP_BLKS = 32 _TABLE_SCAN_COST_PLUS_ONE = TRUE
QUERY_REWRITE_ENABLED = FALSE _SORTMERGE_INEQUALITY_JOIN_OFF = FALSE
QUERY_REWRITE_INTEGRITY = ENFORCED _DEFAULT_NON_EQUALITY_SEL_CHECK = TRUE
_INDEX_JOIN_ENABLED = TRUE _ONESIDE_COLSTAT_FOR_EQUIJOINS = TRUE
_SORT_ELIMINATION_COST_RATIO = 0 _OPTIMIZER_COST_MODEL = CHOOSE
_OR_EXPAND_NVL_PREDICATE = TRUE _GSETS_ALWAYS_USE_TEMPTABLES = FALSE
_NEW_INITIAL_JOIN_ORDERS = TRUE DB_FILE_MULTIBLOCK_READ_COUNT = 16
ALWAYS_ANTI_JOIN = CHOOSE _NEW_SORT_COST_ESTIMATE = TRUE
ALWAYS_SEMI_JOIN = CHOOSE _GS_ANTI_SEMI_JOIN_ALLOWED = TRUE
_OPTIMIZER_MODE_FORCE = TRUE _CPU_TO_IO = 0
_OPTIMIZER_UNDO_CHANGES = FALSE _PRED_MOVE_AROUND = TRUE
BASE STATISTICAL INFORMATION

OP_TRANSIMP_REQS OP_TRANSIMP_MOVES OP_CURPLAN_TRANSIMPREQS


Table stats Table: Table stats Table: Table stats Table:
OP_TRANSIMP_REQS Alias: OP_TRANSIMP_MOVES Alias: ITM OP_CURPLAN_TRANSIMPREQS Alias: OCT
ITR TOTAL :: CDN: 1126964 NBLKS: 180975 TOTAL :: CDN: 24736 NBLKS: 84
TOTAL :: CDN: 509576 AVG_ROW_LEN: 1155 AVG_ROW_LEN: 20
NBLKS: 70091 Column: ITR_VERSIO Col#: 2 Table: Column: ITR_REQUIR Col#: 1 Table:
AVG_ROW_LEN: 939 OP_TRANSIMP_MOVES Alias: ITM OP_CURPLAN_TRANSIMPREQS Alias: OCT
Column: VERSION_NO Col#: 2 NDV: 5 NULLS: 0 DENS: NDV: 16424 NULLS: 0 DENS: 6.0887e-
Table: OP_TRANSIMP_REQS 2.0000e-01 LO: 1 HI: 5 05 LO: 13000128 HI: 13019515
Alias: ITR NO HISTOGRAM: #BKT: 1 #VAL: 2 NO HISTOGRAM: #BKT: 1 #VAL: 2
NDV: 5 NULLS: 0 Column: ITR_REQUIR Col#: 1 Table: Column: ITR_VERSIO Col#: 2 Table:
DENS: 2.0000e-01 LO: 1 HI: 5 OP_TRANSIMP_MOVES Alias: ITM OP_CURPLAN_TRANSIMPREQS Alias: OCT
NO HISTOGRAM: #BKT: 1 NDV: 376424 NULLS: 0 DENS: NDV: 5 NULLS: 0 DENS: 2.0000e-01
#VAL: 2 2.6566e-06 LO: 1 HI: 13019515 LO: 1 HI: 5
Column: REQUIREMEN Col#: 1 NO HISTOGRAM: #BKT: 1 #VAL: 2 NO HISTOGRAM: #BKT: 1 #VAL: 2
Table: OP_TRANSIMP_REQS Column: ITR_VERSIO Col#: 2 Table: Column: ITR_VERSIO Col#: 2 Table:
Alias: ITR OP_TRANSIMP_MOVES Alias: ITM OP_CURPLAN_TRANSIMPREQS Alias: OCT
NDV: 502424 NULLS: 0 NDV: 5 NULLS: 0 DENS: NDV: 5 NULLS: 0 DENS: 2.0000e-01
DENS: 1.9904e-06 LO: 1 HI: 2.0000e-01 LO: 1 HI: 5 LO: 1 HI: 5
13019515 NO HISTOGRAM: #BKT: 1 #VAL: 2 NO HISTOGRAM: #BKT: 1 #VAL: 2
NO HISTOGRAM: #BKT: 1 Column: ITR_REQUIR Col#: 1 Table: Column: ITR_REQUIR Col#: 1 Table:
#VAL: 2 OP_TRANSIMP_MOVES Alias: ITM OP_CURPLAN_TRANSIMPREQS Alias: OCT
-- Index stats NDV: 376424 NULLS: 0 DENS: NDV: 16424 NULLS: 0 DENS: 6.0887e-
INDEX NAME: 2.6566e-06 LO: 1 HI: 13019515 05 LO: 13000128 HI: 13019515
OP_TRANSIMP_REQS_PK NO HISTOGRAM: #BKT: 1 #VAL: 2 NO HISTOGRAM: #BKT: 1 #VAL: 2
COL#: 1 2 -- Index stats -- Index stats
TOTAL :: LVLS: 2 #LB: 1183 INDEX NAME: INDEX NAME:
#DK: 509576 LB/K: 1 DB/K: 1 OP_TRANSIMP_MOVES_NU1 COL#: 1 2 OP_CURPLAN_TRANSIMPREQS_NU1 COL#: 1
CLUF: 86847 4 2
TOTAL :: LVLS: 2 #LB: 3651 #DK: TOTAL :: LVLS: 1 #LB: 95 #DK: 23574
1126964 LB/K: 1 DB/K: 1 CLUF: 198364 LB/K: 1 DB/K: 1 CLUF: 9913
INDEX NAME: INDEX NAME:
OP_TRANSIMP_MOVES_PK COL#: 1 2 3 OP_CURPLAN_TRANSIMPREQS_PK COL#: 3 2
TOTAL :: LVLS: 2 #LB: 3038 #DK: 1
1126964 LB/K: 1 DB/K: 1 CLUF: 198364 TOTAL :: LVLS: 1 #LB: 133 #DK: 24736
LB/K: 1 DB/K: 1 CLUF: 9069
_OPTIMIZER_PERCENT_PARALLEL = 0

Terminology Used

For Tables For Indexes


CDN Cardinality (No of rows Index#, col# The object# of the index and the column_id
of the table) of the columns. Oracle 9 brings an
NBLKS The number of blocks improvement by using the index name
below high water mark rather than index#.
AVG_ROW_LEN Average Row length of LVLS The height of the Index b-tree.
a row. #LB The number of leaf blocks.
#DK The number of distinct keys of the index
LB/K The average Number of leaf blocks per key
DB/K The average Number of data blocks per key
CLUF The clustering factor of the index
BASE TABLE ACCESS COST

Now the optimizer is using this information to evaluate access plans. First the CBO looks at the
different possibilities and costs to access each of the tables in the SQL by itself, taking into
consideration all applicable predicates except join predicates.
Generally Optimizer uses table scan, index unique scan, index range scan, index & equal and
index fast full scan methods.

Single Table Access Paths

OP_CURPLAN_TRANSIMPREQS OP_TRANSIMP_MOVES OP_TRANSIMP_REQS

SINGLE TABLE ACCESS PATH SINGLE TABLE ACCESS SINGLE TABLE ACCESS PATH
Column: OPO_OMRPLA Col#: 3 Table: PATH TABLE: OP_TRANSIMP_REQS
OP_CURPLAN_TRANSIMPREQS Alias: TABLE: ORIG CDN: 509576 ROUNDED
OCT OP_TRANSIMP_MOVES CDN: 509576 CMPTD CDN:
NDV: 24732 NULLS: 0 DENS: ORIG CDN: 1126964 509576
4.0433e-05 LO: 2098117 HI: 2191648 ROUNDED CDN: 1126964 Access path: tsc Resc: 6743
NO HISTOGRAM: #BKT: 1 #VAL: 2 CMPTD CDN: 1126964 Resp: 6743
TABLE: Access path: tsc Resc: BEST_CST: 6743.00 PATH: 2
OP_CURPLAN_TRANSIMPREQS ORIG 17407 Resp: 17407 Degree: 1
CDN: 24736 ROUNDED CDN: 1 CMPTD BEST_CST: 17407.00 Table: OP_TRANSIMP_MOVES
CDN: 1 PATH: 2 Degree: 1 Join index: 31106
Access path: tsc Resc: 10 Resp: 10
Access path: index (iff)
Index:
OP_CURPLAN_TRANSIMPREQS_PK
TABLE:
OP_CURPLAN_TRANSIMPREQS
RSC_CPU: 0 RSC_IO: 14
IX_SEL: 0.0000e+00 TB_SEL:
1.0000e+00
Access path: iff Resc: 14 Resp: 14
Skip scan: ss-sel 0 andv 5
ss cost 5
index io scan cost 0
Access path: index (index-only)
Index:
OP_CURPLAN_TRANSIMPREQS_PK
TABLE:
OP_CURPLAN_TRANSIMPREQS
RSC_CPU: 0 RSC_IO: 2
IX_SEL: 4.0433e-05 TB_SEL: 4.0433e-
05
BEST_CST: 2.00 PATH: 4 Degree: 1

Terminology Used

NDV Number of Distinct Values for the column

NULLS Number of rows with a null “value” for the column


DENS Density of that column(1/NDV for without having histograms on the column)
LO The lowest value for the column (only for numeric columns)
HI The highest value for the column (only for numeric columns)
The information regarding these values can be obtained from dba_tab_columns, if the
corresponding tables are properly analyzed.

I will describe the way it has calculated the cost for single table access paths for
OP_CURPLAN_TRANSIMPREQS.

Description of individual Section:

Full Table Scan Cost

Column: OPO_OMRPLA Col#: 3 Table: OP_CURPLAN_TRANSIMPREQS Alias: OCT


NDV: 24732 NULLS: 0 DENS: 4.0433e-05 LO: 2098117 HI: 2191648
NO HISTOGRAM: #BKT: 1 #VAL: 2
TABLE: OP_CURPLAN_TRANSIMPREQS ORIG CDN: 24736 ROUNDED CDN: 1 CMPTD CDN: 1
Access path: tsc Resc: 10 Resp: 10

DENS = 1/NDV => 4.0433e-05


CMPTD CDN = FF* ORIG CDN => DENS* ORIG CDN => 4.0433e-05*24732 => 1 (FF =>
Filter Factor, Find the details below)
Tsc Resc(Table Scan Cost) = NBLKS / DB_FILE_MULTIBLOCK_READ_COUNT
This is some what rounded calculation. From here you can calculate the effective value of
DB_FILE_MULTIBLOCK_READ_COUNT being used for this database/OS.
So infact effective DB_FILE_MULTIBLOCK_READ_COUNT = NBLKS / Tsc Resc = 84/10
=>8.4.
So it changes with the average row length. The general values which has been calculated for
effective DB_FILE_MULTIBLOCK_READ_COUNT are as follows:

Keep in mind that the thus established k factor is only used in the CBO’s estimate for
the cost of a full table scan – or fast index scan. The actual cost in I/O of a full table
scan depends on other factors besides db_file_multi_block_read_count like proper
extent planning and management and whether data blocks of the table are already
present in the buffer pool and how many.
PREDICATES AND FILTER FACTORS

In order to understand the index access cost calculations it is necessary to discuss filter factors
and their relationship to the query’s predicates. A filter factor is a number between 0 and 1 and,
in a nutshell, is a measure for the selectivity of a predicate, or, in mathematical terms, the
probability that a particular row will match a predicate or set of predicates. If a column has 10
distinct values in a table and a query is looking for all rows where the column is equal to one of
the values, you intuitively expect that it will return 1/10 of all rows, presuming an equal
distribution. That is exactly the filter factor of a single column for an equal predicate:

FF = 1/NDV = density
Both statistics, NDV (a.k.a. num_distinct) and density, are in dba_tab_columns but the optimizer
uses the value of density in most its calculations. This has ramifications, as we will see.
Here is the relationship between predicates and the resulting filter factor:

WITHOUT BIND VARIABLES


predicate Filter factor
c1 = value 1/c1.num_distinct4
c1 like value 1/c1.num_distinct
c1 > value (Hi - value) / (Hi - Lo)
c1 >= value (Hi - value) / (Hi - Lo) + 1/c1.num_distinct
c1 < value (value - Lo) / (Hi - Lo)
c1 <= value (value - Lo) / (Hi - Lo) + 1/c1.num_distinct
c1 between val1 and val2 (val2 – val1) / (Hi - Lo) + 2 * 1/c1.num_distinct

WHEN USING BIND VARIABLES

predicate Filter factor


col1 = :b1 col1.density
col1 {like | > | >= | < | <=} :b1 {5.0000e-02 | col1.density }5
col1 between :b1 and :b2 5.0000e-02 * 5.0000e-02

COMBINING PREDICATES / FILTER FACTORS

Predicate Filter factor


Predicate 1 and predicate 2 => FF1 * FF2
Predicate 1 or predicate 2 => FF1 + FF2 – FF1 * FF2
The rules on how filter factors are calculated when predicates are combined with “and” or “or”
are straight out of probability theory. Given that fact, it should be noted that probability theory
stipulates that these formulas for combining probabilities are only valid if “the predicates are
independent”.
Just as the basic column densities presume a uniform distribution of the distinct column values,
the CBO cost calculations for a plan presume independence of the predicates. But while there is
some remedy in the form of histograms if the data distribution is not uniform, there is no remedy
for cases where the predicates are not independent.
For the calculation of the filter factors for ranges of string literals, the value of the literal is the
weighted sum of the ASCII values of its characters. Strings of different lengths are “right
padded” with zeros:
ADAMS = 65*256 + 68*256 + 65*256 + 77*256 + 83*256 = 2.8032e+11
COLUMN STATS WITH HISTOGRAM

In the absence of histogram data, all filter factors are derived from the density of the columns
involved, or from fixed values if no statistics have been gathered at all. Histograms complicate
the filter factor calculations and go beyond the scope of this paper. In spite of that, we’ll take a
brief look at what changes when histograms are calculated on a column and correct the record
on a couple of myths on the way. Histograms are gathered when the option

“for {all {indexed}} columns {col1 {, col2…}} {size [n|75]}” ANALYZE


“for {all {indexed}} columns size n {col1 {, col2…}}” DBMS_STATS.GATHER_TABLE_STATS

is used with the analyze command or the DBMS_STATS.GATHER_TABLE_STATS procedure.


Note the subtle difference in the phrasing between the two methods.

There are two types of histograms:

• Value-Based Histograms
The number of buckets is equal to the number of distinct values and the “endpoint” of each
bucket is the number of rows in the table with that value. Oracle builds a value based histogram
if the size in the analyze for a column is larger than the number of distinct values for the column.

• Height-Based Histograms

Height-based histograms place approximately the same number of values into each bucket, so
that the endpoints of the bucket are determined by how many values are in that bucket. Oracle
builds a height based histogram if the size in the analyze for a column is smaller than the
number of distinct values for the column.
A commonly held belief is that histograms are useless, i.e. have no effect on the access plan, if
bind variables are used since the value is not known at parse time and the CBO – histograms
are only ever used by the cost based optimizer – can not determine from the histogram if it
should use an available index or not. While the latter is true, the gathering of histograms still can
change the access plan. Why and how?

Because
a) The optimizer uses the density in its filter factor calculation, not NDV.
b) The density is calculated differently for columns with histograms, not simply as 1/NDV
If the density changes, the costs of plan segments and the cardinality estimates of row sources
change and hence the entire plan may change. I have successfully exploited that aspect of
histograms in tuning. Another popular myth is that there is no point in gathering histograms on
non-indexed columns, likely born from the assumption that the only role of a histogram is to let
the optimizer decide between a tablescan and an index access. However, the CBO uses filter
factors, derived from column densities, to calculate the costs of different access plans, and
ultimately choose an access plan; and filter factors are used in two places in this calculation of
access plan costs:

1. In the calculation of index access costs.


2. In the calculation of row source cardinalities.

In the latter calculation, the filter factors of predicates on non-indexed columns do get used.
What is more, the row source cardinality has ultimately the more decisive effect as it guides the
composition of the overall access plan. In my experience, the cause for a poor access plan is
more often the incorrect estimate of the cardinality of a row source than the incorrect estimate of
an index access cost.

INDEX ACCESS COSTS

Having discussed filter factors, we are now ready to look at the other part of the single table
access path evaluation – the calculation of the cost of accessing the needed rows via an index.

Access path: index (iff)


Index: OP_CURPLAN_TRANSIMPREQS_PK
TABLE: OP_CURPLAN_TRANSIMPREQS
RSC_CPU: 0 RSC_IO: 14
IX_SEL: 0.0000e+00 TB_SEL: 1.0000e+00
Access path: iff Resc: 14 Resp: 14

The formula used to calculate the cost of index fast full scan is same as that of FTS:
(Level+Leaf Blocks)/ DB_FILE_MULTIBLOCK_READ_COUNT

INDEX NAME: OP_CURPLAN_TRANSIMPREQS_PK COL#: 3 2 1


TOTAL: LVLS: 1 #LB: 133 #DK: 24736 LB/K: 1 DB/K: 1 CLUF: 9069

So the cost of IFF is:


iff Resc = (1+133)/10.3 =>13.09 =>14

Access path: index (index-only)


Index: OP_CURPLAN_TRANSIMPREQS_PK
TABLE: OP_CURPLAN_TRANSIMPREQS
RSC_CPU: 0 RSC_IO: 2
IX_SEL: 4.0433e-05 TB_SEL: 4.0433e-05
BEST_CST: 2.00 PATH: 4 Degree: 1

For unique Primary key scan :

Cost = blevel+1

INDEX NAME: OP_CURPLAN_TRANSIMPREQS_PK COL#: 3 2 1


TOTAL : LVLS: 1 #LB: 133 #DK: 24736 LB/K: 1 DB/K: 1 CLUF: 9069

RSC_IO = blevel + 1
=2

In General the cost of different types of index scans:

The cost calculations for other index access methods are:

Unique scan blevel+1


Fast full scan leaf_blocks /Effective DB_FILE_MULTIBLOCK_READ_COUNT
Index-only blevel + FF*leaf_blocks
Index range scan blevel + FF*leaf_blocks + FF*clustering_factor
Conclusions that can be drawn from these cost formulae:

· Leaf_blocks contributes to all but the unique index access cost. Index compression, where
appropriate, reduces the number of leaf blocks, lowering the index access costs and can
therefore result in the CBO using an index where before it did not.
· Except for a unique index access, the height of the index (blevel) contributes negligibly to
the cost.
· The clustering factor affects only an index range scan, but then heavily, given the fact the
this is orders of magnitude bigger than LEAF_BLOCKS.

The most important thing that 10053 trace proves is:


· An index is not being used, is not even considered and entered into any plan cost
calculation if its leading column is not among the predicates.

DEFAULT TABLE, INDEX, AND COLUMN STATISTICS

Remember that the rule based optimizer parses statements where none of the tables have
statistics. What if only one, or a few but not all, tables, indexes, or columns have no statistics?
There are different claims for what Oracle does in that case. The most popular is that Oracle
uses the rule based optimizer for that table. But parsing is not a mix and match exercise – a
statement is either parsed entirely by CBO or entirely by RBO. If at least one table in the query
has statistics (and optimizer goal is not rule) then the cost base optimizer does parse the query.
Another claim is that Oracle will dynamically, at runtime, estimate statistics on the objects
without statistics. I have not seen any evidence of that.

Let us examine what the 10053 trace shows if the statistics on the any of the tables are deleted
or that particular table is not analyzed:

1. AVG_ROW_LEN for “NOT ANALYZED” tables defaults to 100


2. NBLKS and hence TABLE_SCAN_COST are identical to those of the analyzed table. How
is that possible?

The answer is actually quite simple and has interesting ramifications:


The NBLKS statistic for an unanalyzed table is taken from the table’s segment header and is
therefore more accurate than the NBLKS statistic of an analyzed table which may be stale.
The access plan of a query on analyzed tables does not change as long as the statistics and the
init.ora parameters do not change (i.e. as long as the tables are not re-analyzed), even if the
tables grow or change in other significant ways. This is no longer true if one of the tables is
“unanalyzed”. A change in its size (NBLKS) is immediately reflected not only in its TSC but also,
as will be demonstrated shortly, in the defaults for the table’s cardinality and column densities
and can therefore result in a change of access plan.
As a corollary, if you are using DBMS_STATS to transport the statistics from a production
database to a test database in order to do your SQL analysis and tuning there, watch out for
tables without statistics. Since CBO uses the actual number of blocks from the segment header,
in this case from the test database rather than production, you can easily get different access
plans.

3. The cardinality is a function of a mixture of actual (NBLKS) and default (AVG_ROW_LEN)


values:
CDN = NBLKS * (db_block_size – block overhead) / AVG_ROW_LEN
4. The statistics for unanalyzed indexes default to
LVLS #LB #DK LB/K DB/K CLUF
1 25 100 1 1 800

To find the default column statistics, remember that column statistics are not listed under “BASE
STATISTICAL INFORMATION” but under “SINGLE TABLE ACCESS PATH”:

The defaults for NDV and DENS do not look like nice round defaults like the ones for index
statistics. Note also that the density is not 1/NDV. Checking the default column statistics for
differently sized tables confirms the suspicion that the column statistics defaults, like the table
statistics defaults, are not static, but are derived from the NBLKS value. Examining and plotting
the default column density of tables of different sizes in a scatter diagram against the tables’
number of blocks shows not only a correlation but clear functional dependency:
density = 0.7821*nblks-0.9992 or practically density = 0.7821 / nblks

Note that again this is an empirically derived formula. For small values of NBLKS it does not
yield the exact same densities as observed in the 10053 trace. Note also that the equation of
the correlation function between density and NBLKS is different for different db_block_size
values. The actual formula is not really important, but the fact that there is a dependency of the
default column density on NBLKS and db_block_size is interesting.
Similar to filter factors for range predicates with bind variables, the optimizer uses defaults for
missing/unknown statistics of “not analyzed” tables, indexes, or columns. These defaults range
from plain static values (index statistics and avg_row_size) to actual values (NBLKS) and, as we
have seen, some values (CDN and column densities) derived from NBLKS using complicated
formulas. The analysis of these default values was done on Oracle 8.1.7. It should surprised
nobody if some of the default values or calculations have changed, and will continue to change,
from Oracle release to release.

GENERAL PLANS

This concludes the single table costing part of the 10053 CBO trace. The next section in the
10053 event trace starts with the heading “GENERAL PLANS”. For all but the simplest SQL
statements, this section makes up the largest part of the trace. This is where the CBO looks at
the costs of all different orders and ways of joining the individual tables and come up with the
best access plan.
The cost based optimizer has three join methods in its arsenal. These are the three join
methods and their costing formulas:

1. NL - NESTED LOOP JOIN


Join cost = cost of accessing outer table
+ (cardinality of outer table * cost of accessing inner table )

2. SM – SORT MERGE JOIN


Join cost = (cost of accessing outer table + outer sort cost)
+ (cost of accessing inner table + inner sort cost)

3. HA – HASH JOIN
Join cost = (cost of accessing outer table)
+ (Cost of building hash table)
+ (Cost of accessing inner table)
We will look at the join costing of each method for our simple query.
JOIN ORDER [N]

In the GENERAL PLANS section, the optimizer evaluates every possible permutation of the
tables in the query. An exception is tables in a correlated subquery where it is semantically
impossible to access the table(s) in the subquery before any tables of the outer query and thus
not all permutations constitute a valid plan. Apart from this situation, each permutation
evaluation in the trace is given a number and is then followed by the list of the tables – with their
aliases to distinguish them if a table occurs multiple times in the query.
The initial join order is chosen by ordering the tables in order of increasing computed cardinality.
In this simple case, the CBO is examining the cost of accessing first the
OP_CURPLAN_TRANSIMPREQS table and then joining the OP_TRANSIMP_REQS table and
finally joining the table OP_TRANSIMP_MOVES.
”Now Joining:” is always the beginning of the next batch of join evaluations, introducing the table
joining the fray – no pun intended.
Join order[1]: OP_CURPLAN_TRANSIMPREQS [OCT] OP_TRANSIMP_REQS [ITR] OP_TRANSIMP_MOVES [ITM]
Now joining: OP_TRANSIMP_REQS [ITR] *******
NL Join
Outer table: cost: 2 cdn: 1 rcz: 12 resp: 2
Inner table: OP_TRANSIMP_REQS
Access path: tsc Resc: 6743
Join: Resc: 6745 Resp: 6745
Join cardinality: 509658 = outer (1) * inner (509576) * sel (1.0000e+00) [flag=0]
Best NL cost: 6745 resp: 6745
Join result: cost: 6745 cdn: 509658 rcz: 37
Now joining: OP_TRANSIMP_MOVES [ITM] *******
NL Join
Outer table: cost: 6745 cdn: 509658 rcz: 37 resp: 6745
Inner table: OP_TRANSIMP_MOVES
Access path: tsc Resc: 17407
Join: Resc: 8871623551 Resp: 8871623551
Access path: index (scan)
Index: OP_TRANSIMP_MOVES_NU1
TABLE: OP_TRANSIMP_MOVES
RSC_CPU: 0 RSC_IO: 3
IX_SEL: 5.3132e-07 TB_SEL: 2.8230e-13
Join: resc: 1535719 resp: 1535719
Access path: index (scan)
Index: OP_TRANSIMP_MOVES_PK
TABLE: OP_TRANSIMP_MOVES
RSC_CPU: 0 RSC_IO: 3
IX_SEL: 5.3132e-07 TB_SEL: 2.8230e-13
Join: resc: 1535719 resp: 1535719
Join cardinality: 1 = outer (509658) * inner (1126964) * sel (1.0427e-12) [flag=0]
Using index (ndv = 509576 sel = 3.9807e-07)
Best NL cost: 1535719 resp: 1535719
SM Join
Outer table:
resc: 6745 cdn: 509658 rcz: 37 deg: 1 resp: 6745
Inner table: OP_TRANSIMP_MOVES
resc: 17407 cdn: 1126964 rcz: 12 deg: 1 resp: 17407
using join:1 distribution:2 #groups:1
SORT resource Sort statistics
Sort width: 2 Area size: 131072 Max Area size: 1257472 Degree: 1
Blocks to Sort: 3183 Row size: 51 Rows: 509658
Initial runs: 21 Merge passes: 5 IO Cost / pass: 4775
Total IO sort cost: 13529
Total CPU sort cost: 0
Total Temp space used: 53208000
SORT resource Sort statistics
Sort width: 2 Area size: 131072 Max Area size: 1257472 Degree: 1
Blocks to Sort: 3312 Row size: 24 Rows: 1126964
Initial runs: 22 Merge passes: 5 IO Cost / pass: 4968
Total IO sort cost: 14076
Total CPU sort cost: 0
Total Temp space used: 54322000
Merge join Cost: 51757 Resp: 51757
HA Join
Outer table:
resc: 6745 cdn: 509658 rcz: 37 deg: 1 resp: 6745
Inner table: OP_TRANSIMP_MOVES
resc: 17407 cdn: 1126964 rcz: 12 deg: 1 resp: 17407
using join:8 distribution:2 #groups:1
Hash join one ptn Resc: 1355 Deg: 1
hash_area: 60 (max=307) buildfrag: 3049 probefrag: 3302 ppasses: 1
Hash join Resc: 25507 Resp: 25507
Join result: cost: 25507 cdn: 1 rcz: 49
Best so far: TABLE#: 0 CST: 2 CDN: 1 BYTES: 12
Best so far: TABLE#: 1 CST: 6745 CDN: 509658 BYTES: 18857346
Best so far: TABLE#: 2 CST: 25507 CDN: 1 BYTES: 49
Join order[2]: OP_CURPLAN_TRANSIMPREQS [OCT] OP_TRANSIMP_MOVES [ITM] OP_TRANSIMP_REQS [ITR]
Now joining: OP_TRANSIMP_MOVES [ITM] *******
NL Join
Outer table: cost: 2 cdn: 1 rcz: 12 resp: 2
Inner table: OP_TRANSIMP_MOVES
Access path: tsc Resc: 17407
Join: Resc: 17409 Resp: 17409
Access path: index (scan)
Index: OP_TRANSIMP_MOVES_NU1
TABLE: OP_TRANSIMP_MOVES
RSC_CPU: 0 RSC_IO: 3
IX_SEL: 5.3132e-07 TB_SEL: 5.3132e-07
Join: resc: 5 resp: 5
Access path: index (scan)
Index: OP_TRANSIMP_MOVES_PK
TABLE: OP_TRANSIMP_MOVES
RSC_CPU: 0 RSC_IO: 3
IX_SEL: 5.3132e-07 TB_SEL: 5.3132e-07
Join: resc: 5 resp: 5
Join cardinality: 1 = outer (1) * inner (1126964) * sel (5.3132e-07) [flag=0]
Best NL cost: 5 resp: 5
SM Join
Outer table:
resc: 2 cdn: 1 rcz: 12 deg: 1 resp: 2
Inner table: OP_TRANSIMP_MOVES
resc: 17407 cdn: 1126964 rcz: 12 deg: 1 resp: 17407
using join:1 distribution:2 #groups:1
SORT resource Sort statistics
Sort width: 2 Area size: 131072 Max Area size: 1257472 Degree: 1
Blocks to Sort: 3312 Row size: 24 Rows: 1126964
Initial runs: 22 Merge passes: 5 IO Cost / pass: 4968
Total IO sort cost: 14076
Total CPU sort cost: 0
Total Temp space used: 54322000
Merge join Cost: 31485 Resp: 31485
HA Join
Outer table:
resc: 2 cdn: 1 rcz: 12 deg: 1 resp: 2
Inner table: OP_TRANSIMP_MOVES
resc: 17407 cdn: 1126964 rcz: 12 deg: 1 resp: 17407
using join:8 distribution:2 #groups:1
Hash join one ptn Resc: 12 Deg: 1
hash_area: 60 (max=307) buildfrag: 1 probefrag: 3302 ppasses: 1
Hash join Resc: 17421 Resp: 17421
Join result: cost: 5 cdn: 1 rcz: 24
Now joining: OP_TRANSIMP_REQS [ITR] *******
NL Join
Outer table: cost: 5 cdn: 1 rcz: 24 resp: 5
Inner table: OP_TRANSIMP_REQS
Access path: tsc Resc: 6743
Join: Resc: 6748 Resp: 6748
Access path: index (unique)
Index: OP_TRANSIMP_REQS_PK
TABLE: OP_TRANSIMP_REQS
RSC_CPU: 0 RSC_IO: 2
IX_SEL: 1.9624e-06 TB_SEL: 1.9624e-06
Join: resc: 7 resp: 7
Access path: index (eq-unique)
Index: OP_TRANSIMP_REQS_PK
TABLE: OP_TRANSIMP_REQS
RSC_CPU: 0 RSC_IO: 2
IX_SEL: 0.0000e+00 TB_SEL: 0.0000e+00
Join: resc: 7 resp: 7
Join cardinality: 1 = outer (1) * inner (509576) * sel (1.9624e-06) [flag=0]
Using index (ndv = 509576 sel = 3.9807e-07)
Best NL cost: 7 resp: 7
SM Join
Outer table:
resc: 5 cdn: 1 rcz: 24 deg: 1 resp: 5
Inner table: OP_TRANSIMP_REQS
resc: 6743 cdn: 509576 rcz: 25 deg: 1 resp: 6743
using join:1 distribution:2 #groups:1
SORT resource Sort statistics
Sort width: 2 Area size: 131072 Max Area size: 1257472 Degree: 1
Blocks to Sort: 1 Row size: 37 Rows: 1
Initial runs: 1 Merge passes: 1 IO Cost / pass: 2
Total IO sort cost: 2
Total CPU sort cost: 0
Total Temp space used: 0
SORT resource Sort statistics
Sort width: 2 Area size: 131072 Max Area size: 1257472 Degree: 1
Blocks to Sort: 2371 Row size: 38 Rows: 509576
Initial runs: 16 Merge passes: 4 IO Cost / pass: 3557
Total IO sort cost: 8300
Total CPU sort cost: 0
Total Temp space used: 40936000
Merge join Cost: 15049 Resp: 15049
HA Join
Outer table:
resc: 5 cdn: 1 rcz: 24 deg: 1 resp: 5
Inner table: OP_TRANSIMP_REQS
resc: 6743 cdn: 509576 rcz: 25 deg: 1 resp: 6743
using join:8 distribution:2 #groups:1
Hash join one ptn Resc: 8 Deg: 1
hash_area: 60 (max=307) buildfrag: 1 probefrag: 2302 ppasses: 1
Hash join Resc: 6756 Resp: 6756
Join result: cost: 7 cdn: 1 rcz: 49
Best so far: TABLE#: 0 CST: 2 CDN: 1 BYTES: 12
Best so far: TABLE#: 2 CST: 5 CDN: 1 BYTES: 24
Best so far: TABLE#: 1 CST: 7 CDN: 1 BYTES: 49

Join order[3]: OP_TRANSIMP_REQS [ITR] OP_CURPLAN_TRANSIMPREQS [OCT] OP_TRANSIMP_MOVES [ITM]


Join order[4]: OP_TRANSIMP_REQS [ITR] OP_TRANSIMP_MOVES [ITM] OP_CURPLAN_TRANSIMPREQS [OCT]
Join order[5]: OP_TRANSIMP_MOVES [ITM] OP_CURPLAN_TRANSIMPREQS [OCT] OP_TRANSIMP_REQS [ITR]
Join order[6]: OP_TRANSIMP_MOVES [ITM] OP_TRANSIMP_REQS [ITR] OP_CURPLAN_TRANSIMPREQS [OCT]

Final:
CST: 7 CDN: 1 RSC: 7 RSP: 7 BYTES: 49
IO-RSC: 7 IO-RSP: 7 CPU-RSC: 0 CPU-RSP: 0

I am going to explain the second join method as it is the best one giving less cost. In the similar
way you can go through First join method.

Join order[2]: OP_CURPLAN_TRANSIMPREQS [OCT] OP_TRANSIMP_MOVES [ITM] OP_TRANSIMP_REQS [ITR]


Now joining: OP_TRANSIMP_MOVES [ITM] *******
NL Join
Outer table: cost: 2 cdn: 1 rcz: 12 resp: 2
Inner table: OP_TRANSIMP_MOVES
Access path: tsc Resc: 17407
Join: Resc: 17409 Resp: 17409
Access path: index (scan)
Index: OP_TRANSIMP_MOVES_NU1
TABLE: OP_TRANSIMP_MOVES
RSC_CPU: 0 RSC_IO: 3
IX_SEL: 5.3132e-07 TB_SEL: 5.3132e-07
Join: resc: 5 resp: 5
Access path: index (scan)
Index: OP_TRANSIMP_MOVES_PK
TABLE: OP_TRANSIMP_MOVES
RSC_CPU: 0 RSC_IO: 3
IX_SEL: 5.3132e-07 TB_SEL: 5.3132e-07
Join: resc: 5 resp: 5
Join cardinality: 1 = outer (1) * inner (1126964) * sel (5.3132e-07) [flag=0]
Best NL cost: 5 resp: 5
As we have set the initialization parameter as:
OPTIMIZER_INDEX_CACHING = 0
OPTIMIZER_INDEX_COST_ADJ = 100
This shows that it will consider the scan on indexes and table to be equally costly and that is the
reason this plan is not deriving the optimim cost. If this parameters are set as
OPTIMIZER_INDEX_CACHING = 100 & OPTIMIZER_INDEX_COST_ADJ = 30 then the cost
would have been less. I have tested both the scenarios and the details of these are as follows:

Alter session set OPTIMIZER_INDEX_CACHING = 90;


Alter session set OPTIMIZER_INDEX_COST_ADJ = 30;
Set autot trace exp
SELECT OCT.OPO_OMRPLANID,
OCT.ITR_REQUIREMENT_NO,
OCT.ITR_VERSION_NO,
ITM.DTY_DECLTYPE_FK,
ITR.LBL_BILLNO_FK,
ITM.TACC_IND,
ITR.COMMCLR_IND
FROM OP_CURPLAN_TRANSIMPREQS OCT,
OP_TRANSIMP_MOVES ITM,
OP_TRANSIMP_REQS ITR
WHERE OCT.OPO_OMRPLANID=:B1
AND OCT.ITR_VERSION_NO=ITM.ITR_VERSION_NO
AND OCT.ITR_REQUIREMENT_NO=ITM.ITR_REQUIREMENT_NO
AND ITR.VERSION_NO=ITM.ITR_VERSION_NO
AND ITR.REQUIREMENT_NO=ITM.ITR_REQUIREMENT_NO;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=49)
1 0 NESTED LOOPS (Cost=4 Card=1 Bytes=49)
2 1 NESTED LOOPS (Cost=3 Card=1 Bytes=24)
3 2 INDEX (RANGE SCAN) OF 'OP_CURPLAN_TRANSIMPREQS_PK' (UNIQUE) (Cost=2 Card=1 Bytes=12)
4 2 TABLE ACCESS (BY INDEX ROWID) OF 'OP_TRANSIMP_MOVES' (Cost=2 Card=1 Bytes=12)
5 4 INDEX (RANGE SCAN) OF 'OP_TRANSIMP_MOVES_PK' (UNIQUE) (Cost=1 Card=1)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'OP_TRANSIMP_REQS' (Cost=2 Card=1 Bytes=25)
7 6 INDEX (UNIQUE SCAN) OF 'OP_TRANSIMP_REQS_PK' (UNIQUE)

The reason the cost is coming less here is that it is using the index scan cost in the cost calculations for those columns which
are used in the where clause and in the below mentioned our case the cost is higher due to giving equal weightage to index
scan and table scans and considering them equally costly. Also in below case it may not be caching the index in preference to
table so as to have these values from the cache itsself.

If these parameters are set as 0/100 then plan that is coming as(as in our case):
Alter session set OPTIMIZER_INDEX_CACHING = 0;
Alter session set OPTIMIZER_INDEX_COST_ADJ = 100;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1 Bytes=49)
1 0 NESTED LOOPS (Cost=7 Card=1 Bytes=49)
2 1 NESTED LOOPS (Cost=5 Card=1 Bytes=24)
3 2 INDEX (RANGE SCAN) OF 'OP_CURPLAN_TRANSIMPREQS_PK' (UNIQUE) (Cost=2 Card=1 Bytes=12)
4 2 TABLE ACCESS (BY INDEX ROWID) OF 'OP_TRANSIMP_MOVES' (Cost=3 Card=1 Bytes=12)
5 4 INDEX (RANGE SCAN) OF 'OP_TRANSIMP_MOVES_NU1' (NON-UNIQUE) (Cost=2 Card=1)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'OP_TRANSIMP_REQS' (Cost=2 Card=1 Bytes=25)
7 6 INDEX (UNIQUE SCAN) OF 'OP_TRANSIMP_REQS_PK' (UNIQUE) Cost=1 Card=1)

This is the plan we are getting as our parameters are set as 0/100 which are default settings. However if these parameters are
changed to 90/30 from 0/100 you can find the reduction in cost from 7 to 4 which is approx 75% reduction.
The rest of the join is self-explanatory.
Join cardinality = card of outer*card of inner*Selectivity of inner table
cost = cost of outer + cardinality of outer * cost of inner
Cost = 2 + 1*3
=5

I am skipping the portions for hash and sort merge joins as for those there are lot of work needs
to be done to reach to a conclusion to find out how oracle is calculating the cost for sorting and
hashing and finally merging.

Join result: cost: 5 cdn: 1 rcz: 24


Now joining: OP_TRANSIMP_REQS [ITR] *******
NL Join
Outer table: cost: 5 cdn: 1 rcz: 24 resp: 5
Inner table: OP_TRANSIMP_REQS
Access path: tsc Resc: 6743
Join: Resc: 6748 Resp: 6748
Access path: index (unique)
Index: OP_TRANSIMP_REQS_PK
TABLE: OP_TRANSIMP_REQS
RSC_CPU: 0 RSC_IO: 2
IX_SEL: 1.9624e-06 TB_SEL: 1.9624e-06
Join: resc: 7 resp: 7
Access path: index (eq-unique)
Index: OP_TRANSIMP_REQS_PK
TABLE: OP_TRANSIMP_REQS
RSC_CPU: 0 RSC_IO: 2
IX_SEL: 0.0000e+00 TB_SEL: 0.0000e+00
Join: resc: 7 resp: 7
Join cardinality: 1 = outer (1) * inner (509576) * sel (1.9624e-06) [flag=0]
Using index (ndv = 509576 sel = 3.9807e-07)
Best NL cost: 7 resp: 7

cost = cost of outer + cardinality of outer * cost of inner


= 5 + 1*2
=7
MULTI-TABLE JOINS

It joins 3 tables, enough to have some permutations of join orders to consider (6), but
not so many that one gets lost following the trail. With 4 tables, there would be 24
permutations, with 5 tables 120 permutations.
In general, there are n! (n faculty ) possible permutations to join n tables, so the number
of permutations – and the cost and time to evaluate them – rises dramatically as the
number of tables in the SQL increases. The init.ora parameter
“optimizer_max_permutations” can be used to limit the number of permutations the
CBO will evaluate.
There are also plenty of predicates to give rise to many different base table access
considerations. But we are not going to go into those details. I only want to demonstrate
the join permutations of our case as follows:

Join order[3]: OP_TRANSIMP_REQS [ITR] OP_CURPLAN_TRANSIMPREQS [OCT] OP_TRANSIMP_MOVES [ITM]


Join order[4]: OP_TRANSIMP_REQS [ITR] OP_TRANSIMP_MOVES [ITM] OP_CURPLAN_TRANSIMPREQS [OCT]
Join order[5]: OP_TRANSIMP_MOVES [ITM] OP_CURPLAN_TRANSIMPREQS [OCT] OP_TRANSIMP_REQS [ITR]
Join order[6]: OP_TRANSIMP_MOVES [ITM] OP_TRANSIMP_REQS [ITR] OP_CURPLAN_TRANSIMPREQS [OCT]

Join order[1] and join order[2] here are completely evaluated by CBO and as soon as
these join order starts and it finds the cost of the join exceeding at anty stage than the
optimum cost till now(7 in our case for Join Order [2]), it abandons that plan there and
then. It doesn’t evaluate the plan any further. Here the cost of accessing the first table
itsself is more than 7(accessing the table OP_TRANSIMP_REQS cost 17407 and
accessing the table OP_TRANSIMP_MOVES cost 6743), so there is no point in carry
forwarding and doing any join analysis on these join orders.
CONCLUSION

If you take one piece of advice from this paper then this:

· Pay close attention to the cardinality values in an explain plan. Wrong estimates by
the CBO for the cardinalities can have a devastating effect on the choice of access
plan and the performance of the SQL statement. Assuming for the moment that the
estimate of cardinality for any of the table in the plan above is incorrect and too low,
then that would invalidate the entire plan costs and perhaps make the optimizer
choose a different, and better, plan. Armed with the knowledge of how the CBO
arrived at this number you know what to change in order to help the optimizer make
a better assumption: the filter factor of the combined predicates on this table, i.e.
ultimately the densities and NDVs of the predicate columns. Here are some means
of doing that:

· Using histograms on some columns and experiment with the sizes (number of
buckets). Even if the histogram data itself is never used, the density for a column
with a histogram changes. Generally, collecting a value-based histogram for a
column significantly reduces its density, often by orders of magnitude, whereas
collecting a height-based histogram increases the density.
· Deleting the statistics of a column – now possible with the DBMS_STATS
procedure – an index or an entire table and let the optimizer use the default
densities.
· “Seeding” a table with rows that artificially either increase the cardinality of a
column (more than that of the table) and thus lower its density, or increase the
cardinality of the table without changing that of the column and thus raise the
column’s density.
· Using brute force and setting the density. This is possible as of Oracle 8.0 with
the advent of DBMS_STATS.SET_xxx_STATS. I recommend using
export_table_stats to export the statistics into a STATTAB table, modify the
value(s) there, and then import the statistics back into the dictionary. Of course
you’ll make two exports – one into a “work” statid and one into a “backup” statid
so that you can restore the original statistics in case the change does not have
the desired effect.

Submitted By:

Sumit Popli
Tata Consultancy Services
sumitp@delhi.tcs.co.in
popli_sumit@yahoo.com

Das könnte Ihnen auch gefallen