Beruflich Dokumente
Kultur Dokumente
Introduction to SQL
What is SQL
SQL is structured Query Language which is a computer
language for storing, manipulating and retrieving
data stored in relational database.
Avoid using DISTINCT: The DISTINCT verb removes the duplicates from the result
table. If duplicates are not a problem and if the chance of having a duplicate
does not exist, then avoid using it because it adds to the overheads.
Use Equivalent Data Types: Use the same data type and lengths when
comparing column values to host variables or literals. This eliminates the need
for data conversion. For examples, to compare a CHAR (6) column with a CHAR
(5) column, Oracle has to do the data conversion and when Oracle has to convert
data the available indexes are not used. The same is the case when there is a
mismatch in lengths or data type between a column and a host variable that are
being compared.
Use BETWEEN instead of <= and >=:
The BETWEEN predicate is more efficient that the `greater/less than or
equal to’ predicates because the optimizer select the most efficient path for
the BETWEEN predicate.
Use same cases for all the tables, Views accessed in select statements,
which will allow Oracle to use shared pool efficiently
Limit updating indexed columns:
When columns in indexes are updated, a corresponding update is applied to
all indexes in which the column participates. This will reduce the
performance considerably because of additional I/O overhead.
Types Of Index
B-tree indexes
This is the standard tree index that Oracle has been using since the earliest
releases.
Bitmap indexes
Bitmap indexes are used where an index column has a relatively small
number of distinct values (low cardinality). These are super-fast for read-only
databases, but are not suitable for systems with frequent updates
Oracle manages the allocation of pointers within index blocks, and this is the
reason why we are unable to specify a PCTUSED value (the freelist re-link
threshold) for indexes. When we examine an index block structure, we see
that the number of entries within each index node is a function of two values
1) The length of the symbolic key
2) The blocksize for the index tablespace
Each data block within the index contains "nodes" in the index tree, with the
bottom nodes (leaf blocks), containing pairs of symbolic keys and ROWID
values. As an Oracle tree grows (via inserting rows into the table), Oracle
fills the block, and when the block is full, it splits, creating new index nodes
(data blocks) to manage the symbolic keys within the index. Hence, an
Oracle index block may contain pointers to other index nodes or
ROWID/Symbolic-key pairs.
The EXPLAIN PLAN results let you determine whether the optimizer select a
particular execution plan, such as, join, Scan Methods. It also helps you to
understand the optimizer decisions, such as why the optimizer chose a
nested loops join instead of a hash join,
To explain a SQL statement, use the EXPLAIN PLAN FOR clause
immediately before the statement.
For example:
EXPLAIN PLAN FOR SELECT last_name FROM employees; .
You can Display the plan_Table output by using the below sql statement
select * from table(dbms_xplan.display());
Reading Explain Plan
What is COST ?
Cost of the operation as estimated by the optimizer's query approach. Cost
is not determined for table access operations. The value of this column does
not have any particular unit of measurement; it is merely a weighted value
used to compare costs of execution plans. The value of this column is a
function of the CPU_COST and IO_COST columns.
CPU_COST
CPU cost of the operation as estimated by the query optimizer's approach.
The value of this column is proportional to the number of machine cycles
required for the operation. For statements that use the rule-based approach,
this column is null.
IO_COST
I/O cost of the operation as estimated by the query optimizer's approach.
The value of this column is proportional to the number of data blocks read by
the operation. For statements that use the rule-based approach, this column
is null.
What is CARDINALITY ?
Estimate by the query optimization approach of the number of rows
accessed by the operation.
What is BYTES?
Estimate by the query optimization approach of the number of bytes
accessed by the operation.
There are other parameter Like Object Owner which depict the
owner of that particular object, Object Name etc.
Significance of Access Paths
To choose an access path, the optimizer first determines which access paths
are available by examining the conditions in the statement's WHERE clause.
The optimizer then generates a set of possible execution plans using
available access paths and estimates the cost of each plan using the
statistics for the index, columns, and tables accessible to the statement. The
optimizer then chooses the execution plan with the lowest estimated cost
This type of scan reads all rows from a table and filters out those that do not
meet the selection criteria
When Oracle performs a full table scan, the blocks are read sequentially.
Each block is read only once.
Why a Full Table Scan Is Faster for Accessing Large Amounts of Data ?
This is because full table scans can use larger I/O calls, and making fewer
large I/O calls is cheaper than making many smaller calls.
The rowid of a row specifies the datafile and data block containing the row
and the location of the row in that block. Locating a row by specifying its
rowid is the fastest way to retrieve a single row, because the exact location
of the row in the database is specified.
To access a table by rowid, Oracle first obtains the rowids of the selected
rows, either from the statement's WHERE clause or through an index scan
of one or more of the table's indexes. Oracle then locates each selected row
in the table based on its rowid.
This is generally the second step after retrieving the rowid from an index.
The table access might be required for any columns in the statement not
present in the index.
Access by rowid does not need to follow every index scan. If the index
contains all the columns needed for the statement, then table access by
rowid might not occur.
INDEX SCAN
In this method, a row is retrieved by traversing the index, using the indexed
column values specified by the statement. An index scan retrieves data from
an index based on the value of one or more columns in the index. To
perform an index scan, Oracle searches the index for the indexed column
values accessed by the statement. If the statement accesses only columns
of the index, then Oracle reads the indexed column values directly from the
index, rather than from the table.
Full Scans
Index Joins
Bitmap Indexes
Assessing I/O for Blocks, not Rows
Oracle does I/O by blocks. Therefore, the optimizer's decision to use full
table scans is influenced by the percentage of blocks accessed, not
rows. This is called the index clustering factor. If blocks contain single
rows, then rows accessed and blocks accessed are the same.
Case 1:
The index clustering factor is low for the rows as they are arranged in the
following diagram.
Block 1 Block 2 Block 3
------- ------- --------
AAA BBB CCC
This is because the rows that have the same indexed column values for c1
are located within the same physical blocks in the table. The cost of using a
range scan to return all of the rows that have the value A is low, because
only one block in the table needs to be read.
Case 2:
If the same rows in the table are rearranged so that the index values are
scattered across the table blocks, then the index clustering factor is higher.
This is because all three blocks in the table must be read in order to retrieve
all rows with the value A in col1.
Index Unique Scan :
Index unique scans, which occur when the Oracle database engine uses an
index to retrieve a specific row from a table. Generally we have the unique
index on the column of the table.
SELECT *
FROM emp e
WHERE e.empno = 1234;
Explain Plan--
SELECT STATEMENT ()
TABLE ACCESS ( BY INDEX ROWID) EMP
INDEX ( UNIQUE SCAN ) UQ_EMP
SELECT *
FROM emp e
WHERE e.empno > 1234;
Explain Plan
SELECT STATEMENT ()
TABLE ACCESS ( BY INDEX ROWID) EMP
INDEX ( RANGE SCAN ) UQ_EMP
Index Skip Scan:
Index skip-scans are faster than full scans of the index, requiring fewer
reads to be performed.
Consider, for example, a table employees (sex, employee_id, address) with
a composite index on (sex, employee_id). Splitting this composite index
would result in two logical subindexes, one for M and one for F.
For this example, suppose you have Created index on sex and employee ID
The index is split logically into the following two subindexes:
The first subindex has the keys with the value F.
The second subindex has the keys with the value M.
The column sex is skipped in the following query
>SELECT * FROM employees WHERE employee_id = 101;
A complete scan of the index is not performed, but the subindex with the
value F is searched first, followed by a search of the subindex with the value
M.
Fast Full Index Scans
Fast full index scans are an alternative to a full table scan when the index
contains all the columns that are needed for the query, and at least one
column in the index key has the NOT NULL constraint. A fast full scan
accesses the data in the index itself, without accessing the table. It cannot
be used to eliminate a sort operation, because the data is not ordered by the
index key. It reads the entire index using multiblock reads, unlike a full index
scan, and can be parallelized.
A fast full scan is faster than a normal full index scan in that it can use
multiblock I/O and can be parallelized just like a table scan.
When a index contains all of the values required to satisfy the query and
table access is not required. The fast full-index scan execution plan will read
the entire index with multi-block reads (using db_file_multiblock_read_count)
and return the rows in unsorted order
The fast full scan usually requires fewer physical I/Os than a full table scan ,
allowing the query to be resolved faster.
There is concatenated index on the columns empno, ename, and deptno.
SELECT e.empno,e.ename,e.deptno
FROM emp e
WHERE e.deptno = 30;
----Explain Plan----
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=5)
1 0 SORT (AGGREGATE)
2 1 INDEX (FAST FULL SCAN)
Since all of the columns in the SQL statement are in the index, a fast full
scan is available.
Index fast full scans are commonly performed during joins in which only the
indexed join key
columns are queried.
Oracle Joining Methods :
Types of Joins:
Nested loop
Sort merge joins
Hash joins
Index joins
Nested loop :
Oracle reads the first row from the first row source and then checks the
second row source for matches.
All matches are then placed in the result set.
This continues until all rows in the first row source have been processed.
The first row source is often called the outer or driving table,
Second row source is called the inner table.
It is the fastest methods of receiving the first records back from a join.
Suppose somebody gave you a telephone book and a list of 20 names
to look up, and asked you to write down each person’s name and
corresponding telephone number. You would probably go down the list
of names, looking up each one in the telephone book one at a time.
This task would be pretty easy because the telephone book is
alphabetized by name. Moreover, somebody looking over your
shoulder could begin calling the first few numbers you write down
while you are still looking up the rest. This scene describes a NESTED
LOOPS join.
NESTED LOOPS joins are ideal when the driving row source (the
records you are looking for) is small and the joined columns of the
inner row source are uniquely indexed or have a highly selective non-
unique index.
However, NESTED LOOPS joins can be very inefficient if the inner row
source (second table accessed) does not have an index on the joined
columns or if the index is not highly selective. If the driving row source
(the records retrieved from the driving table) is quite large.
Avoid using DISTINCT: The DISTINCT verb removes the duplicates from the result
table. If duplicates are not a problem and if the chance of having a duplicate
does not exist, then avoid using it because it adds to the overheads.
Use Equivalent Data Types: Use the same data type and lengths when
comparing column values to host variables or literals. This eliminates the need
for data conversion. For examples, to compare a CHAR (6) column with a CHAR
(5) column, Oracle has to do the data conversion and when Oracle has to convert
data the available indexes are not used. The same is the case when there is a
mismatch in lengths or data type between a column and a host variable that are
being compared.
Use BETWEEN instead of <= and >=:
The BETWEEN predicate is more efficient that the `greater/less than or
equal to’ predicates because the optimizer select the most efficient path for
the BETWEEN predicate.
Use same cases for all the tables, Views accessed in select statements,
which will allow Oracle to use shared pool efficiently
Limit updating indexed columns:
When columns in indexes are updated, a corresponding update is applied to
all indexes in which the column participates. This will reduce the
performance considerably because of additional I/O overhead.