Sie sind auf Seite 1von 4

Design of Indexing

Sanjay Rajput
Indexing and Clustering

One of the major aspects of performance enhancement in DB2 is an Index.


Changes to the indexes of the DB2 tables used in the SQL query processing of a
batch application would have a direct impact on the performance of the query and
hence so to the Batch cycle time. So it becomes mandatory to include Indexing
and clustering of indexes as a vital metric for a performance tuning exercise.

Let’s check out some of the important highlights of the concept of indexing and
clustering.

Points to Mind:

1) The primary advantage of indexes is the ability to process a small


percentage of the rows efficiently with minimal I/O and CPU usage.

2) A-clustering index improves performance for processing a larger


percentage of rows (less than 30 to 50 percent).

3) Whenever DB2 loads or reorganizes a table, it must build or rebuild each


index to it.

4) The costs of changing indexes are often more than the costs of changing the
data.

5) Primary keys and foreign keys are often searched or joined over a small
percentage of rows and are good candidates for indexes. Indeed, the
primary key must have a unique index to guarantee unique values in the
column.

6) If there is no index on the foreign key, an update of a primary key value


requires a table space scan of each dependent table.

7) When a row is deleted from a parent table and no index exists on the
foreign key, it is necessary to do a table space scan on each dependent table
to enforce the delete rule.

8) Joins are often performed on the primary key and foreign key columns;
therefore, an index on these columns makes the join much more efficient in
most cases.

Sanjay Rajput, Syntel Inc 2


9) The keyword CLUSTER specified when the index is created, instructs
DB2 to maintain the rows on the data pages in sequence according to the
indexed column.

10) The optimizer is likely to use the clustering index to avoid a sort for
ORDER BY, GROUP BY, DISTINCT, and join processing.

11) Columns frequently searched or joined over a range of values using the
operators BETWEEN, >, <, and LIKE are good candidates for clustering.

12) A clustering index means that values are maintained in sequence on the
data pages.

13) The REORG utility does not re sequence rows, if a clustering index is not
explicitly declared.

14) The parameter SORTDATA on the REORG utility statement is ignored, if


there is no clustering index declared on a table.

15) SORTDATA required 74 percent less elapsed time when reorganizing data
with a cluster ratio of 80 percent in one case.

16) If equal predicates are used on a column with a unique index, clustering
has no advantages.

17) With the exception of batch processing, there is generally no advantage to


having a clustering index on a primary key with a unique index if equal
predicates are used.

18) Minimize the number of indexes when inserting, updating, and deleting
more than about 10 percent of the rows on a weekly basis.

19) Composite indexes are useful when columns are frequently referenced
together.

20) In most cases, indexes should be created after a table is created and before
the
LOAD utility is used to populate the table.

21) The LOAD utility builds the indexes after extracting the indexed values
while inserting rows into the table.

Sanjay Rajput, Syntel Inc 3


22) The extracted indexed values are sorted and the indexes are built
efficiently in parallel or one at a time serially depending on the parameter
specified to the LOAD utility.

23) If a table already has rows when an index is created, a table space scan is
performed to extract the indexed values.

24) For data accessed sequentially, cluster ratio has a huge impact on
performance for randomly accessed data, cluster ratio is not as important.

25) The CLUSTERRATIO in SYSINDEXES contains the percentage of rows


that are in sequence according to the values in the column declared as the
clustering index by specifying the CLUSTER parameter when the index
was created.

26) A value of 'Y' in the CLUSTERING column indicates that the row
describes the clustering index.

27) The primary advantage of indexes is the ability to process a small


percentage of the rows efficiently with minimal I/O and CPU usage.

Hope these points would be quite helpful for you all as an introduction to the
concept of Indexing and clustering for a performance tuning exercise.

Sanjay Rajput, Syntel Inc 4

Das könnte Ihnen auch gefallen