Sie sind auf Seite 1von 3

Role of Cardinality in Query Optimization

One of the hardest problems in query optimization is to accurately estimate the costs of alternative query plans. Optimizers cost query plans using a mathematical model of query execution costs that relies heavily on estimates of the cardinality, or number of tuples, flowing through each edge in a query plan. Cardinality estimation in turn depends on estimates of the selection factor of predicates in the query. Traditionally, database systems estimate selectivitys through fairly detailed statistics on the distribution of values in each column, such as histograms. Cardinality represents the number of rows in a row set. Here, the row set can be a base table, a view, or the result of a join or GROUP BY operator. Cardinality estimation is the problem of estimating the number of tuples returned by a query; it is a fundamentally important task in data management, used in query optimization, progress estimation, and resource provisioning. The accuracy of cardinality estimates is crucial for obtaining a good query execution plan. Today's optimizers make several simplifying assumptions during cardinality estimation that can lead to large errors and hence poor plans. In a scenario such as query optimizer testing it is very desirable to obtain the "best" plan, i.e., the plan produced when the cardinality of each relevant expression is exact. Such a plan serves as a baseline against which plans produced by using the existing cardinality estimation module in the query optimizer can be compared. However, obtaining all exact cardinalities by executing appropriate sub expressions can be prohibitively expensive. Cardinality is estimated by using explain plan method. The term cardinality is define in explain plan (The explain plan is the sequence of operations performed by oracle to execute the statement. by examining the explain plan, we can identify

inefficient sql statements) Following table shows the plan table of explain plan where cardinality is define with data type Number(38).
CREATE TABLE PLAN_TABLE ( STATEMENT_ID VARCHAR2(30), DATE, TIMESTAMP REMARKS VARCHAR2(80), OPERATION VARCHAR2(30), OPTIONS VARCHAR2(30), OBJECT_NODE VARCHAR2(128), OBJECT_OWNER VARCHAR2(30), VARCHAR2(30), OBJECT_NAME OBJECT_INSTANCE NUMBER(38), OBJECT_TYPE VARCHAR2(30), OPTIMIZER VARCHAR2(255), SEARCH_COLUMNS NUMBER, ID NUMBER(38), PARENT_ID NUMBER(38), POSITION NUMBER(38), COST NUMBER(38), CARDINALITY NUMBER(38), BYTES NUMBER(38), OTHER_TAG VARCHAR2(255), PARTITION_START VARCHAR2(255), PARTITION_STOP VARCHAR2(255), PARTITION_ID NUMBER(38), OTHER LONG, DISTRIBUTION VARCHAR2(30) ) ;

For finding cardinality we execute SET AUTOTRACE ON command on Oracle platform (SET AUTOTRACE ON gives report that includes optimizer execution path and the SQL statement execution statistics). Table below shows the cardinality of a query after applying explain plan.

Execution Plan for the Non-optimize Query in RDB for GROUP BY Clause ----------------------------------------------------------------------------------------------------------------------------0 SELECT STATEMENT Optimizer=CHOOSE (Cost=830 Card=2 Bytes=34) 1 0 SORT (GROUP BY) (Cost=830 Card=2 Bytes=34) 2 1 TABLE ACCESS (BY INDEX ROWID) OF 'RBRANCH (Cost=826 Card=2 Bytes=34) 3 2 INDEX (FULL SCAN) OF 'SYS_C002721' (UNIQUE) (Cost=26 Card=2)

COST = 2512

CARD = 8

BYTES = 104

In mathematics, the cardinality of a set is a measure of the "number of elements of the set". For example, the set A = {2, 4, 6} contains 3 elements, and therefore A has a cardinality of 3. There are two approaches to cardinality one which compares sets directly using bijections and injections, and another which uses cardinal numbers.[1]

In SQL (Structured Query Language), the term cardinality refers to the uniqueness of data values contained in a particular column (attribute) of a database table. The lower the cardinality, the more duplicated elements in a column. Thus, a column with the lowest possible cardinality would have the same value for every row. SQL databases use cardinality to help determine the optimal query plan for a given query.

Values of Cardinality
When dealing with columnar value sets, there are 3 types of cardinality: high-cardinality, normal-cardinality, and low-cardinality. High-cardinality refers to columns with values that are very uncommon or unique. Highcardinality column values are typically identification numbers, email addresses, or user names. An example of a data table column with high-cardinality would be a USERS table with a column named USER_ID. This column would contain unique values of 1-n. Each time a new user is created in the USERS table, a new number would be created in the USER_ID column to identify them uniquely. Since the values held in the USER_ID column are unique, this column's cardinality type would be referred to as high-cardinality. Normal-cardinality refers to columns with values that are somewhat uncommon. Normalcardinality column values are typically names, street addresses, or vehicle types. An example of a data table column with normal-cardinality would be a CUSTOMER table with a column named LAST_NAME, containing the last names of customers. While some people have common last names, such as Smith, others have uncommon last names. Therefore, an examination of all of the values held in the LAST_NAME column would show "clumps" of names in some places (e.g.: a lot of Smith's ) surrounded on both sides by a long series of unique values. Since there is a variety of possible values held in this column, its cardinality type would be referred to as normalcardinality. Low-cardinality refers to columns with few unique values. Low-cardinality column values are typically status flags, Boolean values, or major classifications such as gender. An example of a data table column with low-cardinality would be a CUSTOMER table with a column named NEW_CUSTOMER. This column would contain only 2 distinct values: Y or N, denoting whether the customer was new or not. Since there are only 2 possible values held in this column, its cardinality type would be referred to as low-cardinality.

Das könnte Ihnen auch gefallen