Beruflich Dokumente
Kultur Dokumente
ANJU GARG
1
About me
• More than 11 years of experience in IT Industry
• Sr. Corporate Trainer (Oracle DBA) with Koenig Solutions Pvt. Ltd.
• Oracle blog : http://oracleinaction.com/
• Email : anjugarg66@gmail.com
• Oracle Certified
• Pre-12c Histograms
– Frequency Histograms
– Height Balanced Histograms
• Histograms in 12c
– Top Frequency Histograms
– Hybrid Histograms
– Hybrid Histograms - Corollary
• Conclusion
• References
• Q &A
Nb – Number of buckets
NDV – No. of Distinct Values
Nb – Number of buckets
NDV – No. of Distinct Values Sangam 14 Anju Garg 6
Frequency Histogram
•
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 538080257
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50 | 50200 | 7 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| HIST | 50 | 50200 | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------
Nb – Number of buckets
NDV – No. of Distinct Values
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
Plan hash value: 538080257
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 48 | 48192 | 7 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| HIST | 48 | 48192 | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=8)
IS
IS NO NO
IS SAMPLE_SIZE
NDV > 2048
Nb < NDV ? =
?
AUTO ?
YES YES
FREQUENCY
NO HISTOGRAM
USING
SAMPLE
IS
SAMPLE_SIZE NO HEIGHT
= BALANCED
HISTOGRAM
AUTO ?
YES
DO HIGHLY
POPULAR NO HYBRID
VALUES HISTOGRAM
DOMINATE ?
YES
TOP FREQUENCY
HISTOGRAM
USING
FTS
• Prerequisites
NDV > Nb
The percentage of rows occupied by the top Nb frequent values is equal to or
greater than threshold p, where p is (1-(1/Nb))*100.
The ESTIMATE_PERCENT is set to AUTO_SAMPLE_SIZE in the DBMS_STATS
statistics gathering procedure.
• Features
One value per bucket
One value contained entirely in one bucket
Cardinality of the value = Bucket size
Variable bucket size
Precise for all the endpoints captured
DB12c>exec dbms_stats.gather_table_stats -
(ownname => 'HR', tabname => 'HIST', method_opt => 'FOR COLUMNS ID size
20', cascade => true);
select table_name, column_name,histogram,num_distinct, num_buckets
from dba_tab_col_statistics
where table_name = 'HIST' and column_name = 'ID';
Nb – Number of buckets
NDV – No. of Distinct Values
Sangam 14 Anju Garg 18
Top Frequency Histogram
Nb – Number of buckets
NDV – No. of Distinct Values
Nb – Number of buckets
Sangam 14 Anju Garg 24
NDV – No. of Distinct Values
Hybrid Histogram
BUCKET ENDPOINT
NUMBER VALUES _ VALUE
1 1 1 1 1 1
2 2 2 3 3
3 4 4 5 5
4 6 6 7 7 7 7
5 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
6 9 9 9 10 10 10
7 11 11 11 11 11 11 11
8 12 12 12 12 12 12 12
9 13 13 13 13 13 13 13
10 14 14 14 14
11 15 15 15 15 15 15
12 16 16 16 16
13 17 17 17 17
14 18 19 19
15 20 20 20 20 20 20
16 21 21 21
17 22 22
18 23 23
19 24 24 24
20 25 25 25 26 26 26
HYBRID HISTOGRAM
Sangam 14 Anju Garg 25
Hybrid Histograms
Conclusion
• Hybrid histograms have features of both frequency and height balanced
histograms
• Features similar to frequency histograms
All occurrences of a value are placed in one bucket
ENDPOINT_NUMBER stores cumulative frequency
Variable bucket size
• Features similar to Height Balanced histograms
One bucket can have multiple values.
• Captures more endpoints
• Now we have a better estimate of the data distribution of “non-popular”
values.
• Hybrid histograms are an improvement over height balanced histograms when
top frequency histograms cannot be used
DB12c>exec dbms_stats.gather_table_stats -
(ownname => 'HR', tabname => 'HIST',method_opt => 'FOR COLUMNS ID
size 13', cascade => true);
Nb – Number of buckets
NDV – No. of Distinct Values
26 100 2 7 13 13 13 13 13 13 14 14 14 14
13 rows selected. 8 15 15 15 15 15 16 16 16 16
9 17 17 17 18 18
ENDPOINT_VALUE : Largest value in a bucket 10 19 20 20 20 20 20 20
ENDPOINT_NUMBER : Cumulative frequency 11 21 21 22 22
ENDPOINT_REPEAT_COUNT :Frequency of endpoint 12 23 24 24 24
13 25 25 25 26 26 26
Nb – Number of buckets
NDV – No. of Distinct Values
Nb – Number of buckets
NDV – No. of Distinct Values
• http://jimczuprynski.files.wordpress.com/2014/04/czuprynski-select-q2-2014.pdf
• http://jonathanlewis.wordpress.com/2013/09/01/histograms/