Sie sind auf Seite 1von 14

Data Warehousing > Database

Partitioned Primary Indexes

Jerry Klindt
(updated by Paul Sinclair)
October 20, 2004

Partitioned Primary Indexes


Table of Contents

Executive Summary

Introduction

Definitions and Basics

How Much Can PPI Improve


Performance?

How PPI Solves the Business


Problem Example One

Can the First Example Be Improved


Further?

A Second Example

Executive Summary

Partitioned primary indexes, introduced in Teradata


Database V2R5, provide an opportunity to greatly
improve performance of certain queries, and to improve

A Final Example

10

Specifics of Defining a PPI Table

11

High-Level Partitioning Guidelines

13

High-Level Trade-off Considerations

13

Summary

14

the performance of high-volume insert, update, and


delete operations. The feature is flexible, yet easy to use,
and is largely transparent to end users.
Introduction
Some common business queries generally
require a full-table scan of a large table
even though its predictable that a fairly
small percentage of the rows will qualify.
One example of such a query is a trend
analysis application that compares current
month sales to the previous month, or to
the same month of the previous year,

optimization of frequently used queries


of this class. That tool is the partitioned
primary index (PPI). A PPI allows a table
to be partitioned on columns of interest
while retaining the traditional use of the
primary index (PI) for data distribution
and efficient access when the PI values are
specified in the query.

using a table with several years of sales

A carefully-chosen partitioning expression

detail. Another example is an application

can result in partial-table scans instead

that compares customer behavior in one

of full-table scans with dramatic improve-

geographic region to another region.

ments in resource consumption and

Prior to Teradata Database V2R5, there


were few viable opportunities for a
Database Administrator (DBA) to structure the data warehouse in a manner that
allowed such queries to avoid full-table
scans. Starting with Teradata Database
V2R5, the DBA has a flexible and powerful

EB-1889 > 1204 > PAGE 2 OF 14

tool to structure tables to allow automatic

elapsed time (elapsed time decreases of


99% or more are possible). Batch insert
and update times may also be improved
when the partitioning column is chosen
to match the arrival pattern of the data
(elapsed time decreases of 90% or more
are possible).

Partitioned Primary Indexes


The process for physically defining the

The partitioning expression is specified

The term direct merge join is used to

partitioning expression, via the CREATE

on the CREATE TABLE statement in a

describe a join in which the table of interest

TABLE statement, is simple and straight-

PARTITION BY clause following the

is not spooled in preparation for a merge

forward. This paper gives some examples.

PRIMARY INDEX definition. The result

join. The Optimizer may choose a direct

of the expression must be an integer value

merge join when all columns of the PI are

or a value that can be cast to integer, and

specified in equality join terms.

As is true for all physical database design


decisions, there are trade-off considerations associated with each possible choice.
Its beyond the scope of this paper to discuss
the trade-off considerations at length.

the result indicates the partition number.


The columns referenced in the partitioning expression are called the partitioning
columns. A partition number must be

The objective of this paper is to provide

between 1 and 65,535, inclusive; therefore,

realistic examples and actual performance

the maximum number of partitions that

comparisons using PPI and non-PPI

can be defined for a table is 65,535.

solutions.

Definitions and Basics

Accessing a particular partition of a table


means accessing a subset of the table
beginning with the data block containing

In the context of PPI, partitioning refers

the first row belonging to the partition

to the physical ordering of rows within

(on each AMP), and extending to the data

the table. The ordering is automatically

block containing the last row belonging

provided by the database management

to the partition. The number of data

software, and is determined by a user-

blocks will be zero if there are no rows

specified expression called the partitioning

belonging to that partition (although it

expression. A PPI table physically is

may be necessary to read one data block

substantially the same as a non-PPI table

to determine that there are no rows for

except for the ordering of rows. More

the partition).

specifically, the PI value is hashed to


distribute a row to a particular AMP in
an identical fashion for PPI and non-PPI
tables. Within each AMP, rows are ordered
by PI hash for non-PPI tables, and by
partition number first then PI hash for
PPI tables.

The term partition elimination refers to


an automatic optimization in which the
Optimizer determines, based on query
conditions and the partitioning expression,
that some partitions cannot contain
qualifying rows, and causes those partitions
to be skipped. Partitions that are skipped
for a particular query are called eliminated
partitions. Generally, the greatest benefit
of a PPI table is obtained from partition
elimination.

EB-1889 > 1204 > PAGE 3 OF 14

The term direct product join is used to


describe a join in which the table of interest
is not spooled in preparation for a product
join. The Optimizer may choose a direct
product join when all the partitioning
columns are specified in equality join terms.
The ordering of rows within a table is
transparent to application developers, but
there are trade-off considerations involving
queries with partitioning column conditions, queries that specify one or a few PI
values and queries that perform joins on
the PI columns. We will briefly discuss
these trade-off considerations in subsequent sections.

How Much Can PPI


Improve Performance?
The performance gain depends on the
number of partitions and the specific
query being measured. In the best case,
the elapsed time reduction factor for a
specific query against a single table can
approach the reciprocal of the number
of partitions in the table. This means that
best-case PPI queries can take less than
1/100 of one percent of the time they
would take with a non-PPI table. The best
performance improvement occurs when

Partitioned Primary Indexes

Test Description

Baseline

PPI

Improvement

Select rows that have a specied value of


the partitioning column (200 partitions
with roughly the same number of rows each)

59 seconds

one second

98% reduction in
elapsed time

Select a month of activity from one partition


containing six months of data (11 years of
data contained in 40 partitions of unequal size)

58 seconds

two seconds

96% reduction in
elapsed time

Delete rows that have a specied value of the


partitioning column (200 partitions of equal size)

239 seconds

one second

more than 99% reduction


in elapsed time

Update one column in each row that has a


specied value of the partitioning column
(200 partitions of equal size)

237 seconds

three seconds

98% reduction in
elapsed time

MultiLoad insert a number of rows equal to 1%


of the table size into one partition (of 200)

1394 rows per


second per node
(larger numbers
are better

14,742 rows
per second
per node

more than ten times


faster

MultiLoad insert a number of rows equal to 1%


of the table size into one partition (of 200) with
one NUSI dened on the table

841 rows per


second per
node

5666 rows per


second per
node

more than six times


faster

Figure 1. Actual Performance Test Results

there are many partitions with reasonably


even distribution of rows among the
partitions, and partition elimination
excludes all except one partition.

How PPI Solves the


Business Problem the
First Example
We start the discussion of when a PPI is

months plus the current month-to-date.


Once per month, the transactions from
the oldest month are deleted. Current
transactions are added to the table nightly
using Teradata MultiLoad. Most transac-

Figure 1 shows the results of actual

most appropriate by showing the differ-

performance tests. The Baseline column

ences between a PPI and non-PPI table

is the performance for a non-PPI table,

for a few examples. For the first example,

and the PPI column is the performance

we stipulate a table and some processing

for a PPI counterpart table. These tests are

requirements, discuss the options available

considered to be realistic, but your results

prior to Teradata Database V2R5, and

may vary.

discuss the optimization opportunities a

Each row contains, among other things,

PPI provides.

the product code for the item, the transac-

Our hypothetical company has a large


sales table containing the details of each
transaction for the previous 24 full

EB-1889 > 1204 > PAGE 4 OF 14

tions are added on the date they occur,


but a small percentage of transactions may
be reported a few days after they occur.
The number of transactions per month is
roughly the same for all months.

tion date, an identifier for the sales agent,


and the quantity sold. The rows are short,

Partitioned Primary Indexes


and the data blocks are large. The PI is a

joined to relatively small tables containing

The solution would also complicate the

composite of product code, transaction

information about each product code and

archive strategy. In the end, this solution

date, and the agent identification. The

each sales agent.

was rejected as being too complicated and

non-PPI definition of this table, showing


only a few of the most important columns,

error-prone.
The DBA, prior to Teradata Database
V2R5, had a need to speed up ad hoc

With PPI, theres an excellent solution

queries and agent analysis queries. The

for this example scenario. By adding a

DBA considered creating a value-ordered

PARTITION BY clause to the definition

product_code CHAR(8),

secondary index or join index on the

of the replacement PPI table, it would be

sales_date DATE,

transaction date column, and had set up

easy to create 25 partitions, one for each

agent_id CHAR(8),

tests for those scenarios. After running

month (assuming the current date is in

quantity_sold INTEGER,

and analyzing EXPLAINs, the DBA had

October 2004).

other_columns CHAR(50))

found that the Optimizer had determined

PRIMARY INDEX (product_code,


sales_date, agent_id);

that neither index was selective enough to

is as follows:
CREATE TABLE SalesTable (

be an improvement over a full-table scan.


The DBA then considered splitting the

There are four major categories of queries


against this table:
> A modest number of short-running
queries specify the PI values.
> Many ad hoc queries have the follow-

table into 25 separate tables, each contain-

other_columns CHAR(50))

a UNION of all the tables for use by the

PRIMARY INDEX (product_code,


sales_ date, agent_id)

applications that analyze 24 months of


this solution could indeed speed up the

previous month or to the same days


of the same month of the previous
year for a few product code values.

targeted queries, but it added too much


complexity for the end users. Users would

The RANGE_N function was used in this

code more complicated UNION state-

scenario to specify the beginning and

ments, and select appropriate date ranges

ending dates and the granularity of the

and product code ranges. The need to

partitioning.

know the appropriate table name (from


the 25 different tables) would also apply

calendar quarter or less.

to applications submitting short-running


queries that specify the primary index.

the previous 24 full months, usually for

This solution would also complicate

most or all product code values.

nightly load jobs, especially in the first few

definition. The sales table is frequently

EB-1889 > 1204 > PAGE 5 OF 14

DATE 2002-10-01 AND DATE


2004-10-31 EACH INTERVAL
1 MONTH);

change the table names in their queries,

ance, usually over an interval of a

No other tables have the same PI

PARTITION BY RANGE_N
(sales_date BETWEEN

have to understand the structure and

> Some queries analyze agent perform-

> Some queries examine sales trends over

agent_id CHAR(8),

Then, the DBA would create a view with

Compare one month of activity to

sales to the same days of the

sales_date DATE,

quantity_sold INTEGER,

sales history. The DBA concluded that

Compare current-month-to-date

product_code CHAR(8),

ing transactions for a calendar month.

ing general pattern:


another month, or

CREATE TABLE PPI_SalesTable (

days of a month when a few of the transactions would be from the prior month.

By converting the sales table into a table


partitioned by transaction month, many
of the queries would run faster (in this
scenario) with no significant negative
trade-off considerations. Lets examine
each element of the stated workload as it
applies to the newly-partitioned table in
more detail.

Partitioned Primary Indexes


Faster Monthly Deletes

the deleted rows is not specified. For

Signicant Performance Gains

Instead of using Teradata MultiLoad to

example, to drop the partition and delete

in Ad hoc Queries

delete rows, the DBA could submit an

the rows for October 2002, and create a

Large gains would be seen in ad hoc queries

ALTER TABLE statement on a monthly

partition for November 2004,

that, for example, compare a recent month

basis (see the next example) to drop the

you would submit:

of sales data to a prior month. Due to

oldest partition and delete its rows, and at


the same time create a new partition that
would contain data for the upcoming
month. Additional partitions for future
months could be added if desired. A delete
of all the rows in a partition is optimized
in much the same way that a delete of all
rows in a table has historically been
optimized. In both cases, there is no need
to record the individual rows in the

ALTER TABLE SalesTable MODIFY


PRIMARY INDEX (product_code,
sales_date, agent_id)
DROP RANGE BETWEEN

partition elimination, only two of the


25 partitions would be read instead of the
full-table scan required on the non-PPI
table. This means that the number of disk

DATE 2002-10-01 AND DATE


2002-10-31

reads would be reduced by roughly 92%

ADD RANGE BETWEEN


DATE 2004-11-01 AND DATE
2004-11-30

time. The 92% figure applies to the step

WITH DELETE;

with a proportional reduction in elapsed


that reads the sales table, not to the sum
of all the steps used to accomplish the
query. Given the stated assumptions, the

transient journal as theyre deleted. The

Faster Teradata MultiLoad

other steps should take roughly the same

rows for the month being deleted are

Inserts

amount of time as for the non-PPI table.

physically stored contiguously (on each

The nightly Teradata MultiLoad insert

The same considerations apply to the

AMP) instead of being scattered more or

job would run faster than it did for the

agent analysis queries. The number of

less evenly among all the data blocks, as in

non-PPI table. Instead of the inserted rows

partitions read is determined by the time

the non-PPI table, so there would be fewer

distributing more or less evenly among

period specified in the query. Even if the

data blocks with rows to be deleted. Most

all the data blocks of the table, as with the

analysis is for twelve full months, there is

of the deletes would be full-block deletes

non-PPI table, the inserted rows would be

still roughly a 50% gain in reading twelve

so the data block would not have to be

concentrated in data blocks for the proper

of 25 partitions for the step that reads the

read or rewritten. Only one data block per

month. This would increase the average

sales table.

AMP would contain rows for the oldest

"hits per block" count (a key measure of

month plus the second oldest month, and

No Degradation to Queries

Teradata MultiLoad efficiency) and reduce

that would be the only data block read,

Requiring a Full-Table Scan

the number of data blocks that must be

updated, and rewritten. There is also no

read and rewritten.

need to touch any of the rows for the

Decision support queries that analyze 24


months of sales data would take roughly

other month partitions. Dropping the

Virtually No Change to Short-

the same time and resources as for the non-

oldest partition(s) with an ALTER TABLE

Running Queries

PPI table. There would be a small gain from

statement is a nearly instantaneous

Short-running queries that specify primary

reading 24 instead of 25 partitions. If the

operation assuming there are no second-

index values would run approximately

analysis is for 24 months plus the current

ary indexes or join indexes that require

as fast as on the non-PPI table. Since the

month (i.e., the entire table), the resource

updates, there are no retained or added

partitioning column is part of the primary

usage is the same as for the non-PPI table.

partitions (such as NO RANGE) to move

index, the PI access performance would not

the rows, and the option to make a copy of

be significantly changed.

EB-1889 > 1204 > PAGE 6 OF 14

Partitioned Primary Indexes

Activity

Non-PPI Table

PPI Table

Improvement

Comments

Nightly inserts

Inserted rows
scattered throughout
table

Inserted rows
concentrated in one
partition

Faster performance

No changes to load
script needed.

Monthly delete of
one month of data

MultiLoad job reads


most data blocks,
updates most data
blocks

ALTER TABLE
statement deletes
partition

Much faster
performance

Easier maintenance

Primary index access

One data block read

One data block read

No change
needed

No SQL changes

Comparison of
current month to
prior month

All data blocks read

Two partitions read

Step is 12 times
faster (two partitions
of 25 read)

No SQL changes needed

Trend analysis over


entire table

All data blocks read

All data blocks read

Little change

Rows are two bytes longer


for PPI. 2% more data
blocks for 100-byte rows.

Joins

No direct merge joins

No direct merge
joins

Little change

No direct merge joins due


to choice of primary index.

Archive/Restore
(in Teradata
Database V2R6)

Entire table

Entire table or
selected partitions

Faster archives for


selected partitions

Saves having to re-archive


data already archived

Figure 2. Example of PPI Improvement Opportunities

Virtually No Degradation for

strategy is less efficient with the partition-

Additional Disk Space

Joins

ing of the sales table. Joins could even be

Required

Joins would take roughly the same amount

faster depending on the specific query

The partitioned sales table would require

of time. In this example, since there are no

conditions and the possibility of partition

somewhat more disk space than the non-

other tables with the same primary index,

elimination.

partitioned counterpart due to the two-

there are no direct merge joins to the sales


table. Joins to the product table and agent
table would most likely use the same join
strategy as when the sales table was not
partitioned. The join strategy would
typically be either a duplication of a small
table followed by a product join to the
sales table, or a redistribution of a spool
file followed by a merge join. Neither

byte partition number recorded in each


More Efcient Archiving and
Restoring

increase would be less than 3%.

In Teradata Database V2R6, partitions can


also be selectively archived, restored, and

Figure 2 summarizes the improvement

copied. This can significantly reduce the

opportunities for the example.

time to archive data by only archiving the


recently changed partitions. Restores of
selected partitions can be used to quickly
reload critical partitions.

EB-1889 > 1204 > PAGE 7 OF 14

row. For this example, the percentage of

Partitioned Primary Indexes


Can the First Example Be
Improved Further?
The first PPI solution, outlined above,
was to partition by month since many of
the queries use a month as their basic unit
of time. Another option to consider is
partitioning to a finer level. Let's compare
partitioning by month to partitioning by
day using the following PARTITION BY
clause:
PARTITION BY RANGE_N (sales_date
BETWEEN
DATE 2002-10-01 AND DATE
2004-10-31 EACH INTERVAL 1
DAY);

The table would now have about 760


partitions (two years with 365 days each
plus the current month of about 30 days).
Some small number of partitions, the
ones corresponding to future dates in the
current month, would be empty.
Virtually No Impact to the
Monthly Deletes
The monthly process deleting the oldest
month of data would virtually be the
same. Depending on the month, between
28 and 31 smaller partitions would be
deleted instead of one larger partition.
However, the same number of rows would
be deleted, and the run time for the job
would be roughly the same.

Faster Nightly Inserts

analyze four days for each of two months,

Nightly inserts would benefit from the

while a query submitted on the last day of

finer partitioning. Instead of being con-

the current month might analyze about 30

centrated in one or two partitions out

days for each of two months. Instead of

of the 25 large partitions, as in the last

two out of 25 monthly partitions (between

example, the rows would be inserted into

32 and 36 days of data), the query on the

three to five smaller partitions of the 760

fifth day of the current month would

daily partitions, well under one percent

involve eight out of 760 partitions (eight

of the total. Most of the inserts would be

days of data), which is a smaller percent-

directed to the one partition that contains

age of the table. The query at the end of

the day's activity. This would increase the

the month would examine about 60 out of

hits per block, thereby improving the

760 partitions, which is substantially the

performance of the inserts.

same as two out of 25 monthly partitions.

No Impact to Short-Running

Analysis queries that examine 24 months

Queries

of data would run in about the same time

Having 760 partitions instead of 25 would

as they are examining most of the table in

not impact short-running PI access queries.

either case.

This is because in this example the partitioning column is part of the primary
index. In other situations, there could be
a significant impact.

significantly impact the joins since there


are no direct merge joins against this table
in this scenario.

Modest Improvement for


Some Ad hoc Queries
Ad hoc queries that analyze two full
months of data would not be impacted.
They would now access about 60 partitions out of the 760, instead of two out
of 25, roughly the same percentage of the
table. However, when queries vary by the
time of month, there would be some gain
by having the larger number of partitions.
For example, a query submitted on the
fifth day of the current month might

EB-1889 > 1204 > PAGE 8 OF 14

The number of partitions would not

In summary, for this example, having a


larger number of smaller partitions would
produce modest gains and no degradation
to performance. The greatest gains would
be for queries that analyze only a few days
of transactions, and for the nightly loads.
Additionally in Teradata Database V2R6, a
days transactions (that is, a small partition
of data) could be selectively archived or
restored.

Partitioned Primary Indexes


A Second Example
While transaction date is frequently a good
choice for the partitioning column, it is not
the only choice. Let's consider a telephone
company's table with detailed information
about phone calls. There is a row for each
outgoing call with the originating phone
number, the timestamp for the start of the
call, and the call duration, among other
things. The rows are retained for a variable
length of time based on the call date and
the monthly bill preparation date. This is
not the same for every customer, and the
retention period is rarely more than six
weeks. The primary index is the phone
number and the call-start timestamp. This
implies the primary index was chosen to
provide good data distribution across the

One possibility for partitioning this table

to benefit geographic area analysis since

would be to cast call_start as a date and

(in some parts of the world, at least) the

partition by date, similarly to the solution

first three digits identify a particular area.

in the first example. This would help with


inserting new activity in the same manner
as in the previous example. Deletion of
rows would not get the same performance
gain since the deletes are not strictly by
call date and, therefore, the deleted rows
would not be clustered in a partition. In
this case, the ALTER TABLE statement
could not be used, and the process would
not reap the same performance benefit
that deleting entire partitions provides.

If 1000 partitions improve performance,


10,000 partitions (the first four digits of
the phone number) would probably be
even better. If 10,000 partitions were good,
maybe 50,000 would be better yet. We
cannot have 100,000 partitions, but we
could use the first five digits and assign
two consecutive numbers to each partition. Some partitions might be empty due
to the way phone numbers are assigned.

The analysis queries that are based on the

For example, this table definition creates

date of the call would benefit with queries

50,000 partitions using the first five digits

specifying a range of a few days getting the

of the phone number:

greatest gain.
CREATE TABLE PPI_CallDetail (

AMPs. It is also obvious that the primary

Another choice would be to use the phone

index was not chosen for data access or to

number as the partitioning column. Phone

phone_number DECIMAL(10)
NOT NULL,

facilitate direct merge joins. Some queries

numbers contain too many digits to give

call_start

TIMESTAMP,

analyze all calls from a particular phone

each number its own partition, but a

call_duration

INTEGER,

number. Other queries analyze all calls for

subset of the digits could be used. If the

other_columns

CHAR(30))

a particular period of time, perhaps for as

first (high-order) three digits are used,

long as a month, for customers meeting

there would be 1000 partitions, some of

certain criteria. A non-PPI definition of

which would always be empty because of

this table, showing only a few critical

the way phone numbers are assigned.

columns, follows:

This partitioning expression would not

CREATE TABLE CallDetail (

improve the performance of bulk inserts

phone_number

DECIMAL(10)
NOT NULL,

or deletes, which would be scattered across


all partitions. It would not help with date-

call_start

TIMESTAMP,

based queries, but would allow queries

call_duration

INTEGER,

specifying a phone number to run much

other_columns

CHAR(30))

PRIMARY INDEX (phone_number,


call_start);

EB-1889 > 1204 > PAGE 9 OF 14

faster as only one partition would be read


out of maybe 500 or more non-empty
partitions. A second advantage would be

PRIMARY INDEX (phone_number,


call_start)
PARTITION BY RANGE_N (
CAST(phone_number / 100000.00000
AS INTEGER) BETWEEN 0 AND 99999
EACH 2);

If its not important to be able to map a


geographic area to one or more partitions,
another option would be to maximize the
number of partitions by using the partitioning expression (phone_number mod
65535) + 1. If the table contains about
3.276 billion rows, on average each

Partitioned Primary Indexes


partition would contain about 50,000

The best choice, if any, of these proposed

The following are some of the considera-

rows. For a system with 100 AMPs, each

partitioning expressions depends on the

tions that will apply:

AMP would on average contain about

mix of anticipated queries. The extended

500 rows per partition, a number of rows

logical data model can serve as the starting

that might fit in one data block if the row

point for making the decision, but some

width was fairly small. The decrease in

amount of testing of different scenarios

response time of a one-partition scan for

will often be required.

table scan that would result with the


non-PPI table. A query to return activity
for one phone number is a best-case
scenario for single-table response time
improvement due to PPI. Disregarding the
overhead cost of initiating the query and
returning the answer set, the elapsed time

Required
The primary index is currently defined as
unique, but would have to be defined as
non-unique if the table was partitioned.

all activity for a particular phone number


would be dramatic compared to the full-

Additional Disk Space

A Final Example

There is a business requirement to guaran-

The previous examples illustrate scenarios

Therefore, the DBA would have to define

where a PPI table is the correct choice. For

a unique secondary index on the invoice

this example, we examine a more ambigu-

number column. This secondary index

ous situation in which more trade-off

would increase processing times on insert,

considerations apply, and the correct

delete, and update operations, and con-

solution is not as evident.

sume additional disk space. The base table

tee that invoice numbers are unique.

would also be larger, by two bytes per row,

could be reduced to 1/65535 of the time

An invoice table contains data about

using the non-PPI table. Including the

each invoice issued in the past four years.

query initiation and termination overhead,

The unique primary index is invoice

Slower Short-Running Queries

the total query time improvement would

number. New rows are added nightly using

PI access queries would now use the

be somewhat less than a factor of 65,535,

Teradata MultiLoad, and the oldest month

unique secondary index to access the

but could be less than 1/10000 of the

of data is deleted once per month. There

row. As a rule of thumb, accessing the

non-PPI time. Here is a table definition to

is a moderately heavy volume of queries

row using a secondary index would take

use this partitioning:

that get information about one specified

roughly two to three times as long as using

CREATE TABLE PPI_CallDetail (

invoice. There are ad hoc analysis queries

the primary index for the non-PPI table.

that examine all invoices for some period

On a positive note, the PI access is a very

of time, usually less than a year. Other

fast, usually a sub-second, operation.

TIMESTAMP,

tables have invoice number as their

Doubling or tripling the response time is

INTEGER,

primary index, but do not have an invoice

likely to go unnoticed to the users who

date column. There are frequent joins with

issue those queries.

phone_number
call_start
call_duration
other_columns

DECIMAL(10)
NOT NULL,

CHAR(30))

PRIMARY INDEX (phone_number,


call_start)
PARTITION BY phone_number MOD
65535 + 1;

EB-1889 > 1204 > PAGE 10 OF 14

further increasing the required disk space.

those other tables.


Slower Long-Running Queries
The DBA is considering whether it would

Direct merge joins (without partition

be advantageous to partition the invoice

elimination) would at best require more

table on invoice date using one-month

memory and CPU time, and may be

ranges.

measurably slower compared to a similar

Partitioned Primary Indexes


non-PPI table. The amount of perform-

of queries, and determine how much each

One or more columns can make up the

ance degradation will depend on the query

query type contributes to the overall

partitioning expression although its

conditions, how many partitions can be

workload involving this table. This will

anticipated that, for most tables, one

eliminated, and the specific join plan

provide an estimate of the overall work-

column will be specified. The partitioning

chosen by the Optimizer. Actual measure-

load performance with and without a PPI

columns can be part of the primary index,

ment of representative queries will be

table. If the difference between a PPI and

but are not required to be. The result of

required to determine the overall differ-

non-PPI table performance is substantial

the partitioning expression must be a

ence in performance.

in either direction, the choice will be

scalar value that is INTEGER or can be

evident for the overall workload. But the

cast to INTEGER. Most deterministic

DBA should also consider the relative

functions can be used within the expres-

importance of the various activities. For

sion. The expression must not require

example, if the nightly insert volume is

character or graphic comparisons,

starting to overwhelm the time set aside

although character or graphic columns

for inserting new activity, even a small

can be referenced in some circumstances.

improvement in load time might be

If the partitioning columns are not all part

considered sufficiently important to offset

of the primary index, the primary index

larger degradations in queries. Similarly, if

cannot be defined as unique although a

the response time of PI queries is critical,

unique secondary index can be defined on

even a small degradation in those queries

the same columns as the primary index.

Impact on Table Maintenance


Nightly inserts would benefit in the same
way as in the first example for the same
reasons. However, the additional index on
invoice number would partially offset the
benefit. Since Teradata MultiLoad does not
support unique secondary indexes, the
index would need to be dropped prior to
the MultiLoad job and then recreated after
the job. Alternatively, this may be an
opportunity to move to a near-real-time
load strategy using, for example, Teradata
TPump.

might be considered unacceptable even if


overall workload performance is improved.
In short, measurement and analysis is

The same considerations as in the first

required to come to a rational decision

example apply to the monthly deletes.

for this case.

Similarly, in Teradata Database V2R6,


benefits may occur with archives and
restores of selected partitions.
Faster Ad hoc Queries
Ad hoc queries examining several months
of invoices would benefit in the same
way as in the first example. The benefit
would be greatest when fewer months are
examined.

Only base tables can be PPI tables. This


excludes global temporary tables, volatile
tables, join indexes, hash indexes, and
secondary indexes. This restriction does
not mean that a PPI table cannot have

Specifics of Defining a
PPI Table

secondary indexes or cannot be referenced

The PRIMARY INDEX clause of the

TION BY clause is not available on a

CREATE TABLE statement may be

CREATE GLOBAL TEMPORARY TABLE,

followed by an optional PARTITION BY

CREATE VOLATILE TABLE, CREATE

partitioning_expression clause. The parti-

INDEX, CREATE JOIN INDEX, or

tioning expression is a general expression

CREATE HASH INDEX statement.

in the definition of a join index or hash


index. It merely means that the PARTI-

allowing wide flexibility in tailoring the


partitioning expression to the unique

In the general case, there can be up to

Would it be worthwhile to convert the

characteristics of the table. Two functions,

65,535 partitions numbered from one.

invoice table to use a PPI? The DBA will

RANGE_N and CASE_N, are provided

As rows are inserted into the table, the

need to measure the amount of improve-

to simplify the creation of partitioning

partitioning expression is evaluated to

ment and degradation in the various types

expressions.

determine the proper partition placement

EB-1889 > 1204 > PAGE 11 OF 14

Partitioned Primary Indexes


for that row. A two-byte internal represen-

the partitioning column is compared to

index. Instead of looking up all the rows

tation of the partition number is

constant expressions and the partitioning

in the base table for particular index value,

embedded in the row as part of the row

expression is a single column or a

only rows in the base table referenced by

identifier making PPI rows two bytes

RANGE_N function on a single column

rowids pointing to non-eliminated

wider than they would be if the table

can provide partition elimination. In some

partitions are read.

wasnt partitioned. Secondary indexes

cases, the constant expressions may

referencing PPI tables use the wider row

contain USING variables and still provide

identifier, making those rows wider as

partition elimination.

Teradata Database V2R6 also makes a


Non-Unique Secondary Index (NUSI)
access a single-AMP operation if the

well.1 Except for the embedded internal


Joins on the primary index columns of a

NUSI is on the same columns as the

partitioned table that are equated to the

Non-Unique Primary Index (NUPI) with

columns of another table are also opti-

an equality condition on the NUSI. Note

mized when there are a small number of

that a NUSI on the same columns as the

non-eliminated partitions. In this case,

NUPI is only allowed for a PPI table. This

a set of partitions can be directly read in

potentially provides a faster access path

a sliding window of merge joins and,

than using the NUPI but with the same

Sample uses of partitioning expressions

thereby, avoid spooling the partitioned

single-AMP and rowhash locking charac-

were shown in the discussions of the

table prior to the join. If also joined by

teristics. This can occur when the number

examples that were presented earlier. While

equality on the partitioning columns, a

of occurrences of a NUSI value is less than

the examples were simple, the partitioning

rowkey merge join simplifies and improves

the number of partitions.

expression is a general expression, which

the performance of the merge join.

partition number, PPI rows have the same


format as non-PPI rows. A data block can
contain rows from multiple consecutive
partitions. There are no new control
structures to implement the partitioning
expression.

As mentioned earlier, Teradata Database

makes it possible to define complex


partitioning schemes tailored to the
processing needs of individual tables.
However, a simple partitioning expression
(for instance, RANGE_N on a single date
column) may provide the best opportunities for partition elimination in queries.

In Teradata Database V2R5.1, dynamic

V2R6 provides for archives and restores of

partition elimination can occur when

selected partitions.

there is an equality constraint between


the partitioning column of one table and
a column of another table. This is useful
when looking up a row in one table and
matching those rows to corresponding

The Optimizer does partition elimination

partitions (using a product join) instead of

for a query by analyzing the constraints on

a product join to the entire table. Teradata

the partitioning columns in the context of

Database V2R6 further extends dynamic

the partitioning expression. Constraints

partition elimination to merge joins.

partition elimination. Also, range constraints on the partitioning column where


1

shown in the section How PPI Solves the


Business Problem the First Example to
drop the partition containing the oldest
transactions and create expansion partitions for future dates. This is a simple
example, but it does illustrate the capability.

Another enhancement in Teradata Data-

a simple and convenient mechanism for

base V2R6 provides partition elimination

the DBA to perform periodic maintenance

on the referencing rowids of a secondary

on a range-based PPI table.

A join index or hash index that references a table using a row identier uses the wider format whether
or not the table has a partitioned primary index starting with Teradata Database V2R5.

EB-1889 > 1204 > PAGE 12 OF 14

extended to support PPI. An example was

The ability to ALTER a PPI table provides

that compare the partitioning columns to


be equal to constant expressions provide

The ALTER TABLE statement has been

Partitioned Primary Indexes


High-Level Partitioning
Guidelines
Here are some general guidelines, with
limited discussion, to help determine
whether and how to partition a table:
1. Large tables are good candidates for
partitioning.
2. Partition on a column that is frequently used as a restrictive query
condition.
3. If other factors are equal, partition on

Join costing, in particular, can be more

all of it. This can provide a large perform-

accurate when the actual number of

ance boost for a wide range of queries, day

partitions is known and fairly small

after day, and is automatic. SQL authors

than when the number is assumed to

need not be aware of the partitioning

be 65,535.

structure, and no changes are required to

6. Unless the PI is rarely used for access


or direct merge joins, keep the number
of partitions fairly small when the
partitioning expression uses columns
that are not part of the PI.
7. The same considerations regarding

existing SQL.
A second potential advantage is faster
batch loads. If the table is partitioned
by transaction date, nightly loads of
transactions for the current day can be
dramatically improved. Similarly, the time

the selection of the primary index

a column that is part of the primary

to delete old rows no longer needed can be

apply to PPI tables as non-PPI tables.

index in preference to a column that

dramatically faster (nearly instantaneous

Choose PI columns that provide good

is not, unless the primary index is

in some cases) when the table is parti-

distribution and avoid large clumps of

seldom, if ever, used for access or joins.

tioned by transaction date.

duplicate PI values, and which are most


commonly used to access individual

Finally with Teradata Database V2R6,

partitioning expression, a partitioning

rows in the table. Sometimes those two

you can perform archives and restores of

expression is only useful if the Opti-

considerations conflict, and a reason-

selected partitions. This allows for more

mizer can effectively apply partition

able compromise between the two

frequent, but less costly archives. For

elimination to queries. A simple

must be reached.

restores, critical data (for example, in the

4. While there are few restrictions on the

most recent partitions) can be restored

partitioning expression is more likely


to give the maximum amount of

A more detailed description of partition-

quickly and made available to users

partition elimination than a more

ing guidelines may be found in the

without waiting for the entire table to

complex expression. For example, a

Teradata Orange Book: Partitioned Primary

be restored.

RANGE_N function on a date column

Index Usage.

can often be an effective partitioning


expression for queries with range
constraints on the partitioning column.
5. Use RANGE_N or CASE_N in prefer-

In the above situations, the improvement

High-Level Trade-off
Considerations

may be even greater when the partitioning


structure makes one or more secondary
indexes or join indexes redundant, allow-

The greatest potential gain derived from

ing those indexes to be dropped.

ence to direct use of a column in

partitioning a table is the ability to read

most situations. The Optimizer can

a small subset of the table instead of the

Offsetting these gains are some potential

determine the maximum number

entire table. For example, a query that

disadvantages of partitioning. The first

of partitions when RANGE_N or

examines two months of sales from a table

disadvantage is that PI access of the table

CASE_N is used, and will have to

with two years of sales history would read

may be slower when a partitioning column

assume 65,535 partitions otherwise.

about one-twelfth of the table instead of

is not part of the PI. This disadvantage can

EB-1889 > 1204 > PAGE 13 OF 14

Partitioned Primary Indexes


Teradata.com

be offset by choosing partitioning columns


that are part of the PI, specifying the

Summary
PPI tables can dramatically improve

values of the partitioning columns and

performance of certain types of queries,

the PI columns, or, in some situations, by

especially those that access only a small

defining a secondary index.

part of a large table. High-volume data

Whether and how to partition the primary


index of a table is a physical design choice.
The trade-off considerations associated
with a PPI should be understood and
considered when making the physical
design decisions.

A second disadvantage is that direct merge

load and data maintenance times can

joins involving a partitioned table may be

also be improved when, for example,

The extended logical data model can serve

slower unless both tables can be identically

the transaction date is specified as the

as the starting point for making physical

partitioned. The disadvantage can be offset

partitioning column.

design decisions, but some amount of

when the query conditions allow some

A partitioned primary index is flexible and

partitions to be eliminated from the join.

easy to use. PPI tables retain the traditional

As in all physical design choices, you must

uses of primary indexes to distribute data

weigh the trade-off considerations and test

evenly and provide very fast access when

assumptions to get the best results.

the primary index value is specified in


the query.

testing of different scenarios will often be


required. As with other physical design
decisions, the total workload and relative
importance of the workload components
must be examined to determine whether
the benefits will outweigh the disadvantages for each design decision.

A more detailed description of trade-off


considerations may be found in the

No changes to existing SQL are necessary.

Teradata Orange Book: Partitioned Primary

Users accessing a PPI table will see no

Index Usage.

difference, except perhaps for different


average response times.

Teradata and NCR are registered trademarks of NCR Corporation. NCR continually enhances products as new technologies and components become available.
NCR, therefore, reserves the right to change specications without prior notice. All features, functions, and operations described herein may not be marketed in
all parts of the world. Consult your Teradata representative or visit Teradata.com for more information. No part of this publication may be reprinted or otherwise
reproduced without permission from Teradata.
This document, which includes the information contained herein, is the exclusive property of NCR Corporation. Any person is hereby authorized to view, copy, print,
and distribute this document subject to the following conditions. This document may be used for non-commercial, informational purposes only and is provided on
an AS-IS basis. Any copy of this document or portion thereof must include this copyright notice and all other restrictive legends appearing in this document.
Note that any product, process or technology described in the document may be the subject of other intellectual property rights reserved by NCR and are not
licensed hereunder. No license rights will be implied. Use, duplication or disclosure by the United States government is subject to the restrictions set forth in DFARS
252.227-7013 (c) (1) (ii) and FAR 52.227-19.
2004 NCR Corporation

Dayton, OH U.S.A.

EB-1889 > 1204 > PAGE 14 OF 14

Produced in U.S.A.

All Rights Reserved.

Das könnte Ihnen auch gefallen