Sie sind auf Seite 1von 251

Teradata SQL

Unleash the Power


Michael J. Larkins
and
Thomas L. Coffing, Jr.
Third Edition 2003
(Includes 2!" functionalit#$
%ritten &# Michael J. Larkins and Thomas L. Coffing
%e& 'age( ))).Coffing*%.com
E+Mail addresses(
Mike( TeraTeach,Consultant.com
Tom( Tom.Coffing,Coffing*%.com
Teradata, -C!, and ./-ET are registered trademarks of -C!
Cor0oration, *a#ton, 1hio, 2.3.4., I.M and *.2 are registered
trademarks of I.M Cor0oration, 4-3I is a registered trademark of the
4merican -ational 3tandards Institute. The Jeo0ard# game is a
registered trademark of 'arker .rothers and Mer5 6riffin. In addition
to these 0roducts names, all &rands and 0roduct names in this
document are registered names or trademarks of their res0ecti5e
holders.
Coffing *ata %arehousing shall ha5e neither lia&ilit# nor res0onsi&ilit#
to an# 0erson or entit# )ith res0ect to an# loss or damages arising
from the information contained in this &ook or from the use of
0rograms or 0rogram segments that are included. The manual is not a
0u&lication of -C! Cor0oration, nor )as it 0roduced in con7unction
)ith -C! Cor0oration.
Co0#right 2008 &# Coffing Publishing
4ll rights reser5ed. -o 0art of this &ook shall &e re0roduced, stored in
a retrie5al s#stem, or transmitted &# an# means, electronic,
mechanical, 0hotoco0#ing, recording, or other)ise, )ithout )ritten
0ermission from the 0u&lisher. -o 0atent lia&ilit# is assumed )ith
res0ect to the use of information contained herein. 4lthough e5er#
0recaution has &een taken in the 0re0aration of this &ook, the
0u&lisher and author assume no res0onsi&ilit# for errors or omissions,
neither is an# lia&ilit# assumed for damages resulting from the use of
information contained herein. 9or information, address(
Coffing Publishing
7810 Kiester Rd.
Middletown, OH 4504
International 3tandard .ook -um&er( !"#$ 0%&704&80%'%&
Printed in the United States of America
4ll terms mentioned in this &ook that are kno)n to &e trademarks or
ser5ice ha5e &een stated. Coffing Publishing cannot attest to the
accurac# of this information. 2se of a term in this &ook should not &e
regarded as affecting the 5alidit# of an# trademark or ser5ice mark.

Acknowledgements and Special Thanks
Todd %alter, -C!, for 0ro5iding access to his 0eo0le regarding 2!:
s#stem information.
'aul 3inclair, -C!, for 0ro5iding 2!: information on 3tored
'rocedures and re5ie)ing the stored 0rocedures cha0ter, information
on the ne) 2!: 1L4' functionalit#, and for information regarding the
ne) 2'3E!T command.
9red 'lue&ell, JC'enne# Cor0., for 0ro5iding 2!: s#stem a5aila&ilit#
)hile )e )ere teaching in *allas.
4 s0ecial thanks to the staff at -ation)ide Insurance for letting us
teach an earl# 2!: u0date class and hel0ing finali;e some additional
s#nta< )hen creating stored 0rocedures.
Larr# Carter and 'aul *e!ouin, -C!, for information on changes to
triggers in 2!:.
.ill 'utnam for assistance in o&taining 2!: information.
Chris Coffing, Coffing *ata %arehousing, for dedication in getting our
s#stem u0 on 2!: so that )e didn=t ha5e to >&orro)? so much
s#stem time.
%e ha5e a 5er# s0ecial thank #ou for Loraine Larkins. 3he is Mike=s
Mom and an e<cellent 0roof+reader and &arometer for the ease of
understanding the material. This is es0eciall# true for someone )ho
)as not 3@L literate )hen this )hole thing started.
Last, &ut far from least, )e )ant to thank 6od for 0ro5iding us )ith
the ins0iration, dedication and fortitude to finish this &ook.

Teradata Introduction
The )orld=s largest data )arehouses commonl# use the su0erior
technolog# of -C!=s Teradata relational data&ase management s#stem
(!*.M3$. 4 data )arehouse is normall# loaded directl# from
o0erational data. The ma7orit#, if not all of this data )ill &e collected
on+line as a result of normal &usiness o0erations. The data )arehouse
therefore acts as a central re0ositor# of the data that reflects the
effecti5eness of the methodologies used in running a &usiness.
4s a result, the data loaded into the )arehouse is mostl# historic in
nature. To get a true re0resentation of the &usiness, normall# this data
is not changed once it is loaded. Instead, it is interrogated re0eatedl#
to transform data into useful information, to disco5er trends and the
effecti5eness of o0erational 0rocedures. This interrogation is &ased on
&usiness rules to determine such as0ects as 0rofita&ilit#, return on
in5estment and e5aluation of risk.
9or e<am0le, an airline might load all of its maintenance acti5it# on
e5er# aircraft into the data&ase. 3u&seAuent in5estigation of the data
could indicate the freAuenc# at )hich certain 0arts tend to fail. 9urther
anal#sis might sho) that the 0arts are failing more often on certain
models of aircraft. The first &enefit of the ne) found kno)ledge
regards the a&ilit# to 0lan for the ne<t failure and ma#&e e5en the t#0e
of air0lane on )hich the 0art )ill fail. Therefore, the 0art can &e on
hand )hen and ma#&e )here it is needed, or the 0art might &e
0roacti5el# changed 0rior to its failure.
If the information re5eals that the 0art is failing more freAuentl# on a
0articular model of aircraft, this could &e an indication that the aircraft
manufacturer has a 0ro&lem )ith the design or 0roduction of that
aircraft. 4nother 0ossi&le cause is that the maintenance cre) is doing
something incorrectl# and contri&uting to the situation. Either )a#, #ou
cannot fi< a 0ro&lem if #ou do not kno) that a 0ro&lem e<ists. There is
incredi&le 0o)er and sa5ings in this t#0e of kno)ledge.
4nother &usiness area )here the Teradata data&ase e<cels is in retail.
It 0ro5ides an en5ironment that can store &illions of sales. This is a
critical ca0a&ilit# )hen #ou are recording and anal#;ing the sales of
e5er# item in e5er# store around the )orld. %hether it is used for
in5entor# control, marketing research or credit anal#sis, the data
0ro5ides an insight into the &usiness. This t#0e of kno)ledge is not
easil# attaina&le )ithout detailed data that records e5er# as0ect of the
&usiness. Tracking in5entor# turns, stock re0lenishment, or 0redicting
the num&er of goods needed in a 0articular store #ields a 0riceless
0ers0ecti5e into the o0eration of a retail outlet. This information is
)hat ena&les one retailer to thri5e )hile others go out of &usiness.
Teradata is flourishing )ith the reali;ation that detail data is critical to
the sur5i5al of a &usiness in a com0etiti5e, lo)er margin en5ironment.
Continuall#, &usinesses are forced to do more )ith less. Therefore, it is
5ital to ma<imi;e the efforts that )ork )ell to im0ro5e 0rofit and
minimi;e or correct those that do not )ork.
1ne com0uter 5endor used these same techniAues to determine that it
cost more to sell into the deskto0 en5ironment than )as reali;ed in
0rofit. 'rior to this reali;ation, the sales effort had attem0ted to make
u0 the loss &# selling more com0uters. 2nfortunatel#, increased sales
meant increased losses. Toda#, that com0an# is doing much &etter and
has made a huge ste0 into 0rofita&ilit# &# discontinuing the small
com0uter line.

Teradata Architecture
The Teradata data&ase currentl# runs normall# on -C! Cor0oration=s
%orldMark 3#stems in the 2-IB M'+!43 en5ironment. 3ome of these
s#stems consist of a single 0rocessing node (com0uter$ )hile others
are se5eral hundred nodes )orking together in a single s#stem. The
-C! nodes are &ased entirel# on industr# standard C'2 0rocessor
chi0s, standard internal and e<ternal &us architectures like 'CI and
3C3I, and standard memor# modules )ith :+)a# interlea5ing for
s0eed.
4t the same time, Teradata can run on an# hard)are ser5er in the
single node en5ironment )hen the s#stem runs Microsoft -T and
%indo)s 2000. This single node ma# &e an# com0uter from a large
ser5er to a la0to0.
%hether the s#stem consists of a single node or is a massi5el# 0arallel
s#stem )ith hundreds of nodes, the Teradata !*.M3 uses the e<act
same com0onents e<ecuting on all the nodes in 0arallel. The onl#
difference &et)een small and large s#stems is the num&er of
0rocessing com0onents.
%hen these com0onents e<ist on different nodes, it is essential that
the com0onents communicate )ith each other at high s0eed. To
facilitate the communications, the multi+node s#stems use the ./-ET
interconnect. It is a high s0eed, multi+0ath, dual redundant
communications channel. 4nother ama;ing ca0a&ilit# of the ./-ET is
that the &and)idth increases )ith each consecuti5e node added into
the s#stem. There is more detail on the ./-ET later in this cha0ter.
Teradata Components
4s 0re5iousl# mentioned, Teradata is the su0erior 0roduct toda#
&ecause of its 0arallel o0erations &ased on its architectural design. It is
the 0arallel 0rocessing &# the ma7or com0onents that 0ro5ide the
0o)er to mo5e mountains of data. Teradata )orks more like the earl#
Eg#0tians )ho &uilt the 0#ramids )ithout hea5# eAui0ment using
0arallel, coordinated human efforts. It uses smaller nodes running
se5eral 0rocessing com0onents all )orking together on the same user
reAuest. Therefore, a monumental task is com0leted in record time.
Teradata o0erates )ith three ma7or com0onents to achie5e the 0arallel
o0erations. These com0onents are called( 'arsing Engine 'rocessors,
4ccess Module 'rocessors and the Message 'assing La#er. The role of
each com0onent is discussed in the ne<t sections to 0ro5ide a &etter
understanding of Teradata. 1nce )e understand ho) Teradata )orks,
)e )ill 0ursue the 3@L that allo)s storage and access of the data.
Parsing Engine Processor (PEP or PE)
The 'arsing Engine 'rocessor ('E'$ or 'arsing Engine ('E$, for short,
is one of the t)o 0rimar# t#0es of 0rocessing tasks used &# Teradata.
It 0ro5ides the entr# 0oint into the data&ase for users on mainframe
and net)orked com0uter s#stems. It is the 0rimar# director task
)ithin Teradata.
4s users >logon? to the data&ase the# esta&lish a Teradata session.
Each 'E can manage 820 concurrent user sessions. %ithin each of
these sessions users su&mit 3@L as a reAuest for the data&ase ser5er
to take an action on their &ehalf. The 'E )ill then 0arse the 3@L
statement to esta&lish )hich data&ase o&7ects are in5ol5ed. 9or no),
let=s assume that the data&ase o&7ect is a ta&le. 4 ta&le is a t)o+
dimensional arra# that consists of ro)s and columns. 4 ro) re0resents
an entit# stored in a ta&le and it is defined using columns. 4n e<am0le
of a ro) might &e the sale of an item and its columns include the 2'C,
a descri0tion and the Auantit# sold.
4n# action a user reAuests must also go through a securit# check to
5alidate their 0ri5ileges as defined &# the data&ase administrator. 1nce
their authori;ation at the o&7ect le5el is 5erified, the 'E )ill 5erif# that
the columns reAuested actuall# e<ist )ithin the o&7ects referenced.
-e<t, the 'E o0timi;es the 3@L to create an e<ecution 0lan that is as
efficient as 0ossi&le &ased on the amount of data in each ta&le, the
indices defined, the t#0e of indices, the selecti5it# le5el of the indices,
and the num&er of 0rocessing ste0s needed to retrie5e the data. The
'E is res0onsi&le for 0assing the o0timi;ed e<ecution 0lan to other
com0onents as the &est )a# to gather the data.
4n e<ecution 0lan might use the 0rimar# inde< column assigned to the
ta&le, a secondar# inde< or a full ta&le scan. The use of an inde< is
0refera&le and )ill &e discussed later in this cha0ter. 9or no), it is
sufficient to sa# that a full ta&le scan means that all ro)s in the ta&le
must &e read and com0ared to locate the reAuested data.
4lthough a full ta&le scan sounds reall# &ad, )ithin the architecture of
Teradata, it is not necessaril# a &ad thing &ecause the data is di5ided
u0 and distri&uted to multi0le, 0arallel com0onents throughout the
data&ase. %e )ill look ne<t at the 4M's that 0erform the 0arallel disk
access using their file s#stem logic. The 4M's manage all data storage
on disks. The 'E has no disks.
4cti5ities of a 'E(
Con5ert incoming reAuests from E.C*IC to 43CII (if from an
I.M mainframe$
'arse the 3@L to determine t#0e and 5alidit#
alidate user 0ri5ileges
10timi;e the access 0ath(s$ to retrie5e the ro)s
.uild an e<ecution 0lan )ith necessar# ste0s for ro) access
3end the 0lan ste0s to 4ccess Module 'rocessors (4M'$
in5ol5ed
Access Module Processor (AMP)
The ne<t ma7or com0onent of Teradata=s 0arallel architecture is called
an 4ccess Module 'rocessor (4M'$. It stores and retrie5es the
distri&uted data in 0arallel. Ideall#, the data ro)s of each ta&le are
distri&uted e5enl# across all the 4M's. The 4M's read and )rite data
and are the )orkhorses of the data&ase. Their 7o& is to recei5e the
o0timi;ed 0lan ste0s, &uilt &# the 'E after it com0letes the
o0timi;ation, and e<ecute them. The 4M's are designed to )ork in
0arallel to com0lete the reAuest in the shortest 0ossi&le time.
10timall#, e5er# 4M' should contain a su&set of all the ro)s loaded
into e5er# ta&le. .# di5iding u0 the data, it automaticall# di5ides u0
the )ork of retrie5ing the data. !emem&er, all )ork comes as a result
of a users= 3@L reAuest. If the 3@L asks for a s0ecific ro), that ro)
e<ists in its entiret# (all columns$ on a single 4M' and other ro)s e<ist
on the other 4M's.
If the user reAuest asks for all of the ro)s in a ta&le, e5er# 4M' should
0artici0ate along )ith all the other 4M's to com0lete the retrie5al of all
ro)s. This t#0e of 0rocessing is called an all 4M' o0eration and an all
ro)s scan. Co)e5er, each 4M' is onl# res0onsi&le for its ro)s, not the
ro)s that &elong to a different 4M'. 4s far as the 4M's are concerned,
it o)ns all of the ro)s. %ithin Teradata, the 4M' en5ironment is
a shared nothing? configuration. The 4M's cannot access each other=s
data ro)s, and there is no need for them to do so.
1nce the ro)s ha5e &een selected, the last ste0 is to return them to
the client 0rogram that initiated the 3@L reAuest. 3ince the ro)s are
scattered across multi0le 4M's, the# must &e consolidated &efore
reaching the client. This consolidation 0rocess is accom0lished as a
0art of the transmission to the client so that a final com0rehensi5e sort
of all the ro)s is ne5er 0erformed. Instead, all 4M's sort onl# their
ro)s (at the same time D in 0arallel$ and the Message 'assing La#er is
used to merge the ro)s as the# are transmitted from all the 4M's.
Therefore, )hen a client )ishes to seAuence the ro)s of an ans)er
set, this techniAue causes the sort of all the ro)s to &e done in
0arallel. Each 4M' sorts onl# its su&set of the ro)s at the same time
all the other 4M's sort their ro)s. 1nce all of the indi5idual sorts are
com0lete, the ./-ET merges the sorted ro)s. 'rett# &rilliantE
4cti5ities of the 4M'(
3tore and retrie5e data ro)s using the file s#stem
4ggregate data
Join 0rocessing &et)een multi0le ta&les
Con5ert 43CII returned data to E.C*IC (I.M mainframes
onl#$
3ort and format out0ut data
Message Passing Layer (!"ET)
The Message 'assing La#er 5aries de0ending on the s0ecific hard)are
on )hich the Teradata data&ase is e<ecuting. In the latter 0art of the
20
th
centur#, most Teradata data&ase s#stems e<ecuted under the
2-IB o0erating s#stem. Co)e5er, in 8FFG, Teradata )as released on
Microsoft=s -T o0erating s#stem. Toda# it also e<ecutes under %indo)s
2000. The initial release of Teradata, on the Microsoft s#stems, is for a
single node.
%hen using the 2-IB o0erating s#stem, Teradata su00orts u0 to "82
nodes. This massi5el# 0arallel s#stem esta&lishes the &asis for storing
and retrie5ing data from the largest commercial data&ases in the
)orld, Teradata. Toda#, the largest s#stem in the )orld consists of 8HI
nodes. There is much room for gro)th as the data&ases &egin to
e<ceed :0 or "0 tera&#tes.
9or the -C! 2-IB s#stems, the Message 'assing La#er is called the
./-ET. The ama;ing thing a&out the ./-ET is its ca0acit#. Instead of a
fi<ed &and)idth that is shared among multi0le nodes, the &and)idth of
the ./-ET increases as the num&er of nodes increase. This feat is
accom0lished as a result of using 5irtual circuits instead of using a
single fi<ed ca&le or a t)isted 0air configuration.
To understand the )orkings of the ./-ET, think of a tele0hone s)itch
used &# local and long distance carriers. 4s more and more 0eo0le
0lace 0hone calls, no one needs to s0eak slo)er. 4s one s)itch
&ecomes saturated, another s)itch is automaticall# used. %hen #our
0hone call is routed through a different s)itch, #ou do not need to
s0eak slo)er. If a natural or other t#0e of disaster occurs and a s)itch
is destro#ed, all su&seAuent calls are routed through other s)itches.
The ./-ET is designed to )ork like a tele0hone s)itching net)ork.
4n additional as0ect of the ./-ET is that it is reall# t)o connection
0aths, like ha5ing t)o 0hone lines for a &usiness. The redundanc#
allo)s for t)o different as0ects of its 0erformance. The first as0ect is
s0eed. Each 0ath of the ./-ET 0ro5ides &and)idth of 80 Mega&#tes
(M.$ 0er second )ith ersion 8 and I0 M. 0er second )ith ersion 2.
Therefore the aggregate s0eed of the t)o connections is 20M.Jsecond
or 820M.Jsecond. Co)e5er, as mentioned earlier, the &and)idth gro)s
linearl# as more nodes are added. 2sing ersion 8 an# t)o nodes
communicate at :0M.Jsecond (80M.Jsecond K 2 ./-ETs K 2 nodes$.
Therefore, 80 nodes can utili;e 200M.Jsecond and 800 nodes ha5e
2000M.Jsecond a5aila&le &et)een them. %hen using the 5ersion 2
./-ET, the same 800 nodes communicate at 82,000M.Jsecond
(I0M.Jsecond K 2 ./-ETs K 800 nodes$.
The second and eAuall# im0ortant as0ect of the ./-ET uses the t)o
connections for a5aila&ilit#. !egardless of the s0eed associated )ith
each ./-ET connection, if one of the connections should fail, the
second is com0letel# inde0endent and can continue to function at its
indi5idual s0eed )ithout the other connection. Therefore,
communications continue to 0ass &et)een all nodes.
4lthough the ./-ET is 0erforming at half the ca0acit# during an
outage, it is still o0erational and 3@L is a&le to com0lete )ithout
failing. In realit#, )hen the ./-ET is 0erforming at onl# 80M.Jsecond
0er node, it is still a lot faster than man# normal net)orks that
t#0icall# transfer messages at 80M. 0er second.
4ll messages going across the ./-ET offer guaranteed deli5er#. 3o,
an# messages not successfull# deli5ered &ecause of a failure on one
connection automaticall# route across the other connection. 3ince half
of the ./-ET is not )orking, the &and)idth reduces &# half. Co)e5er,
)hen the failed connection is returned to ser5ice, its to0olog# is
automaticall# configured &ack into ser5ice and it &egins transferring
messages along )ith the other connection. 1nce this occurs, the
ca0acit# returns to normal.

A Teradata #ata$ase
%ithin Teradata, a data&ase is a storage location for data&ase o&7ects
(ta&les, 5ie)s, macros, and triggers$. 4n administrator can use *ata
*efinition Language (**L$ to esta&lish a data&ase &# using a C!E4TE
*4T4.43E command.
4 data&ase ma# ha5e 'E!M4-E-T ('E!M$ s0ace allocated to it. This
'E!M s0ace esta&lishes the ma<imum amount of disk s0ace for storing
user data ro)s in an# ta&le located in the data&ase. Co)e5er, if no
ta&les are stored )ithin a data&ase, it is not reAuired to ha5e 'E!M
s0ace. 4lthough a data&ase )ithout 'E!M s0ace cannot store ta&les, it
can store 5ie)s and macros &ecause the# are 0h#sicall# stored in the
*ata *ictionar# (**$ 'E!M s0ace and reAuire no user storage s0ace.
The ** is in a >data&ase? called *.C.
Teradata allocates 'E!M s0ace to ta&les, u0 to the ma<imum, as ro)s
are inserted. The s0ace is not 0re+allocated. Instead, it is allocated, as
ro)s are stored in &locks on disk. The ma<imum &lock si;e is defined
either at a s#stem le5el in the *.3 Control !ecord, at the data&ase
le5el or indi5iduall# for each ta&le. Like 'E!M, the &lock si;e is a
ma<imum si;e. /et, it is onl# a ma<imum for &locks that contain
multi0le ro)s. .# nature, the &locks are 5aria&le in length. 3o, disk
s0ace is not 0re+allocatedL instead, it is allocated on an as needed
&asis, one sector ("82 &#tes$ at a time. Therefore, the largest 0ossi&le
)asted disk s0ace in a &lock is "88 &#tes.
4 data&ase can also ha5e 3'11L s0ace associated )ith it. 4ll users
)ho run Aueries need )orks0ace at some 0oint in time. This 3'11L
s0ace is )orks0ace used for the tem0orar# storage of ro)s during the
e<ecution of user 3@L statements. Like 'E!M s0ace, 3'11L is defined
as a ma<imum amount that can &e used )ithin a data&ase or &# a
user. 3ince 'E!M is not 0re+allocated, unused 'E!M s0ace is
automaticall# a5aila&le for use as 3'11L. This ma<imi;es the disk
s0ace throughout the s#stem.
It is a common 0ractice in Teradata to ha5e some data&ases )ith
'E!M s0ace that contain onl# ta&les. Then, other data&ases contain
onl# 5ie)s. These 5ie) data&ases reAuire no 'E!M s0ace and are the
onl# data&ases that users ha5e 0ri5ileges to access. The 5ie)s in these
data&ases control all access to the real ta&les in other data&ases. The#
insulate the actual ta&les from user access. There )ill &e more on
5ie)s later in this &ook.
The ne)est t#0e of s0ace allocation )ithin Teradata is
TEM'1!4!/ (TEM'$ s0ace. 4 data&ase ma# or ma# not ha5e TEM'
s0ace, ho)e5er, it is reAuired if 6lo&al Tem0orar# Ta&les are used. The
use of tem0orar# ta&les is also co5ered in more detail later in the 3@L
0ortion of this &ook.
4 data&ase is defined using a series of 0arameter 5alues at creation
time. The ma7orit# of the 0arameters can easil# &e changed after a
data&ase has &een created using the M1*I9/ *4T4.43E command.
Co)e5er, )hen attem0ting to increase 'E!M or TEM' s0ace
ma<imums, there must &e sufficient disk s0ace a5aila&le e5en though
it is not immediatel# allocated. There ma# not &e more 'E!M s0ace
defined that actual disk on the s#stem.
4 num&er of additional data&ase 0arameters are listed &elo) along
)ith the user 0arameters in the ne<t section. These 0arameters are
tools for the data&ase administrator and other e<0erienced users )hen
esta&lishing data&ases for ta&les and 5ie)s.
%&EATE ' M(#I)! #ATAASE Parameters
'E!M4-E-T
TEM'1!4!/
3'11L
4CC12-T
94LL.4CM
J12!-4L
*E942LT J12!-4L

Teradata *sers
In Teradata, a user is the same as a data&ase )ith one e<ce0tion. 4
user is a&le to logon to the s#stem and a data&ase cannot. Therefore,
to authenticate the user, a 0ass)ord must &e esta&lished. The
0ass)ord is normall# esta&lished at the same time that the C!E4TE
23E! statement is e<ecuted. The 0ass)ord can also &e changed using
a M1*I9/ 23E! command.
Like a data&ase, a user area can contain data&ase o&7ects (ta&les,
5ie)s, macros and triggers$. 4 user can ha5e 'E!M and TEM' s0ace
and can also ha5e s0ool s0ace. 1n the other hand, a user might not
ha5e an# of these t#0es of s0ace, e<actl# the same as a data&ase.
The &iggest difference &et)een a data&ase and a user is that a user
must ha5e a 0ass)ord. This similarit# &et)een the t)o makes
administering the s#stem easier and allo)s for default 5alues that all
data&ases and users can inherit.
The ne<t t)o lists regard the creation and modification of data&ases
and users.
+ %&EATE , M(#I)! - #ATAASE or *SE& (in common)
'E!M4-E-T
TEM'1!4!/
3'11L
4CC12-T
94LL.4CM
J12!-4L
*E942LT J12!-4L
+ %&EATE , M(#I)! - *SE& (only)
'433%1!*
3T4!T2'
*E942LT *4T4.43E
.# no means are these all of the 0arameters. It is not the intent of this
cha0ter, nor the intent of this &ook to teach data&ase administration.
There are reference manuals and courses a5aila&le to use. Teradata
administration )arrants a &ook &# itself.

Sym$ols *sed in this ook
3ince there are no standard s#m&ols for teaching 3@L, it is necessar#
to understand some of the s#m&ols used in our s#nta< diagrams
throughout this &ook.
This chart should &e used as a reference for 3@L s#nta< used in the
&ook(
<database-name> Substitute an actual database name in this location
<table-name> Substitute an actual table name in this location
<comparison> Substitute a comparison in this location, i.e. a=1
<column-name> Substitute an actual column name in this location
<data-value> Substitute a literal data value in this location
[ optional entry ] Everything beteen the [ ] is optional, not re!uired to be valid
synta" , use hen needed
# use this $ or this % &se one o' the (eyords or symbols on either side o' the ) $ ),
but not both. *.e. # +E,- $ .*/0- % use either )+E,-1 or
).*/0-1 but not both
Figure 1-1

#ATAASE %ommand
%hen users negotiate a successful logon to Teradata, the# are
automaticall# 0ositioned in a default data&ase as defined &# the
data&ase administrator. %hen an 3@L reAuest is e<ecuted, &# default,
it looks in the current data&ase for all referenced o&7ects.
There ma# &e times )hen the o&7ect is not in the current data&ase.
%hen this ha00ens, the user has one of t)o choices to resol5e this
situation. 1ne solution is to Aualif# the name of the o&7ect along )ith
the name of the data&ase in )hich it resides. To do this, the user
sim0l# associates the data&ase name to the o&7ect name &#
connecting them )ith a 0eriod (.$ or dot as sho)n &elo)(
Ndata&ase+nameO.Nta&le+nameO
The second solution is to use the data&ase command. It re0ositions
the user to the s0ecified data&ase. 4fter the data&ase command is
e<ecuted, there is no longer a need to Aualif# the o&7ects in that
data&ase. 1f course, if the 3@L statement references additional
o&7ects in another data&ase, the# )ill ha5e to &e Aualified in order for
the s#stem to locate them. -ormall#, #ou )ill *4T4.43E to the
data&ase that contains most of the o&7ects that #ou need. Therefore it
reduces the num&er of o&7ect names reAuiring Aualification.
The follo)ing is the s#nta< for the *4T4.43E command.
*4T4.43E Ndata&ase+nameO
L
If #ou are not sure )hat data&ase #ou are in, either the CEL'
3E33I1- or 3ELECT *4T4.43E command ma# &e used to make that
determination. These commands and other CEL' functions are co5ered
in the 3@L 0ortion of this &ook.

*se o. an Inde/
4lthough a relational data model uses 'rimar# Me#s and 9oreign
Me#sto esta&lish the relationshi0s &et)een ta&les, that design is a
Logical Model. Each 5endor uses s0eciali;ed techniAues to im0lement a
'h#sical Model. Teradata does not use ke#s in its 0h#sical model.
Instead, Teradata is im0lemented using indices, &oth 0rimar# and
secondar#.
The 'rimar# Inde< ('I$ is the most im0ortant inde< in all of Teradata.
The 0erformance of Teradata can &e linked directl# to the selection of
this inde<. The data 5alue in the 'I column(s$ is su&mitted to the
hashing function. The resulting ro) hash 5alue is used to ma0 the ro)
to a s0ecific 4M' for data distri&ution and storage.
To illustrate this conce0t, I ha5e on se5eral occasions used t)o decks
of cards. Imagine if #ou )ill, fourteen 0eo0le in a room. To the largest,
most 0o)erful looking man in the room, #ou gi5e one of the decks of
cards. Cis large hands allo) him to hold all fift#+t)o cards at one time,
)ith some degree of success. The cards are arranged )ith the ace of
s0ades continuing through the king of s0ades in ascending order. 4fter
the s0ades, are the hearts, then the clu&s and last, the diamonds.
Each suit is arranged starting )ith the ace and ascending u0 to the
king. The cards are 0artitioned &# suit.
The other deck of cards is di5ided among the other thirteen 0eo0le.
2sing this 0rocedure, all cards )ith the same 5alue (i.e. aces$ all go to
the same 0erson. Like)ise, all the deuces, tre#s and su&seAuent cards
each go to one of the thirteen 0eo0le. Each of the four cards )ill &e in
the same order as the suits contained in the single deck that )ent to
the lone man( s0ades, hearts, clu&s and diamonds. 1nce all the cards
ha5e &een distri&uted, each of the thirteen 0eo0le )ill &e holding four
cards of the same 5alue (:K83P"2$. -o), the game can &egin.
The reAuests in this game come in the form of >gi5e+me,? one or more
cards.
To make it eas# for the lone 0la#er, )e first reAuest( gi5e+me the ace
of s0ades. The 0erson )ith four aces finds their ace, as does the lone
0la#er )ith all "2 cards, &oth on the to0 other their cards. That )as
eas#E
4s the difficult# of the gi5e+me reAuests increase, the le5el of difficult#
dramaticall# increases for the lone man. 9or instance, )hen the gi5e+
me reAuest is for all of the t)os, one of the thirteen 0eo0le holds u0 all
four of their cards and the# are done. The lone man must locate the 2
of s0ades &et)een the ace and tre#. Then, go and locate the 2 of
hearts, thirteen cards later &et)een the ace and tre#. Then, find the 2
of clu&s, thirteen cards after that, as )ell as the 2 of diamonds,
thirteen cards after that to finall# com0lete the reAuest.
4nother reAuest might &e gi5e+me all of the diamonds. 9or the thirteen
0eo0le, each 0erson locates and holds u0 one card of their cards and
the reAuest is finished. 9or the lone 0erson )ith the single deck, the
reAuest means finding and holding u0 the last thirteen cards in their
deck of fift#+t)o. In each of these gi5e+me reAuests, the lone man had
to negotiate all fift# t)o cards )hile the thirteen other 0eo0le onl#
needed to determine )hich of the four cards a00lied to the reAuest, if
an#. This is the same 0rocedure used &# Teradata. It di5ides u0 the
data like )e di5ided u0 the cards.
4s illustrated, the thirteen 0eo0le are faster than the lone man.
Co)e5er, the game is not limited to thirteen 0la#ers. If there )ere 2I
0eo0le )ho )ished to 0la# on the same team, the cards sim0l# need to
&e di5ided or distri&uted differentl#.
%hen using the 5alue (ace through king$ there are onl# 83 uniAue
5alues. In order for 2I 0eo0le to 0la#, )e need a )a# to come u0 )ith
2I uniAue 5alues for 2I 0eo0le. To make the cards more uniAue, )e
might com&ine the 5alue of the card (i.e. ace$ )ith the color.
Therefore, )e ha5e t)o red aces and t)o &lack aces as )ell as t)o
sets for e5er# other card. -o) )hen )e distri&ute the cards, each of
the t)ent#+si< 0eo0le recei5es onl# t)o cards instead of the original
four. The distri&ution is still &ased on fift#+t)o cards (2 times 2I$.
4t the same time, the o0timum num&er of 0eo0le for the game is not
2I. .ased on )hat has &een discussed so far, )hat is the o0timum
num&er of 0eo0leQ
If #our ans)er is "2, then #ou are a&solutel# correct.
%ith this man# 0eo0le, each 0erson has one and onl# one card. 4n#
time a gi5e+me is reAuested of the 0artici0ants, their one card either
Aualifies or it does not. It doesn=t get an# sim0ler or faster than this
situation.
4s eas# as this sounds, to accom0lish this distri&ution the 5alue of the
card alone is not sufficient to manifest "2 uniAue 5alues. -either is
using the 5alue and the color. That com&ination onl# gi5es us a
distri&ution of 2I uniAue 5alues )hen "2 uniAue 5alues are desired.
To achie5e this distri&ution )e need to esta&lish still more uniAueness.
9ortunatel#, )e can use the suit along )ith the 5alue. Therefore, the
ace of s0ades is different than the ace of hearts, )hich is different
from the ace of clu&s and the ace of diamonds. In other )ords, there
are no) "2 uniAue identities to use for distri&ution.
To relate this distri&ution to Teradata, one or more columns of a ta&le
are chosen to &e the 'rimar# Inde<.
Primary Index
The 'rimar# Inde< can consist of u0 to si<teen different columns.
These columns, )hen considered together, 0ro5ide a com0rehensi5e
techniAue to deri5e a 2niAue 'rimar# Inde< (2'I, 0ronounced as >#ou+
0ea?$ 5alue as )e discussed 0re5iousl# regarding the card analog#.
That is the good ne)s.
To store the data, the 5alue(s$ in the 'I are hashed 5ia a calculation to
determine )hich 4M' )ill o)n the data. The same data 5alues al)a#s
hash the same ro) hash and therefore are al)a#s associated )ith the
same 4M'.
The ad5antage to using u0 to si<teen columns is that ro) distri&ution
is 5er# smooth or e5enl# &ased on uniAue 5alues. This sim0l# means
that each 4M' contains the same num&er of ro)s. 4t the same time,
there is a do)nside to using se5eral columns for a 'I. The 'E needs
e5er# data 5alue for each column as in0ut to the hashing calculation to
directl# access a 0articular ro). If a single column 5alue is missing, a
full ta&le scan )ill result &ecause the ro) hash cannot &e recreated.
4n# ro) retrie5al using the 'I column(s$ is al)a#s an efficient, one
4M' o0eration.
4lthough uniAueness is good in most cases, Teradata does not reAuire
that a 2'I &e used. It also allo)s for a -on+2niAue 'rimar#
Inde<(-2'I, 0ronounced as ne)+0ea$. The 0otential do)nside of a
-2'I is that if se5eral du0licate 5alues (-2'I du0s$ are stored, the# all
go to the same 4M'. This can cause an une5en distri&ution that 0laces
more ro)s on some of the 4M's than on others. This means that an#
time an 4M' )ith a larger num&er of ro)s is in5ol5ed, it has to )ork
harder than the other 4M's. The other 4M's )ill finish &efore the
slo)er 4M'. The time to 0rocess a single user reAuest is al)a#s &ased
on the slo)est 4M'. Therefore, serious consideration should &e used
)hen making the decision to use a -2'I.
E5er# ta&le must ha5e a 'I and it is esta&lished )hen the ta&le is
created. If the C!E4TE T4.LE statement contains( 2-I@2E '!IM4!/
I-*EB( Ncolumn+listO $, the 5alue in the column(s$ )ill &e distri&uted
to an 4M' as a 2'I. Co)e5er, if the statement reads( '!IM4!/ I-*EB
( Ncolumn+listO $, the 5alue in the column(s$ )ill &e distri&uted as a
-2'I and allo) du0licate 5alues. 4gain, all the same 5alues )ill go to
the same 4M'.
If the **L statement does not s0ecif# a 'I, &ut it s0ecifies a '!IM4!/
ME/ ('M$, the named column(s$ are used as the 2'I. 4lthough
Teradata does not use 0rimar# ke#s, the **L ma# &e 0orted from
another 5endorRs data&ase s#stem.
4 2'I is used &ecause a 0rimar# ke# must &e uniAue and cannot &e
null. .# default, &oth 2'Is and -2'Is allo) a null 5alue to &e stored
unless the column definition indicates that null 5alues are not allo)ed
using a -1T -2LL constraint.
-o), )ith that &eing said, )hen considering J1I- accesses on the
ta&les, sometimes it is ad5antageous to use a -2'I. This is &ecause
the ro)s &eing 7oined &et)een ta&les must &e on the same 4M'. If
the# are not on the same 4M', one of the ro)s must &e mo5ed to the
same 4M' as the matching ro). Teradata )ill use one of t)o different
strategies to tem0oraril# mo5e ro)s. It can co0# all needed ro)s to all
4M's or it can redistri&ute them using the hashing mechanism on the
column defined as the 7oin domain that is a 'I. Co)e5er, if neither 7oin
column is a 'I, it might &e necessar# to redistri&ute all 0artici0ating
ro)s from &oth ta&les &# hash code to get them together on a single
4M'.
'lanning data distri&ution, using access characteristics, can reduce the
amount of data mo5ement and therefore im0ro5e 7oin 0erformance.
This )orks fine as long as there is a consistent num&er of du0licate
5alues or onl# a small num&er of du0licate 5alues. The logical data
model needs to &e e<tended )ith usage information in order to kno)
the &est )a# to distri&ute the data ro)s. This is done during the
0h#sical im0lementation 0hase &efore creating ta&les.
Secondary Index
4 3econdar# Inde< (3I$ is used in Teradata as a )a# to directl# access
ro)s in the data, sometimes called the &ase ta&le, )ithout reAuiring
the use of 'I 5alues. 2nlike the 'I, an 3I does not effect the
distri&ution of the data ro)s. Instead, it is an alternate read 0ath and
allo)s for a method to locate the 'I 5alue using the 3I. 1nce the 'I is
o&tained, the ro) can &e directl# accessed using the 'I. Like the 'I, an
3I can consist of u0 to 8I columns.
In order for an 3I to retrie5e the data ro) &# )a# of the 'I, it must
store and retrie5e an inde< ro). To accom0lish this Teradata creates,
maintains and uses a su&ta&le. The 'I of the su&ta&le is the 5alue in
the column(s$ that are defined as the 3I. The >data? stored in the
su&ta&le ro) is the 0re5iousl# hashed 5alue of the real 'I for the data
ro) or ro)s in the &ase ta&le. The 3I is a 0ointer to the real data ro)
desired &# the reAuest. 4n 3I can also &e uniAue (23I, 0ronounced as
#ou+sea$ or non+uniAue (-23I, 0ronounced as ne)+sea$.
The ro)s of the su&ta&le contain the ro) hashed 5alue of the 3I, the
actual data 5alue(s$ of the 3I, and the ro) hashed 5alue of the 'I as
the ro) I*. 1nce the ro) I* of the 'I is o&tained from the su&ta&le
ro), using the hashed 5alue of the 3I, the last ste0 is to get the actual
data ro) from the 4M' )here it is stored. The action and hashing for
an 3I is e<actl# the same as )hen starting )ith a 'I. %hen using a
23I, the access of the su&ta&le is a one 4M' o0eration and then
accessing the data ro) from the &ase ta&le is another one 4M'
o0eration. Therefore, 23I accesses are al)a#s a t)o 4M' o0eration
&ased on t)o se0arate ro) hash o0erations.
%hen using a -23I, the su&ta&le access is al)a#s an all 4M'o0eration.
3ince the data is distri&uted &# the 'I, -23I du0licate 5alues ma#
e<ist and 0ro&a&l# do e<ist on multi0le 4M's. 3o, the &est 0lan is to go
to all 4M's and check for the reAuested -23I 5alue. To make this
more efficient, each 4M' scans its su&ta&le. These su&ta&le ro)s
contain the ro) hash of the -23I, the 5alue of the data that created
the -23I and one or more ro) I*s for all the 'I ro)s on that 4M'. This
is still a fast o0eration &ecause these ro)s are Auite small and se5eral
are stored in a single &lock. If the 4M' determines that it contains no
ro)s for the 5alue of the -23I reAuested, it is finished )ith its 0ortion
of the reAuest. Co)e5er, if an 4M' has one or more ro)s )ith the
-23I 5alue reAuested, it then goes and retrie5es the data ro)s into
s0ool s0ace using the inde<.
%ith this said, the 3@L o0timi;er ma# decide that there are too man#
&ase ta&le data ro)s to make inde< access efficient. %hen this
ha00ens, the 4M's )ill do a full &ase ta&le scan to locate the data
ro)s and ignore the -23I. This situation is called a )eakl# selecti5e
-23I. E5en using old+fashioned inde<ed seAuential files, it has al)a#s
&een more efficient to read the entire file and not use an inde< if more
than 8"S of the records )ere needed. This is com0ounded )ith
Teradata &ecause the >file? is read in 0arallel instead of all data from a
single file. 3o, the efficienc# 0ercentage is 0ro&a&l# closer to &eing less
than 3S of all the ro)s in order to use the -23I.
If the 3@L does not use a -23I, #ou should consider dro00ing it, due
to the fact that the su&ta&le takes u0 'E!M s0ace )ith no &enefit to
the users. The Teradata EB'L4I- is co5ered in this &ook and it is the
easiest )a# to determine if #our 3@L is using a -23I. 9urthermore,
the o0timi;er )ill ne5er use a -23I )ithout 3T4TI3TIC3.
There has &een another e5olution in the use of -23I 0rocessing. It is
called -23I .itma00ing. This means that if a ta&le has t)o different
-23I indices and indi5iduall# the# are )eakl# selecti5e, &ut together
the# can &e &itma00ed together to eliminate most of the non+
conforming ro)sL it )ill use the t)o different -23I columns together
&ecause the# &ecome highl# selecti5e. Therefore, man# times, it is
&etter to use smaller indi5idual -23I indices instead of a large
com0osite (more than one column$ -23I.
There is another feature related to -23I 0rocessing that can im0ro5e
access time )hen a 5alue range com0arison is reAuested. %hen using
hash 5alues, it is im0ossi&le to determine an# 5alue )ithin the range.
This is &ecause large data 5alues can generate small hash 5alues and
small data 5alues can 0roduce large hash 5alues. 3o, to o5ercome the
issue associated )ith a hashed 5alue, there is a range feature called
alue 1rdered -23Is. 4t this time, it ma# onl# &e used )ith a four
&#te or smaller numeric data column. .ased on its functionalit#, a
alue 1rdered -23I is 0erfect for date 0rocessing. 3ee the **Lcha0ter
in this &ook for more details on 23I and -23I usage.

#etermining the &elease o. !our Teradata System0
3ELECT K 9!1M *.C.*.CI-91L
In.o1ey In.o#ata 222
!ELE43E 2!.0:.00.02.2I
E!3I1- 0:.00.02.2H

)undamental Structured Query Language (SQL)
The access language for all modern relational data&ase s#stems
(!*.M3$ is 3tructured @uer# Language (3@L$. It has e5ol5ed o5er
time to &e the standard. The 4-3I 3@L grou0 defines )hich commands
and functionalit# all 5endors should 0ro5ide )ithin their !*.M3.
There are three le5els of com0liance )ithin the standard( Entr#,
Intermediate and 9ull. The three le5el definitions are &ased on s0ecific
commands, data t#0es and functionalities. 3o, it is not that a 5endor
has incor0orated some 0ercentage of the commandsL it is more that
each command is categori;ed as &elonging to one of the three le5els.
9or instance, most data t#0es are Entr# le5el com0liant. /et, there are
some that fall into the Intermediate and 9ull definitions.
3ince the standard continues to gro) )ith more o0tions &eing added,
it is difficult to sta# full# 4-3I com0liant. 4dditionall#, all
!*.M35endors 0ro5ide e<tra functionalit# and o0tions that are not
0art of the standard. These e<tra functions are called e<tensions
&ecause the# e<tend or offer a &enefit &e#ond those in the standard
definition.
4t the )riting of this &ook, Teradata )as full# 4-3I Entr# le5el
com0liant &ased on the 8FF2 3tandards document. -C! also 0ro5ides
much of the Intermediate and some of the 9ull ca0a&ilities. This &ook
indicates feature &# feature )hich 3@L ca0a&ilities are 4-3I and )hich
are Teradata s0ecific, or e<tensions. It is to -C!=s &enefit to &e as
com0liant as 0ossi&le in order to make it easier for customers of other
!*.M3 5endors to 0ort their data )arehouse to Teradata.
4s indicated earlier, 3@L is used to access, store, remo5e and modif#
data stored )ithin a relational data&ase, like Teradata. The 3@L is
actuall# com0rised of three t#0es of statements. The# are( *ata
*efinition Language (**L$, *ata Control Language (*CL$ and *ata
Mani0ulation Language (*ML$. The 0rimar# focus of this &ook is on
*ML and **L. .oth **L and *CL are, for the most 0art, used for
administering an !*.M3. 3ince the 3ELECT statement is used the 5ast
ma7orit# of the time, )e are concentrating on its functionalit#,
5ariations and ca0a&ilities.
E5er#thing in the first 0art of this cha0ter descri&es 4-3I
standardca0a&ilities of the 3ELECT command. 4s the statements
&ecome more in5ol5ed, each ca0a&ilit# )ill &e designated as either
4-3I or a Teradata E<tension.

asic SELE%T %ommand
2sing the 3ELECT has &een descri&ed like 0la#ing the game, Jeo0ard#.
The ans)er is thereL all #ou ha5e to do is come u0 )ith the correct
Auestion.
The &asic structure of the 3ELECT statement indicates )hich column
5alues are desired and the ta&les that contain them. To aid in the
learning of 3@L, this &ook )ill ca0itali;e the 3@L ke#)ords. Co)e5er,
)hen 3@L is )ritten for Teradata, the case of the statement is not
im0ortant. The 3@L statements can &e )ritten using all u00ercase,
lo)ercase or a com&inationL it does not matter to the Teradata 'E.
The 3ELECT is used to return the data 5alue(s$ stored in the columns
named )ithin the 3ELECT command. The reAuested columns must &e
5alid names defined in the ta&le(s$ listed in the 9!1M 0ortion of the
3ELECT.
The follo)ing sho)s the format of a &asic 3ELECT statement. In this
&ook, the s#nta< uses e<0ressions like( Ncolumn+nameO (see 9igure
8+8$ to re0resent the location of one or more names reAuired to
construct a 5alid 3@L statement(
The structure of the a&o5e command 0laces all ke#)ords on the left in
u00ercase and the 5aria&le information such as column and ta&le
names to the right. Like using ca0ital letters, this 0ositioning is to aid
in learning 3@L. Lastl#, although the use of 3EL is acce0ta&le in
Teradata, )ith TECTU in sAuare &rackets &eing o0tional, it is not 4-3I
standard.
Lastl#, )hen multi0le column names are reAuested in the 3ELECT, a
comma must se0arate them. %ithout the se0arator, the o0timi;er
cannot determine )here one ends and the ne<t &egins.
The follo)ing s#nta< format is also acce0ta&le(
SEL[ECT] <column-name> FROM <table-name> ;
.oth of these 3ELECT statements 0roduce the out0ut re0ort, &ut the
a&o5e st#le is easier to read and de&ug for com0le< Aueries. The
out0ut dis0la# might a00ear as(
3 !o)s !eturned
3column4name5
aaaaaaaaaaaaaaaaaa
&&&&&&&&&&&&&&&&
cccccccccccccccccc
In the out0ut, the column name &ecomes the default heading for the
re0ort. Then, the data contained in the selected column is dis0la#ed
once for each ro) returned.
The ne<t 5ariation of the 3ELECT statement returns all of the columns
defined in the ta&le indicated in the 9!1M 0ortion of the 3ELECT.
The out0ut of the a&o5e reAuest uses each column name as the
heading and the columns are dis0la#ed in the same seAuence as the#
are defined in the ta&le. *e0ending on the tool used to su&mit the
reAuest, care should &e taken, &ecause if the returned dis0la# is )ider
than the media (i.e. terminalPG0 and 0a0erP833$L it ma# &e
truncated.
4t times, it is desira&le to select the same column t)ice. This is
0ermitted and to accom0lish it, the column name is sim0l# listed in the
3ELECT column list more than once. This techniAue might often &e
used )hen doing aggregations or calculating a 5alue, &oth are co5ered
in later cha0ters.
The ta&le &elo) is used to demonstrate the results of 5arious reAuests.
It is a small ta&le )ith a total of ten ro)s for eas# com0arison.
3tudent Ta&le + contains 80 students
Student6I# Last6"ame )irst6name %lass6code 7rade6Pt
PK

FK

UPI NUSI NUSI
8232"0
82"I3:
23:828
238222
2I0000
2G0023
322833
32:I"2
333:"0
:23:00
'hilli0s
Canson
Thomas
%ilson
Johnson
Mc!o&erts
.ond
*elane#
3mith
Larkins
Martin
Cenr#
%end#
3usie
3tanle#
!ichard
Jimm#
*ann#
4nd#
Michael
3!
9!
9!
31
J!
J!
3!
31
9!
3.00
2.GG
:.00
3.G0
8.F0
3.F"
3.3"
2.00
0.00
Figure 2-1
9or E<am0le( the ne<t 3ELECT might &e used )ith 9igure 2+8, to
dis0la# the student num&er, the last name, first name, the class code
and grade 0oint for all of the students in the 3tudent ta&le(
SELECT *
FROM Student_Table ;
80 !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
:23:00 Larkins Michael 9! 0.00
82"I3: Canson Cenr# 9! 2.GG
2G0023 Mc!o&erts !ichard J! 8.F0
2I0000 Johnson 3tanle# Q Q
238222 %ilson 3usie 31 3.G0
23:828 Thomas %end# 9! :.00
32:I"2 *elane# *ann# 3! 3.3"
8232"0 'hilli0s Martin 3! 3.00
322833 .ond Jimm# J! 3.F"
333:"0 3mith 4nd# 31 2.00
-otice that Johnson has Auestion marks in the grade 0oint and class
code columns. Most client soft)are uses the Auestion mark to
re0resent missing data or an unkno)n 5alue (-2LL$. More discussion
on this condition )ill a00ear throughout this &ook. The other thing to
note is that character data is aligned from left to right, the same as )e
read it and numeric is from right to left, from the decimal.
This 3ELECT returns all of the columns e<ce0t the 3tudent I* from the
3tudent ta&le(
80 !o)s returned
)irst6"ame Last6"ame %lass6%ode 7rade6Pt
Michael Larkins 9! 0.00
Cenr# Canson 9! 2.GG
!ichard Mc!o&erts J! 8.F0
3tanle# Johnson Q Q
3usie %ilson 31 3.G0
%end# Thomas 9! :.00
*ann# *elane# 3! 3.3"
Martin 'hilli0s 3! 3.00
Jimm# .ond J! 3.F"
4nd# 3mith 31 2.00
There is no short cut for selecting all columns e<ce0t one or t)o. 4lso,
notice that the columns are dis0la#ed in the out0ut in the same
seAuence the# are reAuested in the 3ELECT statement.

89E&E %lause
The 0re5ious >unconstrained? 3ELECT statement returned e5er# ro)
from the ta&le. 3ince the Teradata data&ase is most often used as a
data )arehouse, a ta&le might contain millions of ro)s. 3o, it is )ise
to reAuest onl# certain t#0es of ro)s for return.
.# adding a %CE!E clause to the 3ELECT, a constraint is esta&lished
to 0otentiall# limit )hich ro)s are returned &ased on a T!2E
com0arison to s0ecific criteria or set of conditions.
The conditional check in the %CE!E can use the 4-3I com0arison
o0erators (s#m&ols are 4-3I J al0ha&etic is Teradata E<tension$(
E:ual "ot E:ual Less Than 7reater Than Less Than or E:ual 7reater Than or E:ual
= <> < > <= >=
E2 3E +- /- +E /E
Figure 2-2
The follo)ing 3ELECT can &e used to return the students )ith a .
(3.0$ a5erage or &etter from the 3tudent ta&le(
" !o)s returned
Student_ID Last_Name Grade_Pt
238222 %ilson 3.G0
23:828 Thomas :.00
32:I"2 *elane# 3.3"
8232"0 'hilli0s 3.00
322833 .ond 3.F"
%ithout the %CE!E clause, the 4M's return all of the ro)s in the ta&le
to the user. More and more Teradata user s#stems are getting to the
0oint )here the# are storing &illions of ro)s in a single ta&le. There
must &e a 5er# good reason for needing to see all of them. More
sim0l# 0ut, #ou )ill al)a#s use a %CE!E clause )hene5er #ou )ant to
see onl# a 0ortion of the ro)s in a ta&le.

%ompound %omparisons ( A"# ' (& )
Man# times a single com0arison is not sufficient to s0ecif# the desired
ro)s. To add more functionalit# to the %CE!E it is common to use
more than one com0arison. The multi0le condition checks and column
names are not se0arated &# a comma, like column names. Instead,
the# must &e connected using a logical o0erator.
The follo)ing is the s#nta< for using the 4-* logical o0erator(
-otice that the column name is listed for each com0arison se0arated
&# a logical o0eratorL this )ill &e true e5en )hen it is the same column
&eing com0ared t)ice. The 4-* signifies that each indi5idual
com0arison on &oth sides of the 4-* must &e true. The final result of
the com0arison must &e T!2E for a ro) to &e returned.
This Truth Ta&le illustrates this 0oint using 4-*.
)irst Test &esult A"# Second Test &esult )inal &esult
True True True
True 9alse 9alse
9alse True 9alse
9alse 9alse 9alse
Figure 2-3
%hen using 4-*, different columns must &e used &ecause a single
column can ne5er contain more than a single data 5alue.
Therefore, it does not make good sense to issue the ne<t 3ELECT
using an 4-* on the same column &ecause no ro)s )ill e5er &e
returned.
-o ro)s found
The a&o5e 3ELECT )ill ne5er return an# ro)s. It is im0ossi&le for a
column to contain more than one 5alue. -o student has a 3.0 grade
a5erage 4-* a :.0 a5erage. The# might ha5e one or the other, &ut not
&oth. It might contain one or the other, &ut ne5er
&oth at the same time. The 4-* o0erator indicates &oth must &e T!2E
and should ne5er &e used &et)een t)o com0arisons on the same
column.
.# su&stituting an 1! logical o0erator for the 0re5ious 4-*, ro)s )ill
no) &e returned.
The follo)ing is the s#nta< for using 1!(
2 !o)s returned
Student6I# Last6"ame )irst6"ame 7rade6Pt
23:828 Thomas %end# :.00
8232"0 'hilli0s Martin 3.00
The 1! signifies that onl# one of the com0arisons on each side of the
1! needs to &e true for the entire test to result in a true and the ro)
to &e selected.
This Truth Ta&le illustrates the results for the 1!(
)irst Test &esult (& Second Test &esult )inal &esult
-rue -rue -rue
-rue ,alse -rue
,alse -rue -rue
,alse ,alse ,alse
Figure 2-4
%hen using the 1!, the same column or different column names ma#
&e used. In this case, it makes sense to use the same column &ecause
a ro) is returned )hen a column contains either of the s0ecified 5alues
as o00osed to &oth 5alues as seen )ith 4-*.
It is 0erfectl# legal and common 0ractice to com&ine the 4-* )ith the
1! in a single 3ELECT statement.
The ne<t 3ELECT contains &oth an 4-* as )ell as an 1!(
2 !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
23:828 Thomas %end# 9! :.00
8232"0 'hilli0s Martin 3! 3.00
4t first glance, it a00ears that the com0arison )orked correctl#.
Co)e5er, u0on closer e5aluation it is incorrect &ecause 'hilli0s is a
senior and not a freshman.
%hen mi<ing 4-* )ith 1! in the same %CE!E clause, it is im0ortant
to kno) that the 4-* is e5aluated first. The 0re5ious 3ELECT actuall#
returns all ro)s )ith a grade 0oint of 3.0. Cence, 'hilli0s )as returned.
The second com0arison returned Thomas )ith a grade 0oint of :.0 and
a class code of V9!=.
%hen it is necessar# for the 1! to &e e5aluated &efore the 4-* the
use of 0arentheses changes the 0riorit# of e5aluation. 4 different result
is seen )hen doing the 1! first. Cere is ho) the statement should &e
)ritten(
8 !o) returned
Last6"ame %lass6%ode 7rade6Pt
Thomas 9! :.00
-o), onl# Thomas is returned and the out0ut is correct.

Impact o. "*LL on %ompound %omparisons
-2LL is an 3@L reser5ed )ord. It re0resents missing or unkno)n data
in a column. 3ince -2LL is an unkno)n 5alue, a normal com0arison
cannot &e used to determine )hether it is true or false. 4ll
com0arisons of an# 5alue to a -2LL result in an unkno)nL it is neither
true nor false. The onl# 5alid test for a null uses the ke#)ord -2LL
)ithout the normal com0arison s#m&ols and is e<0lained in this
cha0ter.
%hen a ta&le is created in Teradata, the default for a column is for it to
allo) a -2LL 5alue to &e stored. 3o, unless the default is o5er+ridden
and -2LL 5alues are not allo)ed, it is a good idea to understand ho)
the# )ork.
4 3C1% T4.LE command (cha0ter 3$ can &e used to determine
)hether a -2LL is allo)ed. If the column contains a -1T -2LL
constraint, #ou need not &e concerned a&out the 0resence of a -2LL
&ecause it is disallo)ed.
This 4-* Truth Ta&le must no) &e used for com0ound tests )hen
-2LL 5alues are allo)ed(
)irst Test &esult A"# Second Test &esult )inal &esult
True 2nkno)n 2nkno)n
2nkno)n True 2nkno)n
9alse 2nkno)n 9alse
2nkno)n 9alse 9alse
2nkno)n 2nkno)n 2nkno)n
Figure 2-5
This 1! Truth Ta&le must no) &e used for com0ound tests )hen
-2LL5alues are allo)ed(
)irst Test &esult (& Second Test &esult )inal &esult
True 2nkno)n True
2nkno)n True True
9alse 2nkno)n 2nkno)n
2nkno)n 9alse 2nkno)n
2nkno)n 2nkno)n 2nkno)n
Figure 2-6
9or most com0arisons, an unkno)n (null$ is functionall# eAui5alent to
a false &ecause it is not a true. Therefore, )hen using an# com0arison
s#m&ol a ro) is not returned )hen it contains a -2LL.
4t the same time, the ne<t 3ELECT does not return Johnson &ecause
all com0arisons against a -2LL are unkno)n(
-o ro)s found
2!"( KKK 9ailure 3H38 The user must use I3 -2LL or I3 -1T -2LL to
test for -2LL 5alues.
4s seen in the a&o5e Truth ta&les, a com0arison test cannot &e used to
find a -2LL.
To find a -2LL, it &ecomes necessar# to make a slight change in the
s#nta< of the conditional com0arison. The coding necessar# to find a
-2LL is seen in the ne<t section.

*sing "(T in SQL %omparisons
It can &e fairl# straightfor)ard to reAuest e<actl# )hich ro)s are
needed. Co)e5er, sometimes ro)s are needed that contain an# 5alue
other than a s0ecific 5alue. %hen this is the case, it might &e easier to
)rite the 3ELECT to find )hat is not needed instead of )hat is needed.
Then con5ert it to return e5er#thing else. This might &e the situation
)hen there are 800 0otential 5alues stored in the data&ase ta&le and
FF of them are needed. 3o, it is easier to eliminate the one 5alue than
it is to s0ecificall# list the desired FF different 5alues indi5iduall#.
Either of the ne<t t)o 3ELECT formats can &e used to accom0lish the
elimination of the one 5alue(
This second 5ersion of the 3ELECT is normall# used )hen com0ound
conditions are reAuired. This is &ecause it is usuall# easier to code the
3ELECT to get )hat is not )anted and then to enclose the entire set of
com0arisons in 0arentheses and 0ut one -1T in front of it. 1ther)ise,
)ith a single com0arison, it is easier to 0ut -1T in front of the
com0arison o0erator )ithout reAuiring the use of 0arentheses.
The ne<t 3ELECT uses the -1T )ith an 4-* com0arison to dis0la#
seniors and lo)er classmen )ith grade 0oints less than 3.0(
I !o)s returned
Last6"ame )irst6"ame %lass6%ode 7rade6Pt
Mc!o&erts !ichard J! 8.F0
Canson Cenr# 9! 2.GG
*elane# *ann# 3! 3.3"
Larkins Michael 9! 0.00
'hilli0s Martin 3! 3.00
3mith 4nd# 31 2.00
%ithout using the a&o5e techniAue of a single -1T, it is necessar# to
change e5er# indi5idual com0arison. The follo)ing 3ELECT sho)s this
a00roach, notice the other change necessar# &elo), -1T 4-* is an
1!(
3ince #ou cannot ha5e conditions like( -1T OP and -1T NO, the#
must &e con5erted to N (not N and not P$ and P (not, not P$. It
returns the same " ro)s, &ut also notice that the 4-* is no) an 1!(
I !o)s returned
Last6"ame )irst6"ame %lass6%ode 7rade6Pt
Mc!o&erts !ichard J! 8.F0
Canson Cenr# 9! 2.GG
*elane# *ann# 3! 3.3"
'hilli0s Martin 3! 3.00
Larkins Michael 9! 0.00
3mith 4nd# 31 2.00
Chart of indi5idual conditions and -1T(
%ondition (pposite condition "(T condition
<= < 34- >=
<> = 34- <>
536 4. 4.
4. 536 536
Figure 2-7
To maintain the integrit# of the statement, all 0ortions of the %CE!E
must &e changed, including 4-*, as )ell as 1!. The follo)ing t)o
3ELECT statements illustrate the same conce0t )hen using an 1!(
8 !o) returned
Last6"ame
Canson
In the earlier Truth ta&le, the -2LL 5alue returned an unkno)n )hen
checked )ith a com0arison o0erator. %hen looking for s0ecific
conditions, an unkno)n )as functionall# eAui5alent to a false, &ut
reall# it is an unkno)n.
These t)o Truth ta&les can &e used together as a tool )hen mi<ing
4-* and 1! together in the %CE!E clause along )ith -1T.
This Truth Ta&le hel0s to gauge returned ro)s )hen using -1T )ith
4-*(
)irst Test &esult A"# Second Test &esult &esult
-1T(True$ P 9alse -1T(2nkno)n$ P 2nkno)n 9alse
-1T(2nkno)n$ P 2nkno)n -1T(True$ P 9alse 9alse
-1T(9alse$ P True -1T(2nkno)n$ P 2nkno)n 2nkno)n
-1T(2nkno)n$ P 2nkno)n -1T(9alse$ P True 2nkno)n
-1T(2nkno)n$ P 2nkno)n -1T(2nkno)n$ P 2nkno)n 2nkno)n
Figure 2-8
This Truth Ta&le can &e used to gauge returned ro)s )hen using -1T
)ith 1!(
)irst Test &esult (& Second Test &esult &esult
-1T(True$ P 9alse -1T(2nkno)n$ P 2nkno)n 2nkno)n
-1T(2nkno)n$ P 2nkno)n -1T(True$ P 9alse 2nkno)n
-1T(9alse$ P True -1T(2nkno)n$ P 2nkno)n True
-1T(2nkno)n$ P 2nkno)n -1T(9alse$ P True True
-1T(2nkno)n$ P 2nkno)n -1T(2nkno)n$ P 2nkno)n 2nkno)n
Figure 2-9
There is an issue associated )ith using -1T. %hen a -1T is done on a
true condition, the result is a false. Like)ise, the -1T of a false is a
true. Co)e5er, )hen a -1T is done )ith an unkno)n, the result is still
an unkno)n. %hene5er a -2LL a00ears in the data for an# of the
columns &eing com0ared, the ro) )ill ne5er &e returned and the
ans)er set )ill not &e )hat is e<0ected.
4nother area )here care must &e taken is )hen allo)ing -2LL 5alues
to &e stored in one or &oth of the columns. 4s mentioned earlier,
0re5ious 5ersions of Teradata had no conce0t of >unkno)n? and if a
com0are didn=t result in a true, it )as false. %ith the em0hasis on
4-3I com0ati&ilit# the unkno)n )as introduced.
If -2LL 5alues are allo)ed and there is 0otential for the -2LL to
im0act the final outcome of com0ound tests, additional tests are
reAuired to eliminate them. 1ne )a# to eliminate this concern is to
ne5er allo) a -2LL 5alue in an# columns. Co)e5er, this ma# not &e
a00ro0riate and it )ill reAuire more storage s0ace &ecause a -2LL can
&e com0ressed. Therefore, )hen a -2LL is allo)ed, the 3@L needs to
sim0l# check for a -2LL.
Therefore, using the e<0ression I3 -1T -2LL is a good techniAue
)hen -2LL is allo)ed in a column and the -1T is used )ith a single or
a com0ound com0arison. This does reAuire another com0arison and
could &e )ritten as(
H !o)s returned
Last6"ame )irst6"ame %lass6%ode 7rade6Pt
Larkins Michael 9! 0.00
Canson Cenr# 9! 2.GG
Mc!o&erts !ichard ! 8.F0
Johnson 3tanle# Q Q
*elane# *ann# 3! 3.3"
'hilli0s Martin 3! 3.00
3mith 4nd# 31 2.00
-otice that Johnson came &ack this time and did not a00ear 0re5iousl#
&ecause of the -2LL 5alues.
Later in this &ook, the C14LE3CE )ill &e e<0lored as another )a# to
eliminate -2LL 5alues directl# in the 3@L instead of in the data&ase.

Multiple ;alue Search (I")
're5iousl#, it )as sho)n that adding a %CE!E clause to the 3ELECT
limited the returned ro)s to those that meet the criteria. The I-
com0arison is an alternati5e to using one or more 1! com0arisons on
the same column in the %CE!E clause of a 3ELECT statement and the
I- com0arison also makes it a &it easier to code(
The 5alue list normall# consists of multi0le 5alues se0arated &#
commas. %hen the 5alue in the column &eing com0ared matches one
of the 5alues in the list, the ro) is returned.
The follo)ing is an e<am0le for the alternati5e method )hen an# one
of the conditions is enough to satisf# the reAuest using I-(
3 !o) returned
Last6"ame %lass6%ode 7rade6Pt
'hilli0s 3! 3.00
Thomas 9! :.00
3mith 31 2.00
The use of multi0le conditional checks as )ell as the I- can &e used in
the same 3ELECT reAuest. Considerations include the use of 4-* for
declaring that multi0le conditions must all &e true. Earlier, )e sa) the
solution using a com0ound 1!.
Using NOT IN
4s seen earlier, sometimes the un)anted 5alues are not kno)n or it is
easier to eliminate a fe) 5alues than to s0ecif# all the 5alues needed.
%hen this is the case, it is a common 0ractice to use the -1T I- as
coded &elo).
The ne<t statement eliminates the ro)s that match and return those
that do not match(
I !o)s returned
Last6"ame 7rade6Pt
Mc!o&erts 8.F0
Canson 2.GG
%ilson 3.G0
*elane# 3.3"
Larkins 0.00
.ond 3.F"
The follo)ing 3ELECT is a &etter )a# to make sure that all ro)s are
returned )hen using a -1T I-(
H !o)s returned
Last6"ame %lass6%ode 7rade6Pt
Larkins 9! 0.00
Canson 9! 2.GG
Mc!o&erts J! 8.F0
Johnson Q Q
%ilson 31 3.G0
*elane# 3! 3.3"
.ond J! 3.F"
-otice that Johnson came &ack in this list and not the 0re5ious reAuest
using the -1T I-./ou ma# &e thinking that if the -2LLreser5ed )ord is
used )ithin the I- list it )ill co5er the situation. 2nfortunatel#, #ou are
forgetting that this com0arison al)a#s returns an unkno)n. Therefore,
the ne<t reAuest )ill -EE! return an# ro)s(
-o !o)s found
Making this mistake )ill cause no ro)s to e5er &e returned. This is
&ecause e5er# time the column is com0ared against the 5alue list the
-2LL is an unkno)n and the Truth ta&le sho)s that the -1T of an
unkno)n is al)a#s an unkno)n for all ro)s.
If #ou are not sure a&out this, do an EB'L4I- (cha0ter 3$ of the -1T
I- and a su&Auer# to see that the 4M' ste0 )ill actuall# &e ski00ed
)hen a -2LL e<ists in the list. There are also e<tra 4M' ste0s to
com0ensate for this condition. It makes the 3@L E!/ inefficient.

*sing Quanti.iers ;ersus I"
There is another alternati5e to using the I-. @uantifiers can &e used to
allo) for normal com0arison o0erators )ithout reAuiring com0ound
conditional checks.
The follo)ing is eAui5alent to an I-(
This ne<t reAuest uses 4-/ instead of I-(
3 !o) returned
Last6"ame %lass6%ode 7rade6Pt
'hilli0s 3! 3.00
Thomas 9! :.00
3mith 31 2.00
2sing a Aualifier, the eAui5alent to a -1T I- is(
-otice that like adding a -1T to the com0ound condition, all elements
need to &e changed here as )ell. To re5erse the P 4-/, it &ecomes
-1T P 4LL. This is im0ortant, &ecause the -1T P 4-/ selects all the
ro)s e<ce0t those containing a -2LL. The reason is that as soon as a
5alue is not eAual to an# one of the 5alues in the list, it is returned.
The follo)ing 3ELECT is con5erted from an earlier -1T I-(
I !o)s returned
Last6"ame 7rade6Pt
Mc!o&erts 8.F0
Larkins 0.00
Canson 2.GG
%ilson 3.G0
*elane# 3.3"
.ond 3.F"

Multiple ;alue &ange Search (ET8EE")
The .ET%EE- com0arison can &e used as another techniAue to
reAuest multi0le 5alues for a column that are all in a s0ecific range. It
is easier than )riting a com0ound 1! com0arison or a long 5alue list
of seAuential num&ers )hen using the I-.
This is a good time to 0oint out that this cha0ter is incrementall#
adding ne) )a#s to com0are for 5alues )ithin a %CE!E clause.
Co)e5er, all of these techniAues can &e used together in a single
%CE!E clause. 1ne method does not eliminate the a&ilit# to use one
or more of the others using logical o0erators &et)een each
com0arison.
The ne<t 3ELECT sho)s the s#nta< format for using the .ET%EE-(
The first and second 5alues s0ecified are inclusi5e for the 0ur0oses of
the search. In other )ords, )hen these 5alues are found in the data,
the ro)s are included in the out0ut.
4s an e<am0le, the follo)ing code returns all students )hose grade
0oints of 2.0, :.0 and all 5alues &et)een them(
H !o)s returned
7rade6Pt
3.00
2.GG
:.00
3.G0
3.F"
3.3"
2.00
-otice that due to the inclusi5e nature of the .ET%EE-, &oth 2.0 and
:.0 )ere included in the ans)er set. The first 5alue of the .ET%EE-
must &e the lo)er 5alue, other)ise, no ro)s )ill &e returned. This is
&ecause it looks for all 5alues that are greater or eAual to the first
5alue and less than or eAual to the second 5alue.
4 .ET%EE- can also &e used to search for character 5alues. %hen
doing this, care must &e taken to insure that ro)s are recei5ed )ith
the 5alues that are needed. The s#stem can onl# com0are character
5alues that are the same length. 3o, if one column or 5alue is shorter
than the other, the shortest )ill automaticall# &e 0added )ith s0aces
out to the same length as the longer 5alue.
Com0aring VC4= and VC4LI91!-I4= ne5er constitutes a match. In realit#,
the data&ase is com0aring VC4 = )ith VC4LI91!-I4 V and the# are not
eAual. 3ometimes, it is easier to use the LIME com0arison o0erator
)hich )ill &e co5ered in the ne<t section. 4lthough, easier to code, it
does not al)a#s mean faster to e<ecute. There is al)a#s a trade+off to
consider.
The ne<t 3ELECT finds all of the students )hose last name starts )ith
an L(
8 !o) returned
Last6"ame
Larkins
In realit#, the %CE!E could ha5e used .ET%EE- VL= and VM= as long as
no student=s last name )as VM=. The data needs to &e understood )hen
using .ET%EE- for character com0arisons.

%haracter String Search (LI1E)
The LIME is used e<clusi5el# to search for character data strings. The
ma7or difference &et)een the LIME and the .ET%EE- is that the
.ET%EE- looks for s0ecific 5alues )ithin a range. The LIME is normall#
used )hen looking for a string of characters )ithin a column. 4lso, the
LIME has the ca0a&ilit# to use >)ildcard? characters.
The )ildcard characters are(
8ildcard sym$ol 8hat it does
7 8underscore9 matches any single character, but a character must be present
: 8percent sign9 matches any single character, a series o' characters or the
absence o' characters
Figure 2-10
The ne<t 3ELECT finds all ro)s that ha5e a character string that &egins
)ith V3m=(
8 !o) returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
333:"0 3mith 4nd# 31 2.00
The fact that the Vs= is in the first 0osition dictates its location in the
data. Therefore, the Vm= must &e in the second 0osition. Then, the VS=
indicates that an# num&er of characters (including none$ ma# &e in the
third and su&seAuent 0ositions. 3o, if the %CE!E clause contained(
LIME VSsm=, it onl# looks for strings that end in >3M.? 1n the other
hand, if it )ere )ritten as( LIME VSsmS=, then all character strings
containing >sm? an#)here are returned. 4lso, remem&er that in
Teradata mode, the data&ase is not case sensiti5e. Co)e5er, in 4-3I
mode, the case of the letters must match e<actl# and the 0re5ious
reAuest must &e )ritten as V3mS= to o&tain the same result. Care
should &e taken regarding case )hen )orking in 4-3I mode.
1ther)ise, case does not matter.
The VW= )ildcard can &e used to force a search to a s0ecific location in
the character string. 4n#thing in that 0osition is considered a match.
Co)e5er, a character must &e in that 0osition.
The follo)ing 3ELECT uses a LIMEto find all last names )ith an >4? in
the second 0osition of the last name(
2 !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
:23:00 Larkins Michael 9! 0.00
82"I3: Canson Cenr# 9! 2.GG
In the a&o5e e<am0le, the >W? allo)s an# character in the first
0osition, &ut reAuires a character to &e there.
The ke#)ords 4LL, 4-/, or 31ME can &e used to further define the
5alues &eing searched. The# are the same Auantifiers used )ith the
I-. Cere, the Auantifiers are used to e<tend the fle<i&ilit# of the
LIMEclause.
-ormall#, the LIME )ill look for a single set of characters )ithin the
data. 3ometimes, that is not sufficient for the task at hand. There )ill
&e times )hen the characters to search are not consecuti5e, nor are
the# in the same seAuence.
The ne<t 3ELECT returns ro)s )ith &oth an Vs= and an Vm= &ecause of
the 4LL.
3 !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
2G0023 Mc!o&erts !ichard J! 8.F0
23:828 Thomas %end# 9! :.00
333:"0 3mith 4nd# 31 2.00
It does not matter if the Vs= a00ears first or the Vm= a00ears first, as
long as &oth are contained in the string.
.elo), 4-3I is case sensiti5e and onl# 8 ro) returns due to the fact
that the V3= is u00ercase, so Thomas and Mc!o&erts are not returned(
8 !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
333:"0 3mith 4nd# 31 2.00
If, in the a&o5e statement, the 4LL Auantifier is changed to 4-/ (4-3I
standard$ or 31ME (Teradata e<tension$, then a character string
containing either of the characters, Vs= or Vm=, in either order is
returned. It uses the 1! com0arison.
This ne<t 3ELECT returns an# ro) )here the last name contains either
an Vs= or an Vm=(
G !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
:23:00 Larkins Michael 9! 0.00
82"I3: Canson Cenr# 9! 2.GG
2G0023 Mc!o&erts !ichard J! 8.F0
2I0000 Johnson 3tanle# Q Q
238222 %ilson 3usie 31 3.G0
23:828 Thomas %end# 9! :.00
333:"0 3mith 4nd# 31 2.00
8232"0 'hilli0s Martin 3! 3.00
4l)a#s &e a)are of the issue regarding case sensiti5it# )hen using
4-3I Mode. It )ill normall# affect the num&er of ro)s returned and
usuall# reduces the num&er of ro)s.
There is a s0ecialt# o0eration that can &e 0erformed in con7unction
)ith the LIME. 3ince the search uses the >W? and the >S? as )ildcard
characters, ho) can #ou search for actual data that contains a >W? or
>S? in the dataQ
-o) that )e kno) ho) to use the )ildcard characters, there is a )a#
to take a)a# the s0ecial meaning and literall# make the )ildcard
characters an VW= and a VS=. That is the
0ur0ose of E3C4'E. It tells the 'Eto not match an#thing, &ut instead,
match the actual character of VW= or VS=.
The ne<t 3ELECT uses the E3C4'E to find all ta&le names that ha5e a
>W? in the G
th
0osition of the name from the *ata *ictionar#.
2 !o)s returned
Ta$lename
3tudentWTa&le
3tudentWCourseWTa&le
In the a&o5e out0ut, the onl# thing that matters is the VW= in 0osition
eight &ecause of the first se5en VW= characters are still )ildcards.

#eri<ed %olumns
The ma7orit# of the time, columns in the 3ELECT statement e<ist
)ithin a data&ase ta&le. Co)e5er, sometimes it is more ad5antageous
to calculate a 5alue than to store it.
4n e<am0le might &e the salar#. In the em0lo#ee ta&le, )e store the
annual salar#. Co)e5er, a reAuest comes in asking to dis0la# the
monthl# salar#. *oes the ta&le need to &e changed to create a column
for storing the monthl# salar#Q Must )e go through and u0date all of
the ro)s (one 0er em0lo#ee$ and store the monthl# salar# into the
ne) column 7ust so )e can select it for dis0la#Q
The ans)er is no, )e do not need to do an# of this. Instead of storing
the monthl# salar#, )e can calculate it from the annual salar# using
di5ision. If the annual salar# is di5ided &# 82 (months 0er #ear$, )e
>deri5e? the monthl# salar# using mathematics.
Chart of 4-3I o0erands for math o0erations(
(perator (peration per.ormed
( $ 0arentheses, (all math o0erations in 0arentheses done first$
KK e<0onentiation, (80KK82 deri5es 8,000,000,000,000 or 8 trillion$
K multi0lication, (80K82 deri5es 820$
J di5ision, (80J82 deri5es 0, &oth are integers and truncation of decimal
occurs $
X addition, (80X82 deri5es 22$
+ su&traction, (80+82 deri5es +2, since 82 is greater than 80 and negati5e
5alues are allo)ed$
Figure 2-11
These math functions ha5e a 0riorit# associated )ith their order of
e<ecution )hen mi<ed in the same formula. The seAuence is &asicall#
the same as their order in the chart. 4ll e<0onentiation is 0erformed
first. Then, all multi0lication and di5ision is 0erformed and lastl#, all
addition and su&traction is done. %hene5er t)o different o0erands are
at the same 0riorit#, like addition and su&traction, the# are 0erformed
&ased on their a00earance in the eAuation from left to right.
4lthough the a&o5e is the default 0riorit#, it can &e o5er+ridden )ithin
the 3@L. -ormall# an eAuation like 2X:K" #ields 22 as the ans)er.
This is &ecause the :K" P 20 is done first and then the 2 is added to it.
Co)e5er, if it is )ritten as (2X:$K", no) the ans)er &ecomes 30
(2X:PIK"P30$.
The follo)ing 3ELECT sho)s these and the results of an assortment of
mathematics(
8 !o) !eturned
2X:K" (2X:$K" 2X:J" (2X:$J" 2X:.0J" (2X:.0$J"80KKF
30 2 8 2.G 8.2 8000000000
-ote( starting )ith integer 5alues, as in the a&o5e, the ans)er is an
integer. If decimals are used, the result is a decimal ans)er.
1ther)ise, a con5ersion can &e used to change the characteristics of
the data &efore &eing used in an# calculation. 4dding the decimal
makes a difference in the 0recision of the final ans)er. 3o, if the 3@L is
not 0ro5iding the ans)er e<0ected from the data, con5ert the data
first (C43T function later in this &ook$.
The ne<t 3ELECT sho)s ho) the 3@L can &e )ritten to im0lement the
earlier e<am0le )ith annual and monthl# salaries(
2 !o)s returned
salary salary'=>
:G,02:.00 :,002.00
80,G00.00 F00.00
3ince the column name is the default column heading, the deri5ed
column is called salar#J82, )hich is not 0ro&a&l# )hat )e )ish to see
there. The ne<t section co5ers the usage of an alias to tem0oraril#
change the name of a column during the life of the 3@L.
*eri5ed data can &e used in the %CE!E clause as )ell as the 3ELECT.
The follo)ing 3@L )ill onl# return the columns )hen the monthl#
salar# is greater than Y8,000.00(
8 !o) returned
salary salary'=>
:G,02:.00 :,002.00
Teradata contains se5eral functions that allo) a user to deri5e data for
&usiness and engineering. This is a chart of those Teradata arithmetic,
trigonometric and h#0er&olic math functions(
(perator (peration per.ormed
;46 x ;odulo returns the remainder 'rom a division 81 mod < derives 1, as the
remainder o' division, < goes into 1, = times ith a remainder o' 1. -hen, <
mod 1= derives <, 1= goes into <, = times ith a remainder o' <9. ;46
alays returns = thru "-1. 5s such, ;46 < returns = 'or even numbers and
1 'or odd> ;46 ? can be used to determine the day o' the ee(> and ;46
1=, ;46 1==, ;46 1===, etc can be used to shi't the decimal o' any
number to the le't by the number o' @eroes in the ;46 operator.
5AS8x9 5bsolute value, the absolute value o' a negative number is the same
number as a positive ". 85AS81=-1<9 = <9
EBC8x9 E"ponentiation, e raised to a poer, 8 EBC81=9 derives
<.<=<DEDF?GEH=D?E==E 9
+4/8x9 +ogarithm calculus 'unction, 8 +4/81=9 derives the value
1.=============E=== 9
+38x9 3atural logarithm, 8 +381=9 derives the value <.I=<FHF=G<GGE=FE=== 9
S2.-8x9 S!uare root, 8 S2.-81=9 derives the value I.1D<<??DD=1DHIHE===9
COS(x) Takes an angle in radians (x) and returns the ratio of two
sides of a right triangle. The ratio is the length of the side
adjacent to the angle divided by the length of the
hypotenuse. The result lies in the range - to ! inclusive
where x is any valid nu"ber e#pression that e#presses an
angle in radians.
S$%(x) Takes an angle in radians (x) and returns the ratio of two
sides of a right triangle. The ratio is the length of the side
opposite to the angle divided by the length of the
hypotenuse. The result lies in the range - to ! inclusive
where x is any valid nu"ber e#pression that e#presses an
angle in radians.
T&%8x9 Takes an angle in radians (x) and returns the ratio of two
sides of a right triangle. The ratio is the length of the side
opposite to the angle divided by the length of the side
adjacent to the angle where x is any valid nu"ber
e#pression that e#presses an angle in radians.
Chart of Teradata arithmetic, trigonometric and h#0er&olic math
functions (continued$
(perator (peration per.ormed
&COS(x) .eturns the arccosine o' x. -he arccosine is the angle hose cosine
is xhere x is the cosine o' the returned angle. -he values o' x must be
beteen -1 and 1, inclusive. -he returned angle is in the range =
to Jradians, inclusive.
&S$%(x) .eturns the arcsine o' 8"9. -he arcsine is the angle hose sine is "
here " is the sine o' the returned angle. -he values o' " must be
beteen -1 and 1, inclusive. -he returned angle is in the range JK<
to JK< radians, inclusive.
&T&%8x9 .eturns the arctangent o' 8"9. -he arctangent is the angle hose
tangent is arg. -he returned angle is in the range JK< to JK< radians,
inclusive.
&T&%' 8x,y9 .eturns the arctangent o' the speci'ied 8",y9 coordinates. -he
arctangent is the angle 'rom the "-a"is to a line contained the
origin8=,=9 and a point ith coordinates 8",y9.
-he returned angle is beteen JJand Jradians, e"cluding J. 5
positive result represents a countercloc(ise angle 'rom the "-a"is
here a negative result represents a cloc(ise angle. -he 5-53<8",y9
e!uals 5-538yK"9, e"cept that " can be = in 5-53<8",y9 and " cannot
be = in 5-538yK"9 since this ill result in a divide by @ero error. *'
both " and y are =, an error is returned.
COS(8x9 .eturns the hyperbolic cosine o' 8"9 here " is any real number.
S$%(8x9 .eturns the hyperbolic sine o' 8"9 here " is any real number.
T&%(8x9 .eturns the hyperbolic tangent o' 8"9 here arg is any real number.
&COS((x) .eturns the inverse hyperbolic cosine o' 8"9. -he inverse hyperbolic
cosine is the value hose hyperbolic cosine is a number so that " is
any real number e!ual to, or greater than, 1.
&S$%((x) .eturns the inverse hyperbolic sine o' 8"9. -he inverse hyperbolic sine
is the value hose hyperbolic sine is a number so that " is any real
number.
&T&%(8x9 .eturns the inverse hyperbolic tangent o' 8"9. -he inverse hyperbolic
tangent is the value hose hyperbolic tangent is a number so that " is
any real number beteen 1 and -1, e"cluding 1 and -19.
Figure 2-12
3ome of these functions are demonstrated &elo) and throughout this
&ook. Cere the# are also using alias names for the columns. Their
a00lication )ill &e s0ecific to the t#0e of a00lication &eing )ritten. It is
not the intent of this &ook to teach the meaning and use in
engineering and trigonometr#, &ut more to educate regarding their
e<istence.

%reating a %olumn Alias "ame
3ince the name of the selected column or deri5ed data formula
a00ears as the heading for the column, it makes for strange looking
results. To make the out0ut look &etter, it is a good idea to use an alias
to dress u0 the heading name used in the out0ut. .esides making the
out0ut look &etter, an alias also makes the 3@L easier to )rite &ecause
the ne) column name can &e used an#)here in the 3@L statement.
AS
Com0liance( 4-3I
The 0re5ious 3ELECT used salar#J82, )hich is 0ro&a&l# not )hat )e
)ish to see in the heading. Therefore, it is 0refera&le to alias the
column )ithin the e<ecution of the 3@L. This means that a tem0orar#
name is assigned to the selected column for use onl# in this
statement.
To alias a column, use an 43 and an# legal Teradata name after the
real column name reAuested or math formula using the follo)ing
techniAue(
2 !o)s returned
Annual6salary Monthly6salary
:G02:.00 :002.00
80G00.00 F00.00
1nce the alias name has &een assigned, it is literall# the name of the
column for the life of the 3@L statement.
The ne<t reAuest is a 5alid e<am0le of using of the alias in the %CE!E
clause(
8 !o) returned
annual6salary monthly6salary
Y:G,02:.00 Y:,002.00
The math functions are 5er# hel0ful for calculating and e5aluating
characteristics of the data. The follo)ing e<am0les incor0orate most of
the functions to demonstrate their o0erational functionalit#.
The ne<t 3ELECT uses literals and aliases to sho) the data &eing in0ut
and results for each of the most common &usiness a00lica&le
o0erations(
8 !o) returned
#i<>?? Last> E<en (dd
8as
Positi<e
Positi<e"ow S:&oot
2 : 0 8 8 8 2.00
The out0ut of the 3ELECT sho)s some interesting results. The di5ision
is eas#L )e learned that in elementar# school. The first M1* 800
results in :, &ecause the result of the di5ision is 2, &ut the remainder
is : (20: D 200 P :$. 4 M1* 800 can result in an# 5alue &et)een 0
and FF. In realit#, the M1* 800 mo5es the decimal 0oint t)o 0ositions
to the left. 1n the other hand, the M1* 2 )ill al)a#s &e 0 for e5en
num&ers and 8 for odd num&ers. The 4.3 al)a#s returns the 0ositi5e
5alue of an# num&er and lastl#, 2 is the sAuare root of :.
Man# of these )ill &e incor0orated into 3@L throughout this &ook to
demonstrate additional &usiness a00lications.
NAMED
Com0liance( Teradata E<tension
'rior to the 43 &ecoming the 4-3I standard, Teradata used -4ME* as
the ke#)ord to esta&lish an alias. 4lthough &oth currentl# )ork, it is
strongl# suggested that an 43 &e used for com0ati&ilit#. 4lso, as hard
as it is to &elie5e, I ha5e heard that -4ME* ma# not )ork in future
releases.
The follo)ing is the same 3ELECT as seen earlier, &ut here it uses the
-4ME* instead of the 43(
2 !o)s returned
Annual6salary Monthly6salary
:G02:.00 :002.00
80G00.00 F00.00
Naming conventions
%hen creating an alias onl# 5alid Teradata naming characters are
allo)ed. The alias &ecomes the name of the column for the life of the
3@L statement. The onl# difference is that it is not stored in the *ata
*ictionar#.
The charts &elo) list the 5alid characters to use and then the rules (on
the left$ to follo) )hen 4-3I com0liance is desired. 4lso listed are the
more fle<i&le Teradata (on the right$ allo)a&le characters and
e<tended character sets )ith its rules.
Chart of alid Characters for 4-3I and Teradata(
A"SI %haracters Allowed
(up to =@ in a single name)
Teradata %haracters Allowed
(up to A? in a single name)
5 through L 5 through L and a through @
= through G = through G
W (underscore J underline$ W (underscore J underline$
Z (octathro0e J 0ound sign J num&er sign $
M 8dollar sign K currency sign9
Figure 2-13
Chart of 4-3I and Teradata -aming Con5entions
A"SI &ules .or column names Teradata &ules .or column names
;ust be entirely in upper case Nan be all upper, all loer or a mi"ture o' case
using any o' these characters
;ust start ith 5 through L Nan start ith any valid character
;ust end ith underscore 7 Nan end ith any valid character
Figure 2-14
Teradata uses all of the 4-3I characters as )ell as the additional ones
listed in the a&o5e charts.
Breaking Conventions
It is not recommended to &reak these con5entions. Co)e5er,
sometimes it is necessar# or desira&le to use non+standard characters
in a name. 4lso, sometimes )ords ha5e &een used as ta&le or column
names and then in a later release, the name &ecomes a reser5ed
)ord. There needs to &e a techniAue to assist #ou )hen either of these
reAuirements &ecomes necessar#.
The techniAue uses dou&le Auotes (>$ around the name. This techniAue
tells the 'E that the )ord is not a reser5ed )ord and makes it a 5alid
name. This is the onl# 0lace that Teradata uses a dou&le Auote instead
of a single Auote (V$.
4s an e<am0le, the 0re5ious 3ELECT has &een modified to use dou&le
Auotes (>$ instead of -4ME*(
2 !o)s returned
Annualsalary Monthlysalary
80G00.00 F00.00
:G02:.00 :002.00
4lthough it is not o&5ious due to the underlining, the column heading
for the first column is 4nnual 3alar#, including the s0ace. 4 s0ace is
not a 5alid naming character, &ut this is the column name and it is
5alid &ecause of the dou&le Auotes. This can &e seen in the 1!*E!
./ )here it uses the column name. The ne<t section 0ro5ides more
details on the use of 1!*E! ./.

(&#E& !
The Teradata 4M's generall# &ring data &ack randoml# unless the user
s0ecifies a sort. The addition of the 1!*E! ./ reAuests a sort
o0eration to &e 0erformed. The sort arranges the ro)s returned in
ascending seAuence unless #ou s0ecificall# reAuest descending. 1ne or
more columns ma# &e used for the sort o0eration. The first column
listed is the ma7or sort seAuence. 4n# su&seAuent columns s0ecified
are minor sort 5alues in the order of their a00earance in the list.
The snta! "o# us$n% an OR&ER '()
In Teradata, if the seAuence of the ro)s &eing dis0la#ed is im0ortant,
then an 1!*E! ./ should &e used in the 3ELECT. Man# other
data&ases store their data seAuentiall# &# the 5alue of the 0rimar#
ke#. 4s a result, the data )ill a00ear in seAuence )hen it is returned.
To &e faster, Teradata stores it differentl#.
Teradata organi;es data ro)s in ascending seAuence on disk &ased on
a ro) I* 5alue, not the data 5alue. This is the same 5alue that is
calculated to determine )hich 4M' should &e res0onsi&le for storing
and retrie5ing each data ro).
%hen the 1!*E! ./ is not used, the data )ill a00ear 5aguel# in ro)
hash seAuence and is not 0redicta&le. Therefore, it is recommended to
use the 1!*E! ./ in a 3ELECT or the data )ill come &ack randoml#.
!emem&er, e5er#thing in Teradata is done in 0arallel, this includes the
sorting 0rocess.
The ne<t 3ELECT retrie5es all columns and sorts &# the 6rade 0oint
a5erage(
: !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
32:I"2 *elane# *ann# 3! 3.3"
238222 %ilson 3usie 31 3.G0
322833 .ond Jimm# J! 3.F"
23:828 Thomas %end# 9! :.00
-otice that the default seAuence for the 1!*E! ./ is ascending (43C$,
lo)est 5alue to highest. This can &e o5er+ridden using *E3Cto indicate
a descending seAuence as sho)n using the follo)ing 3ELECT(
: !o)s returned
Student6I# Last6"ame )irst6"ame %lass6%ode 7rade6Pt
23:828 Thomas %end# 9! :.00
322833 .ond Jimm# J! 3.F"
238222 %ilson 3usie 31 3.G0
32:I"2 *elane# *ann# 3! 3.3"
4s an alternati5e to using the column name in an 1!*E! ./, a num&er
can &e used. The num&er reflects the column=s 0osition in the 3ELECT
list. The a&o5e 3ELECT could also &e )ritten this )a# to o&tain the
same result(
In this case, the grade 0oint column is the fifth column in the ta&le
definition &ecause of its location in the ta&le and the 3ELECT uses K for
all columns. This adds fle<i&ilit# to the )riting of the 3ELECT. Co)e5er,
al)a#s )atch out for the a&ilit# )ords, like fle<i&ilit# &ecause it adds
another a&ilit# )ord( res0onsi&ilit#. %hen using the column num&er, if
the column that is used for the sort is mo5ed to another location in the
select list, a different column is no) used for the sort. Therefore, it is
im0ortant to &e res0onsi&le to change the list and the num&er in the
1!*E! ./.
Man# times it is necessar# that the 5alue in one column needs to &e
sorted )ithin the seAuence of a second column. This techniAue is said
to ha5e a ma7or sort column or ke# and one or more minor sort ke#s.
The first column listed in the 1!*E! ./ is the ma7or sort ke#.
Like)ise, the last column listed is the most minor sort ke# )ithin the
seAuence. The minor ke#s are referred to as &eing sorted )ithin the
ma7or sort ke#. 4dditionall#, some columns can ascend )hile others
descend.
This 3ELECT sorts t)o different columns( the last name (minor sort$
ascending (43C$, )ithin the class code (ma7or sort$ descending
(*E3C$(
80 !o)s returned
Last6"ame %lass6%ode 7rade6Pt
*elane# 3! 3.3"
'hilli0s 3! 3.00
3mith 31 2.00
%ilson 31 3.G0
.ond J! 3.F"
Mc!o&erts J! 8.F0
Canson 9! 2.GG
Larkins 9! 0.00
Thomas 9! :.00
Johnson Q Q
-otice, in the a&o5e statement, the use of relati5e column num&ers
instead of column names in the 1!*E! ./ for the sort. The num&ers 2
and 8 )ere used instead of ClassWCode and LastW-ame. %hen #ou
select columns and then use num&ers in the sort, the num&ers relate
to the order of the columns after the ke#)ord 3ELECT. %hen #ou
3ELECT K (all columns in the ta&le$ then the sort num&ers reflect the
order of columns )ithin the ta&le.
4n additional ca0a&ilit# of Teradata is that a column can &e used in the
1!*E! ./ that is not selected. This is 0ossi&le &ecause the data&ase
uses a tag sort for s0eed and fle<i&ilit#. In other )ords, it &uilds a tag
area that consists of all the columns s0ecified in the 1!*E! ./ as )ell
as the columns that are &eing selected.
This diagram sho)s the la#out of a ro) in 3'11L used )ith an 1!*E!
./(
-agcolumn1 -agcolumn3 5;CO Selectcolumn1 Selectcolumn< ...Selectcolumn3
Figure 2-15
4lthough it can sort on a column that is not selected, the seAuence of
the out0ut ma# a00ear to &e com0letel# random. This is &ecause the
sorted 5alue is not seen in the dis0la#.
4dditionall#, )ithin a Teradata session the user can reAuest a Collation
3eAuence and a Code 3etfor the s#stem to use. .# reAuesting a
Collation 3eAuence of E.C*IC, the sort 0uts the data into the 0ro0er
seAuence for the I.M mainframe s#stem. Therefore, is the automatic
default code set )hen connecting from the mainframe.
Like)ise, if a user )ere e<tracting to a 2-IB com0uter, the normal
code set is 4C3II. Co)e5er, if the file is transferred from 2-IB to a
mainframe and con5erted there, it is in the )rong seAuence. %hen it is
kno)n ahead of time that the file )ill &e used on a mainframe &ut
e<tracted to a different com0uter, the Collation 3eAuencecan &e set to
E.C*IC. Therefore, )hen the file code set is con5erted, the file is in
the correct seAuence for the mainframe )ithout doing another sort.
Like the Collation 3eAuence, the Code 3et can also &e set. 3o, a file
can &e in E.C*IC seAuence and the data in 43CII or sorted in 43CII
seAuence )ith the data in E.C*IC. The final use of the file needs to &e
considered )hen making this choice.

#ISTI"%T )unction
4ll of the 0re5ious o0erations of the 3ELECT returned a ro) from a
ta&le &ased on its e<istence in a ta&le. 4s a result, if multi0le ro)s
contain the same 5alue, the# all are dis0la#ed.
3ometimes it is onl# necessar# to see one of the 5alues, not all.
Instead of contem0lating a %CE!E clause to accom0lish this task, the
*I3TI-CT can &e added in the 3ELECT to return uniAue 5alues &#
eliminating du0licate 5alues.
The s#nta< for using *I3TI-CT(
The ne<t 3ELECT uses *I3TI-CT to return onl# one ro) for dis0la#
)hen a 5alue e<ists(
" !o)s !eturned
%lass6code
Q
9!
J!
31
3!
There are a cou0le note)orth# situations in the a&o5e out0ut. 9irst,
although there are three freshman, t)o so0homores, t)o 7uniors, t)o
seniors and one ro) )ithout a class code, onl# one out0ut ro) is
returned for each of these 5alues. Lastl#, the -2LL is considered a
uniAue 5alue )hether there is one ro) or multi0le ro)s containing it.
3o, it is dis0la#ed one time.
The main considerations for using *I3TI-CT, it must(
8. 400ear onl# once
2. 400l# to all columns listed in the 3ELECT to determine
uniAueness
3. 400ear &efore the first column name
The follo)ing 3ELECT uses more than one column )ith a *I3TI-CT(
80 !o)s !eturned
class6code grade6pt
Q Q
9! 0.00
9! 2.GG
9! :.00
J! 8.F0
J! 3.F"
31 2.00
31 3.G0
3! 3.00
3! 3.3"
The *I3TI-CT in this 3ELECT returned all ten ro)s of the ta&le. This is
due to the fact that )hen the class code and the grade 0oint are
com&ined for com0arison, the# are all uniAue. The onl# 0otential for a
du0licate e<ists )hen t)o students in the same class ha5e the same
grade 0oint a5erage. Therefore, as more and more columns are listed
in a 3ELECT )ith a *I3TI-CT, there is a greater o00ortunit# for more
ro)s to &e returned due to a higher likelihood for uniAue 5alues.
If, )hen using *I3TI-CT, s0ool s0ace is e<ceeded, see cha0ter " and
the use of the 6!12' ./5ersus *I3TI-CT for eliminating du0licate
ro)s. It ma# sol5e the 0ro&lem and that cha0ter tells the reason for it.

9ELP commands
The Teradata *ata&ase offers se5eral t#0es of hel0 using an interacti5e
client. 9or con5enience, this reduces or eliminates the need to look
information u0 in a hardco0# manual or on a C*+!1M. Therefore,
using the hel0 and sho) o0erations in this cha0ter can sa5e #ou a
large amount of time and make #ou more 0roducti5e. 3ince Teradata
allo)s #ou to organi;e data&ase o&7ects into a 5ariet# of locations,
sometimes #ou need to determine )here certain o&7ects are stored
and other detail information a&out them.
This chart is a list of a5aila&le CEL' commands on 1&7ects(
0E+C 65-5A5SE <database-name> > 6isplays the names o' all the tables 8-9, vies
8P9, macros 8;9, and triggers 8/9 stored in a
database and user ritten table comments
0E+C &SE. <user-name> > 6isplays the names o' all the tables 8-9, vies
8P9, macros 8;9, and triggers 8/9 stored in a
user area and user ritten table comments
0E+C -5A+E <table-name> > 6isplays the column names, type identi'ier, and
any user ritten comments on the columns
ithin a table.
0E+C P4+5-*+E -5A+E > 6isplays the names o' all Polatile temporary
tables active 'or the user session.
0E+C P*EQ <vie-name> > 6isplays the column names, type identi'ier, and
any user ritten comments on the columns
ithin a vie.
0E+C ;5N.4 <macro-name> > 6isplays the characteristics o' parameters
passed to it at e"ecution time.
0E+C C.4NE6&.E <procedure-name> > 6isplays the characteristics o' parameters
passed to it at e"ecution time.
0E+C -.*//E. <trigger-name> > 6isplays details created 'or a trigger, li(e action
time and se!uence.
CEL' C1L2M- Nta&le+nameO.K L
CEL' C1L2M- N5ie)+nameO.K L
CEL' C1L2M-
<table-name>.<column-name>, R. >
6isplays detail data describing the column level
characteristics.
Figure 3-1
To see the data&ase o&7ects stored in a *ata&ase or 2ser area, either
of the follo)ing CEL' commandsma# &e used(
*EL+ &,T,',SE M_&' ;
O#
*EL+ -SER M_-se# ;
: !o)s !eturned
Ta$le';iew'Macroname 1ind %omment
em0lo#ee T TPTa&le)ith8ro)0erem0lo#ee
em0lo#eeW5 Pie)foraccessingEm0lo#eeTa&le
Em0lo#eeWm8 M MPMacrotore0ortonEm0lo#eeTa&le
Em0lo#eeWTrig 6 6PTriggertou0dateEm0lo#eeTa&le
3ince Teradata considers a data&ase and a user to &e eAui5alent, &oth
can store the same t#0es of o&7ects and therefore, the t)o commands
0roduce similar out0ut.
-o) that #ou ha5e seen the names of the o&7ects in a data&ase or
user area, further in5estigation dis0la#s the names and the t#0es of
columns contained )ithin the o&7ect. 9or ta&les and 5ie)s, use the
follo)ing commands(
*EL+ T,'LE M_Table ;
H !o)s !eturned
%olumn"ame Type %omment "ulla$le )ormat Title
Column8 I Thiscolumnisaninteger / +(80$F Q
Column2 I2 Thiscolumnisasmallint / +("$F Q
Column3 I8 Thiscolumnisa&#teint / +(3$F Q
Column: C9 Thiscolumnisafi<edlength / B(20$ Q
Column" C Thiscolumnisa5aria&lelength / B(20$ Q
ColumnI *4 Thiscolumnisadate / ////+MM+** Q
ColumnH * Thiscolumnisadecimal / ++++++++.FF Q

Ma/Length #ecimalTotal#igits #ecimal )ractional#igits &angeLow &ange9igh
. / / / / 0
2 Q Q Q Q -
8 Q Q Q Q -
20 Q Q Q Q -
20 Q Q Q Q -
: Q Q Q Q -
: F 2 Q Q -

*pper%ase Ta$le';iew2 #e.ault<alue %harType Id%olType
- T Q Q Q
- T Q Q Q
- T Q Q Q
- T Q 8 Q
- T Q 8 Q
- T Q Q Q
- T Q Q Q
The a&o5e out0ut has &een )ra00ed to multi0le lines to sho) all the
detail information a5aila&le on the columns of a ta&le.
*EL+ 12E3 M_1$e4 ;
(notice that the 5ast ma7orit# of the column data is not a5aila&le for a
5ie), it comes from the ta&le, not the 3ELECT that creates a 5ie)$
H !o)s !eturned
%olumn"ame Type %omment "ulla$le )ormat Title
Column8 Q Thiscolumnisaninteger Q Q Q
Column2 Q Thiscolumnisasmallint Q Q Q
Column3 Q Thiscolumnisa&#teint Q Q Q
Column: Q Thiscolumnisafi<edlengthQ Q Q

Column" Q Thiscolumnisa5aria&lelength Q Q Q
ColumnI Q Thiscolumnisadate Q Q Q
ColumnH Q Thiscolumnisadecimal Q Q Q

Ma/
Length
#ecimal Total
#igits
#ecimal )ractional
#igits
&ange Low&ange
9igh
Q Q Q Q Q Q
Q Q Q Q Q Q
Q Q Q Q Q Q
Q Q Q Q Q Q
Q Q Q Q Q Q
Q Q Q Q Q Q
Q Q Q Q Q Q

*pper%ase Ta$le';iew2 #e.ault<alue %harType Id%olType
Q Q Q Q Q
Q Q Q Q Q
Q Q Q Q Q
Q Q Q 8 Q
Q Q Q 8 Q
Q Q Q Q Q
Q Q Q Q Q
The a&o5e out0ut is )ra00ed to multi0le lines and dis0la# the column
name and the kind, )hich eAuates to the data t#0e and an# comment
added to a column. -otice that a 5ie) does not kno) the data t#0e of
the columns from a real ta&le. Teradata 0ro5ides a
C1MME-T command to add these comments on ta&les and columns.
The follo)ing C1MME-T commands add a comment to a ta&le and a
5ie)(
This C1MME-T command adds a comment to a column(
The a&o5e column information is hel0ful for most of the column t#0es,
such as I-TE6E! (I$, 3M4LLI-T (I2$ and *4TE (*4$ &ecause the si;e
and the 5alue range is a constant. Co)e5er, the lengths of the
*ECIM4L (*$ and the character columns (C9, C$ are not sho)n here.
These are the most common of the data t#0es. 3ee cha0ter 8G (**L$
for more details on data t#0es.
The ne<t CEL' C1L2M- command 0ro5ides more details for all of the
columns(
The out0ut is not sho)n again, since it is e<actl# the same as the
ne)er 5ersion of the CEL' T4.LEcommand.
The ne<t chart sho)s CEL' commands for information on data&ase
ta&les and sessions, as )ell as 3@L and 3'L commands(
Cel0 Commands(
0E+C *36EB <table-name> > 6isplays the inde"es and their
characteristics li(e uni!ue or non-uni!ue
and the column or columns involved in the
inde". -his data is used by the 4ptimi@er to
create a plan 'or S2+.
0E+C S-5-*S-*NS <table-name> > 6isplays values associated ith the data
demographics collected on the table. -his
data is used by the 4ptimi@er to create a
plan 'or S2+.
CEL' C1-3T!4I-T
<table-name>.<constraint-name> >
6isplays the chec(s to be made on the data
hen it is inserted or updated and the
columns are involved.
0E+C SESS*43> 6isplays the user name, account name,
logon date and time, current database name,
collation code set and character set being
used, transaction semantics, time @one and
character set data.
0E+C SS2+T> 6isplays a list o' available S2+
commands and 'unctions.
0E+C SS2+ <command>T> 6isplays the basic synta" and options 'or the
actual S2+ command inserted in place o'
the <command> .
0E+C SSC+T> 6isplays a list o' available SC+ commands.
0E+C SSC+ <command>T> 6isplays the basic synta" and options 'or the
actual SC+ command inserted in place o' the
<command> .
Figure 3-2
The a&o5e chart does a 0rett# good 7o& of e<0laining the CEL'
functions. These functions onl# 0ro5ide additional information if the
ta&le o&7ect has one of these characteristics defined on it. The I-*EB,
3T4TI3TIC3 and C1-3T!4I-T functions )ill &e further discussed in the
*ata *efinition Language Cha0ter (**L$ &ecause of their relationshi0
to the o&7ects.
4t this 0oint in learning 3@L, and in the interest of getting to other
3@L functions, one of the most useful of these CEL' functions is the
CEL' 3E33I1-.
The follo)ing CEL' returns inde< information on the
de0artmentWta&le(
*EL+ 20&E5 &e6a#tment_table ;
2 ro)s returned
*ni:ue2 Primary or
Secondary2
%olumn
"ames
Inde/
Id
Appro/imate
%ount
Inde/
"ame
(rdered or
Partitioned2
/ ' *e0tW-o 8 G.00 Q C
- 3 *e0artmentWname : G.00 Q C
- 3 MgrW-o G I.00 Q C
The follo)ing CEL' returns information on the session from the 'E(
*EL+ SESS2O0 ;
8 !o) !eturned (columns )ra00ed for 5ie)ing$
*ser
"ame
Account
"ame
Logon #ateLogon
Time
%urrent
#ata$ase %ollation
%haracter
Set
*.C *.C FFJ82J82 88(:"(83 'ersonnel 43CII 43CII

Transaction
Semantics
%urrent
#ate)orm Time Bone
#e.ault
%haracter Type
E/port Latin
Teradata Integerdate 00(00 L4TI- 8

E/port *nicode
E/port *nicode
AdCust E/port 1anCiSDIS E/port 7raphic
8 0 8 0

#e.ault #ate
)ormat &adi/ Separator 7roup Separator 7rouping &ule
//JMMJ** . , 3

%urrency &adi/
Separator
%urrency 7roup
Separator
%urrency
7rouping &ule %urrency "ame
. , 3 23 *ollars

%urrency IS(%urrency
#ual %urrency
"ame
#ual %urrency #ual
IS(%urrency
Y 23* 23 *ollars Y 23*

The a&o5e out0ut has &een )ra00ed for easier 5ie)ing. -ormall#, all
headings and 5alues are on a single line.
The current date form, time ;one and e5er#thing that follo)s them in
the out0ut are ne) )ith the 2!3 release of Teradata. These columns
ha5e &een added to make their reference here, easier than digging
through the *ata *ictionar# using 3@L.
%hen using a tool like .TE@, the line is truncated. 3o, for easier
5ie)ing, the .3I*ETITLE3 and .91L*LI-E commands sho) the out0ut
in a 5ertical dis0la#.
The ne<t seAuence of commands can &e used )ithin .TE@(
Esidetitles on
E.oldline on
*EL+ SESS2O0;
8 !o) !eturned
2ser -ame MIMEL
4ccount -ame *.C
Logon *ate 00J0IJ2"
Logon Time 08(02("2
Current *ata.ase MIMEL
Collation 43CII
Character 3et 43CII
Transaction 3emantics Teradata
Current *ate9orm Integer*ate
3ession Time [one 00(00
*efault Character T#0e L4TI-
E<0ort Latin 8
E<0ort 2nicode 8
E<0ort 2nicode 4d7ust 0
E<0ort Man7i3JI3 8
E<0ort 6ra0hic 0
To reset the dis0la# to the normal line, use either of the follo)ing
commands(
E#E)A*LTS
or
ESI#ETITLES ())
E)(L#LI"ES ())
In .TE@, an# command that starts )ith a dot (.$ does not ha5e to end
)ith a semi+colon (L$.
The ne<t CEL' command returns a list of the a5aila&le 3@L commands
and functions(
*EL+ 7S8L9;
:8 !o)s !eturned
1n+Line Cel0
#S SQL %(MMA"#S0
4.1!T 4LTE! T4.LE .E6I- L166I-6
.E6I- T!4-34CTI1- CCECM'1I-T C1LLECT 3T4TI3TIC3
C1MMIT C1MME-T C!E4TE *4T4.43E
C!E4TE I-*EB C!E4TE M4C!1 C!E4TE T4.LE
C!E4TE 23E! C!E4TE IE% *4T4.43E
*ELETE *ELETE *4T4.43E *ELETE 23E!
*!1' *4T4.43E *!1' I-*EB *!1' M4C!1
*!1' T4.LE *!1' IE% *!1' 3T4TI3TIC3
ECC1 E-* L166I-6 E-* T!4-34CTI1-.

#S SQL )*"%TI("S0
4.3 4**WM1-TC3 4E!46E
CC4!4CTE!3 C43T CC4!2CEBI-T
C12-T C1!! C14!W'1'
C32M EB' EBT!4CT
91!M4T I-*EB C43C4M'
C43C.M4M' C43C.2CMET C43C!1%
M2!T13I3 L- L16
M46 M4BIM2M MCC4!4CTE!3
M*I99 MI-*EB MI-IM2M
MLI-!E6 M32.3T! M32M
-4ME* -2LLI9[E!1 1CTETWLE-6TC
@24-TILE !E6!WI-TE!CE'T !E6!W3L1'E
!4-*1M !4-M 3ME%
3@!T 3T**EW'1' 3T**EW34M'
S&AS-. S&; -*-+E
T!IM T/'E 2''E!
4!6!4'CIC 4!W'1' 4!W34M'
[E!1I9-2LL
The a&o5e out0ut is not a com0lete list of the commands. The three
dots in the center re0resent the location )here commands )ere
omitted so it fit onto a single 0age. 4ll commands are seen )hen
0erformed on a terminal.
1nce this out0ut has &een used to find the command, than the
follo)ing CEL' command 0ro5ides additional information on it(
*EL+ 7S8L E0& TR,0S,CT2O09 ;
" !o)s !eturned
3ince the terminal is used most of the time to access the data&ase,
take ad5antage of it and use the terminal for #our CEL' commands.
Tools like @uer#man also ha5e a 5ariet# of CEL' commands and
indi5idual menus. 4l)a#s look for )a#s to make the task easier.
SET SESSION command
The Teradata *ata&ase 0ro5ides user access onl# &# allocating a
session )ith a 'arsing Engine. The 'arsing engine )ill use default
attri&utes &ased on the user and host com0uter from )hich the user is
connecting. %hen a different session o0tion is needed,the 3ET
3E33I1- command is needed. It o5er+rides the default for this session
onl#. The ne<t time the user logs into Teradata, the original default )ill
&e used again.
3#nta< for 3ET 3E33I1-(
The 3ET 3E33I1- can &e a&&re5iated as( 33.
Collation sequence) 43CII, E.C*IC, M2LTI-4TI1-4L (Euro0ean
(diacritical$ character or Man7i character$, CC4!3ETWC1LL (&inar#
ordering &ased on the current client character set$, JI3WC1LL (logical
ordering of characters &ased on the Ja0anese Industrial 3tandards
collation$, C13T (E.C*IC for I.M channel+attached clients and 43CII
for all other clients + default collation$.
Account-id( allo)s for the tem0orar# changing of accounting data for
charge &ack and 0riorit#. The account+id s0ecified must &e a 5alid one
assigned to the user and the 0riorit# can onl# &e do)n graded.
INTEGERDATE( uses the //JMMJ** format and ANSIDATEuses the
////+MM+** format for a date.
Database-name( &ecomes the data&ase to use as the current
data&ase for 3@L o0erations during this session.

S9(8 commands
There are times )hen #ou need to recreate a ta&le, 5ie), or macro
that #ou alread# ha5e, or #ou need to create another o&7ect of the
same t#0e that is either identical or 5er# similar to an o&7ect that is
alread# created. %hen this is the case, the 3C1% command is a )a#
to accom0lish )hat #ou need.
%e )ill &e discussing all of these o&7ect t#0es and their associated
*ata *efinition Language (**L$ commands later in this course.
The intent of the 3C1% command is to out0ut the C!E4TE statement
that could &e used to recreate the o&7ect of the t#0e s0ecified.
This chart sho)s the commands and their formats(
S04Q -5A+E <table-name> > 6isplays the N.E5-E -5A+E statement
needed to create this table.
S04Q P*EQ <vie-name> > 6isplays the N.E5-E P*EQ statement
needed to create this vie.
S04Q ;5N.4 <macro-name> > 6isplays the N.E5-E
;5N.4 statement needed to create this
macro.
S04Q -.*//E. <trigger-name> > 6isplays the N.E5-E
-.*//E.statement needed to create this
trigger.
S04Q C.4NE6&.E <procedure-name> > 6isplays the N.E5-E
C.4NE6&.Estatement needed to create
this stored procedure.
S04Q <S2+-statement> > 6isplays the N.E5-E -5A+E statements
'or all tablesKvies re'erenced by the
S2+ statement .
Figure 3-3
To see the C!E4TE T4.LEcommand for the Em0lo#ee ta&le, )e use the
command(
S*O3 T,'LE Em6loee ;
83 !o)s !eturned
To see the C!E4TE IE%command, )e use a command like(
3C1% IE% T1*4/ L
3 !o)s !eturned
To see the C!E4TE M4C!1command for the macro called M/!E'1!T,
)e use a command like(
3C1% M4C!1 M/!E'1!T L
F !o)s !eturned
To see the C!E4TE T!I66E!command for 46W34LWT, )e use(
S*O3 TR2::ER ,1:_S,L_T ;
;< Ro4s Retu#ned
3ince the 3C1% command returns the **L, it can &e a real time
sa5er. It is a 5er# hel0ful tool )hen a data&ase o&7ect needs to &e
recreated, a co0# of an e<isting o&7ect is needed, or another o&7ect is
needed that has similar characteristics to an e<isting o&7ect. 'lus, )hat
a great )a# to get a reminder on the s#nta< needed for creating a
ta&le, 5ie), macro, or trigger.
It is a good idea to sa5e the out0ut of the 3C1% command in case it is
needed at a later date. Co)e5er, if the o&7ect=s structure changes, the
3C1% command should &e re+e<ecuted and the ne) out0ut sa5ed. It
returns the **Lthat can &e used to create a ne) ta&le e<actl# the
same as the current ta&le. -ormall#, at a minimum, the ta&le name is
changed &efore e<ecuting the command.

EFPLAI"
The EB'L4I- command is a 0o)erful tool 0ro5ided )ith the Teradata
data&ase. It is designed to 0ro5ide an English e<0lanation of )hat
ste0s the 4M' must com0lete to satisf# the 3@L reAuest. The EB'L4I-
is &ased on the 'E=s e<ecution 0lan.
The 'arsing Engine ('E$ does the o0timi;ation of the su&mitted 3@L,
the creation of the 4M' ste0s and the dis0atch to an# 4M' in5ol5ed in
accessing the data. The EB'L4I- is an 3@L modifierL it modifies the
)a# the 3@L o0erates.
%hen an 3@L statement is su&mitted using the EB'L4I-, the 'E still
does the same o0timi;ation ste0 as normal. Co)e5er, instead of
&uilding the 4M' ste0s, it &uilds the English e<0lanation and sends it
&ack to the client soft)are, not to the 4M'. This gi5es users the a&ilit#
to see resource utili;ation, use of indices, and ro) and time estimates.
Therefore, it can 0redict a Cartesian 0roduct 7oin in seconds, instead of
hours later )hen the user gets sus0icious that the reAuest should ha5e
&een finished. The EB'L4I- should &e run e5er# time changes to an
o&7ect=s structure occur, )hen a reAuest is first 0ut into 0roduction and
other ke# times during the life of an a00lication. 3ome com0anies
reAuire that the EB'L4I- al)a#s &e run &efore e<ecution of an# ne)
Aueries.
The s#nta< for using the EB'L4I- is sim0le( 7ust t#0e the EB'L4I-
ke#)ord 0receding #our 5alid 3@L statement. 9or e<am0le(
The EB'L4I- can &e used to translate the actions for all 5alid 3@L. It
cannot 0ro5ide a translation )hen s#nta< errors are 0resent. The 3@L
must &e a&le to e<ecute in order to &e e<0lained.
Chart for some of the ke#)ords that ma# &e seen in the out0ut of an
EB'L4I-(
Locking 'suedo Ta&le 3erial lock on a s#m&olic ta&le. E5er# ta&le has
one. 2sed to 0re5ent deadlocks situations
&et)een users.
Locking ta&le for Indicates that an 4CCE33, !E4*, %!ITE, or
EBCL23IE lock has &een 0laced on the ta&le
Locking ro)s for Nt#0eO Indicates that an 4CCE33, !E4*, or %!ITE,
lock is 0laced on ro)s as the# are read or
)ritten
*o an 4.1!T test 6uarantees a transaction is not in 0rogress for
this user
4ll 4M's retrie5e 4ll 4M's are recei5ing the 4M' ste0s and are
in5ol5ed in 0ro5iding the ans)er set
.# )a# of an all ro)s scan !o)s are read seAuentiall# on all 4M's
.# )a# of 0rimar# inde< !o)s are read using the 'rimar# inde<
column(s$
.# )a# of inde< num&er !o)s are read using the 3econdar# inde< D
num&er from CEL' I-*EB
Chart of EB'L4I- ke#)ords (continued$
.M3M3 .itMa0 3et Mani0ulation 3te0, alternati5e
direct access techniAue )hen multi0le
-23Icolumns are referenced in the %CE!E
clause
!esidual conditions %CE!E clause conditions, other than those
of a 7oin
Eliminating du0licate ro)s 'ro5iding uniAue 5alues, normall# result of
*I3TI-CT, 6!12' ./ or su&Auer#
%here unkno)n com0arison )ill
&e ignored
Indicates that -2LL 5alues )ill not
com0are to a T!2E or 94L3E. Might &e
seen in a su&Auer# using -1T I- or -1T P
4LL &ecause no ro)s )ill &e returned if
com0arison is ignored.
Merge 7oin !o)s of one ta&le are matched to the
other ta&le on common domain columns
after &eing sorted into the same seAuence,
normall# !o) Cash
'roduct 7oin !o)s of one ta&le are matched to all the
ro)s of the other ta&le )ithout concern for
a domain match
*u0licated on all 4M's 'artici0ating ro)s for the ta&le (normall#
smaller ta&le$ of a 7oin are du0licated on
all 4M'3
Cash redistri&uted on all 4M's 'artici0ating ro)s of a 7oin are hashed on
the 7oin column and sent to the same
4M'that stores the matching ro) of the
ta&le to 7oin
3M3 3et Mani0ulation 3te0, result of an
I-TE!3ECT, 2-I1-, EBCE'T or
MI-23o0eration
Last use 3'11L file is no longer needed after the
ste0 and s0ace is released
.uilt locall# on the 4M's 4s ro)s are read, the# are 0ut into
3'11Lon the same 4M'
4ggregate Intermediate !esults
are com0uted locall#
The aggregation 5alues are all on the same
4M' and therefore no need to redistri&ute
them to )ork )ith ro)s on other 4M's
4ggregate Intermediate !esults
are com0uted glo&all#
The aggregation 5alues are not all on the
same 4M' and must &e redistri&uted on
one 4M', to accom0an# the same 5alue
)ith from the other 4M's
Figure 3-4
1nce #ou attain more e<0erience )ith Teradata and 3@L, these terms
lead #ou to a more detailed understanding of the )ork in5ol5ed in an#
3@L reAuest.
The first is the estimated num&er of ro)s that )ill &e returned. This
num&er is an educated guess that the 'E has made &ased on
information a5aila&le at the time of the EB'L4I-. This num&er ma# or
ma# not &e accurate. If there are current 3T4TI3TIC3 on the ta&le, the
num&ers are more accurate. 1ther)ise, the 'E calculates a guess &#
asking a random 4M' for the num&er of ro)s it contains. Then, it
multi0les the ans)er &# the num&er of 4M's to guess a >total ro)
count.? 4t the same time, it lets #ou kno) ho) accurate the num&er
0ro5ided might &e using the terms in the ne<t chart.
This chart is for 0hrases that accom0an# the estimated num&er of
ro)s(
-o confidence The 'E has no degree of certaint# )ith the 5alues
used. This is normall# a result of not collecting
3T4TI3TIC3 and )orking )ith multi0le ste0s in 3'11L
Lo)
confidence
The 'E is not sure of the 5alues &eing used. This is
normall# a result of 0rocessing in5ol5ing se5eral ste0s
in 3'11L instead of the actual ro)s in a ta&le
Cigh
confidence
-ormall# indicates that 3T4TI3TIC3 ha5e &een
collected on the columns or indices of a ta&le. 4llo)s
the o0timi;er to &e more aggressi5e in the access 0lan.
Inde<
Joinconfidence
Indicates that a 7oin is &eing done there uses a 7oin condition 5ia
a uniAue inde<.
Figure 3-5
The second area to check in the out0ut of the EB'L4I- is the
estimated cost, e<0ressed in time, to com0lete the 3@L reAuest.
4lthough it is e<0ressed in time, do not confuse it )ith either )all+
clock or C'2 time. It is strictl# a cost factor calculated &# the o0timi;er
for com0arison 0ur0oses onl#. It does not take the num&er of users,
the current )orkload or other s#stem related factors into account.
4fter looking at the 0otential e<ecution 0lans, the 0lan )ith the lo)est
cost 5alue is selected for e<ecution. 1nce these t)o 5alues are
checked, the Auestion that should &e asked is( 4re these 5alues
reasona&leQ
9or instance, if the ta&le contains one million ro)s and the estimate is
one million ro)s in :" seconds, that is 0ro&a&l# reasona&le if there is
not a %CE!E clause. Co)e5er, if the ta&le contains a million ro)s and
is &eing 7oined to a ta&le )ith t)o thousand ro)s and the estimate is
that t)o hundred trillion ro)s )ill &e returned and it )ill take fift#
da#s, this is not reasona&le.
The follo)ing EB'L4I- is for a full ta&le scan of the 3tudent Ta&le(
82 !o)s !eturned
The EB'L4I- estimates, G ro)s and .8" seconds. 3ince there are 80
ro)s in the ta&le, the EB'L4I- is slightl# off in its estimate. Co)e5er,
this is reasona&le &ased on the contents of the ta&le and the 3ELECT
statement su&mitted.
The ne<t EB'L4I- is for a 7oin that has an error in it, can #ou find itQ(
The EB'L4I- estimates nearl# "82 ro)s )ill &e returned and it )ill
take .3F seconds. 4lthough the time estimate sounds acce0ta&le, this
is a 5er# small ta&le. Looking at the num&er of ro)s returned as "82
)ith onl# 8: ro)s in the largest of these ta&les. This is not reasona&le
&ased on the contents of the ta&les.
20on further e<amination, the 0roduct 7oin in ste0 I is using (8P8$ as
the 7oin condition )here it should &e a merge 7oin. Therefore, this is a
Cartesian 0roduct 7oin. 4 careful anal#sis of the 3ELECT sho)s a single
7oin condition in the %CE!E clause. Co)e5er, this is a three+ta&le 7oin
and should ha5e t)o 7oin conditions. The %CE!E clause needs to &e
fi<ed and &# using the EB'L4I- )e ha5e sa5ed 5alua&le time.
If #ou can get to the 0oint of using the EB'L4I- in this manner, #ou
are )a# ahead of the game. -o one )ill e5er ha5e to sla0 #our hand
for )riting 3@L that runs for da#s, uses u0 large amounts of s#stem
resources and accom0lishes a&solutel# nothing. /ou sa#, >*octor, it
hurts )hen I do this.? The *octor sa#s, >*on=t do that.? %e are sa#ing,
>*on=t 0ut e<tensi5e 3ELECT reAuests into 0roduction )ithout doing an
EB'L4I- on it.
!emem&er, al)a#s e<amine the EB'L4I- for reasona&le results. Then,
sa5e the EB'L4I- out0ut as a &enchmark against an# future EB'L4I-
out0ut. Then, if the 3@L starts e<ecuting slo)er or using more
resources, #ou ha5e a &asis for com0arison. /ou might also use the
&enchmark if #ou decide to add a secondar# inde<. This 0rotot#0ing
allo)s #ou to see e<actl# )hat #our 3@L is doing.
3ome users ha5e Auit using the EB'L4I- &ecause the# ha5e gotten
inaccurate results. 9rom our e<0erience, )hen the num&ers are
consistentl# different than the actual ro)s &eing returned and the cost
estimate is com0letel# )rong, it is normall# an indicator that
3T4TI3TIC3 should &e collected or u0dated on the in5ol5ed ta&les.

Adding %omments
3ometimes it is necessar# or desira&le to document the logic used in
an 3@L statement )ithin the Auer#. 4 comment is not e<ecuted and is
ignored &# the 'Eat s#nta< checking and resolution time.
ANSI Comment
To comment a line using the 4-3I standard form of a comment(
++ the dou&le dash at the start of a single line denotes a comment is
on that line
Each line that is a comment must &e started )ith the same t)o dashes
for each comment line. This is the onl# techniAue a5aila&le for
commenting using 4-3I com0lianc#.
4t the )riting of this &ook, @uer#man sometimes gets confused and
regards all lines after the ++ as 0art of the comment. 3o, &e careful
regarding 5arious client tools.
++ This is an 4-3I form of comment that consists of a single line of
user e<0lanation or
++ add notes to an 3@L command. This is a second line and needs
additional dashes
Teradata Comment
To comment a line using the Teradata form of a comment(
JK the slash asterisk at the start of a line denotes the &eginning of a
comment
KJ the asterisk slash (re5ersed from the start of a comment$ is used to
end a comment.
.oth the start and the end of a comment can &e a single line or
multi0le lines. This is the most common form of comment seen in
Teradata 3@L, 0rimaril# since it )as the original techniAue a5aila&le.
JK This is the Teradata form of comment that consists of a single line
of user e<0lanation or add notes to an 3@L command. 3e5eral lines of
comment can &e added )ithin a single notation. This is the end of the
comment. KJ

*ser In.ormation )unctions
The Teradata !*.M3 (!elational *ata.ase Management 3#stem$ has
incor0orated into it functions that 0ro5ide data regarding a user )ho
has 0erformed a logon connection to the s#stem. The follo)ing
functions make that data a5aila&le to a user for dis0la# or storage.
ACCOUNT !nction
Com0ati&ilit#( Teradata E<tension
4 user )ithin the Teradata data&ase has an account num&er. This
num&er is used to identif# the user, 0ro5ide a &asis for charge &ack, if
desired and esta&lish a &asic 0riorit#.
're5iousl#, this num&er )as used e<clusi5el# &# the data&ase
administrator to control and monitor access to the s#stem. -o), it is
a5aila&le for 5ie)ing &# the user 5ia 3@L.
3#nta< for using the 4CC12-T function(
4s an e<am0le, the follo)ing returns the account information for m#
user(
SELECT ,CCO-0T;
8 !o) returned
A%%(*"T
YM83IHG
If #our account starts )ith a YM, #ou are running at a medium 0riorit#.
%here YL is lo) and YC is high. 4t the same time, the account does
not ha5e to &egin )ith one of these and can &e an# site s0ecific 5alue.
DATABASE !nction
Com0ati&ilit#( Teradata E<tension
Cha0ter 8 of this &ook discussed the conce0t of a data&ase and user
area )ithin the Teradata !*.M3. Mno)ing the current data&ase )ithin
Teradata is sometimes an im0ortant 0iece of information needed &# a
user. 4s mentioned a&o5e, the CEL' 3E33I1- is one )a# to determine
it. Co)e5er, a lot of other information is also 0resented. 3ometimes it
is ad5antageous to ha5e onl# that single tid&it of data not onl# to see
&ut also for storage. %hen this is the case, the *4T4.43E function is
a5aila&le.
3#nta< for using the *4T4.43E function(
4s an e<am0le, the follo)ing returns the account information for m#
user(
SELECT &,T,',SE;
8 !o) returned
#ATAASE
Mikel
Session !nction
Com0ati&ilit#( Teradata E<tension
Cha0ter 8 of this &ook discussed the 'E' and the conce0t of a session
and its role in5ol5ing the user=s 3@L reAuests. The CEL'
3E33I1- 0ro5ides a )ealth of information regarding the indi5idual
session esta&lished for a user. 1ne of those 0ieces of data is the
session num&er. It uniAuel# identifies e5er# user session in e<istence
at an# 0oint in time. Teradata no) makes the session num&er a5aila&le
using 3@L.
3#nta< for using the 3E33I1- function(
4s an e<am0le, the follo)ing returns the account information for m#
user(
SELECT SESS2O0;
8 !o) returned
SESSI("
80"F

#ata %on<ersions
In order for data to &e managed and used, it must ha5e characteristics
associated )ith it. These characteristics are called attri&utes that
include a data t#0e and a length. The 5alues that a column can store
are directl# related to these t)o attri&utes.
There are times )hen the data t#0e or length defined is not con5enient
for the use or out0ut dis0la# needed. 9or instance, )hen character
data is too long for dis0la#, an o0tion might &e to reduce its length. 4t
other times, the defined numeric data t#0e is not sufficient to store the
result of a mathematical o0eration. Therefore, con5ersion to a larger
numeric t#0e ma# &e the onl# )a# to successfull# com0lete the
reAuest.
%hen one of these situations interru0t the e<ecution of the 3@L, it is
necessar# to use one or more of the con5ersion techniAues. The# are
co5ered here in detail to enhance the understanding and the use of
these ca0a&ilities.
In normal 0ractices, there should &e little need to con5ert from a
num&er to a character on a regular &asis. This reAuirement is one
indicator that the ta&le or column design is Auestiona&le. Co)e5er, if a
con5ersion must &e 0erformed, it is much safer to use the 4-3I
3tandard C43T (Con5ert 4nd 3tore$ function )hen going from numeric
to character instead of the older Teradata im0lied con5ersion. .oth of
these techniAues are discussed here.
Con5ersions should &e used onl# )hen a&solutel# necessar# &ecause
the# are intensi5e on s#stem resources. 4s an e<am0le, I sa) an 3@L
statement that con5erted four columns si< different times. There )ere
around a million ro)s in the ta&le. The 3@L did a lot of 0rocessing and
it took a&out one hour to run. .# eliminating these I million
con5ersions, the 3@L ran in under fi5e minutes. Con5ersions can ha5e
an im0act, &ut sometimes #ou need them. 2se them onl# )hen
a&solutel# necessar#E

#ata Types
Teradata su00orts man# formats for storing data on disk and most of
the data t#0es conform to the 4-3I standard. 4t the same time, there
are data t#0es s0ecific to Teradata. Most of these uniAue data t#0es
are 0ro5ided to sa5e storage s0ace on disk or su00ort an international
code set.
3ince Teradata )as originall# designed to store tera&#tes )orth of data
in millions or &illions of ro)s, sa5ing a single &#te one million times
&ecomes a s0ace sa5ings of nearl# a mega&#te. The sa5ings increases
d#namicall# as more ro)s are added and more &#tes 0er ro) are
sa5ed. This s0ace sa5ings can &e 5er# significant.
Like)ise, the s0eed ad5antage associated )ith smaller ro)s cannot &e
ignored. 3ince data is read from a disk in a &lock, smaller ro)s mean
that more ro)s are stored in a single &lock. Therefore, fe)er &locks
need to &e read and it is faster.
The follo)ing charts indicate the data t#0es currentl# su00orted &#
Teradata. The first chart sho)s the 4-3I standard t#0es and the
second is for the additional data t#0es that are e<tensions to the
standard.
This chart indicates )hich data t#0es that Teradata currentl# su00orts
as 4-3I 3tandards(
#ata Type #escription #ata ;alue &ange
I-TE6E! 3igned )hole num&er +2,8:H,:G3,I:G to
2,8:H,:G3,I:H
3M4LLI-T 3igned smaller )hole
num&er
+32,HIG to 32,HIH
*ECIM4L(B,/$
%here( BP8 thru 8G,
total num&er of digits
in the num&er
4nd /P0 thru 8G digits
to the right of the
decimal
3igned decimal
num&er
8G digits on either
side of the decimal
0oint
Largest 5alue
*EC(8G,0$
3mallest 5alue
*EC(8G,8G$
-2ME!IC(B,/$
3ame as *ECIM4L
3#non#m for
*ECIM4L
3ame as *ECIM4L
9L14T 9loating 'oint 9ormat
(IEEE$
N5alueO<80
30H
to
N5alueO<80
+30G
!E4L 3tored internall# as
9L14T

'!ECI3I1- 3tored internall# as
9L14T

*12.LE '!ECI3I1- 3tored internall# as
9L14T

CC4!4CTE!(B$
CC4!(B$
%here( BP8 thru
I:000
9i<ed length
character string, 8
&#te of storage 0er
character,
8 to I:,000 characters
long, 0ads to length
)ith s0ace
4!CC4!(B$
CC4!4CTE!
4!/I-6(B$
CC4! 4!/I-6(B$
%here( BP8 thru
I:000
aria&le length
character string, 8
&#te of storage 0er
character, 0lus 2
&#tes to record length
of actual data
8 to I:,000 characters
as a ma<imum. The
s#stem onl# stores the
characters 0resented to
it.
*4TE 3igned internal
re0resentation of
///MM** (///
re0resents the
num&er of #ears from
8F00, i.e. 800 for
/ear 2000$
Currentl# to the #ear
3"00 as a 0ositi5e
num&er and &ack into
4* #ears as a negati5e
num&er.
TIME Identifies a field as a
TIME 5alue )ith Cour,
Minutes and 3econds

TIME3T4M' Identifies a field as a
TIME3T4M' 5alue
)ith /ear, Month,
*a#, Cour, Minute,
and 3econds

Figure 4-1
This chart indicates )hich data t#0es that Teradata currentl# su00orts
as e<tensions(
#ata Type #escription #ata ;alue &ange
./TEI-T 3igned )hole num&er +82G to 82H
./TE (B$
%here( BP8 thru
I:000
.inar# 8 to I:,000 &#tes
4!./TE (B$
%here( BP8 thru
I:000
aria&le length &inar# 8 to I:,000 &#tes
L1-6 4!CC4! aria&le length string I:,000 characters
(ma<imum data
length$ The s#stem
onl# stores the
characters 0ro5ided,
not trailing s0aces.$
6!4'CIC (B$
%here( BP8 thru
32000
9i<ed length string of
8I+&it &#tes (2 &#tes
0er character$
8 to 32,000 M4-JI
characters
4!6!4'CIC (B$
%here( BP8 thru
32000
aria&le length string
of 8I+&it &#tes
8 to 32,000 characters
as a ma<imum. The
s#stem onl# stores
characters 0ro5ided.
Figure 4-2
These data t#0es are all a5aila&le for use )ithin Teradata. -otice that
there are fi<ed and 5aria&le length data formats. The fi<ed data t#0es
al)a#s reAuire the entire defined length on disk for the column. The
5aria&le t#0es can &e used to ma<imi;e data storage )ithin a &lock &#
storing onl# the data 0ro5ided )ithin a ro) &# the client soft)are.
/ou should use the a00ro0riate t#0e for the s0ecific data. It is a good
idea to use a 4! data t#0e )hen most of the data is less than the
ma<imum si;e. This is due to the addition of an e<tra 2+&#te length
indicator that is stored along )ith the actual data.

%AST
Com0ati&ilit#( 4-3I
2nder most conditions, the data t#0es defined and stored in a ta&le
should &e a00ro0riate. Co)e5er, sometimes it is neither con5enient
nor desira&le to use the defined t#0e. *ata can &e con5erted from one
t#0e to another &# using the C43T function. 4s long as the data
in5ol5ed does not &reak an# data rules (i.e. 0lacing al0ha&etic or
s0ecial characters into a numeric data t#0e$ the con5ersion )orks. The
name of the C43T function comes from the Con5ert 4nd 3Tore
o0eration that it 0erforms.
Care must also &e taken )hen con5erting data to manage an#
0otential length issues. In Teradata mode, truncation occurs if a length
is reAuested that is shorter than the original data. Co)e5er, in 4-3I
mode, an 3@L error is the result &ecause 4-3I sa#s, >Thou shall not
truncate data.?
The &asic s#nta< of the C43Tstatement follo)s(
E<am0les using C43T(
These are onl# some of the 0otential con5ersions and are 0rimaril#
here for illustration of ho) to code a C43T. The C43T could also &e
used )ithin the %CE!E clause to control the length characteristics or
the t#0e of the data to com0are.
4gain, )hen using the C43T in 4-3I mode, an# attem0t to truncate
data causes the 3@L to fail &ecause 4-3I does not allo) truncation.
The ne<t 3ELECT uses literal 5alues to sho) the results of con5ersion(
8 !o) !eturned
Trunc (1 igger 8hole &ounder
4 82G 82H 828 822
In the a&o5e e<am0le, the first C43T truncates the fi5e characters (left
to right$ to form the single character V4=. In the second C43T, the
integer 82G is con5erted to three characters and left 7ustified in the
out0ut. The 82H )as initiall# stored in a 3M4LLI-T (" digits + u0 to
32HIH$ and then con5erted to an I-TE6E!. Cence, it uses 88
character 0ositions for its dis0la#, ten numeric digits and a sign
(0ositi5e assumed$ and right 7ustified as numeric.
The 5alue of 828."3 is an interesting case for t)o reasons. 9irst, it )as
initiall# stored as a *ECIM4L as " total digits )ith 2 of them to the
right of the decimal 0oint. Then it is con5erted to a 3M4LLI-T using
C43T to remo5e the decimal 0ositions. Therefore, it truncates data &#
stri00ing off the decimal 0ortion. It does not round data using this data
t#0e. 1n the other hand, the C43T in the fifth column called !ounder
is con5erted to a
*ECIM4L as 3 digits )ith no digits (3,0$ to the right of the decimal, so
it )ill round data 5alues instead of truncating. 3ince ."3 is greater
than .", it is rounded u0 to 822.

Implied %AST
Com0ati&ilit#( Teradata E<tension
4lthough the C43T function is the 4-3I standard, it has not al)a#s
&een that )a#. 'rior to the C43T function, Teradata had the a&ilit# to
con5ert data from one t#0e to another.
This con5ersion is reAuested &# 0lacing the >im0lied= data t#0e
con5ersion in 0arentheses after the column name. Therefore, it
&ecomes a 0art of the select list and the column reAuest. The ne) data
t#0e is )ritten as an attri&ute for the column name.
The follo)ing is the format for reAuesting a con5ersion(
4t first glance, this a00ears to &e the &est and shortest techniAue for
doing con5ersions. Co)e5er, there is a hidden danger here )hen
con5erting from numeric to character that is demonstrated in this
3ELECT that uses the same data as a&o5e to do im0lied
C43Tcon5ersions(
8 !o) !eturned
Shortened ((PS= ((PS> igger 6 8hole
4 D 82G 828
%hat ha00ened in the column named 1M and -W1MQ
The ans)er to this Auestion is( the 5alue 82G is 8 greater than 82H
and therefore too large of a 5alue to store in a ./TEI-T. 3o it is
automaticall# stored as a 3M4LLI-T (" digits 0lus a sign$ &efore the
con5ersion. The im0licit con5ersion changes it to a character t#0e )ith
the first 3 characters &eing returned. 4s a result, onl# the first 3
s0aces are seen in the re0ort (W W W 82G$. Like)ise, 11'32 is stored
as (W W +82G$ )ith the first three characters (2 s0aces and + $ sho)n in
the out0ut. 4l)a#s think a&out the im0act of the sign as a 5alid 0art of
the data )hen con5erting from numeric to character. 4s mentioned
earlier, if #ou find that con5ersions of this t#0e are regularl# necessar#,
the ta&le design needs to &e re+e<amined.
4s demonstrated in the a&o5e out0ut, it is al)a#s safer to use
C43T )hen going from character to numeric data t#0es.

)ormatted #ata
Com0ati&ilit#( Teradata E<tension
!emem&er that truncation )orks in Teradata mode, &ut not in 4-3I
mode. 3o, another )a# to make data a00ear to &e truncated is to use
the Teradata 91!M4T in the 3ELECT list )ith one or more columns
)hen using a tool like .TE@. 3ince 91!M4T does not truncate data, it
)orks in 4-3I mode.
The s#nta< for using 91!M4T is(
The ne<t 3ELECT demonstrates the use of 91!M4T(
8 !o) !eturned
Shorter )mt6=>= =>=EGA )mt6"um#ate )mt6#ate 6
4.C 00828 828."3 80J08J8FFF 1CT 08, 8FFF
There are a cou0le of things to notice in this out0ut. 9irst, it )orks in
4-3I mode &ecause truncation does not occur. The distinction is that
all of the data from the column is in s0ool. It is onl# the out0ut that is
shortened, not truncated. The character data t#0es use the VB= for the
formatting character. 3econd, formatting does not round a data 5alue
as )ith the 828."3, the dis0la# is shortened. The numeric data t#0es
use a VF= as the &asic formatting character. 1thers are sho)n in this
cha0ter.
-e<t, *4TE t#0e data uses the VM= for month, the V*= for da# of the
month and V/= for the #ear 0ortion of a 5alid date. Lastl#, the case of
the formatting characters does not matter. The formatting characters
can &e )ritten in all u00ercase, lo)ercase, or a mi<ture of &oth cases.
The t)o follo)ing charts sho) the 5alid formatting characters for
Teradata and 0ro5ide an e<0lanation of the im0act each one has on the
out0ut dis0la# )hen using .TE@(
.asic -umeric and Character *ata 9ormatting 3#m&ols
Sym$ol Mask character and how used
B or < Character data. Each B re0resents one character. Can re0eat
5alueD i.e. BBBBB or B("$.
F *ecimal digit. Colds 0lace for numeric digit for a dis0la# 0
through F. 4ll leading ;eroes are sho)n if the format mask is
longer than the data 5alue. Can re0eat 5alueD i.e. FFFFF or
F("$.
or 5 Im0lied decimal 0oint. 4ligns data on a decimal 5alue.
'rimaril# used on im0orted data )ithout actual decimal
0oint.
E or e E<0onential. 4ligns the end of the mantissa and the
&eginning of the e<0onent.
6 or g 6ra0hic data. Each 6 re0resents one logical (dou&le &#te+
M4-JI or Matakana$ character. Can re0eat 5alueD i.e. 66666
or 6("$.
Figure 4-3
4d5anced -umeric and Character 9ormatting 3#m&ols
Sym$ol Mask character and how used
Y 9i<ed or floating dollar sign. Inserts a Y or lea5es s0aces
and mo5es (floats$ o5er to the first character of a currenc#
5alue. %ith the 0ro0er ke#&oard, additional currenc# signs
are a5aila&le( Cent, 'ound and /en.
, Comma. Inserted )here a00ears in format mask. 2sed
0rimaril# to make large num&ers easier to read.
. 'eriod. 'rimar# use to align decimal 0oint 0osition. 4lso
used for( dates and comma in some currencies.
+ *ash character. Inserted )here a00ears in format mask.
2sed 0rimaril# for dates and negati5e numeric 5alues. 4lso
used for( 0hone num&ers, ;i0 codes, and social securit#
(234$.
J 3lash character. Inserted )here a00ears in format mask.
2sed 0rimaril# for dates.
S 'ercent character. Inserted )here a00ears in format mask.
2sed 0rimaril# for dis0la# of 0ercentage D i.e. FFS 5s. .FF
[ or ; [ero+su00ressed decimal digit. Colds 0lace for numeric digit
dis0la#s 8 through F and 0, )hen significant. 4ll leading
;eroes (insignificant$ are sho)n as s0ace since their
0resence does not change the 5alue of the num&er &eing
dis0la#ed.
. or & .lank data. Insert a s0ace )here a00ears in format mask.
Figure 4-4
The ne<t chart sho)s the formatting characters used in con7unction
)ith *4TE data(
*ate 9ormatting 3#m&ols
Sym$ol Mask character and how used (not case speci.ic)
M or m Month. 4llo)s month to &e dis0la#ed an# )here in the date
dis0la#. %hen VMM= is s0ecified, the numeric (08+82$ 5alue
is a5aila&le. %hen VMMM= is s0ecified, the three character
(J4-+*EC$ 5alue is a5aila&le.
* or d *a#. 4llo)s da# to &e dis0la#ed an# )here in the date
dis0la#. %hen V**= is s0ecified, the numeric (08+38$ 5alue
is a5aila&le. %hen V***= is s0ecified, the three+digit da# of
the #ear (008+3II$ 5alue is a5aila&le.
/ or # /ear. 4llo)s da# to &e dis0la#ed an# )here in the date
dis0la#. The normal V//= has &een used for man# #ears for
the 20
th
centur# )ith the 8F// assumed. Co)e5er, since )e
ha5e mo5ed into the 28
st
centur#, it is recommended that
the V////= &e used.
Figure 4-5
There is additional information on date formatting in a later cha0ter
dedicated e<clusi5el# to date 0rocessing.
The ne<t 3ELECT demonstrates some of the additional formatting
s#m&ols(
8 !o) !eturned
)mt6Shorter )mt6Phone B6Press )mt6Dulian )mt6Pay
4.C 208+:G"+FFFF 8028."3 FF2H: YFF8,008.00
There are onl# t)o things that need to &e )atched )hen using the
91!M4T function. 9irst, the data t#0e must match the formatting
character used or a s#nta< error is returned. 3o, if the data is numeric,
use a numeric formatting character and the same condition for
character data. The other concern is configuring the format mask &ig
enough for the largest data column. If the mask is too short, the 3@L
command e<ecutes, ho)e5er, the out0ut contains a series of
KKKKKKKKKKKKK to indicate a format o5erflo), as demonstrated &# the
follo)ing 3ELECT(
8 !o) !eturned
)mt6Phone
KKKKKKKKK
4ll of these 91!M4T reAuests )ork )onderfull# if the client soft)are is
.TE@. 4fter all, it is a re0ort )riter and these are re0ort )riter o0tions.
The issue is that the 1*.C and @uer#man look at the data as data,
not as a re0ort. 3ince man# of the formatting s#m&ols are >characters?
the# cannot &e numeric. Therefore, the 1*.C stri0s off the s#m&ols
and 0resents the numeric data to the client soft)are for dis0la#.
Tricking t"e ODBC to A##o$ ormatted Data
If a tool uses the 1*.C, the 91!M4T in the 3ELECT is ignored and the
data comes &ack as data, not as a formatted field. This is es0eciall#
noticea&le )ith numeric data and dates.
To force tools like @uer#man to format the data, the soft)are must &e
tricked into thinking the data is character t#0e, )hich it lea5es alone.
This can &e done using the C43T function.
The ne<t 3ELECT uses the C43To0eration to trick the soft)are into
thinking the formatted data is character(
8 !o) !eturned
)mt6%AST6Phone )mt6%AST6#ate )mt6%AST6Pay
:G"+FFFF 8FFF.80.08 YFF8,008.00
*o not let the 0resence of 43 in the a&o5e 3ELECT confuse #ou. The
first 43, inside the 0arentheses, goes )ith the ne) data t#0e for the
C43T. -otice that the 0arentheses enclose &oth the data and the
91!M4T so that the# are treated as a single entit#. The second 43 is
outside the 0arentheses and is used to name the alias.

TITLE Attri$ute .or #ata %olumns
Com0ati&ilit#( Teradata E<tension
4s seen earlier, an alias ma# &e used to change the column name. This
can &e done for ease of reference or to alter the heading for the
column in the out0ut. The TITLE is an alternati5e to using an alias
name )hen a column heading needs to &e changed. There is a &ig
difference &et)een TITLE and an alias. 4lthough an alias does change
the title on
a re0ort, it is normall# used to rename a column (throughout the 3@L$
as a ne) name. The TITLEonl# changes the column heading.
The s#nta< for using TITLE follo)s(
Like 91!M4T, TITLE changes the attri&ute of the dis0la#ed data.
Therefore, it is )ritten in 0arentheses also. 4lso like 91!M4T, tools
using the 1*.Cma# not )ork as )ell as the# do in .TE@, the re0ort
)riter. This is es0eciall# true )hen using the JJ stacking s#m&ols. In
tools like @uer#man, the title literall# contains JJ and is 0ro&a&l# not
the intent. 4lso, if #ou attem0t to use TITLE in @uer#man and it does
not )ork, there is a configuration o0tion in the 1*.C. %hen >2se
Column -ames? is checked, it )ill not use the title designation.
The follo)ing 3ELECT uses the TITLE to sho) the result(
8 !o) !eturned
%haracter #ata %haracter #ata "umeric #ata
Character *ata Character *ata 823
-otice that the )ord VCharacter= is stacked o5er the V*ata= 0ortion of
the heading for the second column using .TE@. 3o, as an alternati5e,
a TITLE can &e used instead of an alias and allo)s the user to include
s0aces in the out0ut title.
4nother neat trick for TITLE is to use t)o single Auotes together
(TITLE V=$. This techniAue creates a ;ero length TITLE, or no title at all,
as seen in the ne<t 3ELECT(
8 !o) !eturned
%haracter #ata
Character *ata Character *ata 823
!emem&er, this TITLE is t)o se0arate single Auotes, not a single
dou&le Auote. 4 dou&le Auote &# itself does not )ork &ecause it is
un&alanced )ithout a second dou&le Auote.

Transaction Modes
Transaction mode is an area )here the 0ers0ecti5e of the Teradata
!*M.3 and 4-3I e<0erience a de0arture. Teradata, &# default, is
com0letel# non+case s0ecific. 4-3I reAuires 7ust the o00osite
condition, e5er#thing is case s0ecific and as )e sa) earlier, dictates
that ta&le and column names &e in ca0ital letters.
This is 0ro&a&l# a little restricti5e and I tend to agree com0letel# )ith
the Teradata im0lementation. 4t the same time, Teradata allo)s the
user to )ork in either mode )ithin a session )hen connected to the
!*.M3. The choice is u0 to the user )hen .TE@ is the client interface
soft)are.
9or instance, )ithin .TE@ either of the follo)ing commands can &e
used &efore logging onto the data&ase(
ESET SESSI(" T&A"SA%TI(" A"SI
1r
ESET SESSI(" T&A"SA%TI(" TET
The .TET transaction is sim0l# an acron#m made from a consolidation
of the .E6I- T!4-34CTI1- (.T$ and E-* T!4-34CTI1- (ET$
commands to re0resent Teradata mode.
The s#stem administrator defines the s#stem default mode for
Teradata. 4 setting in the *.3 Control record determines the default
session mode. The a&o5e commands allo) the default to &e o5er+
ridden for each logon session. The 3ET command must &e e<ecuted
&efore the logon to esta&lish the transaction mode for the ne<t
session(s$.
Co)e5er, not all client soft)are su00orts the a&ilit# to change modes
&et)een Teradata and 4-3I. %hen it is desira&le for functionalit# or
0rocessing characteristics of the other mode, other o0tions are
a5aila&le and are 0resented &elo). There is more information on
transactional 0rocessing later in this &ook.

%ase Sensiti<ity o. #ata
It has &een discussed earlier that there is no need for concern
regarding the use of lo)er or u00er case characters )hen coding the
3@L. 4s a matter of fact, the different case letters can &e mi<ed in a
single statement. -ormall#, the Teradata data&ase does not care a&out
the case )hen com0aring the stored data either.
Co)e5er, the 4-3I mode im0lementation of the Teradata !*.M3 is
case sensiti5e, regarding the data. This means that it kno)s the
difference &et)een a lo)er case letter like Va= and an u00er case letter
V4=. 4t the same time, )hen using Teradata mode )ithin the Teradata
data&ase, it does not distinguish &et)een u00er and lo)er case
letters. It is the mode of the session that dictates the case sensiti5it#
of the data.
The 3@L can al)a#s e<ecute 4-3I standard commands in Teradata
mode and like)ise, can al)a#s e<ecute Teradata e<tensions in 4-3I
mode. The 3@L is al)a#s the same regardless of the mode &eing used.
The difference comes )hen com0aring the results of the data ro)s
&eing returned &ased on the mode.
9or e<am0le, earlier in this cha0ter, it )as stated that 4-3I mode does
not allo) truncation. Therefore, the 91!M4T could &e used in either
mode &ecause it did not truncate data.
To demonstrate this issue, the follo)ing uses the different modes in
.TE@(
-o !o)s !eturned
The a&o5e 3@L e<ecution is case s0ecific due 4-3I mode and V4= is
different than Va=. The same 3@L is e<ecuted again here, ho)e5er, the
transaction mode for the session is set to Teradata mode (.TET$ 0rior
to the logon(
8 !o) !eturned
#o They Match2
The# match
-o) that the defaults ha5e &een demonstrated, the follo)ing functions
can &e used to mimic the o0eration of each mode )hile e<ecuting in
the other (4-3I 5s Teradata$ )here case sensiti5it# is concerned.

%ASESPE%I)I%
Com0ati&ilit#( Teradata E<tension
The C43E3'ECI9IC attri&ute ma# &e used to reAuest that Teradata
com0are data 5alues )ith a distinction made &et)een u00er and lo)er
case. The logic &ehind this designation is that e5en in Teradata mode,
case sensiti5it# can &e reAuested to make the 3@L )ork the same as
4-3I mode, )hich is case s0ecific. Therefore, )hen C43E3'ECI9IC is
used, it normall# a00ears in the %CE!E clause.
The s#nta< of the ne<t t)o statements e<ecute e<actl# the same(
1r, it ma# &e a&&re5iated as C3(
Con5ersel#, if 4-3I is the current mode and there is a need for it to &e
non+case s0ecific, the -1T can &e used to ad7ust the default o0eration
of the 3@L )ithin a mode.
The follo)ing 3@L forces 4-3I to &e non+case s0ecific(
1r, it ma# &e a&&re5iated as(
The ne<t 3ELECT demonstrates the functionalit# of C43E3'ECI9IC and
C3 for com0aring an eAualit# condition like it e<ecuted a&o5e in 4-3I
mode(
-o !o)s !eturned
-o ro)s are returned, &ecause V4= is different than Va= )hen case
sensiti5it# is used. 4t first glance, this seems to &e unnecessar# since
the mode can &e set to use either 4-3I or Teradata. Co)e5er, the dot
(.$ commands are .TE@ commands. The# do not )ork in @uer#man. If
case sensiti5it# is needed )hen using other tools, this is one of the
o0tions a5aila&le to mimic 4-3I com0arisons )hile in Teradata mode.
The 3@L e<tensions in Teradata ma# &e used to eliminate the a&solute
need to log off to reset the mode and then log &ack onto Teradata in
order to use a characteristic like case sensiti5it#. Instead, Teradata
mode can &e forced to use a case s0ecific com0arison, like 4-3I mode
&# incor0orating the C43E3'ECI9IC(C3$ into the 3@L. The case
s0ecific o0tion is not a statement le5el featureL it must &e s0ecified for
each column needing this t#0e of com0arison in &oth .TE@ and
@uer#man.

L(8E& )unction
Com0ati&ilit#( 4-3I
The L1%E! case function is used to con5ert all characters stored in a
column to lo)er case letters for dis0la# or com0arison. It is a function
and therefore reAuires that the data &e 0assed to it.
The s#nta< for using L1%E!(
The follo)ing 3ELECT uses an u00er case literal 5alue as in0ut and
out0uts the same 5alue, &ut in lo)er case(
SELECT L(8E& =7,'C&E9> ,S Result ;
8 !o) !eturned
&esult
a&cde
%hen L1%E! is used in a %CE!E clause, the result is a 0redicta&le
string of all lo)ercase characters. %hen com0ared to a lo)ercase
5alue, the result is a case &lind com0arison. This is true regardless of
ho) the data )as originall# stored.
SELECT 7The match9 =t$tle 7&o the match/9>
3*ERE L(8E&=7a'c&e9> ? 7abcde9 ;
8 !o) !eturned
#o They match2
The# match

*PPE& )unction
Com0ati&ilit#( 4-3I
The 2''E! case function is used to con5ert all characters stored in a
column to the same characters in u00er case. It is a function and
therefore reAuires that data &e 0assed to it.
The s#nta< for using 2''E!(
The ne<t e<am0le uses a literal 5alue )ithin 2''E! to sho) the out0ut
all in u00er case(
SELECT *PPE&=7a'c&e9> ,S Result ;
8 !o) !eturned
&esult
4.C*E
It is also 0ossi&le to use &oth the L1%E! and 2''E! case functions
)ithin the %CE!E clause. This techniAue can &e used to make 4-3I
non+case s0ecific, like Teradata, &# con5erting all the data to a kno)n
state, regardless of the starting case. Thus, it does not check the
original data, &ut instead it checks the data after the con5ersion.
The follo)ing 3ELECT uses the 2''E! function in the %CE!E(
8 !o) !eturned
#o They match2
The# match
%hen the data does not meet the reAuirements of the out0ut format,
it is time to con5ert the data. The 2''E! and L1%E! functions can &e
used to change the a00earance or characteristics of the data to a
kno)n state.
%hen case sensiti5it# is needed, 4-3I is one )a# to accom0lish it. If
that is not an o0tion, the C43E3'ECI9IC function can &e incor0orated
into the 3@L.

Aggregate Processing
The aggregate functions are used to summari;e column data 5alues
stored in ro)s. 4ggregates eliminate the detail information from the
ro)s and onl# return the ans)er. Therefore, the result is one or more
aggregated 5alues as a single line or one line 0er uniAue 5alue, as a
grou0. The other characteristic of these functions is that the# all ignore
null 5alues stored in column data 0assed to them.
Mat" Aggregates
The math aggregates are the original functions used to 0ro5ide sim0le
t#0es of arithmetic o0erations for the data 5alues. Their names are
descri0ti5e of the o0eration 0erformed. The functions are listed &elo)
)ith e<am0les follo)ing their descri0tions. The ne)er, 2!: statistical
aggregates are co5ered later in this cha0ter.
T"e SUM !nction
4ccumulates the 5alues for the named column and 0rints one total
from the addition.
T"e A%& !nction
4ccumulates the 5alues for the named column and counts the num&er
of 5alues added for the final di5ision to o&tain the a5erage.
T"e MIN !nction
Com0ares all the 5alues in the named column and returns the smallest
5alue.
T"e MA' !nction
Com0ares all the 5alues in the named column and returns the largest
5alue.
T"e COUNT !nction
4dds one to the counter each time a 5alue other than null is
encountered.
The aggregates can all &e used together in a single reAuest on the
same column, or indi5iduall# on different columns, de0ending on #our
needs.
The follo)ing s#nta< sho)s all si< aggregate functions in a single
3ELECT to 0roduce a single line ans)er set(
The follo)ing ta&le is used to demonstrate the aggregate functions(
3tudent Ta&le + contains 80 students
Student6I# Last6"ame )irst6name %lass6code 7rade6Pt
PK FK
UPI NUSI NUSI
8232"0
82"I3:
23:828
238222
2I0000
2G0023
322833
32:I"2
333:"0
:23:00
'hilli0s
Canson
Thomas
%ilson
Johnson
Mc!o&erts
.ond
*elane#
3mith
Larkins
Martin
Cenr#
%end#
3usie
3tanle#
!ichard
Jimm#
*ann#
4nd#
Michael
3!
9!
9!
31
J!
J!
3!
31
9!
3.00
2.GG
:.00
3.G0
8.F0
3.F"
3.3"
2.00
0.00
Figure 5-1
The ne<t 3ELECT uses the 3tudent ta&le, to sho) all aggregates in one
statement )orking on the same column(
8 !o) !eturned
S*M(7rade6pt) A;7(7rade6pt) MI"(7rade6pt) MAF(7rade6pt) %(*"T(7rade6pt)
2:.GG 2.HI 0.00 :.00 F
-otice that 3tanle#=s ro) is not included in the functions due to the
null in his grade 0oint a5erage. 4lso notice that no indi5idual grade
0oint data is dis0la#ed &ecause the aggregates eliminate this le5el of
column and ro) detail and onl# returns the summari;ed result for all
included ro)s. The )a# to eliminate ro)s from &eing included in the
aggregation is through the use of a %CE!E clause. 3ince the name of
the selected column a00ears as the heading for the column, aggregate
names make for funn# looking headings. To make the out0ut look
&etter, it is a good idea to use an alias to dress u0 the name used in
the out0ut. 4dditionall#, the alias can &e used else)here in the 3@L as
the column name.
The ne<t 3ELECT demonstrates the use of alias names for the
aggregates(
8 !o) !eturned
Total A<erage Smallest Largest %ount
2:.GG 2.HI 0.00 :.00 F
-otice that )hen using aliases in the a&o5e 3ELECT the# a00ear as the
heading for each column. 4lso the )ords Total, 45erage and Count are
in dou&le Auotes. 4s mentioned earlier in this &ook, the dou&le Auoting
techniAue is used to tell the 'E that this is a column name, o00osed to
&eing the reser5ed )ord. %hereas, the single Auotes are used to
identif# a literal data 5alue.
Aggregates and Derived Data
The 5arious aggregates can )ork on an# column. Co)e5er, most of the
aggregates onl# )ork )ith numeric data. The C12-T function might
&e the 0rimar# one used on either character or numeric data. The
aggregates can also &e used )ith deri5ed data.
The follo)ing ta&le is used to demonstrate deri5ed data and
aggregation(
Em0lo#ee Ta&le + contains F students
Employee6"o Last6"ame )irst6name Salary #ept6"o
PK FK
UPI NUSI NUSI
8232"HG
82"I3:F
23:828G
238222"
2000000
800023:
882833:
832:I"H
8333:":
Cham&ers
Carrison
!eill#
Larkins
Jones
3m#the
3trickling
Coffing
3mith
Mandee
Cer&ert
%illiam
Loraine
3Auigg#
!ichard
Cletus
.ill#
John
Y:G,G"0.00
Y":,"00.00
Y3I,000.00
Y:0,200.00
Y32,G00."0
YI:,300.00
Y":,"00.00
Y:8,GGG.GG
Y:G,000.00
800
:00
:00
300
80
:00
200
200
Figure 5-2
This 3ELECT totals the salaries for all em0lo#ees and sho) )hat the
total salaries )ill &e if e5er#one is gi5en a "S or a 80S raise(
8 !o) !eturned
Salary Total HGI &aise H=?I &aise6 A<erage Salary
%omputed A<erage
Salary
Y:28,03F.3G Y::2,0F8.3" Y:I3,8:3.32 Y:I,HG2.8" Y:I,HG2.8"
-otice that since &oth TITLE and 91!M4T reAuire 0arentheses, the#
can share the same set. 4lso, the 46 function and di5iding the 32M
&# the C12-T 0ro5ide the same ans)er.

7&(*P !
It has &een sho)n that aggregates 0roduce one ro) of out0ut )ith one
5alue 0er aggregate. Co)e5er, the a&o5e 3ELECT is incon5enient if
indi5idual aggregates are needed &ased on different 5alues in another
column, like the class code. 9or e<am0le, #ou might )ant to see each
aggregate for freshman, so0homores, 7uniors, and seniors.
The follo)ing 3@L might &e run once for each uniAue 5alue s0ecified in
the %CE!E clause for class code, here the aggregates onl# )ork on
the senior class (V3!=$(
8 !o) !eturned
Total A<erage Smallest Largest %ount
I.3" 3.8H" 3.00 3.3" 2
4lthough this techniAue )orks for finding each class, it is not 5er#
con5enient. The first issue is that each uniAue class 5alue needs to &e
kno)n ahead of time for each e<ecution. 3econd, each %CE!E clause
must &e manuall# modified for the different 5alues needed. Lastl#,
each time the 3ELECT is e<ecuted, it 0roduces a se0arate out0ut. In
realit#, it might &e &etter to ha5e all the results in a single re0ort
format.
3ince the results of aggregates are incor0orated into a single out0ut
line, it is necessar# to create a )a# to 0ro5ide one line returned 0er a
uniAue data 5alue. To 0ro5ide a uniAue 5alue, it is necessar# to select
a column )ith a 5alue that grou0s 5arious ro)s together. This column
is sim0l# selected and not used in an aggregate. Therefore, it is a not
an aggregated column.
Co)e5er, )hen aggregates and >non+aggregates? (normal columns$
are selected at the same time, a 3"0: error message is returned to
indicate the mi<ture and that the non+aggregate is not 0art of an
associated grou0. Therefore, the 6!12' ./ is reAuired in the 3@L
statement to identif# e5er# column selected that is not an aggregate.
The resulting out0ut consists of one line for all aggregate 5alues for
each uniAue data 5alue stored in the column(s$ named in the 6!12'
./. 9or e<am0le, if the de0artment num&er is used from the Em0lo#ee
ta&le, the out0ut consists of one line 0er de0artment )ith at least one
em0lo#ee )orking in it.
The ne<t 3ELECT uses the 6!12' ./ to create one line of out0ut 0er
uniAue 5alue in the class code column(
" !o)s !eturned
%lass6code Total A<erage Smallest Largest %ount
9! I.GG 2.2F 0.00 :.00 2
Q Q Q Q Q 0
J! ".G" 2.F2" 8.F0 3.F" 2
3! I.3" 3.8H" 3.00 3.3" 2
31 ".G0 2.F 2.00 3.G0 2
-otice that the null 5alue in the class code column is returned. 4t first,
this ma# seem contrar# to the aggregates ignoring nulls. Co)e5er,
class code is not &eing aggregated and is selected as a >uniAue 5alue.?
4ll the aggregate 5alues on the grade 0oint for this ro) are null,
e<ce0t for C12-T. 4lthough, the C12-T is ;ero and this does indicate
that the null 5alue is ignored. The C12-T 5alue initiall# starts at ;ero,
so( 0 X 0 P 0. The 6!12' ./ is onl# reAuired )hen a non+aggregate
column is selected along )ith one or more aggregates. %ithout &oth a
non+aggregate and a 6!12' ./ clause, the aggregates return onl#
one ro). %hereas, )ith a non+aggregate and a 6!12' ./ clause
designating the column(s$, the aggregates return one ro) 0er uniAue
5alue in the column, as seen a&o5e.
4dditionall#, more than one non+aggregate column can &e s0ecified in
the 3ELECT and in the 6!12' ./ clause. The normal result of this is
that more ro)s are returned. This is &ecause one ro) a00ears
)hene5er an# single column 5alue changes, the com&ination of each
column constitutes a ne) 5alue. !emem&er, all non+aggregates
selected )ith an aggregate must &e included in the 6!12' ./, or a
3"0: error is returned.
4s an e<am0le, the last name might &e added as a second non+
aggregate. Then, each com&ination of last name and class code are
com0ared to other students in the same class. This com&ination
creates more lines of out0ut. 4s a result, each aggregate 5alue is
0rimaril# the aggregation of a single ro). The onl# time multi0le ro)s
are 0rocessed together is )hen multi0le students ha5e the same last
name and are in the same class. Then the# grou0 together &ased on
the 5alues in &oth columns &eing eAual.
This 3ELECT demonstrates the correct s#nta< )hen using multi0le
non+aggregates )ith aggregates and the out0ut is one line of out0ut
for each student(
80 !o)s !eturned
Last6name %lass6code Total A<erage Smallest Largest %ount
Johnson Q Q Q Q Q 0
Thomas 9! :.00 :.00 :.00 :.00 8
3mith 31 2.00 2.00 2.00 2.00 8
Mc!o&erts J! 8.F0 8.F0 8.F0 8.F0 8
Larkins 9! 0.00 0.00 0.00 0.00 8
'hilli0s 3! 3.00 3.00 3.00 3.00 8
*elane# 3! 3.3" 3.3" 3.3" 3.3" 8
%ilson 31 3.G0 3.G0 3.G0 3.G0 8
.ond J! 3.F" 3.F" 3.F" 3.F" 8
Canson 9! 2.GG 2.GG 2.GG 2.GG 8
.e#ond sho)ing the correct s#nta< for multi0le non+aggregates, the
a&o5e out0ut re5eals that it is 0ossi&le to reAuest too man# non+
aggregates. 4s seen a&o5e, e5er# out0ut line is a single ro).
Therefore, e5er# aggregated 5alue consists of a single ro). Therefore,
the aggregate is meaningless &ecause it is the same as the original
data 5alue. 4lso notice that )ithout an 1!*E! ./, the 6!12' ./ does
not sort the out0ut ro)s.
Like the 1!*E! ./, the num&er associated )ith the column=s relati5e
0osition )ithin the 3ELECT can also &e used in the 6!12' ./. In the
a&o5e e<am0le, the t)o columns are the first ones in the 3ELECT and
therefore, it is )ritten using the shorter format( 6!12' ./ 8,2.
Caution( 2sing the shorter techniAue can cause 0ro&lems if the
location of a non+aggregate is changed in the 3ELECT list and the
6!12' ./ is not changed. The most common 0ro&lem is a 3"0: error
message indicating that a non+aggregate is not included in the 6!12'
./, so the 3ELECT does not e<ecute.
4s 0re5iousl# sho)n, the default for a column heading is the column
name. It is not 5er# 0rett# to see the name of the aggregate and
column used as a heading. Therefore, an alias is suggested in all tools
or o0tionall#, a TITLE in .TE@ to define a heading.
4lso seen earlier, a C12-T on the grade 0oint for de0artment null is
;ero. 4ctuall#, this is misleading in that 8 ro) contains a null not ;ero
ro)s. .ut, &ecause of the null 5alue, the ro) is not counted. 4 &etter
techniAue might &e the use of C12-T(K$, for a ro) count. 4lthough
this im0lies counting all columns, in realit# it counts the ro). The
o&7ecti5e of this reAuest is to find an# column that contains a non+null
data 5alue.
4nother method to 0ro5ide the same result is to count an# column that
is defined as -1T -2LL. Co)e5er, since it takes time to determine
such a column and its name is longer than t#0ing an asterisk (K$, it is
easier to use the C12-T(K$.
4gain, the 6!12' ./ clause creates one line of out0ut 0er uniAue
5alue, &ut does not 0erform a sort. It onl# creates the distinct
grou0ing for all of the columns s0ecified. Therefore, it is suggested
that #ou al)a#s include an 1!*E! ./ to sort the out0ut.
The follo)ing might &e a &etter )a# to code the 0re5ious reAuest,
using the C12-T(K$ and an 1!*E! ./(
" !o)s !eturned
%lass6code Total A<erage Smallest Largest %ount
Q Q Q Q Q 8
9! I.GG 2.2F 0.00 :.00 3
J! ".G" 2.F2" 8.F0 3.F" 2
31 ".G0 2.F 2.00 3.G0 2
3! I.3" 3.8H" 3.00 3.3" 2
-o) the out0ut is sorted &# the class code )ith the null a00earing
first, as the lo)est >5alue.? 4lso notice the count is one for the ro)
containing mostl# -2LL data. The C12-T(K$ counts the ro).

Limiting (utput ;alues *sing 9A;I"7
4s in an# 3ELECT statement, a %CE!E clause can al)a#s &e used to
limit the num&er or t#0es of ro)s used in the aggregate 0rocessing.
Therefore, something &esides a %CE!E is needed to e5aluate
aggregate 5alues &ecause the aggregate is not finished until all eligi&le
ro)s ha5e &een read. 4gain, a %CE!E clause eliminates ro)s during
the 0rocess of reading the &ase ta&le ro)s. To allo) for the elimination
of s0ecific aggregate results, the C4I-6 clause is used to make the
final com0arison &efore the aggregate results are returned.
The 0re5ious 3ELECT is modified &elo) to com0are the aggregates and
onl# return the students from s0ool )ith a grade 0oint a5erage of .
(3.0$ or &etter(
8 !o)s !eturned
%lass6code Total A<erage %ount
3! I.3" 3.8G 2
-otice that all of the 0re5iousl# seen out0ut )ith an a5erage 5alue less
than 3.00 has &een eliminated as a result of using the C4I-6 clause.
The %CE!E clause eliminates ro)sL the C4I-6 0ro5ides the last
com0arison after the calculation of the aggregate and &efore results
are returned to the user client.

Statistical Aggregates
In Teradata !elease : (2!:$ there are se5eral ne) aggregates that
0erform statistical o0erations. Man# of them are used in other internal
functions and no) the# are a5aila&le for use )ithin 3@L.
-ot onl# are these statistical functions the ne)est, &ut there are t)o
t#0es of statistical functions. The# are unar# (single in0ut 5alue$
functions, and &inar# (dual in0ut 5alue$ functions.
The unar# functions look at indi5idual column 5alues for each ro)
included and com0are all of the 5alues for trends, similarities and
grou0ings. 4ll the original aggregate functions are unar# in that the#
acce0t a single 5alue to 0erform their 0rocessing.
The statistical unar# functions are(
Murtosis
3ke)
3tandard *e5iation of a sam0le
3tandard *e5iation of a 0o0ulation
ariance of a sam0le
ariance of a 0o0ulation
The &inar# functions e<amine the relationshi0 &et)een the t)o
different 5alues. -ormall# these t)o 5alues re0resent t)o se0arate
0oints on an B a<is and /+a<is.
The &inar# functions are(
Correlation
Co5ariance
!egression Line Interce0t
!egression Line 3lo0e
The results from the statistical functions are not as o&5ious to
demonstrate and figure out as the original functions, like 32M or 46.
The 3tats ta&le in 9igure "+3 is used to demonstrate the statistical
functions. Its column 5alues ha5e certain 0atterns in them. 9or
instance C1L8 increases seAuentiall# from 8 to 30 )hile C1L:
decreases seAuentiall# from 30 to 8. The remaining columns tend to
ha5e the same 5alue re0eated and some 5alues re0eat more than
others. These 5alues are used in &oth the unar# and &inar# functions
to illustrate the t#0es of ans)ers generated using these statistical
functions.
The follo)ing ta&le demonstrates the o0eration and out0ut from the
ne) statistical aggregate functions in 2!:.
3tats Ta&le + contains 30 ro)s
%ol= %ol> %olA %olJ %olG %olK
PK
8
2
3
:
"
I
H
G
F
80
88
82
83
8:
8"
8I
8H
8G
8F
20
28
22
23
2:
2"
8
8
3
3
3
:
"
"
"
"
H
H
F
F
F
F
80
80
80
80
80
80
83
83
83
8
8
80
80
80
80
80
80
80
20
20
20
20
20
20
20
20
20
20
20
20
20
20
30
30
30
2F
2G
2H
2I
2"
2:
23
22
28
20
8F
8G
8H
8I
8"
8:
83
82
88
80
F
G
H
I
8
2
3
:
"
I
H
G
F
80
22
82
83
8:
8"
8:
83
82
88
F
G
H
I
"
:
0
"
80
8"
20
30
30
30
3"
3"
:0
:0
:"
:"
"0
""
""
I0
I0
I"
I"
I"
H0
H0
G0
2I
2H
2G
2F
30
8:
8"
8"
8I
8I
:0
:0
"0
"0
I0
"
:
3
2
8
3
2
8
8
8
G"
F0
F0
F"
800
Figure 5-3
T"e (U)TOSIS !nction
The M2!T13I3 function is used to return a num&er that re0resents the
shar0ness of a 0eak on a 0lotted cur5e of a 0ro&a&ilit# function for a
distri&ution com0ared )ith the normal distri&ution.
4 high 5alue result is referred to as le0tokurtic. %hile a medium result
is referred to as mesokurtic and a lo) result is referred to as
0lat#kurtic.
4 0ositi5e 5alue indicates a shar0 or 0eaked distri&ution and a
negati5e num&er re0resents a flat distri&ution. 4 0eaked distri&ution
means that one 5alue e<ists more often than the other 5alues. 4 flat
distri&ution means there is the same Auantit# 5alues e<ist for each
num&er.
If #ou com0are this to the ro) distri&ution associated )ithin Teradata,
most of the time a flat distri&ution is &est, )ith the same num&er of
ro)s stored on each 4M'. Ca5ing ske)ed data re0resents more of a
lum0# distri&ution.
3#nta< for using M2!T13I3(
@-RTOS2S=<column-name>>
The ne<t 3ELECT uses M2!T13I3 to com0are the distri&ution of the
3tats ta&le(
8 !o) !eturned
1o.%ol= 1o.%ol> 1o.%olA 1o.%olJ 1o.%olG 1o.%olK
+8 +8 8 +8 +8 +8
T"e S(E* !nction
The 3ke) indicates that a distri&ution does not ha5e eAual 0ro&a&ilities
a&o5e and &elo) the mean (a5erage$. In a ske) distri&ution, the
median and the mean are not coincident, or eAual.
%here(
a median 5alue N mean 5alue P a 0ositi5e ske)
a median 5alue O mean 5alue P a negati5e ske)
a median 5alue P mean 5alue P no ske)
3#nta< for using 3ME%(
S@E3=<column-name>>
The follo)ing 3ELECT uses 3ME% to com0are the distri&ution of the
3tats ta&le(
8 !o) !eturned
S1o.%ol= S1o.%ol> S1o.%olA S1o.%olJ S1o.%olG S1o.%olK
0 +0 8 0 0 +0
T"e STDDE%+POP !nction
The standard de5iation function is a statistical measure of s0read or
dis0ersion of 5alues. It is the root=s sAuare of the difference of the
mean (a5erage$. This measure is to com0are the amount &# )hich a
set of 5alues differs from the arithmetical mean.
The 3T**EW'1' function is one of t)o that calculates the standard
de5iation. The 0o0ulation is of all the ro)s included &ased on the
com0arison in the %CE!E clause.
3#nta< for using 3T**EW'1'(
ST&&E1_+O+=<column-name>>
The ne<t 3ELECT uses 3T**EW'1' to determine the standard
de5iation on all columns of all ro)s )ithin the 3tats ta&le(
8 !o) !eturned
S#Po.%ol= S#Po.%ol> S#Po.%olA S#Po.%olJ S#Po.%olG S#Po.%olK
F : 8: F : 2H
T"e STDDE%+SAMP !nction
The standard de5iation function is a statistical measure of s0read or
dis0ersion of 5alues. It is the root=s sAuare of the difference of the
mean (a5erage$. This measure is to com0are the amount &# )hich a
set of 5alues differs from the arithmetical mean.
The 3T**EW34M' function is one of t)o that calculates the standard
de5iation. The sam0le is a random selection of all ro)s returned &ased
on the com0arisons in the %CE!E clause. The 0o0ulation is for all of
the ro)s &ased on the %CE!E clause.
3#nta< for using 3T**EW34M'(
ST&&E1_S,M+=<column-name>>
The follo)ing 3ELECT uses 3T**EW34M' to determine the standard
de5iation on all columns of a sam0le of the ro)s )ithin the 3tats ta&le(
8 !o) !eturned
S#So.%ol= S#So.%ol> S#So.%olA S#So.%olJ S#So.%olG S#So.%olK
F : 8: F " 2H
T"e %A)+POP !nction
The ariance function is a measure of dis0ersion (s0read of the
distri&ution$ as the sAuare of the standard de5iation. There are t)o
forms of ariance in Teradata, 4!W'1' is for the entire 0o0ulation of
data ro)s allo)ed &# the %CE!E clause.
4lthough standard de5iation and 5ariance are regularl# used in
statistical calculations, the meaning of 5ariance is not eas# to
ela&orate. Most often 5ariance is used in theoretical )ork )here a
5ariance of the sam0le is needed.
There are t)o methods for using 5ariance. These are the Mruskal+
%allis one+)a# 4nal#sis of ariance and 9riedman t)o+)a# 4nal#sis of
ariance &# rank.
3#nta< for using 4!W'1'(
1,R_+O+=<column-name>>
The follo)ing 3ELECT uses 4!W'1' to com0are the 5ariance of the
distri&ution on all ro)s from the 3tats ta&le(
8 !o) !eturned
;Po.%ol= ;Po.%ol> ;Po.%olA ;Po.%olJ ;Po.%olG ;Po.%olK
H" 8F 8F8 H" 20 H23
T"e %A)+SAMP !nction
The ariance function is a measure of dis0ersion (s0read of the
distri&ution$ as the sAuare of the standard de5iation. There are t)o
forms of ariance in Teradata, 4!W34M' is used for a random
sam0ling of the data ro)s allo)ed through &# the %CE!E clause.
4lthough standard de5iation and 5ariance are regularl# used in
statistical calculations, the meaning of 5ariance is not eas# to
ela&orate. Most often 5ariance is used in theoretical )ork )here a
5ariance of the sam0le is needed to look for consistenc#.
There are t)o methods for using 5ariance. These are the Mruskal+
%allis one+)a# 4nal#sis of ariance and 9riedman t)o+)a# 4nal#sis of
ariance &# rank.
3#nta< for using 4!W34M'(
1,R_S,M+=<column-name>>
The ne<t 3ELECT uses 4!W34M' to com0are the 5ariance of the
distri&ution on a ro) sam0le from the 3tats ta&le(
8 !o) !eturned
;So.%ol= ;So.%ol> ;So.%olA ;So.%olJ ;So.%olG ;So.%olK
HG 20 8FG HG 20 H:G
T"e CO)) !nction
The C1!! function is a &inar# function, meaning that t)o 5aria&les are
used as in0ut to it. It measures the association &et)een 2 random
5aria&les. If the 5aria&les are such that )hen one changes the other
does so in a related manner, the# are correlated. Inde0endent
5aria&les are not correlated &ecause the change in one does not
necessaril# cause the other to change.
The correlation coefficient is a num&er &et)een +8 and 8. It is
calculated from a num&er of 0airs of o&ser5ations or linear 0oints
(B,/$.
%here(
8 P 0erfect 0ositi5e correlation
0 P no correlation
+8 P 0erfect negati5e correlation
3#nta< for using C1!!(
CORR=<column-name>A <column-name>>
The follo)ing 3ELECT uses C1!! to com0are the association of 5alues
stored in 5arious columns from the 3tats ta&le(
8 !o) !eturned
%o.%ol=L> %o.%ol=LA %o.%ol=LJ %o.%ol=LG %o.%ol=LK
0.FGI:G0 0.GG"8"" +8.000000 +0.8"8GHH 0.FF8I82
3ince there are t)o column 5alues 0assed to this function and the first
e<am0le has data 5alues that seAuentiall# ascend, the ne<t e<am0le
uses col: as the first 5alue &ecause it seAuentiall# descends. It
demonstrates the im0act of this seAuence change on the result(
8 !o) !eturned
%o.%olJL> %o.%olJLA %o.%olJL= %o.%olJLG %o.%olJLK
+0.FGI:G0 +0.GG"8"" +8.000000 0.8"8GHH +0.FF8I82
T"e CO%A) !nction
The co5ariance is a statistical measure of the tendenc# of t)o 5aria&les
to change in con7unction )ith each other. It is eAual to the 0roduct of
their standard de5iations and correlation coefficients.
The co5ariance is a statistic used for &i5ariate sam0les or &i5ariate
distri&ution. It is used for )orking out the eAuations for regression
lines and the 0roduct+moment correlation coefficient.
3#nta<(
CO1,R=<column-name>A <column-name>>
The ne<t 3ELECT uses C14! to com0are the co5ariance association
of 5alues stored in 5arious columns from the 3tats ta&le(
8 !o) !eturned
%;o.%ol=L> %;o.%ol=LA %;o.%ol=LJ %;o.%ol=LG %;o.%ol=LK
3G 80I +H" +I 238
3ince there are t)o column 5alues 0assed to this function and the first
e<am0le has data 5alues that seAuentiall# ascend, the ne<t e<am0le
uses col: as the first 5alue &ecause it seAuentiall# descends. It
demonstrates the im0act of this seAuence change on the result(
8 !o) !eturned
%<o.%olJL> %<o.%olJLA %<o.%olJL= %<o.%olJLG %<o.%olJLK
+3H +80I +H" I +238
T"e )E&)+INTE)CEPT !nction
4 regression line is a line of &est fit, dra)n through a set of 0oints on a
gra0h for B and / coordinates. It uses the / coordinate as the
*e0endent aria&le and the B 5alue as the Inde0endent aria&le.
T)o regression lines al)a#s meet or interce0t at the mean of the data
0oints(<,#$, )here <P46(<$ and #P46(#$ and is not usuall# one of
the original data 0oints.
3#nta< for using !E6!WI-TE!CE'T(
RE:R_20TERCE+T=de6endent-e!6#ess$onA $nde6endent-e!6#ess$on>
The follo)ing 3ELECT uses !E6!WI-TE!CE'T to find the interce0t
0oint &et)een the 5alues stored in 5arious columns from the 3tats
ta&le(
8 !o) !eturned
&Io.%ol=L> &Io.%ol=LA &Io.%ol=LJ &Io.%ol=LG &Io.%ol=LK
+8 3 38 8G +8
3ince there are t)o column 5alues 0assed to this function and the first
e<am0le has data 5alues that seAuentiall# ascend, the ne<t e<am0le
uses col: as the first 5alue &ecause it seAuentiall# descends. It
demonstrates the im0act of this seAuence change on the result(
8 !o) !eturned
&Io.%olJL> &Io.%olJLA &Io.%olJL= &Io.%olJLG &Io.%olJLK
32 2G 0 83 32
T"e )E&)+S,OPE !nction
4 regression line is a line of &est fit, dra)n through a set of 0oints on a
gra0h of B and / coordinates. It uses the / coordinate as the
*e0endent aria&le and the B 5alue as the Inde0endent aria&le.
The slo0e of the line is the angle at )hich it mo5es on the B and /
coordinates. The 5ertical slo0e is / on B and the hori;ontal slo0e is B
on /.
3#nta< for using !E6!W3L1'E(
RE:R_SLO+E=de6endent-e!6#ess$onA $nde6endent-e!6#ess$on>
The ne<t 3ELECT uses !E6!W3L1'E to find the slo0e for the 5alues
stored in 5arious columns from the 3tats ta&le(
8 !o) !eturned
&So.%ol=L> &So.%ol=LA &So.%ol=LJ &So.%ol=LG &So.%ol=LK
2 8 +8 +0 0
3ince there are t)o column 5alues 0assed to this function and the first
e<am0le has data 5alues that seAuentiall# ascend, the ne<t e<am0le
uses col: as the first 5alue &ecause it seAuentiall# descends. It
demonstrates the im0act of this seAuence change on the result(
8 !o) !eturned
&So.%olJL> &So.%olJLA &So.%olJL= &So.%olJLG &So.%olJLK
+2 +8 8 0 +0
Using &)OUP B-
Like the original aggregates, the ne) statistical aggregates ma# also
take ad5antage of using non+aggregates )ith the aggregates. The
6!12' ./ is used to identif# and form grou0s for each uniAue 5alue in
the selected non+aggregate column.
Like)ise, the ne) statistical aggregates are com0ati&le )ith the
original aggregates as seen in the follo)ing 3ELECT(
H !o)s !eturned
colA %nt A<g= S#= ;P= A<gJ S#J ;PJ A<gK S#K ;PK
8 2 2 0 0 30 0 0 2 2 I
80 H I 2 : 2" 2 : 2: F H:
20 8: 8I : 8I 8: : 8I ": 88 88I
30 2 2: 0 0 I 0 0 H" " 2"
:0 2 2I 0 0 : 0 0 GG 2 I
"0 2 2G 0 0 2 0 0 F2 2 I
I0 8 30 0 0 8 0 0 800 0 0
Use o. /A%IN&
4lso like the original aggregates, the C4I-6 ma# &e used to eliminate
s0ecific out0ut lines &ased on one or more of the final aggregate
5alues.
The ne<t 3ELECT uses the C4I-6 to 0erform a com0ound com0arison
on &oth the count and the co5ariance(
2 !o)s !eturned
colA %nt A<g= S#= ;P=
80 H I 2 :
20 8: 8I : 8I

*sing the #ISTI"%T )unction with Aggregates
4t times throughout this &ook, e<am0les are sho)n using a function
)ithin a function and the 0o)er it 0ro5ides. The C12-T aggregate
0ro5ides another o00ortunit# to demonstrate a ca0a&ilit# that might
0ro5e itself useful. It com&ines the *I3TI-CT and aggregate functions.
The follo)ing ma# &e used to determine ho) man# courses are &eing
taken instead of the total num&er of students (80$ )ith a 5alid class
code(
8 !o) !eturned
*ni:ue6%ourses *ni:ue67PA
: F
-ote( 'rior to 2!", #ou can onl# use a single column for all *I3TI-CT
o0erations inside of aggregates.
ersus using all of the 5alues(
8 !o) !eturned
%ourses 7PAs
F F
It is allo)a&le to use the *I3TI-CT in multi0le aggregates )ithin a
3ELECT. Co)e5er, 0rior to 2!" there )as a restriction that onl#
allo)ed the aggregates to use the same column for each *I3TI-CT
function. -o), it can use different columns names.

Aggregates and ;ery Large #ata ases (;L#)
4s great as huge data&ases might &e, there are considerations to take
into account )hen 0rocessing large num&ers of ro)s. This section
enumerates a fe) of the situations that might &e encountered. !ead
them and think a&out the reAuirement or &enefit of incor0orating them
into #our 3@L.
Potentia# o. Exec!tion Error
4ggregates use the data t#0e of the column the# are aggregating. 1n
most data&ases, this )orks fine. Co)e5er, )hen )orking on a L*.,
this ma# cause the 3ELECT to fail on a numeric o5erflo) condition. 4n
o5erflo) occurs )hen the 5alue &eing calculated e<ceeds the
ma<imum si;e or 5alue for the data t#0e &eing used.
9or e<am0le, one &illion (8,000,000,000$ is a 5alid 5alue for an integer
column &ecause it is less than 2,8:H,:G3,I:H. Co)e5er, if three ro)s
each ha5e one &illion as their 5alue and a 32M o0eration is 0erformed,
it fails on the third ro).
Tr# the follo)ing series of commands to demonstrate an o5erflo) and
its fi<(
Create a ta&le called 15erflo) )ith 2 columns
CT 15erflo)Wt&l (15rW&#te ./TEI-T, 15rWint I-T$L
Insert 3 ro)s )ith 5er# large 5alues of 8 &illion )here ma< 5alue is
2,8:H,:3G,I:H
I-3 o5erflo)Wt&l 5alues (8, 80KKF$L
I-3 o5erflo)Wt&l 5alues (2, 80KKF$L
I-3 o5erflo)Wt&l 5alues (3, 80KKF$L
4 32M aggregate on these 5alues )ill result in 3 &illion
SEL S-M=oB#_$nt> ,S sum_col FROM oBe#"lo4_tbl;
KKKKK 2I8I numeric o5erflo)
4ttem0ting this 32M, as )ritten, results in a 2I8I numeric o5erflo)
error. That is &ecause 3 &illion is too large to &e stored in the default
data t#0e of integer. This is the default &ecause of the data t#0e of the
column &eing used )ithin the aggregate. To fi< it, use either of the
follo)ing techniAues to con5ert the data column to a different t#0e
&efore 0erforming the aggregation.
8 !o) !eturned
sum6col
3,000,000,000
%hene5er #ou find #ourself in a situation )here the 3@L is failing due
to a numeric o5erflo), it is most likel# due to the inherited data t#0e
of the column. %hen this ha00ens, &e sure to con5ert the t#0e &efore
doing the math.
&)OUP B- vers!s DISTICT
4s seen in cha0ter 2, *I3TI-CT is used to eliminate du0licate 5alues.
In this cha0ter, the 6!12' ./ is used to consolidate multi0le ro)s )ith
the same 5alue into the same grou0. It does the consolidation &#
eliminating du0licates. 1n the surface, the# 0ro5ide the same
functionalit#.
The ne<t 3ELECT uses 6!12' ./)ithout aggregation to eliminate
du0licates(
" !o)s !eturned
class6code
Q
,.
J!
31
3!
The 6!12' ./ )ithout aggregation returns the same ro)s as the
*I3TI-CT. 3o the o&5ious Auestion &ecomes, )hich is more efficientQ
The ans)er is not a sim0le one. Instead, something must &e kno)n
a&out the characteristics of the data. 6enerall#, )ith more du0licate
data 5alues D 6!12' ./is more efficient. Co)e5er, if onl# a fe)
du0licates e<ist D *I3TI-CT is more efficient. To understand the
reason, it is im0ortant to kno) ho) each of them eliminates the
du0licate 5alues.
TechniAue used to eliminate du0licates (can &e seen in EB'L4I-$(
*I3TI-CT
!eads a ro) on each 4M'
Cashes the column(s$ 5alue identified in the
*I3TI-CT
!edistri&utes the ro) 5alue to the a00ro0riate 4M'
1nce all 0artici0ating ro)s ha5e &een redistri&uted
3orts the data to com&ine du0licates on each 4M'
Eliminates du0licates on each 4M'
6!12' ./
!eads all the 0artici0ating ro)s
Eliminates du0licates on each 4M' using >&uckets?
Cashes the uniAue 5alues on each 4M'
!edistri&utes the uniAue 5alues to the a00ro0riate 4M'
1nce all uniAue 5alues ha5e &een redistri&uted from
e5er# 4M'
3orts the uniAue 5alues to com&ine du0licates on
each 4M'
Eliminates du0licates on each 4M'
.ack to the original Auestion( )hich is more efficientQ
3ince *I3TI-CT redistri&utes the ro)s immediatel#, more data ma#
mo5e &et)een the 4M's, com0ared to 6!12' ./ that onl# sends
uniAue 5alues &et)een the 4M's. 3o, 6!12' ./ sounds more
efficient. Co)e5er, )hen #ou consider that if the data is nearl# uniAue,
6!12' ./ s0ends time attem0ting to eliminate du0licates that do not
e<ist. Therefore, it is )asting the time to check for du0licates the first
time. Then, it must redistri&ute the same amount of data an#)a#.
Therefore, for efficienc#, )hen there are(
Man# du0licates D use 6!12' ./
9e) to no du0licates D use *I3TI-CT
3'11L s0ace is e<ceeded D tr# 6!12' ./

Per.ormance (pportunities
The Teradata o0timi;er has al)a#s had o0tions a5aila&le to it )hen
0erforming 3@L. It al)a#s attem0ts to use the most efficient 0ath to
0ro5ide the ans)er set. This is true for aggregation as )ell.
%hen 0erforming aggregation, the main shortcut a5aila&le might
include the use of a secondar# inde<. The inde< ro) is maintained in a
su&ta&le. This ro) contains the ro) I* (ro) hash X uniAueness 5alue$
and the actual data 5alue stored in the data ro). Therefore, an inde<
ro) is normall# much shorter than a data ro). Cence, more inde<
ro)s e<ist in an inde< &lock than in a data &lock.
4s a result, the read of an inde< &lock makes more 5alues a5aila&le
than the actual data &lock. 3ince IJ1 is the slo)est o0eration on all
com0uter s#stems, less IJ1 eAuates to faster 0rocessing. If the
o0timi;er can o&tain all the 5alues it needs for 0rocessing &# using the
secondar# inde<, it )ill. This is referred to as a >co5ered Auer#.?
The creation of a secondar# inde< is co5ered in this &ook as 0art of the
*ata *efinition Language(**L$ cha0ter.

Su$:uery
The su&Auer# is a commonl# used techniAue and 0o)erful )a# to
select ro)s from one ta&le &ased on 5alues in another ta&le. It is
0redicated on the use of a 3ELECT statement )ithin a 3ELECT and
takes ad5antage of the relationshi0s &uilt into a relational data&ase.
The &asic conce0t &ehind a su&Auer# is that it retrie5es a list of 5alues
that are used for com0arison against one or more columns in the main
Auer#. To accom0lish the com0arison, the su&Auer# is )ritten after the
%CE!E clause and normall# as 0art of an I- list.
In an earlier cha0ter, the I- )as used to &uild a 5alue list for
com0arison against the ro)s of a ta&le to determine )hich ro)s to
select. The ne<t e<am0le illustrates ho) this techniAue can &e used to
3ELECT all the columns for ro)s containing an# of the three different
5alues 80, 20 and 30(
: !o)s !eturned
%olumn= %olumn>
80 4 ro) )ith 80 in column8
30 4 ro) )ith 30 in column8
80 4 ro) )ith 80 in column8
20 4 ro) )ith 20 in column8
4s 0o)erful as this is, )hat if the three 5alues turned into a thousand
5alues. That is too much )ork and too man# o00ortunities to forget
one of the 5alues. Instead of )riting the 5alues manuall#, a su&Auer#
can &e used to generate the 5alues automaticall#.
The coding techniAue of a su&Auer# re0laces the 5alues 0re5iousl#
)ritten in the I- list )ith a 5alid 3ELECT. Then the su&Auer# 3ELECT
d#namicall# generates the 5alue list. 1nce the 5alues ha5e &een
retrie5ed, it eliminates the du0licates &# automaticall# 0erforming a
*I3TI-CT.
The follo)ing is the s#nta< for a su&Auer#(
Conce0tuall#, the su&Auer# is 0rocessed first so that all the 5alues are
e<0anded into the list for com0arison )ith the column s0ecified in the
%CE!E clause. These 5alues in the su&Auer# 3ELECT can onl# &e used
for com0arison against the column or columns referenced in the
%CE!E.
Columns inside the su&Auer# 3ELECT cannot &e returned to the user
5ia the main 3ELECT. The onl# columns a5aila&le to the client are
those in the ta&les named in the main (first$ 9!1M clause. The Auer#
in 0arentheses is called the su&Auer# and it is res0onsi&le for &uilding
the I- list.
4t the )riting of this document, Teradata allo)s u0 to I: ta&les in a
single Auer#. Therefore, if each 3ELECT accessed onl# one ta&le, a
Auer# might contain I3 su&Aueries in a single statement.
The ne<t t)o ta&les are used to demonstrate the functionalit# of
su&Aueries(
Customer Ta&le + contains " customers
%ustomer6num$er %ustomer6name Phone6num$er
PK
UPI NUSI NUSI
88888888
38383838
3832383:
"HGFIGG3
GH323:"I
.ill#=s .est Choice
4cme 'roducts
4CE Consulting
B/[ 'lum&ing
*ata&ases -+2
"""+823:
"""+8888
"""+8282
3:H+GF":
322+8082
Figure 6-1
1rder Ta&le + contains " orders
(rder6num$er %ustomer6num$er (rder6date (rder6total
PK FK
UPI NUSI NUSI
823:"I
823"82
823""2
823"G"
823HHH
88888888
88888888
3832383:
GH323:"I
"HGFIGG3
FG0"0:
FF0808
FF8008
FF8080
FF0F0F
823:H."3
0G00".F8
0"888.:H
8"238.I2
23:":.G:
Figure 6-2
The ne<t 3ELECT uses a su&Auer# to find all customers that ha5e an
order of more than Y80,000.00(
3 !o)s !eturned
%ustomer6name Phone6num$er
.ill#=s .est Choice """+823:
B/[ 'lum&ing 3:H+GF":
*ata&ases -+2 322+8082
This is an a00ro0riate 0lace to mention that the columns &eing
com0ared &et)een the main and su&Aueries must &e from the same
domain. 1ther)ise, if no eAual condition e<ists, no ro)s are returned.
The a&o5e 3ELECT uses the customer num&er (9M$ in the 1rder ta&le
to match the customer num&er ('M$ in the Customer ta&le. The# are
&oth customer num&ers and therefore ha5e the o00ortunit# to
com0are eAual from &oth ta&les.
The ne<t su&Auer# s)a0s the Aueries to find all the orders &# a s0ecific
customer(
3 !o)s !eturned
(rder6num$er (rder6total
823:"I 823:H."3
823"82 G00".F8
-otice that the Customer ta&le is used in the main Auer# to ans)er a
customer Auestion and the 1rder ta&le is used in the main Auer# to
ans)er an order Auestion. Co)e5er, the# &oth com0are on the
customer num&er as the common domain &et)een the t)o ta&les.
.oth of the 0re5ious su&Aueries )ork fine for com0aring a single
column in the main ta&le to a 5alue list in the su&Auer#. Thus, it is
0ossi&le to ans)er Auestions like, >%hich customer has 0laced the
largest orderQ? Co)e5er, it cannot ans)er this Auestion, >%hat is the
ma<imum order for each customerQ?
To make 3u&Aueries more so0histicated and 0o)erful, the# can
com0are more than one column at a time. The multi0le columns are
referenced in the %CE!E clause, of the main Auer# and also enclosed
in 0arentheses.
The ke# is this( if multi0le columns are named &efore the I- 0ortion of
the %CE!E clause, the e<act same num&er of columns must &e
referenced in the 3ELECT of the su&Auer# to o&tain all the reAuired
5alues for com0arison.
9urthermore, the corres0onding columns (outside and inside the
su&Auer#$ must res0ecti5el# &e of the same domain. Each of the
columns must &e eAual to a corres0onding 5alue in order for the ro)
to &e returned. It )orks like an 4-* com0arison.
The follo)ing 3ELECT uses a su&Auer# to match t)o columns )ith t)o
5alues in the su&Auer# to find the highest dollar orders for each
customer(
: !o)s !eturned
%ustomer (rder6num$er (rder6total
88888888 823":I 823:H."3
"HGFIGG3 823HHH 23:":.G:
3832383: 823""2 "888.:H
GH323:"I 823"G" 8"238.I2
4lthough this )orks )ell for MI- and M4B t#0e of 5alues (eAualities$, it
does not )ork )ell for finding 5alues greater than or less than an
a5erage. 9or this t#0e of 0rocessing, a Correlated su&Auer# is the &est
solution and )ill &e demonstrated later in this cha0ter.
3ince I: ta&les can &e in a single Teradata 3@L statement, as
mentioned 0re5iousl#, this means that a ma<imum of I3 su&Aueries
can &e )ritten into a single statement. The follo)ing sho)s a 3+ta&le
access using t)o se0arate su&Aueries. 4dditional su&Aueries sim0l#
follo) the same 0attern.
9rom the a&o5e ta&les, it is also 0ossi&le to find the customer )ho has
ordered the single highest dollar amount order. To accom0lish this, the
1rder ta&le must &e used to determine the ma<imum order. Then, the
1rder ta&le is used again to com0are the ma<imum )ith each order
and finall#, com0ared to the Customer Ta&le to determine )hich
customer 0laced the order.
The ne<t su&Auer# can &e used to find them(
8 !o) !eturned
%ustomer6name Phone6num$er
B/[ 'lum&ing 3:H+GF":
It is no) kno)n that B/[ 'lum&ing has the highest dollar order. %hat
is not kno)n is the amount of the order. 3ince the order total is in the
1rder ta&le, )hich is not referenced in the main Auer#, it cannot &e
0art of the 3ELECT list.
In order to see the order total, a 7oin must &e used. Joins )ill &e
co5ered in the ne<t cha0ter.
Using NOT IN
4s seen in a 0re5ious cha0ter, )hen using the I- and a 5alue list, the
-1T I- can &e used to find all of the ro)s that do not match.
2sing this techniAue, the su&Auer# a&o5e can &e modified to find the
customers )ithout an order. The onl# changes made are to 8$ add the
-1T &efore the I- and 2$ eliminate the %CE!E clause in the su&Auer#(
8 !o) !eturned
%ustomer6name Phone6num$er
*ata&ases ! 2s 322+8082
Caution needs to &e used regarding the -1T I- )hen there is a
0otential for including a -2LL in the 5alue list. 3ince the com0arison of
a -2LL and an# other 5alue is unkno)n, and the -1T of an unkno)n is
still an unkno)n no ro)s are returned. Therefore )hen there is
0otential for a -2LL in the su&Auer#, it is &est to also code a
com0ound com0arison as seen in the follo)ing 3ELECT(
Using 0!anti.iers
In other !*.M3 s#stems and earl# Teradata 5ersions, using an
eAualit# s#m&ol (P$ in a com0arison normall# 0ro5ed to &e more
efficient than using an I- list. The reason )as that it allo)ed for
indices, if the# e<isted, to &e used instead of a seAuential read of all
ro)s. Toda#, Teradata automaticall# uses indices )hene5er the# are
more efficient. 3o, the use of Auantifiers is o0tional and an I- )orks
e<actl# the same.
4nother 0o)erful use for Auantifiers is )hen using ineAualities. It is
sometimes necessar# to find all ro)s that are greater than or less than
one or more other 5alues.
To use Auantifiers, re0lace the I- )ith an P, N, O, 4-/, 31ME or 4LL
as demonstrated in the follo)ing s#nta<(
Earlier in this cha0ter, a t)o le5el su&Auer# )as used to find the
customer )ho s0ent the most mone# on a single order. It used an I-
list to find eAual 5alues. The ne<t 3ELECT uses P 4-/ to find the same
customers(
2 !o)s !eturned
%ustomer6name Phone6num$er
.ill#=s .est Choice """+823:
B/[ 'lum&ing 3:H+GF":
In order to accom0lish this, the 1rder ta&le is first used to determine
the a5erage order amount. Then, the 1rder ta&le is used again to
com0are the a5erage )ith each order and finall#, com0ared to the
Customer ta&le to determine )hich of the customers Aualif#.
The Auantifiers of 31ME and 4-/ are interchangea&le. Co)e5er, the
use of 4-/ conforms to 4-3I standard and 31ME is the Teradata
e<tension. The P 4-/ is functionall# eAui5alent to using an I- list.
The 4LL and the P are more limited in their sco0e. In order for them
to )ork, there can onl# &e a single 5alue from the su&Auer# for each of
the 5alues in the %CE!E clause. Co)e5er, earlier the -1T I- )as
e<0lored. %hen using Auantifiers and the -1T, consider the follo)ing(
E:ui<alency %hart
I- is eAui5alent to P 4-/
-1T I- is eAui5alent to -1T P 4LL
Figure 6-3
1f these, the -1T P 4LL takes the most thought. It forces the s#stem
to e<amine e5er# 5alue in the list to make sure that the 5alue &eing
com0ared is checked against all the 5alues. 1ther)ise, as soon as an#
of the 5alues is different, the ro) is returned )ithout looking at the
other 5alues (4LL$.
4lthough the a&o5e descri&es the conce0tual a00roach of a su&Auer#,
the Teradata o0timi;er )ill normall# use a 7oin to o0timi;e and locate
the ro)s that are needed from )ithin the data&ase. This usage ma# &e
seen using the EB'L4I-. Joins are discussed in the ne<t cha0ter.

Quali.ying Ta$le "ames and %reating a Ta$le Alias
This section 0ro5ides techniAues to s0ecificall# reference ta&le and
columns throughout all data&ases and to tem0oraril# rename ta&les
)ith an alias name. .oth of these techniAues are necessar# to 0ro5ide
s0ecific and uniAue names to the o0timi;er at 3@L e<ecution time.
0!a#i.ying Co#!mn Names
3ince column names )ithin a ta&le must &e uniAue, the s#stem kno)s
)hich data to access sim0l# &# using the column name. Co)e5er,
)hen more that one ta&le is referenced &# the 9!1M in a single
3ELECT, this ma# not &e the case. The 0otential e<ists for columns of
the same domain to ha5e the same name in more than one ta&le.
%hen this ha00ens, the s#stem does not guess )hich column to
reference. The 3@L must e<0licitl# declare )hich ta&le to use for
accessing the column.
This declaration is called Aualif#ing the column name. If the 3@L does
not Aualif# the column name a00earing in more than one ta&le, the
s#stem dis0la#s an error message that indicates too much am&iguit#
e<ists in the Auer#. Correlated su&Aueries, addressed ne<t, and 7oin
0rocessing, in the ne<t cha0ter, &oth make use of more than one ta&le
at the same time. Therefore, man# times it is im0ortant to make sure
the s#stem kno)s )hich ta&le=s columns to use for all 0ortions of the
3@L statement.
To Aualif# a column name, the ta&le name and column name are
connected using a 0eriod or sometimes referred to as a dot (.$. The
dot connects the names )ithout a s0ace to make the t)o names )ork
as a single reference name. Co)e5er, if the column has different
names in the multi0le ta&les, there is no confusion )ithin the s#stem
and therefore, no need to Aualif# the name.
To illustrate this conce0t, lets consider 0eo0le instead of ta&les. 9or
instance, Mike is a common name. If t)o Mikes are in different rooms
and someone uses the name in either room, there is no confusion.
Co)e5er, if &oth Mikes are in the same room and someone uses the
name, &oth Mikes res0ond and therefore confusion e<ists. To eliminate
the conflict, the use of the first and last names makes the
identification uniAue.
The s#nta< for using Aualification le5els follo)s(
C-leBel #e"e#ence) <database-name>D<table-name>D<column-name>
;-leBel #e"e#ence) <database-name>D<table-name>
;-leBel #e"e#ence) <table-name>D<column-name>
%hene5er all 3 le5els are used, the first name is al)a#s the data&ase,
the second is the ta&le and the last is the column. Co)e5er, )hen t)o
names a00ear in a 2+le5el Aualification, the location of the names
)ithin the 3@L must &e e<amined to kno) for sure their meaning.
3ince the 9!1M names the ta&les, the first name of the Aualified
names is a data&ase name and the second is a ta&le. 3ince columns
are referenced in the 3ELECT list and %CE!E clause, the first name is
a ta&le name and the second is an K (all columns$ or a single column.
In Teradata, the follo)ing is a 5alid statement, including the
a&&re5iation for 3ELECT and missing 9!1M(
SEL &'CDT,'LESD* ;
This techniAue is not 4-3I standard, ho)e5er, the 'E has e5er#thing
needed to get all columns and ro)s out of the T4.LE3 ta&le in the *.C
data&ase.
Creating an A#ias .or a Ta1#e
3ince ta&le names can &e u0 to 30 characters long, to sa5e t#0ing
)hen the name is used more than once, a commonl# used techniAue is
to 0ro5ide a tem0orar# name for the ta&le )ithin the 3ELECT. The ne)
tem0orar# name for a ta&le is called an alias name.
1nce the alias is created for the ta&le, it is im0ortant to use the alias
name throughout the reAuest. 1ther)ise the s#stem looks at the use
of the full ta&le name as another ta&le and it causes undesira&le
results. To esta&lish an alias for a ta&le, in the 9!1M, sim0l# follo) the
name of the ta&le )ith an 43( 9!1M Nta&le+nameO 43 Nta&le+alias+
nameO.

%orrelated Su$:uery Processing
The correlated su&Auer# is a 5er# 0o)erful tool. It is an e<cellent
techniAue to use )hen there is a need to determine )hich ro)s to
3ELECT &ased on one or more 5alues from another ta&le. This is
es0eciall# true )hen the 5alue for com0arison is &ased on an
aggregate. It com&ines su&Auer# 0rocessing and 7oin 0rocessing into a
single reAuest.
9or e<am0le, one Teradata user has the need to &ill their customers
and incor0orate the latest 0a#ment date. Therefore, the latest date
needs to &e o&tained from the ta&le. 3o, the 0a#ment date is found
using the M4B aggregate in the su&Auer#. Co)e5er, it must &e the
latest 0a#ment date for that customer, )hich might &e different for
each customer. The 0rocessing in5ol5es the su&Auer# locating the
ma<imum date onl# for one customer account.
The correlated su&Auer# is 0erfect for this 0rocessing. It is more
efficient and faster than using a normal su&Auer# )ith multi0le 5alues.
1ne reason for its s0eed is that it can 0erform some 0rocessing ste0s
in 0arallel, as seen in an EB'L4I-. The other reason is that it onl#
finds the ma<imum date )hen a 0articular account is read for
0rocessing, not for all accounts like a normal su&Auer#.
The o0eration for a correlated su&Auer# differs from that of a normal
su&Auer#. Instead of com0aring the selected su&Auer# 5alues against
all the ro)s in the main Auer#, the correlated su&Auer# )orks
&ack)ard. It first reads a ro) in the main Auer#, and then goes into
the su&Auer# to find all the ro)s that match the s0ecified column
5alue. Then, it gets the ne<t ro) in the main Auer# and retrie5es all
the su&Auer# ro)s that match the ne<t 5alue in this ro). This
0rocessing continues until all the Aualif#ing ro)s from the main
3ELECT are satisfied. 4lthough this sounds terri&l# inefficient and is
inefficient on other data&ases, it is e<tremel# efficient in Teradata. This
is due to the )a# the 4M's handle this t#0e of reAuest. The 4M's are
smart enough to remem&er and share each 5alue that is located.
Thus, )hen a second ro) comes into the com0arison that contains the
same 5alue as an earlier ro), there is no need to re+read the matching
ro)s again. That o0eration has alread# &een done once and the 4M's
remem&er the ans)er from the first com0arison.
The follo)ing is the s#nta< for )riting a correlated su&Auer#(
The su&Auer# does not ha5e a semi+colon of its o)n. The 3ELECT in
the su&Auer# is all 0art of the same 0rimar# Auer# and shares the one
semi+colon.
The aggregate 5alue is normall# o&tained using MI-, M4B or 46.
Then this aggregate 5alue is in turn used to locate the ro) or ro)s
)ithin a ta&le that com0ares eAuals, less than or greater than this
5alue.
This ta&le is used to demonstrate correlated su&Aueries(
Em0lo#ee Ta&le + contains F students
Employee6"o Last6"ame )irst6name Salary #ept6"o
PK FK
UPI NUSI NUSI
8232"HG
82"I3:F
23:828G
238222"
2000000
800023:
882833:
832:I"H
8333:":
Cham&ers
Carrison
!eill#
Larkins
Jones
3m#the
3trickling
Coffing
3mith
Mandee
Cer&ert
%illiam
Loraine
3Auigg#
!ichard
Cletus
.ill#
John
Y:G,G"0.00
Y":,"00.00
Y3I,000.00
Y:0,200.00
Y32,G00."0
YI:,300.00
Y":,"00.00
Y:8,GGG.GG
Y:G,000.00
800
:00
:00
300
80
:00
200
200
Figure 6-4
2sing the a&o5e ta&le, this Correlated su&Auer# finds the highest 0aid
em0lo#ee in each de0artment(
I !o)s !eturned
Last6name )irst6name #ept6no Salary
3m#the !ichard 80 YI:,300.00
Cham&ers Mandee 800 Y:G,G"0.00
3mith John 200 Y:G,000.00
Larkins Loraine 300 Y:0,200.00
Carrison Cer&ert :00 Y":,"00.00
3trickling Cletus :00 Y":,"00.00
-otice that &oth of the ta&les ha5e &een assigned alias names (em0
for the main Auer# and emt for the correlated su&Auer#$. .ecause the
same Em0lo#ee ta&le is used in the main Auer# and the su&Auer#, one
of them must &e assigned an alias. The aliases are used in the
su&Auer# to Aualif# and match the common domain 5alues &et)een
the t)o ta&les. This coding techniAue >correlates? the main Auer# ta&le
to the one in the su&Auer#.
The follo)ing Correlated su&Auer# uses the 46 function to find all
em0lo#ees )ho earn less than the a5erage 0a# in their de0artment(
" !o)s !eturned
Last6name 6 )irst6name #ept6no Salary 6
3m#the !ichard 80 YI:,300.00
Cham&ers Mandee 800 Y:G,G"0.00
Coffing .ill# 200 Y:8,GGG.GG
Larkins Loraine 300 Y:0,200.00
!eill# %illiam :00 Y3I,000.00
Earlier in this cha0ter, it )as indicated that a column from the
su&Auer# cannot &e referenced in the main Auer#. This is still true.
Co)e5er, nothing is )rong )ith using one or more column references
from the main Auer# )ithin the su&Auer# to create a Correlated
su&Auer#.

EFISTS
4nother 0o)erful resource that can &e used )ith a correlated su&Auer#
is the EBI3T3. It 0ro5ides a true+false test )ithin the %CE!E clause.
In the s#nta< that follo)s, it is used to test )hether or not a single ro)
is returned from the su&Auer# 3ELECT(
If a ro) is found, the EBI3T3 test is true, and con5ersel#, if a ro) is
not found, the result is false. %hen a true condition is determined, the
5alue in the 3ELECT is returned from the main Auer#. %hen the
condition is determined to &e false, no ro)s are selected.
3ince EBI3T3 returns one or no ro)s, it is a fast )a# to determine
)hether or not a condition is 0resent )ithin one or more data&ase
ta&les. The correlated su&Auer# can also &e 0art of a 7oin to add
another le5el of test. It has 0otential to &e 5er# so0histicated.
4s an e<am0le, to find all customers that ha5e not 0laced an order the
-1T I- su&Auer# can &e used. !emem&er, )hen #ou use the -1T I-
clause the -2LL needs to &e considered and eliminated using the I3
-1T -2LL check in the su&Auer#. %hen using the -1T EBI3T3 )ith a
correlated su&Auer#, the same ans)er is o&tained, it is faster than a
normal su&Auer# and there is no concern for getting a null into the
su&Auer#. These ne<t e<am0les sho) the EBI3T3 and the -1T EBI3T3
tests.
-otice that the ne<t 3ELECT is the same correlated su&Auer# as seen
earlier, e<ce0t here it is utili;ing the su&Auer# to find all customers
)ith orders(
: !o)s !eturned
%ustomer6name
4ce Consulting
*ata&ases ! 2s
.ill#Rs .est Choice
B/[ 'lum&ing
.# changing the EBI3T3 to a -1T EBI3T3, the ne<t 3ELECT finds all
customers )ithout orders(
8 !o) !eturned
%ustomer6name
4cme 'roducts
3ince the Customer and 1rder ta&les are used in the a&o5e Correlated
su&Auer#, the ta&le names did not reAuire an alias. Co)e5er, it )as
done to shorten the names to ease the eAualit# coding in the
su&Auer#.
4n added &enefit of this techniAue (-1T EBI3T3$ is that the 0resence
of a -2LL does not affect the 0erformance. -otice that in &oth
su&Aueries, the asterisk (K$ is used for the columns. 3ince it is a true
or false test, the columns are not used and it is the shortest )a# to
code the 3ELECT. If the column in the su&Auer# ta&le is a 'rimar#
Inde< or a 2niAue 3econdar# Inde<, the correlated su&Auer# can &e
5er# fast.
The e<am0les in this cha0ter onl# use a single column for the
correlation. Co)e5er, it is common to use more than one column from
the main Auer# in the correlated su&Auer#. 4lthough the techniAues
0resented in this last cha0ter seem relati5el# sim0le, the# can &e 5er#
0o)erful. 2nderstanding su&Aueries and Correlated su&Aueries can
hel0 #ou unleash the 0o)er.

Doin Processing
4 7oin is the com&ination of t)o or more ta&les in the same 9!1M of a
single 3ELECT statement. %hen )riting a 7oin, the ke# is to locate a
column in &oth ta&les that is from a common domain. Like the
correlated su&Auer#, 7oins are normall# &ased on an eAual com0arison
&et)een the 7oin columns.
4n e<am0le of a common domain column might &e a customer
num&er. %hether it re0resents a 0articular customer, as the 0rimar#
ke#, in the Customer ta&le, or the customer that 0laced a s0ecific
order, as a foreign ke#, in the 1rder ta&le, it re0resents the same
entit# in &oth ta&les. %ithout a common 5alue, a match cannot &e
made and therefore, no ro)s can &e selected using a 7oin. 4n eAualit#
7oin returns matching ro)s.
4n# ans)er set that a su&Auer# can return, a 7oin can also 0ro5ide.
2nlike the su&Auer#, a 7oin lists all of its ta&les in the same 9!1M
clause of the 3ELECT. Therefore, columns from multi0le ta&les are
a5aila&le for return to the user. The desired columns are the main
factor in deciding )hether to use a 7oin or a su&Auer#. If onl# the
columns come from a single ta&le are desired, a su&Auer# or a 7oin
)ork fine. Co)e5er, if columns from more than one ta&le are needed,
a 7oin must &e used. In ersion 2 !elease 3, the num&er of ta&les
allo)ed in a single 7oin increased from si<teen (8I$ to si<t#+four (I:$
ta&les.

(riginal Doin Synta/
The 3@L 7oin is a traditional and 0o)erful tool in a relational data&ase.
The first difference &et)een a 7oin and a single ta&le 3ELECT is that
multi0le ta&les are listed using the 9!1M clause. The first techniAue,
sho)n &elo), uses a comma &et)een the ta&le names. This is the
same techniAue used )hen listing multi0le columns in the 3ELECT,
1!*E! ./ or most other area that allo)s for the identification of more
than one o&7ect.
The follo)ing is the original 7oin s#nta< for a t)o+ta&le 7oin(
The follo)ing ta&les )ill &e used to demonstrate the 7oin s#nta<(
Customer Ta&le + contains " customers
%ustomer6num$er %ustomer6name Phone6num$er
PK
UPI NUSI NUSI
88888888
38383838
3832383:
"HGFIGG3
GH323:"I
.ill#=s .est Choice
4cme 'roducts
4CE Consulting
B/[ 'lum&ing
*ata&ases -+2
"""+823:
"""+8888
"""+8282
3:H+GF":
322+8082
Figure 7-1
1rder Ta&le + contains " orders
(rder6num$er %ustomer6num$er (rder6date (rder6total
PK K
UPI NUSI NUSI
823:"I
823"82
823""2
823"G"
823HHH
88888888
88888888
3832383:
GH323:"I
"HGFIGG3
FG0"0:
FF0808
FF8008
FF8080
FF0F0F
823:H."3
0G00".F8
0"888.:H
0"888.:H
23:":.G:
Figure 7-2
The common domain &et)een these t)o ta&les is the customer
num&er. It is used in the %CE!E clause )ith the eAual condition to find
all the ro)s from &oth ta&les )ith matching 5alues. 3ince the column
has e<actl# the same name in &oth ta&les, it &ecomes mandator# to
Aualif# this column=s name so that the 'E kno)s )hich ta&le to
reference for the data. E5er# a00earance of the customer num&er in
the 3ELECT must &e Aualified.
The ne<t 3ELECT finds all of the orders for each customer and sho)s
the Customer=s name, 1rder num&er and 1rder total using a 7oin(
" !o)s !eturned
%ustomer6num$er %ustomer6name (rder6num$er (rder6total
3832383: 4CE Consulting 823""2 Y",888.:H
88888888 .ill#=s .est Choice 823:"I Y82,3:H."3
88888888 .ill#=s .est Choice 823"82 YG,00".F8
GH323:"I *ata&ases -+2 823"G" Y8",238.I2
"HGFIGG3 B/[ 'lum&ing 823HHH Y23,:":.G:
In the a&o5e out0ut, all of the customers, e<ce0t one, ha5e a single
order on file.
Co)e5er, .ill#=s .est Choice has 0laced t)o orders and is dis0la#ed
t)ice, once for each order. -otice that the Customer num&er in the
3ELECT list is Aualified and returned from the Customer ta&le. *oes it
matter, in this 7oin )hich ta&le is used to o&tain the 5alue for the
Customer num&erQ
/our ans)er should &e no. This is &ecause the 5alue in the t)o ta&les
is checked for eAual in the %CE!E clause of the 7oin. Therefore, the
5alue is the same regardless of )hich ta&le is used. Co)e5er, as
mentioned earlier, #ou must use the ta&le name to Aualif# an# column
name that e<ists in more than one ta&le )ith the same name. Teradata
)ill not assume )hich column to use.
The follo)ing sho)s the s#nta< for a three+ta&le 7oin(
The ne<t three ta&les are used to demonstrate a three+ta&le 7oin(
Course Ta&le + contains H courses
%ourse6I# %ourse6"ame %redits Seats
PK K
UPI NUSI
800
200
280
220
300
:00
"00
Teradata Conce0ts
Introduction to 3@L
4d5anced 3@L
2!3 3@L 9eatures
'h#sical *ata&ase *esign
*ata&ase 4dministration
Logical *ata&ase *esign
3
3
3
2
:
:
2
"0
20
22
2"
20
8I
2:
Figure 7-3
3tudent Ta&le + contains 80 students
Student6I# Last6"ame )irst6name %lass6code 7rade6Pt
PK K
UPI NUSI NUSI
8232"0
82"I3:
23:828
238222
2I0000
2G0023
322833
32:I"2
333:"0
:23:00
'hilli0s
Canson
Thomas
%ilson
Johnson
Mc!o&erts
.ond
*elane#
3mith
Larkins
Martin
Cenr#
%end#
3usie
3tanle#
!ichard
Jimm#
*ann#
4nd#
Michael
3!
9!
9!
31
J!
J!
3!
31
9!
3.00
2.GG
:.00
3.G0
8.F0
3.F"
3.3"
2.00
0.00
Figure 7-4
3tudentWCourse Ta&le (associati5e ta&le$
Student6I# %ourse6I#
PK
NUPI NUSI
8232"0
82"I3:
82"I3:
82"I3:
23:828
238222
238222
2I0000
2G0023
322833
322833
32:I"2
333:"0
800
800
200
220
800
280
220
:00
280
220
300
200
:00
Figure 7-5
The first t)o ta&les re0resent the students and courses the# can
attend. 3ince a student can take more than one class, the third ta&le
3tudentWCourse is used to associate the t)o main ta&les. It allo)s for
one student to take man# classes and one class to &e taken &# man#
students (a man#+to+man# relationshi0$.
The follo)ing 3ELECT 7oins these three ta&les on the common domain
columns to find all courses &eing taken &# the students(
83 !o)s !eturned
Last "ame )irst Student6I# %ourse
Mc!o&erts !ichard 2G0023 4d5anced 3@L
%ilson 3usie 238222 4d5anced 3@L
Johnson 3tanle# 2I0000 *ata&ase 4dministration
3mith 4nd# 333:"0 *ata&ase 4dministration
*elane# *ann# 32:I"2 Introduction to 3@L
Canson Cenr# 82"I3: Introduction to 3@L
.ond Jimm# 322833 'h#sical *ata&ase *esign
Canson Cenr# 82"I3: Teradata Conce0ts
'hilli0s Martin 8232"0 Teradata Conce0ts
Thomas %end# 23:828 Teradata Conce0ts
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
It is reAuired to ha5e one less eAualit# test in the %CE!E than the
num&er of ta&les &eing 7oined. Cere there are three ta&les and t)o
eAualities on common domain columns in the ta&les. If the ma<imum
of I: ta&les is used, this means that there must &e I3 com0arisons
)ith I3 4-* logical o0erators. If one com0arison is forgotten, the
result is not a s#nta< errorL it is a Cartesian 0roduct 7oin.
Man# times the reAuest adds some residual conditions to further refine
the out0ut. 9or instance, the need might &e to see all the students that
ha5e taken the 2!3 3@L class. This is 5er# common since most ta&les
)ill ha5e thousands or millions of ro)s. 4 )a# is needed to limit the
ro)s returned. The residual conditions also a00ear in the %CE!E
clause.
In the ne<t 7oin, the %CE!E of the 0re5ious 3ELECT has &een modified
to add an additional com0arison for the course(
3 !o)s !eturned
Last "ame )irst"ame Student6I# %ourse
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
The added residual condition does not re0lace the 7oin conditions.
Instead it adds a third condition for the course. If one of the 7oin
conditions is omitted, the result is a Cartesian 0roduct 7oin (e<0lained
ne<t$.

Product Doin
It is 5er# im0ortant to use an eAual condition in the %CE!E clause.
1ther)ise #ou get a 0roduct 7oin. This means that one ro) of a ta&le is
7oined to multi0le ro)s of another ta&le. 4 mathematic 0roduct means
that multi0lication is used.
The ne<t 7oin e<am0le uses a %CE!E clause, &ut it onl# limits )hich
ro)s 0artici0ate in the 7oin and does not 0ro5ide a 7oin condition(
" !o)s !eturned
%ustomer6name (rder6num$er (rder6total
.ill#=s .est Choice 823:"I 823:H."3
.ill#=s .est Choice 823"82 G00".F8
.ill#=s .est Choice 823""2 "888.:H
.ill#=s .est Choice 823"G" "888.:H
.ill#=s .est Choice 823HHH 23:":.G:
The a&o5e out0ut resulted from 8 ro) in the customer ta&le &eing
7oined to all the ro)s of the order ta&le. The %CE!E limited the
customer ro)s that 0artici0ated in the 7oin, &ut did not s0ecif# an
eAual com0arison &et)een the 7oin columns. 4s a result, it looks like
.ill# 0laced fi5e orders, )hich is not correct. 3o, &e careful )hen using
0roduct 7oins &ecause 3@L ans)ers the Auestion as asked, not
necessaril# as intended.
%hen all ro)s of one ta&le are 7oined to all ro)s of another ta&le, it is
called a Cartesian 0roduct 7oin or an unconstrained 0roduct 7oin. Think
a&out this( if one ta&le has one million ro)s and the other ta&le
contains one thousand ro)s, the out0ut is one trillion ro)s (8,000,000
ro)s K 8,000 ro)s P 8,000,000,000 ro)s$.
4s seen a&o5e, the 5ast ma7orit# of the time, a 0roduct 7oin has no
meaningful out0ut and is usuall# a mistake. The mistake is either that
the %CE!E clause is omitted, a column com0arison is omitted for one
of the ta&les using an 4-*, or the ta&le is gi5en an alias and the alias
is not used (s#stem thought it )as an additional ta&le )ithout a
com0arison$.
The ne<t 3ELECT is the same as the one a&o5e, e<ce0t this time the
entire %CE!E clause has &een commented out using JK and KJ(
3ince the 7oin condition is con5erted into a comment, the out0ut from
the 3ELECT is a Cartesian 0roduct that )ill return FG0 ro)s
(80KH0K8:PFG0$ using these 5er# small ta&les. The out0ut is
com0letel# meaningless and im0lies that e5er# student is taking e5er#
class. This out0ut does not reflect the correct situation.
9orgetting to include the %CE!E clause does not make the 7oin s#nta<
incorrect. Instead, it results in a Cartesian 0roduct 7oin. 4l)a#s use the
EB'L4I- to 5erif# that the result of the 7oin is reasona&le &efore
e<ecuting the actual 7oin. The follo)ing sho)s the
out0ut from an EB'L4I- of the a&o5e classic Cartesian 0roduct 7oin.
-otice that ste0s I and H indicate a 0roduct 7oin on the condition that
(8P8$. 3ince 8 is al)a#s eAual to 8 e5er# time a ro) is read, all ro)s
are 7oined )ith all ro)s.
The contents of 30ool 8 are sent &ack to the user as the result of
statement 8. The total estimated time is 0."I seconds.
If #ou remem&er from Cha0ter 3, the EB'L4I- sho)s immediatel# that
this situation )ill occur if the 3ELECT is e<ecuted. This is &etter than
)aiting, 0otentiall# hours, to determine that the 3ELECT is running too
long, stealing 5alua&le com0uter c#cles, doing data transfer, and
interfering )ith 5alid 3@L from other users. .e a good cor0orate citi;en
and data&ase user( EB'L4I- #our 7oin s#nta< &efore e<ecutingE Make
sure the estimates are reasona&le for the si;e of the data&ase ta&les
in5ol5ed.

"ewer A"SI Doin Synta/
The 4-3I committee has created a ne) form of the 7oin s#nta<. Like
most 4-3I com0liant code, it is a &it longer to )rite. Co)e5er, I
0ersonall# &elie5e that it is )orth the time and the effort due to &etter
functionalit# and safeguards. 'lus, it is more difficult to get an
accidental 0roduct 7oin using this form of s#nta<. This cha0ter
descri&es and demonstrates the use of the I--E! J1I-, the 12TE!
J1I-, the C!133 J1I- and the 3elf+7oin.
INNE) 2OIN
4lthough the original s#nta< still )orks, there is a ne)er 5ersion of the
7oin using the I--E! J1I-s#nta<. It )orks e<actl# the same as the
original 7oin, &ut is )ritten slightl# different.
The follo)ing s#nta< is for a t)o+ta&le I--E! J1I-(
There are t)o 0rimar# differences &et)een the ne) I--E! J1I- and
the original 7oin s#nta<. The first difference is that a comma (,$ no
longer se0arates the ta&le names. Instead of a comma, the )ords
I--E! J1I- are used. 4s sho)n in the a&o5e s#nta< format, the )ord
I--E! is o0tional. If onl# the J1I- a00ears, it defaults to an I--E!
J1I-.
The other difference is that the %CE!E clause for the 7oin condition is
changed to an 1- to declare an eAual com0arison on the common
domain columns. If the 1- is omitted, a s#nta< error is re0orted and
the 3ELECT does not e<ecute. 3o, the result is not a Cartesian 0roduct
7oin as seen in the original s#nta<. Therefore, it is safer to use.
4lthough the I--E! J1I- is a slightl# longer 3@L statement to code, it
does ha5e ad5antages. The first ad5antage, mentioned a&o5e, is fe)er
accidental Cartesian 0roduct 7oins &ecause the 1- is reAuired. In the
original s#nta<, )hen the %CE!E is omitted the s#nta< is still correct.
Co)e5er, )ithout a com0arison, all ro)s of &oth ta&les are 7oined )ith
each other and results in a Cartesian 0roduct.
The last and most com0elling ad5antage of the ne)er s#nta< is that
the I--E! J1I- and 12TE! J1I- statements can easil# &e com&ined
into a single 3@L statement. The 12TE! J1I- s#nta<, e<0lanation and
significance are co5ered in this cha0ter.
The follo)ing is the same 7oin that )as 0erformed earlier using the
original 7oin s#nta<. Cere, it has &een con5erted to use an I--E!
J1I-(
" !o)s !eturned
%ustomer6num$er %ustomer6name (rder6num$er (rder6total
3832383: 4CE Consulting 823""2 Y",888.:H
88888888 .ill#=s .est Choice 823:"I Y82,3:H."3
88888888 .ill#=s .est Choice 823"82 YG,00".F8
GH323:"I *ata&ases -+2 823"G" Y8",238.I2
"HGFIGG3 B/[ 'lum&ing 823HHH Y23,:":.G:
Like the original s#nta<, more than t)o ta&les can &e 7oined in a single
I--E! J1I-. Each consecuti5e ta&le name follo)s an I--E! J1I- and
associated 1- clause to tell )hich columns to match. Therefore, a ten+
ta&le 7oin has nine J1I- and nine 1- clauses to identif# each ta&le and
the columns &eing com0ared. There is al)a#s one less J1I- J 1-
com&ination than the num&er of ta&les referenced in the 9!1M.
The follo)ing s#nta< is for an I--E! J1I- )ith more than t)o ta&les(
The Nta&le+name-O reference a&o5e is intended to re0resent a
5aria&le num&er of ta&les. It might &e a 3+ta&le, a 80+ta&le or u0 to a
I:+ta&le 7oin. The same a00roach is used regardless of the num&er of
ta&les &eing 7oined together in a single 3ELECT.
The other difference &et)een these t)o 7oin formats is that regardless
of the num&er of ta&les in the original s#nta<, there )as onl# a single
%CE!E clause. Cere, each additional I--E! J1I- has its o)n 1-
condition. If one 1- is omitted from the I--E! J1I-, an error code of
3H0I )ill &e returned. This error kee0s the 7oin from e<ecuting, unlike
the original s#nta<, )here a forgotten 7oin condition in the %CE!E is
allo)ed, &ut creates an accidental Cartesian 0roduct 7oin.
The ne<t I--E! J1I- is con5erted from the 3+ta&le 7oin seen earlier(
3 !o)s !eturned
Last "ame )irst"ame Student6I# %ourse
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
The I--E! J1I- s#nta< can use a %CE!E clause instead of a
com0ound 1- com0arison. It can &e used to add one or more residual
conditions. 4 residual condition is a com0arison that is in addition to
the 7oin condition. %hen it is used, the intent is to 0otentiall# eliminate
ro)s from one or more of the ta&les.
In other )ords, as ro)s are read the %CE!E clause com0ares each
ro) )ith a condition to decide )hether or not it should &e included or
eliminated from the 7oin 0rocessing. The %CE!E clause is a00lied as
ro)s are read, &efore the 1- clause. Eliminated ro)s do not
0artici0ate in the 7oin against ro)s from another ta&le. 9or more
details, read the section on %CE!E clauses at the end of this cha0ter.
The follo)ing is the same 3ELECT using a %CE!E to com0are the
Course name as a residual condition instead of a com0ound (4-*$
com0arison in the 1-(
4s far as the I--E! J1I-0rocessing is concerned, the 'E )ill normall#
o0timi;e &oth of these last t)o 7oins e<actl# the same. The EB'L4I- is
the &est )a# to determine ho) the o0timi;er uses s0ecific Teradata
ta&les in a 7oin o0eration.
OUTE) 2OIN
4s seen 0re5iousl#, the 7oin 0rocessing matches ro)s from multi0le
ta&les on a column containing 5alues from a common domain. Most of
the time, each ro) in a ta&le has a matching ro) in the other ta&le.
Co)e5er, )e do not li5e in a 0erfect )orld and sometimes our data is
not 0erfect. Im0erfect data is ne5er returned )hen a normal 7oin is
used and the im0erfection ma# go unnoticed.
The sole 0ur0ose of an 12TE! J1I- is to find and return ro)s that do
not match at least one ro) from another ta&le. It is for >e<ce0tion?
re0orting, &ut at the same time, it does the I--E! J1I- 0rocessing
too. Therefore, the intersecting (matching$ common domain ro)s are
returned along )ith all ro)s )ithout a matching 5alue from another
ta&le. This non+matching condition might &e due to the e<istence of a
-2LL or in5alid data 5alue in the 7oin column(s$.
9or instance, if the em0lo#ee and de0artment ta&les are 7oined using
an I--E! J1I-, it dis0la#s all the em0lo#ees )ho )ork in a 5alid
de0artment. Mechanicall#, this means it returns all of the em0lo#ee
ro)s that contain a 5alue in the de0artment num&er column, as a
foreign ke#, that matches a de0artment num&er 5alue in the
de0artment ta&le, as a 0rimar# ke#.
%hat it does not dis0la# are em0lo#ees )ithout a de0artment num&er
(-2LL$ and em0lo#ees )ith in5alid de0artment num&ers (&reaks
referential integrit# rules$. These additional ro)s can &e returned )ith
the intersecting ro)s using one of the three formats for an 12TE!
J1I- listed &elo).
The three formats of an 12TE! J1I- are(
Le"t_table LE)T (*TE& D(I" R$%ht_table E le"t table $s oute# table
Le"t_table &I79T (*TE& D(I" R$%ht_table E #$%ht table $s oute# table
Le"t_table )*LL (*TE& D(I" R$%ht_table E both a#e oute# tables
The 12TE! J1I- has an outer ta&le. The outer ta&le is used to direct
)hich e<ce0tion ro)s are out0ut. 3im0l# 0ut, it is the controlling ta&le
of the 12TE! J1I-. 4s a result of this feature, all the ro)s from the
outer ta&le )ill &e returned, those containing matching domain 5alues
and those )ith non+matching 5alues. The I--E! J1I- has onl# inner
ta&les. To code an 12TE! J1I- it is )ise to start )ith an I--E! J1I-.
1nce the 7oin is )orking, the ne<t ste0 is to con5ert the )ord I--E! to
12TE!.
The 3ELECT list for matching ro)s can dis0la# data from an# of the
ta&les in the 9!1M. This is &ecause a ro) )ith a matching ro) e<ists
in the ta&les. Co)e5er, all non+matching ro)s )ith -2LL or in5alid
data in the outer ta&le do not ha5e a matching ro) in the inner ta&le.
Therefore, the entire inner ta&le ro) is missing and no column is
a5aila&le for the 3ELECT list. This is the eAui5alent of a -2LL. 3ince
the e<ce0tion ro) is missing, there is no data a5aila&le for dis0la#. 4ll
referenced columns from the missing inner ta&le ro)s )ill &e
re0resented as a -2LL in the dis0la#.
The &asic s#nta< for a t)o+ta&le 12TE! J1I- follo)s(
2nlike the I--E! J1I-, there is no original 7oin s#nta< o0eration for an
12TE! J1I-. The 12TE! J1I- is a uniAue ans)er set. The closest
functionalit# to an 12TE! J1I- comes from the 2-I1- set o0erator,
)hich is co5ered later in this &ook. The other fantastic Aualit# of the
ne)er I--E! and 12TE! 7oin s#nta< is that the# &oth can &e used in
the same 3ELECT )ith three or more ta&les.
The ne<t se5eral sections e<0lain and demonstrate all three formats of
the 12TE! J1I-. The 0rimar# issue )hen using an 12TE! J1I- is that
onl# one format can &e used in a 3ELECT &et)een an# t)o ta&les. The
9!1M list determines the outer ta&le for 0rocessing. It is im0ortant to
understand the functionalit# in order to chose the correct outer 7oin.
,ET OUTE) 2OIN
The outer ta&le is determined &# its location in the 9!1M clause of the
3ELECT as sho)n here(
<Oute#-table> LE)T (*TE& D(I" <2nne#-table>
O#
<Oute#-table> LE)T D(I" <2nne#-table>
In this format, the Customer ta&le is the one on the left of the )ord
J1I-. 3ince this is a LE9T 12TE! J1I-, the Customer is the outer
ta&le. This s#nta< can return all customer ro)s that match a 5alid
order num&er (I--E! J1I-$ and Customers )ith -2LL or in5alid order
num&ers (12TE! J1I-$.
The ne<t 3ELECT sho)s customers )ith matching orders and those
that need to &e called &ecause the# ha5e not 0laced an order(
I !o)s !eturned
%ustomer6name (rder6num$er (rder6total
4ce Consulting 823""2 Y",888.:H
4cme 'roducts Q Q
.ill#Rs .est Choice 823:"I Y82,3:H."3
.ill#Rs .est Choice 823"82 YG,00".F8
*ata&ases -+2 823"G" Y8",238.I2
B/[ 'lum&ing 823HHH Y23,:":.G:
The a&o5e out0ut consists of all the ro)s from the Customer ta&le
&ecause it is the outer ta&le and there are no residual conditions.
2nlike the earlier I--E! J1I-, 4cme 'roducts is no) easil# seen as
the onl# customer )ithout an order. 3ince 4cme 'roducts has no order
at this time, the order num&er and the order total are &oth e<tended
)ith the >Q? to re0resent a -2LL, or missing 5alue from a non+
matching ro) of the inner ta&le. This is a 5er# im0ortant conce0t.
The result of the 3ELECT 0ro5ides the matching ro)s like the I--E!
J1I- and the non+matching ro)s, or e<ce0tions that are missed &# the
I--E! J1I-. It is 0ossi&le to add the order num&er to an 1!*E! ./ to
0ut all e<ce0tions either at the front (43C$ or &ack (*E3C$ of the
out0ut re0ort.
%hen using an 12TE! J1I-, the results of this 7oin are stored in the
s0ool area and contain all of the ro)s from the outer ta&le. This
includes the ro)s that match and all the ro)s that do not match from
the 7oin ste0. The onl# difference is that the non+matching ro)s are
carr#ing the -2LL 5alues for all columns for missing ro)s from the
inner ta&le.
The conce0t of a LE9T 12TE! J1I-is 0rett# straight for)ard )ith t)o
ta&les. Co)e5er, additional thought is reAuired )hen using more then
t)o ta&les to 0reser5e ro)s from the first outer ta&le.
!emem&er that the result of the first 7oin is sa5ed in s0ool. This same
s0ool is then used to 0erform all su&seAuent 7oins against an#
additional ta&les, or other s0ool areas. 3o if #ou 7oin 3 ta&les using an
outer 7oin the first t)o ta&les are 7oined together )ith the s0ooled
results re0resenting the ne) outer ta&le and then 7oined )ith the third
ta&le )hich &ecomes the !I6CT ta&le.
2sing the 3tudent, Course and 3tudentWCourse ta&les, the follo)ing
3ELECT 0reser5es the e<ce0tion ro)s from the 3tudent ta&le as the
outer ta&le, throughout the entire 7oin. 3ince &oth 7oins are )ritten
using the LE9T 12TE! J1I- and the 3tudent ta&le is the ta&le name
that is the furthest to the left it remains as the outer ta&le(
8: !o)s !eturned
Last "ame )irst"ame Student6I# %ourse
Larkins Michael :23:00 Q
Mc!o&erts !ichard 2G0023 4d5anced 3@L
%ilson 3usie 238222 4d5anced 3@L
Johnson 3tanle# 2I0000 *ata&ase 4dministration
3mith 4nd# 333:"0 *ata&ase 4dministration
*elane# *ann# 32:I"2 Introduction to 3@L
Canson Cenr# 82"I3: Introduction to 3@L
.ond Jimm# 322833 'h#sical *ata&ase *esign
Canson Cenr# 82"I3: Teradata Conce0ts
'hilli0s Martin 8232"0 Teradata Conce0ts
Thomas %end# 23:828 Teradata Conce0ts
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
The a&o5e out0ut contains all the ro)s from the 3tudent ta&le as the
outer ta&le in the three+ta&le LE9T 12TE! J1I-. The 12TE! J1I-
returns a ro) for a student named Michael Larkins e5en though he is
not taking a course. 3ince, his course ro) is missing, no course name
is a5aila&le for dis0la#. 4s a result, the out0ut is e<tended )ith a
-2LL in course name, &ut &ecomes 0art of the ans)er set.
-o), it is kno)n that a student isn=t taking a course. It might &e
im0ortant to kno) if there are an# courses )ithout students. The
0re5ious 7oin can &e con5erted to determine this fact &# rearranging
the ta&le names in the 9!1M to make the Course ta&le the outer ta&le,
or &# using the !I6CT 12TE! J1I-.
)I&/T OUTE) 2OIN
4s indicated earlier, the outer ta&le is determined &# its 0osition in the
9!1M clause of the 3ELECT. Consider the follo)ing(
<2nne#-table> &I79T (*TE& D(I" <Oute#-table>
O#
<2nne#-table> &I79T D(I" <Oute#-table>
*n the ne"t e"ample, the Nustomer table is still ritten be'ore the 4rder table. Since it is
no a .*/0- 4&-E. U4*3 and the 4rder table is on the right o' the ord U4*3, it is
no the outer table. .emember, all ros can be returned 'rom the outer tableV
To include the orders )ithout customers, the 0re5iousl# seen LE9T
12TE! J1I- has &een con5erted to a !I6CT 12TE! J1I-. It can &e
used to return all of the ro)s in the 1rder ta&le, those that match
customer ro)s and those that do not match customers.
The follo)ing is con5erted to a !I6CT 12TE! J1I- to find all orders(
I !o)s !eturned
%ustomer6name (rder6num$er (rder6total
Q FFFFFF Y8.00+
4ce Consulting 823""2 Y",888.:H
.ill#Rs .est Choice 823:"I Y82,3:H."3
.ill#Rs .est Choice 823"82 YG,00".F8
*ata&ases -+2 823"G" Y8",238.I2
B/[ 'lum&ing 823HHH Y23,:":.G:
The a&o5e out0ut from the 3ELECT consists of all the ro)s from the
1rder ta&le, )hich is the outer ta&le. In a 2+ta&le 12TE! J1I-)ithout
a %CE!E clause, the num&er of ro)s returned is usuall# eAual to the
num&er of ro)s in the outer ta&le. In this case, the outer ta&le is the
1rder ta&le. It contains I ro)s and all I ro)s are returned.
This 7oin returns all orders )ith a 5alid customer I* (like the I--E!
J1I-$ and orders )ith a missing or an in5alid customer I* (12TE!
J1I-$. Either of these last t)o conditions constitutes a critical &usiness
0ro&lem that needs immediate attention. It is im0ortant to determine
that orders )ere 0laced, &ut that the &u#er of them is not kno)n.
3ince the out0ut )as sorted &# the customer name, the e<ce0tion ro)
is returned first. This techniAue makes the e<ce0tion eas# to find,
es0eciall# in a large re0ort. -ot onl# is the customer missing for this
order, it o&5iousl# has additional 0ro&lems. The total is negati5e and
the order num&er is all nines. %e can no) correct a situation )e kne)
nothing a&out or correct the 0rocedure or 0olic# that allo)ed for the
error to occur.
2sing the same 3tudent and Course ta&les from the 0re5ious 3+ta&le
7oin, it can &e con5erted from the t)o LE9T 12TE! J1I-o0erations to
t)o !I6CT 12TE! J1I- o0erations in order to find the students taking
courses and also find an# courses )ithout students enrolled(
G !o)s !eturned
Last "ame )irst"ame Student6I# %ourse
Mc!o&erts !ichard 2G0023 4d5anced 3@L
%ilson 3usie 238222 4d5anced 3@L
*elane# *ann# 32:I"2 Introduction to 3@L
Canson Cenr# 82"I3: Introduction to 3@L
Q Q Q Logical *ata&ase *esign
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
-o), using the out0ut from the 12TE! J1I- on the Course ta&le, it is
a00arent that no one is enrolled in the Logical *ata&ase *esign
course. The enrollment needs to &e increased or the room needs to &e
freed u0 for another course. %here inner 7oins are great at finding
matches, outer 7oins are great at finding &oth matches and 0ro&lems.
)*LL (*TE& D(I"
The last form of the 12TE! J1I- is a 92LL 12TE! J1I-. If &oth
Customer and 1rder e<ce0tions are to &e included in the out0ut
re0ort, then the s#nta< should a00ear as(
<Oute#-table> )*LL (*TE& D(I" <Oute#-table>
O#
<Oute#-table> )*LL D(I" <Oute#-table>
4 92LL 12TE! J1I- uses &oth of the ta&les as outer ta&les. The
e<ce0tions are returned from &oth ta&les and the missing column
5alues from either ta&le are e<tended )ith -2LL. This 0uts the LE9T
and !I6CT 12TE! J1I-out0ut into a single re0ort.
To return the customers )ith orders, and include the orders )ithout
customers and customers )ithout orders, the follo)ing 92LL 12TE!
J1I- can &e used(
H !o)s !eturned
%ustomer6name (rder6num$er (rder6total
Q FFFFFF Y8.00+
4ce Consulting 823""2 Y",888.:H
4cme 'roducts Q Q
.ill#Rs .est Choice 823"82 YG,00".F8
.ill#Rs .est Choice 823:"I Y82,3:H."3
*ata&ases -+2 823"G" Y8",238.I2
B/[ 'lum&ing 823HHH Y23,:":.G:
The out0ut from the 3ELECT consists of all the ro)s from the 1rder
and Customer ta&les &ecause the# are no) &oth outer ta&les in a 92LL
12TE! J1I-.
The total num&er of ro)s returned is more difficult to 0redict )ith a
92LL 12TE! J1I-. The ans)er set contains( one ro) for each of the
matching ro)s from the ta&les, 0lus one ro) for each of the missing
ro)s in the left ta&le, 0lus one for each of the missing ro)s in the right
ta&le.
3ince &oth ta&les are outer ta&les, not as much thought is reAuired for
choosing the outer ta&le. Co)e5er, as mentioned earlier the I--E!
and 12TE! 7oin 0rocessing can &e com&ined in a single 3ELECT. The
I--E! J1I- still eliminates all non+matching ro)s. This is )hen the
most consideration needs to &e gi5en to the a00ro0riate outer ta&les.
Like all 7oins, more than t)o ta&les can &e 7oined using a 92LL 12TE!
J1I-, u0 to I: ta&les. The ne<t 92LL 12TE! J1I- s#nta< uses 3tudent
and Course ta&les for the outer ta&les through the entire 7oin 0rocess(
8" !o)s !eturned
Last "ame )irst"ame Student6I# %ourse
Larkins Michael :23:00 Q
Mc!o&erts !ichard 2G0023 4d5anced 3@L
%ilson 3usie 238222 4d5anced 3@L
Johnson 3tanle# 2I0000 *ata&ase 4dministration
3mith 4nd# 333:"0 *ata&ase 4dministration
*elane# *ann# 32:I"2 Introduction to 3@L
Canson Cenr# 82"I3: Introduction to 3@L
Q Q Q Logical *ata&ase *esign
.ond Jimm# 322833 'h#sical *ata&ase *esign
Canson Cenr# 82"I3: Teradata Conce0ts
'hilli0s Martin 8232"0 Teradata Conce0ts
Thomas %end# 23:828 Teradata Conce0ts
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
The a&o5e 3ELECT uses the 3tudent, Course and >3tudent Course?
(associati5e$ ta&les in a 92LL 12TE! J1I-. 4ll three ta&les are outer
ta&les. The a&o5e includes one non+matching ro) from the 3tudent
ta&le )ith a null in the course name and one non+matching ro) from
the course ta&le )ith nulls in all three columns from the 3tudent ta&le.
3ince the 3tudent Course ta&le is also an outer ta&le, if there )ere an#
non+matching ro)s in it, the# can also &e returned containing a null in
its columns. Co)e5er, since it is an associati5e ta&le used onl# for a
man#+to+man# relationshi0 &et)een the 3tudent and Course ta&les,
missing ro)s in it )ould indicate a serious &usiness 0ro&lem.
4s a reminder, the result of the first 7oin ste0 is stored in s0ool, )hich
is tem0orar# )ork s0ace that the s#stem uses to com0lete each ste0
of the 3ELECT. Then, the s0ool area is used for each consecuti5e J1I-
ste0. This continues until all of the ta&les ha5e &een 7oined together,
t)o at a time. Co)e5er, the s0ool areas are not held until the end of
the 3ELECT. Instead, )hen the s0ool is no longer needed, it is released
immediatel#. This makes more s0ool a5aila&le for another ste0, or &#
another user. The release can &e seen in the EB'L4I- out0ut as (Last
2se$ for a s0ool area.
4lso, )hen using Teradata, do not s0end a lot of time )orr#ing a&out
)hich ta&les to 7oin first. The o0timi;er makes this choice at e<ecution
time. The o0timi;er al)a#s looks for the fastest method to o&tain the
reAuested ro)s. It uses data distri&ution and inde< demogra0hics to
make its final decision on a methodolog#. 3o, the ta&les 7oined first in
the s#nta<, might &e the last ta&les 7oined in the e<ecution 0lan.
4ll data&ases 7oin ta&les t)o at a time, &ut most data&ases 7ust 0ick
)hich ta&les to 7oin &ased on their 0osition in the 9!1M. 3ometimes
)hen the 3@L runs slo), the user 7ust changes the order of the ta&les
in the 7oin. 1ther)ise, 7oin schemas must &e &uilt to tell the
!*.M3 ho) to 7oin s0ecific ta&les.
Teradata is smart enough, using e<0licit or im0licit 3T4TI3TIC3, to
e5aluate )hich ta&les to 7oin together first. %hene5er 0ossi&le, four
ta&les might &e 7oined at the same time, &ut it is still done as t)o,
t)o+ta&le 7oins in 0arallel. Joins in5ol5ing millions of ro)s are
considered difficult for most data&ases, &ut Teradata 7oins them )ith
ease.
It is a good idea to use the Teradata EB'L4I-, to see )hat ste0s the
o0timi;er 0lans to use to accom0lish the reAuest. 'rimaril# in the
&eginning #ou are looking for an estimate of the num&er of ro)s that
)ill &e returned and the time cost to accom0lish it. I recommend using
the EB'L4I- &efore each 7oin as #ou are learning to make sure that
the result is reasona&le.
If these num&ers a00ear to &e too high for the ta&les in5ol5ed, it is
0ro&a&l# a Cartesian 0roductL )hich is not good. The EB'L4I-
disco5ers the 0roduct 7oin )ithin seconds instead of hours. If it )ere
actuall# running, it )ould &e )asting resources &# doing all the e<tra
)ork to accom0lish nothing. 2se the EB'L4I- to learn this fact the
eas# )a# and fi< it.
C)OSS 2OIN
4 C!133 J1I- is the 4-3I )a# to )rite a 0roduct 7oin. This means
that it 7oins one or more ro)s 0artici0ating from one ta&le )ith all the
0artici0ating ro)s from the other ta&le. 4s mentioned earlier in this
cha0ter, there is not a large a00lication for a 0roduct 7oin and e5en
fe)er for a Cartesian 7oin.
4lthough there are not man# a00lications for a C!133 J1I-, consider
this( an airline might use one to determine the location and num&er of
routes needed to fl# from one hu& to all of the other cities the# ser5e.
4 0otential route >7oins? e5er# cit# to the hu&. Therefore, the result
needs a 0roduct 7oin. 'ro&a&l# )hat should still &e a5oided is to fl#
from e5er# cit# to e5er# other cit# (Cartesian 7oin$. 4 C!133 J1I- is
controlled using a %CE!E clause. 2nlike the other 7oin s#nta<, a
C!133 J1I- results in a s#nta< error if an 1- clause is used.
The follo)ing is the s#nta< for the C!133 J1I-(
The ne<t 3ELECT 0erforms a C!133 J1I- (0roduct 7oin$ using the
3tudent and Course ta&les(
80 !o)s !eturned
Last6name %ourse6name
'hilli0s Teradata Conce0ts
Canson Teradata Conce0ts
Thomas Teradata Conce0ts
%ilson Teradata Conce0ts
Johnson Teradata Conce0ts
Mc!o&erts Teradata Conce0ts
.ond Teradata Conce0ts
*elane# Teradata Conce0ts
3mith Teradata Conce0ts
Larkins Teradata Conce0ts
3ince e5er# student is not taking e5er# course, this out0ut has 5er#
little meaning from a student and course 0ers0ecti5e. Co)e5er, this
same data can &e 5alua&le in determining a 0otential for a situation or
the resources that are needed to determine ma<imum room ca0acities.
9or e<am0le, it hel0s if the *ean )ants to kno) the ma<imum num&er
of seats needed in a classroom if e5er# student )ere to enroll for
e5er# 3@L class. Co)e5er, the ro)s are 0ro&a&l# counted (C12-T(K$$
and not dis0la#ed.
This 3ELECT uses a C!133 J1I- to 0o0ulate a deri5ed ta&le (discussed
later$, )hich is then used to o&tain the final count(
8 !o) !eturned
Total SQL Seats "eeded
30
The 0re5ious 3ELECT can also &e )ritten to use the %CE!E clause to
the main 3ELECT to com0are the ro)s of the deri5ed ta&le called *T
instead of onl# &uilding those ro)s. Com0are the 0re5ious 3ELECT
)ith the ne<t one and determine )hich is more efficient.
%hich do #ou find to &e more efficientQ
4t first glance, it )ould a00ear that the first is more efficient &ecause
the C!133 J1I- inside the 0arentheses for a deri5ed ta&le is not a
Cartesian 0roduct. Instead, the C!133 J1I- that 0o0ulates the
deri5ed ta&le is constrained in the %CE!E to onl# 3@L courses rather
than all courses. Co)e5er, the 'Eo0timi;es them the same. I told #ou
that Teradata )as smartE
Se#. 2oin
4 3elf Join is sim0l# a 7oin that uses the same ta&le more than once in
a single 7oin o0eration. The first reAuirement for this t#0e of 7oin is
that the ta&le must contain t)o different columns of the same domain.
This ma# in5ol5e de+normali;ed ta&les.
9or instance, if the Em0lo#ee ta&le contained a column for the
manager=s em0lo#ee num&er and the manager is an em0lo#ee, these
t)o columns ha5e the same domain. .# 7oining on these t)o columns
in the Em0lo#ee ta&le, the managers can &e 7oined to the em0lo#ees.
The ne<t 3ELECT 7oins the Em0lo#ee ta&le to itself as an em0lo#ee
ta&le and also as a manager ta&le to find managers. Then, the
managers are 7oined to the *e0artment ta&le to return the first ten
characters of the manager=s name and their entire de0artment name(
The self 7oin can &e the original s#nta< (ta&le , ta&le$, an I--E!,
12TE!, or C!133 7oin. 4nother reAuirement is that at least one of the
ta&le references must &e assigned an alias. 3ince the alias name
&ecomes the ta&le name, the ta&le is no) treated as t)o com0letel#
different ta&les.
-ormall#, a self 7oin reAuires some degree of de+normali;ation to allo)
for t)o columns in the same ta&le to &e 0art of the same domain.
3ince our Em0lo#ee ta&le does not contain the manager=s em0lo#ee
num&er, the out0ut cannot &e sho)n. Co)e5er, the conce0t is sho)n
here.
A#ternative 2OIN 3 ON Coding
There is another format that ma# &e used for coding &oth the I--E!
and 12TE! J1I- 0rocessing. 're5iousl#, all of the e<am0les and
s#nta< for 7oins of more than t)o ta&les used an 1- immediatel#
follo)ing the J1I- ta&le list.
The follo)ing demonstrates the other coding s#nta< techniAue(
%hen using this techniAue, care should &e taken to seAuence the J1I-
and 1- 0ortions correctl#. There are t)o 0rimar# differences )ith this
st#le com0ared to the earl# s#nta<. 9irst, the J1I- statements and
ta&le names are all together. In one sense, this is more like the s#nta<
of( ta&lename8, ta&lename2 as seen in the original 7oin.
3econd, the 1- statement seAuence is re5ersed. In the a&o5e s#nta<
diagram, the 1- reference for ta&lename2 and ta&lename- is &efore
the 1- reference for ta&lename8 and ta&lename2. Co)e5er, the J1I-
for Nta&le+name8O and Nta&le+name2O are still &efore the J1I- of
Nta&le+name2O and Nta&le+name-O. In other )ords, the first 1-
goes )ith the last J1I- )hen the# are nested using this techniAue.
The follo)ing three+ta&le I--E! J1I- seen earlier is con5erted here to
use this re5ersed form of the 1- com0arisons(
'ersonall#, )e 0refer the first techniAue in )hich e5er# J1I- is
follo)ed immediatel# &# its 1- condition. Cere are our reasons(
It is harder to accidentall# forget to code an 1- for a J1I-,
the# are together.
Less de&ugging time needed, and )hen it is needed, it is
easier.
.ecause the 7oin allo)s I: ta&les in a single 3ELECT, the 3@L
in5ol5ing se5eral ta&les ma# &e longer than a single 0age can
dis0la#. Therefore, man# of the J1I- clauses )ill &e on a
different 0age than its corres0onding 1- condition. It might
reAuire 0aging &ack and forth multi0le times to locate all of
the 1- conditions for e5er# J1I- clause. This in5ol5es too
much effort. 2sing the J1I- J 1-, the# are 0h#sicall# ne<t to
each other.
4dding another ta&le into the 7oin reAuires careful thought
and 0lacement for &oth the J1I- and the 1-. %hen using the
J1I- J 1-, the# can &e 0laced almost an#)here in the 9!1M
clause.

Adding &esidual %onditions to a Doin
Most of the e<am0les in this &ook ha5e included all ro)s from the
ta&les &eing 7oined. Co)e5er, in the )orld of Teradata )ith millions of
ro)s &eing stored in a single ta&le, additional com0arisons are
0ro&a&l# needed to reduce the num&er of ro)s returned. There are
t)o )a#s to code residual conditions. The# are( the use of a com0ound
condition using the 1-, or a %CE!E clause ma# &e used in the ne)
J1I-. These residual conditions are in addition to the 7oin eAualit# in
the 1- clause.
Consideration should &e gi5en to the t#0e of 7oin )hen including the
%CE!E clause. The follo)ing 0aragra0hs discuss the o0erational
as0ects of mi<ing an 1- )ith a %CE!E for I--E! and 12TE!
J1I-o0erations.
INNE) 2OIN
The %CE!E clause )orks e<actl# the same )hen used )ith the I--E!
J1I- as it does on all other forms of the 3ELECT. It eliminates ro)s at
read time &ased on the condition &eing checked and an# inde<
columns in5ol5ed in the com0arison.
-ormall#, as fe)er ro)s are read, the faster the 3@L )ill run. It is
more efficient &ecause fe)er resources such as disk, IJ1, cache s0ace,
s0ool s0ace, and C'2 are needed. Therefore, )hene5er 0ossi&le, it is
&est to eliminate unneeded ro)s using a %CE!E condition )ith an
I--E! J1I-. I like the use of %CE!E &ecause all residual conditions
are located in one 0lace.
The follo)ing sam0les are the same 7oin that )as 0erformed earlier in
this cha0ter. Cere, one uses a %CE!E clause and the other a
com0ound com0arison 5ia the 1-(
1r
2 !o)s !eturned
%ustomer6name (rder6num$er (rder6total
.ill#=s .est Choice 823:"I Y82,3:H."3
.ill#=s .est Choice 823"82 YG,00".F8
The out0ut is e<actl# the same )ith &oth coding methods. This can &e
5erified using the EB'L4I-. %e recommend using the %CE!E clause
)ith an inner 7oin &ecause it consolidates all residual conditions in a
single location that is eas# to find )hen changes are needed. 4lthough
there are multi0le 1- com0arisons, there is onl# one %CE!E clause.
OUTE) 2OIN
Like the I--E! J1I-, the %CE!E clause can also &e used )ith the
12TE! J1I-. Co)e5er, its 0rocessing is the o00osite of the techniAue
used )ith an I--E! J1I- and other 3@L constructs. If #ou remem&er,
)ith the I--E! J1I- the intent of the %CE!E clause )as to eliminate
ro)s from one or all ta&les referenced &# the 3ELECT.
%hen the %CE!E clause is coded )ith an 12TE! J1I-, it is e<ecuted
last, instead of first. !emem&er, the 12TE! J1I- returns e<ce0tions.
The e<ce0tions must &e determined using the 7oin (matching and non+
matching ro)s$ and therefore ro)s cannot &e eliminated at read time.
Instead, the# go into the 7oin and into s0ool. Then, 7ust &efore the
ro)s are returned to the client, the %CE!E checks to see if ro)s can
&e eliminated from the s0ooled 7oin ro)s.
The follo)ing demonstrates the difference )hen using the same t)o
techniAues in the 12TE! J1I-. -otice that the results are different(
H !o)s !eturned
Last "ame )irst"ame Student6I# %ourse
Mc!o&erts !ichard 2G0023 4d5anced 3@L
%ilson 3usie 238222 4d5anced 3@L
*elane# *ann# 32:I"2 Introduction to 3@L
Canson Cenr# 82"I3: Introduction to 3@L
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
-otice that onl# courses )ith 3@L as 0art of the name are returned.
%hereas the ne<t 3ELECT using the same condition as a com0ound
com0arison has a different result(
88 !o)s !eturned
Last "ame )irst"ame Student6I# %ourse
Mc!o&erts !ichard 2G0023 4d5anced 3@L
%ilson 3usie 238222 4d5anced 3@L
Q Q Q *ata&ase 4dministration
*elane# *ann# 32:I"2 Introduction to 3@L
Canson Cenr# 82"I3: Introduction to 3@L
Q Q Q Logical *ata&ase *esign
Q Q Q 'h#sical *ata&ase *esign
Q Q Q Teradata Conce0ts
.ond Jimm# 322833 2!3 3@L 9eatures
Canson Cenr# 82"I3: 2!3 3@L 9eatures
%ilson 3usie 238222 2!3 3@L 9eatures
The reason for the difference makes sense after #ou think a&out the
functionalit# of the 12TE! J1I-. !emem&er that an 12TE! J1I-
retains all ro)s from the outer ta&le, those that match and those that
do not match the 1- com0arison. Therefore, the ro) sho)s u0, &ut as
a non+matching ro) instead of as a matching ro).
There is one last consideration )hen using a %CE!E clause )ith an
12TE! J1I-. 4l)a#s use columns from the outer ta&le in the %CE!E.
The reason( if columns of the inner ta&le are referenced in a %CE!E,
the o0timi;er )ill 0erform an I--E! J1I- and not an 12TE! J1I-, as
coded. It does this since no ro)s )ill &e returned e<ce0t those of the
inner ta&le. Therefore, an I--E! J1I- is more efficient. The 0hrase
>merge 7oin? can found &e in the EB'L4I- out0ut instead of >outer
7oin? to 5erif# this e5ent.
The ne<t 3ELECT )as e<ecuted earlier as an inner 7oin and returned 2
ro)s. Cere it has &een con5erted to an outer 7oin. Co)e5er, the out0ut
from the EB'L4I- sho)s in ste0 " that an inner (merge$ 7oin )ill &e
used &ecause customer name is a column from the inner ta&le
(Customer ta&le$(
EB'L4I-
E<0lanation

(*TE& D(I" 9ints
The easiest )a# to &egin )riting an 12TE! J1I- is to(
8. 3tart )ith an I--E! J1I- and con5ert to an 12TE! J1I-.
1nce the I--E! J1I- is )orking, change the a00ro0riate I--E!
descri0tors to LE9T 12TE!, !I6CT 12TE! or 92LL 12TE! 7oin
&ased on the desire to include the e<ce0tion ro)s. 3ince I--E!
and 12TE! 7oins can &e used together, one 7oin at a time can &e
changed to 5alidate the out0ut. 2se the 7oin diagram &elo) to
con5ert the I--E! J1I- to an 12TE! J1I-.
2. 9or 7oins )ith greater than t)o ta&les, think of it as( J1I- t)o
ta&les at a time.
It makes the entire 0rocess easier &# concentrating on onl# t)o
ta&les instead of all ta&les. The o0timi;er )ill al)a#s 7oin t)o
ta&les, )hether seriall# or in 0arallel and it is smart enough to
do it in the most efficient manner 0ossi&le.
3. *on=t )orr# a&out )hich ta&les #ou 7oin first.
The o0timi;er )ill determine )hich ta&les should &e 7oined first
for the o0timal 0lan.
:. The %CE!E clause, if used in an 12TE! J1I- to eliminate ro)s.
4. It is a00lied after then 7oin is com0lete, not )hen ro)s are
read like the Inner Join.
.. It should reference columns from the outer ta&le. If
columns from the Inner ta&le are referenced in a %CE!E
clause, the o0timi;er )ill most likel# 0erform a merge 7oin
(I--E!$ for efficienc#. This is actuall# an I--E!
J1I-o0eration and can &e seen in the EB'L4I- out0ut.
Join *iagram(
%here(
Ta&le I ro)s
4 P that match Ta&le II ro)s and match Ta&le III ro)s
(I--E! 7oin 0erfect data$
. P that match Ta&le II ro)s, &ut not Ta&le III ro)s
C P that do not match Ta&le II ro)s or Ta&le III ro)s
* P that do not match Ta&le II ro)s, &ut do match Ta&le III
ro)s
Ta&le II ro)s
E P that do not match Ta&le I, nor Ta&le III ro)s
9 P that do not match Ta&le I, &ut do match Ta&le III
Ta&le III ro)s
6 P that do not match Ta&le I or Ta&le II

Parallel Doin Processing
There are four &asic t#0es of 7oins that Teradata can 0erform
de0ending on the characteristics of the ta&le definition. %hen the 7oin
domain is the 0rimar# inde< ('I$ column, )ith a uniAue secondar#
inde< (23I$ the 7oin is referred to as a nested 7oin and in5ol5es, at
most, three 4M's. The second t#0e of 7oin is a merge 7oin, )ith three
different forms of a merge 7oin, &ased on the reAuest. The ne)est t#0e
of 7oin in Teradata is the !o) Cash 7oin using the 0re+sorted !o) Cash
5alue instead of a sorted data 5alue match. This is &eneficial since the
data ro) is stored &ased on the ro) hash 5alue and not the data
5alue. The last t#0e is the 0roduct 7oin.
In Teradata, each 4M' 0erforms all 7oin 0rocessing in 0arallel locall#.
This means that matching 5alues in the 7oin columns must &e on the
same 4M' to &e matched. %hen the ro)s are not distri&uted and
stored on the same 4M', the# must &e tem0oraril# mo5ed to the same
4M', in s0ool. !emem&er, ro)s are distri&uted on the 5alue in the 'I
column(s$. If 7oins are 0erformed on the 'I of &oth ta&les, no ro)
mo5ement is necessar#. This is &ecause the ro)s )ith the same 'I
5alue are on the same 4M' D eas#, &ut not al)a#s 0ractical. Most 7oins
use a 0rimar# ke#, )hich might &e the 2'I and a foreign ke#, )hich is
0ro&a&l# not the 'I.
!egardless of the 7oin t#0e, in a 0arallel en5ironment, the mo5ement
of at least one ro) is normall# reAuired. This mo5ement 0uts all
matching ro)s together on the same 4M'. The mo5ement is usuall#
reAuired due to the user=s choice of a 'I. !emem&er, it is the 'I data
5alue that is used for hashing and ro) distri&ution to an 4M'.
Therefore, since the 7oined columns are mostl# columns other than the
'I, ro)s need to &e redistri&uted to another 4M'. The redistri&uted
ro)s )ill &e tem0oraril# stored in s0ool s0aceand used from there for
the 7oin 0rocessing.
The o0timi;er )ill attem0t to determine the most efficient 0ath for
data ro) mo5ement. Its choice )ill &e &ased on the amount of data
in5ol5ed. The three 7oin strategies a5aila&le are( 8+ du0licate all ro)s
of one ta&le onto e5er# 4M', 2+ redistri&ute the ro)s of one ta&le &#
hashing the non+'I 7oin column and sending them to the 4M'
containing the matching 'I ro), and 3+ redistri&ute &oth ta&les &#
hashed 7oin column 5alue.
The du0lication of all ro)s is a 0o0ular a00roach )hen the non+'I
column is on a small ta&le. Therefore, co0#ing all ro)s is faster than
hashing and distri&uting all ro)s. This techniAue is also used )hen
doing a 0roduct 7oin and )orse, a Cartesian 0roduct 7oin.
%hen &oth ta&les are large, the redistri&ution of the non+'I column
ro) to the 4M' )ith the 'I column )ill &e used to sa5e s0ace on each
4M'. 4ll 0artici0ating ro)s are redistri&uted so that the# are on the
same 4M' )ith the same data 5alue used &# the 'I for the other ta&le.
The last choice is the redistri&ution of all 0artici0ating ro) from &oth
ta&les &# hashing on the 7oin column. This is reAuired )hen the 7oin is
on a column that is not the 'I in either ta&le. 2sing this last t#0e of
7oin strateg# )ill reAuire the most s0ool s0ace. 3till, this techniAue
allo)s Teradata to Auickl# 7oin ta&les together in a 0arallel
en5ironment. .# com&ining the s0eed of the ./-ET, the e<0erience of
the 'E o0timi;er, and the hashing ca0a&ilities of Teradata the data can
&e tem0oraril# mo5ed to meet the demands of the 3@L Auer#. *o not
underestimate the im0ortance or &rilliance of this ca0a&ilit#. 4s Aueries
change and 0lace ne) demands on the data, Teradata is fle<i&le and
0o)erful enough to mo5e the data tem0oraril# and Auickl# to the
0ro0er location.
!edistri&ution reAuires o5erhead 0rocessing. It has nothing to do )ith
the 7oin 0rocessing, &ut e5er#thing to do )ith 0re0aring for the 7oin.
This is the 0rimar# reason that man# ta&les )ill use a column that is
not the 0rimar# ke# column as a -2'I. This )a#, the 7oin columns
used in the %CE!E or the 1- are used for distri&ution and the ro)s
are stored on the same 4M'. Therefore, the 7oin is 0erformed )ithout
need to redistri&ute data. Co)e5er, normall# some re+distri&ution is
needed. 3o, make sure to C1LLECT 3T4TI3TIC3 (see **L cha0ter$ on
the 7oin columns. The strateg# that the o0timi;e chooses can &e seen
in out0ut from an EB'L4I-.

Doin Inde/ Processing
3ometimes, regardless of the 7oin 0lan or indices defined, certain 7oins
cannot &e 0erformed in a short enough time frame to satisf# the users.
%hen this is the case, another alternati5e must &e e<0lored. Later
cha0ters in this &ook discuss tem0orar# ta&les and summar# ta&les as
a5aila&le techniAues. If none of these 0ro5ide a 5ia&le solution, #et
another o0tion is needed.
The other )a# to im0ro5e 7oin 0rocessing is the use of a J1I- I-*EB.
It is a 0re+7oin that stores the 7oined ro)s. Then, )hen the 7oin inde<
>co5ers? the user=s 3ELECT columns, the o0timi;er automaticall# uses
the stored 7oin inde< ro)s to retrie5e the 0re+7oined ro)s from
multi0le ta&les instead of doing the 7oin again. The term used here is
co5ers. It means that if all columns reAuested &# the user are 0resent
in the 7oin inde< it is used. If e5en one column is reAuested that is not
in the 7oin inde<, it cannot &e used. Therefore, the actual 7oin must &e
0rocessed to get that e<tra column.
The s0eed of the 7oin inde< is its main ad5antage. To enhance its on+
going use, )hene5er a 5alue in a column in a ro) for a ta&le used
)ithin a 7oin inde< is changed, the corres0onding 5alue in the 7oin
inde< ro)(s$ is also changed. This kee0s the 7oin inde< consistent )ith
the ro)s in the actual ta&les.
9or more information on 7oin inde< usage, see Cha0ter 8G in this &ook.

#ATEM TIMEM and TIMESTAMP
Teradata has a date function and a time function &uilt into the
data&ase and the a&ilit# to reAuest this data from the s#stem. In the
earl# releases, *4TE )as a 5alid data t#0e for storing the com&ination
of #ear, month and da#, &ut TIME )as not. -o), TIME and
TIME3T4M' are &oth 5alid data t#0es that can &e defined and stored
)ithin a ta&le.
The Teradata !*.M3 stores the date in ///MM** format on disk. The
/// is an offset 5alue from the &ase #ear of 8F00. The MM is the
month 5alue from 8 to 82 and the ** is the da# of the month. 2sing
this format, the data&ase can currentl# )ork )ith dates &e#ond the
#ear 3000. 3o, it a00ears that Teradata is /3M com0liant. Teradata
al)a#s stores a date as a numeric I-TE6E! 5alue.
The follo)ing calculation demonstrates ho) Teradata con5erts a date
to the ///MM** date format, for storage of Januar# 8, 8FFF(
9ormula for I-TE6E!*4TE P ((/ear D 8F00$ K 80000$ X (Month K
800$ X *a#
The stored data for the date Januar# 8, 8FFF is con5erted to(
/ear P (8FFF D 8F00$ K 80000 P 0FF0000 (#ear 0ortion$
Month P 08 K 800 P X0800 (month 0ortion$
*a# P 08 X08 (da# 0ortion$
0FF0808 stored on disk
4lthough #ears 0rior to 2000 look fairl# >normal? )ith an im0lied #ear
for the 20
th
Centur#, after 2000 #ears do not look like the normal
conce0t of a #ear (800$. 9ortunatel#, Teradata automaticall# does all
the con5ersion and makes it trans0arent to the user. The remainder of
this &ook )ill 0ro5ide 3@L e<am0les using &oth a numeric date as )ell
as the character formats of V//JMMJ**= and V////+MM+**=.
The ne<t con5ersion sho)s the data stored for Januar# 8, 2000 (notice
that ///P800 or 800 #ears from 8F00$(
/ear P (2000 D 8F00$ K 80000 P 8000000 (#ear 0ortion$
Month P 08 K 800 P X0800 (month 0ortion$
*a# P 08 X08 (da# 0ortion$
8000808 stored on disk
4dditionall#, since the date is stored as an integer and an integer is a
signed 5alue, dates 0rior to the &ase #ear of 8F00 can also &e stored.
The same formula a00lies for the date
con5ersion regardless of )hich centur#. Co)e5er, since dates 0rior to
8F00, like 8G00 are smaller 5alues, the result of the su&traction is a
negati5e num&er.

A"SI Standard #ATE &e.erence
C2!!E-TW*4TE is the 4-3I 3tandard name for the date function. 4ll
references to the original *4TEfunction continues to )ork and return
the same date information. 9urthermore, the# &oth dis0la# the date in
the same format.

I"TE7E&#ATE
I-TE6E!*4TE is the default dis0la# format for most Teradata data&ase
client utilities. It is in the form of //JMMJ**. It has nothing to do )ith
the )a# the data is stored on disk, onl# the format of the out0ut
dis0la#. The current e<ce0tion to this is @uer#man. 3ince it uses the
1*.C, it dis0la#s onl# the 4-3I date, as seen &elo).
Later in this &ook, the Teradata 91!M4T function is also addressed to
demonstrate alternati5e arrangements regarding #ear, month and da#
for out0ut 0resentation.
JK *is0la# toda#=s date, this e<am0le assumes 1ct. 8, 2008 KJ
Traditional Teradata A"SI
3ELECT *4TE( 3ELECT C2!!E-TW*4TE(
*4TE
08J80J08
C2!!E-TW*4TE
08J80J08
Figure 8-1
To change the out0ut default dis0la#, see the *4TE91!M o0tions in the
ne<t section of this cha0ter.

A"SI#ATE
Teradata )as u0dated in release 2!3 to include the 4-3I date dis0la#
and reser5ed name. The 4-3I format is( ////+MM+**.
JK *is0la# toda#=s date, this e<am0le assumes 1ct. 8, 2008 KJ
Traditional Teradata A"SI
3ELECT *4TE( 3ELECT C2!!E-TW*4TE(
*4TE
2008+80+08
C2!!E-TW*4TE
2008+80+08
Figure 8-2
3ince )e are no) &e#ond the #ear 8FFF, it is ad5isa&le to use this
4-3I format to guarantee that e5er#one kno)s the difference &et)een
all the #ears of each centur# as( 2000, 8F00 and 8G00. If #ou regularl#
use tools 5ia the 1*.C, )hich is soft)are for 10en *ata .ase
Connecti5it#, this is the default dis0la# format for the date.

#ATE)(&M
Teradata has traditionall# &een /2M com0liant. In realit#, it is
com0liant to the #ears &e#ond 3000. Co)e5er, the default dis0la#
format using //JMMJ** is not 4-3I com0liant.
In Teradata, release 2!3 allo)s a choice of )hether to dis0la# the
date in the original dis0la# format (//JMMJ**$ or the ne)er 4-3I
format (////+MM+**$. %hen installed, Teradata defaults at the s#stem
le5el to the original format, called I-TE6E!*4TE. Co)e5er, this s#stem
default *4TE91!M ma# &e o5er+ridden &# u0dating the *.3
Control record.
The *4TE91!M(
Controls default dis0la# of selected dates
Controls e<0ected format for im0ort and e<0ort of dates as character
strings (V//JMMJ**= or V////+MM+**=$ in the load utilities
Can &e o5er+ridden &# 23E! or )ithin a 3ession at an# time.
System ,eve# De.inition
M1*I9/ 6E-E!4L 8: P 0 JK I-TE6E!*4TE (//JMMJ**$ KJ
M1*I9/ 6E-E!4L 8: P 8 JK 4-3I*4TE (////+MM+**$ KJ
User ,eve# De.inition
C!E4TE 23E! username \\



&,TEFORM ? F20TE:ER&,TE G ,0S2&,TEH ;
Session ,eve# Dec#aration
In addition to setting the s#stem default in the control record, a user
can reAuest the format for their indi5idual session. The s#nta< is(
SET SESS2O0 &,TEFORM ? F,0S2&,TE G 20TE:ER&,TEH ;
In the a&o5e settings, the > ] > is used to re0resent an 1! condition.
The setting can &e 4-3I*4TE or I-TE6E!*4TE. !egardless of the
*4TE91!M &eing used, 4-3I*4TE or I-TE6E!*4TE, these define load
and dis0la# characteristics onl#. !emem&er, the date is al)a#s stored
on disk in the ///MM** format, &ut the *4TE91!M allo)s #ou to
select the format for dis0la#.

#ATE Processing
Much of the time s0ent 0rocessing dates is dedicated to storage and
reference. /et, there are times that one date #ields or deri5es a second
date. 9or instance, once a &ill has &een sent to a customer, the
e<0ectation is that 0a#ment comes I0 da#s later. The challenge
&ecomes the correct calculation of the e<act due date.
3ince Teradata stores the date as an I-TE6E!, it allo)s sim0le and
com0le< mathematics to calculate ne) dates from dates. The ne<t
3ELECT o0eration uses the Teradata date arithmetic and
*4TE91!MPI-TE6E!*4TE to sho) the month and da# of the 0a#ment
due date in I0 da#s(
: !o)s !eturned
#ue #ate (rder6date (rder6total
FFJ82J0F FFJ80J80 Y8",238.I2
FFJ03J02 FFJ08J08 YG,00".F8
FFJ88J0G FFJ0FJ0F Y23,:":.G:
FFJ88J30 FFJ80J08 Y",888.:H
.esides a due date, the 3@L can also calculate a discount 0eriod date
80 da#s 0rior to the 0a#ment due date using the alias name(
: !o)s !eturned
(rder6date #ue #ate (rder6total #iscount #ate #iscounted
FFJ80J80 FFJ82J0F Y8",238.I2 FFJ88J2F Y8:,F2I.FF
FFJ08J08 FFJ03J02 YG,00".F8 FFJ02J20 YH,G:".HF
FFJ0FJ0F FFJ88J0G Y23,:":.G: FFJ80J2F Y22,FG".H:
FFJ80J08 FFJ88J30 Y",888.:H FFJ88J20 Y",00F.2:
In the a&o5e e<am0le, it )as demonstrated that a *4TE X or + an
I-TE6E! results in a ne) date (date ^ X ] + _ integer P date$.
Co)e5er, it 0ro&a&l# does not make a lot of sense to multi0l# or di5ide
a date &# a num&er.
4s seen earlier in this cha0ter, the stored format of the date is
///MM**. 3ince ** is the lo)est com0onent, the I0 &eing added to
the order date in the a&o5e 3ELECT is assumed to &e da#s. The
s#stem is smart enough to kno) that it is dealing )ith a date.
Therefore, it is smart enough to kno) that a normal #ear contains 3I"
da#s.
The associati5e 0ro0erties of math tell us that eAuations can &e
rearranged and still &e 5alid. Therefore, a *4TE D a *4TE results in an
I-TE6E! (date X]+ date P integer$. This I-TE6E! re0resents the
num&er of da#s &et)een the dates.
This chart summari;es the math o0erations on dates
(peration &esult
*4TE + *4TE Inter5al (da#s &et)een dates$
*4TE X or + integer *4TE
Figure 8-3
This 3ELECT uses this 0rinci0al to dis0la# the num&er of da#s I )as
ali5e on m# last &irthda#(
8 !o) !eturned
MikeNs Age in #ays
8H"32
The a&o5e e<am0le su&tracted one of m# &irthda#s (1cto&er 8, 2000$
)ith m# actual &irthda# in 8F"2. -otice ho) a)ful an age looks in
da#sE More im0ortantl#, notice ho) I sli00ed it into the Title the fact
that #ou can use t)o single Auotes to store or dis0la# a literal single
Auote in a character string.
4s mentioned a&o5e, an age in da#s looks a)ful and that is 0ro&a&l#
)h# )e do not use that format. I am not read# to tell someone I am
7ust a little o5er 8H000. Instead, )e think a&out ages in #ears. To
con5ert the da#s to #ears, again math can &e used as seen in the
follo)ing 3ELECT(
8 !o) !eturned
MikeOs Age in !ears
:G
%o)E I feel so much #ounger no). This is )here di5ision &egins to
make sense, &ut remem&er, the I-TE6E! is not a *4TE. 4t the same
time, it assumes that all #ears ha5e 3I" da#s. It onl# does the math
o0erations s0ecified in the 3@L statement.
-o), )hat da# )as he &ornQ
The ne<t 3ELECT uses the concatenation, date arithmetic and a &lank
TITLE to 0roduce the desired out0ut(
8 !o) !eturned
Mike was $orn on day
2
The a&o5e su&traction results in the num&er of da#s &et)een the t)o
dates. Then, the M1* H di5ides &# H to get rid of the num&er of )eeks
and results in the remainder. 4 M1* H can onl# result in 5alues 0 thru
I (al)a#s 8 less than the M1* o0erator$. 3ince Januar# 8, 8F00
( 808(date$ $ is a Monda#, Mike )as &orn on a %ednesda#.
This chart can &e used for the da# of the )eek &ased on the a&o5e
formula and 808(date$
&esult #ay o. the 8eek
0 Monda#
8 Tuesda#
2 %ednesda#
3 Thursda#
: 9rida#
" 3aturda#
I 3unda#
Figure 8-4
The follo)ing 3ELECT uses a #ear=s )orth of da#s to deri5e a ne) date
that is 3I" da#s a)a#(
" !o)s !eturned
(rder6date !ear Later #ate (rder6total
FGJ0"J0: FFJ0"J0: Y82,3:H."3
FFJ08J08 00J08J08 YG,00".F8
FFJ0FJ0F 00J0FJ0G Y23,:":.G:
FFJ80J08 00J0FJ30 Y",888.:H
FFJ80J80 00J80J0F Y8",238.I2
In the a&o5e, the #ear 8FFF )as not a lea0 #ear. Therefore, the 5alue
of 3I" is used. Like)ise, had the &eginning #ear &een 2000, then 3II
needs to &e used &ecause it is a Lea0 /ear. !emem&er, the s#stem is
sim0l# doing the math that is indicated in the 3@L statement. If a #ear
)ere al)a#s needed, regardless of the num&er of da#s, see the
4**WM1-TC3function.
ADD+MONT/S
Com0ati&ilit#( Teradata E<tension
The Teradata 4**WM1-TC3function can &e used to calculate a ne)
date. This date ma# &e in the future (addition$ or in the 0ast
(su&traction$. The calendar intelligence is &uilt+in for the num&er of
da#s in a month as )ell as lea0 #ear 0rocessing. 3ince the 4-3I
C2!!E-TW*4TE and C2!!E-TWTIME are com0ati&le )ith the original
*4TE and TIMEfunctions, the 4**WM1-TC3 )orks )ith them as )ell.
.elo) is the s#nta< for the 4**WM1-TC3 function(
The ne<t 3ELECT uses literals instead of ta&le ro)s to demonstrate the
calendar logic used &# the 4**WM1-TC3 function )hen &eginning )ith
the last da# of a month and arri5ing at the last da# of 9e&ruar#(
8 !o) !eturned
)E6"on6Leap (ct6P=>? )E6Leap6!r (ct6M>J? )E6Leap6!r> (ct6J!rs
2008+02+2G 08J02J2H 2000+02+2F 00J03J0: 200:+80+30 0:J80J30
-otice, )hen using the 4**WM1-TC3 function, that all the out0ut
dis0la#s in 4-3I date form. This is true )hen using .TE@ or
@uer#man. Con5ersel#, the date arithmetic uses the default date
format. Like)ise, the second 4**WM1-TC3 uses DG, )hich eAuates to
su&traction or going &ack in time 5ersus ahead. 4dditionall#, &ecause
months ha5e a 5ar#ing num&er of da#s, the out0ut from math is likel#
to &e different than the 4**WM1-TC3.
The ne<t 3ELECT uses the 4**WM1-TC3 function as an alternati5e to
the 0re5ious 3ELECT o0erations for sho)ing the month and da# of the
0a#ment due date in 2 months(
" !o)s !eturned
#ue #ate (rder6date (rder6total
8FFG+0H+0: 8FFG+0"+0: Y82,3:H."3
8FFF+03+08 8FFF+08+08 YG,00".F8
8FFF+88+0F 8FFF+0F+0F Y23,:":.G:
8FFF+82+08 8FFF+80+08 Y",888.:H
8FFF+82+80 8FFF+80+80 Y8",238.I2
The 4**WM1-TC3 function also takes into account the last da# of
each month. The follo)ing goes from the last da# of one month to the
last da# of another month(
8 !o) !eturned
Leap6Ahead6>yrs Leap6ack6>yrs 8ithA?6A=6
2000+02+2F 2000+02+2F 2008+0H+38
%hether going for)ard or &ack)ard or &ack)ard in time, a lea0 #ear is
still recogni;ed using 4**WM1-TC3.

A"SI TIME
Teradata has also &een u0dated in 2!3 to include the 4-3I time
dis0la#, reser5ed name and the ne) TIME data t#0e. 4dditionall#, the
clock is no) intelligent and can carr# seconds o5er into minutes.
C2!!E-TWTIME is the 4-3I name of the time function. 4ll current 3@L
references to the original Teradata TIME function continue to )ork.
JK *is0la# the time, this e<am0le assumes 82(8"'M KJ
Traditional Teradata A"SI
3ELECT TIMEL 3ELECT C2!!E-TWTIMEL
TIME W
82(8"(00
C2!!E-TWTIME
82(8"(00
Figure 8-5
4lthough the time could &e dis0la#ed 0rior to release 2!3, )hen
stored, it )as con5erted to a character column t#0e. -o), TIMEis also
a 5alid data t#0e, ma# &e defined in a ta&le, and retains the
CC(MM(33 0ro0erties.
4s )ell as creating a TIME data t#0e, intelligence has &een added to
the clock soft)are. It can increment or decrement TIME )ith the result
increasing to the ne<t minute or decreasing from the 0re5ious minute
&ased on the addition or su&traction of seconds.
%hen storing TIME on disk, this chart indicates the amount of storage
reAuired(
TIME(n) as0 990MM0SSEnnnnnn n P ?4K (ma/imum is K digits to the right
o. the decimalM de.ault P K)
CC stored as &#teint (8 &#te$
MM stored as &#teint (8 &#te$
33 stored as decimal(G,I$ (: &#tes$
Figure 8-6
TIME re0resentation character dis0la# length(
TIME (0$ D 80(8:(3G CC4!(G$
TIME (I$ + 80(8:(3G.2088I3 CC4!(8"$

EFT&A%T
Com0ati&ilit#( 4-3I
.oth *4TE and TIME data are s0ecial in terms of relational design.
3ince each is com0rised of 3 0arts and the# are decom0osa&le.
*ecom0osa&le data is data that is not at its most granular le5el. 9or
e<am0le, #ou ma# onl# )ant to see the hour.
The EBT!4CT function is designed to do the decom0osition on these
data t#0es. It )orks )ith &oth the *4TE and TIME functions. This
includes the original and ne)er 4-3I e<0ressions. The o0eration is to
0ull a s0ecific 0ortion of the 3@L techniAues.
The s#nta< for EBT!4CT(
The ne<t 3ELECT uses the EBT!4CT)ith date and time literals to
demonstrate the coding techniAue and the resulting out0ut(
8 !o) !eturned
!r6Part Mth6Part #ay6Part 9r6Part Min6Part Sec6Part
<=== 1= =1 1= 1 I=
The EBT!4CT can &e 5er# hel0ful )hen there is a need to ha5e a
single com0onent for controlling access to data or the 0resentation of
data. 9or instance, )hen calculating aggregates, it might &e necessar#
to grou0 the out0ut on a change in the month. 3ince the data
re0resents dail# acti5it#, the month 0ortion needs to &e e5aluated
se0aratel#.
The 1rder ta&le &elo) is used to demonstrate the EBT!4CT function in
a 3ELECT(
1rder Ta&le + contains " orders
(rder6num$er %ustomer6num$er (rder6date (rder6total
PK K
UPI NUSI NUSI
823:"I
823"82
823""2
823"G"
823HHH
88888888
88888888
3832383:
GH323:"I
"HGFIGG3
FG0"0:
FF0808
FF8008
FF8080
FF0F0F
823:H."3
0G00".F8
0"888.:H
8"238.I2
23:":.G:
Figure 8-7
The follo)ing 3ELECT uses the EBT!4CT to onl# dis0la# the month and
also to control the num&er of aggregates dis0la#ed in the 6!12' ./(
: !o)s !eturned
EFT&A%T(M("T9
)&(M((rder6date) "$r6o.6rows A<erage((rder6total)
8 8 G00".F8
" 8 823:H."3
F 8 23:":.G:
80 2 808H8.":
The ne<t 3ELECT o0eration uses entirel# 4-3I com0liant code )ith
*4TE91!MP4-3I*4TE to sho) the month and da# of the 0a#ment due
date in 2 months and : da#s, notice it uses dou&le Auotes to allo)
reser5ed )ords as alias names and 4-3I*4TE in the com0arison and
dis0la#(
: !o)s !eturned
Month #ay !ear (rder6date (rder6total
*ue *ate( 3 I 8FFF Jan 08, 8FFF G00".F8
*ue *ate( 88 82 8FFF 4ug 0F, 8FFF 23:":.G:
*ue *ate( 82 : 8FFF 1ct 80, 8FFF "888.:H
*ue *ate( 82 83 8FFF 1ct 80, 8FFF 8"238.I2

Implied E/tract o. #ayM Month and !ear
Com0ati&ilit#( Teradata E<tension
4lthough the EBT!4CT )orks great and it is 4-3I com0liant, it is a
function. Therefore, it must &e e<ecuted and the 0arameters 0assed to
it to identif# the desired 0ortion as data. Then, it must 0ass &ack the
ans)er. 4s a result, there is additional o5erhead 0rocessing reAuired to
use it.
It )as mentioned earlier that Teradata stores a date as an integer and
therefore allo)s math o0erations to &e 0erformed on a date.
The s#nta< for im0lied e<tract(
The follo)ing 3ELECT uses math to e<tract the three 0ortions of Mike=s
literal &irthda#(
8 !o) !eturned
#ay6portion Month6portion !ear6portion
8 80 2008
!emem&er that the date is stored as ###mmdd. The literal 5alues are
used here to 0ro5ide a date of 1ct. 8, 2008. The da# 0ortion is
o&tained here &# making the dd 0ortion (last 2 digits$ the remainder
from the M1* 800. The month 0ortion is o&tained &# di5iding &# 800
to eliminate the dd to lea5e the mm (ne) last 2 digits$ 0ortion the
remainder of the M1* 800. The #ear 0ortion is the trickiest. 3ince it is
stored as ### (#### D 8F00$, )e must add 8F00 to the stored 5alue to
con5ert it &ack to the #### format. %hat do #ou su00ose the
EBT!4CT function doesQ 3ame thing.

A"SI TIMESTAMP
4nother ne) data t#0e, added to Teradata in 2!3 to com0l# )ith the
4-3I standard, is the TIME3T4M'. TIME3T4M' is no) a dis0la# format,
a reser5ed name and a ne) data t#0e. It is a com&ination of the
*4TE and TIMEdata t#0es com&ined together into a single column data
t#0e.
3ince this is entirel# ne), there is no 0re5ious com0ati&ilit# to
contrast.
Teradata A"SI
*id not 0re5iousl# e<ist
SELECT C-RRE0T_T2MEST,M+;
C2!!E-TWTIME3T4M'
2000+80+08 82(8"(00
Figure 8-8
Timestam0 re0resentation character dis0la# length(
T2MEST,M+=<> IJJK-I;-<L II)CL)MK C*,R=IJ>
T2MEST,M+=N> IJJK-I;-<L II)CL)MKD;IC<<< C*,R=;N>
-otice that there is a s0ace &et)een the *4TE and TIME0ortions of a
timestam0. This is a reAuired element to delimit or se0arate the da#
from the hour.

TIME B("ES
In 2!3, Teradata has the a&ilit# to access and store &oth the hours
and the minutes reflecting the difference &et)een the user=s time ;one
and the s#stem time ;one. 9rom a %orld 0ers0ecti5e, this difference is
normall# the num&er of hours &et)een a s0ecific location on Earth and
the 2nited Mingdom location that )as historicall# called 6reen)ich
Mean Time (6MT$. 3ince the 6reen)ich o&ser5ator# has &een
>decommissioned,? the ne) reference to this same time ;one is called
2ni5ersal Time Coordinate (2TC$.
4 time ;one relati5e to London (2TC$ might &e(
LA Miami )rank.urt 9ong 1ong
XG(00 X0"(00 00(00 +0G(00
4 time ;one relati5e to -e) /ork (E3T$ might &e(
LA Miami )rank.urt 9ong 1ong
X3(00 00(00 +0"(00 +83(00
Cere, the time ;ones used are re0resented from the 0ers0ecti5e of the
s#stem at E3T. In the a&o5e, it a00ears to &e &ack)ard. This is
&ecause the time ;one is set using the num&er of hours that the
s#stem is from the user.
To sho) an e<am0le of TIME5alues, )e randoml# chose a time 7ust
after 80(004M. .elo), the 5arious TIME )ith time ;one 5alues are
designated as(
The default, for &oth TIME and TIME3T4M', is to dis0la# si< digits of
decimal 0recision in the second=s 0ortion. Time ;ones are set either at
the s#stem le5el (*.3 Control$, the user le5el ()hen user is created or
modified$, or at the session le5el as an o5erride.
SETTIN& TIME 4ONES
4 Time [one should &e esta&lished for the s#stem and e5er# user in
each different time ;one.
3etting the s#stem default time ;one(
M1*I9/ 6E-E!4L 8I P < JK Cours, nP +82 to 83 KJ
M1*I9/ 6E-E!4L 8H P < JK Minutes, n P +"F to "F KJ
3etting a 2ser=s time ;one reAuires choosing either L1C4L, -2LL, or a
5ariet# of e<0licit 5alues(
4 Teradata session can modif# the time ;one during normal o0erations
)ithout reAuiring a logoff and logon.
Using TIME 4ONES
4 user=s time ;one is no) 0art of the information maintained &#
Teradata. The settings can &e seen in the e<tended information
a5aila&le in the CEL' 3E33I1-reAuest.
8 !o) !eturned
*ser "ame MDL
4ccount -ame MJL
Logon *ate 00J80J8"
Logon Time 0G(:3(:"
Current *ata.ase 4ccounting
Collation 43CII
Character 3et 43CII
Transaction 3emantics Teradata
Current *ate9orm Integer*ate
Session Time "one ##$##
*efault Character T#0e L4TI-
E<0ort Latin 8
E<0ort 2nicode 8
E<0ort 2nicode 4d7ust 0
E<0ort Man7i3JI3 8
E<0ort 6ra0hic 0
.# creating a ta&le and reAuesting the %ITC TIME [1-E o0tion for a
TIME or TIME3T4M' data t#0e, this additional offset is also stored.
The follo)ing 3C1% command dis0la#s a ta&le containing one
timestam0 column )ith TIME [1-Eand one column as a timestam0
column )ithout TIME [1-E(
S*O3 T,'LE Tstam6_test;
Te<t of **L 3tatement !eturned
4s ro)s )ere inserted into the ta&le, the time ;one of the user=s
session )as automaticall# ca0tured along )ith the data for
T3W)ithW;one. 3toring the time ;one reAuires an additional 2 &#tes of
storage &e#ond the dateXtime reAuirements.
The ne<t 3ELECT sho) the data ro)s currentl# in the ta&le(
SELECT * FROM Tstam6_test ;
: !o)s !eturned
TS6Qone TS6with6Qone TS6without6Qone
2TC 2000+80+08 0G(82(00.000000X0"(00 2000+80+08 0G(82(00.000000
E3T 2000+80+08 0G(82(00.000000X00(00 2000+80+08 0G(82(00.000000
'3T 2000+80+08 0G(82(00.000000+03(00 2000+80+08 0G(82(00.000000
CMT 2000+80+08 0G(82(00.000000+88(00 2000+80+08 0G(82(00.000000
Norma#i5ing TIME 4ONES
Teradata has the a&ilit# to incor0orate the use of time ;ones into 3@L
for a relati5e 5ie) of the data &ased on one localit# 5ersus another.
This 3ELECT ad7usts the data ro)s &ased on their TIME [1-E data in
the ta&le(
: !o)s !eturned
TS6Qone TS6with6Qone T6"ormal
2TC 2000+80+08 0G(82(00.000000X0"(00 2000+80+08 03(82(00.000000
E3T 2000+80+08 0G(82(00.000000X00(00 2000+80+08 0G(82(00.000000
'3T 2000+80+08 0G(82(00.000000+03(00 2000+80+08 88(82(00.000000
CMT 2000+80+08 0G(82(00.000000+88(00 2000+80+08 8F(82(00.000000
-otice that the Time [one 5alue )as added to or su&tracted from the
time 0ortion of the time stam0 to ad7ust them to a 0ers0ecti5e of the
same time ;one. 4s a result, at that moment, it has normali;ed the
different Times [ones in res0ect to the s#stem time.
4s an illustration, )hen the transaction occurred at G(82 4M locall# in
the '3T Time [one, it )as alread# 88(82 4M in E3T, the location of the
s#stem. The times in the columns ha5e &een normali;ed in res0ect to
the time ;one of the s#stem.

#ATE and TIME Inter<als
To make Teradata 3@L more 4-3I com0liant and com0ati&le )ith other
!*.M3 3@L, -C! has added I-TE!4L 0rocessing. Inter5als are used
to 0erform *4TE, TIME and TIME3T4M' arithmetic and con5ersion.
4lthough Teradata allo)ed arithmetic on *4TE and TIME, it )as not
0erformed in accordance to 4-3I standards and therefore, an
e<tension instead of a standard. %ith I-TE!4L &eing a standard
instead of an e<tension, more 3@L can &e 0orted directl# from an 4-3I
com0liant data&ase to Teradata )ithout con5ersion.
4dditionall#, )hen a data 5alue )as used to 0erform date or time
math, it )as al)a#s >assumed? to &e at the lo)est le5el for the
definition (da#s for *4TE and seconds for TIME$. -o), an# 0ortion of
either can &e e<0ressed and used.
I-TE!4L Chart
The simple inter<als are0 The more in<ol<ed inter<als are0
/E4!
M1-TC
*4/
C12!
MI-2TE
3EC1-*
*4/ T1 C12!
*4/ T1 MI-2TE
*4/ T1 3EC1-*
C12! T1 MI-2TE
C12! T1 3EC1-*
MI-2TE T1 3EC1-*
Figure 8-9
Using Interva#s
To use the 4-3I s#nta< for inter5als, the 3@L statement must &e 5er#
s0ecific as to )hat the data 5alues mean and the format in )hich the#
are coded. 4-3I standards tend to &e lengthier to )rite and more
restricti5e as to )hat is and )hat is not allo)ed regarding the 5alues
and their use.
3im0le I-TE!4L E<am0les using literals(
20TER1,L 7M<<9 &,(=C>
20TER1,L 7C9 MO0T*
20TER1,L -7;K9 *O-R
Com0le< I-TE!4L E<am0les using literals(
20TER1,L 9.M IK)C<)I<9 &,( TO SECO0&
20TER1,L 9I;)I;9 *O-R TO M20-TE
20TER1,L 9I;)I;9 M20-TE TO SECO0&
9or se5eral of the I-TE!4Lliterals, their use seems o&5ious &ased on
the literal non+numeric literals used. Co)e5er, notice that the C12!
T1 MI-2TE and the MI-2TE
T1 3EC1-* a&o5e, are not so o&5ious. Therefore, the declaration of
the meaning is im0ortant.
9or instance, notice that the# are coded as character literals. This
allo)s for use of a slash (J$, colon (( $ and s0ace as 0art of the literal.
4lso, notice the use of a negati5e time frame reAuires a >+? sign to &e
outside of the Auotes. The 0resence of the Auotes also denotes that
the numeric 5alues are treated as character for con5ersion to a 0oint in
time.
The format of a timestam0 reAuires the s0ace &et)een the da# and
hour )hen using inter5als. 9or e<am0le, notice the &lank s0ace
&et)een the da# and hour in the com0ound *4/ T1 C12! inter5al.
%ithout the s0ace, it is an error.
INTE)%A, Arit"metic $it" DATE and TIME
To use *4TE and TIME arithmetic, it is im0ortant to kee0 in mind the
results of 5arious o0erations.
The chart &elo) sho)s the Teradata im0lied arithmetic results.
#ATE and TIME arithmetic &esults prior to inter<als0
*4TE + *4TE P Integer (da#s$
*4TE M1* *4TE P Integer (da# of month$
*4TE J 800 P Integer (#ear and month$
*4TE J 80000 P Integer (#ear$
-*;E - -*;E = *nteger 8hours9
65-E W or - *nteger = 65-E
Figure 8-10
The chart &elo) sho)s the 4-3I e<0licit arithmetic results.
#ATE and TIME arithmetic &esults prior to inter<als0
*4TE + *4TE P Inter5al
TIME + TIME P Inter5al
TIME3T4M' + TIME3T4M' P Inter5al
*4TE X or + Inter5al P *4TE
TIME X or + Inter5al P TIME
TIME3T4M' X or + Inter5al P TIME3T4M'
I-TE!4L X or + Inter5al P Inter5al
Figure 8-11
-ote( It makes little sense to add t)o dates together.
Traditionall#, the out0ut of the su&traction is an integer, u0 to 2.8:H
&illion. Co)e5er, Teradata kno)s that )hen an integer is used in a
formula )ith a date, it must re0resent a num&er of da#s. The follo)ing
uses the 4-3I re0resentation for a *4TE(
SELECT =&,TE OIJJJ-I<-<IO - &,TE OIJKK-I<-<I9> ,S ,ssumed_&as ;
8 !o) !eturned
Assumed6#ays
:08H
The ne<t 3ELECT uses the 4-3I e<0licit *4/ inter5al(
SELECT =&,TE OIJJJ-I<-<IO - &,TE OIJKK-I<-<I9> &,( ,S ,ctual_&as ;
KKKK 9ailure H:"3 Internal 9ield 15erflo)
The a&o5e reAuest fails on an o5erflo) of the I-TE!4L. 2sing this
4-3I inter5al, the out0ut of the su&traction is an inter5al )ith : digits.
The default for all inter5als is 2 digits and therefore the o5erflo)
occurs until the 3ELECT is modified )ith *4/(:$, &elo)(
SELECT =&,TE OIJJJ-I<-<IO - &,TE OIJKK-I<-<I9> #A!(J) ,S ,ctual_&as ;
8 !o) !eturned
Actual6#ays
:08H
-ormall#, a date minus a date #ields the num&er of da#s &et)een
them. To see months instead, the follo)ing 3ELECT o0erations use
literals to demonstrate the con5ersions 0erformed on 5arious *4TE and
I-TE!4L data(
SELECT =&,TE O;<<<-I<-<IO E &,TE OIJJJ-I<-<IO> MO0T* =T$tle 7Months9> ;
8 !o) !eturned
Months
82
The ne<t 3ELECT sho)s I-TE!4Lo0erations used )ith TIME(
8 !o) !eturned
Actual6hours Actual6minutes Actual6seconds Actual6secondsJ
2 8"" F300.000000 F300.0000
4lthough Inter5als tend to &e more accurate, the# are more restricti5e
and therefore, more care is reAuired )hen coding them into the 3@L
constructs. Co)e5er, one miscalculation, like in the o5erflo) e<am0le,
and the 3@L fails. 4dditionall#, FFFF is the largest 5alue for an#
inter5al. Therefore, it might &e reAuired to use a com&ination of
inter5als, such as( M1-TC3 to *4/3 in order to recei5e an ans)er
)ithout an o5erflo) occurring.
CAST Using Interva#s
Com0liance( 4-3I
The C43T function )as seen in an earlier cha0ter as the 4-3I method
for con5erting data from one t#0e to another. It can also &e used to
con5ert one I-TE!4L to another I-TE!4L re0resentation. 4lthough
the C43T is normall# used in the 3ELECT list, it )orks in the %CE!E
clause for com0arison reasons.
.elo) is the s#nta< for using the C43T )ith a date(
The follo)ing con5erts an I-TE!4L of I #ears and 2 months to an
I-TE!4L num&er of months(
SELECT C,ST= =20TER1,L ON-<;O (E,R TO MO0T*> ,S 20TER1,L MO0T* >;
8 !o) !eturned
K4?>
H:
Logic seems to dictate that if months can &e sho)n, the #ears and
months should also &e a5aila&le. This reAuest attem0ts to con5ert
8300 months to sho) the num&er of #ears and months(
KKK 9ailure H:"3 Inter5al 9ield 15erflo).
The a&o5e failed &ecause the num&er of months takes more than t)o
digits to hold a num&er of #ears greater than FF. The fi< is to change
the /E4! to /E4!(3$ and rerun(
8 !o) !eturned
!ears R Months
800+02
The &iggest ad5antage in using the I-TE!4L 0rocessing is that 3@L
)ritten on another s#stem is no) com0ati&le )ith Teradata.
4t the same time, care must &e taken to use a re0resentation that is
large enough to contain the ans)er. The default is 2 digits and
an#thing larger, : digits ma<imum, must &e literall# reAuested. The
incorrect si;e results in an 3@L runtime error. The ne<t section on the
3#stem Calendar demonstrates another )a# to con5ert from one
inter5al of time to another.

(;E&LAPS
Com0ati&ilit#( Teradata E<tension
%hen )orking )ith dates and times, sometimes it is necessar# to
determine )hether t)o different ranges ha5e common 0oints in time.
Teradata 0ro5ides a .oolean function to make this test for #ou. It is
called 1E!L4'3L it e5aluates true, if multi0le 0oints are in common,
other)ise it returns a false.
The s#nta< of the 1E!L4'3 is(
The follo)ing 3ELECT tests t)o literal dates and uses the 1E!L4'3 to
determine )hether or not to dis0la# the character literal(
8 !o) !eturned
The dates o5erla0
The literal is returned &ecause &oth date ranges ha5e from 1cto&er 8"
through -o5em&er 30 in common.
The ne<t 3ELECT tests t)o literal dates and uses the 1E!L4'3 to
determine )hether or not to dis0la# the character literal(
-o !o)s 9ound
The literal )as not selected &ecause the ranges do not o5erla0. 3o, the
common single date of -o5em&er 30 does not constitute an o5erla0.
%hen dates are used, 2 da#s must &e in5ol5ed and )hen time is used,
2 seconds must &e contained in &oth ranges.
The follo)ing 3ELECT tests t)o literal times and uses the 1E!L4'3 to
determine )hether or not to dis0la# the character literal(
The times o5erla0
This is a trick# e<am0le and it is sho)n to 0ro5e a 0oint. 4t first
glance, it a00ears as if this ans)er is incorrect &ecause 02(08(00 looks
like it starts 8 second after the first range ends. Co)e5er, the s#stem
)orks on a 2:+hour clock )hen a date and time (timestam0$ is not
used together. Therefore, the s#stem considers the earlier time of 24M
time as the start and the later time of G 4M as the end of the range.
Therefore, not onl# do the# o5erla0, the second range is entirel#
contained in the first range.
The follo)ing 3ELECT tests t)o literal dates and uses the 1E!L4'3 to
determine )hether or not to dis0la# the character literal(
-o !o)s 9ound
%hen using the 1E!L4'3function, there are a cou0le of situations to
kee0 in mind(
8. 4 single 0oint in time, i.e. the same date, does not constitute an
o5erla0. There must &e at least one second of time in common for
TIME or one da# )hen using *4TE.
2. 2sing a -2LL as one of the 0arameters, the other *4TE or
TIME constitutes a single 0oint in time 5ersus a range.

System %alendar
Com0ati&ilit#( Teradata E<tension
4lso in 2!3, Teradata has a s#stem calendar that is 5er# hel0ful )hen
date com0arisons more com0le< than month, da# and #ear are
needed. 9or e<am0le, most &usinesses reAuire com0arisons from
8
st
Auarter to 2
nd
Auarter. It is &est used to a5oid maintaining #our o)n
calendar ta&le or 0erforming #our o)n so0histicated 3@L calculations
to deri5e the needed date 0ers0ecti5e.
Teradata=s calendar is im0lemented using a &ase date ta&le named
caldates )ith a single column named C*4TE3. The &ase ta&le is ne5er
referenced. Instead, it is referenced using the 5ie) named C4LE-*4!.
The &ase ta&le contains ro)s )ith dates Januar# 8, 8F00 through
*ecem&er 38, 2800. The s#stem calendar ta&le and 5ie)s are stored
in the 3#sWcalendar data&ase. This is a calendar from Januar# through
*ecem&er and has nothing to do )ith fiscal calendars.
The 0ur0ose of the s#stem calendar is to 0ro5ide an eas# )a# to
com0are dates. 9or e<am0le, com0aring acti5ities from the first
Auarter of this #ear )ith the same Auarter of last #ear can &e Auite
5alua&le. The 3#stem Calendarmakes these com0arisons eas#
com0ared to tr#ing to figure out the com0le<it# of the 5arious dates.
The ne<t 0age contains a list of column names, their res0ecti5e data
t#0es, and a &rief e<0lanation of the 0otential 5alues calculated for
each )hen using the C4LE-*4! 5ie)(
%olumn "ame #ata Type #escription 6
calendarWdate *4TE 3tandard Teradata date
EAui5alenc#( *4TE
da#WofW)eek ./TEI-T 8+H, )here 8 is 3unda#
EAui5alenc#( (*4TE + *4TE$ M1* H
da#WofWmonth ./TEI-T 8+38, some months ha5e less
EAui5alenc#( *4TE M1* H
da#WofW#ear 3M4LLI-T 8+3II, Julian da# of the #ear
EAui5alenc#( *4TE M1* 800 or EBT!4CT *a#
da#WofWcalendar I-TE6E! -um&er of da#s since 08J08J8F00
EAui5alenc#( *4TE + 808(date$
)eekda#WofWmonth ./TEI-T The seAuence of a da# )ithin a month,
first 3unda#P8, second 3unda#P2, etc
EAui5alenc#( -one kno)n
)eekWofWmonth ./TEI-T 0+", seAuential )eek num&er )ithin a
month, 0artial )eek starts at 0
EAui5alenc#( -one kno)n
)eekWofW#ear ./TEI-T 0+"3, seAuential )eek num&er )ithin a
#ear, 0artial )eek starts at 0
EAui5alenc#( -one kno)n
)eekWofWcalendar I-TE6E! -um&er of )eeks since 08J08J8F00
EAui5alenc#( (*4TE D 808(date$$JH
monthWofWAuarter ./TEI-T 8+3, each Auarter has 3 months
EAui5alenc#( C43E EBT!4CT Month
monthWofW#ear ./TEI-T 8+82, u0 to 82 months 0er #ear
EAui5alenc#( *4TEJ800 M1* 800 or EBT!4CTMonth
monthWofWcalendar I-TE6E! -um&er of months since 08J08J8F00
EAui5alenc#( -one needed
AuarterWofW#ear ./TEI-T 8+:, u0 to : Auarters 0er #ear
EAui5alenc#( C43E EBT!4CT Month
AuarterWofWcalendar I-TE6E! -um&er of Auarters since 08J08J8F00
EAui5alenc#( -one needed
#earWofWcalendar 3M4LLI-T 3tarts at 8F00
EAui5alenc#( EBT!4CT /ear
It a00ears that the least useful of these columns are all the names
that end )ith >WofWcalendar.? 4s seen in the a&o5e descri0tions, these
5alues are all calculated starting at the calendar reference date of
Januar# 8, 8F00. 2nless a &usiness transaction occurred on that date,
the# are meaningless.
The &iggest &enefit of the 3#stem Calendar is for determining the
follo)ing( *a# of the %eek, %eek of the Month, %eek of the /ear,
Month of the @uarter and @uarter of the /ear.
Most of the 5alues are 5er# straightfor)ard. Co)e5er, the column
called %eekWofWMonth deser5es some discussion. The descri0tion
indicates that a 0artial )eek is )eek num&er 0. 4 0artial )eek is an#
first )eek of a month that does not start on a 3unda#. Therefore, not
all months ha5e a )eek 0 &ecause some do start on 3unda#.
Ca5ing these column references a5aila&le, there is less need to make
as man# com0ound com0arisons in 3@L. 9or instance, to sim0l#
determine a Auarter reAuires 3 com0arisons, one for each month in
that Auarter. %orse #et, each Auarter of the #ear )ill ha5e 3 different
months. Therefore, the 3@L might reAuire modification each time a
different Auarter )as desired.
The ne<t 3ELECT uses the 3#stem Calendar to o&tain the 5arious date
related ro)s for 1cto&er 8, 2008(
8 !o) !eturned
calendar6date ?='=?'?=
da#WofW)eek 2
da#WofWmonth 8
da#WofW#ear 2H:
da#WofWcalendar 3H8I:
)eekda#WofWmonth 8
)eekWofWmonth 0
)eekWofW#ear 3F
)eekWofWcalendar "30F
monthWofWAuarter 8
monthWofW#ear 80
monthWofWcalendar 8222
AuarterWofW#ear 3
AuarterWofWcalendar :0G
#earWofWcalendar 2008
3ince the calendar is a 5ie), it is used like an# other ta&le and
columns are selected or com0ared from it. Co)e5er, not all columns of
all ro)s are needed for e5er# a00lication. 2nlike a user created
calendar, it )ill &e faster. The 0rimar# reason for this is due to reduced
in0ut reAuirements.
Each date is onl# : &#tes stored as *4TE. The desired column 5alues
are materiali;ed from the stored date. It makes sense that less I1
eAuates to a faster res0onse. 3o, : &#tes 0er date are read instead of
32 or more &#tes 0er date needed. There ma# &e hundreds of different
dates in a ta&le )ith millions of ro)s. Therefore, utili;ing the Teradata
s#stem calendar makes good sense.
3ince the s#stem calendar is a 5ie) or 5irtual ta&le, its 0rimar# access
is 5ia a 7oin to a stored date (i.e. &illing or 0a#ment date$. %hether
the date is the current date or a stored date, it can &e 7oined to the
calendar. %hen a 7oin is 0erformed, a ro) is materiali;ed in cache to
re0resent the 5arious as0ects of that date.
The follo)ing e<am0les demonstrate the use of the %CE!E clause for
these com0arisons using months instead of Auarters (%CE!E
MonthWofW/ear P 8 1! MonthWofW/ear P 2 1! MonthWofW/ear P 3 5s.
%CE!E @uarterWofW/ear P 8$ and the *a#WofW)eek column instead of
*4TE M1* H to sim0lif# coding(
8 !o) !eturned
(rder6date (rder6total Quarter6o.6!ear 8eek6o.6Month
FFJ0FJ0F Y23,:":.G: 3 8
4s nice as it is to ha5e a num&er that re0resents the da# of the )eek,
it still isn=t as clear as it might &e )ith a little creati5it#.
This C!E4TE T4.LE &uilds a ta&le called %eekW*a#s and 0o0ulates it
)ith the English name of the )eek da#s(
1nce the ta&le is a5aila&le, it can &e incor0orated into 3@L to make
the out0ut easier to read and understand, like the follo)ing(
2 !o)s !eturned
(rder6date (rder6total #ay6o.68eek 8kday6#ay
FFJ0FJ0F Y23,:":.G: " Thursda#
FFJ80J08 Y",888.:H I 9rida#
4s demonstrated in this cha0ter, there are man# )a#s to incor0orate
dates and date logic into 3@L. The format of the date can &e ad7usted
using the *4TE91!M. The 3@L ma# use 4-3I functions or Teradata
ca0a&ilities and functions. -o) #ou are read# to go &ack and forth )ith
a date (0un intended$.

Trans.orming %haracter #ata
Most of the time, it is acce0ta&le to dis0la# data directl# as it is stored
in the data&ase. Co)e5er, there are times )hen it is not acce0ta&le
and the character data must &e tem0oraril# transformed. It might
need shortening or something as sim0le as eliminating undesired
s0aces from a 5alue. The tools to make these changes are discussed
here.
Earlier, )e sa) the C43T function as a techniAue to con5ert data. It
can &e used to truncate data unless running in 4-3I mode, )hich does
not allo) truncation. These functions 0ro5ide an alternati5e to using
C43T, &ecause the# do not truncate data. Instead, the# allo) a 0ortion
of the data to &e returned. This is a slight distinction, &ut enough to
allo) the 0rocessing to 0ro5ide some interesting ca0a&ilities.
%e )ill e<amine the CC4!4CTE!3, T!IM, 32.3T!I-6, 32.3T!,
'13ITI1- and I-*EB functions. 4lone, each function 0ro5ides a
ca0a&ilit# that can &e useful )ithin 3@L. Co)e5er, )hen com&ined,
the# 0ro5ide some 0o)erful functionalit#.
This is an e<cellent time to remem&er one of the 0rimar# differences
&et)een 4-3I mode and Teradata mode. 4-3I mode is case sensiti5e
and Teradata mode is not. Therefore, the out0ut from most of these
functions is sho)n here in &oth modes.

%9A&A%TE&S )unction
Com0ati&ilit#( Teradata E<tension
The CC4!4CTE!3 function is used to count the num&er of characters
stored in a data column. It is easiest to use and the most hel0ful )hen
the characters &eing counted are stored in a 5aria&le length as a
4!CC4! column. 4 4!CC4! stores onl# the characters in0ut and no
trailing s0aces after the last non+s0ace character.
%hen referencing a fi<ed length CC4! column, the CC4!4CTE!3
function al)a#s returns a num&er that re0resents the ma<imum
num&er of characters defined. This is &ecause the data&ase must store
the data and 0ad to the full length using literal s0aces. 4 s0ace is a
5alid character and therefore, the CC4!4CTE!3 function counts e5er#
s0ace.
The s#nta< of the CC4!4CTE!3 function(
C%ARACTERS ( Ncolumn+nameO $
1r
C%AR ( Ncolumn+nameO $
To use the CC4!4CTE!3 (can &e a&&re5iated as CC4!$ function,
sim0l# 0ass it a column name. %hen referenced in the 3ELECT list, it
dis0la#s the num&er of characters. %hen )ritten into the %CE!E
clause, it can &e used as a com0arison 5alue to decide )hether or not
the ro) should &e returned.
The Em0lo#ee ta&le is used to demonstrate the functions in this
cha0ter. The contents of this ta&le is listed &elo)(
Em0lo#ee Ta&le + contains F em0lo#ees
Employee6"o Last6"ame )irst6name Salary #ept6"o
PK FK
UPI NUSI NUSI
8232"HG
82"I3:F
23:828G
Cham&ers
Carrison
!eill#
Mandee
Cer&ert
%illiam
:G,G"0.00
":,"00.00
3I,000.00
800
:00
:00
238222"
2000000
800023:
882833:
832:I"H
8333:":
Larkins
Jones
3m#the
3trickling
Coffing
3mith
Loraine
3Auigg#
!ichard
Cletus
.ill#
John
:0,200.00
32,G00."0
I:,300.00
":,"00.00
:8,GGG.GG
EH,00=.==
300
80
:00
200
200
Figure 9-1
The ne<t 3ELECT demonstrates ho) to code using the CC4!function in
&oth the 3ELECT list as )ell as in the %CE!E, 0lus the ans)er set(
: !o)s !eturned
)irst6name %6length
Mandee I
Cletus I
.ill# "
John :
If there are leading and im&edded s0aces stored )ithin the column,
the CC4! function counts them as 5alid or significant data characters.
The ans)er is e<actl# the same using CC4! in the 3ELECT list and the
alias in the %CE!E instead of re0eating the CC4! function(
: !o)s !eturned
)irst6name %6length
Mandee I
Cletus I
.ill# "
John :
4s mentioned earlier, the CC4!function )orks &est on 4!CC4!data.
The follo)ing demonstrates its result on CC4! data &# retrie5ing the
last name and the length of the last name )here the first name
contains more than H characters(
: !o)s !eturned
Last6name %6length
Cham&ers 20
Coffing 20
3mith 20
3trickling 20
4gain, the s0ace characters are 0resent in the data and therefore
counted. Cence, all the last names are 20 characters long. The
com0arison is on the first name &ut the dis0la# is &ased entirel# on the
last name.
The CC4! function is hel0ful for determining demogra0hic information
regarding the 4!CC4!data stored )ithin the Teradata data&ase.
Co)e5er, sometimes this same information is needed on fi<ed length
CC4! data. %hen this is the case, the T!IM function is hel0ful.

%9A&A%TE&6LE"7T9 )unction
Com0ati&ilit#( 4-3I
The CC4!4CTE!WLE-6TC function is used to count the num&er of
characters stored in a data column. It is the 4-3I eAui5alent of the
Teradata CC4!4CTE!3 function
a5aila&le in 2!:. Like CC4!4CTE!3, it=s easiest to use and the most
hel0ful )hen the characters &eing counted are stored in a 5aria&le
length 4!CC4! column. 4 4!CC4! stores onl# the characters in0ut
and no trailing s0aces.
%hen referencing a fi<ed length CC4! column, the
CC4!4CTE!WLE-6TC function al)a#s returns a num&er that
re0resents the ma<imum num&er of characters defined. This is
&ecause the data&ase must store the data and 0ad to the full length
using literal s0aces. 4 s0ace is a 5alid character and therefore, the
CC4!4CTE!WLE-6TC function counts e5er# s0ace.
The s#nta< of the CC4!4CTE!WLE-6TC function(
%9A&A%TE&6LE"7T9 = <column-name> >
To use the CC4!4CTE!WLE-6TC function, sim0l# 0ass it a column
name. %hen referenced in the 3ELECT list, it dis0la#s the num&er of
characters. %hen )ritten into the %CE!E clause, it can &e used as a
com0arison 5alue to decide )hether or not the ro) should &e returned.
The contents of the same Em0lo#ee ta&le a&o5e is also used to
demonstrate the CC4!4CTE!WLE-6TC function.
The ne<t 3ELECT demonstrates ho) to code using the
CC4!4CTE!WLE-6TC function in &oth the 3ELECT list as )ell as in the
%CE!E, 0lus the ans)er set(
: !o)s !eturned
)irst6name %6length
Mandee I
Cletus I
.ill# "
John :
If there are leading and im&edded s0aces stored )ithin the column,
the CC4!4CTE!WLE-6TC function counts them as 5alid or significant
data characters.
4s mentioned earlier, the CC4!4CTE!WLE-6TC function )orks &est on
4!CC4! data. The follo)ing demonstrates its result on CC4! data &#
retrie5ing the last name and the length of the last name )here the
first name contains more than H characters(
: !o)s !eturned
Last6name %6length
Cham&ers 20
Coffing 20
3mith 20
3trickling 20
4gain, the s0ace characters are 0resent in the data and therefore
counted. Cence, all the last names are 20 characters long. The
com0arison is on the first name &ut the dis0la# is &ased entirel# on the
last name.
The CC4!4CTE!WLE-6TC function is hel0ful for determining
demogra0hic information regarding the 4!CC4! data stored )ithin
the Teradata data&ase. Co)e5er, sometimes this same information is
needed on fi<ed length CC4!data. %hen this is the case, the
T!IM function is hel0ful.

(%TET6LE"7T9 )unction
Com0ati&ilit#( 4-3I
The 1CTETWLE-6TC function is used to count the num&er of
characters stored in a data column. It is another 4-3I eAui5alent of
the Teradata CC4!4CTE!3 function a5aila&le in 2!:. Like
CC4!4CTE!3, it=s easiest to use and the most hel0ful )hen the
characters &eing counted are stored in a 5aria&le length
4!CC4! column. 4 4!CC4! stores onl# the characters in0ut and no
trailing s0aces.
%hen referencing a fi<ed length CC4! column, the
1CTETWLE-6TCfunction al)a#s returns a num&er that re0resents the
ma<imum num&er of characters defined. This is &ecause the data&ase
must store the data and 0ad to the full length using literal s0aces. 4
s0ace is a 5alid character and therefore, the 1CTETWLE-6TC function
counts e5er# s0ace.
The s#nta< of the 1CTETWLE-6TCfunction(
&CTET_LENGT% ( Ncolumn+nameO $
To use the 1CTETWLE-6TCfunction, sim0l# 0ass it a column name.
%hen referenced in the 3ELECT list, it dis0la#s the num&er of
characters. %hen )ritten into the %CE!E clause, it can &e used as a
com0arison 5alue to decide )hether or not the ro) should &e returned.
The contents of the same Em0lo#ee ta&le a&o5e is also used to
demonstrate the 1CTETWLE-6TCfunction.
The ne<t 3ELECT demonstrates ho) to code using the
1CTETWLE-6TC function in &oth the 3ELECT list as )ell as in the
%CE!E, 0lus the ans)er set(
: !o)s !eturned
)irst6name %6length
Mandee I
Cletus I
.ill# "
John :
If there are leading and im&edded s0aces stored )ithin the column,
the 1CTETWLE-6TC function counts them as 5alid or significant data
characters.
4s mentioned earlier, the 1CTETWLE-6TC function )orks &est on
4!CC4! data. The follo)ing demonstrates its result on CC4! data &#
retrie5ing the last name and the length of the last name )here the
first name contains more than H characters(
: !o)s !eturned
Last6name %6length
Cham&ers 20
Coffing 20
3mith 20
3trickling 20
4gain, the s0ace characters are 0resent in the data and therefore
counted. Cence, all the last names are 20 characters long. The
com0arison is on the first name &ut the dis0la# is &ased entirel# on the
last name. The 1CTETWLE-6TCfunction is hel0ful for determining
demogra0hic information regarding the 4!CC4! data stored )ithin
the Teradata data&ase. Co)e5er, sometimes this same information is
needed on fi<ed length CC4!data. %hen this is the case, the
T!IM function is hel0ful.
N43-*3&E 0E.E ---- httpXKK.co''ingd.comKs!lKtds!lutpKtrim.htm

Das könnte Ihnen auch gefallen