Sie sind auf Seite 1von 347
- Included in This Chapter: ‘\ n The Relational Model n What is a Database?
-
Included in This Chapter: ‘\
n
The Relational Model
n
What is a Database?
n
Along Came Codd n
Data Structure m Data
Integrity
q
Data Manipulation
n Data Structure m Data Integrity q Data Manipulation T he database is ubiquitous in modern
n Data Structure m Data Integrity q Data Manipulation T he database is ubiquitous in modern
n Data Structure m Data Integrity q Data Manipulation T he database is ubiquitous in modern

T he database is ubiquitous in modern information processing; data is, stored in collections of some type in nearly every application of note. Ic was the automation of mammoth collections of data and the

J’

commen- surate ability to sort, search, and maintain those records that led tu the :I*:

.; ‘L & ,%& ::sj 3;; ,~.i _,. ‘I. ,$ r
.;
‘L &
,%&
::sj 3;;
,~.i _,. ‘I.
,$
r

advancement, at breakneck speed, of the powerful computer hardwartr,z:j; ‘ that has become common today. Data is power, and organizations hav+ come to recognize that the ability to harness these information resources is not only a competitive edge, but critical to their survival.

Initially modeled after the hierarchical structure of the paper files that’ -!:

the hierarchical structure of the paper files that’ -!: it replaced, the computerized database has been

it replaced, the computerized database has been through numerous

incarnations in a search for the most efficient method of maintaining ,.I+$** the integrity and efficiency of the data sets. In the 1970s Dr. E.E Codd

proposed a database model based on mathematical objects known as relations and the processes that can be applied to them. This intellec- tual treatise became known as the Relational Database Model, and it has revolutionized the design and development of data management software.

Delphi, with its tight binding to the Borland Database Engine, allows

the developer unprecedented freedom in the design and creation ot

software that works with database resources on nearly all platforms. This freedom, however, comes at
software that works with database resources on nearly all platforms. This freedom, however, comes at

software that works with database resources on nearly all platforms. This freedom, however, comes at a heavy cost. To reap the benefits of this powerful environment, it is up to the developer to create a design that combines the beneficial aspects of the freedom with the discipline of the relational model. Before diving headlong into the nuts and bolts of Delphi coding, it is important for you to form a solid foundation of knowledge in the theory and reasoning that drive this specialized area of programming. Database development is easy to do and even easier to do badly. The first three chapters of this book are designed to pro- vide you with the background skills needed to make effective design choices, decisions that let your software derive the maximum benefits from the wide variety of tools and implementation choices provided by Borland.

The Relational Model

The initial step to take in this foray is to establish clearly what the rela- tional model is. Based on the certainty of mathematics, the defining rules for the relational model describe a system that provides consistent and accurate results in all conditions while maintaining the integrity of the data elements as a highest priority An important note to make at this point is that the relational model is not a blueprint for the physical implementation of data storage. How closely a particular vendor chooses to apply the rules is up to them. The RDBMSs (Relational Data- base Management System) and development tools available vary widely in the measure of how closely they are bound to the relational rules in their efforts to generate greater performance from their storage and retrieval mechanisms. Delphi, as a development tool, and the Borland Database Engine (BDE) in particular, allow a great deal of free- dom in how a database is built using the product. Understanding the rules of the road that follow will allow your efforts to benefit from the data integrity offered by the relational model and the internal power of the physical engine driving your database.

internal power of the physical engine driving your database. What is a Database? The term “database”

What is a Database?

The term “database” has been applied in a number of different ways, many specific to the development context in which the word is men- tioned. Ignoring the marketing driven terminology for the moment, a database is best described as simply a collection of related data. The relationship is defined by some natural or forced affinity between the items, or records, that make up the collection. Figure 1.1 shows a file of

some natural or forced affinity between the items, or records, that make up the collection. Figure
some natural or forced affinity between the items, or records, that make up the collection. Figure
some natural or forced affinity between the items, or records, that make up the collection. Figure
some natural or forced affinity between the items, or records, that make up the collection. Figure

figure 1.1 The paper file system was emulated in the first data- base systems.

Figure I.2

The

FRIENDS

in the first data- base systems. Figure I.2 The FRIENDS f/at file database C h a

f/at file

database

base systems. Figure I.2 The FRIENDS f/at file database C h a p t e r

Chapter I -Introduction to the Relational Database

/ , ielm ,. i an elec- a.‘%
/
,
ielm
,. i
an
elec-
a.‘%
=:
=:
*; .,: .,-, _; .’ .:g :+
*;
.,:
.,-, _;
.’
.:g :+
Yj
Yj
,. i an elec- a.‘% =: *; .,: .,-, _; .’ .:g :+ Yj our friends.

our friends.

a.‘% =: *; .,: .,-, _; .’ .:g :+ Yj our friends. and user ;,,*t ;

and user

;,,*t ; ‘:. ,G
;,,*t
;
‘:.
,G
.’ .:g :+ Yj our friends. and user ;,,*t ; ‘:. ,G affecting the-_ needed to

affecting the-_

needed to

invoices, all related by what they represent and the shared informati4 contained within each invoice.

INVOICE 1098 INVOICE1322 INVOICE1779
INVOICE 1098
INVOICE1322
INVOICE1779

A

computerized

tronic
tronic

database

is

representation of this file. It is

repository for electronically storing da1 records, each record being a set of the individual data elements that describe each item that the records are mode& upon.

The simplest form of a

database is a single file that contains.&.-.

of the data. Similar

taining

certain

performed

computerized,

to a paper file con-‘?!

records,

there are

all

of

the

actions

with

that will commonly be

the

file:

n Adding new, empty records to the database

H Inserting records into a specific position within the database Inserting records into a specific position within the database

w Deleting

w Deleting unneeded records

unneeded

records

 

n

Retrieving

specific

records

from

the

database

n

Modifying

the

records

contained

in

the

database

the records contained in the database The flat file database in Figure 1.2 contains the names

The flat file database

in Figure 1.2 contains the names of

FRIENDS Geurii Sharon 722 Boston Ln 909 221-4456 Rollins Hank PO Box 89 818 986-2091
FRIENDS
Geurii
Sharon
722
Boston Ln
909 221-4456
Rollins Hank
PO Box 89
818
986-2091
Geurii
Tom
722
Boston Ln
909 221-4456
Wayne
Marion Warhawk Dr
303 458-8095

\

Ln 909 221-4456 Wayne Marion Warhawk Dr 303 458-8095 \ Each record in this table contains

Each record in this table contains one entry about each of the peo@e

that

file

alike. For example, because all instances of a “friend” must be con-

we

poses

know.

a

Despite

of

the

advantages

provided

for

by its simplicity, the fk#

the

developer

number

different

problems

tained

in

a

single

file,

the

possibility

of

update

anomalies

integrity

of

the

data

is

great.

Notice

that

two

records

are

6 W Part I-The Relational Database _,-:,, , -. , ,/, *,, -l.i( i ‘.

6 W Part I-The Relational Database _,-:,,

,

-.

,

,/,

*,,

-l.i(

i

‘.

/?.sV

maintain the telephone number for the Geurii family members; two records are needed because both Tom and Sharon are our friends and

this is the only way of representing both of their names. Modifying the telephone number that they share requires two discrete operations, one

to modify Tom’s record and a second to modify Sharon’s. Due to the

duplication of data, the possibility exists that both entries may not be updated correctly, or at all, leading to problems with the integrity of the data. The next time that the phone book database is used to tele- phone Sharon, we might discover that a disconnected telephone number has been stored rather than the correct exchange.

A common problem encountered with paper-based files also appears as

a glaring weakness in the flat file. If the records are filed according to

the spelling of the last name, a simple misspelling causes otherwise matching records to be separated. Tom might be filed correctly as Geurii but Sharon’s record could easily be misspelled as Guerii. While this problem is certainly not limited to the flat file paradigm, the record-by-record methods that are required when updating data com- pound this problem. If the update action relies on a search of last names, only one of the Geurii family records will be updated with their new address, causing a billing to be sent to the old address and not received.

Flat file database implementations are usuall>- tightly bound to the physical storage method used to place the records on disk. Because of this, the user access methods provided for working with the table closely match their actual implementation methods. The user perceives that flat files are being accessed with a record-b>--record approach. The database access methods are coded in such a ]\‘a!- that each record is approached discretely, not recognizing a connection between the related records.

The flat file database was a logical intermediate solution when the leap was made from paper-based record keeping to the electronic storage of data. The problems that plagued paper records followed the data to disk and the speed at which the technology brought the problems to the surface made it clear that a new approach was necessary.

the surface made it clear that a new approach was necessary. Along Came Codd In examine
the surface made it clear that a new approach was necessary. Along Came Codd In examine
the surface made it clear that a new approach was necessary. Along Came Codd In examine

Along Came Codd

In

examine the possibility of applying the uniformity and consistency of mathematics to the undisciplined field of database management. His research resulted in the now-classic paper, ‘A Relational Model of Data for Large Shared Data Banks, which was published in 1970 and

1968, while working as a researcher for IBM, Dr. E.E Codd began to

n k s , ” which was published in 1970 and 1968, while working as a
n k s , ” which was published in 1970 and 1968, while working as a
n k s , ” which was published in 1970 and 1968, while working as a
n k s , ” which was published in 1970 and 1968, while working as a
n k s , ” which was published in 1970 and 1968, while working as a
.$ :s ‘.“;.$< -e@ :i; ‘)j C h a p t e r I -
.$ :s ‘.“;.$< -e@ :i; ‘)j C h a p t e r I -
.$ :s ‘.“;.$< -e@ :i;
.$
:s
‘.“;.$< -e@ :i;
‘)j
‘)j

Chapter I - Introduction to the Relational Database

‘(

“,s*s*Y

Columns are Single Valued This first property is of primary importance, as it sets the stage for many of the later properties and methods of the relational database. This rule implies that there are no repeating groups in the table, a

implies that there are no repeating groups in the table, a repeating group being duplicate entries

repeating group being duplicate entries in the table representing the ;z? j

same entity. The power, flexibility, and integrity of the relational data- base model are derived from the strict application of this standard.

Application of the single-value rule ensures that each row in the

table

unique and data manipulation and retrieval operations can be designi! to operate with mathematical precision. No gray areas need to be

:;gj accounted for. --“;$ The VENDOR table displayed in Figure 1.3 shows a spreadsheet approach
:;gj
accounted for.
--“;$
The VENDOR table displayed in Figure 1.3 shows a spreadsheet
approach to managing the items supplied by each vendor.
1 Kinafich
Inr
I
Smelt
Tidepool Company
Crab
Clam
The Crab Walk
Tuna
Shark
Swordfish
Devil Ray Foods
Oyster
Abalone
Clam
Figure I .3 VENDOR table with repeating columns
Vendors have a single row
devoted to their inventory.
This design is problematic;
$
‘li
attempting to design access “3
logic to take into account th+$J
number of possible item sol-.:$
umns would be a nightmare+!!
Locating a specific item a@$$
the vendor that supplies it ,f,i+$
would be difficult at best. &I +
better approach, enforced ?$j
through the relational par&$
digm, is displayed in Figk$$
14
*:: ?CL

Figure 1.4 VENDOR table with repeating

columns

removed

In this relation, none of the columns are repeated and it is now a si ple matter to associate an item with a vendor.

In this relation, none of the columns are repeated and it is now a si ple
Entries in j “. _ _ _ I “n^. I _i -414~1~~-~ ,‘T, ;.$ ‘3
Entries in j “. _ _ _ I “n^. I _i -414~1~~-~ ,‘T, ;.$ ‘3
Entries in j “. _ _ _ I “n^. I _i -414~1~~-~ ,‘T, ;.$ ‘3
Entries in j “. _ _ _ I “n^. I _i -414~1~~-~ ,‘T, ;.$ ‘3

Entries in

j “. _ _ _ I “n^. I _i -414~1~~-~ ,‘T, ;.$ ‘3 ‘$2 ->c*
j
“.
_
_
_
I
“n^.
I
_i
-414~1~~-~ ,‘T,
;.$ ‘3
‘$2
->c* :;j
name ,::I
-414~1~~-~ ,‘T, ;.$ ‘3 ‘$2 ->c* :;j name ,::I ,I ‘A f* I$ j$ 2J $j’
,I ‘A f* I$ j$ 2J $j’ q ‘.;$,; .3
,I ‘A
f*
I$
j$ 2J
$j’ q
‘.;$,; .3
-1 & f .&j $ $ ::
-1
&
f
.&j
$
$
::

row.

‘-j ,‘! :-:! ,y .&a :x
‘-j
,‘!
:-:!
,y .&a :x
; ;I ‘: .i 1
;
;I
‘:
.i
1
:: row. ‘-j ,‘! :-:! ,y .&a :x ; ;I ‘: .i 1 I-The Relational Database

I-The Relational Database

,y .&a :x ; ;I ‘: .i 1 I-The Relational Database _ I A;/ Each Co/umn

_

I

A;/

Each Co/umn

are from the Same Domain

A

can be drawn. This pool of values describes the entire range of accept- able data elements. For example, the domain of the column Employee-ID is the range of valid employee ID numbers. Items not

domain is the defined set of values from which the data in a column

belonging to that collection of values, such as the employee’s last

an ID number beyond or below the valid range, fall outside of the domain for that column.

With a domain for each column defined, the user is rewarded with con- 134

sistency in their data. Since the columns make up the attributes, or description, of each row added to the table, the user knows that each row retrieved will have the same description. Applying this rule also simplifies data validation. Because you will always apply this rule, busi-” ness rules can more readily be applied to each data element.

or

rules can more readily be applied to each data element. or Each Row is Unique No
rules can more readily be applied to each data element. or Each Row is Unique No

Each Row

is Unique

be applied to each data element. or Each Row is Unique No two rows in a

No two rows in a relational table are identical; at least one column uniquely identifies the contents of each row. The power of this property $4

cannot be underestimated. In a non-relational data collection, it is pos- -5$

sible, and likely probable, that there is a considerable duplication of information. Retrieval operations must then be designed to seek and compare, record by record, the retrieval request against all of the col- umns in the table in order to satisfy the request. Modification and manipulation requests are equally at risk; slight differences in a num- ber of conceptually matching records leads to the possibility that they will not be selected for updating.

The uniqueness of each row is enforced by the presence of a key value.

A more extensive discussion of key values is presented in Chapter 2,

but for now, the definition of a primary key is a column, or a minimal set of columns, that uniquely describes each row. Two simple rules

apply to the primary key of a relation: First, it cannot be a null. A null

value breaks the rules immediately; it does nothing to describe the

Secondly, the primary key must be unique within the table. A duplicate

key value would indicate duplicated data in other fields, destroying the 1; .?

integrity of the relational operations.

The existence of a key value gives the designers of the data access and :

manipulation operations the freedom to build their routines around 2 fast and simple search methods. With the uniqueness characteristic .I guaranteed, the access method can be confident that once it has located a key value that matches the retrieval request, all of the rows matching the request have been located and returned to the user.

The The Chapter I --Introduction to the Relational Database dn_d s_‘**I ‘. . *_ -.*
The
The
The The Chapter I --Introduction to the Relational Database dn_d s_‘**I ‘. . *_ -.* ’

The

The The Chapter I --Introduction to the Relational Database dn_d s_‘**I ‘. . *_ -.* ’

Chapter I --Introduction to the Relational Database

dn_d

s_‘**I

‘.

.

*_

to the Relational Database dn_d s_‘**I ‘. . *_ -.* ’ . + ,” ,_ .;
-.* ’ . + ,” ,_ .;
-.*
. +
,”
,_ .;
Database dn_d s_‘**I ‘. . *_ -.* ’ . + ,” ,_ .; create ( .

create

dn_d s_‘**I ‘. . *_ -.* ’ . + ,” ,_ .; create ( . Referring

(.

Referring back to Figure 1.3, we see that the Vendor column uniquely -,;.

identified each of the rows. With the conversion performed to

the table in Figure 1.4, the database now is at risk of containing dupli-5:. cate rows. As we will see in Chapter 2, it’s easy to design a solution to ‘$ meet both criteria.

of

the

Columns

is

Insignificant

Ino

Sequence

By not placing any importance on the order of the columns, the rela-

tional table is ensuring that there is no hidden meaning in the order o&j the
tional table is ensuring that there is no hidden meaning in the
order o&j
the data in each row. Without concern for the sequence of the columnq;
the database user is free to retrieve the data in any combination or .i $
order, viewing the data as they see fit, not as enforced by the databas$s
“:
The VENDOR table in Figure 1.5 is shown in two configurations. :i<! ;’ :
OR
Kingfish, Inc.
366 Old Salt Lane
Coos
Bay
Tidepool
Company
Keiko
Avenue
Reedsport
O
R
9a(
“,
90*3r;;
The
Crab
Walk
El Perro Blvd
Lincoln
City
O
R
OR 903
f &$j
Devil Ray Foods
Navy
Way
Rockaway
Beach
Kingfish, {nc.
90333
OR
Coos
Bay
366 Old Salt Lane
90421
OR
Reedsport
Keiko
Avenue
Tidepool
Company
Lincoln
City
El Perro Blvd
The Crab
Walk
90231
OR
9031
I
OR
Rockaway
Beach
Navy
Way
Devil Ray Foods
Figure
I
.5
Insignificance
of
column
order
::
,.
designs it is important to maintain this condition. Designing data accee
routines that rely on an artificial sequencing of the columns in your -:
The sequence of columns in version two is reversed, but the underlyin$$
data remains the same. Though the columnar sequence has changed, <$
each row still represents the same entity. In your database and code
.:i
data tables will result in maintenance difficulties prevented by the reEa:j
‘;
tional database design.
:’
Rows
is
Insignificant
.:: .* (>
Of even less consequence is the ordering of the rows in a table. PI&t
encumbrance placed on the user by the order of the rows, the user is
able to resequence the data to be meaningful to them. The sort order
“”
of the rows, the user is able to resequence the data to be meaningful to them.

Sequence ofthe

I2

I(

n

Part I--The

Relational Database

*a*-*

any

resulting

table

can

be

determined

by

the

data

contained

in

any of

the

columns.

data contained in any of the columns. Each ofthe Cohnns is Uniquely Named The than by

Each ofthe

Cohnns is Uniquely Named

The

than by column position; because of this, each of the columns in a table

must have a unique description. The benefits from this characteristic are twofold. First, this will ensure that the data requested by the user the data that is received by the user. Secondly, since the physical posi- tioning of the column is unimportant, the table can be easily modified without destroying the work done by the user in the past. These same benefits are not achieved using access and manipulation methods that are dependent upon a column being located in a specific position within the table.

user

will refer to

the columns of a relational table by name rather

is

Data

Integrity

One

of

the

most

critical

aspects

of

any

data

management

scheme

is

the

integrity

of

the

data

and

how

well

its

internal

methods

ensure

it.

The

integrity rules of the relational database model focus on constraining the values contained within the columns of each table; without the con- straints, values in the columns would be free to assume incorrect values. Simplicity is the key in this regard; there are only two rules that

encompass

the

whole

of

the

integrity

constraints,

the

entity

integrity

rule

and

the

referential

integrity

rule.

The two relational integrity rules represent business rules, not technical considerations. The relational database definition requires that the

tools

part

need

tional

needed

of

to

the

be

to

maintain

the

At

integrity

the

them

through

of

the

time,

any

database

the

user

be

an

integral

never

rela-

implementation.

concerned

once

with

set

same

from

should

standpoint;

be

technical

means,

integrity,

technical

should

transparent

to

 

the

end

user.

 
Entity integrity

Entity integrity

The

first

rule

is

simply

a

repeat

of

the

earlier

structure

requirement

that

each row

in

a

table

be

unique.

The

entity

integrity

rule

requires

that

no

part of the primary key

of

a table be allowed to accept null values.

Blank data

fields

do

not

lend

themselves

to

an

adequate

description

of

the

row,

so

requiring

data

elements

in

the

column

or

columns

that

make up the key to the row makes perfect sense. Requiring that the key

for

each

row

be

unique

fulfills

the

mission

of

the

relational

model.

the row makes perfect sense. Requiring that the key for each row be unique fulfills the
,,_= Referential I .1 ” i ,_ “,.in_\_ C h a p t e r
,,_= Referential I .1 ” i ,_ “,.in_\_ C h a p t e r

,,_=

Referential

I

.1

” i
i
,_ “,.in_\_
,_
“,.in_\_

Chapter I -Introduction to the Relational Dat

a p t e r I -Introduction to the Relational Dat Integrity The rule regarding referential
a p t e r I -Introduction to the Relational Dat Integrity The rule regarding referential
a p t e r I -Introduction to the Relational Dat Integrity The rule regarding referential

Integrity The rule regarding referential integrity introduces a new topic to the discussion, the foreign key. Aforeign key is a column or a combinati&$ of columns that serve as the primary key in another table. The reasa i_ exists in the table of focus is to provide a method of linkage betvveea 2 the two tables. The relationship between the tables is often describ+$i as a parent-child relationship, and, as shown in Figure 1.6, this rela- *I :$ tionship must be maintained at all times.

this rela- *I :$ tionship must be maintained at all times. Kingfish, Inc. 8213 Tidepool Company
this rela- *I :$ tionship must be maintained at all times. Kingfish, Inc. 8213 Tidepool Company
this rela- *I :$ tionship must be maintained at all times. Kingfish, Inc. 8213 Tidepool Company
this rela- *I :$ tionship must be maintained at all times. Kingfish, Inc. 8213 Tidepool Company
Kingfish, Inc. 8213 Tidepool Company 947 I The Crab Walk 7790 Devil Ray Foods 9022
Kingfish, Inc.
8213
Tidepool
Company
947
I
The
Crab
Walk
7790
Devil Ray Foods
9022

VENDOR

(Parent)
(Parent)
The Crab Walk 7790 Devil Ray Foods 9022 VENDOR (Parent) I I30 82 I 3 Smelt
The Crab Walk 7790 Devil Ray Foods 9022 VENDOR (Parent) I I30 82 I 3 Smelt
The Crab Walk 7790 Devil Ray Foods 9022 VENDOR (Parent) I I30 82 I 3 Smelt
The Crab Walk 7790 Devil Ray Foods 9022 VENDOR (Parent) I I30 82 I 3 Smelt
The Crab Walk 7790 Devil Ray Foods 9022 VENDOR (Parent) I I30 82 I 3 Smelt
I I30 82 I 3 Smelt 230 947 I Crab PRODUCTS (Child) .
I
I30
82 I
3
Smelt
230
947 I
Crab
PRODUCTS
(Child)
.

Figure

1.6 The

parent-child

relationship

The parent-child relationship is displayed in Figure 1.6. The VENDOR table contains one entry for
The parent-child relationship is displayed in Figure 1.6. The VENDOR
table contains one entry for each of the suppliers that we do business
with; each row is unique. The number of products purchased from f?
supplier usually is greater than one. Each of the items in the -2
PRODUCTS table is linked to its parent record, which contains the.4
ping information for each vendor, through the Vendor ID field. As &.? “3
real life, no child can exist without a parent. No products can be
entered without having a vendor ID, which can only be generated &I&$
entry in the VENDOR table. Likewise, referential integrity will prevq
a vendor from being deleted while there are still items in the prodx@,
table using its vendor ID.
jl
:;

a

Data Manipulation

The third aspect of the relational database model concerns

the manipulation of data. The model defines two categories of ope tions that can be performed using the relations:

itself tin.=*

‘*
‘*
of data. The model defines two categories of ope tions that can be performed using the
of data. The model defines two categories of ope tions that can be performed using the
:.;, This set 14 n Part I- -The Relational Database _ , , / .
:.;, This set 14 n Part I- -The Relational Database _ , , / .
:.;, This set 14 n Part I- -The Relational Database _ , , / .
:.;, This set 14 n Part I- -The Relational Database _ , , / .
:.;, This set 14 n Part I- -The Relational Database _ , , / .
:.;, This set 14 n Part I- -The Relational Database _ , , / .
:.;, This set 14 n Part I- -The Relational Database _ , , / .
:.;,
:.;,
:.;, This set 14 n Part I- -The Relational Database _ , , / . ,.

This set

:.;, This set 14 n Part I- -The Relational Database _ , , / . ,.

14 n Part I--The Relational Database _

:.;, This set 14 n Part I- -The Relational Database _ , , / . ,.

,,/.

,. ,,
,.
,,

.

1. Assignment of relations to other relations

2. Manipulation of the data using eight defined operators

Both categories are in reality intertwined. Data manipulation opera tions such as Select result in the selected data being placed into a n table.

The eight relational database operators share two characteristics. Fi@

the relational operators are set processing commands; they apply to .%! and result in relational
the relational operators are set processing commands; they apply to .%!
and result in relational tables. The second characteristic is that the
:ig
operators are unaffected by how the data is physically stored. Reme&#
ber that the relational model calls for a strict separation of the logic&-“3
and physical implementation. The operators to be discussed are:
‘.
.>: L

n

Select

n

Project

H

Product

H

Join

n

Union

m Intersection

n

Difference

n

Division

These operators are not intended to be a specification for a language. Rather, the intention of defining this set of operations is to protect the user from the burden of having to be familiar with the technical details of data manipulation and retrieval. SQL is covered extensively in Chap- ter 3, which also focuses on the operations supported by Delphi and BDE. The following pages are meant to act as an introduction and to round out the relational database definition.

--

-

Select

The select operation retrieves a set of rows into a new relation.

is composed of rows in the base relation in which the column values match the criteria provided in the query.

Project

The project operator retrieves a subset of columns from a relational table, placing them into a new relation. In the process, it also removes duplicate rows from the result. The necessity for this initially seems odd, but consider the situation in which the key columns are not retrieved as a part of the operation. Without these values, it is realistic

“_ Product I oin Chapter I --Introduction to the Relational **-em> Datgd ,*~~m operation i
“_ Product I oin Chapter I --Introduction to the Relational **-em> Datgd ,*~~m operation i
“_ Product I oin Chapter I --Introduction to the Relational **-em> Datgd ,*~~m operation i
“_ Product I oin Chapter I --Introduction to the Relational **-em> Datgd ,*~~m operation i

“_

Product

I oin

Chapter I --Introduction to the Relational **-em>

Datgd ,*~~m
Datgd ,*~~m

operation i

to the Relational **-em> Datgd ,*~~m operation i to expect that there may be some duplication
to the Relational **-em> Datgd ,*~~m operation i to expect that there may be some duplication
to the Relational **-em> Datgd ,*~~m operation i to expect that there may be some duplication
to the Relational **-em> Datgd ,*~~m operation i to expect that there may be some duplication
to the Relational **-em> Datgd ,*~~m operation i to expect that there may be some duplication
to the Relational **-em> Datgd ,*~~m operation i to expect that there may be some duplication
to the Relational **-em> Datgd ,*~~m operation i to expect that there may be some duplication

to expect that there may be some duplication among the value&

non-key columns of the resulting table. The project unique values in the new table.

resulting table. The project unique values in the new table. The product operation puts two rows
resulting table. The project unique values in the new table. The product operation puts two rows

The product operation puts two rows from separate tables to& the resultant table. The new relation is now twice the column 1 the original base tables.

is now twice the column 1 the original base tables. The join operation combines the Product
is now twice the column 1 the original base tables. The join operation combines the Product
is now twice the column 1 the original base tables. The join operation combines the Product
is now twice the column 1 the original base tables. The join operation combines the Product
is now twice the column 1 the original base tables. The join operation combines the Product
is now twice the column 1 the original base tables. The join operation combines the Product

The join operation combines the Product and Select operations duce the new relation. Rather than simply combining the columi two tables, the rows to be combined are defined through the Q a Select operation.

be combined are defined through the Q a Select operation. Union The union operation vertically combines
be combined are defined through the Q a Select operation. Union The union operation vertically combines
Union
Union
are defined through the Q a Select operation. Union The union operation vertically combines the data

The union operation vertically combines the data in the rows of8 relation with the rows in another table. removinn duDlicate r resulting table. Vertical combination requires that the columns for each of the base tables be defined exactly alike.

for each of the base tables be defined exactly alike. intersection Difference Division The intersection of
for each of the base tables be defined exactly alike. intersection Difference Division The intersection of
for each of the base tables be defined exactly alike. intersection Difference Division The intersection of
for each of the base tables be defined exactly alike. intersection Difference Division The intersection of

intersection

Difference

Division
Division
be defined exactly alike. intersection Difference Division The intersection of two tables is a relation containing
be defined exactly alike. intersection Difference Division The intersection of two tables is a relation containing
be defined exactly alike. intersection Difference Division The intersection of two tables is a relation containing
be defined exactly alike. intersection Difference Division The intersection of two tables is a relation containing

The intersection of two tables is a relation containing those row are common to both tables. The intersection operator evaluates contents of matching columns in each table to determine if the are a match.

columns in each table to determine if the are a match. The difieerence of two tables
columns in each table to determine if the are a match. The difieerence of two tables

The difieerence of two tables is a relation that contains those rows exist in one of the two tables but not the other. The unique rows each relation are selected after a comparison of column values

relation are selected after a comparison of column values The division operator results in a relation
relation are selected after a comparison of column values The division operator results in a relation
relation are selected after a comparison of column values The division operator results in a relation
relation are selected after a comparison of column values The division operator results in a relation

The division operator results in a relation that

from one table for which there are other matching column sponding to every row in another table. In other words, a r going to be divided by another relation with the quotient b relation. Consider the example shown in Figure 1.7. Dividing the “!T

contains colu

relation with the quotient b relation. Consider the example shown in Figure 1.7. Dividing the “!T
relation with the quotient b relation. Consider the example shown in Figure 1.7. Dividing the “!T
INGREDIENT relation by the RECIPE relation results in a new R The resulting table answers
INGREDIENT relation by the RECIPE relation results in a new R The resulting table answers
INGREDIENT relation by the RECIPE relation results in a new R
The resulting table answers the question: Who can provide all Or!-, ,,’
ingredients for this recipe?
JS .ti, i;”
r
,.
,,”
‘TI-
A/’
SUPPLIERS
Kingfish, Inc.
33c
Crab
100
Kingfish, Inc.
4 T
Tuna
200
The
Crab
Walk
c22
Crab
300
Devil Ray Foods
CRXXI
Crab
t:
100
Kingfish. Inc.
v 3
Onion
300
Devil Ray Foods
VGXX9
Onion
I
k
200
The
Crab
Walk
BR
Bread
Cr
300
Devil Ray Foods
1 Fuo<67
Bread
Crumbs
I
RECIPE
.$g
RESULTS OF DIVISION
“&
Devil Ray Foods
Figure I. 7 The division operator
The relational database model is the de facto standard for data man:‘,
agement software and development tools. As a developer, it
is criti+S
that you understand how closely your choice of tools follows the
rules of the relational paradigm. By understanding the theory be
the implementation, you are in a much better position to exploit
power of the database. The following two chapters continue to p
the bedrock knowledge that is the mark of the experienced databe; _
developer.
The following two chapters continue to p the bedrock knowledge that is the mark of the
Chapter 2 discusses the practice of designing the relational database. -; Careful design, as in
Chapter 2 discusses the practice of designing the relational database. -; Careful design, as in
Chapter 2 discusses the practice of designing the relational database. -; Careful design, as in

Chapter 2 discusses the practice of designing the relational database. -; Careful design, as in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in the final database.

in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
in all software projects, pays off in the measure of ,‘.’ integrity and performance evident in
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter:
a
a
a * 1‘L-a g 28 ’ going _p.d -%z .i F Included in This Chapter: H
*
*
1‘L-a g 28 ’ going _p.d -%z .i
1‘L-a g
28
going
_p.d -%z
.i
F
F

Included in This Chapter:

H Diagramming the Database 4 Case Study 8 Nor- malization n Normalization Case Study n Translating the Logical Design into a Physical Design

n Translating the Logical Design into a Physical Design As with any programming effort, taking the
n Translating the Logical Design into a Physical Design As with any programming effort, taking the
n Translating the Logical Design into a Physical Design As with any programming effort, taking the
n Translating the Logical Design into a Physical Design As with any programming effort, taking the

As with any programming effort, taking the time to plan and create a.1 proper design pays off when the rubber hits the road and the time corr to develop a database application. The application creation process is simplified, as numerous problems and issues have already been addressed up front, before the first Begin statement is coded. If you don’t care to spend the time to carefully consider your database up front, you can simply pay the price in stability, accuracy, and programming effort needed to make it right on the back end.

Logical database modeling, the subject of this chapter, is very similar to the practice of object modeling; in both processes you are attempting to

identify the real-life items, relationships, and processes that you’re

real-life items, relationships, and processes that you’re to emulate with your software. The model that we

to emulate with your software. The model that we create is going to meet all of the requirements of the relational database model discussed in the preceding chapter, and because of that, it will reap all of the bene- fits. There are going to be three areas of activity described, and in the course of examining them, we will become familiar with the Redwood

Fish Foods Company, our case study.

‘,j !, $ 52 ; ,> S :i .,$ ;ii for this $3 (El?) diagt-unz;‘$j
‘,j !, $ 52 ; ,> S :i .,$ ;ii for this $3 (El?) diagt-unz;‘$j
‘,j !, $ 52 ; ,> S :i .,$ ;ii for this $3 (El?) diagt-unz;‘$j
‘,j !, $ 52 ; ,> S :i .,$ ;ii for this $3 (El?) diagt-unz;‘$j
‘,j !, $ 52 ; ,> S :i .,$ ;ii for this $3 (El?) diagt-unz;‘$j
‘,j
!,
$
52
;
,>
S
:i
.,$
;ii
for this $3
(El?) diagt-unz;‘$j
2

20 n Pa

:i .,$ ;ii for this $3 (El?) diagt-unz;‘$j 2 20 n Pa ‘T-t I- -The Relational

‘T-t I- -The

Relational

Database

First, the design for the database will be mapped through the creation

of entity-relationship diagrams, a standard tool that makes the data-

base members crystal clear and puts the information into a format that

can be quickly converted into relations. Secondly, the process of nor- malizing the relations described in the ER diagrams will be examined in 2 detail. Normalization is the process by which the relations you have designed are tested against the rules of the relational database, and through multiple design iterations they are manipulated into place. Finally, the process of mapping the design to a physical data structure ;

will be explored, setting up your design to be converted to a physical implementation.

Diagra.mt ning the Database

To arrive at any destination, the most efficient process is to follow a

map. Developing a database is no different. The mapping used

type of development effort is called an entity-relationship

it’s a graphical representation of all of the items that will be contained in the database. The diagram completely describes the database to the

The diagram completely describes the database to the 3 ,$Z ‘$ :*; level of detail necessary
3 ,$Z ‘$ :*;
3
,$Z ‘$
:*;

level of detail necessary to transfer the logical design directly to a php 5%

ical implementation.

In nearly every instance, the diagram will be composed of representa- .g ‘

tions of the following items:

w
w

Entities

w
w

Relationships

n Primary keys

8
8

Alternate keys

n Foreign keys

w
w

Business

rules

” -4 -i : .:: ;; 1/ .v ;? ,‘I. ;s “, :; i--i 3
-4
-i
:
.:: ;; 1/
.v
;? ,‘I. ;s
“,
:;
i--i
3

In reviewing the ER diagram, it is easy to see that the two most critical ;;i

components of the database model are the entities and the relation- ships between them. Before applying these diagramming tools, establishing some definitions is in order.

Entities

An entity is any object that the user wants to represent in the databae-

and which she wants to record facts about. Examples of entities are VENDORS or ITEMS or CDs; any people, places, or things involved wi$$&$ the subject of the database. Entities are composed of attributes, facts “%$ 1

-_ Chapter 2-Logical and Physical Database Design I -. .i- .i _ -_ I; d-,

-_

Chapter 2-Logical and Physical Database Design

I

-.

.i-

.i

_

-_

I;

d-,

s

I

=n

a

,*-v_e

F _

,,

.-

(_

,

->

-. .i- .i _ -_ I; d-, s I =n a ,*-v_e F _ ,, .-
‘1 r
‘1 r
.i _ -_ I; d-, s I =n a ,*-v_e F _ ,, .- (_ ,

_-- -.

I; d-, s I =n a ,*-v_e F _ ,, .- (_ , -> ‘1 r

that describe the object. For example, a VENDOR will have a name,‘::

street address, and city attributes that serve to make that instance (

VENDOR different from all the rest. Attributes should be

Crete pieces of data such as the city or state, atomic in nature, mea& that they describe only one element of the object.

simple, disg

I
I
Relationsh ips
Relationsh
ips

Relationships are the linking agents between the identified entities. d the case of two entities, VENDOR and ITEMS, we can identify a rel&?!$ tionship as supplies, as in the VENDOR supplies ITEMS. Thrt :e br categories, generally described by verbs or prepositions, are general indicators of the existence of a relationship:

w Existence relationship-a VENDOR has EMPLOYEES w Functional relationship-an EMPLOYEE fi&Hs ORDERS

w Functional relationship-an EMPLOYEE fi&Hs ORDERS n Event relationship-a VENDOR delivers ITEMS Figure 2. I
w Functional relationship-an EMPLOYEE fi&Hs ORDERS n Event relationship-a VENDOR delivers ITEMS Figure 2. I
w Functional relationship-an EMPLOYEE fi&Hs ORDERS n Event relationship-a VENDOR delivers ITEMS Figure 2. I
w Functional relationship-an EMPLOYEE fi&Hs ORDERS n Event relationship-a VENDOR delivers ITEMS Figure 2. I

n Event relationship-a VENDOR delivers ITEMS

Figure 2. I

Cardinality of

relationships

The entity-relationship diagram displayed in Figure 2.1 demonstratd

pronertv of the defined relationships; each has a

.i
.i

cardinalin/.

of the defined relationships; each has a .i cardinalin/. One-to-One I : I INVOICE H LINE
of the defined relationships; each has a .i cardinalin/. One-to-One I : I INVOICE H LINE
of the defined relationships; each has a .i cardinalin/. One-to-One I : I INVOICE H LINE

One-to-One I

:

I

INVOICE

H
H

LINE

ITEMS

One-to-Many

I

:M

PRoDUCTS l-----l PARTS I
PRoDUCTS
l-----l
PARTS
I

Many-to-Many

M:M

I :M PRoDUCTS l-----l PARTS I Many-to-Many M:M Cardinality is the number of expected occurrences of
I :M PRoDUCTS l-----l PARTS I Many-to-Many M:M Cardinality is the number of expected occurrences of
I :M PRoDUCTS l-----l PARTS I Many-to-Many M:M Cardinality is the number of expected occurrences of

Cardinality

is the number of expected occurrences of the relationship

between two entities. A one-to-one relationship means

record (set of attributes) in the first entity, there is exactly one

that for each

.7aa

A one-to-one relationship means record (set of attributes) in the first entity, there is exactly one

A one-to-one relationship means record (set of attributes) in the first entity, there is exactly one
22 n # _-_ Part I--The Relational Dotabase occurrence of a related record in other
22 n # _-_ Part I--The Relational Dotabase occurrence of a related record in other
22 n # _-_ Part I--The Relational Dotabase occurrence of a related record in other
22 n # _-_ Part I--The Relational Dotabase occurrence of a related record in other
22 n # _-_
22 n
#
_-_

Part I--The Relational Dotabase

22 n # _-_ Part I--The Relational Dotabase occurrence of a related record in other entity

occurrence of a related record in other entity in the relationship. In Fig- ure 2.1, the first example says that for each instance of WAREHOUSE there is exactly one WAREHOUSE MANAGER.

rip
rip

There are many accepted methods for the graphical representa- tion of an entity-relationship diagram. For purposes of simplicity, we are going to document the cardinality degree of our relation- ships with one of two symbols. One-to-one relationships will be

denoted with a single arrowhead

(-+). Relationships with multiple

with a single arrowhead (-+). Relationships with multiple cardinality will be shown with two arrowheads on

cardinality will be shown with two arrowheads on the relation- ship line (++).

shown with two arrowheads on the relation- ship line (++). The second example, a one-to-many cardinality,

The second example, a one-to-many cardinality, is demonstrated by the invoice to line items relationship. In this relationship, one record in the first entity is related to one or more records in the other entity. The example diagrams a relationship in which each INVOICE contains one or more LINE ITEMS.

The last relationship in the example is the most complex. A many-to-many cardinality represents that one or more records of one of the entities has relationships with one or more records of the other entity. The many-to-many relationship is difficult to diagram and even more difficult to translate into a logical and physical database design. The recommended course for handling relationships of this type is that they be further decomposed, if possible, into a pair of one-to-many relationships. Figure 2.2 is an example of this process.

The original relationship had two programmer entities assigned to one or more of the project entities. Creating a new entity called TASKS and then establishing two one-to-many relationships between the relations creates a clearer representation of the relationship. The first, between TASKS and PROGRAMMERS, states that one PROGRAMMER is assigned to one or more TASKS. A second relationship is then established that says that the PROJECT entity has one or more PROGRAMMER entities assigned to it. No information is lost in this process and you end up with a much more understandable design.

entities assigned to it. No information is lost in this process and you end up with
entities assigned to it. No information is lost in this process and you end up with
Chaster 2-Logical and Physical Database .“ Design **~ PROGRAMMER / JO’+’ (Yp,,, PROJECT Y 1
Chaster 2-Logical and Physical Database .“ Design **~ PROGRAMMER / JO’+’ (Yp,,, PROJECT Y 1
Chaster 2-Logical and Physical Database .“ Design **~ PROGRAMMER / JO’+’ (Yp,,, PROJECT Y 1
Chaster
Chaster

2-Logical

and

Chaster 2-Logical and Physical Database .“ Design **~ PROGRAMMER / JO’+’ (Yp,,, PROJECT Y 1 -i

Physical

Database

.“

Design

**~

PROGRAMMER / JO’+’ (Yp,,, PROJECT Y 1 -i
PROGRAMMER
/
JO’+’
(Yp,,,
PROJECT Y 1
-i
TASKS I
TASKS
I

II

Relationships also have direction, unlike some of the programmers. The direction of the relationship describes which entity is the “from” entity and which is the “to.” The “from” is also referred to as the parent while the “to” entity takes the label of child. The direction of a one-to-one relationship is for the most part arbitrary. A one-to-many cardinality requires a little more thought to discover the direction, although often the verb or preposition that describes the relationship provides a strong indicator. For example, an INVOICE has LINE ITEMS; the INVOICE entitv has the LINE ITEMS, making it the oarent to the LINE ITEMS child.

LINE ITEMS, making it the oarent to the LINE ITEMS child. Figure 2.2 A many- to-many
LINE ITEMS, making it the oarent to the LINE ITEMS child. Figure 2.2 A many- to-many
LINE ITEMS, making it the oarent to the LINE ITEMS child. Figure 2.2 A many- to-many
LINE ITEMS, making it the oarent to the LINE ITEMS child. Figure 2.2 A many- to-many

Figure 2.2

A many-

to-many

relationship

decomposed

Primary Keys

The singular nature of items in a relational database system is the key concept that drives the paradigm. By ensuring that each record is unique in some way, the results of either a query or data manipulation c ,; operation can be guaranteed to be correct by the database system. A ‘.:

primary key is an attribute value on which the non-redundancy rules .’ are heavily enforced. Though attributes in the remainder of the record r

value on which the non-redundancy rules .’ are heavily enforced. Though attributes in the remainder of
.,? -j4 ,g :; :$i C-3 jj :,$ : ‘; ~’ 1:~ the item 24
.,? -j4 ,g :; :$i C-3 jj :,$ : ‘; ~’ 1:~ the item 24
.,? -j4 ,g :; :$i C-3 jj :,$ : ‘; ~’ 1:~ the item 24
.,? -j4 ,g :; :$i C-3 jj :,$
.,?
-j4
,g :;
:$i C-3
jj :,$
: ‘; ~’ 1:~
:
‘;
~’
1:~
.,? -j4 ,g :; :$i C-3 jj :,$ : ‘; ~’ 1:~ the item 24 n
.,? -j4 ,g :; :$i C-3 jj :,$ : ‘; ~’ 1:~ the item 24 n
.,? -j4 ,g :; :$i C-3 jj :,$ : ‘; ~’ 1:~ the item 24 n

the item

24 n ---.w^_wI_p
24
n
---.w^_wI_p

fart I -The

Relational Database il s. .sda,-.n ‘ , I. ,
Relational
Database
il
s.
.sda,-.n
,
I.
,

_

,I=.- 4181118-1

_,

may match similar attributes in other records, in tandem with the key value, they are unique.

The first step taken in this part of the design process is to identify an attribute or a minimal set of attributes that can be used to uniquely

identify a record to become the primary key Sometimes it may become ‘.,$ necessary to introduce a new attribute to the relation if a single attrib- ute or a set of attributes (a composite key) cannot be identified. As an example, in an EMPLOYEE relation, this may take the form of an iden-

tification number that is applied to each record. A key may also be composed from a minimal set of attributes, the smallest number of $ attributes that are necessary to uniquely identify a record. Testing to ,j determine if it is truly the minimal set of attributes involves the removal of any one of the attributes to determine if it results in a loss of the uniqueness. Any attribute or attributes that are selected as candih *ii date keys must meet the requirements of the primary key, uniqueness Y and always being non-null.

In the VENDORS entity, the name of the company alone may serve to

uniquely identify each record. An ITEMS table might select

number as a candidate key since no two products will share the same

identifying number. On the other hand, if we deal with a number of ‘2

divisions of the same vendor, the name alone will not suffice to ensure ‘:?

uniqueness. In this case we will need to create a composite key of mul-

tiple fields. Combining the vendor name and zip code creates a unique “?; identifier that can be used.

code creates a unique “?; identifier that can be used. ‘; :;:,: : .:j _- 3
‘; :;:,: : .:j _- 3 ~$ ‘.t< *
‘;
:;:,: :
.:j
_-
3
~$ ‘.t< *
,Pw:y‘2 $j ~~4 :. ‘L.‘ -j 2 z
,Pw:y‘2
$j
~~4
:. ‘L.‘ -j 2 z
3 ~$ ‘.t< * ,Pw:y‘2 $j ~~4 :. ‘L.‘ -j 2 z Alternate Keys The candidate

Alternate Keys

The candidate keys that were not selected as the primary keys become alternate keys. These keys will become tools to help implement easier access to the data for your users, perhaps becoming indexes. The pur- pose of identifying the alternate keys is to provide substitute access paths.

Foreign Keys Aforeign

key serves a critical purpose in a relationship. It is an attribute

words, the

or set of attributes that identifies the parent record. In other

foreign key is the attribute that links the child occurrences to the par-

ent entity occurrence through matching key values. The foreign key is artificially present in the child entity and is the primary key of the parent.

î _c. ,:; ”2 ;? :? .,-d ” ). Chapter 2-Logical and Physical Database Design
î _c. ,:; ”2 ;? :? .,-d ” ). Chapter 2-Logical and Physical Database Design

î _c.

î _c. ,:; ”2 ;? :? .,-d ” ). Chapter 2-Logical and Physical Database Design l
î _c. ,:; ”2 ;? :? .,-d ” ). Chapter 2-Logical and Physical Database Design l
î _c. ,:; ”2 ;? :? .,-d ” ). Chapter 2-Logical and Physical Database Design l
,:; ”2 ;? :? .,-d ”
,:; ”2
;?
:? .,-d
î _c. ,:; ”2 ;? :? .,-d ” ). Chapter 2-Logical and Physical Database Design l
).
).

Chapter 2-Logical

and Physical Database Design l

*r .~*r_s.Bcs )~

eie

*_

i

,

/*;1

Database Design l *r .~*r_s.Bcs )~ eie *_ i , /*;1 For example, in the relationship

For example, in the relationship VENDOR supplies ITEMS, VENDOR’8 is the primary key in the VENDORS entity. This attribute appears in ITEMS entity as a foreign key, linking each item instance back to a sp# cific vendor. Figure 2.3 shows the resulting relation between the two entities.

VENDORS

the resulting relation between the two entities. VENDORS ITEMS I 901 Disk 200 I I I

ITEMS

I 901 Disk 200 I I I I I 902 CD-ROM 200 I 903 Book
I 901
Disk
200
I
I
I
I I
902
CD-ROM
200
I
903 Book
904
Disk
I
I
I
.
.
.
.
Tip

Figure 2.3 The VENDOR-ID attribute is the primary key in the VENDORS em und a foreign key in the /EMS entity.

the VENDORS em und a foreign key in the /EMS entity. Do not select alternate keys

Do not select alternate keys as a foreign key. Alternate keys allow null values, leading to linkage problems not present if the primary key is used. In addition, the use of alternate keys leads to unnecessary complication of the database structure because their use requires indirect referencing back to the primary key anyway.

/

lies Recall that one of the major benefits of the relational database modei ii its increased data integrity. The main tools for implementing the inte@$ rity constraints are called business rules. These constraints are just th rules that govern the data and the transactions that are modeled by.@ database. For example, the Waytag Appliance service center will only; make service calls to cities in the state of Colorado. Because of this;? area code other than 303, 719, 720, or 970 would be acceptable on d work order. When written as a business rule, this constraint is desc as follows:

be acceptable on d work order. When written as a business rule, this constraint is desc
;
;
be acceptable on d work order. When written as a business rule, this constraint is desc
be acceptable on d work order. When written as a business rule, this constraint is desc

Bush qess RI

be acceptable on d work order. When written as a business rule, this constraint is desc
,-.$ : &‘f $ I t? 1: ii; .-I! >* *: .:I *.- 26 n
,-.$ : &‘f $ I t? 1: ii; .-I! >* *: .:I *.-
,-.$
: &‘f $ I
t?
1:
ii;
.-I!
>* *:
.:I
*.-

26 n fart/ -The

I t? 1: ii; .-I! >* *: .:I *.- 26 n fart/ -The Relational Database The

Relational

Database

The customer’s area code must be equal to 303 or 719 or 720 or 970.

Each attribute in every table should be reviewed with the user to deter-$ mine the constraints necessary for maintaining the integrity of the data.Gg

Acceptable ranges of data as well as format and type are all items that ‘I?$

work together to implement a business rule. The rules will become a

part of the physical database implementation, either directly imple- :$ mented through the data structure or handled by the supporting code. 2; r;

There is a second aspect of the rules that needs to be discussed, the issue of triggering. Business rules become a part of the field definition

for each relation. Obedience to the rule is tested during specific opera- “;

tions on the database. These actions are called triggering

Three database operations trigger the business rules tests: - .

Adding data to a database m Deleting data from a database

Updating the data in fields of a databaseAdding data to a database m Deleting data from a database Triggering operations cause the relational

Triggering operations cause the relational system to validate that the ~.; database is being changed according to the rules defined for the partic- ‘3 ular field. For example, when adding a record to the SERVICE CALL ‘21 ~ table, the data being entered for the Area Code field is validated against the four acceptable codes that were defined. Entering 415 would generate an exception and the RDBMS control mechanisms would prevent the user from entering this data.

l

operations.

prevent the user from entering this data. l operations. ” .$$j .G a 2: “‘$i Likewise,
prevent the user from entering this data. l operations. ” .$$j .G a 2: “‘$i Likewise,
” .$$j .G a 2: “‘$i
.$$j .G
a 2:
“‘$i

Likewise, a deletion operation should trigger a more complex validation procedure. Take the example of an INVOICES table that draws the ven- I’ ‘ji dor’s address from the VENDOR table. A business rule applicable to this :J situation is:

Every vendor ID that appears on an invoice must be a verifk ‘.‘d

able and valid vendor ID.

This indicates that as long as open invoices contain a vendor’s ID num-

ber, there must be a record in the VENDOR table that matches that ID. If the user tries to delete a vendor from the VENDOR table, a validation -- procedure must examine all open invoices to determine if that ID is still in use. If it is, an exception must be raised to alert the user to that fact -:i and prevent her from removing the record.

.,
.,
If it is, an exception must be raised to alert the user to that fact -:i
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY “ j.j” / .**v and Physical Database Desigrp ,_- - i~--4-,*n.i”_-*-, Case
Chapter 2-Logical dY
Chapter
2-Logical
dY
“ j.j” / .**v
j.j”
/ .**v

and Physical Database Desigrp

,_-

-

i~--4-,*n.i”_-*-,

Case Stu

The best way to apply these conceptual tools is to work our way ‘:

to apply these conceptual tools is to work our way ‘: through an example. To do

through an example. To do this we’ll examine a common database ‘LB application, the order entry system. This business model is easy to’ .S understand: A company sells items to customers. The customers the company and provide their name and address and payment i mation, which the company retains for future reference. The custd selects the items that they would like to purchase and orders the company, in turn, pulls the item from inventory, prepares an in the offsetting entry to the customer’s payment, and ships the it dors ship the items to us that we sell to the customers. The Re Fish Foods Company will be the company that opens its doors to exploration.

to us that we sell to the customers. The Re Fish Foods Company will be the
to us that we sell to the customers. The Re Fish Foods Company will be the
will be the company that opens its doors to exploration. lS In this example, it is
will be the company that opens its doors to exploration. lS In this example, it is

lS

will be the company that opens its doors to exploration. lS In this example, it is
will be the company that opens its doors to exploration. lS In this example, it is

In this example, it is relatively easy to pick out the entities. Reme

it is relatively easy to pick out the entities. Reme that the entities are those people,

that the entities are those people, places, or things about which we4 want to record facts in the database. Figure 2.4 collects the ident entities for Redwood Fish Foods and, as shown, we will be using

standard rectannular

remesentation
remesentation

of an entitv in all of

the dia

rectannular remesentation of an entitv in all of the dia CUSTOMERS The Player / CREDITTERMS Iii
rectannular remesentation of an entitv in all of the dia CUSTOMERS The Player / CREDITTERMS Iii
rectannular remesentation of an entitv in all of the dia CUSTOMERS The Player / CREDITTERMS Iii
rectannular remesentation of an entitv in all of the dia CUSTOMERS The Player / CREDITTERMS Iii
rectannular remesentation of an entitv in all of the dia CUSTOMERS The Player / CREDITTERMS Iii
rectannular remesentation of an entitv in all of the dia CUSTOMERS The Player / CREDITTERMS Iii

CUSTOMERS

remesentation of an entitv in all of the dia CUSTOMERS The Player / CREDITTERMS Iii Figure

The Player

/ CREDITTERMS Iii
/ CREDITTERMS
Iii

Figure 2.4 The entities of the Redwood Fish Foods Company

/ INVOICES

entities of the Redwood Fish Foods Company / INVOICES Five entities were identified from the case
entities of the Redwood Fish Foods Company / INVOICES Five entities were identified from the case
entities of the Redwood Fish Foods Company / INVOICES Five entities were identified from the case

Five entities were identified from the case description:

n

Vendors

n

Customers

H

Items

n

Invoices

w
w

Credit terms

from the case description: n Vendors n Customers H Items n Invoices w Credit terms
from the case description: n Vendors n Customers H Items n Invoices w Credit terms
28 H p*l*El lrimjXa 1- Part I -The Relational Database * . .- _* .
28 H p*l*El lrimjXa 1-
28
H
p*l*El
lrimjXa
1-

Part I -The Relational Database *

.

.-

_*

.

_
_

.-Ic*e,,,“s_L

~

eI

ss

Relational Database * . .- _* . _ .-Ic*e,,,“s_L ~ eI ss In reality, the process

In reality, the process of identifying entities is a laborious one and would result in the identification of numerous other entities. The prac- tice of interviewing users, reviewing reports, examining business practices, and looking at the current database would yield many more items for us to work with, In the interest of simplicity, we will limit our discussions to these few entities.

we will limit our discussions to these few entities. *. gj ‘ h l ., ::
*. gj
*.
gj

hl .,:: ;.:

to these few entities. *. gj ‘ h l ., :: ;.: $ ‘z Their Attributes
$ ‘z
$
‘z

Their Attributes

Finding the positive attributes of an object (person, place, or thing) is a good exercise
Finding the positive attributes of an object (person, place, or thing) is a
good exercise in any walk of life, but it is an especially important skill
.:i
$:”
in database design. Remember that the attributes of an entity are the .?
smallest component that describes the entity. In other words, the attrib- ’
utes are the individual items that make up each record and will direct& ‘i
translate into the fields of the database records. In reviewing the enti-
ties that you have identified for the database, the next task is to
identify the components of each of them.
,-_ 2; -I

We’ll start the process with the VENDORS entity. Without worrying

about the normalization of the data (that step will arrive soon enough), ‘-‘:

list all of the attributes you can identify that describe the vendors that

will inhabit the table. In considering each attribute, you should be

that the element cannot be decomposed further. If it can, it should be

reduced into two or more attributes. For example, the Address

of a vendor entry should not contain the street, city, state, and zip codes ::$J

.--:I
.--:I
X’ ~2:
X’ ~2:

.,:i$ .x-s :’.2; .g; ‘_ 1. ~” $

.--:I X’ ~2: .,:i$ .x-s :’.2; . g ; ‘_ 1. ~” $ sure .“j+; attribute

sure .“j+;

attribute S’:!fq

. g ; ‘_ 1. ~” $ sure .“j+; attribute S’:!fq of the vendor. This attribute

of the vendor. This attribute can be decomposed further into its individ- .‘“??g

ual elements and should be represented by individual fields for the street, city, state, and zip code.

The VENDOR entity is composed of the following attributes:

The VENDOR entity is composed of the following attributes: . $sTP d n Name n Street
The VENDOR entity is composed of the following attributes: . $sTP d n Name n Street
. $sTP d
.
$sTP d
entity is composed of the following attributes: . $sTP d n Name n Street n City

n

Name

n

Street

n

City

n

State

n

Zip Code

n

Telephone

n

Fax

n Contact Last Name H Contact First Name

This set of attributes adequately describes the vendors in the table. We must now consider which of the items uniquely describes each entity The fields that we select will be the candidate keys, and nearly all of

.va-ilw.v;_willnsl- * ., Chapter 2-Logical and Physical Dcrtabase ë ==,~,>a~,_ b-q Design n the fields

.va-ilw.v;_willnsl-

*.,

Chapter

2-Logical

and Physical

.va-ilw.v;_willnsl- * ., Chapter 2-Logical and Physical Dcrtabase ë ==,~,>a~,_ b-q Design n the fields in
Dcrtabase ë ==,~,>a~,_ b-q
Dcrtabase
ë ==,~,>a~,_
b-q

Design

n

and Physical Dcrtabase ë ==,~,>a~,_ b-q Design n the fields in the VENDOR entity could stand

the fields in the VENDOR entity could stand a reasonable chance of being the primary key. In one way or another, each of them could be .,I

expected to be unique in each record. As we review them, let’s first consider if any of the values of the attributes is likely to change as the 1:

the values of the attributes is likely to change as the 1: Y* database is put
Y*
Y*

database is put to use. Primary key fields should be the most stable iIi I the database since so many items are linked by and through them. The! Street, City, State, Zip-Code, Telephone, Fax, and Contact fields coulft- c< all be susceptible to change during usage, causing all manner of update:;2;

repercussions. They will be removed from consideration, leaving the vendor Name field as the only candidate.

This field would be a natural choice for a primary key. Very few cornpa- ,

nies will share business names, so it meets the requirements for uniqueness, and it isn’t very often that a company changes its name,

making it relatively stable. One last consideration is whether or not it .;i will ever be a null value. All of the vendors that we deal with will have

a business name, so we are okay on this front as well. However, what will happen if the companies begin to provide products to us from mul- :.” tiple locations such as different warehouses or processing plants? To ? keep our books in order, it is necessary to maintain the source of the goods information for each order. In this case, the names are now in danger of being duplicated. This destroys the uniqueness of the Name field.

In situations such as this, it is best to introduce an additional attribute .:‘i to the entity that will ensure the uniqueness of each record. The easiest :‘;

ensure the uniqueness of each record. The easiest :‘; ‘:: ‘3 .,;. :$ “z; ‘$ method
ensure the uniqueness of each record. The easiest :‘; ‘:: ‘3 .,;. :$ “z; ‘$ method
‘:: ‘3
‘::
‘3

.,;.

of each record. The easiest :‘; ‘:: ‘3 .,;. :$ “z; ‘$ method of producing this
of each record. The easiest :‘; ‘:: ‘3 .,;. :$ “z; ‘$ method of producing this
:$ “z; ‘$
:$
“z;
‘$

method of producing this result for the VENDORS relation is to add a

Vendor ID attribute to each record. This will serve two purposes, the first bemg that each record will now be guaranteed to be unique by the,,.’ Vendor-ID attribute. Secondly, each vendor record can now be freely -, ;, .:;$ .:?a modified without fear of harming any relationships within the database.

fear of harming any relationships within the database. I Can Relat :e to That As we
fear of harming any relationships within the database. I Can Relat :e to That As we

I Can Relat :e to That

relationships within the database. I Can Relat :e to That As we seek to identify the

As we seek to identify the entities to be represented in the database, ‘, .‘y ,t there should also be some idea forming about how they relate to one 2 another. The relationship describes transactions, communications, and fz 4 ownership between the entities. As we identify and examine the rela- tionships, the database designer is looking to identify the cardinality, ,;;r / the type of participation, and the degree of participation. The funda- mental relationships are shown in Figure 2.5.

/ the type of participation, and the degree of participation. The funda- mental relationships are shown
/ the type of participation, and the degree of participation. The funda- mental relationships are shown
/ the type of participation, and the degree of participation. The funda- mental relationships are shown
/ the type of participation, and the degree of participation. The funda- mental relationships are shown
30 n Part Figure 2.5 Relationships between the entities I -The Relational Database VENDORS CUSTOMERS

30 n

Part

Figure 2.5

Relationships

between the

entities

I -The

Relational

Database

between the entities I -The Relational Database VENDORS CUSTOMERS I- PURCHAsf--u ITEMS I r-----l
VENDORS
VENDORS
between the entities I -The Relational Database VENDORS CUSTOMERS I- PURCHAsf--u ITEMS I r-----l CUSTOMERS t-
CUSTOMERS I- PURCHAsf--u ITEMS I
CUSTOMERS
I- PURCHAsf--u
ITEMS
I
Database VENDORS CUSTOMERS I- PURCHAsf--u ITEMS I r-----l CUSTOMERS t- GENERATE--b r-----l I N V O
Database VENDORS CUSTOMERS I- PURCHAsf--u ITEMS I r-----l CUSTOMERS t- GENERATE--b r-----l I N V O

r-----l

CUSTOMERS

t-
t-
CUSTOMERS I- PURCHAsf--u ITEMS I r-----l CUSTOMERS t- GENERATE--b r-----l I N V O I C

GENERATE--b

r-----l

INVOICE

I

CUSTOMERS t- GENERATE--b r-----l I N V O I C E I Identifying the relationships and
CUSTOMERS t- GENERATE--b r-----l I N V O I C E I Identifying the relationships and
CUSTOMERS t- GENERATE--b r-----l I N V O I C E I Identifying the relationships and
CUSTOMERS t- GENERATE--b r-----l I N V O I C E I Identifying the relationships and

Identifying the relationships and the details of each is again perform&j through an analysis of the existing database, interviews with users, a&$ your own understanding of business practices. Examine the entities on@ 3;

by one and try to determine how many of the other tables are related

to

the nature of the relationship. In general, a verb or preposition will be

it. Often, stating the relationship in sentence form helps to identify

,,T. .G ‘j -.g a
,,T.
.G
‘j
-.g
a
“3
“3

the connecting word between the two entities and will readily identify :l; ‘ii! S:i

a relationship. For example, VENDORS supply ITEMS or CUSTOMERS

generate

For example, VENDORS supply ITEMS or CUSTOMERS generate IhWOlCES. Once you have established that a relationship

IhWOlCES.

Once you have established that a relationship exists, one of the

aspects of the affinity that you must identify is which of the flavors of cardinality you will be implementing. The type of relationship deter- mines the use and extent of foreign keys or linking tables in your database design.

A one-to-one relationship is one in which a single record in one table is

related to only one record in another table. In addition, a

second table can only be related to a single record in the first table. TIN, :?.% example shown in Figure 2.6 is a relationship in which each Customer ., record relates to only one record in the CREDIT TERMS table. Likewise,-S:

to only one record in the CREDIT TERMS table. Likewise,-S: primary .‘I :,i$ .d record in
to only one record in the CREDIT TERMS table. Likewise,-S: primary .‘I :,i$ .d record in

primary .‘I :,i$

.d

the CREDIT TERMS table. Likewise,-S: primary .‘I :,i$ .d record in the ,,‘ : : each

record in the ,,‘ :

:
:

each entry in the CREDIT TERMS relation is offset by a single record in ‘,’ the CUSTOMER table.

in the ,,‘ : : each entry in the CREDIT TERMS relation is offset by a
Chapter 2-Logical and Physical Database = i- -i _I , “*( ” I-iiil”P^ I ,_,
Chapter 2-Logical and Physical Database = i- -i _I , “*( ” I-iiil”P^ I ,_,
Chapter
2-Logical
and Physical Database
=
i-
-i
_I
,
“*(
I-iiil”P^
I
,_,
“ii
L
*
_
C U STOMER
b
CREDIT TERMS
Figure
2.4 A one-to-one relationship
Figure 2.7 demonstrates a one-to-many relationship between the ’
INVOICE and the ITEMS tables: An invoice contains one or more 1
items. Though the items themselves may be duplicated in th
table, each instance is unique because of the addition of the
key, Invoice Number, in the list of attributes for the table.
‘;
INVOICE
b+
ITEMS
Figure 2.7 A one-to-many relationship
The one-to-many relationship allows that a single record in one t
the parent, or dominant, table, is related to one or more records
child, or subordinate, table. The relationship also states the inve
that one or more records in the subordinate table is related to a sin
record in the dominant table.
The many-to-many relationship is the most complex of the three
both design and implementation. A many-to-many relationship c
of two tables in which a single record in the first table is related
or more records in the second table while at the same time a s’
record in the second table is related to one or more records in
table. The example shown in Figure 2.8 demonstrates this con
32 n fart I -The Relational Database VENDORS 7 ITEMS I 100 .’ .& Coos
32 n fart I -The Relational Database VENDORS 7 ITEMS I 100 .’ .& Coos
32 n fart I -The Relational Database VENDORS 7 ITEMS I 100 .’ .& Coos

32

n

fart I -The Relational Database

VENDORS 7 ITEMS I 100 .’ .& Coos Bay Fish SQO I Squid ,,y :$s&
VENDORS
7 ITEMS
I
100
.’
.&
Coos
Bay
Fish
SQO
I
Squid
,,y :$s&
200
Portland
Sea
Food
SQ02
Cuttlefish
~<$s < y
300
Pacific
Coast
Crab
CR10
King
Cral
400
Maine
Lobster
Shack
CR50
Blue
Cral
OY36
Oyster
? * .;3 “?
;:,
L
Figure 2.8 A many-to-many relationship
,g
Each vendor listed in the VENDORS table could supply one or more of:;<*
the products that we carry in the ITEMS table. Additionally, our crab
could be purchased from one or more vendors. Due to the amount of -$
l
redundant data that this relationship requires when implemented with.‘@
two tables alone, a third table is going to be introduced to the
database.
.ri ,<;22 y,&-&
:;“* ‘-d
This third table is called a linking table. The linking table is created byj$$
taking the primary key from each of the two tables involved in the rei@;3
tionship and making these the only attributes of the linking table.
:$g, : 9
,:
VENDORSITEMS
involved in the rei@;3 tionship and making these the only attributes of the linking table. :$g,
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food

Chapter 2-Logical and Physical Database Design’

Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food SQ02
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food SQ02
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food SQ02
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food SQ02
Chapter 2-Logical and Physical Database Design’ COOS Bay Fish SQO I Squid Portland Sea Food SQ02
COOS Bay Fish SQO I Squid Portland Sea Food SQ02 Cuttlefish Pacific Coast Crab CR10
COOS
Bay
Fish
SQO I
Squid
Portland
Sea
Food
SQ02
Cuttlefish
Pacific
Coast
Crab
CR10
King
Crab
Maine
Lobster
Shack
CR50
Blue
Crab
OY36
Oyster
Crab Maine Lobster Shack CR50 Blue Crab OY36 Oyster Figure 2.9 The many-to-many relationship is decomposed
Crab Maine Lobster Shack CR50 Blue Crab OY36 Oyster Figure 2.9 The many-to-many relationship is decomposed
Crab Maine Lobster Shack CR50 Blue Crab OY36 Oyster Figure 2.9 The many-to-many relationship is decomposed
Crab Maine Lobster Shack CR50 Blue Crab OY36 Oyster Figure 2.9 The many-to-many relationship is decomposed
Crab Maine Lobster Shack CR50 Blue Crab OY36 Oyster Figure 2.9 The many-to-many relationship is decomposed

Figure 2.9 The many-to-many relationship is decomposed and simplified by the addition ing table.

The structure of the VENDORSITEMS linking table is now:

Vendor-ID

Item ID

-

In looking at this new table you are now wondering how this fu primary rule of the relation: no redundant data. The reason that t& table works is that the key for the table is a composite key. The k composed of both fields together, which serves to make each ret unique. In examining the new relationships created in Figure 2.9 find something completely different. The many-to-many relations&,&S now composed of two one-to-many relationships. This makes the dg base much easier to understand and implement.

Your goal to this point in a project is have a complete ER diagram for- your project. This should include the entities, their attributes, and tl relationshios that have been identified. The next sten in the orocess the normalization of the data.

next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of
next sten in the orocess the normalization of the data. Normalization Normalization is the process of

Normalization

Normalization is the process of decomposing relations to ensure rn&<$$

mum stability and minimal data redundancy. The process is one

which relations and their structures are refined in such a way that 4 data is lost and no artificial structures are introduced. A fully

in !
in
!

:;:

are refined in such a way that 4 data is lost and no artificial structures are
are refined in such a way that 4 data is lost and no artificial structures are
iL_- I are .‘: ;; .; y 2-i 1! *$< are.& 34 n Port I
iL_- I are .‘: ;; .; y 2-i 1! *$< are.& 34 n Port I
iL_- I are .‘: ;; .; y 2-i 1! *$< are.& 34 n Port I
iL_- I are .‘: ;; .; y 2-i 1! *$< are.& 34 n Port I
iL_- I are .‘: ;; .; y 2-i 1! *$< are.& 34 n Port I
iL_-
iL_-
I
I
iL_- I are .‘: ;; .; y 2-i 1! *$< are.& 34 n Port I ii

are

.‘:
.‘:
;; .; y 2-i 1! *$< are.&
;; .;
y 2-i 1!
*$<
are.&

34 n Port I

ii

.‘: ;; .; y 2-i 1! *$< are.& 34 n Port I ii --The Relational Database
--The
--The

Relational

Database

~;L-;

normalized relation is one that most closely matches the guidelines &

the relational model and exhibits correctness, consistency, stability,

non-redundancy.

A table is said to be in normal form when its structure and data meet.:;;

the requirements of one of the stages of normalization. There

stages labeled first through fifth normal form, Boyce/Codd normal form, and domain-key normal form. For purposes of our discussion,

will limit our detailed discussion to the first through third normal forms, as this level of normalization is sufficient for the vast majority database programs. We will also step away from the Redwood Fish :Tg .g

example to explore other relations.

Fish ‘ : T g .g example to explore other relations. First Norm ral F o
Fish ‘ : T g .g example to explore other relations. First Norm ral F o

First Normral Form (INF)

To be in first normal form, a relation must have no repeating groups ti

multi-valued attributes. The class and grade fields in Figure 2.10 good example of a non-relational design.

GRADE i - i Each student record maintains the names of up to four classes
GRADE
i
-
i
Each student record maintains the names of up to four classes and
grades. To be a relational table, each attribute, or field, must represent::.;1
a unique, discrete fact about the entity. The fields in the example rela- I!$ .A
tions represent duplication of the data.
-.$ 1.‘:
It
is a simple matter to reduce this relation to first normal form by
‘“,
:;;+j I
:;;+j I

merely introducing a child relation to the database called GRADES. ‘37 new structure of STUDENTS will now contain only unique records MI&,.~:

the entries in the GRADES relation will be made unique by the combi- ‘-$$

nation of the Student-ID and Class fields into a composite

connection between these two tables will be a one-to-many relationsh

-
-

key. The

Figure 2. IO Decomposition to first normal form (I NF)

tables will be a one-to-many relationsh - k e y . T h e Figure 2.
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter
a?i
a?i
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter 2--Logical

and Physical Database Design

.~>_

/

/

sr*-*

a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter 2--Logical
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter 2--Logical
,.g
,.g
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter 2--Logical
a?i and Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter 2--Logical

. :a $4

Physical Database Design .~>_ / / sr*-* ,.g . : a $4 Chapter 2--Logical I (,
Chapter 2--Logical I (, i /(_ . .1
Chapter
2--Logical
I
(,
i
/(_
.
.1
,.g . : a $4 Chapter 2--Logical I (, i /(_ . .1 i*i) .? ^

i*i)

.?

^

-

.

.

.

.

i

_‘.

based upon the Student ID being the primary key in the STUDENT& table and a foreign key L the GRADES table.

,J The benefits of a table being in first normal form are easy to enume&$ ate. First, you are now working with simpler data structures. Second&$

the tables are now in a state from which further normalization can .;,* occur, and finally, these structures are much easier to move from the-‘:

logical data model to a physical one.

Second Normal Form (ZNF)

data model to a physical one. Second Normal Form (ZNF) Second normal form further refines the

Second normal form further refines the database structures. To be in i, Y$ 2NF a table must in 1NF and all of the attributes must be fully depeaz ent upon the whole primary key. Every non-key attribute must be fu@ dependent upon the primary key. The TRANSCRIPT relation in Figur&#

the primary key. The TRANSCRIPT relation in Figur&# ,.~1 ,; 2.11 is probably not in 2NE
the primary key. The TRANSCRIPT relation in Figur&# ,.~1 ,; 2.11 is probably not in 2NE
,.~1 ,;
,.~1 ,;

2.11 is probably not in 2NE

relation in Figur&# ,.~1 ,; 2.11 is probably not in 2NE The TRANSCRIPT table is in
relation in Figur&# ,.~1 ,; 2.11 is probably not in 2NE The TRANSCRIPT table is in
relation in Figur&# ,.~1 ,; 2.11 is probably not in 2NE The TRANSCRIPT table is in
relation in Figur&# ,.~1 ,; 2.11 is probably not in 2NE The TRANSCRIPT table is in
relation in Figur&# ,.~1 ,; 2.11 is probably not in 2NE The TRANSCRIPT table is in

The TRANSCRIPT table is in 1NF; there are no repeating groups or multi-valued attributes. Your next step is to then consider the de ency of the fields on the primary key of Student-ID + Class. Th STUDENT-NAME, STREET, CITY,, STATE, and ZIP are fully dependent it:

STREET, CITY,, STATE, and ZIP are fully dependent it: ,“. ’ 6 upon the STUDENT ID
,“.
,“.

6

CITY,, STATE, and ZIP are fully dependent it: ,“. ’ 6 upon the STUDENT ID attribute

upon the STUDENT ID attribute for their meaning. On the other haq$%,* these attributes certainly have no dependency on the field Class. The 2

reason for the second-normal-form refinement to the relation becony; clearer when you consider a situation in which the student moves an+-,4

needs to update his address. Each instance of the student in the

TRANSCRIPT relation would have to be updated, a tedious and poten;:l tially error-introducing operation.

Figure 2.12 shows the decomposed TRANSCRIPT table that has . - removed the non-dependent fields into another entity called STUDEkj

shows the decomposed TRANSCRIPT table that has . - removed the non-dependent fields into another entity
shows the decomposed TRANSCRIPT table that has . - removed the non-dependent fields into another entity
shows the decomposed TRANSCRIPT table that has . - removed the non-dependent fields into another entity
shows the decomposed TRANSCRIPT table that has . - removed the non-dependent fields into another entity
shows the decomposed TRANSCRIPT table that has . - removed the non-dependent fields into another entity
shows the decomposed TRANSCRIPT table that has . - removed the non-dependent fields into another entity
36 w Part I- --The Relational Database ~ STUDENT ID STUDENT NAME STREET CITY STATE
36 w Part I- --The Relational Database ~ STUDENT ID STUDENT NAME STREET CITY STATE
36 w Part I- --The Relational Database ~ STUDENT ID STUDENT NAME STREET CITY STATE

36

w Part Part

36 w Part I- --The Relational Database ~ STUDENT ID STUDENT NAME STREET CITY STATE ZIP
I- --The
I- --The

Relational

Database

~

STUDENT ID

STUDENT NAME

STREET

CITY
CITY

STATE

ZIP

CLASS

STUDENT ID STUDENT NAME STREET CITY STATE ZIP CLASS r- I I-- 1 STUDENTID ( I---”
r- I I-- 1 STUDENTID ( I---” STUDENT cLAss ID 1 STUDENT NAME r- STREET
r-
I
I--
1
STUDENTID
(
I---”
STUDENT cLAss ID
1
STUDENT NAME
r-
STREET
GRADE
I
~--
>I ->
CITY
;
rf
STATE
I
I
-5
1
ZIP
L--
-~~~~~_ I
.:,
G-a .‘I: tl
:’
Each record in the TRANSCRIPT relation remains unique based on the
primary key values and the relationship is set by the values in the Stu-:‘i
dent-ID fields in both TRANSCRIPT and STUDENTS. The update ,‘.
advantages are now apparent; changes to the student record will only. :z
involve a single operation, leaving far less opportunity for error. ‘$
$

Figure

Decomposition

2.12

to

second

nor-

mal

form

W’W
W’W
2.12 to second nor- mal form W’W !:f c.; ; hv y Tl?ird Nom JOI Form
2.12 to second nor- mal form W’W !:f c.; ; hv y Tl?ird Nom JOI Form
2.12 to second nor- mal form W’W !:f c.; ; hv y Tl?ird Nom JOI Form
!:f c.; ; hv y
!:f
c.;
;
hv
y
second nor- mal form W’W !:f c.; ; hv y Tl?ird Nom JOI Form Third normal

Tl?ird Nom JOI Form

Third normal form is achieved by first massaging the relation into set-

form is achieved by first massaging the relation into set- ond normal form. In 3NF, each

ond normal form. In 3NF,

each non-key attribute must now be fully

dependent upon the whole primary key 3NF introduces the

transitive dependency. This type of dependency indicates a functional ‘,-{ dependency between two or more non-key attributes. The FRESHNESS ,,:i:: “.pi

relation in Figure 2.13 has a transitive dependency in the Hours attribute.

concept of .-?

I)

dependency in the Hours attribute. concept of .-? I) FRESHNESS Coos Bay Denver 200 Portland Los

FRESHNESS

in the Hours attribute. concept of .-? I) FRESHNESS Coos Bay Denver 200 Portland Los Angeles
Coos Bay Denver 200 Portland Los Angeles 300 Seattle Portland
Coos
Bay
Denver
200
Portland
Los
Angeles
300
Seattle
Portland
the Hours attribute. concept of .-? I) FRESHNESS Coos Bay Denver 200 Portland Los Angeles 300
the Hours attribute. concept of .-? I) FRESHNESS Coos Bay Denver 200 Portland Los Angeles 300
., Chapter 2-Logical and Physical Database Design ^*lrvdtl 1 VENDOR-DISTANCE Coos Bay Denver Portland Los
., Chapter 2-Logical and Physical Database Design ^*lrvdtl 1 VENDOR-DISTANCE Coos Bay Denver Portland Los

.,

Chapter 2-Logical and Physical Database Design ^*lrvdtl
Chapter
2-Logical
and
Physical
Database Design
^*lrvdtl

1

VENDOR-DISTANCE

and Physical Database Design ^*lrvdtl 1 VENDOR-DISTANCE Coos Bay Denver Portland Los Angeles Seattle Portland
Coos Bay Denver Portland Los Angeles Seattle Portland
Coos Bay
Denver
Portland
Los Angeles
Seattle
Portland
Coos Bay Denver Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay

FRESHNESS-INDEX

Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los
Portland Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los

I

i

;

!

k

Los Angeles Seattle Portland FRESHNESS-INDEX I i ; ! k Coos Bay Denver Portland Los Angeles
Coos Bay Denver Portland Los Angeles Seattle Portland Figure 2. I3 The FRESHNESS relation contains
Coos Bay
Denver
Portland
Los Angeles
Seattle
Portland
Figure 2. I3 The FRESHNESS relation contains transitive dependencies.
i:
The values contained in the Hours field are dependent on the Origin, -.
and Destination fields but not the Vendor-ID value. Origin and Destil
tion are non-key attributes of the FRESHNESS relation. Third normat!:

form removes these transitive dependencies to a child relation in wl$ the attribute is fully dependent upon the primary key.

The FRESHNESS relation has been decomposed into two new relatio

VENDOR DISTANCE and FRESHNESS-INDEX. Now, each of

relatio VENDOR DISTANCE and FRESHNESS-INDEX. Now, each of : :,j the attrig .’ I 7 utes
: :,j
:
:,j
the attrig .’ I 7
the attrig
.’
I
7
FRESHNESS-INDEX. Now, each of : :,j the attrig .’ I 7 utes in the new relations

utes in the new relations is fully dependent on the primary key of the

table. Why go to this much trouble? This normalization exercise removes update anomalies such as the deletion anomaly surrounding the removal of vendor 100. Because of the transitive dependency we 1,’ lose the freshness index between Coos Bay and Denver that is useful f$ other vendors in that location.

Denver that is useful f$ other vendors in that location. Though we have examined the normalization

Though we have examined the normalization process in detail throq third normal form, that should not be taken as an indication that this.4 where it ends. 3NF is sufficient for most applications, but there is sti1i.Z the possibility of some specific anomalies being found in your databa The other normalization steps are summarized in the following

paragraphs.

38 n ser- Part I -The Relational Database ~ * Boyce/Cod l / /. /;_e
38 n ser- Part I -The Relational Database ~ * Boyce/Cod l / /. /;_e

38 n

38 n ser- Part I -The Relational Database ~ * Boyce/Cod l / /. /;_e *
38 n ser- Part I -The Relational Database ~ * Boyce/Cod l / /. /;_e *
38 n ser- Part I -The Relational Database ~ * Boyce/Cod l / /. /;_e *

ser-

38 n ser- Part I -The Relational Database ~ * Boyce/Cod l / /. /;_e *

Part I -The Relational Database

~

*

38 n ser- Part I -The Relational Database ~ * Boyce/Cod l / /. /;_e *
Boyce/Cod
Boyce/Cod

l

/

/.

/;_e

*

_
_

xI--sa-*

_
_
Database ~ * Boyce/Cod l / /. /;_e * _ xI--sa-* _ Norma/ Form (BCNF) A

Norma/ Form (BCNF)

A table in third normal form is in Boyce/Codd normal form when every

determinant is a candidate key. A determinant is the attribute on the left-hand side of the arrow in a functional dependency. Dr. Codd and RX Boyce were responsible for identifying the anomalies that occurred with some tables in 3NF and suggested this stronger form as a solution.

Fourth Normal Form (4NF)

A relation is in fourth normal form if it is in BCNF and it contains no

multi-valued dependencies. A multi-valued dependency is a type of dependency that exists when there are at least three attributes in a pri- mary key. To best explain this normal form requires an example. An entity represents a student who takes many classes and has many edu- cational objectives, both of which are independent of each other. Redundancy is rampant in this structure because the objectives will have to be entered for each class. To normalize this relation requires that the STUDENTREASONS relation be decomposed into two separate relations, one containing the attributes Student and Class and the other composed of Student and Objective.

Norma/and Class and the other composed of Student and Objective. Form (SNF) A relation is in

and the other composed of Student and Objective. Norma/ Form (SNF) A relation is in fifth

Form (SNF)

A relation is in fifth normal form if it is in fourth normal form and does

not have a join dependency A table that has a join dependency cannot

be decomposed into two tables and then have the resulting tables be

recombined to form the original table.

resulting tables be recombined to form the original table. Domain-Key Normal Form (DKINF) Domain-key normalization is

Domain-Key Normal Form (DKINF)

Domain-key normalization is a simplification of the entire process that, unfortunately, lacks a methodology for achieving it. Its theorist states that if a relation is in DWNF then it is automatically in 5NE 4NE etc. A relation is in domain-key normal form if every constraint on the rela- tion is a logical consequence of key constraints and domain constraints.

The normalization of the relations in a database marks the difference

between a professional developer and those who will wantonly lash together a set of tables to meet an end result. The application may ini- tially function correctly, but over time, the anomalies will begin to surface and beat the program down. The well-designed and normalized

set of relations will encounter far fewer problems when placed into

vice. One caveat to follow during the normalization considerations is not to over-normalize. It is possible to zealously decompose relations to

during the normalization considerations is not to over-normalize. It is possible to zealously decompose relations to
during the normalization considerations is not to over-normalize. It is possible to zealously decompose relations to
Chapter 2-Logical and Physical Database Design j_ _“ ll .=1_ =* .- ^^ i -=_
Chapter 2-Logical and Physical Database Design j_ _“ ll .=1_ =* .- ^^ i -=_
Chapter 2-Logical and Physical Database Design j_ _“ ll .=1_ =* .- ^^ i -=_
Chapter 2-Logical and Physical Database Design j_ _“ ll .=1_ =* .- ^^ i -=_