Sie sind auf Seite 1von 37

Dr.

Philip Cannata 1
Normalization
Dr. Philip Cannata 2
Designing Databases Using Normalization

On No!!! Theres another way to design a good set of tables





Normalization
FDs, 1NF 2NF, 3NF, BCNF, MVDs, F+, A+, FD Preservation, Lossless Join
Dr. Philip Cannata 3
Why does the emp table need to be separate from the dept table?
DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM DNAME LOC
10 7782 CLARK MANAGER 7839 09-JUN-81 2450 100 ACCOUNTING NEW YORK
10 7839 KING PRESIDENT 17-NOV-81 5000 0 ACCOUNTING NEW YORK
10 7934 MILLER CLERK 7782 23-JAN-82 1300 100 ACCOUNTING NEW YORK
20 7369 SMITH CLERK 7902 17-DEC-80 800 0 RESEARCH DALLAS
20 7876 ADAMS CLERK 7788 12-JAN-83 1100 100 RESEARCH DALLAS
20 7902 FORD ANALYST 7566 03-DEC-81 3000 100 RESEARCH DALLAS
20 7788 SCOTT ANALYST 7566 09-DEC-82 3000 100 RESEARCH DALLAS
20 7566 JONES MANAGER 7839 02-APR-81 2975 100 RESEARCH DALLAS
30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 100 SALES CHICAGO
Whats wrong with this picture?
Insertion Anomalies
Cant insert information about Department 40 unless it has an employee. So
where do you store information about Department 40?
Deletion Anomalies
If you delete EMPNO 7499 you lose all information about Department 30
Update Anomalies
If change CLARKs DNAME and leave his DEPTNO=10, you need to make sure
the DNAME for DEPT 10 is changed in every tuple.
Designing Databases Using Normalization
Dr. Philip Cannata 4
Functional Dependencies
Require that the value for a certain set of attributes
determines uniquely the value for another set of
attributes.
A functional dependency is a generalization of the
notion of a key.
Designing Databases Using Normalization
Dr. Philip Cannata 5
Functional Dependencies
Let R be a relation and
o _ R and | _ R
The functional dependency
o |
holds on R if and only if for any tuples t
1
and t
2
that
agree on the attributes o, they also agree on the
attributes |.
That is, t
1
[o] = t
2
[o] t
1
[| ] = t
2
[| ]
Designing Databases Using Normalization
Dr. Philip Cannata 6
Functional Dependencies
Example: Consider r(empno, ename, deptno) with the following
instance of r.
EMPNO ENAME DEPTNO
---------- ---------- ----------
7876 ADAMS 20
7499 ALLEN 30
7698 BLAKE 30
7600 BLAKE 40
7782 CLARK 10
7902 FORD 20
7900 J AMES 30
7566 J ONES 20
Designing Databases Using Normalization
Is this true? Yes No
empno ename
empno deptno
ename empno
ename deptno
deptno empno
deptno ename
Dr. Philip Cannata 7
Functional Dependencies
Example: Consider r(empno, ename, deptno) with the
following instance of r.
EMPNO ENAME DEPTNO
---------- ---------- ----------
7876 ADAMS 20
7499 ALLEN 30
7698 BLAKE 30
7600 BLAKE 40
7782 CLARK 10
7902 FORD 20
7900 J AMES 30
7566 J ONES 20
Designing Databases Using Normalization
Is this true? Yes No
empno ename X
empno deptno X
ename empno X
ename deptno X
deptno empno X
deptno ename X
Dr. Philip Cannata 8
Keys
Superkey - K is a superkey for relation R if and only if K R
Candidate Key - K is a candidate key for R if and only if
K R (i.e., K is a super key), and for no o c K, o R
(i.e., K is minimal with respect to the number of attributes of which it is
composed)
Prime attributes - attributes that belong to any candidate key
Primary Key Pick one from the candidate keys
Designing Databases Using Normalization
Dr. Philip Cannata 9
Finding Keys (Armstrongs Axioms)
Armstrongs Axioms:
if | _ o, then o | (reflexivity) (e.g., if o = AB then AB A and AB B)
(these are the trivial dependencies)
if o |, then o | (augmentation)
and o |
if o |, and | , then o (transitivity)
Designing Databases Using Normalization
Dr. Philip Cannata 10
Finding Keys (additional rules)
Additional rules implied by Armstrongs Axioms:
if o | holds and o holds, then o | holds (union)
if o | holds, then o | holds and o holds (decomposition)
if o | holds and | o holds, then o o holds (pseudotransitivity)
Designing Databases Using Normalization
Dr. Philip Cannata 11
Finding Keys (F
+
)
To compute the closure of a set of functional dependencies F:

F
+
= F
repeat
for each functional dependency f in F
+

apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F
+

for each pair of functional dependencies f
1
and f
2
in F
+

if f
1
and f
2
can be combined using transitivity
then add the resulting functional dependency to F
+
until F
+
does not change any further
Procedure for Computing F
+
Dr. Philip Cannata 12
Example
R =(A, B, C, G, H, I )
F ={ A B
A C
CG H
CG I
B H}

some members of F
+

A H
by transitivity from A B and B H
AG I
by augmenting A C with G, to get AG CG
and then transitivity with CG I
CG HI
from CG H and CG I : union rule can be inferred from
definition of functional dependencies, or
Augmentation of CG I to infer CG CGI , augmentation
of
CG H to infer CGI HI , and then transitivity

Dr. Philip Cannata 13
Finding Keys (a
+
)
Closure of a set of attributes:
Given a set of attributes a, define the closure of a under F
(denoted by a
+
) as the set of attributes that are functionally
determined by a under F:
if a | is in F
+
then | _ a
+

Algorithm to compute a
+
, the closure of a under F
result := a;
while (changes to result) do
for each | in F do
begin
if | _ result then result := result
end
Designing Databases Using Normalization
Dr. Philip Cannata 14
Finding Keys
o is a superkey key if o
+
contains all attributes of R. It is also a
candidate key if no subset of o is a candidate key.
Designing Databases Using Normalization
Dr. Philip Cannata 15
Finding Keys - Example
R = (A, B, C, G, H, I)
F = { A B
A C
CG H
CG I
B H }
After trying A
+
, B
+
, C
+
, G
+
, H
+
, I
+
, (AB)
+
, (AC)
+
with no success, (AG)
+
R
1. result = AG
2. result = ABCG (from A C and A B)
3. result = ABCGH (from CG H)
4. result = ABCGHI (CG I)
AG a superkey because:
AG R
AG is a candidate key because:
A
+
does not R and G
+
does not R
Designing Databases Using Normalization
Dr. Philip Cannata 16
Lossless-Join Decomposition

Example of Non Lossless-Join Decomposition
Decompose empdept (create table empdept as select * from emp natural join dept)
into 2 tables:
create table empdept1 as
select empno, ename, job, mgr, hiredate, sal, comm from empdept
create table empdept2 as select job, deptno, dname, loc from empdept
Then try to recreate empdept by issuing the following query
select * from empdept1 natural join empdept2
Designing Databases Using Normalization
Dr. Philip Cannata 17
Lossless-Join Decomposition

Test for Lossless-Join Decomposition

Let R be a relation that you want to decompose into R1 and R2.
The decomposition is Lossless if
R1 R2 R1 or R1 R2 R2
In other words, if R1 R2 forms a superkey of either R1 or R2
Designing Databases Using Normalization
Dr. Philip Cannata 18
Dependency Preservation Decomposition

Example of Non Dependency Preservation Decomposition
Decompose empdept (create table empdept as select * from emp natural join dept)
into 2 tables:
create table empdept3 as select empno, ename, job, mgr, hiredate, sal,
comm, deptno from empdept
create table empdept4 as select distinct empno, dname, loc from empdept
No Lossless-Join Problem but a test for the FDs deptno dname and deptno loc
requires a join.
Designing Databases Using Normalization
Dr. Philip Cannata 19
Dependency Preservation Decomposition
Test for Dependency Preservation Decomposition
Let R be a relation that you want to decompose into R1, R2 Rn and
Let each relation have the following Functional Dependencies
R has F, R1 has F1, R2 has F2 and Rn has Fn
The decomposition has Dependency Preservation if
(F1 F2

Fn)
+
= F
+
For our example F is empno R and deptno deptno, dname, loc
F1 is empno R1
F2 is empno R2
(F1 F2)
+
does not contain deptno deptno, dname, loc
Designing Databases Using Normalization
Dr. Philip Cannata 20
2NF - Relation R is in 2NF if:
R is in 1NF and
no nonprime attribute (i.e., an attribute thats not part of any
candidate key) is partially dependent on any key
(no nonprime attribute is functionally determined by a prime
attribute)
in other words, each nonprime attribute in relation R is fully
dependent upon every key
Note: If relation R has no compound keys, R is automatically in
2NF
Designing Databases Using Normalization
Dr. Philip Cannata 21
2NF A good way to remember this.

If you have a SINGLE table that is a mapping from the following
object model (instead of the normal three tables),



the single table will not be in 2NF.
Designing Databases Using Normalization
A B
* * * *
Dr. Philip Cannata 22
The problem is that some non-key attributes are functionally
dependent on PART of the key (this is not allowed for 2NF).
Decomposition into 2NF
Designing Databases Using Normalization
Convert to
A B C D
A B C D
A D
B C
A D
B C A D
This will be a Lossless-Join and Dependency Preserving
decomposition.
B C A B
Dr. Philip Cannata 23
3NF - Relation R is in 3NF if:
R is in 2NF and
No nonprime attribute functionally determines any other
nonprime attribute
Designing Databases Using Normalization
Dr. Philip Cannata 24
3NF A good way to remember this.

If you have a SINGLE table that is a mapping from the
following object model (instead of the normal two tables),



the single table will not be in 3NF.
Designing Databases Using Normalization
A B
* * * *
Dr. Philip Cannata 25
The problem is that some non-key attribute(s) are functionally
dependent on a non-key attribute (this is not allowed for 3NF).
Decomposition into 3NF
Designing Databases Using Normalization
Convert to
A B C
A B C
B C
A B B C
A B B C
This will be a Lossless-Join and Dependency Preserving
decomposition.
Dr. Philip Cannata 26
BCNF - Relation R is in BCNF if:
R is in 1NF and
All functional dependencies must either be
trivial
or, their left hand sides must be a superkeys
Note: This means
all nonprime attributes must be fully dependent on every key
(2NF)
no nonprime attribute functionally determines any other
nonprime attribute (3NF)
(i.e. all BCNF Relations are in 3NF but not all 3NF Relations
are in BCNF)
Designing Databases Using Normalization
Dr. Philip Cannata 27
BCNF Decomposition Algorithm
result := {R};
done := false;
compute F
+
;
while (not done) do
if (there is a R
i
in result that is not in BCNF)
then begin
let o | be a nontrivial functional
dependency that holds on R
i

such that o R
i
is not in F
+
(i.e., o is not a super key),
and o | = C;
result := { (result R
i
) (R
i
|) } { o | }
end
else done := true;
Note: each R
i
is in BCNF, and decomposition is Lossless-Join but not always
Dependency Preserving
Designing Databases Using Normalization
Dr. Philip Cannata 28
A Dependency Diagram
Dr. Philip Cannata 29
Second Normal Form (2NF) Conversion Results
Dr. Philip Cannata 30
Third Normal Form (3NF) Conversion Results
Dr. Philip Cannata 31
A Table That is in 3NF but not in BCNF
Dr. Philip Cannata 32
Decomposition to BCNF
Dr. Philip Cannata 33
Sample Data for a BCNF Conversion
Dr. Philip Cannata 34
Another BCNF Decomposition
Dr. Philip Cannata 35
Fourth Normal Form (4NF)
Table is in fourth normal form (4NF) when both of the following are true:
It is in 3NF
Has no multiple sets of multivalued dependencies
4NF is largely academic if tables conform to following two rules:
All attributes must be dependent on primary key, but independent of
each other
No row contains two or more multivalued facts about an entity
Dr. Philip Cannata 36
Fourth Normal Form (4NF) (continued)
Dr. Philip Cannata 37
Fourth Normal Form (4NF) (continued)