Sie sind auf Seite 1von 45

Normal Forms

First Normal Form (1NF)


• Objectives of 1NF
- The schema of an unorganized relation gives no clues to
which attributes can have multiple values.
- Semantics of a 1NF are more explicit.
- The relational operator are applicable only on flat that is
1NF relations.
• Problems:
All the anomalies discussed previously
• Update Anomalies
•Deletion Anomalies
•Insertion Anomalies
Full Functional Dependency
Definition:
In a relation R attribute B of R is “fully functional
dependent” on an attribute or set of attribute A of R if B is
functional dependent on A but not functional dependent
on any proper subset of A.
Example:
Lets consider the following relation

FD’s:
Second Normal Form (2NF)
Definition:
A relation is in second normal form (2NF) if and only if it
in 1NF and all nonkey attributes are fully dependent on
the key.

Clearly if the relation is in 1NF and the key consists of a


single attribute the relation is automatically 2NF.

For example the previous relation is not in 2NF.


The class relation is not in 2NF.

• Intuitively a relation might not be in 2NF if is trying to


describe information for more than one entity.
(i,e through a many-many relationship)
- In the previous example: student and the course entity.
- In the EMP_PROJ: Information about Projects and
Employees.
Transforming a 1NF relation to 2NF relation.
1. Identify each nonfull functional dependency.
2. Form projections by removing the attributes that depend
on each of the determinants so identified.
3. Place these determinants in separate relations along
with their dependent attributes.
4. The original relation still contains the composite key
and any attributes that are fully functional dependent on
it.

This type of projections are called “lossless projection”


because the original relation can be reconstructed by taking
the natural join of the resulting projections.
Example:

FD’s:

2NF
• Instance of the previous relations.
Example:
Objectives of 2NF:

• The semantics of a 2NF are more explicit: all the


attributes are dependent on the entire primary key.
• Database designed with 2NF relations avoid
undesirable update anomalies present in 1NF
relations.
• The schema of a 1NF relation gives no glue to which
attributes are dependent on which other attributes.
- Knowing that a relation in 2NF means that no
attribute is dependent on only part of the key.
REVIEW:
• Transitive Dependency (3rd Amstrong’s axiom)
- Lets consider the following relation

• FD’s:
• The STUID functionally determines STATUS in two
ways: Directly and Transitively through CREDITS.

• So the attribute STATUS is said to be transitively


dependent on the attribute STUID.
Third Normal Form (3NF)
Definition:
A relation is in third normal form (3NF) if and only if
- it is in 2NF and
- no nonkey attribute is “transitively dependent” on the
key.
Example:
- the following relation is in 2NF but not in 3NF.

- because the nonkey attribute STATUS is transitively


dependent on the key, STUID.
• Clearly a 2NF relation with one nonkey attribute must
always be a 3NF relation.
• Transforming a 2NF relation to 3NF relation.
1. We look to see if any nonkey attribute is functionally
dependent on another nonkey attribute.
2. Remove the functionally dependent attribute from the
relation placing it in a new relation with its determinant.
3. The determinant can remain in the original relation.

• Example:

3NF
Example:

• Intuitively we see that ED1 and ED2 represent independent


entity facts about employees and departments.
• The NATURAL JOIN operation on ED1 and ED2 will
recover the original relation EMP_DEPT without generating
spurious tuples.
Objectives of 3NF:

• The semantics of a 3NF are more explicit: all the


attributes are dependent ONLY on the primary key.
• Database designed with 3NF relations avoid
undesirable update anomalies present in 2NF
relations.
• The schema of a 2NF relation gives no glue to which
nonkey attributes are dependent on which other
nonkey attributes.
- Knowing that a relation in 3NF means that no
nonkey attribute is dependent on only part of the key.
BOYCE-CODD Normal form (BCNF)
• The definition of 3NF for relations that have a
single candidate key.
• It was found to have be deficient in in cases were
there are:
- multiple candidate keys
- composite candidate keys
Example:

Constrains:
1. No two faculty members within a single department have the
same name.
2. Each faculty member have only one office.
3. A department may have several faculty offices.
4. Faculty members from the same department may share
offices.
Resulting FDs:

- 2NF?
- 3NF?
BOYCE-CODD Normal form (BCNF)
Definition:
A relation is in Boyce-Codd normal form (BCNF) if
and only if
- every determinant is a candidate key

Q: Is the previous relation in BCNF?


- NO because OFFICE is not a candidate key.

A relational schema in BCNF is:


Objectives of BCNF:
• The semantics of multiple candidate keys are more
explicit: all the attributes are dependent ONLY on the
candidate key.
• Database designed with BCNF relations avoid
undesirable update anomalies present in 3NF
relations.
• In previous example:
We can not delete a faculity member from a
department without loosing information about an
office (assuming he is the only occupant).
- That is because OFFICE is not a candidate key.
Example of Functional Dependencies and Normal
Forms.

• Consider the following universal relation that stores


information about projects in a large business.

Semantics:
1. Each project has unique name but names of
employees and managers are not unique.
2. Each project has one manager whose name is stored
in PROJMGR.
Semantics:

3. Many employees may be assigned to work on each


project and an employee may be assigned to more
than one project.
4. HOURS tells the number of hours per week that a
particular employee is assigned to work on a
particular project.
5. BUDGET stores the amount budgeted for a project
and STARTDATE gives the starting date for the
project.
6. SALARY gives the annual salary of an employee.
Semantics:

7. EMPMGR give the name of the employee’s


manager who is not the same as the project
manager.
8. EMPDEPT gives the employee;s department.
Department name is unique. The employee;s
manager is tha manager of the employee’s
department.
9. RATING gives the employee’s rating for a
particular project. The project manager assigns the
rating at the end of the employee’s work on the
project.
Solution
FD’s:

NORMAL FORMS:
1NF? With our composite key, which cell will be single value so
WORK is 1NF.
2NF? NO because we have the following partial dependency.
We transform the relation into an equivalent set of 2NF
relations by projection,resulting:

3NF?
PROJ and WORK1 are in 3NF but EMP is not because
we have a transitive dependency:

Our new set of 3NF relations is therefore

BCNF?
Yes since in each relation the only determinant is the
primary key.
The Normalization Process
• The process of finding stable set of relations that is a
faithful model of the enterprise.

• Decomposition (top-down process)


- start with a universal relation
- identify functional dependencies
- use decomposition techniques to split the universal
relation into a set of ones.
The previous example was based on the
decomposition approach.
Synthesis (bottom-up process)

• Begin with attributes and combine them into


related group using functional dependencies to
develop a set of normalized relations.
• A synthesis algorithm was developed by
Bernstein.
• Basic steps:
- make a list of all FDs
- groups together those with the same determinant
- construct a relation of each group.
Synthesis (bottom-up process)
Problems:
1. Some FDs have more attributes in the determinant
than needed
- We must eliminate extraneous attributes or 2NF
relations might not result.
2. Eliminate redundant FDs before grouping 3NF will
not result.
3. Two relations may appear to have different keys
when in fact the keys are equivalent.
Synthesis (bottom-up process)

Improve Algorithm:
1. Make a list of all FDs.
2. Eliminate extraneous attributes in each FD.
3. Remove any redundant FDs and find a non
redundant covering of the input FDs.
- Combine FD groups with equivalent key.
4. Group together those with the same determinant.
5. Construct a relation for each group.
Example: Consider the following set of FDs.

QUST: Using the Synthesis approach construct a set 3NF


relations.
Example: Consider the following set of FDs.

QUST: Using the Synthesis approach construct a set 3NF


relations.
Example: Consider the following set of FDs.

QUST: Using the Synthesis approach construct a set 3NF


relations.
Multivalue Dependencies
Consider the following relation:

Assume that:
1. A faculty member can belong to more than one
department.
2. A faculty can belong to several college-wide
committees.
3. There is no relation between department and
committee.
Consider the following figure.
• The resulting relation is in BCNF but we still have update,
insertion, deletion anomalies, i.e.
– Update a committee that F101 belongs from Budget to Advancement.
• The faculty is not associated with only one department, is
associated with a particular set of departments and a particular
set of committees that are independent of each other.
– This independence is the cause of the problem.
Definition:

• Let R be a relation having attributes or sets of attributes A, B and


C. There is a “multivalued dependence” of attribute B on
attribute A if and only if:
• The set of B values associated with a given A value is independent of the
C values.

• This definition makes the following true:


– Consider two values of C:C1 and C2. The set of values of B in rows of
R with a given value of A and with C-value C1 must be exactly the same
as the set of values of B in rows of R with that same A-value and with C-
value C2.
• Unlike the rules for functional dependencies, which make
certain tuples illegal, multivalue dependencies make certain
tuples essential in a relation.
Fourth Normal Form (4NF)
Definition:
A relation is in 4NF <==> it is in BCNF and there are no nontrivial
multivalued dependencies.
•The dependency A Æ B is called trivial multivalued
dependency if B is a subset of A or A ∪ B = R.
Example:
The faculty relation is not in 4NF because of the nontrivial
multivalued dependencies:

4NF
Objectives of 4NF:
• The semantics are more explicit:
- all dependencies are related.
• Database designed with 4NF relations avoid undesirable
update anomalies present in 3NF.
- In the previous example: We cannot drop a faculty member
from a committee without loosing information about the
faculty( assuming he belongs to only one committee).
• The schema of a BCNF relation gives no glue to whether
there are multivalued dependencies among the primary
key’s components not is it clear which components of the
primary key are independent of one another.
- Knowing that a relation is in 4NF means that no
component of the key is independent of any other
component.
Lossless Decomposition
Definition:
A decomposition of a relation R is a set of relations
{R1, R2, …, Rn} such that:
- each Ri is a subset of R ( Ri R )
- the union of the Ri is R ( Ri = R )

Definition of “Lossless Decomposition”:


A decomposition {R1, R2, …, Rn} of a relation R is
called “Lossless Decomposition”: for R if the natural
join of R1, R2, …, Rn produces exactly the relation R.

• Not every decomposition is lossless.


Example: Consider the relation

and the following decomposition which is not lossless


• We can guarantee that the decomposition is lossless if:
– For each pair of relations that will be joined, the set of common attributes
is a determinant of one of the relations.
• We can do this by placing functionally dependent attributes in a
relation with their determinant and keeping the determinants
themselves in the original relation.
Formal Definition:

• If R is decomposed into two relations {R1, R2}, the join is


lossless <==> either of the following holds in the closures of the
set of FD’s for R:

• For a decomposition involving more than two relations, the


previous test cannot be used.
• Testing for “lossless decomposition”:
– Given relation schema R (A1, A2, …, An), a set of functional
dependencies F and decomposition p = {R1, R2,…, Rm}.
• The following algorithm can be used to test wether the
decomposition has a lossless join.

Steps:
1. Constuct an m by n table S, with a column for each of the n
attributes in R and row for each of the m relations in the
decomposition.
2. For each cell S(i,j) of S,
if the attribute for the column, Aj is in the relation
for the row, Ri, then
set S(i,j) = a(j)
else set S(i,j) = b(i,j)

3. Consider each FD, X Æ Y F until no more changes can


be made to S.
• Look for rows whose X-column agrees
• EQUATE Y-column
4. If after all possible changes have been made to S, a row
is made up entirely of symbols a(1), a(2), …, a(n), the
join is lossless. If there is no such row, the join is lossy.
Fifth Normal Form
Definition:
A relation is in 5NF if no remaining nonloss
projections are possible, except the trivial one in
which the key appears in each projection.
Definition:
Decomposition p preserves a set of FD’s, F if the
union of all FD’s in Ri implies all the decomposition
in F.

Das könnte Ihnen auch gefallen