You are on page 1of 36

Unit 4: Data Normalization

Pratian Technologies (India) Pvt. Ltd.


www.pratian.com

Topics
Pitfalls in Relational Database Design What is normalization? First, Second and Third Normal Form

Copyright 2008 Pratian Technologies www.pratian.com

Pitfalls in Relational Database Design

Title Internet Security DBMS Oracle

Publisher Pearson Pearson McGraw Hill

Pub_City Delhi Delhi Delhi

Author Navathe Navathe Mukund

Aut_City Mumbai Mumbai Delhi

Copyright 2008 Pratian Technologies www.pratian.com

Pitfall in Relational Database Design


Spreadsheet Design Too much data Compound fields Missing keys Bad Keys Missing relations Unnecessary relationships Incorrect relations Duplicate field names Cryptic field and table names Missing or incorrect business rules

Copyright 2008 Pratian Technologies www.pratian.com

Decomposition
Consider a Relation - Lending

Copyright 2008 Pratian Technologies www.pratian.com

Decompose into two relations

Copyright 2008 Pratian Technologies www.pratian.com

Reconstruct the lending relation by performing a natural join on the two new schemas

Copyright 2008 Pratian Technologies www.pratian.com

What is Normalisation?
Transforming data from a problem into relations while ensuring data integrity and eliminating data redundancy. Normalisation should remove redundancy, but not at the expense of data integrity.

Copyright 2008 Pratian Technologies www.pratian.com

Normal Forms
First Normal Form Second Normal Form Third Normal Form Boyce Codd Normal Form Fourth Normal Form Fifth Normal Form

Copyright 2008 Pratian Technologies www.pratian.com

First Normal Form


A relation is in 1NF if, and only if, it contains no repeating attributes or groups of attributes. To remove the repeating group, either:
Flatten the table and extend the key, or Decompose the relation- leading to First Normal Form

Copyright 2008 Pratian Technologies www.pratian.com

What is a Repeating Group?

Figure 4.2 is insert here.

Repeating group (any project can have a group of data entries) which should not to be appeared in relational table

Copyright 2008 Pratian Technologies www.pratian.com

Example
matric_no 960100 Name Smith, J date_of_birth 14/11/1977 subject Databases Soft_Dev ISDE Soft_Dev ISDE 960120 Moore, T 11/03/1970 Databases Soft_Dev Workshop Databases Databases Soft_Dev ISDE Workshop grade C A D B B A B C B B D C D

960105

White, A

10/05/1975

960145 960150

Smith, J Black, D

09/01/1972 21/08/1973

Copyright 2008 Pratian Technologies www.pratian.com

Student(matric_no, name, date_of_birth, ( subject, grade ) ) name, date_of_birth -> matric_no

Flattened Tables
matric_no 960100 960100 960100 960105 960105 960120 960120 960120 960145 960150 960150 960150 960150
Copyright 2008 Pratian Technologies www.pratian.com

name Smith, J Smith, J Smith, J White, A White, A Moore, T Moore, T Moore, T Smith, J Black, D Black, D Black, D Black, D

date_of_birth 14/11/1977 14/11/1977 14/11/1977 10/05/1975 10/05/1975 11/03/1970 11/03/1970 11/03/1970 09/01/1972 21/08/1973 21/08/1973 21/08/1973 21/08/1973

Subject Databases Soft_Dev ISDE Soft_Dev ISDE Databases Soft_Dev Workshop Databases Databases Soft_Dev ISDE Workshop

grade C A D B B A B C B B D C B

Decomposing the relation


Record matric_no 960100 960100 960100 960105 960105 ... 960150 subject Databases Soft_Dev ISDE Soft_Dev ISDE ... Workshop grade C A D B B ... B Student matric_no 960100 960105 960120 960145 960150 name Smith,J date_of_birth 14/11/1977

White,A 10/05/1975 Moore,T 11/03/1970 Smith,J Black,D 09/01/1972 21/08/1973

Copyright 2008 Pratian Technologies www.pratian.com

Relations
We now have two relations, Student and Record. Student(matric_no, name, date_of_birth ) Record(matric_no, subject, grade ) Without repeating groups, we say the relations are in First Normal Form (1NF).

Copyright 2008 Pratian Technologies www.pratian.com

Second Normal Form


A relation is in 2NF if, and only if, it is in 1NF and every non-key attribute is fully functionally dependent on the whole key. Another way of saying this is that there must be no partial key dependencies (PKDs).

Copyright 2008 Pratian Technologies www.pratian.com

Example 1
Consider again the Student relation from the flattened Student
matric_no 960100 960100 960100 960105 960105 960120 960120 960120 960145 960150 960150 960150 960150 name Smith, J Smith, J Smith, J White, A White, A Moore, T Moore, T Moore, T Smith, J Black, D Black, D Black, D Black, D date_of_birth 14/11/1977 14/11/1977 14/11/1977 10/05/1975 10/05/1975 11/03/1970 11/03/1970 11/03/1970 09/01/1972 21/08/1973 21/08/1973 21/08/1973 21/08/1973 Subject Databases Soft_Dev ISDE Soft_Dev ISDE Databases Soft_Dev Workshop Databases Databases Soft_Dev ISDE Workshop grade C A D B B A B C B B D C B

Copyright 2008 Pratian Technologies www.pratian.com

Dependency Diagram
A dependency diagram is used to show how non-key attributes relate to each part or combination of parts in the primary key.

Student
matric_no name date_of_birth subject grade

PKD Fully Dependent

Copyright 2008 Pratian Technologies www.pratian.com

Student Details
matrix_no name date_of_birth

Student
matrix_no subject grade

All attributes in each relation are fully functionally dependent upon its primary key These relations are now in 2NF

Copyright 2008 Pratian Technologies www.pratian.com

Third Normal Form


Every column must depend directly on the primary key. Remove columns that are not dependent upon the primary key A relation is in 3NF if, and only if, it is in 2NF and there are no transitive functional dependencies

Copyright 2008 Pratian Technologies www.pratian.com

Example
project_no p1 p2 p3 p4 manager Black,B Smith,J Black,B Black,B address 32 High Street 11 New Street 32 High Street 32 High Street

Project has more than one non-key field so we must check for transitive dependency:

Copyright 2008 Pratian Technologies www.pratian.com

Project p1 p2 p3 p4

project_no

manager Black,B Smith,J Black,B Black,B Manager manager Black,B address 32 High Street

Smith,J

11 New Street

Copyright 2008 Pratian Technologies www.pratian.com

Summary: 1NF
A relation is in 1NF if it contains no repeating groups

Copyright 2008 Pratian Technologies www.pratian.com

Summary: 2NF
A relation is in 2NF if it contains no repeating groups and no partial key functional dependencies

Copyright 2008 Pratian Technologies www.pratian.com

Summary: 3NF
A relation is in 3NF if it contains no repeating groups, no partial functional dependencies, and no transitive functional dependencies

Copyright 2008 Pratian Technologies www.pratian.com

A Sample Report Layout

Copyright 2008 Pratian Technologies www.pratian.com

Table 4.1 should be here.

Convert to First Normal Form


Copyright 2008 Pratian Technologies www.pratian.com

Data Organization: 1NF


PK PK

Figure 4.3

Copyright 2008 Pratian Technologies www.pratian.com

Dependency Diagram (1NF)


Above: Desired Dependencies

Composite primary key Below: Less Desired Dependencies

Convert to Second Normal Form


Copyright 2008 Pratian Technologies www.pratian.com

2NF Conversion Results


Figure 4.5

Convert to Third Normal Form


Copyright 2008 Pratian Technologies www.pratian.com

Third Normal Form (3NF) Conversion Results

Copyright 2008 Pratian Technologies www.pratian.com

Question time
Please try to limit the questions to the topics discussed during the session. Thank you.

Copyright 2008 Pratian Technologies www.pratian.com

Time to check our understanding

Copyright 2008 Pratian Technologies www.pratian.com

Normalize the following table


Class Enrolment Class Code 503 Class Description Mgt Info Systems Student Number Name 00001 00003 00005 540 Quant Methods 00002 00003 00004 Masters, Rick Smith, Steve Jones, Terry Wallace, Fred Smith, Steve Nurk, Sterling

Copyright 2008 Pratian Technologies www.pratian.com

Solution
R1(ClassCode, ClassDescription) R2(ClassCode, StudentNumber, Name)

Copyright 2008 Pratian Technologies www.pratian.com

Question time
Please try to limit the questions to the topics discussed during the session. Thank you.

Copyright 2008 Pratian Technologies www.pratian.com