Sie sind auf Seite 1von 622

V1.2.2.

cover

Relational Database Design

(Course Code CF18)

Student Notebook
ERC 2.0

IBM Learning Services


Worldwide Certified Material
Student Notebook

Trademarks
IBM® is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
DB2 DB2 Universal Database RACF
z/OS 400
Other company, product, and service names may be trademarks or service marks of
others.

February 2002 Edition

The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. The original
repository material for this course has been certified as being Year 2000 compliant.

© Copyright International Business Machines Corporation 2000, 2002. All rights reserved.
This document may not be reproduced in whole or in part without the prior written permission of IBM.
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.
V1.2.2
Student Notebook

TOC Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Unit 1. Relational Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.1 Tables and Guidelines Relating to Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Components of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Uniqueness of Rows and Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Order of Rows and Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Linkage of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13

Unit 2. Views and Results During Database Design . . . . . . . . . . . . . . . . . . . . . . 2-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
2.1 Data Views, Steps, and Results During Design . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Data Views During Database Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
Conceptual View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Storage View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Logical View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
Design Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16

Unit 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.1 Problem Statement for Application Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
Purpose and Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Contents of Problem Statement (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Sample Problem Statement (1 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
Sample Problem Statement (2 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
Sample Problem Statement (3 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
Contents of Problem Statement (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
Sample Problem Statement (4 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
Sample Problem Statement (5 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
Contents of Problem Statement (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
Sample Problem Statement (6 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
Sample Problem Statement (7 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-19
Sample Problem Statement (8 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-23
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26

© Copyright IBM Corp. 2000, 2002 Contents iii


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit 4. Entity-Relationship Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2
4.1 Entity Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
ER Model in Design Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6
Entity Types, Entity Instances, Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8
Properties of Entity Types and Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10
Representation of Entity Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13
Determining the Entity Types (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15
Determining the Entity Types (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17
Entity Types - A Piece of Advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19
Entity Types for CAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-21
4.2 Relationship Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-23
Relationship Types Between Entity Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24
Relationship Types in ER Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26
Relationship Instance Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-29
Multiple Relationship Types for Entity Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-30
Unary Relationship Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-32
A Special Relationship Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-34
Relationship Types - Generalized Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-35
Relationship Type on Relationship Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-36
Relationship Type Versus Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-38
Relationship Types for CAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-40
Cardinalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-41
Cardinalities (Example 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-44
Cardinalities (Example 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-46
Defining Attributes and Relationship Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-48
Relationship Key (Example 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-50
Relationship Key (Example 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-51
Cardinalities for CAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-53
4.3 Dependent Entity Types, Supertypes, and Subtypes . . . . . . . . . . . . . . . . . . . 4-55
A First Correction of the CAB Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-56
Dependent Entity Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-58
Dependent Entity Types - Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-60
Nondefining Attributes for Relationship Types . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-62
Nondefining Attributes - Sample Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-65
Attributes for a Sample Relationship Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-66
Relationships on Owning Relationship Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-68
Controlling Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-69
Cascading Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-71
Controlling for Relationship Type Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-73
A Second Correction of the CAB Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-74
Supertype and Subtypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-76
Bundle Cardinalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-79
An Alternate Maintenance Record Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-82
ER Model for CAB Without Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-84
4.4 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-87
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-88

iv Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

TOC Constraints in ER Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-90


Constraints (Example 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-92
Constraints (Example 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-94
Constraints (Example 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-96
Constraints (Example 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-98
4.5 Splitting and Combining Entity-Relationship Models . . . . . . . . . . . . . . . . . . . 4-101
Subdivision of ER Model into Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-102
Pilot View of ER Model for CAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-104
Maintenance View of ER Model for CAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-106
Building an Enterprise-Wide ER Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-108
Problems During Consolidation of ER Models . . . . . . . . . . . . . . . . . . . . . . . . . . 4-110
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-112
Unit Summary (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-121
Unit Summary (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-122
Unit Summary (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-123

Unit 5. Data and Process Inventories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
5.1 Data Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
Data and Process Inventories in Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
Data Inventory - Purpose and Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
Contents of Data Inventory (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
Sample Data Types (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
Sample Data Types (2 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
Sample Data Types (3 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
Sample Data Types (4 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
Contents of Data Inventory (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
Contents of Data Inventory (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
Sample Data Elements and Groups (1 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-22
Sample Data Elements and Groups (2 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24
Sample Data Elements and Groups (3 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-25
Sample Data Elements and Groups (4 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-26
Sample Data Elements and Groups (5 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-27
Sample Data Elements and Groups (6 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28
Sample Data Elements and Groups (7 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
Methods for Establishing a Data Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
Survey of Departments of Expertise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32
Review of Existing Data and Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34
Coupling of Data and Process Inventories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-36
5.2 Process Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-39
Process Inventory - Purpose and Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . 5-40
Contents for a Business Process (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-42
Contents for a Business Process (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-44
Sample Business Process (1 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-46
Sample Business Process (2 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-47
Sample Business Process (3 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-48
A Walk Through the ER Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-50

© Copyright IBM Corp. 2000, 2002 Contents v


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Business Process (4 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-56


Sample Business Process (5 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-57
Process Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-58
Process Decomposition for CAB (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-60
Process Decomposition for CAB (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-62
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-63
Unit Summary (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-67
Unit Summary (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-68

Unit 6. Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2
6.1 Establishing Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
Tuple Types in Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4
Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-5
Characteristics of Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-7
Tuple Types for Entity Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-9
Tuple Types for Relationship Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-11
No Tuple Type for Relationship Type (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-13
No Tuple Type for Relationship Type (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-14
No Tuple Type for Relationship Type (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-16
Required Tuple Types for CAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-17
Documentation of Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-19
Tuple Types With Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-21
Some Sample Tuple Types for CAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23
A Special Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-25
6.2 Normalization of Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-27
Normalization - An Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-28
First Normal Form - Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-30
First Normal Form - Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-31
First Normal Form - Instance Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-33
First Normal Form - ER Model Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-34
First Normal Form - 2nd Example (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-35
First Normal Form - 2nd Example (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-37
Second Normal Form - Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-39
Second Normal Form - Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-41
Third Normal Form - Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-43
Third Normal Form - Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-45
Third Normal Form - Instance Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-47
Third Normal Form - ER Model Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-48
Third Normal Form in Multiple Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-50
3rd NF in Multiple Tuple Types (Alternative 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-51
3rd NF in Multiple Tuple Types (Alternative 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-53
Fourth Normal Form - Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-54
Fourth Normal Form - Sample Tuple Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-56
Fourth Normal Form - Instance Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-58
Fourth Normal Form - Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-60
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-62

vi Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

TOC Unit Summary (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-66


Unit Summary (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-67

Unit 7. From Tuple Types to Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.1 Combining and Splitting Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
Tables in Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
Tables for Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
Conversion of Tuple Types into Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
Problems With One-to-One Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
Merging Partial Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
Finding Partial Tuple Types from ER Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-15
Imbedding Detail Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
Finding Detail Tuple Types from ER Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20
Decomposition of Super Tuple Types (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
Decomposition of Super Tuple Types (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25
Combining Tuple Types - Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-26
Limitations and Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28
Denormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
Vertical Splitting of Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-33
Horizontal Splitting of Tuple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-35
7.2 Physical Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-37
Built-In Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-38
Column Attributes - Nullable Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-41
Nullable Columns and Cardinalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-43
Column Attributes - Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-45
Selection of Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-47
Considerations for Abstract Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-49
User Defined Distinct Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-51
User Defined Distinct Types - Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-53
User Defined Functions (UDFs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-55
UDFs - Definition and Invocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-57
Check Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-59
Check Constraints - Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-61
Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-62
Triggers - Some Additional Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-64
A Sample Abstract Data Type - Name Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-66
Setting Up the Abstract Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-67
INSERT Triggers for Abstract Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-69
UPDATE Triggers for Abstract Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-71
Abstract Data Type - Inserting and Updating . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-72
Abstract Data Type - Selecting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-74
An Alternate Implementation (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-76
An Alternate Implementation (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-77
Token Translation Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-78
Token Translation Tables - An Alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-79
7.3 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-81

© Copyright IBM Corp. 2000, 2002 Contents vii


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Documenting User Defined Distinct Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-82


Documenting User Defined Functions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-83
Documenting User Defined Functions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-84
Documenting Check Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-85
Documenting Tables - Table Info (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-87
Documenting Tables - Table Info (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-88
Documenting Tables - Column Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-89
Documenting Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-90
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-91
Unit Summary (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-97
Unit Summary (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-98
Unit Summary (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-99

Unit 8. Integrity Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-2
8.1 Referential Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
Integrity Rules in Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-4
Integrity - Areas of Concern and Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-5
Referential Integrity - Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-7
Referential Integrity - Insert Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-11
Referential Integrity - Delete Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-13
Referential Integrity - Update Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-16
Delete Rules and ER Model (1 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-18
Delete Rules and ER Model (2 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-20
Delete Rules and ER Model (3 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-22
Delete Rules and ER Model (4 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-24
Delete Rules and ER Model (5 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-26
Delete Rules and ER Model (6 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-28
Delete Rules and ER Model (7 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-29
Delete Rules and ER Model (8 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-31
Delete Rules for an Imbed Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-32
Delete Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-34
Referential Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-36
Definition of Referential Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-38
Referential Integrity - Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-39
Maintenance View - Updated ER Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-41
Referential Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-42
Referential Structure - Constraint Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-46
8.2 Other Types of Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
Domain Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-48
Redundancy Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-50
Violation of Normal Forms - Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-53
Derivable Data - Sample Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-55
Constraint Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-57
Constraint Integrity - Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-59
Constraint Integrity - Example 2 (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-61
Constraint Integrity - Example 2 (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-62

viii Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

TOC Constraint Integrity - Example 3 (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-64


Constraint Integrity - Example 3 (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-66
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-68
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-73

Unit 9. Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
9.1 Structure, Options, and Usage of Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3
Indexes in Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
Purpose of an Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5
Structure of Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7
Searching Via an Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9
Unique and Nonunique Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11
Clustering Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-13
Clustering Index - First Insertion (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15
Clustering Index - First Insertion (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-16
Clustering Index - Second Insertion (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17
Clustering Index - Second Insertion (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18
Partitioning Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-19
Use of Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-21
No Index for Leading Foreign Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23
Indexes - Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-25
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-26
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-29

Unit 10. Logical Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1


Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
10.1 Logical Data Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
Logical Data Structures in Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
Logical Data Structures - Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5
Logical Data Structures - Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7
Sample Business Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-9
Sample Structure Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-11
Sample Path and Table Summaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-17
An Alternate Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-18
Processes and Logical Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-20
Example 2 - Business Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-22
Example 2 - Structure Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-23
Example 2 - Path and Table Summaries (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . 10-26
Example 2 - Path and Table Summaries (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . 10-27
Example 2 - Path and Table Summaries (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . 10-28
Characteristics of Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-29
Usage of Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-31
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-33
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-35

© Copyright IBM Corp. 2000, 2002 Contents ix


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Appendix A. Sample Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Business Object Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Business Relationship Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
Business Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5

Appendix B. Checkpoint Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X-1

x Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

TMK Trademarks
The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:
IBM® is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
DB2 DB2 Universal Database RACF
z/OS 400
Other company, product, and service names may be trademarks or service marks of
others.

© Copyright IBM Corp. 2000, 2002 Trademarks xi


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

xii Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

pref Course Description

Relational Database Design

Duration : 4.0 days

Purpose
This course presents a methodology for modeling and designing relational databases.

Audience
People responsible for designing relational databases and people who need an in-depth
understanding of data modeling.

Prerequisites
The course does not require any special prerequisites.

Objectives
After completing this course, you should be able to:
• Design relational databases.
• Consider logical and physical aspects including integrity requirements during the design.

Contents
This course covers the following major topics:
• Relational concepts
• Views and results during database design
• Problem statement
• Entity-relationship modeling
• Data and process inventories
• Tuple types
• From tuple types to tables
• Integrity rules
• Indexes
• Logical data structures and views

© Copyright IBM Corp. 2000, 2002 Course Description xiii


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

xiv Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

pref Agenda

Day 1
Relational Concepts
Views and Results During Design
Problem Statement
Exercises: Problem Statement
Review Exercises: Problem Statement
Entity-Relationship Model (Part 1)

Day 2
Entity-Relationship Model (Part 2)
Exercises: ER Model
Review Exercises: ER Model
Data and Process Inventories
Exercises: Data and Process Inventories

Day 3
Review Exercises: Data and Process Inventories
Tuple Types
Exercises: Tuple Types
Review Exercises: Tuple Types
From Tuple Types to Tables
Exercises: From Tuple Types to Tables

Day 4
Review Exercises: From Tuple Types to Tables
Integrity Rules
Exercises: Integrity Rules
Review Exercises: Integrity Rules
Indexes
Exercises: Indexes
Review Exercises: Indexes
Logical Data Structures

© Copyright IBM Corp. 2000, 2002 Agenda xv


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

xvi Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 1. Relational Concepts

What This Unit Is About


This unit describes relational concepts important for designing
relational databases.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Identify the components of tables.
• Explain the rules defined by the relational data model regarding:
- The uniqueness of rows and columns
- The physical ordering of rows and columns
- The linkage of tables

How You Will Check Your Progress


Accountability:
• Checkpoint questions

© Copyright IBM Corp. 2000, 2002 Unit 1. Relational Concepts 1-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives

After completion of this unit, you should be able to:

Identify the components of tables

Explain the guidelines defined by the relational


data model pertaining to:
The uniqueness of rows and columns
The physical ordering of rows and columns
The linkage of tables

Figure 1-1. Unit Objectives CF182.0

Notes:
The relational data model describes the conceptual representation of the data objects of
relational databases and gives guidelines for their implementation. In this unit, we will
discuss the main relational data object, the table, and some of the guidelines applicable to
the implementation of tables.
Conceptually, all data in relational databases is stored in tables. Also, when data is
presented to a user externally, it has the appearance of a table. Tables consist of rows and
columns as we will discuss in this unit.
In this unit, we will also discuss guidelines of the relational data model pertaining to the
uniqueness of rows and columns, the physical ordering of rows and columns, i.e., the
stored sequence of rows and columns, and the linkage of tables. The discussions will
emphasize the implications of these guidelines for the design and processing of tables.

1-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 1.1 Tables and Guidelines Relating to Tables

© Copyright IBM Corp. 2000, 2002 Unit 1. Relational Concepts 1-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Components of Tables

COLUMN
AIRCRAFT_MODEL
TYPE MODEL CATEGORY MANUFACTURER ENGINES
A340 100 JET AIRBUS 4
B737 500 JET BOEING 2
B737 700 JET BOEING 2 ROW
A320 200 JET AIRBUS 2

B747 400 JET BOEING 4

VALUE FIELD

Figure 1-2. Components of Tables CF182.0

Notes:
Tables are the main data object described by the relational data model. Conceptually, all
data of relational databases is stored into tables. Also, all data returned to a user is
presented in form of a table.
Structurally, as with tables in books or newspapers, a table is subdivided in rows and
columns. Horizontally, a table is subdivided in rows. The data stored into a row is logically
related and belongs to a single object, such as a person or an aircraft model. Conversely,
the data for a single object is stored into a single row.
You can compare rows to records in flat files for regular access methods or to segments in
hierarchical databases. From the access method's point of view, records are unstructured.
In contrast, from the database management system's perspective, rows are structured.
Their structure is determined by the columns of the table.
Vertically, a table is subdivided into columns. All data stored into a column has the same
semantical meaning and is of the same type. Columns have names. You can define the
name of a column and should choose it in such a way that it expresses the semantical
meaning of the column.

1-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The columns of a table subdivide the rows of the table into fields. The fields are the actual
receptacles for the data stored into a table. All rows of a table are subdivided in the same
manner, i.e., have the same columns in the same order.
A field may or may not contain data. The data in a field is also referred to as the value of
the field or the value of the column for the appropriate row. From the relational database
management system's point of view, the data in a field is atomic and unstructured. This
means that, from the relational database management system's point of view, a field
contains a single value. This does not preclude that the relational database management
system may offer (column) functions allowing you to further manipulate the data of a
column.

© Copyright IBM Corp. 2000, 2002 Unit 1. Relational Concepts 1-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Uniqueness of Rows and Columns

ENGINE_1 ENGINE_2 ENGINE_3 ENGINE_4

Column names
must be unique
AIRCRAFT
SERIAL_NUMBER ACQUIRED ENGINE ENGINE ENGINE ENGINE
B238725737 1994-07-21 P0102313 P0102314
B238768737 1997-05-12 R0942497 R0942498
B167029747 1992-10-20 G0015237 G0015240 G0025635 G0025678
A11599320 1994-02-19 R0307023 R0307025
A11599320 1994-02-19 R0307023 R0307025
A203623340 1996-08-01 R0346723 R0346724 R0346743 R0346744

Rows should be unique

Figure 1-3. Uniqueness of Rows and Columns CF182.0

Notes:
In contrast to records that are always retrieved in their entirety, you need not retrieve all
columns of the rows. You can select particular columns by providing their names. You can
also only change selected columns of the rows of a table. For this reason, the relational
data model requires that all column names of a table be unique. Thus, in the example on
the visual, you cannot have four columns with the name ENGINE. If you need all four
columns, you must name them differently. In the example, the four columns have been
renamed to ENGINE_1, ENGINE_2, ENGINE_3, and ENGINE_4.
In many cases, if you have naming conflicts for columns of a table, the semantics of the
conflicting columns has not been defined sufficiently. By better defining the meaning of the
columns, you may find different, more meaningful, names for the columns as is the case for
the illustrated example. (The engines of an aircraft are generally referred to as Engine 1,
Engine 2, and so on.)
From a design point of view, the illustrated solution may not even be the desirable solution.
What happens, for example, if new aircraft models are introduced whose aircraft have
more than four engines? This will be discussed later in the course.

1-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty In the same way, as you can retrieve or update selective columns of a table, you can
retrieve, update, or delete specific rows of a table. To ensure that you can do this, the
relational data model recommends that all rows be unique, i.e., that no two rows contain
the exact same data. Many relational database management systems do not enforce this
rule, but there are some which do. Therefore, if your database design is to be
system-independent, you should make sure that the rows of your tables are unique. We will
see later in the course how you can achieve this.
There are also other design considerations that make it highly recommendable to ensure
that all rows of a table are unique. A design should not just be short-lived, it should be
something lasting. At this moment, duplicate rows in a table may be fine because you might
not intend to retrieve, update, or delete rows individually. However, your perception may
change as new applications are introduced.
Ask yourself why you may want to have multiple identical rows? If you only need them to
determine how often the event creating the rows occurred, you might be better off to add a
column to the table counting the occurrences and remove the duplicates. This may reduce
the space required for your table and improve performance.
The design methodology taught in this course will insist that rows in the resulting tables are
unique.
DB2 allows duplicate rows in tables, but can automatically ensure that no two rows are
alike in a table.

© Copyright IBM Corp. 2000, 2002 Unit 1. Relational Concepts 1-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Order of Rows and Columns


TYPE MODEL CATEGORY MANUFACTURER ENGINES
A340 100 JET AIRBUS 4
B737 500 JET BOEING 2 First
B737 700 JET BOEING 2 time
A320 200 JET AIRBUS 2
B747 400 JET BOEING 4

Unordered Retrieval
Next time, rows may be returned in a different sequence
Next time, columns may be returned in a different sequence

TYPE MODEL CATEGORY ENGINES MANUFACTURER


A320 200 JET 2 AIRBUS
Next A340 100 JET 4 AIRBUS
time B737 500 JET 2 BOEING
B737 700 JET 2 BOEING
B747 400 JET 4 BOEING

Figure 1-4. Order of Rows and Columns CF182.0

Notes:
According to the relational data model, the sequence in which the rows and columns of a
table are physically stored in a relational database is completely up to the relational
database management system. Conversely, the physical sequence of the rows and
columns does not imply the sequence in which the rows or columns are returned if an
ordering has not been requested by the end user or application. As a matter of fact, the
same (unordered) retrieval request issued twice may return the rows and columns in a
different sequence the second time it is issued.
This means that an application cannot rely on the physical sequence of the rows or
columns in the database. If the order of the rows or columns is important to the application
during retrieval, it must tell the relational database management system how the returned
rows should be ordered. The ordering of the rows can be based on the values of one or
more columns of the table; the order can be by ascending or descending column values.
The order that can be requested is always a logical order and never a physical order.
The application can define the order in which the columns are to be returned by specifying
the column names in the desired sequence in the retrieval request.

1-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Linkage of Tables

AIRCRAFT_MODEL
TYPE MODEL CATEGORY MANUFACTURER ENGINES
A340 100 JET AIRBUS 4
B737 500 JET BOEING 2
B737 700 JET BOEING 2
A320 200 JET AIRBUS 2

B747 400 JET BOEING 4

MANUFACTURER
MID NAME CITY
AIRBUS AIRBUS INDUSTRIES TOULOUSE
BOEING BOEING CORPORATION SEATTLE

Figure 1-5. Linkage of Tables CF182.0

Notes:
A table is seldom on its own meaning that a relational database normally consists of
multiple tables which are logically interconnected.
The visual shows two tables, AIRCRAFT_MODEL and MANUFACTURER. There is clearly
an interconnection between the two tables. Each row of table AIRCRAFT_MODEL contains
an identifier for the manufacturer of the corresponding aircraft model, but does not give any
details for the manufacturer. The details for the manufacturer are contained in table
MANUFACTURER. Logically, each row of table AIRCRAFT_MODEL with a specific
manufacturer-id is interconnected with the row of table MANUFACTURER having the same
manufacturer-id.
The relational data model prescribes that logical associations are not physically
implemented in the relational database and that they are dynamically established, by
means of Join operations, on a request-by-request basis. In particular, there are no
physical pointers, such as addresses, in the columns referring to rows of other tables. The
request-based joining of tables is accomplished by means of the values of the columns
named in the join operation.

© Copyright IBM Corp. 2000, 2002 Unit 1. Relational Concepts 1-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In the example of the visual, the joining of the rows for aircraft models A340, Model 100,
and A320, Model 200, in table AIRCRAFT_MODEL with the proper manufacturer is
achieved by having the same value (AIRBUS) in columns MANUFACTURER and MID,
respectively. Of course, the columns must be specified in the request performing the join
operation.
Similarly, all rows for aircraft models having the value BOEING in the MANUFACTURER
column are joined with the appropriate row of table MANUFACTURER.
The important point is that, during relational database design, you need not worry about
physical pointers. However, you will have to worry about logical relationships which are
realized through column values rather than pointers. Column values are not affected by
reorganizations, physical pointers may be affected.
The relational data model disallows externally visible pointers, but does not prohibit internal
pointers (e.g., in index entries) that are not externally visible.

1-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Checkpoint

Exercise — Unit Checkpoint


1. What is the purpose of the relational data model?
_____________________________________________________
_____________________________________________________
_____________________________________________________

2. Which one of the following choices is the main data object of


relational databases?
a. Column.
b. Row.
c. Table.
d. Field.

3. A table is horizontally structured into columns and vertically into


rows? (T/F)

4. What is a field of a table?


_____________________________________________________
_____________________________________________________
_____________________________________________________

5. A field may or may not contain a value? (T/F)

6. All rows of a table have the same structure (columns)? (T/F)

7. Which of the following statements are true?


a. All columns of a table must have a name.
b. The names of the columns of a table must be unique.
c. The rows of a table must be unique.
d. The rows of a table should be unique.

© Copyright IBM Corp. 2000, 2002 Unit 1. Relational Concepts 1-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

8. Why should all rows of a table be different?


_____________________________________________________
_____________________________________________________
_____________________________________________________

9. If a logical order of the rows is not requested, the rows of a table


are always made available in the sequence they are physically
stored in. (T/F)

10. For an unordered retrieval request, different executions of the


request may return the retrieved rows in a different sequence. (T/F)

11. Which of the following statements are true?


a. Tables in a relational database are interconnected by means of
pointers.
b. Even internally, relational database management systems do
not use pointers.
c. Tables are joined based on column values.

1-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary

In relational databases, all data is stored in tables


Tables are structured in rows and columns
Horizontally, into rows
Vertically, into columns
Fields may or may not contain data
The data in fields is considered atomic
The column names of a table must be unique
All rows of a table should be different
Unless you request a specific order, you cannot
assume that the rows returned are in an order
Unless you request a specific order, you cannot
assume that the columns returned are in an order

Figure 1-6. Unit Summary CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 1. Relational Concepts 1-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

1-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 2. Views and Results During Database Design

What This Unit Is About


This unit describes the different views assumed for the data of the
application domain during relational database design. It also outlines
the steps performed during design and their results.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Explain the different views assumed for the data during database
design:
- The conceptual view
- The storage view
- The logical view
• Summarize the steps performed during database design and their
results.
• Relate the steps and results to the data views.

How You Will Check Your Progress


Accountability:
• Checkpoint questions

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives

After completion of this unit, you should be able to:

Explain the different views assumed for


the data during database design:
The Conceptual View
The Storage View
The Logical View

Summarize the steps performed during


database design and their results

Relate the results of the steps to the


data views

Figure 2-1. Unit Objectives CF182.0

Notes:
When designing a relational database, different views are assumed for the data of the
subject application domain. These views are:
• The conceptual view
• The storage view
• The logical view
During this unit, we will discuss these data views, give an overview of the steps performed
during database design, list their results, and relate the results of the steps to the data
views.

2-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 2.1 Data Views, Steps, and Results During Design

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Data Views During Database Design


Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 2-2. Data Views During Database Design CF182.0

Notes:
When designing the database for an application domain, you start with a problem
statement for the application domain, i.e., a document describing the types of business
objects for the application domain, the relationships between them, and the business
constraints for both of them. The problem statement must be established by an application
domain expert (analyst). In general, it is not produced by the database designer who does
not have the domain expertise, but it is input for him/her.
Starting with the problem statement, a series of steps is performed during the design.
These steps look at the data of the application domain from three different angles, called
views:
• The conceptual view
• The storage view
• The logical view
For each of these views, a set of results is produced during database design. You can
associate a view with its results and describe it by its results. For this reason, it is quite
common to say "the ... view consists of ..." rather than "the ... view establishes ...". In the

2-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty latter case, the view is seen more as the activity of looking at the application domain from a
specific angle and producing certain results whereas, in the former case, the view is seen
as the results produced. During this course, both terminologies are used.
The conceptual view scrutinizes and structures the data of the application domain based
on their semantical meaning, i.e., their meaning for the business (application domain). It
does this independently of the business processes accessing the data and without regard
to any existing or planned method for storing the data.
Thus, during the conceptual view, the process- and implementation-independent
architecture of the data of the application domain is established.
The storage view looks at the data of the application domain from a storage point of view.
During the storage view, in a series of steps, the objects of the conceptual view are
mapped into objects (in particular, tables) of the relational database management system
chosen for the implementation of the data. Thus, the storage view is not an
implementation-independent view of the data. Rather, it is an implementation-oriented view
of the data during which the conceptual view is physically implemented in the selected
relational database management system.
The logical view looks at the data of the application domain from a process point of view.
Generally, a particular business process does not access all data of the application
domain, but only a part of the data. Thus, it has its own process-dependent view of the data
of the application domain. Accordingly, during the logical view, the process-dependent
views for the business processes of the application domain are established.

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Conceptual View
Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 2-3. Conceptual View CF182.0

Notes:
As mentioned before, during the conceptual view, the process- and
implementation-independent architecture of the data of the application domain is
established. The application domain, described by the problem statement, is scrutinized for
its business object types, the relationships between them, and the business constraints.
As a result of this scrutiny, an entity-relationship model is established visualizing and
structuring the business object types of the application domain as entity types; illustrating
the relationships between the business object types by means of relationship types; and
modeling the constraints for the entity types and relationship types imposed by the
business constraints.
In a second step, which is not directly performed by the database designer, but requires
his/her participation, the elementary data of the application domain, referred to as data
elements, are identified and described in detail. The descriptions are recorded in a
document, the data inventory.
As the data elements are collected, they are assigned to the business object types to which
they belong. More precisely, they are assigned to the corresponding entity types verifying

2-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty whether or not the entity-relationship model established during the previous step is
complete.

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Storage View
Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 2-4. Storage View CF182.0

Notes:
During storage view, the conceptual view is physically implemented in a relational database
management system. More precisely, the results of the conceptual view are implemented.
The steps executed during storage view transform the results of the conceptual view into
tables and related objects of the chosen relational database management system. The
initial steps are mostly independent of the chosen relational database management
system. The further you proceed, the more system-dependent aspects have to be
considered although many of the considerations are of a global nature.
The first step of the storage view uses the data inventory to construct tuple types for the
entity types and relationship types of the entity-relationship model developed during the
conceptual view and normalizes them. Tuple types are the precursors of tables and provide
the basis for the computerized processing of the entity types and relationship types for the
application domain. During the normalization of the tuple types, data redundancies and
abnormalities are resolved that may lead to data inconsistencies if not removed.

2-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Based on a prescribed set of rules, the next step of the storage view converts the tuple
types into tables of the chosen relational database management system taking into
account the supported functions and features.
Also as part of storage view, any rules concerning the integrity of the data, including the
constraints defined as part of the entity-relationship model, must be converted into rules for
the tables created by the previous step and implemented if the chosen relational database
management system provides the necessary functions such as check constraints,
referential constraints, and triggers.
Some of the associations between the tables (especially, those implied by referential
constraints) make it imperative that indexes be defined for certain columns of the tables.
The last step of storage view will define these indexes.

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Logical View
Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 2-5. Logical View CF182.0

Notes:
The logical view looks at the data of the application domain from the perspective of the
processes for the application domain. During the logical view, the required process-specific
views for the processes of the application domain are established.
The first step during the logical view for an application domain describes, in an
implementation- and database-independent fashion, the processes retrieving and/or
manipulating the data of the application domain. The process descriptions are collected in
a document referred to as process inventory. For each process, they must identify the data
elements used by the process.
After the tables for the application domain and the integrity rules for them have been
defined, as part of logical view, the necessary logical data structures are established for all
processes described in the process inventory. Each logical data structure describes a view
that a process (or part of a process) has of the tables defined during storage view. More
precisely, the logical data structure describes the subset of the tables for the application
domain required by the process or a part of the process. It also illustrates how the process

2-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty or the part of the process must logically navigate through the tables to achieve its function.
Thus, it reflects the tables needed, the subsets required, and the flow between the tables.
As you can see by now, the steps of the various views are interconnected and may be
dependent on each other. The process inventory of the logical view is the primary source
for the data inventory since it identifies the data elements used by the processes. Similarly,
the tables and the integrity rules of the storage view are required input for the logical data
structures.

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Design Methodology
Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 2-6. Design Methodology CF182.0

Notes:
This visual illustrates the complete design methodology used during the course. The
entity-relationship model developed during the conceptual view is used and updated by the
later steps of the design as additional knowledge becomes available. This ensures that the
model remains valid and useful at all times.
The methodology described on the previous pages and illustrated by the diagram on this
visual is not a pure top-down approach. Design is and must be an iterative process. When
you start with the design, your knowledge of the application domain is most likely
incomplete even if the problem statement was prepared carefully. No matter how
thoroughly you execute the various steps, subsequent steps will detect holes and errors in
the results of the preceding steps that will force you to revisit these steps.
Unless the problem statement is incomplete, you should always start an iteration with the
entity-relationship model. Check if the required change impacts the entity-relationship
model. If it does not, proceed to the next step and verify its results.
If the problem detected reveals that the problem statement is incomplete or incorrect, have
it extended or corrected by the application domain expert. It is not your, i.e., the database

2-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty designer's, responsibility to change the problem statement. This must be done by a person
with the proper domain competence. However, it is your responsibility to make the
application domain expert aware of the problem. After the problem statement has been
corrected, continue the iteration with the entity-relationship model as before.
The fact that relational database design is an iterative process should not make you sloppy.
The better the problem statement and the more carefully the various steps are performed,
the better your design will be. However, it does not make sense to dwell endlessly on a
specific step.

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Checkpoint

Exercise — Unit Checkpoint


1. What is the problem statement?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

2. Who normally establishes the problem statement for an application


domain?
a. The database designer.
b. The database administrator.
c. An application domain expert.
d. An application programmer.
e. The end users.

3. Match the following terms with the corresponding data views:


a. Physical implementation ____ Conceptual view
b. Process-dependent views ____ Storage view
c. Independent data architecture ____ Logical view

4. During the conceptual view, the data of the application domain are
structured taking into account the business processes for the
application domain. (T/F)

5. The logical view looks at the data of the application domain from
the viewpoint of the business processes for the application domain.
(T/F)

2-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 6. Match the three data views with the results produced by them:
a. Conceptual view ____ Tuple types
b. Storage View ____ Entity-relationship model
c. Logical view ____ Data inventory
____ Process inventory
____ Tables
____ Integrity rules
____ Logical data structures
____ Indexes

7. What is the purpose of an entity-relationship model?


_____________________________________________________
_____________________________________________________
_____________________________________________________

8. What is the data inventory?


_____________________________________________________
_____________________________________________________

9. Which of the following statements are true?


a. Tuple types are the precursors of tables.
b. Tuple types provide the basis for the computerized processing
of the business objects and of the relationships between them.

10. What is the purpose of a logical data structure?


_____________________________________________________
_____________________________________________________
_____________________________________________________

11. The design methodology taught in this unit is a waterfall approach,


i.e., the various steps of the methodology are processed from top
to bottom and are not reiterated. (T/F)

© Copyright IBM Corp. 2000, 2002 Unit 2. Views and Results During Database Design 2-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Summary

The data views taken during database design are:


The conceptual view
The storage view
The logical view
The conceptual view looks at the data from
the application domain perspective
Process- and implementation-independent
architecture of data of application domain
The storage view looks at the data from
a physical (storage) point of view
Physical implementation of conceptual view
The logical view looks at the data from the process
point of view
Collection of process-dependent views

Figure 2-7. Unit Summary CF182.0

Notes:

2-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 3. Problem Statement

What This Unit Is About


This unit describes the purpose and contents of the problem statement
for an application domain. It discusses the responsibility for the
problem statement and the role of the database designer in its
creation.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Explain the purpose of the problem statement for database design.
• Understand who has the responsibility for the creation of the
problem statement.
• Describe the role of the database designer in the creation of the
problem statement.
• Describe the contents of a problem statement.

How You Will Check Your Progress


Accountability:
• Checkpoint questions
• Exercises

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives

After completion of this unit, you should be able to:

Explain the purpose of the problem


statement for database design

Understand who has the responsibility


for the creation of the problem statement

Describe the role of the database designer


in the creation of the problem statement

Describe the contents of a problem statement

Figure 3-1. Unit Objectives CF182.0

Notes:
When designing the database for an application domain, you start with a problem
statement for the application domain. This unit discusses the problem statement in detail
and describes:
• The purpose of the problem statement
• Who is responsible for the creation of the problem statement
• The role of the database designer in the creation of the problem statement
• The contents the problem statement should have to be a usable input for database
design.

3-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 3.1 Problem Statement for Application Domain

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Purpose and Responsibilities

Created by application domain expert

Input for database designer

Global description of application domain

Must allow the database designer to:


Gain a basic understanding of the
application domain
Create an entity-relationship model
for the application domain

Does not contain detailed information

Database designer
should work with
application domain expert

Figure 3-2. Purpose and Responsibilities CF182.0

Notes:
The problem statement must be created by someone who has detailed knowledge of the
application domain, i.e., an application domain expert. Only then will the problem statement
reflect the application domain correctly and completely.
The problem statement is input for the database designer and is a global description of the
application domain for which the database designer is to develop a database. It is a global
description rather than a detailed description. This means it describes the important
characteristics of the application domain rather than the various data elements and
processes of the application domain or any implementation-dependent details. It should be
a functional description of the application domain. It should not describe the current or a
planned implementation. It must allow the database designer to:
• Gain a basic understanding of the application domain so that he/she can comprehend
the context of business objects, business relationships, and business constraints
important for the design; detect inconsistencies; and discuss problems detected during
the design with sufficient ease and knowledge with the application domain expert or the
responsible department.

3-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • Create an entity-relationship model for the application domain visualizing and clarifying
the types of business objects and business relationships and the business constraints to
be implemented in the database.
As mentioned before, the problem statement should not contain detailed information about
the application domain. The detailed information about the application domain is provided
by the data inventory and the process inventory discussed later.
Although the application domain expert is responsible for the problem statement, the
database designer should work with the application domain expert during the creation of
the problem statement. The database designer knows best which input he/she needs for
the design of the appropriate database and, thus, can provide the necessary guidance to
the application domain expert.
Furthermore, by working with the application domain expert, the database designer will
gain a better understanding of the application domain easing his/her work considerably.

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Contents of Problem Statement (1 of 3)

A short textual description (overview ) of the application


domain
What the application domain does
What the application domain wants to achieve
by means of the database management system

A listing of all business object types about which


information is to be stored including:
A textual description of the business object types
No details yet

How the individual objects of the business object


type can be identified

Figure 3-3. Contents of Problem Statement (1 of 3) CF182.0

Notes:
First, the problem statement should contain a textual description, an overview, of the
application domain. This overview should describe, in simple words, what the application
domain does so that the database designer gets at least a certain idea of what is going on.
Furthermore, the overview should indicate what the application domain (or more precisely,
the appropriate departments) wants to achieve by using a database management system.
In particular, the overview should point out which areas of the application domain should be
implemented in the target database. The actual application domain may be much larger,
and it may not be intended or possible to implement the entire domain in the database.
Secondly, the problem statement should list and describe all categories (types) of
business objects which are important for the application domain and about which
information is to be stored in the target database. These categories are referred to as
business object types.
As a category, a business object type represents all business objects having the same
meaning and characteristics rather than distinct business objects. For an airline company,
for example, the problem statement should describe that information about aircraft models

3-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty in general is to be stored rather than about a specific aircraft model such as a Boeing 747,
Model 400.
For each business object type to be implemented in the target database, the following
information should be provided:
• A textual description of the semantics of the business object type without going into
details such as the individual attributes of the business object type. The details for the
business object types will be provided via the data inventory discussed later.
• How the distinct objects belonging to the business object type can be identified.

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Problem Statement (1 of 8)

Overview

Come ABoard (CAB) is an airline servicing a set of airports with its


aircraft. As employees, it has pilots flying the aircraft, mechanics
maintaining and servicing the aircraft, and other personnel for various
service functions.

CAB wants to administer flight planning, pilot assignment, and aircraft


maintenance activities by means of a database management system.

Figure 3-4. Sample Problem Statement (1 of 8) CF182.0

Notes:
Throughout this course, we will use a sample application domain to demonstrate the
various items discussed. This application domain comprises the flight planning, pilot
assignment, and aircraft maintenance for an airline called Come Aboard or, in short, CAB.
This visual illustrates the overview section of the problem statement for our sample
application domain. The amount of information provided in the overview depends on the
general familiarity of the application domain. If the application domain is less known or
more complex, the overview will require more information. The sample application domain
used in this course is generally well-known and is not really complex. Consequently, the
short description on the visual should be sufficient.
You should note that the second paragraph limits the application domain being considered
to the fight planning, the pilot assignment, and the aircraft maintenance. Without this
restriction, the application domain for the management of an airline would comprise
additional areas such as flight reservation or seat selection.
The entire problem statement for the sample application domain can be found in Appendix
A - Sample Problem Statement.

3-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Problem Statement (2 of 8)

Business Object Types

CAB wants to store information about the following business object types in
its database:

AIRCRAFT MODELS
For its flying activities, CAB uses aircraft of different types or, more precisely,
models such as Boeing 737, Model 500, or Airbus A320, Model 200. For the
aircraft models it owns or has on order, CAB wants to maintain information in
its database such as their category (e.g., JET or TURBOPROP), length,
height, wing span, or number of engines.
The aircraft models can be uniquely identified by their type code (e.g., B737)
together with their model number (e.g., 500).
unique identifier

Figure 3-5. Sample Problem Statement (2 of 8) CF182.0

Notes:
This visual illustrates a business object type for our sample airline company called CAB.
Aircraft Models is a business object type for the application domain being considered since
flight planning, pilot assignment, and aircraft maintenance are dealing with aircraft models.
When a flight is planned and an aircraft is assigned to the flight, that aircraft cannot be an
arbitrary aircraft. It must be an aircraft of a specific aircraft model because the model is
published in the timetables and the starting and landing airports require the aircraft to be of
a certain model.
Similarly, pilots are only allowed to fly aircraft of those models they have a license for, and
mechanics may only service aircraft of models they have been trained for.
Aircraft Models is a business object type rather than a business object because it
represents a set of objects with the same meaning (being models of aircraft) and the same
characteristics such as manufacturer, category (jet or turboprop), or number of engines.
As highlighted on the visual, the individual aircraft can uniquely be identified by their type
code (e.g., B737) together with their model number (e.g., 500).

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Problem Statement (3 of 8)

AIRCRAFT
CAB owns multiple aircraft of the various aircraft models. For the aircrafts it
owns, CAB wants to maintain information such as the date when the aircraft
was acquired, the engines mounted on the aircraft, or the seats of the
aircraft.
Each aircraft has a unique serial number. This serial number is unique
across aircraft models.
unique identifier
AIRPORTS
CAB services a set of airports with its aircraft. For these airports as well as
for airports CAB plans to service in the near future, CAB wants to keep
information in its database such as the airport code, the location of the
airport, the address of CAB's city ticketing office, or the address of CAB's
airport office.
The airport codes uniquely identify the various airports.

Figure 3-6. Sample Problem Statement (3 of 8) CF182.0

Notes:
This visual illustrates two further business object types for our sample application domain:
Aircraft and Airports. Again, both of these types are of interest for the application domain
and represent true categories. Each of them represents a set of objects having the same
meaning and the same characteristics.
As highlighted on the visual, the various aircraft of the business object type Aircraft can be
identified by means of a unique serial number. Even for aircraft of different models, this
serial number is unique, i.e., no two aircraft can have the same serial number.
For airports, their international airport codes (e.g., SFO for San Francisco, CA, JFK for
John F. Kennedy Airport in New York, NY, or STR for Stuttgart, Germany) serve as unique
identifiers.
The full set of business object types for the CAB application domain can be found in
Appendix A - Sample Problem Statement.

3-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Contents of Problem Statement (2 of 3)

A listing of all types of relationships between the business


objects including:
A textual description of the type of relationship
A business relationship always concerns at least two business objects
The objects can belong to the same or different object types
All relationships of the same type have the same meaning
and concern objects of the same business object types
If a first object has a relationship with a second object, the
second object also has a relationship with the first object
How many relationships of the same type an object can have
If the object may have many or at most one relationship of that
type
If the type of relationship requires an object to have at least one
relationship
If the objects having a relationship with an object must be deleted
when the other object is deleted

Figure 3-7. Contents of Problem Statement (2 of 3) CF182.0

Notes:
As a third item, the problem statement should contain a listing of all types (categories) of
logical relationships that exist between business objects of the various business object
types. A business relationship logically interconnects two or more business objects which
may belong to different or to the same business object type. The objects may even be
identical, i.e., an object may have a relationship with itself.
Business relationships of the same type always interconnect business objects of the same
respective types. For example, if r1 and r2 are business relationships of the same type and
r1 associates an object of business object type O1 with an object of business object type
O2, then r2 must also interconnect an object of O1 with an object of O2.
Note that we are talking about types or categories of relationships, referred to as business
relationship types, rather than individual relationships between business objects. For the
problem statement, it is not important which specific business objects have a relationship
with each other. It is only important to identify the type of the business relationship and to
understand its semantics and characteristics. For the sample airline application domain, for
example, it is only important to know that aircraft belong to aircraft models and that an

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

individual aircraft always belongs to one and only one model. For the problem statement, it
is not important to know that the aircraft with serial number B238725737 is a Boeing 737,
Model 500.
Depending on the business object type from which you look at the business relationship
type, there are different (directional) views of the same business relationship type. For the
above airline example, you may look at the business relationship type from Aircraft's point
of view or from Aircraft Models' point of view. From Aircraft's point of view, the semantics is
that an aircraft belongs to an aircraft model; from Aircraft Models' point of view, the
meaning is that a specific aircraft model comprises an aircraft. As expected, the meanings
are complementary. You can think of them as separate directional business relationship
types that make up a single nondirectional (or bidirectional) business relationship type.
For each business relationship type, the problem statement should include:
• A textual description of the business relationship type, i.e., describe its meaning and the
business object types involved.
• How many relationships of the same type an object can have. The important fact is
whether it can have many relationships or at most one.
• If the type of the relationship requires every existing object of an object type to have at
least one relationship of the considered business relationship type.
• If the objects having a relationship of the considered type with an object must be deleted
as well if that object is deleted, i.e., the consequences of delete operations on objects
that are interconnected by means of relationships.
It is possible that the objects of two business object types are interconnected by multiple
(different) business relationship types.

3-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Problem Statement (4 of 8)

Business Relationship Types

The following types of relationships exist between the business objects


which CAB wants to implement in its database:

AIRCRAFT MODELS - AIRCRAFT 1 aircraft model ~


0 to many aircraft

For an aircraft model, CAB may have any number of aircraft. In particular, it
is possible that there are no aircraft (yet) for an aircraft model. Conversely,
an aircraft belongs to one and only one aircraft model.

1 aircraft ~ 1 to 1
aircraft model Mandatory relationship type

Figure 3-8. Sample Problem Statement (4 of 8) CF182.0

Notes:
This visual illustrates a business relationship type for our sample application domain. As
mentioned before, there exists a business relationship type linking objects of business
object type Aircraft Models to objects of business object type Aircraft.
From Aircraft Models' point of view, the meaning of the business relationship type is that an
aircraft model comprises an aircraft. Conversely, from Aircraft's point of view, the meaning
is that an aircraft belongs to an aircraft model.
As highlighted on the visual, there may be many aircraft for an aircraft model, but it is also
possible that an aircraft model does not have any aircraft.
A given aircraft, in contrast, can only belong to a single aircraft model. Furthermore, an
aircraft must always belong to an aircraft model. Accordingly, every (existing) aircraft must
have a relationship to an aircraft model. Thus, from Aircraft's point of view, the business
relationship type is a mandatory business relationship type. From Aircraft Models' point of
view, the business relationship type is not mandatory because an aircraft model need not
have a relationship to an aircraft.

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Problem Statement (5 of 8)

AIRCRAFT - MAINTENANCE RECORDS


For as long as an aircraft is owned by CAB, all maintenance records for the
aircraft are kept. A maintenance record applies to one and only one aircraft.
For an aircraft, there may be multiple maintenance records.

1 aircraft ~ 0 to many 1 maintenance record


maintenance records ~ 1 to 1 aircraft

When the aircraft is removed from the list of aircraft, its maintenance
records are deleted as well.

Cascading relationship type

Figure 3-9. Sample Problem Statement (5 of 8) CF182.0

Notes:
This visual illustrates another business relationship type for CAB. This business
relationship type interrelates aircraft and maintenance records: The objects of business
object type Aircraft (may) have relationships with objects of business relationship type
Maintenance Records (Aircraft Has Maintenance Record). Conversely, each object of
Maintenance Records must have a relationship to one and only one object of Aircraft
(Maintenance Record for Aircraft).
As the description states, all maintenance records for an aircraft are to be deleted when the
aircraft is deleted. From Aircraft's point of view, the business relationship type is a
cascading business relationship type because delete operations are cascaded down to the
associated objects of the other business object type.
The above description of the business relationship type does not match with the description
in Appendix A - Sample Problem Statement since we want to illustrate a cascading
business relationship type. The description in Appendix A - Sample Problem Statement has
some peculiarities which will be discussed later.

3-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The remaining business relationship types for our sample application domain can be found
in Appendix A - Sample Problem Statement.

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-15


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Contents of Problem Statement (3 of 3)

A listing of all constraints for the business object types


and/or business relationship types including:
A textual description of the constraint, i.e., the
restriction that must be adhered to
The business object types and business relationship
types to which the constraint applies
Constraints may apply to a single or to multiple business
object types or business relationship type
Constraints may apply to a mixture of business object
types and business relationship types

When the constraint is to be applied


Under which circumstances/conditions
When object or relationship is added, changed, or removed

Action to be performed when constraint is violated

Figure 3-10. Contents of Problem Statement (3 of 3) CF182.0

Notes:
The fourth section of the problem statement should list all business constraints for the
business object types and business relationship types of the application domain.
Business constraints represent restrictions that exist for the objects of business object
types or the relationships of business relationship types or a mixture thereof. For example,
such a restriction could require that for each (existing) business object of business object
type O1 a corresponding business object of business object type O2 must exist. We will
see further, more intuitive, examples for our sample airline application domain on the
subsequent visuals.
For each business constraint, the problem statement should contain the following
information:
• A textual description of the business constraint, i.e., of the restriction the business
objects or business relationships involved must adhere to.
• The description should identify the business object types and/or business relationship
types to whose objects or relationships the restriction applies, i.e., whose insert, update,

3-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty or delete operations are limited by the business constraint. As mentioned before, a
single business constraint can restrict the objects or relationships of a single or multiple
business object types or business relationship types, or of a mixture thereof.
• The description should specify when the appropriate restriction is to be applied, i.e.,
what triggers the application of the restriction. This has two facets:
1. There may be circumstances or conditions attached to a business constraint
specifying that the restriction is to be exercised only if these conditions are met. For
example, the condition could specify that the restriction only concerns aircraft
manufactured by Boeing or that the restriction only applies to aircraft put in service
before January 1, 1985.
2. For the affected business object types or business relationship types, the description
should specify the type of operations (insert, update, or delete) for which the
constraint must be enforced provided that the before-mentioned conditions are met.
• The description of the business constraint should specify the action to be performed
when the constraint is violated. The simplest form of action is to reject the operation.
However, there are more complex actions possible. For example, the violation of the
constraint could trigger the creation of a business object for another business object
type.
As you can see from the description, a business constraint may not only involve the
business object types or business relationship types to which its restriction applies, but also
other business object types or business relationship types for evaluating the triggering
condition or for the action to be performed if the constraint is violated.

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-17


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Problem Statement (6 of 8)

Business Constraints

The following constraints exist for the business object and relationship types
that CAB wants to maintain in its database:

NUMBER OF ENGINES ON AIRCRAFT


An aircraft cannot have more engines mounted than the aircraft model allows.

Business To what the


object type constraint applies

To be enforced when an engine is added When constraint is to


to an aircraft. be applied
The request to add an engine to an aircraft Action if constraint is
must be rejected if it violates the constraint. violated

Figure 3-11. Sample Problem Statement (6 of 8) CF182.0

Notes:
This visual illustrates a business constraint for our sample airline called Come Aboard. The
business constraint limits the number of engines for an aircraft to the number of engines for
the corresponding aircraft model.
The business object type to which the constraint applies is Aircraft. The restriction controls
how many engines an aircraft can have.
There is not a particular condition under which the constraint is to be applied. The
constraint must be verified (enforced) whenever an engine is added to (mounted on) an
aircraft.
The request to add an engine to an aircraft should be rejected if the limit for the
corresponding aircraft model were exceeded, i.e., the constraint were violated.
As mentioned before, the business constraint applies to business object type Aircraft.
However, in order to verify it, business relationship type Aircraft Belongs to Aircraft Model
and business object type Aircraft Models are needed.

3-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Problem Statement (7 of 8)

CAPTAIN AND COPILOT MUST BE DIFFERENT

A pilot cannot be captain and copilot for the same flight.

Business
relationship type

To be enforced when a pilot is assigned to When constraint is to


a flight or the pilot assignment is changed. be applied

The pilot assignment must be rejected if Action if constraint is


the pilot does not qualify for the flight. violated

Figure 3-12. Sample Problem Statement (7 of 8) CF182.0

Notes:
The above business constraint requires that the captain and the copilot for a flight must be
different.
This business constraint applies to the relationships of a business relationship type rather
than to the objects of a business object type. It applies to business relationship type Pilot
for Flight which interconnects business object types Pilots and Flights.
The corresponding restriction must be applied in two cases:
• When a pilot is assigned to a flight, i.e., when a new relationship for the business
relationship type is added.
• When a pilot assignment is changed, i.e., an existing relationship for the business
relationship type is changed. (You could also view this as the deletion of the old
business relationship followed by the addition of a new business relationship.)
The pilot assignment (new or changed) should be rejected if pilot and copilot were the
same.

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-19


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Note that the business constraint applies to Pilot for Flight and also needs Pilot for Flight to
check if the constraint has been violated.

3-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Problem Statement (8 of 8)

PILOTS FOR FLIGHT MUST HAVE LICENSE FOR AIRCRAFT MODEL FOR LEG

A pilot for a flight must have the license to fly the aircraft model for the leg for
the flight.

Business relationship type

To be checked when a pilot is assigned to a flight or when a previous


pilot assignment is changed.
The pilot assignment is to be rejected if the pilot does not qualify for the flight.
Also to be verified if the aircraft model for a leg of an itinerary is changed.
In this case, previous pilot assignments for flights for the leg must be
canceled and appropriate notifications must be given.

When constraint is to be applied


Action if constraint is violated

Figure 3-13. Sample Problem Statement (8 of 8) CF182.0

Notes:
The business constraint on this visual requires that the pilots for a flight must be licensed to
fly the aircraft model used for the leg for the flight. This means that the above business
constraint applies to the same business relationship type as the previous example: Pilot for
Flight.
However, there is a peculiarity for this business constraint. The associated restriction is to
be checked:
1. When a pilot is assigned to a flight or a pilot assignment is changed.
2. When the aircraft model for the leg for the flight is changed. As a consequence, pilots
previously assigned to flights for the leg might no longer be licensed to fly the new
aircraft model.
The point illustrated here is that a constraint may also have to be enforced when business
objects or business relationships of another business object type or business relationship
type are inserted, updated, or changed. In case of the above example, the constraint has to

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-21


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

be verified when a relationship of business relationship type Aircraft Model for Leg is
changed.
As shown on the visual, the action to be performed when the constraint is violated depends
on what was causing the violation, the assignment of a pilot or the reassignment of an
aircraft model.

3-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Checkpoint

Exercise — Unit Checkpoint


1. Which of the following statements are true?
a. The problem statement should give the database designer a
basic understanding of the application domain.
b. The problem statement should allow the database designer to
create an entity-relationship model for the application domain.
c. The problem statement is input for the database designer.
d. The problem statement is created by the database designer.
e. The database designer should assist the application domain
expert in creating the problem statement.
f. The problem statement should describe the current
implementation of the application domain.
g. The problem statement should be a global, functional,
description of the application domain.

2. What are the main sections of a problem statement.


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

3. What is the purpose of the overview section?


_____________________________________________________
_____________________________________________________

4. Which of the following statements are true?


a. Business object types list the possible business objects for the
application domain.
b. Business object types represent the categories of business
objects that are important to the application domain.
c. All business objects belonging to the same business object type
have the same meaning and characteristics.

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-23


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

5. For each business object type, the problem statement should


describe how its objects can be identified. (T/F)

6. What is a business relationship type?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

7. List at least three things that a problem statement should describe


for each business relationship type.
_____________________________________________________
_____________________________________________________
_____________________________________________________

8. For a mandatory business relationship type, there exists at least


one business object type whose objects must always have (at
least) one business relationship of that type. (T/F)

9. How do you call a business relationship type that is based on a


business object type the deletion of whose objects causes the
deletion of all objects having a relationship with the deleted
objects?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

10. What is a business constraint?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

3-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 11. List the items that the problem statement should contain for each
business constraint.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

12. The checking of a business constraint can be triggered by insert,


delete, or update operations for business object types or business
relationship types other than the one to which the restriction
applies. (T/F)

© Copyright IBM Corp. 2000, 2002 Unit 3. Problem Statement 3-25


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Summary

Established by application domain expert


P
r With assistance of database designer
o Input for database design
b
l Global description of application domain
e Describes business object types
m Textual description
How to identify objects
S
t Describes relationship types between business objects
a Who with whom
t If always
e With how many
m What if partner deleted
e Describes business constraints
n For whom
t Under which circumstances, when, which reaction

Figure 3-14. Unit Summary CF182.0

Notes:

3-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 4. Entity-Relationship Model

What This Unit Is About


This unit describes how to establish an entity-relationship model
visualizing and structuring the business object types of the application
domain as entity types; illustrating the relationships between the
business object types as relationship types; and modeling the
constraints for the entity types and relationship types imposed by the
business constraints. It describes the constructs needed to establish
an entity-relationship model based on a sample application domain
and discusses alternate solutions.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Define the entity types for an application domain based on a
problem statement.
• Define the relationship types for an application domain based on a
problem statement.
• Define the supertypes and subtypes for an application domain.
• Identify the constraints for the entity types and relationship types of
an application domain.
• Establish an entity-relationship model for an application domain.

How You Will Check Your Progress


Accountability:
• Checkpoint questions
• Exercises

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives
After completion of this unit, you should be able to:

Define the entity types for an application


domain based on a problem statement:
Basic and dependent entity types
Define the relationship types for an application
domain based on a problem statement
With and without attributes
Define the supertypes and subtypes for an
application domain
Identify the constraints for the entity types and
relationship types of an application domain
Establish an Entity-Relationship Model for
an application domain

Figure 4-1. Unit Objectives CF182.0

Notes:
During this unit, it will be described how to develop an entity-relationship model for an
application domain. You will learn to analyze a given problem statement and to define the
entity types for the application domain represented by the problem statement. There are
basic entity types corresponding to the truly independent business object types and
dependent entity types that are based on other entity types. Their instances require the
existence of corresponding instances of the entity type they are based upon.
Furthermore, you will learn to determine the relationship types for an application domain
and to represent them in an entity-relationship model. Most of the relationship types of an
entity-relationship model correspond to the business relationship types described in the
problem statement for the application domain, but there will also be others. Most
relationship types do not have attributes further describing them, but you will experience
relationship types with attributes as well and learn how to represent their attributes in an
entity-relationship model.
A special category of entity types are supertypes and subtypes which are interconnected
by so-called is-bundles (relationship types). They allow you to form categories and classify

4-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty the represented entity instances, i.e., the objects represented by the entity types.
Supertypes and subtypes are advanced modeling constructs.
A further topic of this unit are constraints for the entity types and relationship types of an
application domain. Most of the constraints are derived from the business constraints for
the application domain, but there are also others.
All elements discussed in this unit are ingredients of entity-relationship models. Thus, you
will learn in this unit how to establish an entity-relationship model for an application domain.
During the unit, we will use the sample application domain for the airline company called
Come Aboard (CAB) we have used in the previous unit. Based on the problem statement
described in Appendix A - Sample Problem Statement, we will establish an
entity-relationship model for our sample airline company. We will pick specific items of the
problem statement and illustrate how they are modeled.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

4-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 4.1 Entity Types

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

ER Model in Design Methodology


Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 4-2. ER Model in Design Methodology CF182.0

Notes:
The development of an entity-relationship model for an application domain is the first step
of the conceptual view during which the process- and implementation-independent
architecture of the data of the application domain is established.
The application domain, described by the problem statement, is scrutinized for its business
object types, the relationships between them, and for business constraints. As a result of
the scrutiny, an entity-relationship model is established visualizing and structuring the
business object types of the application domain as entity types; illustrating the relationship
types between the entity types resulting from the business relationship types of the
application domain; and modeling the constraints for the entity types and relationship types
imposed by the business constraints.
As a general rule, the better the problem statement describing the application domain, the
easier it will be to establish the corresponding entity-relationship model. Therefore, you
should insist on a good problem statement as outlined in Unit 3 - Problem Statement and
assist the domain expert in producing it.

4-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Even if you have a good problem statement, you will most likely encounter items that are
obscure. Consult the domain expert or the appropriate department of expertise to clarify
the open issues. Do not make assumptions on your own that are not based upon
knowledge of the application domain, but are your own speculations. They might be wrong!
The entity-relationship model is the basis for all subsequent design steps. A wrong
assumption for the entity-relationship model will produce incorrect results for the
subsequent steps. If the problem is detected during a later step, you must reiterate all
preceding steps, starting with the entity-relationship model or even the problem statement,
and correct the erroneous results. Therefore, it is advisable to solve open questions
concerning the application domain with the competent people right away and not to make
assumptions not based on knowledge of the application domain.
The development of the entity-relationship model is not a one-time affair. The
entity-relationship model is maintained constantly and changed as the subsequent steps
reveal errors or discover undocumented business object types, business relationship
types, or business constraints. If the problems found concern undocumented items of the
application domain or items not properly described in the problem statement, the problem
statement must be corrected as well. It should be corrected by the domain expert.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Entity Types, Entity Instances, Attributes

Entity Type
An independent conceptual unit representing a class of
objects with the same meaning and characteristics
about which information is to be stored and maintained

Entity (Instance)
An actual object belonging to an entity type

Attribute
A conceptual piece of information with a
distinct meaning stored for the instances of an entity type,
not actual values

Figure 4-3. Entity Types, Entity Instances, Attributes CF182.0

Notes:
As the name entity-relationship model suggests, entity types are one of the building blocks
of an entity-relationship model. They constitute independent conceptual units
representing classes of objects with the same meaning and characteristics about
which information is to be stored and maintained.
Many of these classes derive themselves from the business object types for the application
domain, but not necessarily all of them. Some of the entity types, especially those added
later in the design process, are caused by design rules, e.g., the rules for avoiding
redundancies in the information stored.
You should realize that an entity type represents a class of objects rather than a specific
object. It is a conceptual category of items. The items may physically exist, such as an
aircraft, or they may not physically exist and only be imaginary, such as an aircraft model.
(An aircraft model does not physically exist, it only exists on paper, i.e., it is imaginary.)
An entity type must be an independent conceptual unit. This means that the objects of the
entity type must have a conceptual meaning by themselves; that the information
represented by the objects is understandable by itself; and that, from the application

4-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty domain's point of view, it makes sense to process the information represented by the
objects independently. For our sample airline, it makes sense to have an entity type PILOT
representing the pilots of the company and providing information such as the name, age,
and even shoe size of the pilots (if CAB provides the shoes for the pilots as part of their
uniforms). However, it would not make sense to have an entity type combining the shoe
size of pilots with the wing span of aircraft models because the individual objects of the
entity type would not have a reasonable conceptual meaning.
The term independent in this context does not mean that the objects of an entity type are
completely unrelated to objects of other entity types. In contrast, in a real-life
entity-relationship model, there are many interconnections between the objects of the
various entity types. It is fairly suspicious if there are entity types having no associations
with other entity types. For our sample airline company, PILOT and AIRCRAFT MODEL are
two apparent entity types making sense on their own, but, nevertheless, being interrelated
with each other: Pilots have licenses to fly aircraft models.
Up to now, we talked about the objects belonging to an entity type. In modeling
terminology, the actual objects belonging to an entity type are referred to as instances of
the entity type, entity instances, or simply entities.
The term attribute is used to denote a conceptual piece of information with a distinct
meaning stored for the instances of an entity type. Attributes represent the conceptual type
of information stored, such as last name, and not actual values (such as MILLER for last
name). Therefore, it would be better to talk about attribute types, but this is not the
terminology generally used.
Attributes represent partial information for an entity type. Whether several pieces of
information together form a distinct entity type or just a set of attributes of a larger entity
type depends on their importance for the application domain and their independence. For
example, addresses consisting of country, state, city, and street only represent a set of
attributes for the pilots of our sample airline company. They would form a separate entity
type, identifying buildings, for a shipping company.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Properties of Entity Types and Attributes


Each entity type receives a unique name
Should be generic class name in singular form
Examples: AIRCRAFT MODEL, PILOT, AIRPORT
For all entity instances, the same common characteristics (attributes)
are stored
The attributes stored for the instances of an entity type have a direct
bearing on the meaning of the entity type
The attributes for an entity type receive a unique name
Should be the descriptive term for the characteristics
Examples: Manufacturer, Last Name, Airport Code
For an entity instance, an attribute may assume no, one, or multiple values
All having the same meaning
Minimum and maximum number of values depends on entity type
Attributes can be elementary (indivisible) or composite (have components)
Every entity type must have a set of one or more attributes whose values
together uniquely and permanently identify its (possible) entity instances
Entity key

Figure 4-4. Properties of Entity Types and Attributes CF182.0

Notes:
Entity types and attributes have the following properties:
• Each entity type receives a unique name. This should be the generic class name
expressing the function of the instances of the entity type. By convention, the name for
the class name is used in the singular form as this is done for biological genders where
you talk about the class Human Being and not about the class Human Beings. All
capital letters will be used for the name. Since the name for an entity type is used for
reference purposes, it must be unique.
For our sample airline application domain, entity types are for example: AIRCRAFT
MODEL, AIRCRAFT, PILOT, and AIRPORT.
• For all instances of an entity type, the same common characteristics are stored.
Primarily, this means that the same attributes are stored. However, as we will see later
on, it also means the same types of relationships and/or constraints are recorded.
• The attributes stored for the instances of an entity type have a direct bearing on the
meaning of the entity type. In other words, attributes are not stored for the instances of

4-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty an entity type if they have nothing to do with the semantics of the entity type. For
example, Wing Span should not be an attribute of entity type PILOT. It has nothing to do
with pilots. It is an attribute of aircraft models and, thus, of entity type AIRCRAFT
MODEL.
As obvious this statement seems to be, again and again attributes are assigned to the
wrong entity type.
• The attributes for an entity type receive a unique name. The name should clearly
express the meaning of the attribute, i.e., the characteristic it represents. The name of
an attribute may consist of multiple words. We will start each word with a capital letter
except for connecting words such as of, for, or and.
Examples of attributes are:
For entity type AIRCRAFT MODEL: Number of Engines, i.e., the number of engines
for an aircraft model.
Manufacturer, i.e., the company manufacturing
an aircraft model.
For entity type AIRCRAFT: Aircraft Number, i.e., the unique serial number
identifying an aircraft.
Seat, i.e., the seats on an aircraft.
For entity type PILOT: Last Name, i.e., the last name of a pilot.
For entity type AIRPORT: Airport Code, i.e., the three-letter designator
used in aviation for the various airports.
• For an entity instance, an attribute may assume no, one, or multiple values (an array of
values). However, all values assumed have the same meaning, namely, the meaning
imposed by the attribute. For example, attribute Seat for entity type AIRCRAFT may
assume multiple values: one for each seat on the particular aircraft. How many values
an attribute must assume at least or at most depends on the entity type.
• Attributes can be elementary or composite. From the perspective of the application
domain, the values of an elementary attribute are (logically) indivisible. This means they
cannot be subdivided into smaller units that, by themselves, have a meaning for the
application domain. Thus, they are not structured. For our sample airline company,
examples of elementary attributes are: the serial number for an aircraft and the number
of engines for an aircraft model.
In contrast, composite attributes consist of components. This means their values can be
decomposed into smaller units having an own meaning for the application domain. All
values have the same structure imposed by the components. The components of a
composite attribute are logically related. They can be elementary attributes or again
composite attributes. Each value of a composite attribute for an entity instance is
composed of the appropriate values of its components for the entity instance.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

For our sample airline company, attribute Manufacturer mentioned above is a


composite attribute for entity type AIRCRAFT MODEL. Some of its components are:
Manufacturer Code, Name of Manufacturer, Address, and Phone Number. Address is
again a composite attribute.
It depends on the application domain whether an attribute is elementary or composite.
For Come Aboard, Name of Person would be a composite attribute for entity type
PILOT consisting of elementary attributes Last Name, First Name, and Middle Initial.
For a different application domain, the entire name may be considered indivisible since
last name, first name, and middle initial are not important as separate pieces of
information and are not identifiable.
The elementary attributes for an entity type are the actual carrier of information.
Composite attributes just group attributes. The information they represent is the
information of their components. If an entity type contains a composite attribute, it
means that it contains all components of the composite attribute.
Composite attributes are not absolutely necessary. However, they are very helpful in
identifying information of an entity type that logically belongs together and referring to it.
• It is an absolute requirement that the instances of an entity type be uniquely and reliably
identifiable. Therefore, every entity type must have a set of one or more attributes
uniquely and permanently identifying all possible entity instances for the entity type.
Such a set of attributes is referred to as entity key.
For the entity key, it is not sufficient that its attributes uniquely identify the possible
entity instances. It is also required that all attributes of the set are necessary for the
unique identification, i.e., that none of the attributes can be omitted without losing the
unique identifiability. This is referred to as the minimum principle for entity keys.
It is conceivable that an entity type has multiple potential entity keys. If so, you should
choose the one that most naturally, with regard to the application domain, represents
the entity instances. If there are multiple of that type, choose the one with the fewest
attributes since this eases referencing the entity instances. (Note that different
candidate entity keys may have different numbers of attributes.)

4-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Representation of Entity Types
AIRCRAFT MODEL
K Type Code
K Model Number
AIRCRAFT Entity Type Dimensions
MODEL Length
Height
Wing Span
Category
Standard Attribute
Representation Representation

Entity Instances

AIRCRAFT MODEL AIRCRAFT MODEL


Type Code: B747 Type Code: A310
Model Number: 400 Model Number: 300
Dimensions Dimensions
Length: 70.67 m Length: 46.67 m
Height: 19.33 m Height: 15.81 m
Wing Span: 64.31 m Wing Span: 43.90 m
Category: Jet Category: Jet
Figure 4-5. Representation of Entity Types CF182.0

Notes:
In an entity-relationship model, entity types are represented as rectangles. Most of the
time, the rectangle for an entity type just contains the name of the entity type because of
the limited size of the drawing area available for the entity-relationship model. This
representation is referred to as standard representation of the entity types.
A more detailed representation of an entity type includes attributes for the entity type. In
this case, the rectangle contains a header separated from the rest of the information by a
horizontal line. The header contains the name of the entity type. Below the header, the
attributes for the entity type are listed. For a composite attribute, its components may be
shown as well and are indented to identify them as components.
The attributes belonging to the entity key are preceded by the letter K, the nonkey
attributes are not. If a composite attribute (i.e., all its components) belongs to the entity key,
it is marked appropriately and not its components. This representation of entity types is
referred to as attribute representation of the entity types.
To make the key attributes better visible, their names will be italicized throughout this
document especially when representing entity instances.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Frequently, only a few sample attributes are shown to restrict the size of the rectangle. In
general, the illustrated attributes include the attributes of the entity key. Often, only the
attributes of the entity key are shown.
Because it tends to reduce the clarity of the entity-relationship model and because of the
limited size of the drawing area, generally, the attribute representation is only used if:
• the attributes are important for the understanding
• a small portion of the entity-relationship model is illustrated.
In general, tools only show the standard representation, i.e., the rectangles with the names.
If you click with the mouse on the rectangle for an entity type, a separate window is opened
providing a textual description of the entity type and listing the attributes as far as they have
been entered. A similar approach can be applied when using paper for the
entity-relationship model: The entity-relationship model is drawn in standard representation
and, for each entity type, a page is added providing details about the entity type including
the name, a textual description, and the attributes known.
Sometimes, it is desirable to illustrate a few sample instances for an entity type. An entity
instance is represented as a rectangular box with a header containing the name of the
entity type. The header is followed by a line for each desired attribute. For an elementary
attribute, the name of the attribute is followed by a colon (:) which, in turn, is followed by the
values of the attribute for the represented entity instance. If the attribute assumes multiple
values for the entity instance, the values are separated by commas.
If the components for a composite attribute are shown, the line for a composite attribute
only contains the name of the composite attribute. The components of composite attributes
are indented as for the attribute representation. If the components for a composite attribute
are not shown, the line for a composite attribute has the same format as a line for an
elementary attribute. The components of a value are enclosed in parentheses and
separated by commas.
As described for the attribute representation, we will italicize the lines (name and values)
for the key attributes throughout this course. If a composite attribute belongs to the entity
key and its components are shown, only the line for the composite attribute is italicized.
The examples on the visual list both key and nonkey attributes. Only a subset of the
attributes for entity type AIRCRAFT MODEL is shown. Generally, when developing the
initial entity-relationship model, you do not know all attributes for the entity types yet.
However, you should know the entity keys! If the key for an entity type cannot be derived
from the description of the related business object type in the problem statement, contact
the domain expert to identify the entity key.
The visual illustrates both the standard representation and the attribute representation for
entity type AIRCRAFT MODEL. It also shows two entity instances, a Boeing 747, Model
400, and an Airbus 310, Model 300.

4-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Determining the Entity Types (1 of 2)

Primary source is problem statement for


application domain

Business Object Type

(Basic) Entity Type

There will be others as the design progresses!!!

BUT . . .

Figure 4-6. Determining the Entity Types (1 of 2) CF182.0

Notes:
Since the entity-relationship model must reflect the application domain and visualize its
business object types, the problem statement for the application domain constitutes the
primary source for determining the entity types. For each business object type of the
application domain, there is normally an entity type in the entity-relationship model. The
entity types derived this way are usually referred to as basic entity types since they are
inherent (basic) to the application domain.
This illustrates how important it is that a good problem statement is available when the
modeling begins. Therefore, the database designer should insist on a good problem
statement being established by the domain expert (with the help of the database designer
to ensure that it contains the proper information). The better the problem statement, the
easier it is to develop the corresponding entity-relationship model.
You should realize that the final entity-relationship model will contain additional entity types
that were not apparent from the problem statement. For some of these entity types, the
corresponding business object types were simply forgotten in the problem statement, and
the problem statement should be corrected accordingly by the domain expert. Other entity

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-15


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

types were part of more complex business object types and must be separated out. We will
see such cases later in this unit. For them, you should also request an update of the
problem statement by the domain expert.
Furthermore, structuring requirements of the later steps of the design process may
introduce additional entity types. The entity-relationship model is updated as these entity
types are found. You may rightfully ask if these entity types do not have corresponding
business object types? In many cases (if not all), they indeed should have corresponding
business object types. However, these business object types are frequently not
immediately obvious to the domain expert because they play a secondary role from the
perspective of the application domain. It is highly advisable that the database designer
discusses these entity types with the domain expert and convinces him/her to update the
problem statement accordingly.

4-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Determining the Entity Types (2 of 2)
Ask yourself the following questions:
Considered by themselves, have the entity instances a
meaning for the application domain?
What is the generic class name?
Would the application domain conceivably process the
instances on their own?
How can the entity instances uniquely be identified?
What is the entity key?
Do the entity instances also have nonkey attributes?
Will there be eventually multiple instances of that type
for the application domain?

If necessary, go back and ask the domain expert!

If you can answer all questions affirmatively, it is most


likely an entity type
Figure 4-7. Determining the Entity Types (2 of 2) CF182.0

Notes:
This visual lists a set of questions you might want to ask yourself before accepting
something (a business object type) as entity type:
• Considered by themselves, have the instances of the candidate entity type a reasonable
meaning for the application domain?
The term by themselves emphasizes the independence of the instances. Further
subquestions leading to the answer are:
- What would be the generic class name and does it make sense in the context of the
application domain? Does it indeed represent a conceptual entity compatible with the
application domain?
- Would the application domain conceivably process the instances on their own, i.e., do
the instances have a meaning by themselves, or are the instances only meaningful
when processed together with the instances of another entity type? In the latter case,
you may rather be dealing with a subset of attributes for the other entity type and not
with a separate entity type.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-17


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

• How would the instances of the candidate entity type be uniquely identified, i.e., what
would be the entity key?
You should be able to find a set of attributes that identifies the entity instances in a
manner natural to the application domain.
• Do the instances of the candidate entity type also have nonkey attributes?
It is possible, but very seldom, that all attributes of an entity type belong to the entity key.
Therefore, you should be suspicious if there are not any nonkey attributes.
• Will the candidate entity type eventually contain multiple instances or will there always
be only a single instance for the entity type?
Again, it is possible that an entity type will always contain just a single entity instance
(something like a control record), but it is very unusual and should make you suspicious.
If the problem statement does not contain the answers to the above questions, go back to
the domain expert or the appropriate department of expertise. Do not make unfounded
assumptions!
If you can answer all the above questions satisfactorily and affirmatively, the candidate
entity type is most likely a real entity type. However, you should be aware that the
affirmative answers only provide clues and not proofs that something is an entity type. That
is because the entity types depend on the application domain.

4-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Entity Types - A Piece of Advice
You should establish the entity types carefully

However:
Do not linger on endlessly!!!
You may not have all information yet
Some entity types may be hidden and reveal themselves in
the subsequent steps
Some entity types may turn out not to be entity types as you get
more information and become wiser

Remember: It is an iterative
process!!!
Figure 4-8. Entity Types - A Piece of Advice CF182.0

Notes:
Since the entity-relationship model is the basis for all further steps of the design process,
you should establish the entity types very carefully. However, you should not linger on
endlessly. Projects have failed in the past because the participants fought endlessly over
what the entity types for the application domain were.
At such an early stage of the design process, you may not have all information to be a
hundred percent certain of the entity types, especially since the problem statement does
not list all data elements yet that play a role for the application domain. It only lists a few
sample data elements for each business object type.
Despite of all good intentions when writing the problem statement, some entity types may
be hidden and only emerge in the subsequent steps or when all data elements are
compiled.
Conversely, some of the entity types may have been overrated and prove not to be entity
types after all as more information becomes available during the subsequent steps and you
become more familiar with the application domain.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-19


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Thus, establish the entity types carefully, but continue on to the next steps of the design
after you feel confident with what you have done. Remember that the design methodology
used in this course represents an iterative approach allowing you to continuously improve
the entity-relationship model and the dependent results.

4-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Entity Types for CAB

MECHANIC PILOT

AIRCRAFT
MODEL
AIRPORT

AIRCRAFT

MAINTENANCE
RECORD
ITINERARY FLIGHT

Figure 4-9. Entity Types for CAB CF182.0

Notes:
This visual illustrates the entity types for our sample airline company called Come Aboard.
The entity types were derived from the problem statement in Appendix A - Sample Problem
Statement. Since this is a fairly good problem statement, there is an entity type for each
business object type. However, later in this unit, we will see that some additional entity
types will have to be added.
The entity types have the following entity keys:

Entity Type Entity Key


AIRCRAFT MODEL Type Code, Model Number
AIRCRAFT Aircraft Number
PILOT Employee Number
MECHANIC Employee Number
AIRPORT Airport Code
ITINERARY Flight Number

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-21


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Entity Type Entity Key


FLIGHT Flight Number, From (airport of departure), To (airport of
arrival), Flight Locator
MAINTENANCE RECORD Maintenance Number

4-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 4.2 Relationship Types

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-23


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationship Types Between Entity Types

Relationship Type
A conceptual association between the entity
instances, one each, of two not necessarily different entity types

Relationship (Instance)
A specific interrelation of a given relationship type
between specific entity instances of the entity types for
the relationship type

Relationship type = Entirety of all relationships


with the same meaning

Figure 4-10. Relationship Types Between Entity Types CF182.0

Notes:
As the name already suggests, relationship types form the second component of
entity-relationship models. Initially, we will concentrate on relationship types between entity
types. Later, we will expand, i.e., generalize, the relationship type definition given here.
A relationship type (between entity types) is a conceptual association between the entity
instances, one each, of two not necessarily different entity types. Thus, it describes a class
of interrelationships, having the same characteristics, connecting the entity instances of
two entity types.
The terms relationship instance and relationship are used to denote a specific
interrelationship of a given relationship type between specific instances of the entity types
for the relationship type.
Relationship instances of the same relationship type always interconnect instances of the
same respective entity types. If r1 and r2 are relationship instances of the same
relationship type and r1 associates an instance of entity type E1 with an instance of entity
type E2, then r2 must also interconnect an instance of E1 to an instance of E2.
Furthermore, r1 and r2 must have the same meaning and characteristics.

4-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Taking this into account, you can conceive a relationship type as the entirety of all
(potential) relationships, with the same meaning, between entity instances of two (not
necessarily different) entity types.
By definition, relationship instances exist only as long as the instances exist they
interconnect. If one of the instances is deleted, the relationship instance no longer exists.
By definition, relationship types are binary in the sense that their instances always
interconnect two entity instances. At the first glance, this seems to be restrictive, but it will
prove not to be the case when the relationship type definition is extended later in this unit.
Please note the similarity of the relationship type definition to the definition of business
relationship types given in Unit 3 - Problem Statement. Therefore, you may already suspect
that the business relationship types will be the primary source for the relationship types of
the entity-relationship model. This is indeed the case, but there will be additional
relationship types as we will see later in this unit.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-25


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationship Types in ER Model

Source for Target for


primary direction PILOT inverse direction

Name for Name for


primary direction _can_fly_ (_can_be_flown_by_) inverse direction

Arrow for
primary direction

Target for AIRCRAFT Source for


primary direction MODEL inverse direction

Figure 4-11. Relationship Types in ER Model CF182.0

Notes:
In the entity-relationship model, the entity types for a relationship type are interconnected.

Since relationship types are binary by definition as explained before, each relationship type
can be viewed from two directions. One of the direction is referred to as primary direction,
the other as inverse direction. The term primary seems to indicate that one of the directions
is more important than the other. From a data modeling perspective, this is not the case
and it is irrelevant which direction is chosen as primary direction. From an application point
of view, you may want to choose the direction as primary direction which, application-wise,
is more important.
In the above example, the relationship type interconnecting the entity types PILOT and
AIRCRAFT MODEL can be looked at from PILOT's point of view meaning that a pilot can
fly an aircraft model. The relationship type can also be looked at from AIRCRAFT MODEL's
point of view. Then, the meaning is that an aircraft model can be flown by a pilot. As
expected, the meanings are complementary. Let us choose the direction from PILOT to
AIRCRAFT MODEL as the primary direction.

4-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty In the entity-relationship model, the primary direction of a relationship type is indicated by
an arrow specifying the direction of the view.
To allow referencing them, all relationship types are uniquely named. More precisely, each
direction receives a unique name. In the entity-relationship model, the names for the
directions are placed next to the connecting arrow and the name for the inverse direction is
enclosed in parentheses. This convention together with the arrow for the primary direction
allows you to understand and interpret the relationship type correctly from the
entity-relationship model.
When talking about a direction of a relationship type, it makes sense to talk about the
source and the target of the direction. The source of the direction is the entity type from
which you look at the relationship type. The target is the opposite entity type.
In the example on the visual, _can_fly_ (more precisely,
PILOT_can_fly_AIRCRAFT MODEL as we will see in a minute) is the name of the primary
direction. PILOT is the source for the primary direction and AIRCRAFT MODEL its target.
For the inverse direction, the name is _can_be_flown_by_ (more precisely, AIRCRAFT
MODEL_can_be_flown_by_PILOT); AIRCRAFT MODEL is the source; and PILOT is the
target.
From a data modeling point of view, it is only important to be able to identify the relationship
type as such and not the various directions. For this, it is sufficient to list a single name for
the relationship type in the entity-relationship model. For simplicity, the name of the primary
direction is used since it does not require the enclosing parentheses.
People often talk about the source and target of a relationship type without mentioning a
specific direction. In this case, the source and the target of the primary direction are meant.
We will follow this convention as well throughout this course.
As mentioned before, the directions of a relationship type receive unique names. To avoid
overly lengthy names in the entity-relationship model, we are using the following naming
convention throughout the course:
• The full name for a direction always starts with the name of the source followed by an
underscore and always ends with the name of the target preceded by an underscore. All
words in between the source and target names are separated by underscores rather
than blanks.
• In the entity-relationship model, only the part of the name is shown that follows the name
of the source and precedes the name of the target. The names of the source and the
target are not shown. Thus, the illustrated name portion (abbreviated name) always
starts with an underscore and always ends with an underscore signaling the absence of
the source and target names.
This convention allow us to use the same abbreviated name in the entity-relationship
model for the directions of different relationship types or for both directions of a relationship
type and still to be able to determine the full unique names for them.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-27


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In addition to illustrating the relationship types in the entity-relationship model, you should
provide a detailed description for them on a separate piece of paper including the names
for both directions, the names of their sources and targets, and a textual description of the
meaning of the relationship type. When following the above naming convention, the names
of the source and target for a direction are implicitly identified and need not be specified
explicitly.

4-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Relationship Instance Diagram

_can_fly_ AIRCRAFT
PILOT
(_can_be_flown_by_) MODEL

AIRCRAFT MODEL
Type Code: B747
Model Number: 400
PILOT _can_fly_ Cruising Speed: 930 km/h
Employee Number: 0491337 ... ...
Last Name: Miller
First Name: Jack AIRCRAFT MODEL
... ... _can_fly_
Type Code: A340
Model Number: 100
PILOT _can_fly_
Cruising Speed: 890 km/h
... ...
Employee Number: 1662951
Last Name: Smith
First Name: Joe AIRCRAFT MODEL
... ... _can_fly_ Type Code: A310
Model Number: 300
Cruising Speed: 860 km/h
... ...

Figure 4-12. Relationship Instance Diagram CF182.0

Notes:
Relationship instance diagrams are a useful means to illustrate a relationship type by
example. They cannot replace entity-relationship models. They can only help to better
visualize small parts of an entity-relationship model by means of examples.
In a relationship instance diagram, sample entity instances are interconnected by named
arrows in the manner intended by the subject relationship type.
The topmost part of the above visual shows how the relationship type is represented in an
entity-relationship model. The representation is followed by a relationship instance diagram
for the relationship type. The relationship instance diagram shows that pilot Miller, Jack
(employee number 0491337) can fly Boeing 747, Model 400 (type code B747, model
number 400), and Airbus 340, Model 100 (type code A340, model number 100). Pilot
Smith, Joe can fly Airbus 340, Model 100, and 310, Model 300.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-29


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Multiple Relationship Types for Entity Types


_captain_for_
PILOT FLIGHT
_copilot_for_

FLIGHT
PILOT Flight Number: YY1842
Employee Number: 0491337 _captain_for_ From: FRA
Last Name: Miller To: JFK
First Name: Jack Flight Locator: 453
... ...
Planned Departure
_copilot_for_ Departure Date: 1999-07-21
PILOT Departure Time: 10:30
Employee Number: 1662951 ... ...
Last Name: Smith
First Name: Joe FLIGHT
... ... _captain_for_ Flight Number: YY2843
From: ATL
PILOT To: SJC
Employee Number: 0844092 _copilot_for_ Flight Locator: 210
Last Name: Ferguson Planned Departure
First Name: Jane Departure Date: 1999-08-01
... ... Departure Time: 16:35
... ...

Figure 4-13. Multiple Relationship Types for Entity Types CF182.0

Notes:
The above visual demonstrates that there may exist multiple different relationship types
between two entity types underlining why it is important to name the relationship types
(more precisely, their directions). The names allow you to differentiate the various
relationship types.
As described by the problem statement for our sample airline company called Come
Aboard, to each flight, one pilot is assigned as (flight) captain and another as copilot. This
gives rise to two relationship types between entity types PILOT and FLIGHT:
PILOT_captain_for_FLIGHT
PILOT_copilot_for_FLIGHT
The upper part of the picture illustrates their representation in an entity-relationship model.
Only the primary names are shown for the relationship types. For better distinguishability,
the connecting arrow for PILOT_copilot_for_FLIGHT has been dotted. This does not imply
a special meaning.

4-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The lower part of the picture illustrates a relationship instance diagram comprising the two
relationship types. Pilot Miller, Jack with employee number 0491337 is captain for flight
YY1842, flight locator 453, from Frankfurt (FRA) to New York Kennedy airport (JFK). Pilot
Smith, Joe (employee number 1662951) is pilot for flight YY2843, flight locator 210, from
Atlanta (ATL) to San Jose, California (SJC). Smith, Joe is also copilot for the flight Miller,
Jack is captain for.
Pilot Ferguson, Jane is copilot for captain Smith's flight from Atlanta to San Jose.
The requirement that captain and copilot for a flight must be different cannot be modeled by
these relationship types. It must be expressed by means of constraints discussed later.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-31


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unary Relationship Types

AIRPORT _nonstop_to_

AIRPORT
Airport Code: SJC
Country: USA
City: San Jose
... ...
_nonstop_to_
AIRPORT
Airport Code: ATL
Country: USA _nonstop_to_ _nonstop_to_
City: Atlanta
... ...
_nonstop_to_
AIRPORT
Airport Code: STR
Country: Germany
City: Stuttgart
... ...

Figure 4-14. Unary Relationship Types CF182.0

Notes:
The problem statement for our sample airline company states that itineraries consist of
ordered collections of nonstop connections between airports. As the term connection
implies, nonstop connections are relationships between two airports: an airport has a
nonstop connection to another airport. As indicated on the visual, the abbreviated name for
the relationship type is _nonstop_to_. Accordingly, the full name is
AIRPORT_nonstop_to_AIRPORT. The first airport is the airport of departure, the second
airport the airport of arrival.
Even though the individual relationship instances are binary in that they interconnect two
entity instances, a relationship type interconnecting instances of the same entity type is
referred to as unary relationship type.
The upper part of the visual illustrates the representation of a unary relationship type in an
entity-relationship model: the arrow returns to the entity type it starts from.
The lower part of the visual illustrates instances for the represented relationship type.
Atlanta (ATL) has a nonstop connection to San Jose, California (SJC). Stuttgart, Germany

4-32 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty (STR), has nonstop connections to Atlanta and San Jose. San Jose has a nonstop
connection to Stuttgart.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-33


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

A Special Relationship Type

Business Relationship Types

AIRPORTS - ITINERARIES
An itinerary consists of one or more legs. The legs are nonstop connections
between two airports, the starting and ending airports for the leg. Airports
can be the starting or ending points for legs of multiple itineraries.

Legs are relationship instances, not entity instances


Only reflect the fact that two airports are
interconnected, i.e., have a relationship
Itineraries have relationships with relationships

Need to extend the definition of relationship types


Figure 4-15. A Special Relationship Type CF182.0

Notes:
Let us look closer at the business relationship type for Come Aboard associating airports
with itineraries. It states that itineraries consist of legs. The legs are nonstop connections
between two airports, the starting airport (airport of departure) and the ending airport
(airport of arrival) for the respective leg.
As explained on the previous visual, the legs (nonstop connections) are relationship
instances and not entity instances because they only reflect the fact that two airports are
interconnected, i.e., have a relationship. The appropriate relationship type had been called
AIRPORT_nonstop_to_AIRPORT.
Since itineraries consist of one or more legs, they have relationships with legs, i.e., with
relationships (relationship instances). This means that we need to extend the definition of
relationship types to allow relationship types as source or target.

4-34 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Relationship Types - Generalized Definition

Relationship Type
A conceptual association between:
The entity instances, one each, of two not necessarily
different entity types
The relationship instances, one each, of two not
necessarily different relationship types
The entity instances and the relationship instances,
one of each, of an entity type and a relationship type

Relationship (Instance)
A specific interrelation of a given relationship type

Figure 4-16. Relationship Types - Generalized Definition CF182.0

Notes:
This visual contains the general relationship type definition. It extends the previous
definition, which only allowed entity types as source or target, by allowing entity types or
relationship types as source or target of relationship types. All kinds of combinations are
allowed:
• The relationship instances can interconnect the instances of two (not necessarily
different) entity types. This was the initial, restricted, definition of relationship types.
• The relationship instances can interconnect the instances of two (not necessarily
different) relationship types.
• The relationship instances can interconnect an entity instance and a relationship
instance. Either one can be the source or the target. It is not important here which one is
the source or the target because the role can be reversed by selecting the other
direction of the relationship type as the primary direction.
A relationship instance in this extended sense is nothing else than a specific interrelation of
the considered relationship type.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-35


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationship Type on Relationship Type


_nonstop_to_

_in_
AIRPORT ITINERARY

_nonstop_to_
AIRPORT
_in_ ITINERARY
Airport Code: STR
... ... _in_ Flight Number: YY3367
_nonstop_to_ ... ...
AIRPORT
_in_
Airport Code: FRA
... ... ITINERARY _in_
_in_
_nonstop_to_ Flight Number: YY0025
AIRPORT ... ...
Airport Code: ATL _in_
... ...
_nonstop_to_
AIRPORT ITINERARY
Airport Code: SFO Flight Number: YY0100
... ... _in_ ... ...
_nonstop_to_

Figure 4-17. Relationship Type on Relationship Type CF182.0

Notes:
As described in the problem statement for Come Aboard, an itinerary consists of one or
more legs. We have already determined that the legs are relationship instances of
relationship type AIRPORT_nonstop_to_AIRPORT. Accordingly, we need a relationship
type interconnecting this relationship type and entity type ITINERARY. In the
entity-relationship model portion above, this relationship type is represented as an arrow
from relationship type AIRPORT_nonstop_to_AIRPORT (source) to entity type ITINERARY
(target). Its abbreviated name is _in_. According to our naming convention, the full name of
the (primary direction of the) relationship type is:
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY
If necessary, parentheses may be used to avoid duplicate names or any
misunderstandings.
The lower part of the visual illustrates a relationship instance diagram for the
entity-relationship model portion of the upper part:

4-36 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • The itinerary for flight number YY3367 is composed of two nonstop connections: one
from Atlanta (ATL) to Stuttgart (STR) and one from Stuttgart to Frankfurt (FRA).
• The itinerary for flight YY0025 consists of three legs (nonstop connections): one from
Stuttgart to Frankfurt, one from Frankfurt to Atlanta, and one from Atlanta to San
Francisco (SFO).
• The itinerary for flight number YY0100 consists of two legs: one from San Francisco to
Atlanta and one from Atlanta to Stuttgart.
If the airline company had round flights (e.g., sightseeing flights), airports could be
connected to themselves.
The model does not make a statement about the order of the legs although the problem
statement specifies that an itinerary is an ordered collection of nonstop connections. For
the sample itineraries above, we have used the implicit rule that the starting airport for the
next leg must be the ending airport for the previous leg. This rule may not always hold true
or it may not provide the order of the legs if the starting and ending airports for an itinerary
are the same (around-the-world trips). It is possible to model the order of the legs. We will
do this after we have talked about the necessary modeling constructs later in this unit.
As defined, relationship types are binary in nature in that all relationship instances
interconnect two instances. Many modeling methodologies only allow entity types as
source or target of relationship types and, in order to compensate for the loss of
functionality, introduce n-ary relationship types.
N-ary relationship types interconnect the instances of n entity types. The business
relationship type Airports - Itineraries used for this visual would be considered as a ternary
(3-ary) relationship type by these methodologies whose instances interconnect three entity
instances: a starting airport, an ending airport, and the itinerary. For the correct
interpretation of n-ary relationship types, you need to define the roles of the entity types
within the relationship types.
In case of our sample ternary relationship type, you need to specify that the first airport is
the starting airport for the leg of the itinerary and the second airport the ending airport of
that leg. By doing this, you implicitly define a relationship between the two airports, namely,
that they are the starting and ending airports for a nonstop connection. This is the
relationship type explicitly implemented in the entity-relationship model portion of the
visual. It more clearly expresses the actual situation.
Binary relationship types are sufficient if relationship types are allowed as source or target.
By using only binary relationship types as we have defined them, the application domain is
much better structured and hidden relationship types are revealed. Furthermore, using only
binary relationship types avoids violations of the Fourth Normal Form and the Fifth Normal
Form.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-37


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationship Type Versus Attribute

AIRCRAFT - MAINTENANCE RECORDS


As an aircraft is serviced, a maintenance record for the aircraft is established. A
maintenance record applies to one and only one aircraft. For an aircraft, there may be
multiple maintenance records.
The maintenance records for an aircraft contain the serial number for the aircraft. All
maintenance records for an aircraft must be kept for the time the aircraft is owned by
CAB and for two years thereafter. This implies that the maintenance records must
still be kept after the remaining information for the aircraft has been deleted.

Relationships require that source


and target instances exist

Not guaranteed here


Aircraft Number must
be an attribute
Cannot be expressed by a relationship type
MAINTENANCE RECORD
_has_ K Maintenance Number
AIRCRAFT MAINTENANCE ...
RECORD
Aircraft Number

Figure 4-18. Relationship Type Versus Attribute CF182.0

Notes:
It is not always clear whether a business relationship type must be modeled as a
relationship type or just constitutes an attribute of an entity type. This is illustrated on the
visual by means of business relationship type AIRCRAFT - MAINTENANCE RECORDS for
our sample airline company.
The fact that there is a business relationship type seems to indicate that the
entity-relationship model should include a relationship type
AIRCRAFT_has_MAINTENANCE RECORD interconnecting entity types AIRCRAFT and
MAINTENANCE RECORD.
However, the description of the business relationship type states that the maintenance
records for an aircraft contain the serial number for the aircraft (aircraft number). This
seems to indicate that the aircraft number should be an attribute of entity type
MAINTENANCE RECORD. But, do not be fooled! The before-mentioned text only
expresses that the aircraft number is displayed with a maintenance record (e.g., in the
maintenance-record form on paper or in a window on a screen). It does not describe how
the associations between aircraft and maintenance records are internally stored in a

4-38 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty database. For a relational database management system, they would not be stored as part
of the maintenance records if a maintenance record belonged to multiple aircraft. (In case
of our sample airline company, it can only belong to one aircraft.)
Moreover, the entity-relationship model is part of the conceptual view during which only the
conceptual interrelationships, and not any physical implementations, should be considered.
Accordingly, the fact that a maintenance record contains the aircraft number rather
expresses the relationship between maintenance records and aircraft (the inverse direction
of relationship type AIRCRAFT_has_MAINTENANCE RECORD).
Well, we must disappoint you in this case ... Unfortunately, the description of the business
relationship type includes the remark that the maintenance records (including the aircraft
number) must be kept even after the remaining information for the aircraft has been
deleted. This means that the association with the aircraft must be maintained.
This requirement prevents modeling the business relationship type between aircraft and
maintenance records as a relationship type in the entity-relationship model since
relationship instances, at all times, require the existence of their source and target
instances. A relationship instance (automatically) disappears when its source or target
instance is deleted.
Thus, the considered business relationship type cannot be expressed as a relationship
type. It must be expressed as an attribute of entity type MAINTENANCE RECORD.
If such an anomaly, as exemplified by the considered business relationship type, does not
exist, an association between two entity types should always be expressed as a
relationship type in the entity-relationship model regardless of any future implementation
considerations.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-39


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationship Types for CAB

MECHANIC _can_fly_
PILOT

AIRCRAFT
_trained MODEL
_for_ _can_land AIRPORT
_at_
_copilot _captain
_from_ _for_ _for_ _for_ _for_
_nonstop_to_

AIRCRAFT _in_
_scheduled
_for_
_for_
MAINTENANCE
RECORD
ITINERARY FLIGHT

_belongs_to_ _for_

Figure 4-19. Relationship Types for CAB CF182.0

Notes:
This visual contains the relationship types for our sample airline company called Come
Aboard. The relationship types were derived from the problem statement contained in
Appendix A - Sample Problem Statement. Since this is a fairly good problem statement, the
relationship types could easily be derived from the business relationship types described
by the problem statement. However, later in this unit, we will see that some additional
relationship types will have to be added.
Note that there is no relationship type between MAINTENANCE RECORD and AIRCRAFT
as discussed before.

4-40 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Cardinalities

AIRCRAFT _for_
AIRCRAFT
MODEL 1. .1 (_belongs_to_) 0. .m

An aircraft belongs to one An aircraft model may


and only one aircraft model be for many aircraft

0. .1 0. .m 1. .1 1. .m
Possible cardinalities:
1 m

At most one aircraft can be An aircraft may be


assigned to a flight (used) for many flights

0. .1 _for_ 0. .m
AIRCRAFT FLIGHT
(_has_been_assigned_)

Figure 4-20. Cardinalities CF182.0

Notes:
For modeling purposes and for the transformation of an entity-relationship model into tuple
types and tables, it is important to know if an instance of the source of a relationship type
can have relationships with multiple target instances, or vice versa, or only with a single
target or source instance. It is also important to know if a source or target instance must
always be connected to at least one target or source instance, respectively.
Since the relationship types of the entity-relationship model mostly correspond to the
business relationship types of the problem statement, the multiplicities for the relationship
types should be reflected by the descriptions of the corresponding business relationship
types in the problem statement. The multiplicities were required input for the business
relationship types. Since they are application-domain specific, the database designer
should not make assumptions about them on his/her own. He/She should consult the
domain expert or the appropriate department of expertise to obtain the correct information.
In the entity-relationship model, the multiplicities are expressed by cardinalities:

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-41


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

• The cardinality for the target describes how many target instances may be associated
with a single source instance and is placed close to the connecting arrow at the target
(end) of the relationship type.
• The cardinality for the source expresses how many source instances may be associated
with a single target instance and is placed close to the connection arrow at the source
(end) of the relationship type.
• A cardinality consists of two values, a minimum value and a maximum value separated
by two periods:
minimum .. maximum
Minimum can be 0 (zero) or 1. Maximum can be 1 or m where m is used as abbreviation
for many.
A minimum of 0 for the cardinality of the target (source) means that a source (target)
instance may not necessarily have a relationship with a target (source) instance.
A minimum of 1 for the cardinality of the target (source) means that a source (target)
instance must always have at least one relationship with the instances of the target
(source).
A maximum of 1 for the cardinality of the target (source) means that a source (target)
instance cannot have more than one relationship with the instances of the target
(source).
A maximum of m for the cardinality of the target (source) means that a source (target)
instance can have many relationships with the instances of the target (source).
The upper relationship type on the visual interconnects aircraft models and aircraft. For the
corresponding business relationship type, the problem statement states the following:
• An aircraft belongs to one and only one aircraft model.
• An aircraft model may apply to multiple aircraft.
The fact that an aircraft belongs to one and only one aircraft model is expressed by a
cardinality of 1..1 at the AIRCRAFT MODEL end of the relationship type since it describes
the cardinality for the source. The fact that an aircraft model may apply to multiple aircraft is
expressed by a cardinality of 0..m at the AIRCRAFT end of the relationship type. The
minimum value of zero allows for aircraft models for which there is no aircraft.
The lower part of the visual illustrates the cardinalities for relationship type
AIRCRAFT_for_FLIGHT for our sample airline company. According to the problem
statement, an aircraft may be used for many flights resulting in a target cardinality of 0..m.
Note that aircraft need not be assigned to flights at all times. According to the problem
statement, at most one aircraft can be assigned to a flight, but there need not be an aircraft
assigned to a flight. This results in a cardinality of 0..1 for the source of the relationship
type.
Because of the minimum and maximum values they can assume, the possible cardinalities
are:

4-42 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 0..1 = at most one


0..m = any number
1..1 = one and only one
1..m = one or more
Since 0..1 and 0..m are the most common cardinalities, 1 and m can be used as
abbreviations for them.
Relationship types with cardinalities ..m (meaning 0..m or 1..m) at both ends are referred to
as m:m relationship types (m to m).
Relationship types with cardinalities ..1 (meaning 0..1 or 1..1) at both ends are referred to
as 1:1 relationship types (one to one).
Relationship types with cardinality ..m at one end and ..1 at the other end are referred to as
1:m relationship types (one to m).
A further classification of the relationship types is the following:
Relationship types with cardinality 1.. (meaning 1..1 or 1..m) at at least one end are
referred to as mandatory relationship types (mandatory for the source or target or both). It
is mandatory for the source if the target cardinality is 1..1 or 1..m and mandatory for the
target if the source cardinality is 1..1 or 1..m.
Relationship types with cardinality 0.. (meaning 0..1 or 0..m) at at least one end are
referred to as optional or conditional relationship types (conditional for the source or target
or both).

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-43


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Cardinalities (Example 1)

MAINTENANCE _from_
RECORD
MECHANIC
m 1. .1

Possibly many
maintenance records Maintenance record from
from a mechanic at least one mechanic

MAINTENANCE RECORD _from_ MECHANIC


Maintenance Number: 10385 Employee Number: 9163488
... ... ... ...
Maintenance record from
at most one mechanic
MAINTENANCE RECORD MECHANIC
_from_
Maintenance Number: 10386 Employee Number: 0275912
... ... ... ...
_from_
MAINTENANCE RECORD MECHANIC
Maintenance Number: 10404 Employee Number: 4712002
... ... ... ...

Figure 4-21. Cardinalities (Example 1) CF182.0

Notes:
The above visual illustrates the cardinalities for relationship type MAINTENANCE
RECORD_from_MECHANIC describing the interrelationships between maintenance
records and mechanics for our sample airline company.
A maintenance record must be from at least one mechanic as indicated by the minimum
value of 1 for the target cardinality (at the MECHANIC end of the relationship type).
Accordingly, in the relationship instance diagram, there must be at least one connection
from each maintenance record to a mechanic. The maximum value of 1 for the cardinality
reflects that a maintenance record can be from at most one mechanic. Consequently, there
must not be more than one connection from a maintenance record to mechanics.
The source cardinality of m is equivalent to a cardinality of 0..m. It specifies that a
mechanic may be responsible for multiple (many) maintenance records, but need not be
responsible for any:
• Mechanic 9163488 is responsible for a single maintenance record.
• Mechanic 0275912 is responsible for two, i.e., multiple, maintenance records.

4-44 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • Mechanic 4712002 (currently) is not responsible for any maintenance records.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-45


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Cardinalities (Example 2)

_nonstop_to_

m
_in_
AIRPORT ITINERARY
1 m
..
m m

An itinerary can have many


legs (nonstop portions)
A leg may occur in zero,
one, or many itineraries
An itinerary has at least one
leg (nonstop portion)

Figure 4-22. Cardinalities (Example 2) CF182.0

Notes:
On the above visual, AIRPORT_nonstop_to_AIRPORT is a m:m relationship type: An
airport can be the airport of arrival or the airport of departure for any number of nonstop
connections (legs).
According to the problem statement for Come Aboard, an itinerary must always have at
least one leg and can have multiple legs. Thus, the source cardinality must be 1..m for the
_in_ relationship type. Note the way the cardinality is written on the visual to save space.
The target cardinality for the _in_ relationship type is m meaning that the legs may be part
of multiple itineraries, but need not belong to any itineraries. The question is justified if a leg
must not always belong to at least one itinerary resulting in a target cardinality of 1..m
rather than 0..m? Why have a nonstop connection otherwise? The problem statement for
our airline company is not precise in this regard and we must consult the domain expert for
the correct cardinality. His/Her answer is that CAB wants to record planned nonstop
connections between airports even before itineraries are established. This means that the
cardinality of m is correct.

4-46 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Relationship types with cardinalities of 1.. at both ends represent a kind of "chicken and
egg" problem when adding instances for the source or target. If an instance for the source
is added, the corresponding target instance, if it does not exist yet, and the interconnecting
relationship instance must be added at the same time. A similar scenario applies to adding
a target instance. The transaction concept of relational database management systems
allows this provided that the completeness check is performed at the end of the
transaction, i.e., when the transaction is committed, and not when the source or target
instance are inserted.
To avoid the problem from the beginning, it may be preferable to change one of the
minimum cardinalities to 0. In case of our example, this allows the legs to be established
(first) without a check being performed. However, you should note that this is an
implementation problem and not a conceptual design problem. Therefore, you should use
1.. cardinalities at both ends, if that is what the application domain requires, and handle the
resulting problem during the later design phases.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-47


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Defining Attributes and Relationship Key


Defining Attributes
of a Relationship Type Independent
of
= cardinalities
Key of . . .
AND

. .1 . .1
Source Target

Source OR Target
. .1 Relationship Key . .1
=
. .m Key of . . . . .m
Target AND Source

Target Source
. .m . .m
Figure 4-23. Defining Attributes and Relationship Key CF182.0

Notes:
To fully describe a relationship instance, you must specify the source and target instances
interconnected by the relationship instance. The source and target instances can be
identified by means of the values of their keys. If the source and target of the relationship
type are entity types, the keys are the respective entity keys. We will see in a moment what
the key is if the source or the target is a relationship type.
Since the keys of source and target completely describe and define the possible
relationship instances, they are referred to as defining attributes of the relationship type.
The defining attributes of a relationship type are completely independent of the cardinalities
for the relationship type.
Similar to the introduction of the term entity key, the term relationship key is introduced to
denote a subset of the defining attributes of a relationship type that can be used to uniquely
identify the potential relationship instances and does not contain any defining attributes not
needed for the unique identification (minimum principle).
It depends on the cardinalities for the relationship type which of the defining attributes can
form the relationship key:

4-48 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • If both cardinalities of the relationship type are ..m cardinalities (i.e., 0..m or 1..m), each
source instance can be associated with multiple target instances and each target
instance with multiple source instances. Thus, each source or target key value may
occur as defining attribute of multiple relationship instances.
Consequently, a relationship instance can only be uniquely identified (referred to) by
providing both the key value for the source and the target. Accordingly, the relationship
key for the relationship type consists of the key of the source and the key of the target.
• If the cardinality of the source is ..1 and the cardinality of the target is ..m, there may be
multiple target instances for each source instance, but there may be only one source
instance for any target instance. Thus, a source key value may occur as defining
attribute of multiple relationship instances whereas a target key value can only occur as
defining attribute of a single relationship instance.
Consequently, a relationship instance can uniquely be identified by providing the value
of its target defining attribute, i.e., the key value of its target instance. In other words, the
relationship key consists of (the attributes of) the key of the target of the relationship
type.
• Similarly, if the cardinality of the target is ..1 and the cardinality of the source is ..m, there
may be multiple source instances for each target instance, but there may be only one
target instance for any source instance. Thus, the target key value may occur as
defining attribute of multiple relationship instances whereas a source key value can only
occur as defining attribute of a single relationship instance.
Consequently, a relationship instance can uniquely be identified by providing the value
of its source defining attribute, i.e., the key value of its source instance. In other words,
the relationship key consists of (the attributes of) the key of the source of the relationship
type.
• If both cardinalities are ..1, for every source instance there may only be one target
instance and vice versa. Thus, each source or target key value may occur once as
defining attribute of a relationship instance. A relationship instance can be uniquely
identified by providing the key value of its source instance or the key value of its target
instance. Only one is required.
Accordingly, you can choose as relationship key either the key of the source or the key
of the target of the relationship type, but not both (minimum principle).

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-49


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationship Key (Example 1)

AIRCRAFT MODEL AIRCRAFT


K Type Code _for_
K Aircraft Number
K Model Number 1. .1 m ...
...

Relationship Key
Aircraft Number

Defining Attributes
Type Code
Model Number
Aircraft Number

Figure 4-24. Relationship Key (Example 1) CF182.0

Notes:
The visual shows relationship type AIRCRAFT MODEL_for_AIRCRAFT, a 1:m relationship
type since there may be multiple aircraft for each aircraft model, but one and only one
aircraft model for each aircraft.
The defining attributes for the relationship type are the keys of the source and the target,
i.e., Type Code and Model Number from AIRCRAFT MODEL and Aircraft Number from
AIRCRAFT.
Since there is only one aircraft model for each aircraft, Aircraft Number, i.e., the key of
entity type AIRCRAFT, becomes the relationship key.

4-50 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Relationship Key (Example 2)
_nonstop_to_

To m
AIRPORT _in_ ITINERARY
K Airport Code K Flight Number
... 1 m ...
..
From m m

Defining Attributes:
Defining Attributes: Flight Number
From (Airport Code) From (Airport Code)
To (Airport Code) To (Airport Code)

Relationship Key: Relationship Key:


From (Airport Code) Flight Number
To (Airport Code) From (Airport Code)
To (Airport Code)
Figure 4-25. Relationship Key (Example 2) CF182.0

Notes:
If we want to determine the defining attributes or the relationship key of relationship type
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY, we first need to find the relationship key
of relationship type AIRPORT_nonstop_to_AIRPORT. Its source and target are entity types
so that we can immediately derive its defining attributes and relationship key. The defining
attributes are twice Airport Code, once playing the role of the airport of departure (From)
and once the role of the airport of arrival (To).
To make this apparent, you can (and should) indicate the respective roles at the
appropriate ends of the relationship type. The defining attributes for the relationship type
should be named accordingly. As done on the visual, you should add, in parentheses, the
original name of the attributes since the roles only act as synonyms for them.
AIRPORT_nonstop_to_AIRPORT is a m:m relationship type. Therefore, the relationship
key consists of all defining attributes.
After having determined the relationship key of AIRPORT_nonstop_to_AIRPORT, we also
know the defining attributes of relationship type
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY. They consist of the key of the target

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-51


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

and the key of the source for the relationship type, i.e., of Flight Number (from ITINERARY)
and From and To (from AIRPORT_nonstop_to_AIRPORT). The sequence of the attributes
is not important.
Since AIRPORT_nonstop_to_AIRPORT_in_ITINERARY is a m:m relationship type, its
relationship key consists of its defining attributes, i.e., Flight Number, From, and To.
When determining the defining attributes or the relationship key of a relationship type, you
must back-step until you finally reach relationship types whose source and target are entity
types. Start determining the defining attributes and the relationship keys from there. If the
source or target of the _nonstop_to_ relationship type had been relationship types, you
would have had back-step further to determine the defining attributes and the relationship
key.

4-52 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Cardinalities for CAB

m
MECHANIC _can_fly_
PILOT
1. .1 m m 1 1
m
m AIRCRAFT
m m
_trained MODEL
_for_ _can_land m AIRPORT m
.1. .1. _at_
1 1 From To
_copilot _captain
_from_ _for_ _for_ _for_ _for_
_nonstop_to_
m 1. .m
m
AIRCRAFT _in_
_scheduled 1. .1
_for_ m
1
m m _for_ m m
MAINTENANCE Owner
RECORD
ITINERARY FLIGHT
m 1 m
m

_belongs_to_ _for_

Figure 4-26. Cardinalities for CAB CF182.0

Notes:
This visual contains the cardinalities for the relationship types for our sample airline
company called Come Aboard. The cardinalities for the relationship types were derived
from the description of the business relationship types contained in the problem statement
in Appendix A - Sample Problem Statement.
Based on the cardinalities, the relationship types have the following relationship keys:

Relationship Type Relationship Key


AIRCRAFT MODEL_for_AIRCRAFT Aircraft Number
AIRPORT_nonstop_to_AIRPORT From (Airport Code), To (Airport Code)
AIRPORT_nonstop_to_AIRPORT_in_ Flight Number, From (Airport Code), To
ITINERARY (Airport Code)
AIRCRAFT_for_FLIGHT Flight Number, From (Airport Code), To
(Airport Code), Flight Locator
MAINTENANCE RECORD_from_MECHANIC Maintenance Number

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-53


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationship Type Relationship Key


AIRCRAFT MODEL_can_land_at_AIRPORT Type Code, Model Number, Airport
Code
PILOT_can_fly_AIRCRAFT MODEL Employee Number, Type Code, Model
Number
PILOT_captain_for_FLIGHT Flight Number, From (Airport Code), To
(Airport Code), Flight Locator
PILOT_copilot_for_FLIGHT Flight Number, From (Airport Code), To
(Airport Code), Flight Locator
AIRCRAFT MODEL_for_AIRPORT_nonstop_ Flight Number, From (Airport Code), To
to_AIRPORT_in_ITINERARY (Airport Code)
AIRPORT_nonstop_to_AIRPORT_in_ Flight Number, From (Airport Code), To
ITINERARY_for_FLIGHT (Airport Code), Flight Locator
MECHANIC_trained_for_AIRCRAFT MODEL Employee Number, Type Code, Model
Number
MECHANIC_scheduled_for_AIRCRAFT Employee Number, Aircraft Number
MAINTENANCE RECORD_belongs_to_ Maintenance Number
MAINTENANCE RECORD

4-54 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 4.3 Dependent Entity Types, Supertypes, and Subtypes

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-55


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

A First Correction of the CAB Model


AIRCRAFT TYPE
K Type Code
Category
Manufacturer
AIRCRAFT MODEL Number of Engines
K Type Code ...
K Model Number 1. .1
Category
Manufacturer Not model specific,
Number of Engines only type specific _for_
Dimensions
Length 1. .m
Height AIRCRAFT MODEL
Wing Span
... K Type Code
K Model Number
Dimensions
Length
However, this is a very Height
special relationship type Wing Span
...

Figure 4-27. A First Correction of the CAB Model CF182.0

Notes:
A closer look at entity type AIRCRAFT MODEL for our sample airline company reveals that
it contains some attributes that are not really aircraft model specific, but rather aircraft type
specific. For example, Category (JET, TURBOPROB, etc.), Manufacturer, and Number of
Engines are only dependent on Type Code, i.e, the type of the aircraft (e.g., B747), and not
on the specific model. They are the same for all models of the same type.
This leads to the conclusion that business object type Aircraft Models as described by the
problem statement is rather a combination of two entity types, namely, of entity types
AIRCRAFT TYPE and AIRCRAFT MODEL as illustrated on the right-hand side of the
visual. AIRCRAFT TYPE only contains the type-specific attributes and AIRCRAFT MODEL
the model-specific attributes (e.g., Dimensions consisting of Length, Height, and Wing
Span) that may be different for the models of a type.
The entity key of AIRCRAFT TYPE is Type Code. For AIRCRAFT MODEL, it consists of
Type Code and Model Number as before since Model Number alone is not unique.
Of course, entity types AIRCRAFT TYPE and AIRCRAFT MODEL are interconnected by a
relationship type:

4-56 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty AIRCRAFT TYPE_for_AIRCRAFT MODEL


An aircraft model belongs to one and only one aircraft type. For an aircraft type, there may
be many different aircraft models. The cardinality of 1..m for AIRCRAFT MODEL assumes
that CAB keeps the information about an aircraft type only if it keeps at least one aircraft
model for that type.
Splitting an entity type into two entity types as we have done here requires a reevaluation
of the relationship types for which the former, combined, entity type was the source or
target. For each relationship type, we must determine which of the new entity types will be
its source or target.
Furthermore, the problem statement should be updated by the domain expert (in
cooperation with the database designer) to reflect the application-domain aspects of the
split entity types and of the new relationship type.
A closer look at the new relationship type reveals that an aircraft model cannot be
connected to an arbitrary aircraft type. It can only be connected to specific aircraft types as
illustrated on the next visual.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-57


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Dependent Entity Types

_for_ D
AIRCRAFT AIRCRAFT
TYPE 1. .1 1. .m MODEL

No cardinality in diagram
since always 1. .1 AIRCRAFT MODEL
Type Code: B747
_for_ Model Number: 400
AIRCRAFT TYPE
Must be
Type Code: B747 equal
_for_ AIRCRAFT MODEL
Type Code: B747
Model Number: 200
AIRCRAFT MODEL
Type Code: A310
_for_ Model Number: 200
AIRCRAFT TYPE
Must be
Type Code: A310 equal
_for_ AIRCRAFT MODEL
Type Code: A310
Model Number: 300

Figure 4-28. Dependent Entity Types CF182.0

Notes:
An aircraft model cannot be connected to an arbitrary aircraft type. It can only be
associated with the aircraft type having the same type code as the aircraft model: A Boeing
747, Model 400 (Type Code = B747, Model Number = 400) is a Boeing 747 (type) and,
therefore, can only be associated with the instance of AIRCRAFT TYPE having the entity
key value B747.
An entity type being dependent on another entity type or on a relationship type in such a
way that
• a part of its entity key or its full key is the key of the other entity type or the key of the
relationship type
• each of its instances must be connected to, and only to, the entity instance or
relationship instance with the matching key value
is referred to as dependent entity type. In the entity-relationship model, the dependent
entity type is identified by the letter D at its end of the relationship type establishing the
dependency.

4-58 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Because of this key interdependency, each dependent entity instance must belong to one
and only one parent instance. Thus, the cardinality at the parent end of the relationship
type establishing the dependency must always be 1..1 and is omitted to simplify the
diagram.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-59


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Dependent Entity Types - Characteristics


The source can be an entity type or
a relationship type
Parent AIRCRAFT
Entity Type TYPE The target must be an entity type
The key of the source must be part of
No cardinality the key of the target
since always 1. .1 Only instances with matching values
are associated with each other

Owning For every target instance, there


Relationship _for_ must be a source instance
Type
An entity type must not be dependent
on more than one source
Otherwise, relationship type missing
between parents
D 1. .m Dependence should be on that
relationship type
Dependent AIRCRAFT
Entity Type MODEL Defining attributes and relationship
key for owning relationship type:
Key of dependent entity type

Figure 4-29. Dependent Entity Types - Characteristics CF182.0

Notes:
As described before, a dependent entity type is an entity type fulfilling the following
conditions:
• A part of its key or its entire key is equal to the key of another entity type or of a
relationship type. This entity type or relationship type is referred to as parent entity type
or parent relationship type, respectively.
• There must exist a relationship type between the parent entity type or relationship type
and the dependent entity type with the following characteristics:
- Each instance of the dependent entity type is, at all times, connected to one and only
one parent instance.
- The dependent and parent instances interconnected are those with matching values:
The value of the appropriate key portion of the dependent entity instance must be
equal to the key value of the parent instance.
The relationship type interconnecting the parent entity type or relationship type and the
dependent entity type is referred to as owning relationship type.

4-60 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty An entity type must not be dependent on more than one entity type or relationship type.
Should you see the need for a dependency on two parents, a relationship type between the
parents is missing and should be established. Should you see a dependency on more than
two parents, multiple interrelated relationship types are missing, and the dependent entity
type is to be based on the last of them.
As discussed before, the defining attributes of a relationship type are the keys of its source
and target. However, because of the matching values of the key/key portion, the key of the
dependent entity type is sufficient to completely describe the owning relationship type.
Therefore, the key of the parent is omitted.
As a consequence of the implied 1..1 cardinality at the parent end, the key of the
dependent entity type is also the key of the owning relationship type.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-61


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Nondefining Attributes for Relationship Types

Relationship Key:
_nonstop_to_ Flight Number
From (Airport Code)
To (Airport Code)
To m
AIRPORT _in_ ITINERARY
K Airport Code K Flight Number
... 1 m ...
..
From m m _as_

D 1. .1
Relationship Key:
From (Airport Code) LEG
To (Airport Code) K Flight Number
K From
K To
Leg Number

Figure 4-30. Nondefining Attributes for Relationship Types CF182.0

Notes:
Besides the defining attributes, relationship types may have additional attributes further
characterizing them. These attributes are referred to as nondefining attributes of the
relationship type.
When talking about the nonstop connections for an itinerary, we observed that the legs of
an itinerary must be ordered. Each instance of relationship type _in_
(AIRPORT_nonstop_to_AIRPORT_in_ITINERARY) having as target the considered
itinerary represents a leg of that itinerary. Its defining attributes specify the flight number for
the itinerary and the nonstop connection (starting and ending airports) for the leg.
To order the legs, it is necessary to assign an attribute (Leg Number) to relationship type
_in_ by means of which the sequence of the legs for the itinerary can be established. The
attribute cannot simply be added to entity type ITINERARY since, in this case, the
itineraries were ordered without considering the legs. The attribute must also not be an
attribute for relationship type _nonstop_to_ (AIRPORT_nonstop_to_AIRPORT) since, in
this case, the nonstop connections were ordered without consideration for the itineraries.
The order for a leg, however, depends on both the itinerary and the nonstop connection for

4-62 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty the leg. Consequently, Leg Number must be an attribute for relationship type _in_
(AIRPORT_nonstop_to_AIRPORT_in_ITINERARY).
Nondefining attributes are assigned to a relationship type by basing a dependent entity
type on the relationship type containing the attributes and the relationship key. Thus, to
each instance of the relationship type, zero, one, or more instances of the dependent entity
type are attached. The cardinality for the dependent entity type determines how many
instances of the dependent entity type can and must be attached to a relationship instance.
In case of the example on the visual, a dependent entity type is based on relationship type
_in_ containing nonkey attribute Leg Number. For each relationship instance, i.e., each leg
of an itinerary, it specifies the sequence number (leg number) for the appropriate nonstop
connection for the itinerary. Since the dependent entity type further describes the legs of
the itinerary, we have called it LEG in the above visual. The abbreviated name of the
owning relationship type is _as_. According to our naming convention, its full name is:
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY_as_LEG
Because each leg only receives a single leg number, the cardinality for the dependent
entity type must be 0..1 or 1..1. Which of these it is depends on how you want to model it. If
you also want to assign a leg number to the only leg of a one-leg itinerary, the cardinality
must be 1..1. If you only want to sequence the legs of multi-leg itineraries, the cardinality
should be 0..1. We have chosen the first alternative because it treats itineraries more
uniformly, prevents that legs of a multi-leg itinerary are not sequenced, and tends to be
more general.
In addition to the leg number, the dependent entity type contains the key of the parent
relationship type, i.e., the attributes Flight Number (coming from ITINERARY), From, and
To (both from AIRPORT_nonstop_to_AIRPORT).
Because of the maximum cardinality of 1, the key of the (parent) relationship type becomes
the key of the dependent relationship type.
In addition to the key attributes, the dependent entity type may contain any number of
(nondefining) attributes for the relationship type (e.g., the planned departure and arrival
Helvetica for the leg) as long as the maximum value of the cardinality for the dependent
entity type is observed. However, dependent entity types should follow the rules for entity
types we have established before. In particular, the dependent entity type should have a
sensible meaning for the relationship type (and application domain) and its attributes
should all support that meaning. The dependent entity type should not be a garbage
collection. If necessary, introduce multiple dependent entity types for the relationship type
each having a well-defined meaning.
You also may want to introduce multiple dependent entity types for the relationship type if
many of the nondefining attributes are optional. In this case, you might prefer to have a
dependent entity type with cardinality 1..1 for the mandatory attributes and one or more
others with cardinality 0..1 for the optional attributes. All of the attributes of such an entity
type should be optional at the same time: If one of the attributes does not apply, the others

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-63


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

do not apply either. The resulting dependent entity types should again have a well-defined
meaning for the relationship type whose nondefining attributes they contain.
For attributes requiring a different maximum cardinality, you need different dependent entity
types (possibly multiple ones in accordance with the discussions above).

4-64 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Nondefining Attributes - Sample Diagram

LEG
Flight Number: YY3367
From: ATL
To: STR
AIRPORT Leg Number: 1
Airport Code: ATL
... ...
_as_
_nonstop_to_
_in_
AIRPORT ITINERARY
Airport Code: STR Flight Number: YY3367
... ... ... ...
_in_
_nonstop_to_
AIRPORT _as_
Airport Code: FRA LEG
... ...
Flight Number: YY3367
From: STR
To: FRA
Leg Number: 2

Figure 4-31. Nondefining Attributes - Sample Diagram CF182.0

Notes:
The above instance diagram illustrates dependent entity type LEG, containing the
nondefining attributes for relationship type
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY (= _in_), for a sample itinerary. Because
of cardinality 1..1 for dependent entity type LEG, there is only one dependent entity
instance for each instance of relationship type _in_.
The dependent entity instance contains the key of its parent relationship instance and the
assigned leg number. The nonstop connection from Atlanta (ATL) to Stuttgart (STR) is the
first leg (Leg Number = 1) for itinerary YY3367. The nonstop connection from Stuttgart to
Frankfurt (FRA) is the second leg (Leg Number = 2) of the itinerary.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-65


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Attributes for a Sample Relationship Type


1 _captain_for_ m
PILOT FLIGHT
1 _copilot_for_ m

PILOT _assigned _to_ FLIGHT


K Employee Number K Flight Number
... m _by_ m K From
K To
D 1. .1 K Flight Locator
...
PILOT ASSIGNMENT
K Flight Number
K From
K To
K Flight Locator
K Employee Number
Pilot Function

Figure 4-32. Attributes for a Sample Relationship Type CF182.0

Notes:
Using (nondefining) attributes, i.e., a dependent entity type, you can replace the two
relationship types PILOT_captain_for_FLIGHT and PILOT_copilot_for_FLIGHT by a single
relationship type PILOT_assigned_to_FLIGHT as illustrated in the lower part of the above
visual. Each instance of the new relationship type has associated with it an instance of
dependent entity type PILOT ASSIGNMENT specifying the function (CAPTAIN or
COPILOT) for the selected pilot on the selected flight.
Since a pilot can be assigned to multiple flights and multiple pilots can be assigned to a
flight, relationship type PILOT_assigned_to_FLIGHT is a m:m relationship type.
Accordingly, its key consists of the keys for PILOT and FLIGHT.
Since the cardinality for PILOT ASSIGNMENT is 1..1 (a pilot assigned to a flight has one
and only one function for that flight), no additional attributes are needed to achieve
uniqueness of the entity instances and the key of the parent relationship type becomes the
entity key of the dependent entity type.
Both approaches have advantages and disadvantages. The first approach of using two
relationship types ensures that not more than two pilots are assigned to a flight and not

4-66 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty more than one pilot as captain or copilot, respectively. However, without additional
constraints, it does not prevent a pilot from being assigned as captain and copilot to the
same flight. (Constraints are discussed later in this unit.)
The second approach, using a single relationship type, does not prevent the assignment of
multiple captains or copilots to a flight without additional constraint. It also does not prevent
that more than two pilots are assigned to a flight. However, because of the uniqueness
requirement for the entity key, it ensures that a pilot only assumes one role for a flight.
The second solution is more flexible and open-ended. By removing the appropriate
constraints and allowing additional values for attribute Pilot Function, it enables Come
Aboard to introduce substitute captains and copilot (i.e., standbys for pilots that fall sick) or
to assign multiple captains or copilots to long flights for which the maximum flying period for
pilots were exceeded. However, before introducing these new functions, they must be
discussed with and approved by the domain expert or the appropriate department of
expertise. In case of multiple captains and copilots for long flights, you can easily think of
additional attributes for dependent entity type PILOT ASSIGNMENT: for example, the time
in the flight when a pilot is captain or copilot.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-67


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Relationships on Owning Relationship Type

Parent

Defining Attributes:
Key of Dependent
Entity Type
Relationship Key: Key of Target
Key of Dependent Target
Entity Type

D
Dependent
Entity Type Target
Defining Attributes:
Key of Dependent
Entity Type
Key of Target

Figure 4-33. Relationships on Owning Relationship Type CF182.0

Notes:
It is conceivable that an owning relationship type is the source or target of another
relationship type. However, in this case, you can base the other relationship type on the
dependent entity type rather than on the owning relationship type as explained in the
following.
For simplicity, let us assume that the owning relationship type is the source of the second
relationship type. As explained before, the key of the owning relationship type is the key of
the dependent entity type. Therefore, the defining attributes for the second relationship
type are the key of the dependent entity type and the key of the target.
If the second relationship type had as source the dependent entity type, its defining
attributes would also be the key of the dependent entity type and the key of the target. This
means that the potential relationship instance are the same in both cases and that the two
implementations of the second relationship type are equivalent. Consequently, you can
base the second relationship type on the dependent entity type rather than on the owning
relationship type simplifying the entity-relationship model.

4-68 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Controlling Property

Owner 1

MAINTENANCE _belongs_to_
RECORD
C m

Deletion of Deletion of
Relationship Instance Controlled Instance
MAINTENANCE RECORD
Maintenance Number: 004712
... ...
_belongs_to_
MAINTENANCE RECORD _belongs_to_
MAINTENANCE RECORD
Maintenance Number: 004711 Maintenance Number: 004713
... ... ... ...
_belongs_to_
MAINTENANCE RECORD
Maintenance Number: 004714
... ...

Figure 4-34. Controlling Property CF182.0

Notes:
The controlling property can be specified for the source or the target of a relationship type
or for both. In the entity-relationship model, it is indicated by the letter C at the end of the
relationship type to which it applies.
If you specify the controlling property for the source (target), the source (target) instance
belonging to a relationship instance is to be deleted when the relationship instance is
deleted.
As a modeling construct, the controlling property can only describe what should happen if a
relationship instance is deleted. Nevertheless, when talking about an example, people
often say: If this relationship instance is deleted, then this source (or target) instance is
deleted. This means that they talk about the effects of the controlling property when it is
implemented.
The above visual illustrates the controlling property for relationship type
MAINTENANCE RECORD_belongs_to_MAINTENANCE RECORD. The problem
statement for our sample airline company Come Aboard specifies that a maintenance
record should be deleted if its owning maintenance record is deleted. CAB's maintenance

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-69


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

records are hierarchically structured. A maintenance record can belong to another (one)
maintenance record, the owning maintenance record, and can have multiple subrecords.
The implied deletion of the subrecords is modeled by specifying the controlling property for
the subrecord end (the source) of relationship type
MAINTENANCE RECORD_belongs_to_MAINTENANCE RECORD. As indicated on the
visual, the controlling property implies that, as a result of the deletion of the relationship
instance connecting maintenance record 004712 to maintenance record 004711,
maintenance record 004712 is to be deleted.
The deletion of a maintenance record implies the deletion of all relationship instances
having the deleted maintenance record as their target. Consequently, the controlling
property for the source of the relationship type implies that all subrecords of a maintenance
record are to be deleted if the maintenance record is deleted. Thus, if maintenance record
004711 is deleted, maintenance records 004712, 004713, and 004714 should be deleted
as well.

4-70 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Cascading Effect

Owner 1

MAINTENANCE _belongs_to_
RECORD
C m

MAINTENANCE RECORD _belongs_to_


6 MAINTENANCE RECORD
5
Maintenance Number: 004801 7
Maintenance Number: 004802
... ... ... ...
_belongs_to_
4
MAINTENANCE RECORD MAINTENANCE RECORD
3
Maintenance Number: 004722 Maintenance Number: 002907
... ... ... ...
2
_belongs_to_
MAINTENANCE RECORD MAINTENANCE RECORD
1
Maintenance Number: 004711 3
Maintenance Number: 004721
... ... 2
_belongs_to_ ... ...

Figure 4-35. Cascading Effect CF182.0

Notes:
The controlling property may have a cascading effect: One deletion may "cause" many
others. This is especially true for unary relationship types as illustrated on the above visual:
• The maintenance record originally being deleted is the record with maintenance number
004711 ( 1 ).
• The deletion of maintenance record 004711 causes the deletion of the relationship
instances associating maintenance record 004711 with maintenance records 004721
and 004722 ( 2 ) because their target instance is deleted.
• The deletion of the two relationship instances, in conjunction with the controlling
property, implies that maintenance records 004721 and 004722 are to be deleted ( 3 ).
• Since maintenance record 004722 was the target of the relationship instance connecting
maintenance record 004801 to it, the relationship instance is deleted as well ( 4 ).
• Together with the controlling property, the deletion of the relationship instance
interconnecting maintenance records 004722 and 004801 implies that maintenance
record 004801 is to be deleted ( 5 ).

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-71


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

• The deletion of maintenance record 004801 causes the deletion of the relationship
instance interconnecting maintenance records 004801 and 004802 because the target
of the relationship instance has been deleted ( 6 ).
• Finally, the deletion of the relationship instance implies that maintenance record 004802
is to be deleted ( 7 ).
Thus, due to the controlling property, the deletion of a single maintenance record implies
the deletion of all maintenance records except maintenance record 002907. Maintenance
record 002907 is not interconnected to any of the deleted maintenance records.
The example illustrates very clearly that you must be careful when using the controlling
property and must understand its explicit and implicit effects. If the cascading effect is what
the application domain wants to achieve (as is the case for the maintenance records), the
usage of the controlling property is perfectly all right.

4-72 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Controlling for Relationship Type Attributes

_nonstop_to_

To m
AIRPORT _in_ ITINERARY
K Airport Code K Flight Number
... 1 m ...
..
From m m _as_

C D 1. .1
LEG
K Flight Number
K From
K To
Leg Number

Figure 4-36. Controlling for Relationship Type Attributes CF182.0

Notes:
As we discussed before, the nondefining attributes for relationship types are modeled by
means of dependent entity types. When a relationship instance is deleted, the dependent
entity instance or instances containing the nondefining attributes (values) for the
relationship instance must be deleted as well. This can be achieved by means of the
controlling property for the dependent entity type.
The visual illustrates this for dependent entity type LEG. If a nonstop connection is
removed from an itinerary, its leg number (a nondefining attribute for relationship type
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY) should be deleted as well as indicated
by the controlling property for LEG.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-73


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

A Second Correction of the CAB Model

PILOT MECHANIC
K Employee Number K Employee Number
Last Name General Last Name
First Name Employee First Name
Address Information Address
Date of Birth Date of Birth
... ...
Date Last Checkup Area of Expertise
Result Last Checkup Type of Certification
Date Next Checkup Date Certification
Last Flown On Security Status
... ...

Pilot Mechanic
Specific Specific
Information Information

Figure 4-37. A Second Correction of the CAB Model CF182.0

Notes:
If you scrutinize the attributes for entity types PILOT and MECHANIC for our sample airline
company Come Aboard, you will realize that they have attributes (e.g., Last Name, First
Name, Address, and Date of Birth) that are common to both of them. They also have
attributes that are specific to the particular entity type: Date Last Checkup, Result Last
Checkup, Date Next Checkup, and Last Flown On only apply to pilots; Area of Expertise,
Type of Certification, Date of Certification, and Security Status only to mechanics.
The common attributes are not specific to pilots or mechanics. Rather, they are common to
all employees. Pilots and mechanics are subcategories (or subtypes) of employees. As
employees, the common attributes apply to them as well.
Since CAB does not want to distinguish the different types of employees when only
processing the employee information, it makes sense to introduce another entity type,
called EMPLOYEE, which functions as a supertype and contains the attributes common to
all employees. The common attributes are removed from PILOT and MECHANIC so they
only contain the attributes that are specific to pilots or mechanics, respectively.

4-74 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The introduction of entity type EMPLOYEE is illustrated on the next visual. It leads to
supertypes and subtypes.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-75


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Supertype and Subtypes


EMPLOYEE
K Employee Number
Total Last Name Total
Attributes for First Name Attributes for
Pilot Address Mechanic
Date of Birth
...
S
_is_

C D 1 1 D C
PILOT MECHANIC
K Employee Number K Employee Number
Date Last Checkup Area of Expertise
Result Last Checkup Type of Certification
Date Next Checkup Date Certification
Last Flown On Security Status
... ...

Figure 4-38. Supertype and Subtypes CF182.0

Notes:
When categorizing items, you form classes and subclasses. The subclasses structure the
elements of the classes. They do not contain different elements. Each member of a
subclass also belongs to the (superior) class to which the subclass belongs.
In modeling, the items categorized are the instances of entity types. The superior class is
called supertype entity type or supertype. The subclasses are referred to as subtype entity
types or subtypes. A supertype may have one or more subtypes. The term class structure
is used to denote the structure consisting of a supertype and its subtypes.
For each instance of a subtype, the supertype contains one, and only one, corresponding
instance reflecting the fact that a member of a subclass, at the same time, is a member of
the corresponding superior class. In the example on the visual, pilots and mechanics are,
at the same time, employees. Therefore, for each instance of entity types PILOT or
MECHANIC, there must be a corresponding instance in entity type EMPLOYEE.
A supertype can be considered as the (common) generalization of its subtypes.
Conversely, the subtypes can be considered as specializations of the supertype. Therefore,

4-76 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty the terms generalization and specialization are used in conjunction with supertypes and
subtypes.
Whereas there must be a supertype instance for each subtype instance, there need not be
a subtype instance for every supertype instance. This means that the specialization can be
incomplete (partial). Come Aboard, for example, has employees (other than pilots or
mechanics) who do not have specific attributes. For them, there is not a subtype.
The supertype/subtype concept implies that the total set of attributes for the conceptual
object represented by a subtype instance consists of its subtype attributes and the
attributes for the corresponding supertype instance. In our example, the total set of
attributes for a pilot consists of his/her pilot-specific attributes (as represented by the PILOT
instance) and his/her attributes as an employee (i.e., the attributes of the corresponding
EMPLOYEE instance).
Processing-wise, you want both the attributes of the subtype instance and of the
corresponding supertype instance when referring to a subtype instance. In contrast, when
referring to a supertype instance, i.e., when processing the represented object in the
quality expressed by the supertype, you only want the attributes of the supertype instance
and not the attributes of any subtype instances associated with it.
When a supertype instance is deleted, any associated subtype instances must be deleted
as well because the conceptual object associated with the supertype instance no longer
exists. By themselves, a subtype instance and its corresponding supertype instance can be
considered as partial instances. Together, they form the complete instance.
Logically, the supertype has a relationship type of the form
supertype_is_subtype
with each subtype (e.g., EMPLOYEE_is_PILOT and EMPLOYEE_is_MECHANIC). To
indicate that these relationship types belong to the same class structure, they are
combined to a fork whose handle starts at the supertype. (Note that an entity type may be
structured in more than one way into subtypes making it necessary to group the
relationship types for a class structure.)
In addition, the supertype is identified by the letter S next to it. Without this indication, the
supertype could not be identified for class structures having just one subtype.
The set of _is_ relationship types interconnecting a supertype and its subtypes is referred
to as is-bundle. All relationship types of the is-bundle have a supertype cardinality of 1..1
since there must be one and only one supertype instance for every subtype instance.
Therefore, the supertype cardinality is omitted. It is considered implied by the letter S for
the supertype.
For each supertype instance, a subtype may contain at most one instance. Consequently,
the cardinality for the subtype of the _is_ relationship type must be ..1.
As entity types, the subtypes must have an entity key. The most natural choice is the entity
key of the supertype. In this case, a subtype instance is always to be connected to the

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-77


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

supertype instance with the matching key value. Accordingly, the subtype becomes a
dependent entity type and is marked as such.
Since a subtype instance is to be deleted when its supertype instance is deleted, the
controlling property applies to the subtypes.

4-78 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Bundle Cardinalities

. .1 Exclusive
0. .1 or 1
Employee may be a pilot or a
mechanic, but not both
EMPLOYEE

S
1. .1
Employee must be a pilot or a
_is_ mechanic, but not both
1
0. .m or m
C D 1 1 D C Employee may be a pilot and/or
a mechanic
PILOT MECHANIC
1. .m
Employee must be a pilot and/or
a mechanic
1. . Covering

Figure 4-39. Bundle Cardinalities CF182.0

Notes:
As discussed before, for each subtype instance, there must be one, and only one,
supertype instance. The reversal of this statement is not true. There need not necessarily
be a subtype instance for a supertype instance. For a supertype instance, there may also
be instances in multiple subtypes. However, a subtype may contain at most one instance
for any supertype instance.
It is a characteristic property of a class structure if, for every supertype instance, at least
one of the subtypes must contain a corresponding subtype instance.
It is another characteristic property of a class structure if, for a supertype instance, multiple
subtypes can contain a corresponding subtype instance.
These two properties are controlled by the bundle cardinality. The bundle cardinality is a
cardinality for the is-bundle rather than for the sources and targets of the relationship types
it comprises. The bundle cardinality specifies how many relationship instances of the is
bundle (is-relationship instances) a supertype instance must have at least and may have at
most. In other words, it specifies how many prongs the fork must have at least and may
have at most for a supertype instance.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-79


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In the entity-relationship model, the bundle cardinality is specified at the point of the fork for
the is-bundle where the handle and the prongs meet. It can assume the following values:
0..1 or 1
A supertype instance may have at most one is-relationship instance. This means that a
supertype instance need not have a corresponding subtype instance in any of the
subtypes. It may have a corresponding instance in at most one subtype.
For the example on the visual, this would mean that an employee need not be a pilot or a
mechanic. The employee can be a pilot or mechanic, but cannot be both.
1..1
Every supertype instance must have one and only one is-relationship instance. This
means that a supertype instance must have a corresponding subtype instance in one and
only one subtype.
For the example on the visual, this would mean that an employee must be a pilot or a
mechanic, but cannot be both.
0..m or m
A supertype instance may have any number of is-relationship instances. This means that
a supertype instance need not have a corresponding subtype instance in any of the
subtypes. It may have corresponding instances (one each) in multiple subtypes.
For the example on the visual, this would mean that an employee need not be a pilot or a
mechanic, but can be a pilot or mechanic and can be both.
1..m
A supertype instance must have one or more is-relationship instances. This means that a
supertype instance must have corresponding subtype instances (one each) in at least one
subtype. It may have corresponding instances in multiple subtypes.
For the example on the visual, this would mean that an employee must be a pilot or
mechanic and can be both.
The bundle cardinality is only specified if the supertype has more than one subtype. In case
of only a single subtype, the subtype cardinality is sufficient.
As always, the correct choice of the bundle cardinality depends on the application domain.
The problem statement for our sample airline company implies that there are other
employees than pilots and mechanics. Thus, the bundle cardinality can only be 0..1 or 0..m.
Since the business constraints for Come Aboard state that a pilot cannot be a mechanic at
the same time, the bundle cardinality must be 0..1 (= 1) for the illustrated example.
Bundle cardinalities of the form ..1 specify that a supertype instance may have subtype
instances in at most one subtype. Therefore, the subtype set (the set of subtypes) is
referred to as exclusive.

4-80 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Bundle cardinalities of the form 1.. specify that a supertype instance must have a
corresponding subtype instance in at least one subtype. Therefore, the subtype set is
referred to as covering.
You should ensure that the subtype cardinalities and the bundle cardinality are compatible.
If at least one of the subtype cardinalities is 1..1 (meaning that, for each supertype
instance, there must be a corresponding instance in this subtype), the bundle cardinality
should be 1.. (1..1 or 1..m).

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-81


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

An Alternate Maintenance Record Solution

_belongs_to_

m MAINTENANCE RECORD 1
K Maintenance Number
C Date of Maintenance Owner
Type of Maintenance
...

S
_is_
1. .1

D 1 1 D
ARCHIVE RECORD ACTIVE RECORD _for_
K Maintenance Number K Maintenance Number AIRCRAFT
Aircraft Number . . . ??? m 1. .1
Retention Date
...

Figure 4-40. An Alternate Maintenance Record Solution CF182.0

Notes:
When we discussed the business relationship type between aircraft and maintenance
records before, we determined that the business relationship type cannot be expressed by
a relationship type in the entity-relationship model. The reason was that the maintenance
records for an aircraft, including the aircraft number, must be kept even after the remaining
information about the aircraft has been deleted. This led to the conclusion that the aircraft
number must be an attribute of entity type MAINTENANCE RECORD.
Using a class structure for the maintenance records, the business relationship type can be
expressed by means of a relationship type, however, only for the maintenance records of
existing aircraft:
• The maintenance records are subdivided into two subtypes: Maintenance records for
aircraft owned by CAB (entity type ACTIVE RECORD) and maintenance records for
aircraft no longer owned by CAB (entity type ARCHIVE RECORD).
Since a maintenance record must be either an active record or an archive record, the
bundle cardinality must be 1..1. Both entity types are dependent entity types and the key
of the supertype (Maintenance Number) is also the entity key of the subtypes.

4-82 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • Since the remaining aircraft information no longer exists for archive maintenance
records, subtype ARCHIVE RECORD must include the serial number of the aircraft to
which it belonged (Aircraft Number). It may contain other attributes, such as the date
until when the maintenance record must be retained (Retention Date), that only exist for
archive maintenance records.
• Besides the entity key, subtype ACTIVE RECORD may contain additional attributes that
only exist for active maintenance records. But are there any? This illustrates the
possibility of entity types just containing the entity key. As we mentioned before, you
should be suspicious of entity types not having nonkey attributes. So, you should be
here and question if this is a good solution?
• By definition, active maintenance records belong to aircraft owned by Come Aboard.
Therefore, their relationship to aircraft can be expressed by a relationship type between
ACTIVE RECORD and AIRCRAFT.
• In general, other relationship types having MAINTENANCE RECORD as their source or
target are not affected by the introduction of the class structure.
If an aircraft is no longer owned by CAB and its entity instance is removed, the appropriate
instances for its maintenance records must be moved from subtype ACTIVE RECORD to
subtype ARCHIVE RECORD. This is enforced by the entity-relationship model:
• The target cardinality of 1..1 for relationship type ACTIVE RECORD_for_AIRCRAFT
requires, at all times, an aircraft for an active maintenance record. Consequently, if an
aircraft is deleted, its active maintenance record instances must either be assigned to
other aircraft or removed from ACTIVE RECORD. Since it would be incorrect to assign
them to other aircraft, they must be removed from ACTIVE RECORD.
• On the other hand, bundle cardinality 1..1 requires that each instance of
MAINTENANCE RECORD has an instance in either ARCHIVE RECORD or ACTIVE
RECORD. If an aircraft is deleted, its maintenance records cannot have instances in
ACTIVE RECORD as explained before. Thus, they must have instances in ARCHIVE
RECORD.
As an instance is moved from ACTIVE RECORD to ARCHIVE RECORD, the serial number
for the aircraft it belonged to (Aircraft Number) must be added along with any other
attributes for archive maintenance records.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-83


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

ER Model for CAB Without Constraints


EMPLOYEE
S _is_
1
DC 1 1 DC
AIRCRAFT
TYPE m
MECHANIC _can_fly_
PILOT
_for_
1. .1 m m m
D 1. .m
m
m AIRCRAFT
m m
_trained MODEL
_for_ _can_land m AIRPORT m
.1. .1. _at_
1 1 From To
PILOT DC _by_ _assigned
_from_ _for_ _to_
_nonstop_to_ ASSIGNMENT 1. .1
m 1. .m
m _as_ DC
AIRCRAFT _in_ LEG
_scheduled 1. .1
_for_
1 _for_ m
m m _for_ m
C MAINTENANCE Owner D
RECORD 1
ITINERARY FLIGHT
m m
m

_belongs_to_ _for_

Figure 4-41. ER Model for CAB Without Constraints CF182.0

Notes:
The above entity-relationship model for our sample airline company includes the changes
discussed since we established the cardinalities for the relationship types. However, it does
not include the alternate maintenance record solution on the previous visual since it does
not really provide an improvement in our case.
Note the following changes:
• Entity type AIRCRAFT TYPE has been introduced. Entity type AIRCRAFT MODEL
becomes dependent on it.
• Relationship type MAINTENANCE RECORD_belongs_to_MAINTENANCE RECORD is
controlling for its source.
• Relationship type AIRPORT_nonstop_to_AIRPORT_in_ITINERARY has Leg Number
as nondefining attribute in dependent entity type LEG.
• Relationship type AIRPORT_nonstop_to_AIRPORT_in_ITINERARY_for_FLIGHT has
been replaced by relationship type FLIGHT_for_LEG. Note that entity type FLIGHT

4-84 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty becomes dependent on LEG: The values of a portion of its entity key must always match
up with the appropriate values of the key of LEG.
The new relationship type seems to be more natural because its target is entity type
LEG. However, we can do this only because:
- We chose to have a leg number even for one-leg itineraries (cardinality of LEG is 1..1
in AIRPORT_nonstop_to_AIRPORT_in_ITINERARY_as_LEG); otherwise, there
would not be an entity instance of LEG for nonstop connections without leg number to
which we could connect flights.
- The two relationship types have the same defining attributes, the key of entity type
FLIGHT. (Also for the old relationship type, FLIGHT would have been a dependent
entity type.)
• AIRCRAFT MODEL_for_AIRPORT_nonstop_to_AIRPORT_in_ITINERARY has been
replaced by relationship type AIRCRAFT MODEL_for_LEG. Again, this can only be
done because the cardinality of LEG is 1..1 in relationship type
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY_as_LEG and the defining attributes
of the new relationship type are the same as for the old relationship type. (Note that the
key of LEG is the same as the key of
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY.)
• The two relationship types PILOT_captain_for_FLIGHT and
PILOT_captain_for_FLIGHT have been replaced by relationship type
PILOT_assigned_to_FLIGHT as discussed.
• EMPLOYEE has been introduced as a supertype for PILOT and MECHANIC.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-85


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

4-86 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 4.4 Constraints

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-87


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraints

Constraint
An interdependency between objects of an
entity-relationship model restricting the possible
instances of entity or relationship types
The interdependent objects can be attributes,
entity types, or relationship types
A single constraint can restrict the instances
of multiple entity or relationship types

Primary source are business constraints,


but there are others

Figure 4-42. Constraints CF182.0

Notes:
Constraints are interdependencies between the objects of an entity-relationship model
restricting the possible instances that entity types or relationship types can assume. The
interdependent objects can be attributes, entity types, or relationship types. A single
constraint can restrict the instances of multiple entity types and/or relationship types.
Logically, a constraint consists of three components: a set of constraining objects, a set of
constrained objects, and a rule. The constraining objects can be attributes, entity types, or
relationship types. Their values (attributes) or instances (entity types or relationship types)
restrict the instances of the constrained objects. The constrained objects may be entity
types or relationship types. The rule specifies how the values or instances of the
constraining objects restrict the instances of the constrained objects.
There may be all kinds of constraints for the entity types and relationship types of an
entity-relationship model. The simplest form of a constraint restricts the values of an
attribute of an entity type and, thus, the instances that the entity type can assume. In this
case, the constraining object is the attribute and the constrained object is the entity type.

4-88 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The rule describes how the values of the attribute, and, thus, the instances of the entity
type, are constrained.
In principle, the value ranges (domains) of attributes could be considered as constraints.
However, these are not the constraints you would like to visualize in an entity-relationship
model since they would clutter it. You do need to document the domains of the attributes
(more precisely, of the data elements on which the attributes are based), but you do this
outside the entity-relationship model, namely, in the data inventory described in Unit 5 -
Data and Process Inventories.
In the entity-relationship model, you should only document restrictions for attributes that go
beyond the limitations imposed by the domains. The constraints that you really want to
visualize in an entity-relationship model are those where an attribute, entity type, or
relationship type constricts the instances of a different entity type or relationship type.
You should not formulate something as a constraint if it can reasonably be expressed by
other modeling constructs. However, there will always be constraints that cannot be
expressed by other modeling constructs even if additional modeling constructs were
introduced. The variety of possible constraints is so immense that it is impossible to cover
them all by additional modeling constructs.
The primary source for constraints are the business constraints for the considered
application domain. Generally, a nontrivial application domain will have many constraints.
So does the application domain for our sample airline company. For example, the business
constraint that an aircraft cannot have more engines mounted than the aircraft type (!)
allows gives rise to a constraint for entity type AIRCRAFT. We will study this and further
examples on the subsequent visuals.
Besides the constraints resulting from the business constraints, an entity-relationship
model may also contain other constraints that cannot directly be derived from business
constraints. We will see such an example as well.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-89


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraints in ER Model

If a single object is constrained, constraint is placed near the constrained


entity type or relationship type

If multiple objects are constrained by a constraint, constrained objects are


connected by a dotted line and constraint is positioned near connecting line
Alternatively, the constraint is repeated for each constrained object

A dotted arrow may be drawn from the constraining object to the


constrained object with the constraint placed next to it

{ identifier [ : rule ] } Format of a single constraint for an object:

Rule for constraint


(optional if description outside ER model)
Unique identifier for description of constraint

Multiple constraints for same object separated by semicolons:


{ id-1 : rule-1 ; id-2 : rule-2 ; id-3 : rule-3 }

Figure 4-43. Constraints in ER Model CF182.0

Notes:
As mentioned before, a constraint can limit the instances of a single object or of multiple
objects. If a single entity type or relationship type is constrained, the constraint is positioned
near the constrained object.
If multiple objects are constrained by the same constraint, the constrained objects are
interconnected by a dotted line and the constraint is placed next to the connecting dotted
line. To avoid cluttering and to maintain the clearness of the entity-relationship model, you
may prefer not to connect the constrained objects, but rather repeat the constraint for every
constrained object. If the constrained objects are far apart in the entity-relationship model,
it may be difficult or even impractical to interconnect them.
To visualize the interdependency, a dotted arrow may be drawn from the constraining object
(in case of an attribute from its entity type) to a constrained object and the constraint placed
next to it.
In the entity-relationship model, the constraints themselves are documented as follows:
• The constraints are enclosed in braces.

4-90 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • Each constraint consists of a unique identifier which is optionally followed by a colon (:)
and the rule describing the interdependency.
• The unique identifier for the constraint can be anything you like. Usually, it is a number.
Its purpose is to tie together repetitions of the same constraint (if multiple objects are
constrained by a single constraint as explained above) and to identify a detailed
description of the constraint outside the diagram.
• The colon and the rule may only be omitted if an outside description of the constraint is
provided.
Multiple constraints for an object may be placed within the same enclosing braces. The
different constraints are separated by semicolons (;).
For the rule, you may use conditional expressions or formulas, if applicable, or natural
language text. Natural language text may be easier to understand, but holds the danger of
ambiguities. However, many of the rules can only be formulated using natural language.
It would be possible to define a formal notation using conditional expressions,
mathematical symbols, set symbols, and functional operators covering most of the cases,
but this formal notation would be complex and not necessarily enhance the clarity of the
entity-relationship model. Most of the time, natural language may still be your best choice.
Therefore, we will use natural language in most of the examples in this document.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-91


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraints (Example 1)

AIRCRAFT TYPE
K Type Code D
Category _for_
Manufacturer
AIRCRAFT MODEL
1. .m
Number of Engines
... 1. .1
_for_
m
AIRCRAFT
K Aircraft Number
Date Acquired
Engine: value-1,
{ 1 : Number of engines for aircraft <_ value-2,
Number of engines for aircraft type } value-3,
value-4
...

Constraint No. 1
Rule: Number of engines for aircraft <_ Number of engines for aircraft type
Explanation: An aircraft cannot have more engines mounted than the
aircraft type allows
Figure 4-44. Constraints (Example 1) CF182.0

Notes:
The problem statement for Come Aboard in Unit 3 - Problem Statement states as a
business constraint that an aircraft cannot have more engines mounted than the aircraft
model allows. In the meantime, we have learned that the number of engines is rather a
characteristic of the aircraft type and, therefore, an attribute of entity type AIRCRAFT TYPE
(and not of entity type AIRCRAFT MODEL) as illustrated in the above entity-relationship
model portion.
Attribute Number of Engines of entity type AIRCRAFT TYPE is the constraining object of
the constraint. It restricts how many values the attribute Engine may have for instances of
entity type AIRCRAFT. Thus, it constrains the instances of entity type AIRCRAFT, the
constrained object.
The dotted arrow from AIRCRAFT TYPE to AIRCRAFT visualizes who constrains whom.
The braces next to the arrow contain the identifier (1) and the rule for the constraint.
At the bottom of the visual, an outside (of the entity-relationship model) description of the
constraint is given. It repeats the rule and provides an explanation. A more complete

4-92 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty outside description should list the constraining objects (Number of Engines in entity type
AIRCRAFT TYPE) and the constrained objects (AIRCRAFT).

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-93


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraints (Example 2)

1 _captain_for_ m

PILOT { 2a } { 2b } FLIGHT
1 _copilot_for_ m

Constraint No. 2a
Rule: The captain for a flight cannot become the copilot at the same time
Explanation: A pilot that has been assigned as captain to a flight cannot become
copilot for the flight at the same time. This means that relationship
type PILOT_copilot_for_FLIGHT cannot receive a relationship
instance already contained in PILOT_captain_for_FLIGHT.
Constraint No. 2b
Rule: The copilot for a flight cannot become the captain at the same time
Explanation: A pilot that has been assigned as copilot to a flight cannot become
captain for the flight at the same time. This means that relationship
type PILOT_captain_for_FLIGHT cannot receive a relationship
instance already contained in PILOT_copilot_for_FLIGHT.
Figure 4-45. Constraints (Example 2) CF182.0

Notes:
This example demonstrates that a business constraint may result in multiple constraints for
the entity-relationship model.
CAB has a business constraint specifying that the captain and copilot for a flight must be
different. Assuming that the pilot assignment is modeled by the two relationship types
PILOT_captain_for_FLIGHT and PILOT_copilot_for_FLIGHT (original solution), the
translation of the business constraint into constraints for the entity-relationship model
results in two constraints:
• The first constraint (2a) constrains the instances of relationship type
PILOT_copilot_for_FLIGHT by requiring that a pilot that has already been assigned as
captain to a flight cannot become the copilot of the flight as well. Thus, the instances of
PILOT_copilot_for_FLIGHT are constrained by the instances of
PILOT_captain_for_FLIGHT. If PILOT_captain_for_FLIGHT already contains an
instance for a specified pilot and flight, an instance for them must not be added to
PILOT_copilot_for_FLIGHT.

4-94 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Accordingly, the constraining object is relationship type PILOT_captain_for_FLIGHT and


the constrained object is PILOT_copilot_for_FLIGHT. The rule is: The captain for a flight
cannot become the copilot at the same time. A more formal notation for the rule could
be:
(pilot, flight) c PILOT_captain_for_FLIGHT u
(pilot, flight) v PILOT_copilot_for_FLIGHT
• Conversely, the second constraint (2b) constrains the instances of relationship type
PILOT_captain_for_FLIGHT by requiring that a pilot that has already been assigned as
copilot to a flight cannot become the captain of the flight. Thus, the instances of
PILOT_captain_for_FLIGHT are constrained by the instances of
PILOT_copilot_for_FLIGHT. If PILOT_copilot_for_FLIGHT already contains an instance
for the specified pilot and flight, an instance for them cannot be added to
PILOT_captain_for_FLIGHT.
Accordingly, the constraining object is relationship type PILOT_copilot_for_FLIGHT and
the constrained object is PILOT_captain_for_FLIGHT. The rule is: The copilot for a flight
cannot become the captain at the same time. In this case, a more formal notation for the
rule could be:
(pilot, flight) c PILOT_copilot_for_FLIGHT u
(pilot, flight) v PILOT_captain_for_FLIGHT

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-95


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraints (Example 3)

PILOT _assigned _to_ FLIGHT


K Employee Number K Flight Number
... m _by_ m K From
K To
D 1. .1 K Flight Locator
...
PILOT ASSIGNMENT
K Flight Number
K From
K To
K Flight Locator
K Employee Number
Pilot Function
{ 2 : Pilot Function must be unique for a flight }

Constraint No. 2
Rule: Pilot Function must be unique for a flight
Explanation: For a flight, each function (CAPTAIN or COPILOT) must only be
assigned once. This means the combination (Flight Number,
From, To, Flight Locator, Pilot Function) must be unique.
Figure 4-46. Constraints (Example 3) CF182.0

Notes:
This visual illustrates the constraint required if the pilot assignment is modeled using a
single relationship type with nondefining attributes (dependent entity type PILOT
ASSIGNMENT). As we discussed before, in this case, the uniqueness of the entity key of
dependent entity type PILOT ASSIGNMENT automatically takes care of the requirement
that a pilot be assigned only once to a flight.
However, in order to ensure that not more than one captain and not more than one copilot
are assigned to a flight, we need a constraint. The rule for the constraint is simply that the
value of attribute Pilot Function must be unique for each flight. In other words, the
quintuplet of attributes (Flight Number, From, To, Flight Locator, Pilot Function) must be
unique.
In this case, the five attributes are the constraining objects and entity type PILOT
ASSIGNMENT is the constrained object.
Note that the above constraint does not restrict the values of attribute Pilot Function to the
two values CAPTAIN and COPILOT. This should be achieved through the domain definition
for the appropriate data element.

4-96 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Furthermore, this constraint is not a direct derivative of a business constraint. It originates
from the way we have modeled the pilot assignment. It does not enforce that the captain
and copilot for a flight are different. (This is achieved in another way.) It only enforces that
each function is only assigned once to a flight.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-97


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraints (Example 4)

AIRCRAFT _for_
LEG
MODEL 1. .1 m
m

_can_fly_ _for_
AND
m m D
{3}
m m
PILOT FLIGHT
_assigned_to_

Constraint No. 3
Rule: Pilot for flight must have license to fly aircraft model for leg
Explanation: A pilot assigned to a flight must be able, i.e., have the license,
to fly the aircraft model for the leg for the flight.

Figure 4-47. Constraints (Example 4) CF182.0

Notes:
Come Aboard has a business constraint requiring that the pilots for a flight must have the
license to fly the aircraft model for the leg for the flight, i.e., can fly the aircraft model. The
above visual illustrates how this business constraint can be translated into a constraint for
the entity-relationship model.
For the constraint, relationship types PILOT_can_fly_AIRCRAFT MODEL,
FLIGHT_for_LEG, and AIRCRAFT MODEL_for_LEG are the constraining objects since
their instances determine the pilots that can be assigned to a given flight: The aircraft
model must be for the leg of the flight and the pilot must be able to fly the aircraft model. In
other words, for a given flight, the aircraft model must be determined using relationship
types FLIGHT_for_LEG and AIRCRAFT MODEL_for_LEG. Then, the resulting aircraft
model must be used to determine the pilots that can fly the aircraft model.
The constrained object is relationship type PILOT_assigned_to_FLIGHT because only
special pilots can be assigned to a flight, namely, those that can fly the aircraft model for
the leg for the flight (rule).

4-98 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Note that dependent entity type PILOT ASSIGNMENT for relationship type
PILOT_assigned_to_FLIGHT is not shown on the visual.
You might get the idea that you could avoid the constraint by having a relationship type
(PILOT_can_fly_AIRCRAFT MODEL)_assigned_to_(AIRCRAFT MODEL_for_LEG),
interconnecting PILOT_can_fly_AIRCRAFT MODEL and AIRCRAFT MODEL_for_LEG,
and basing the relationship type assigning pilots to flights on this relationship type rather
than on PILOT. Not so since an instance of AIRCRAFT MODEL_for_LEG could be paired
with any instance of PILOT_can_fly_AIRCRAFT MODEL, even with one that had a different
aircraft model! There is nothing in the relationship type definition enforcing that only
particular instances can be interconnected. Only a constraint could ensure that only
instances with the same aircraft model were interconnected.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-99


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

4-100 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 4.5 Splitting and Combining Entity-Relationship Models

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-101


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Subdivision of ER Model into Pages

Most of the time, an entity-relationship model will not fit


onto one page
Must subdivide ER model into pieces that fit onto one page
Determine subareas and/or different views of application domain
Establish entity-relationship model for subareas or views
If subareas or views of application domain are still too large, try to
find smaller logical subsets you can break out
If nothing of the above helps, break out units in such a way that:
Entity and relationship types have as few relationship types to objects
on other pages as possible
Repeat entity or relationship types on other pages to illustrate
relationship types
The various submodels will overlap
Together, the submodels must cover the entire
entity-relationship model (application domain)

Figure 4-48. Subdivision of ER Model into Pages CF182.0

Notes:
Most of the time, the entity-relationship model for an application domain will not fit onto a
single page. Sure, you can use a bigger piece of paper, but this will only alleviate the
problem and not solve it. The consequence is that the entity-relationship model must be
split into pieces fitting on a single page.
Your first attempt should be to identify autonomous subareas of the application domain and
to separate their entity-relationship models. If you cannot find such subareas or their
entity-relationship models do not fit on a single page, try to identify different views with
which you can look at the application domain and establish the entity-relationship models
for them. A view comprises all objects of the entity-relationship model that a specific group
of people needs to know or that concerns them.
For our sample airline company called Come Aboard, a sample view would be the Pilot
View which comprises all entity types, relationship types, and constraints that pilots need to
know about or apply to them. Another view would be the Maintenance View including all
entity types, relationship types, and constraints concerning the aircraft maintenance. Both
views are illustrated on the subsequent visuals.

4-102 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty If the subareas or views are still too large, try to find smaller logical units that you can break
out and that will fit onto one page.
If nothing of the above helps, you just have to break out any pieces of the
entity-relationship model that will fit onto one page. Try to break them out in such a way that
the entity types and relationship types on that page have as few relationship types to entity
types or relationship types on other pages as possible. Of course, you need to illustrate the
page-crossing relationship types on other pages. There, you need to repeat the entity types
or relationship types of this page being their sources or targets.
Generally, the submodels on the various pages will overlap. Some parts of the entire
entity-relationship model will occur on multiple pages. The submodels must not conflict with
each other. Together, they must cover the entire application domain, i.e., cover all portions
of the entire, undivided, entity-relationship model for the application domain.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-103


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Pilot View of ER Model for CAB

AIRCRAFT
PILOT
TYPE
m
_for_
{ 2 : Pilot Function must
be unique for flight } D 1. .m
m m PILOT DC _by_ _assigned AIRCRAFT
AIRPORT _to_
From To ASSIGNMENT 1. .1 MODEL
1. .1
_nonstop_to_ _for_
1. .m
m m
_as_ DC _for_ D _for_
_in_ LEG FLIGHT AIRCRAFT
1. .1 m m 1

ITINERARY

Figure 4-49. Pilot View of ER Model for CAB CF182.0

Notes:
This visual illustrates the Pilot View for Come Aboard. It comprises all entity types,
relationship types, and constraints that pilots need to know or are concerned with.
Pilots want to know the flights they have been assigned to and their function on the flight.
Therefore, the view needs to include, besides entity type PILOT, entity types FLIGHT and
PILOT ASSIGNMENT and relationship types PILOT_assigned_to_FLIGHT and
PILOT_assigned_to_FLIGHT_by_PILOT ASSIGNMENT.
Furthermore, pilots want to know to which leg of the itinerary a flight belongs and all
information about the airports for the leg. Thus, the view must include entity types LEG,
ITINERARY, and AIRPORT and relationship types FLIGHT_for_LEG,
AIRPORT_nonstop_to_AIRPORT, AIRPORT_nonstop_to_AIRPORT_in_ITINERARY, and
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY_as LEG.
In addition, pilots need to know everything about the aircraft for the flight including its model
and type. Consequently, the view must comprise entity types AIRCRAFT, AIRCRAFT
MODEL, and AIRCRAFT TYPE and relationship types AIRCRAFT_for_FLIGHT,
AIRCRAFT MODEL_for_AIRCRAFT, and AIRCRAFT TYPE_for_AIRCRAFT MODEL.

4-104 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The illustrated constraint is the only one concerning the entity types and relationship types
of this entity-relationship model view. Business constraints Pilots for Flight Must Have
License for Aircraft Model for Leg and Only Aircraft Model With Start and Landing Rights
for Legs concern relationship type AIRCRAFT MODEL_for_LEG which is not part of this
view. Pilots need not necessarily know the corresponding constraints. The constraints
rather deal with flight planning and pilot assignment, done by different groups of people,
and would have to appear in the appropriate views.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-105


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Maintenance View of ER Model for CAB

Constraint No. 5 AIRCRAFT


Rule: Only trained mechanics for aircraft maintenance TYPE
Explanation: A mechanic can only service an aircraft if he/she
has been trained for the appropriate aircraft model
_for_
1. .m D
_trained
_for_ AIRCRAFT
m MODEL
1 Owner m
1. .1
MAINTENANCE _from_
_belongs_to_ MECHANIC _for_
RECORD m 1. .1 AND {5}
m m
m C
m
AIRCRAFT
_scheduled
_for_

{ 4 : New maintenance record only for existing aircraft }

Figure 4-50. Maintenance View of ER Model for CAB CF182.0

Notes:
The Maintenance View comprises all entity types, relationship types, and constraints
needed for the scheduling or performance of aircraft maintenance.
The scheduling concerns mechanics, aircraft, and aircraft models and must select
mechanics from those trained for the aircraft model for the aircraft to be serviced. Thus, the
maintenance view must include entity types MECHANIC, AIRCRAFT MODEL, and
AIRCRAFT and relationship types AIRCRAFT MODEL_for_AIRCRAFT,
MECHANIC_for_AIRCRAFT_MODEL, and MECHANIC_scheduled_for_AIRCRAFT.
During a maintenance, mechanics must write maintenance records or look at them.
Therefore, the view must include entity type MAINTENANCE RECORD and relationship
types MAINTENANCE RECORD_from_MECHANIC and
MAINTENANCE RECORD_belongs_to_MAINTENANCE RECORD.
Furthermore, a mechanic needs to know information about the aircraft, its model, and its
type. Consequently, the view must also include entity type AIRCRAFT TYPE and
relationship type AIRCRAFT TYPE_for_AIRCRAFT MODEL.

4-106 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty There are two constraints applicable to the Maintenance View:


• The first constraint (4) requires that a new maintenance record is for an existing aircraft.
This constraint enforces that the aircraft number for a new maintenance record belongs
to an aircraft owned by CAB.
• The second constraint (5) ensures that only mechanics trained for the affiliated aircraft
model are scheduled for the service of an aircraft.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-107


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Building an Enterprise-Wide ER Model


An enterprise may comprise
many application domains

May be too complex to start building a single ER model

Must build separate models for the application domains

Consolidation needed for every step of


design process before continuing!!!

Consolidation of problem statements


before starting with ER models
Consolidation of ER models before continuing

Figure 4-51. Building an Enterprise-Wide ER Model CF182.0

Notes:
These days, many companies want to establish an enterprise-wide entity-relationship
model. For bigger companies, an enterprise-wide entity-relationship model comprises
multiple application domains. The enterprise-wide entity-relationship model might be too
complex to immediately build the entire model especially since you will rarely find a single
domain expert that fully understands all application domains involved.
As a consequence, it might be better to start with separate models for the various
application domains and then to consolidate them. It is necessary to consolidate the results
of every step of the design process for all application domains before continuing on to the
next step for any of the application domains. If you do not do this, the results most likely will
not fit together.
For the entity-relationship models, this means two things:
1. You should consolidate the problem statements of the various application domains
before developing the respective entity-relationship models.

4-108 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 2. You should consolidate the entity-relationship models before proceeding to the next step
of the design process for any of the application domains.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-109


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Problems During Consolidation of ER Models


Problems for Entity Types
Entity types with same name may have a different meaning
Entity types with different names may have the same meaning
Entity types in one ER model may be attributes in other models
Entity keys may be different

Problems for Relationship Types


Relationship types with same name may have a different meaning
Relationship types with different names may have the same meaning
Cardinalities may be different
Properties (controlling, dependent, supertype) may be different
Relationship key may be different

Problems for Constraints


Constraints may be missing
Constraints may be conflicting

Figure 4-52. Problems During Consolidation of ER Models CF182.0

Notes:
During the consolidation of the entity-relationship models for the various application
domains, you may experience problems concerning the entity types, the relationship types,
or the constraints of the different models.
The different models may contain entity types that have the same names, but a different
meaning. Thus, the names must be changed to achieve uniqueness. Conversely, entity
types with different names may correspond to the same business object types and,
therefore, should be named the same.
Furthermore, because of the different perspectives of the individual application domains,
an entity type for one application domain may just be a set of attributes of another entity
type in another application domain. In this case, the set of attributes must become an entity
type in the other application domain as well and the necessary relationship types using this
new entity type as source or target must be established.
A further problem that may surface is that the entity keys for the same entity types in
different entity-relationship models may be different. This problem is easy to resolve. Since

4-110 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty the entity types are the same, they have the same attributes so that the same entity key
can be chosen for all entity-relationship models.
As for entity types, the different entity-relationship models may contain relationship types
with the same name, but with a different meaning. To remove the problem, the names must
be changed to achieve uniqueness. This applies to the names for both directions of the
relationship types. Conversely, differently named relationship types may have the same
meaning and, therefore, should be named the same.
The cardinalities of the same relationship types may be different in different
entity-relationship models. In this case, the true cardinalities must be determined and the
erroneous entity-relationship models changed accordingly.
Some of the properties for relationship types may different. If there is a difference for the
controlling property, it must be determined if the deletion of the appropriate source or target
instances should indeed take place on an enterprise-wide scale. If so, the controlling
property must be added where it was omitted. If not, it must be dropped where specified.
If a relationship type is an owning relationship type in one entity-relationship model, but not
in another, it must be checked if the dependent entity type really fulfills the dependency
requirements and the proper corrections must be made in one of the models. Furthermore,
a class structure may have been recognized in one entity-relationship model, but not the
other. In this case, it must be introduced in the entity-relationship model where it is missing,
the supertype must appropriately be identified, and relationship types using the former
entity types as source or target must be verified.
In case of 1:1 relationship types, there are two choices for the relationship key. Thus,
different models may have chosen a different relationship key. Just choose the same
relationship key for all models concerned.
In some models, constraints may be missing that have been identified in other models. It
must be verified if the constraints have an enterprise-wide scope and the erroneous model
must be changed accordingly.
Furthermore, the different models may contain conflicting constraints. The conflicts must
be resolved by the domain experts and the models changed accordingly.
There may be other problems during the consolidation, but this list should already give you
a pretty good idea of what to look for.

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-111


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Checkpoint

Exercise — Unit Checkpoint


1. Name the three major components of entity-relationship models.
_____________________________________________________
_____________________________________________________
_____________________________________________________

2. The instances of an entity type may all have a different meaning.


(T/F)

3. Explain the difference between an entity type and an entity


instance.
_____________________________________________________
_____________________________________________________
_____________________________________________________

4. What is the purpose of entity keys? What is the minimum principle


for entity keys?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

5. The business object types for the application domain are the
primary source for the entity types of an entity-relationship model.
(T/F)

6. Describe what a relationship type is.


_____________________________________________________
_____________________________________________________
_____________________________________________________

4-112 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 7. The source of the primary direction of a relationship type is the


target of the inverse direction and the source of the inverse
direction is the target of the primary direction. (T/F)

8. A relationship type can again be the source or target of a


relationship type. (T/F)

9. A business relationship type can always be translated into a


relationship type of the entity-relationship model for the application
domain. (T/F)

10. Match the following catchwords with the corresponding


cardinalities:
a. at most one ____ 0..m
b. any number ____ 1
c. one and only one ____ 0..1
d. one or more ____ m
____ 1..m
____ 1..1

11. Assume that you have two entity types SEAT and PASSENGER
and a relationship type PASSENGER_has_SEAT expressing
which seats have been assigned to passengers. A passenger may
have zero, one, or multiple seats assigned to him/her. A seat can
only be assigned to a single passenger.
Specify the cardinalities for the source and target of relationship
type PASSENGER_has_SEAT:
a. Cardinality for source: _______
b. Cardinality for target: _______

12. Describe the terms 1:1 relationship type, 1:m relationship type, and
m:m relationship type.
_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-113


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

13. Assume that you have the following entity-relationship model:

r1
A B
1 m

r2

a. For relationship type r1, how many instances of entity type B


can at most be connected to an instance of entity type A?
__________________________________________________
b. For relationship type r1, how many instances of entity type B
must at least be connected to an instance of entity type A?
__________________________________________________
c. For relationship type r1, how many instances of entity type A
can at most be connected to an instance of entity type B?
__________________________________________________
d. For relationship type r1, how many instances of entity type A
must at least be connected to an instance of entity type B?
__________________________________________________
e. For relationship type r2, how many instances of entity type A
can at most be connected to an instance of entity type C?
__________________________________________________
f. For relationship type r2, how many instances of entity type A
must at least be connected to an instance of entity type C?
__________________________________________________

4-114 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 14. Based on the entity-relationship model for the previous checkpoint
question, assume that entity types A, B, and C have the following
instances:

Entity Type Entity Instances


A A1, A2, A3
B B1, B2, B3, B4
C C1
Are all of the following relationship instances for relationship type
r1 possible? If not, explain why they are not all possible.

Relationship Type Relationship Instances


r1 (A1, B3), (A1, B4), (A3, B1), (A3, B3)

_____________________________________________________
_____________________________________________________
Would you expect any relationship instances for relationship type
r2?
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-115


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

15. Assume that you have the following entity-relationship model:

r1
A B
m m 1

r2

1..m

List the defining attributes and the relationship keys for relationship
types r1 and r2. Use the term key of ... to describe them.
Defining attributes for r1: _______________________________
Relationship key for r1: _______________________________
Defining attributes for r2: _______________________________
Relationship key for r2: _______________________________

16. The entity key of a dependent entity type must be equal to the
entity key of another entity type or the relationship key of a
relationship type. (T/F)

17. Name the criteria that an entity type must fulfill to be a dependent
entity type.
_____________________________________________________
_____________________________________________________
_____________________________________________________

4-116 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 18. Assume that you have the following entity-relationship model:

r1 D
A B
m

Furthermore, assume that A and B have the following entity


instances (just the keys are shown):

Entity Type Entity Instances


A A1, A2, A3
B A1.B1, A1.B2, A2.B1, A3.B1
Can the owning relationship type r1 have the following relationship
instances?
(A1, A1.B1), (A1, A1.B2), (A1, A2.B1), (A2, A2.B1), (A3, A3.B1)
_____________________________________________________
_____________________________________________________
_____________________________________________________

19. How can you represent the nondefining attributes for a relationship
type in an entity-relationship model?
_____________________________________________________
_____________________________________________________
_____________________________________________________

20. If you specify the controlling property for the target of a relationship
type, a relationship instance is to be deleted when its target
instance is deleted. (T/F)

21. If you specify the controlling property for the target of a relationship
type, the target instance belonging to a relationship instance is to
be deleted when the relationship instance is deleted. (T/F)

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-117


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

22. Assume that you have the following entity-relationship model:

r2 C
A B
m m m

C m

r1 r4

C m
C D
m r3 m

Furthermore, assume that the entity types and relationship types


have the following instances:

Object Instances
A A1, A2, A3
B B1, B2
C C1, C2, C3
D D1, D2, D3
r1 (C1, A2), (C2, A3)
r2 (A1, B1), (A2, B1), (A3, B2)
r3 (C1, D1), (C1, D2), (C2, D3)
r4 ((A1, B1), (C1, D2)), ((A1, B1), (C2, D3))
Which instances will the various entity types and relationship types
have after entity instance C2 of entity type C has been deleted?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

4-118 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 23. The purpose of supertypes and subtypes is to categorize the


instances of entity types. (T/F)

24. Which are the components of a class structure?


_____________________________________________________
_____________________________________________________
_____________________________________________________

25. What is the is-bundle?


_____________________________________________________
_____________________________________________________
_____________________________________________________

26. Match the following partial sentences with the proper bundle
cardinalities:
a. The subtype set is exclusive, but not
____ 1..1
covering if the bundle cardinality is ...
b. The subtype set is covering, but not
____ 1..m
exclusive if the bundle cardinality is ...
c. The subtype set is exclusive and
____ 0..1
covering if the bundle cardinality is ...
d. The subtype set is not covering and not
____ 0..m
exclusive if the bundle cardinality is ...

27. Which mechanism can you use to restrict the instances of entity
types or relationship types?
_____________________________________________________
_____________________________________________________

28. Name the three components of constraints.


_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-119


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

29. What is the format of a constraint in the entity-relationship model?


_____________________________________________________
_____________________________________________________
_____________________________________________________

4-120 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary (1 of 3)
The three major components of entity-relationship models are:
Entity types, relationship types, constraints
Entity types are conceptual units representing classes of objects
with the same meaning and characteristics
Have attributes (conceptual pieces of information)
Instances uniquely identified by entity key
Primary source: business object types for application domain
Relationship types are classes of interrelationships between
the instances of entity types and/or relationship types
All interrelationships have the same meaning and characteristics
All interrelationships interconnect two instances
A relationship type has a primary and an inverse direction
Primary source: business relationship types for application domain
Cardinalities for relationship types determine:
How many interrelationships a source instance can and must have
with a target instance
How many interrelationships a target instance can and must have
with a source instance

Figure 4-53. Unit Summary (1 of 3) CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-121


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Summary (2 of 3)
Defining attributes completely describe instances of relationship type
Relationship key uniquely identifies instances of relationship type
Relationship type may have nondefining attributes
Modeled by means of dependent entity types
Dependent entity type is an entity type connected to a parent
entity type or a relationship type via an owning relationship type
Each dependent instance connected to one and only one parent instance
Key portion of dependent entity type = key of parent
Only instances interconnected with matching key portion/key values
Controlling property for relationship type specifies if source or
target instance to be deleted when relationship instance is deleted
Cascading effect
Class structures allow the categorization of entity instances
Supertype = generalization of subtypes
Subtypes = specializations of supertype
Subtype set may be exclusive and/or covering

Figure 4-54. Unit Summary (2 of 3) CF182.0

Notes:

4-122 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary (3 of 3)

Is-bundle = set of _is_ relationship types connecting the supertype


to its subtypes
Represented as a fork with handle starting at supertype
Bundle cardinality specifies to how many subtype instances a supertype
instance can be and must be connected

Constraints are interdependencies between the objects of an ER model


restricting the possible instances of entity types or relationship types
Consist of constraining objects, constrained objects, and a rule
Primary source: business constraints for application domain

Large entity-relationship models must be split into pages


ER models for subareas, views, or logical subsets of application domain
Together submodels must cover entire entity-relationship model

When building an enterprise-wide entity-relationship model:


Start with entity-relationship models for separate application domains
Consolidate results for every step of design process before moving on
to next step

Figure 4-55. Unit Summary (3 of 3) CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 4. Entity-Relationship Model 4-123


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

4-124 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 5. Data and Process Inventories

What This Unit Is About


This unit describes the purpose and content of data and process
inventories. Furthermore, it describes methods for developing them
and gives examples for their content.

What You Should Be Able to Do


After completing this unit you should be able to:
• Explain the purpose of data and process inventories.
• Explain the significance of data inventories for database design.
• Understand who has the responsibility the creation of data and
process inventories.
• Describe the content data and process inventories should have for
database design.
• Summarize some methods for establishing data and process
inventories.

How You Will Check Your Progress


Accountability:
• Checkpoint questions
• Exercises

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives
After completion of this unit, you should be able to:

Explain the purpose of data and process


inventories

Explain the significance of data inventories


for database design

Understand who has the responsibility for


the creation of data and process inventories

Describe the content data and process


inventories should have for database design

Summarize some methods for establishing


data and process inventories

Figure 5-1. Unit Objectives CF182.0

Notes:
Up to now, from the problem statement, the entity-relationship model for the application
domain has been developed. To develop the corresponding database, you must determine
the data that should be contained in the database before you can proceed. This means that
you must establish a list of all data for the application domain, that is, the data inventory.
In this unit, we will talk about the data inventory and the process inventory which is
interrelated with it. We will describe their purposes and explain the significance of the data
inventory for database design. You will find out whose responsibility it is to establish the
data and process inventories for the application domain.
In addition, you will learn what the content of data and process inventories should be from
the perspective of database design. The process inventory is primarily intended for
application programmers, but is important for database design: The descriptions of its
business processes reveal the data that should be contained in the database for the
application domain.

5-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 5.1 Data Inventory

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Data and Process Inventories in Design Process


Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules
Logical View Storage Indexes
View

Figure 5-2. Data and Process Inventories in Design Process CF182.0

Notes:
The preceding steps of the design process dealt with the problem statement and the
entity-relationship model for the application domain. To establish the database for the
application domain from the entity-relationship model, you need to know the various pieces
of data to be stored in the database. The data for the application domain are described in
the data inventory.
Data inventory and process inventory are developed in parallel during the conceptual view
of the design process. They are established after the entity-relationship model because the
entity-relationship model can be used in their development and is verified as part of their
development.
In principle, the data inventory can be developed without the process inventory. However,
the best method for developing the data inventory is to couple its development and the
development of the process inventory. The process inventory contains a description of all
business processes for the application domain. The description for a business process lists
the data used by the business process. Hence, the process inventory reveals the data

5-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty elements that should be contained in the database for the application domain and, thus,
should be described in the data inventory.
By coupling the data and process inventories, you can ensure that all data needed by
documented business processes of the application domain are contained in the data
inventory and only these data. Consequently, the database will contain precisely the
required data. Furthermore, you can ensure that the data inventory is updated as new
business processes are planned and recorded in the process inventory.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Data Inventory - Purpose and Responsibilities


Detailed description of all data
for application domain
Data elements and data groups
Data element = indivisible piece of data
Data group = group of logically related
data elements and/or data groups
Independent of entity types
Multiple entity types may use
same data for different purposes
Jointly created by:
Application domain expert
Knows application domain
Database designer
Knows what is needed for database design
Knows entity types for data elements/data groups
Input for database designer
Figure 5-3. Data Inventory - Purpose and Responsibilities CF182.0

Notes:
The data inventory contains a detailed description of all data for the application domain: It
describes all abstract data types, data elements, and data groups for the application
domain.
Data elements are indivisible pieces of data. They cannot be divided into smaller pieces
meaningful for the application domain.
In contrast, data groups are sets of logically related (for the application domain) data
elements and/or data groups. This is a recursive definition and implies that data groups can
contain data groups. The data elements or data groups of a data group are referred to as
components of the (owning) data group.
A data group can be viewed as a tree structure of one or more levels whose lowest level
nodes (terminal nodes) are data elements. An example of a data group may be Name of
Person, i.e., the name of a person, consisting of data elements Last Name, First Name,
and Middle Initial.

5-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The correct identification of data groups is important for the later steps of the design
process since they identify items that logically belong together. In particular, they enable
the recognition of repetitive groups and of groups of data that can be separated out
(vertical splitting).
Since the data inventory is part of the conceptual view, it should be purely
application-domain oriented. It should not contain data elements or data groups caused by
implementation and not having a direct meaning for the application domain.
The entity-relationship model is input for the development of the data inventory. It helps
identify data elements and data groups. Data elements and data groups can be viewed as
abstractions or generalizations of elementary attributes and composite attributes,
respectively. They define elementary or composite data for the application domain
independent of their usage by entity types. Therefore, the definition of the data elements
and data groups should be independent of the entity types of the entity-relationship model.
In this context, the question arises if you should have two different data elements or data
groups for data with the same fundamental meaning, but a (slightly) different usage? For
example, for our sample airline company, we want to store the planned departure time and
the actual departure time for a flight. Should you have different data elements Planned
Departure Time and Actual Departure Time or just a single data element Departure Time?
Planned Departure Time and Actual Departure Time are certainly different attributes for
entity type FLIGHT.
The answer is that both solutions are feasible. If you choose a single data element, data
element Departure Time is used in two different roles (purposes) by entity type FLIGHT: It
is used as planned departure time (attribute name Planned Departure Time) and as actual
departure time (attribute name Actual Departure Time). If you choose two different data
elements, they are used by entity type FLIGHT in their fundamental meanings. In this case,
the attribute names can be the names of the data elements.
It is a matter of taste and judgement where you make the assignment of roles: on the data
element level, the data group level, or the entity type level. If you make the differentiation
on the data element level, you must make the description of the data elements more
restrictive, but need not deal with roles. If you differentiate on the data group or entity type
level, you can keep the description of the data elements more general, but have to deal
with roles. Although the roles must be described as well, the definition work may be
somewhat less in the latter case.
However, you should make a sensible trade-off. In the extreme, you could decide on having
a single data element for all data with the same data type and use roles for all usages. For
example, you could have a single data element Time representing a time and define all
kinds of roles for it: as planned time of departure, as actual time of departure, as planned
time of arrival, as actual time of arrival, and so on. By now, you should know that this is
certainly not the way to go. Having data elements Departure Time and Arrival Time would
probably be adequate in this case.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

As described above, data elements or data groups may be used as components by other
data groups. They may also be used as attributes by entity types or tuple types (as we will
see later on).
Data elements and data groups as such do not have cardinalities. However, when used as
component or attribute, a cardinality is associated with a data element or data group. The
cardinality specifies how many values the data element/data group must assume at least
and at most and will assume on average for this usage. Note that two sets, having the
same data elements and data groups, are considered different data groups if the
cardinalities of the components are different.
A data element or data group may be used by many data groups and entity types. A data
element or data group may even be used multiple times (for different purposes) by the
same data group or entity type.
When a data group is used by an entity type, it becomes a composite attribute of the entity
type. This means that the entity type contains elementary attributes for all data elements of
the tree structure for the data group.
The data inventory must be created by someone with detailed knowledge of the application
domain. Thus, the application domain expert must be involved in the creation of the data
inventory. However, he/she needs the help of the database designer. The database
designer knows best what is needed for database design. He/She knows the entity types of
the entity-relationship model that may contain the data elements or data groups and has a
better understanding of data types. The data inventory identifies the entity types using the
data elements or data groups and describes the abstract data types the data elements are
based upon.
According to the above, the application domain expert and the database designer must
jointly develop the data inventory.
The data inventory is input for the database designer who needs it during the later steps of
the design process. The data inventory also helps application programmers when
designing the application programs or queries for the business processes of the application
domain.

5-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Contents of Data Inventory (1 of 3)
Abstract Data Types

For each data type:

Signature: A unique name for the data type followed, in parentheses and
separated by commas, by the parameters for the data type
Values: A description of the values that data belonging to the data
type can assume
For a finite number of values, a list of the possible values
Can be values of another data type or a subset thereof
Can be defined by a formula
Can be a textual description of the values

Operations: A description of the operations that can be performed


for data of the data type
Including operator name, operands, and results
Including Equal Comparison (=) determining when data
of the data type are considered equal

Figure 5-4. Contents of Data Inventory (1 of 3) CF182.0

Notes:
The first component of the data inventory is the description of the abstract data types for
the application domain.
Data types describe the values data of that data type can assume and the operations that
can be performed with the data. Thus, by associating a data element with a data type, you
define the fundamental values and the operations for the data element.
In support of the SQL standard, all relational database management systems provide a set
of standard data types such as INTEGER, DECIMAL, CHARACTER, DATE, or TIME.
These standard data types are general-purpose data types covering many situations, but
are imprecise.
Abstract data types go beyond standard data types. You tailor them for your application
domain so they reflect the values the associated data elements can assume and the
operations that can be performed with the data elements.
As an example, take the employee numbers for pilots and mechanics of our sample airline
company called Come Aboard. They consist of digits, and you would be attempted to

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

assign them to the standard data type INTEGER. However, they are not really integers:
You should not perform the usual integer operations (such as integer addition and
subtraction) with them. Furthermore, they cannot be negative and leading zeros have a
meaning and should not be suppressed. You might suggest to define them as
CHARACTER data. However, this would result in a different problem: Employee numbers
could contain letters which is not correct either. The solution is an abstract data type
reflecting that employee numbers consist of digits and cannot be added or subtracted.
By implementing abstract data types, you can ensure that the values of the data of your
database are always "syntactically" correct. You can also prevent undesired or illegal
operations for them.
The same data type can be used by many data elements. Sometimes, different data
elements have only slightly different requirements on their data types raising the question:
can the same data type be used? For example, for two character-string type data elements,
only the allowable maximum length may be different.
To enable the common usage for slight differences, data types can be parameterized. For
each data element, you can specify different parameter values. For the character-string
example, the parameters could be the minimum length and the maximum length for the
respective character-strings.
Data types can only be associated with data elements. They cannot be associated with
data groups since these just have a grouping function. Data groups may be composed of
many data elements all having different data types.
The data inventory should contain a description of all abstract data types for your
application domain. The descriptions of the data elements will refer to the data types.
Preferably, you should only use your own abstract data types. However, realistically, most
data inventories will also use the standard data types of the SQL standard. Include a list of
the standard data types used by your data elements.
For each abstract data type, you should describe the following:
• The Signature of the Abstract Data Type
The signature consists of a unique name for the data type followed, in parentheses and
separated by commas, by the parameters for the data type.
• The Values for the Abstract Data Type
The description of the values depends on the abstract data type:
- If the abstract data type can only assume a finite number of values, the values can
be listed.
- The values of the abstract data type may be the values of another abstract data
type, of a standard data type, or of a subset thereof. In this case, specify the
appropriate subset.
- The values of the abstract data type may be defined by a formula. This is especially
true for integer data which must satisfy a specific formula.

5-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty - Frequently, the values can only be described by text.


• The Operations for the Abstract Data Type
For each operation, specify the name of the operator, the operands, and the result of
the operation. You can provide them in a manner similar to that for the signature.
You always want to include the Equal Comparison (comparison for equality). This
operation determines whether or not, and thus when, two values of the data type are
considered equal. As we will see in the subsequent examples, multiple allowable values
of an abstract data type may be considered equal.
The abstract data types for your application domain must be implemented to become
effective. It depends on the features of the database management system if they can be
implemented as part of the database or must be implemented via application programs.
Sometimes, you want both: to intercept invalid input as soon as possible; to avoid update
operations directly using the data manipulation language of the database management
system corrupting the data.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Types (1 of 4)


Text Data
Signature: TEXTDATA( [ minimum-length ] [ , maximum-length ] )
Values: Any string of printable characters. Minimum-length and
maximum-length specify how many characters the string
has at least (default: 1) and at most (default: unlimited).
Operations:

Normalize Text Data


NORM(text-data-1) text-data-2
Removes all leading and trailing blanks from text-data-1
Reduces intermediate groups of blanks for text-data-1 to a single blank each

Equal Comparison
EQUAL(text-data-1, text-data-2) { TRUE | FALSE }
Normalizes text-data-1 and text-data-2 and compares them character by
character
Result is TRUE if all characters are equal; FALSE otherwise
Figure 5-5. Sample Data Types (1 of 4) CF182.0

Notes:
The abstract data type described on the visual deals with text data, i.e., descriptive text
such as remarks added to the maintenance records for Come Aboard. The values consist
of arbitrary strings of printable characters. The strings must have at least as many
characters as specified by parameter minimum-length of the signature and not more than
specified by parameter maximum-length.
Both parameters of the signature are optional. If minimum-length is not specified, a default
minimum length of 1 is assumed. If maximum-length is not specified, the length of the
string is not limited.
Two operations are allowed for text data:
• Operation
NORM(text-data-1) t text-data-2
normalizes text-data-1. This means it removes leading and trailing blanks and replaces
intermediate groups of blanks by a single blank each. The result is again a text data
string, text-data-2.

5-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty For example, " This is text ..." becomes "This is text ..." when normalized. (Note that
the surrounding double-quotes do not belong to the text. They are used here for clarity
purposes and delimit the text.)
• The second operation is the equal comparison for text data:
EQUAL(text-data-1, text-data-2) t {TRUE | FALSE}
Text-data-1 and text-data-2 are input for the operation. The operation normalizes both
strings and compares the normalized text data character by character. If the normalized
strings are character-wise identical, they are considered equal and the result is TRUE. If
they are character-wise different, the result is FALSE. In particular, the strings are not
considered equal if the corresponding normalized strings have a different length.
Accordingly, the strings " Equal strings" and "Equal strings " are considered equal.
In the database, you want text data to be stored in the normalized form. Normalization of
text data is important when searching for a specific string based on user input. User input
may not always be normalized.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Types (2 of 4)


Name Data
Signature: NAMEDATA( [ minimum-length ] [ , maximum-length ] )
Values: Any string of letters, blanks, and single dashes (-) or periods (.).
Minimum-length and maximum-length specify how many characters
the string has at least (default: 1) and at most (default: unlimited).
Operations:
Normalize Name Data
NORM(name-data-1) name-data-2
Removes all leading and trailing blanks from name-data-1
Reduces intermediate groups of blanks for name-data-1 to a single blank each
Uppercases all letters of name-data-1
Equal Comparison
EQUAL(name-data-1, name-data-2) { TRUE | FALSE }
Normalizes name-data-1 and name-data-2 and compares them character by
character
Result is TRUE if all characters are equal; FALSE otherwise
Figure 5-6. Sample Data Types (2 of 4) CF182.0

Notes:
The abstract data type described on the visual deals with name data such as the last name
or first name of a person. Names do not allow arbitrary characters. They allow letters,
blanks, and single dashes (-) or periods (.). Thus, the values for abstract data type Name
Data consist of strings of these characters of the specified minimum and maximum lengths.
For name data, you have the same operations as for text data. However, the normalization
of name data uppercases all letters. For example, "Miller", "MiLLer", and "miller" all become
"MILLER" when normalized. (Note that the surrounding double-quotes do not belong to the
text.)
As for text data, the equal comparison compares the normalized strings. Thus, "Miller",
"MiLLer", and "miller" are all considered equal.
In the database, you want name data to be stored in the normalized form. Normalization of
name data is important when searching for a specific name based on user input. User input
may not always be normalized.

5-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Data Types (3 of 4)

Alphanumeric String

Signature: ALPHANUMERIC( [ minimum-length ] [ , maximum-length ] )

Values: Any string of alphanumeric characters. Minimimum-length and


maximum-length specify how many characters the string must
have at least (default: 1) and at most (default: unlimited).
Operations:

Equal Comparison

EQUAL(alphanumeric-1, alphanumeric-2) { TRUE | FALSE }

Compares alphanumeric-1 and alphanumeric-2 character by character


Result is TRUE if all characters are equal; FALSE otherwise

Figure 5-7. Sample Data Types (3 of 4) CF182.0

Notes:
The abstract data type on this visual deals with alphanumeric strings. They consist of
letters and digits. An example are the aircraft serial numbers for Come Aboard. Again the
abstract data type is parameterized: The minimum and maximum lengths for the
alphanumeric string can be specified.
For this abstract data type, lowercase letters and uppercase letters are considered different
since a normalization operation has not been provided. Leading or trailing blanks are not
allowed. They are considered as improper input.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Types (4 of 4)

Airport Code

Signature: AIRPORT CODE

Values: One of the following acronyms:


ATL, CDG, DFW, FCO, FRA, JFK, LAS, LAX, MAD, ORD,
SAN, SFO, SJC, STR, ZRH, ...

Operations:

Equal Comparison

EQUAL(airport-code-1, airport-code-2) { TRUE | FALSE }


Compares airport-code-1 and airport-code-2 character by character
Result is TRUE if all characters are equal; FALSE otherwise

Figure 5-8. Sample Data Types (4 of 4) CF182.0

Notes:
The values of the abstract data type on this visual are the international codes for the
airports serviced by Come Aboard. Thus, a finite set of values that can be listed. As
indicates by the ellipsis (three dots) at the end, only a few values are shown on the visual.
The abstract data type is not parameterized. It only supports the equal comparison.

5-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Contents of Data Inventory (2 of 3)
Data Elements and Data Groups
For each data element/data group:
Name: A unique identifier for the data element or data group
One or more words
Each word starting with a capital letter except for connecting
words such as of or for
As natural as possible for application domain
Type: Data element or data group
Description: A detailed textual description of the meaning of the data
element or data group for the application domain
As precise as possible to avoid synonyms and homonyms
Data Type: If data element, data type for data element
Lengths: If data element:
Minimum Maximum Average Number Decimal
Length Length Length of Digits Places
Domain: If data element, value constraints for data element over
and above those imposed by data type
Figure 5-9. Contents of Data Inventory (2 of 3) CF182.0

Notes:
The second component of the data inventory is the inventory of the data elements and data
groups for the application domain. For the application domain, data elements are indivisible
pieces of information. Data groups are groups of logically related data elements or data
groups as explained before.
For each data element or data group, you should provide the following basic items:
Name
The unique name for the data element or data group. Each data element and data group
receives a unique name. The name should clearly express the meaning of the data
element or data group for the application domain. It may consist of multiple words. We will
start each word with a capital letter except for connecting words such as of or for.
The names are used by the business processes, described in the process inventory, to
refer to the data elements and data groups they need.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-17
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Type
The type of the object described, i.e., if the object is a data element or a data group. Thus,
the proper values are data element and data group, respectively.
Description
A detailed textual description of the meaning of the data element or data group for the
application domain. The description should be as precise as possible to avoid synonyms
and homonyms.
Synonyms are data elements or data groups having a different name, but meaning the
same object. Synonymous data elements or data groups can lead to the same information
being stored multiple times, i.e., to the redundant storage of information.
Due to the equivocalness and ambiguity of their names and descriptions, data elements or
data groups that are homonyms can be interpreted to mean different things. Their names
and descriptions should be made unambiguous. If necessary, they must be split into
multiple data elements or data groups.
Data Type
For data elements, the data type (standard data type or abstract data type) of the data
element including the applicable values for parameters of the data type.
For data groups, this item is not applicable. It should either be marked as not applicable or
be omitted.
Lengths
For data elements, the lengths applicable for them. For string-type data elements,
important lengths are: their minimum length, their maximum length, and their average
length. The average length is important for estimates made by the database administrator
when allocating space for tables.
For numbers, important values are: their number of digits and, if applicable, the number of
decimal places.
If the data type for the data element is parameterized, some of these values may already
have been specified as parameters for the data type.
For integers, you may prefer to specify a domain, that is, the range of values that can be
assumed. The number of digits can then be derived from the specified range.
For data groups, this item is not applicable. It should either be marked as not applicable or
be omitted.
Domain
For data elements, value constraints over and above those implied by the data type for the
data element.
If you use abstract data types extensively and correctly, you probably will not need
additional values constraints.

5-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty If you use standard data types, such as INTEGER, you might want additional value
constraints such as the minimum and maximum values the data element can assume.
If multiple data elements basically have the same data type and only differ marginally in
their allowable values, you might decide to use a single abstract data type covering the
values of all data elements concerned and define value constraints for the various data
elements.
For data groups, this item is not applicable. It should either be marked as not applicable or
be omitted.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-19
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Contents of Data Inventory (3 of 3)


Data Groups: For each data group the data element or data group
belongs to and for each role it plays in the data group:
Data Group Role Cardinality
Name Description Name Min Max Avg
May play different roles in same data group
Cardinality may be different for each data group and role

Entity Types: For each entity type the data element or data group
belongs to and for each role it plays for the entity type:
Entity Type Role Cardinality
Name Description Name Min Max Avg

Entity-Relationship Model

Completeness Check
for Entity-Relationship Model
Figure 5-10. Contents of Data Inventory (3 of 3) CF182.0

Notes:
In addition to the basic items on the previous visual, you should provide the following items
for a data element or data group:
Data Groups
The data element or data group being described may be a component of other data
groups. It may belong to a data group more than once, however, in different roles. For each
data group the data element or data group belongs to and each role played, provide the
following:
• The name of the owning data group.
• The role played in the other data group. Provide a textual description of the role and the
name the data element or data group assumes in that role. Description and name need
only be provided if they are different from the fundamental purpose and name of the data
element or data group.
• The cardinality of the data element or data group for the role in the data group. Provide
minimum, maximum, and average cardinality. This means, specify how many values the

5-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty data element or data group must assume at least and at most and will assume on
average for this usage. If the maximum cardinality is not limited, use an asterisk (*).
Entity Types
Most of the time, data elements or data groups are immediately used by entity types and
not indirectly through data groups. If the data element or data group is a direct attribute of
an entity type, provide the following:
• The name of the entity type. The entity-relationship model helps you determine the
appropriate entity types.
• The role played for the entity type. Provide a textual description of the role and the name
the data element or data group assumes in that role. Description and name need only be
provided if they are different from the fundamental purpose and name for the data
element or data group.
• The cardinality of the data element or data group for the role it plays in the entity type.
Provide minimum, maximum, and average cardinality. This means, specify how many
values the data element or data group must assume at least and at most and will
assume on average for this usage. If the maximum cardinality is not limited, use an
asterisk (*).
Do not provide an entry for this item if the data element or data group is not immediately
used by entity types.

By determining the entity types for data elements and data groups, you verify the
completeness of the entity-relationship model. If you find a data element or data group that
cannot be associated with another data group or an entity type, the entity-relationship
model is incomplete and must be corrected.
Relationship types are not of interest in this context because their defining attributes are
derivatives of the keys of their source and target.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-21
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Elements and Groups (1 of 7)

Last Name

Name Last Name


Type Data element
Description Last name of a person
Data Type NAMEDATA(1, 60)
Lengths Minimum Length: 1
Maximum Length: 60
Average Length: 8
Number of Digits: -
Decimal Places: -
Domain
Data Groups Name of Person
Role: -
Role Name: -
Cardinality: Minimum = 1, Maximum = 1, Average = 1
Entity Types

Figure 5-11. Sample Data Elements and Groups (1 of 7) CF182.0

Notes:
The above visual illustrates a data element for our sample airline company. The data
element is a component of a data group. It is not used as direct attribute of an entity type.
The data element is called Last Name and represents the last name of a person (for
example, a pilot or mechanic). The data type for the data element is Name Data defined
before as an abstract data type. The signature NAMEDATA(1, 60) specifies that a last
name must consists of at least one character and must not have more than 60 characters.
The abstract data type is described on page 5-14.
The lengths relevant for last names are the minimum length, the maximum length, and the
average length. Minimum length and maximum length must be the same as for the
signature for the data type.
There are not any value restrictions above those for name data.
The data element is a component of data group Name of Person described on the next
visual. For each instance of the data group, it may assume one and only one value.
Therefore, Minimum, Maximum, and Average all have the value 1. Role and Role Name

5-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty have not been provided. The data element plays its fundamental role and is used under its
defined name (Last Name) in the data group. It is not necessary and would be repetitive to
repeat the name and description of the data element.
The data element is not used as a direct attribute by any entity type.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-23
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Elements and Groups (2 of 7)

Name of Person

Name Name of Person


Type Data Group
Description Full name of a person consisting of last name, first name, and middle initial
Data Type N/A
Lengths N/A
Domain N/A
Data Groups
Entity Types EMPLOYEE
Role: -
Role Name: -
Cardinality: Minimum = 1, Maximum = 1, Average = 1

Figure 5-12. Sample Data Elements and Groups (2 of 7) CF182.0

Notes:
This visual describes data group Name of Person for data element Last Name.
The data group represents the full name for a person consisting of the last name, first
name, and middle initial for the person. The data inventory must contain descriptions for
the appropriate data elements. We have seen the description for data element Last Name.
Items Data Type, Lengths, and Domain do not apply to data groups.
Data group Name of Person is not again a component of another data group. It is used as
direct (composite) attribute by entity type EMPLOYEE. For each entity instance, it assumes
one and only one value (minimum cardinality = maximum cardinality = average cardinality
= 1).

5-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Data Elements and Groups (3 of 7)
Aircraft Number

Name Aircraft Number


Type Data Element
Description Universal serial number for aircraft
Data Type ALPHANUMERIC(10, 10)
Lengths Minimum Length: 10
Maximum Length: 10
Average Length: 10
Number of Digits: -
Decimal Places: -
Domain
Data Groups
Entity Types AIRCRAFT
Role: -
Role Name: -
Cardinality: Minimum = 1, Maximum = 1, Average = 1
MAINTENANCE RECORD
Role: Aircraft for maintenance record
Role Name: Aircraft Number
Cardinality: Minimum = 1, Maximum = 1, Average = 1

Figure 5-13. Sample Data Elements and Groups (3 of 7) CF182.0

Notes:
Data element Aircraft Number, the universal aircraft serial number for aircraft, is an
alphanumeric string of 10 characters (data type ALPHANUMERIC(10, 10)). Therefore,
Minimum Length, Maximum Length, and Average Length have the same value 10.
The data element is not a component of another data group. It is used as direct attributes
by entity types AIRCRAFT and MAINTENANCE RECORD.
For entity type AIRCRAFT, it is the unique identifier for the various aircraft that Come
Aboard owns. Since playing a single role for the entity type, its fundamental role, the data
element need not be named differently. As unique identifier, the data element assumes one
and only one value for every instance of entity type AIRCRAFT.
For entity type MAINTENANCE RECORD, the data element represents the aircraft serial
number of the aircraft for the maintenance record. Also in this case, there is no need to
rename the data element since it is used in a single role by the entity type. Its original name
clearly expresses the purpose it is used for. Since every maintenance record contains one
and only one aircraft number, minimum, maximum, and average cardinality are all 1.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-25
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Elements and Groups (4 of 7)

Engine Number

Name Engine Number


Type Data Element
Description Serial number for an engine of an aircraft
Data Type ALPHANUMERIC(8, 12)
Lengths Minimum Length: 8
Maximum Length: 12
Average Length: 10
Number of Digits: -
Decimal Places: -
Domain
Data Groups Engine
Role: -
Role Name: -
Cardinality: Minimum = 1, Maximum = 1, Average = 1
Entity Types

Figure 5-14. Sample Data Elements and Groups (4 of 7) CF182.0

Notes:
This visual illustrates another data element for Come Aboard, the serial number for aircraft
engines.
Data element Engine Number uses the same abstract data type as data element Aircraft
Number, however, with different parameter values. Whereas aircraft serial numbers were
10 characters long, engine serial numbers may consist of 8 to 12 alphanumeric characters.
Using the same data type is perfectly all right as long as you want to allow that the various
data elements can be compared with each other. If you do not want aircraft serial numbers
to be compared with engine serial numbers, you should define two different abstract data
types.
Engine Number is a component of data group Engine. It is not used by other data groups or
directly by entity types. For each instance of data group Engine, Engine Number assumes
one and only one value.

5-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Data Elements and Groups (5 of 7)

Engine

Name Engine
Type Data Group
Description An engine for an aircraft
Data Type N/A
Lengths N/A
Domain N/A
Data Groups
Entity Types AIRCRAFT
Role: -
Role Name: -
Cardinality: Minimum = 0, Maximum = 4, Average = 2

Figure 5-15. Sample Data Elements and Groups (5 of 7) CF182.0

Notes:
Data Group Engine is the data group for data element Engine Number. It has additional
components such as the type of the engine and information about the manufacturer for the
engine. Engine is a repetitive group for entity type AIRCRAFT. This is because an aircraft
may have multiple engines mounted. Consequently, for each instance of entity type
AIRCRAFT, Engine may assume multiple values each of which is composed of appropriate
values for the components of Engine.
The minimum cardinality of 0 signals that aircraft need not have engines mounted. The
maximum cardinality of 4 specifies that an aircraft cannot have more than four engines
mounted. The average cardinality of 2 indicates that, on the average, an aircraft has two
engines mounted.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-27
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Elements and Groups (6 of 7)

Manufacturer

Name Manufacturer
Type Data Group
Description All information concerning a manufacturer (e.g., manufacturer code, company
name, complete address, and phone number)
Data Type N/A
Lengths N/A
Domain N/A
Data Groups Engine
Role: -
Role Name: -
Cardinality: Minimum = 1, Maximum = 1, Average = 1
Entity Types AIRCRAFT TYPE
Role: -
Role Name: -
Cardinality: Minimum = 1, Maximum = 1, Average = 1

Figure 5-16. Sample Data Elements and Groups (6 of 7) CF182.0

Notes:
Manufacturer is a data group consisting of all information pertaining to a manufacturer. In
particular, it includes:
• the manufacturer code (a unique identification for the manufacturer)
• the name of the manufacturer's company
• the address of the manufacturer
• the phone number of the manufacturer
The address of the manufacturer is again a data group.
Data group Manufacturer is a component of data group Engine of entity type AIRCRAFT. It
also is a direct attribute of entity type AIRCRAFT TYPE.
This example illustrates a hierarchy of data groups:
Address t Manufacturer t Engine
Address is a data group representing an address. Assuming that Address consists of the
data elements Street Address, Post Office Box, City, State, Country, and Postal Code, the
tree structure for Engine looks as follows:

5-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Engine
Engine Number
Engine Type
Manufacturer
Manufacturer Code
Company Name
Address
Street
Post Office Box
City
State
Country
Postal Code
Phone Number
The names of data groups are shown in bold. Indentation indicates the next level of the
tree structure. Items with the same indentation are on the same level of the tree structure.
Since data group Engine is used as a composite attribute by entity type AIRCRAFT, the
data elements at the terminal nodes become elementary attributes of entity type
AIRCRAFT.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-29
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Data Elements and Groups (7 of 7)

Number of Engines

Name Number of Engines


Type Data Element
Description Number of engines that an aircraft model may have
Data Type INTEGER
Lengths Determined by domain
Domain 0-4
Data Groups
Entity Types AIRCRAFT TYPE
Role: -
Role Name: -
Cardinality: Minimum = 1, Maximum = 1, Average = 1

Figure 5-17. Sample Data Elements and Groups (7 of 7) CF182.0

Notes:
In the illustrated example, data element Number of Engines is associated with standard
data type INTEGER which is not parameterized. This means that a value range cannot be
specified for the data type. However, the minimum value that Number of Engines can
assume is 0. The maximum value is 4. To indicate this, you can use a domain specification
as done in the example. The domain specification must be implemented by database
functions such as check constraints, if available, or by checking user input.

5-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Methods for Establishing a Data Inventory

Survey of departments of expertise

Review of existing data and programs

Parallel development with process inventory

Use a single method or a combination thereof

Figure 5-18. Methods for Establishing a Data Inventory CF182.0

Notes:
There are many ways to establish a data inventory. However, there are three methods that
are normally considered when creating a data inventory:
• You can survey the departments of expertise and ask their members for the data
elements and data groups needed by the application domain.
• If available, you can review existing data (in files or databases) and programs to
determine the data elements and data groups needed for the application domain.
• You can develop the data inventory in parallel with the process inventory which
describes the business processes for the application domain. As a business process is
described, the data elements and data groups it uses become apparent.
You can use one of these methods or a combination thereof. We will discuss the
advantages and the disadvantages of these methods in the following.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-31
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Survey of Departments of Expertise


Application domain experts asks members of
departments of expertise for data needed

Results depend on:


Ability of domain expert to extract information
from members of departments
Ability of members of departments to
communicate their expertise
Willingness of members of departments to
cooperate with application domain expert
Easy to forget something
Easily results in superfluous data elements
and data groups in data inventory
One-time effort: later changes not reflected in
data inventory

Only auxiliary method


Figure 5-19. Survey of Departments of Expertise CF182.0

Notes:
This method suggests that the application domain experts asks the members of the
departments of expertise for the data needed by their tasks.
The quality of the result depends on several communicative factors:
• It depends on the ability of the domain expert to extract the proper information from the
members of the departments of expertise. From the discussions in the unit so far, he/she
should know which information is needed. However, the answers received are frequently
not very well structured. They must be scrutinized and filtered to reveal the actual facts.
• It depends on the ability of the interviewer to tell the application domain expert precisely
what is needed. The members of the departments of expertise are not computer
experts. Frequently, they do not have a feeling for the information needed.
• It depends on the willingness of the members of the departments of expertise to
cooperate with the domain expert. Since they do not see a direct benefit for them, they
may find it tiresome and annoying to be involved in the interviews. Their willingness
largely depends on the pressure they are under as a consequence of their actual work.

5-32 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Even if the above-mentioned problems do not occur, there are some other pitfalls with this
technique. The approach is fairly unstructured and, during a discussion, it is very easy to
forget something as you probably know from your own experience.
Conversely, during a discussion, things may surface that are on the mind of the interviewee
or in his/her fantasy rather than being facts. This may lead to superfluous data elements
and data groups in the data inventory and, thus, unnecessary fields in the database being
designed.
A further disadvantage of this method is that it is a one-time effort. Consequently, later
changes (e.g., new data elements or extensions of data groups) are not reflected in the
data inventory.
Summing it up, surveying the departments of expertise is rather an auxiliary method than
the method to be used. Together with other methods, it may be quite helpful, especially,
since it promotes the contact to the members of the departments of expertise, the actual
"customers" of the database being developed.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-33
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Review of Existing Data and Programs

Existing data files (on paper, in flat files, etc.) and program
listings are screened for data of the application domain

May have to investigate a great variety


of files and documents
Result depends on quality of
documentation of data and programs
May easily overlook some data
Must check if data found are:
Relevant for application domain
Implementation-dependent

Feasible, but beware of implementation


dependent data elements or data groups
Figure 5-20. Review of Existing Data and Programs CF182.0

Notes:
This method screens existing data files (which may be on paper, in flat files, or in old
databases) and program listings for data used by the application domain. From the data
found, the data elements or data groups for the application domain are derived and
registered in the data inventory.
For a large application domain, a great number and variety of files and documents may
have to be inspected. This is not a problem as such because, whatever you do to come to
a data inventory, it will cost you quite some effort; otherwise, the data inventory will be
incomplete. The success depends on the availability and the quality of the documentation
of the data and the programs. The poorer the documentation, the more effort you must put
in.
The amount of information to be scanned may cause potential data elements or data
groups to be overlooked. Conversely, you may find data elements or data groups that are
not really objects of the application domain, but caused by the particular implementation
used so far. You do not want these data elements and data groups in the data inventory.

5-34 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Summarizing the preceding points, we can say that this is a feasible method provided the
required information is available. However, you must be wary of implementation-dependent
data elements or data groups and ignore them.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-35
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Coupling of Data and Process Inventories


Jointly develop and maintain
process and data inventories

Process inventory contains a description of all


business processes for application domain
For each business process, process
Process inventory lists all data used by process
Inventory Data elements and data groups identified for
processes are described in data inventory
Including role of data element or data group
If process changes, data inventory is updated

Responsibility where it belongs to:


Data With processes
Inventory
Only data needed by
processes in data inventory
Figure 5-21. Coupling of Data and Process Inventories CF182.0

Notes:
The method discussed on this visual synchronizes the data inventory with the process
inventory. It couples the development and maintenance of the data and process
inventories. The process inventory contains a detailed description of the business
processes for the application domain. It is input for application programmers and enables
them to write the programs supporting the application domain. As we will see later in this
unit, the description of the business processes includes, for each business process, a list of
the data elements and data groups used.
As a business process is described or changed, the affected data elements or data groups
are described or their description is updated in the data inventory. Thus, the process
inventory and the data inventory remain synchronized at all times. If processes require new
data elements or data groups, they are associated with the proper entity types as described
before. If an entity type cannot be found for a data element or data group, the
entity-relationship model must be changed as well. As you can imagine, this will result in
changes for your database.

5-36 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The advantage of this approach is that it leaves the responsibility for the data elements and
data groups where it belongs to, namely, with the business processes. As a consequence,
the data inventory contains the data elements and data groups for the documented
processes and only for those. These may be existing or planned business processes. A
positive side effect is that the planned business processes must have materialized at least
so far that they have been documented; that they no longer are some vague ideas in some
people's head that never are realized.
The discussed method ensures that all needed data elements and data groups are in the
data inventory and, thus, will be in the database.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-37
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

5-38 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 5.2 Process Inventory

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-39
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Process Inventory - Purpose and Responsibilities


Detailed description of all business
processes for application domain
Strictly business oriented
Implementation independent
Created by application domain expert
Input for application programmers
Must allow application programmers to:
Understand processes for application domain
Develop programs for processes
Must identify all data for processes
References to data elements and data groups in
data inventory
Data inventory also input for application programmers
Allows to verify entity-relationship model
Database designer not involved
Figure 5-22. Process Inventory - Purpose and Responsibilities CF182.0

Notes:
As already mentioned before, the process inventory contains a detailed description of all
business processes for the application domain. The descriptions should be completely
business oriented. They should be independent of any implementation considerations.
The process inventory is established by the application domain expert because he/she has
the overall knowledge of the application domain required. Of course, he/she needs to
discuss the business processes and verify their descriptions with the departments of
expertise. Whereas the members of the departments of expertise are frequently not willing
to discuss the data elements or data groups, they generally are interested in talking about
the business processes. The reason is that the business processes represent their daily
work. They want to ensure that their implementation makes their work as easy as possible.
The process inventory is input for the application programmers. It must allow them to
understand the business processes for the application domain and to develop the required
programs, queries, etc.
The descriptions for the business processes must identify all data elements and data
groups for the business processes. Since the data elements and data groups are process

5-40 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty independent, they are not described in the process inventory, but in the data inventory. The
business processes only refer to the data elements and data groups in data inventory.
Therefore, the data inventory is also input for the application programmers.
When describing a business process, the application domain expert should verify that the
entity-relationship model contains all entity types and relationship types necessary for the
implementation of the business process. We will discuss this in more detail on one of the
subsequent visuals.
Except for assisting the application domain expert in verifying the entity-relationship model,
the data base designer is not involved in the establishment of the process inventory.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-41
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Contents for a Business Process (1 of 2)


Title: The unique title under which the business process is
known throughout the application domain

Purpose: A short description of the purpose of the business


process for the application domain

Input: A description of all data, including their role, being


external input for the business process
For example, data entered by the end user
For example, aircraft number for maintenance record and
not just aircraft number

Textual A detailed textual description of the steps of the business


Description: process
Needed for departments of expertise and end users

Formal A formal description of the conditions, rules, and actions


Description: for the business process
For application programmers
For example, decision tables

Figure 5-23. Contents for a Business Process (1 of 2) CF182.0

Notes:
For each business process, the process inventory should contain the following items:
Title
The unique title under which the business process is known throughout the application
domain. The business process should be easily recognizable from the title.
Purpose
A short description of the purpose of the business process, i.e., an outline what, from a
business perspective, the business process is supposed to achieve.
Input
A description of all data, including their role, that are external input for the business
process. This means a description of all data that are perceived as input by the (end) users
of the business process. In particular, these may be data entered by them in entry fields or
selected via check boxes, radio buttons, or combination boxes.

5-42 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty As mentioned, for each input, its role should be identified. For example, the description
should not just say "aircraft number", but rather "aircraft number for the maintenance record
of the specified aircraft". This is important for the application programmers for two reasons:
1. They may have to provide an appropriate description for the corresponding input field on
a window or in help information.
2. They need to know which data to access. As you know already from the
entity-relationship model, the aircraft number may occur in multiple entity types and,
thus, later on, in multiple tables.
Textual Description
A detailed textual description of the various steps of the business process. A textual
description is necessary since the description must be verified by the departments of
expertise and will be available to the users of the business process. Generally, a formal
description of the business process is not understood by these people. The textual
description also helps the database designer when verifying design steps by means of the
business processes described in the process inventory.
Formal Description
A formal description (for example, by means of decision tables) of the conditions, rules, and
actions for the business process. Application programmers may prefer such a formal
description over a textual description because it is more precise and, thus, eases their task.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-43
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Contents for a Business Process (2 of 2)


Output: A description of all data, including their role, which are
external output for the business process
For example, data displayed on a screen or in a listing
For example, airport code for airport of departure for leg and
not just airport code
Data Read: A description of all data elements or data groups internally
read by the business process
For each data element or data group, provide:
Its name in the data inventory
All purposes for which it is read (roles)
Data Written: A description of all data elements or data groups internally
written by the business process
For each data element or data group provide:
Its name in the data inventory
All purposes for which it is written (roles)
Others: Other items needed by application programmers such as
window formats or listing formats

Figure 5-24. Contents for a Business Process (2 of 2) CF182.0

Notes:
In addition to the items on the previous visual, the description for a business process
should contain the following items:
Output
A description of all data, including their role, which are external output for the business
process, i.e., perceived as output by the users of the business process. In particular, these
may be data displayed in a window or in a listing. It may also be something as abstract as
an interrelationship established by the business process (e.g., the assignment of an aircraft
to a flight) or a message.
Furthermore, the output may be conditional. This means, it can depend on the input
provided for the business process and on situations encountered during its execution.
As mentioned, for each output, its role should be identified as far as applicable. For
example, the description should not just say "airport code", but rather "airport code for
airport of departure for leg". Application programmers need this information to properly
describe the output on windows or in listings.

5-44 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Data Read


A detailed description of all data elements or data groups read internally when the business
process is executed. For each data element or data group, provide its name in the data
inventory and all purposes it is read for by the business process. The purposes identify the
roles the data element or data group plays for the business process. The roles/purposes
are important for the correct assignment of the data element or data group to entity types in
the data inventory.
Data Written
A detailed description of all data elements or data groups written internally when the
business process is executed. For each data element or data group, provide its name in
the data inventory and all purposes it is written for by the business process. The purposes
identify the roles the data element or data group plays for the business process. The
roles/purposes are important for the correct assignment of the data element or data group
to entity types in the data inventory.
Others
The description may contain many other items such as window or listing formats. However,
these are not of immediate interest for the database designer and, therefore, not discussed
here.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-45
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Business Process (1 of 5)

Business Processes

Assign Captain for Flight

Purpose: Assign the specified pilot as captain to the specified flight.

Input: Flight number for flight


Airport of departure for flight
Identifies flight
Airport of arrival for flight
Flight locator for flight

Employee number for pilot


to be assigned to flight Identifies pilot

Figure 5-25. Sample Business Process (1 of 5) CF182.0

Notes:
The next few visuals show the description of a business process for our sample airline
company called Come Aboard. The business process assigns a pilot as captain to a flight.
Appropriately enough, the unique title under which the business process is known
throughout the application domain is Assign Captain for Flight.
Item Purpose explains in more detail what the business process will accomplish: It will
assign the specified pilot to the specified flight.
The input for the business process must identify the flight as well as the pilot who becomes
the captain for the flight. To identify the flight, the fight number, the airport of departure, the
airport of arrival, and the locator for the flight must be provided. Note that the addendum for
flight in the visual identifies the role of the input. This is important because flight number,
airport of departure, and airport arrival can be used to identify different things (e.g., legs
rather than flights).

5-46 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Business Process (2 of 5)
Textual This business process performs the following operations:
Description: 1 It is verified that the specified flight and pilot exist.
If flight or pilot do not exist, an appropriate error message is displayed
and the business process ends.
2 If pilot and flight exist, it is checked if the pilot has the license to fly the
aircraft model for the leg for the flight.
If the pilot cannot fly the aircraft model, an appropriate error message is
displayed and the business process ends.
3 If the pilot has the license to fly the aircraft model, it is checked if the
pilot has already been assigned to the flight.
If the pilot is already captain or copilot for the flight, an appropriate
message is displayed and the business process ends.
4 If the pilot has not yet been assigned to the flight, it is checked if
another pilot is already captain for the flight.
If so, a message is displayed containing employee number, last name,
and first name of the current captain and the business process ends.
5 If a captain has not yet been assigned to the flight, the specified pilot
becomes the captain for the flight.
6 A message is displayed confirming that the pilot has been assigned as
captain to the flight. The message includes employee number, last
name, and first name of the assigned captain.
Figure 5-26. Sample Business Process (2 of 5) CF182.0

Notes:
This visual lists the individual steps that, from a business perspective, must be performed
by the business process. It does not describe an implementation. The implementation may
look completely different and even will in this case: It can make of use of two constraints of
the entity-relationship model provided these are implemented. As a consequence of the
constraints, a lot of the checking for the business process need not be implemented since it
is handled by the constraints.
Because the description is pretty intelligible, we need not discuss it further here.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-47
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Business Process (3 of 5)


Output: Flight number for flight
Airport of departure for flight
Airport of arrival for flight
Flight locator for flight

If another pilot has already been assigned as captain to the


flight:
Employee number of currently assigned captain
Last name of currently assigned captain
First name of currently assigned captain
Employee number of pilot not assigned to flight

If the specified pilot has been assigned as captain to the flight:


Pilot assigned as captain to flight
Employee number of newly assigned captain
Last name of newly assigned captain
First name of newly assigned captain

Figure 5-27. Sample Business Process (3 of 5) CF182.0

Notes:
This visual illustrates the external output for the sample business process assigning a pilot
as captain to a flight.
The flight information, i.e, the flight number, the airport of departure, the airport of arrival,
and the locator for the flight are always returned as output. They also were input for the
business process. Again, the addendum for flight on the visual indicates the role of the
output.
The further output is dependent on conditions encountered during the execution of the
business process:
If another pilot has already been assigned as captain to the flight, employee number, last
name, and first name of that pilot and the employee number of the specified pilot are
returned. (As a consequence, the specified pilot was not assigned to the flight.)
If the specified pilot has been assigned as captain to the flight, his/her employee number,
last name, and first name are returned. In addition a message is issued that the pilot has

5-48 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty been assigned successfully. Accordingly, the fact that the pilot has been assigned to the
flight is perceived as an output by the user of the business process.
You could think of further conditions resulting in different output. Such a condition is that the
pilot has already been assigned as copilot to the flight. These conditions and their output
should be described as well. We have not done this here to keep the output for the sample
business process on a single visual.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-49
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

A Walk Through the ER Model


EMPLOYEE
S _is_
1
DC 1 1 DC
AIRCRAFT
TYPE m
MECHANIC _can_fly_
PILOT
_for_
1. .1 m m m
D 1. .m
m
m AIRCRAFT
m m
_trained MODEL
_for_ _can_land m AIRPORT m
.1. .1. _at_
1 1 From To
PILOT DC _by_ _assigned
_from_ _for_ _to_
_nonstop_to_ ASSIGNMENT 1. .1
m 1. .m
m _as_ DC
AIRCRAFT _in_ LEG
_scheduled 1. .1
_for_ _for_
1 m
m m _for_ m
C MAINTENANCE Owner D
RECORD ITINERARY m
FLIGHT
m 1
m

_belongs_to_ _for_

Figure 5-28. A Walk Through the ER Model CF182.0

Notes:
For each business process, you should verify the entity-relationship model for the
application domain. You should check if it contains all required entity types and relationship
types by scrutinizing all steps of the business process.
To determine the entity types needed, you must determine the data elements and data
groups used by the steps. Thus, in the course of the verification, you determine all data
elements and data groups read or written by the business process.
When verifying the entity-relationship model for a business process, you perform a walk
through the entity-relationship model and determine the view needed for the business
process.
We will do this now for the sample business process assigning the captain for a flight. The
steps of the business process have been described on page 5-47. They will be repeated
here as far as required for the understanding:
1. The first step of the business process verifies that the specified pilot and flight exist. If
not, an appropriate message is displayed and the business process ends.

5-50 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The specified flight exists if entity type FLIGHT contains an entity instance for the
specified flight number, airport of departure, airport of arrival, and flight locator. Thus,
entity type FLIGHT must have attributes for the following data elements:

Entity Type Data Element/Data Group


FLIGHT Flight Number (flight number for a flight)

Airport Code (airport of departure for a flight)

Airport Code (airport of arrival for a flight)

Flight Locator (flight locator for a flight)


Their roles are specified in parentheses. As you can see, data element Airport Code is
used in two roles. Logically, these attributes are read from entity type FLIGHT to verify
that the specified flight exists.
The specified pilot exists if entity type PILOT contains an entity instance for the specified
employee number. Thus, entity type PILOT must have an attribute for data element
Employee Number:

Entity Type Data Element/Data Group


PILOT Employee Number (employee number for a pilot)
As we know, this attribute is the entity key.
As you would expect, for entity type PILOT, the data element plays the role of employee
number for a pilot. It is immaterial that the business process reads the employee
number for the pilot specified as input since being specified as input does not constitute
a characteristic of pilots. It only limits the entity instances read.
2. The second step of the business process checks if the specified pilot has the license to
fly the aircraft model for the leg for the flight. If he/she does not have the license, an
appropriate message is displayed and the business process ends.
To determine if the pilot has the required license for the flight, we must first determine
the leg for the flight and the aircraft model for the leg. Then, we must see if the specified
pilot has the license to fly the aircraft model we have determined. Using the
entity-relationship model, we can accomplished this by:
•Navigating from entity type FLIGHT to entity type LEG via relationship type
FLIGHT_for_LEG to determine the leg for the flight.
•Navigating from entity type LEG to entity type AIRCRAFT MODEL via relationship type
AIRCRAFT MODEL_for_LEG to find the aircraft model for the leg for the flight.
•Checking if relationship type PILOT_can_fly_AIRCRAFT MODEL contains an instance
for the specified pilot and the aircraft model just determined.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-51
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Thus, the entity-relationship model includes all entity types and relationship types
required for the step.
Accessing a relationship type means accessing its defining attributes since they
completely describe the relationship instances. As we know, the defining attributes are
the keys of the source and target for the relationship type. Consequently, source and
target of the relationship type are the primary receptacles for the data elements and data
groups corresponding to the defining attributes. If they do not contain them, the
relationship type cannot contain them. If they are their keys, the relationship type will
automatically contain them. Therefore, in the data inventory, the data elements/data
groups for the accessed defining attributes are associated with the source and target
entity types rather than with the relationship type.
In view of this convention, the walk through the entity-relationship model for this step of
the business process requires the following data elements for the indicated entity types.
The roles are included in parentheses:

Entity Type Data Element/Data Group


FLIGHT Flight Number (flight number for a flight)

Airport Code (airport of departure for a flight)

Airport Code (airport of arrival for a flight)


LEG Flight Number (flight number for a leg)

Airport Code (airport of departure for a leg)

Airport Code (airport of arrival for a leg)


AIRCRAFT MODEL Type Code (type code for an aircraft model)

Model Number (model number for an aircraft model)


PILOT Employee Number (employee number for a pilot)
The data elements for entity types FLIGHT and PILOT have already been identified for
the previous step and are not repeated in the data inventory. For the roles, similar
considerations apply as for the first step.
3. The third step of the business process checks if the specified pilot has already been
assigned to the flight. If he/she has already been assigned to the flight, an appropriate
message is displayed and the process ends.
It does not matter whether the specified pilot has been assigned as captain or copilot. In
both cases, he cannot be assigned again to the flight.
In the entity-relationship model, relationship type PILOT_assigned_to_FLIGHT must be
used to determine if the pilot has already been assigned to the flight. Since it is not of

5-52 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty interest whether the pilot has been assigned as captain or copilot, entity type PILOT
ASSIGNMENT is not needed.
Accordingly, this step of the business process uses the following data elements in the
indicated entity types:

Entity Type Data Element/Data Group


FLIGHT Flight Number (flight number for a flight)

Airport Code (airport of departure for a flight)

Airport Code (airport of arrival for a flight)

Flight Locator (flight locator for a flight)


PILOT Employee Number (employee number for a pilot)
All data elements and roles have already been identified before. Therefore, they are not
repeated in the data inventory.
4. The fourth step of the business process checks if another pilot is already captain for the
flight. It so, a message is displayed, containing employee number, last name, and first
name of that pilot, and the business process ends.
Using the entity-relationship model, you can accomplish this by navigating from entity
type FLIGHT to entity type PILOT ASSIGNMENT via relationship types
PILOT_assigned_to_FLIGHT and PILOT_assigned_to_FLIGHT_by
PILOT ASSIGNMENT and inspecting the function for the pilot assignments.
Following the above convention concerning relationship types, this traversal of the
entity-relationship model requires that the following data elements have attributes in the
indicated entity types:

Entity Type Data Element/Data Group


FLIGHT Flight Number (flight number for a flight)

Airport Code (airport of departure for a flight)

Airport Code (airport of arrival for a flight)

Flight Locator (flight locator for a flight)


PILOT Employee Number (employee number for a pilot)

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-53
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Entity Type Data Element/Data Group


PILOT Flight Number (flight number for a pilot assignment)
ASSIGNMENT
Airport Code (airport of departure for a pilot assignment)

Airport Code (airport of arrival for a pilot assignment)

Flight Locator (flight locator for a pilot assignment)

Employee Number (employee number for a pilot assignment)

Pilot Function (pilot function for a pilot assignment)


All data elements and roles for entity types FLIGHT and PILOT have been identified
before. They are not repeated in the data inventory.
If another captain has already been assigned to the flight, you must navigate from entity
type PILOT ASSIGNMENT to entity type PILOT via relationship types
PILOT_assigned_to_FLIGHT_by_PILOT ASSIGNMENT and
PILOT_assigned_to_FLIGHT. Using relationship type EMPLOYEE_is_PILOT, you must
continue on to entity type EMPLOYEE. There, you find the last name and first name of
the current captain for the flight needed for the message being issued.
This requires the following additional data elements for the indicated entity types:

Entity Type Data Element/Data Group


EMPLOYEE Employee Number (employee number for an employee)

Last Name (last name of an employee as part of data group


Name of Person used by entity type EMPLOYEE)

First Name (first name of an employee as part of data group


Name of Person used by entity type EMPLOYEE)
5. In the fifth step of the business process, the specified pilot becomes the captain of the
specified flight.
For the entity-relationship model this means that instances must be added to entity type
PILOT ASSIGNMENT and relationship types PILOT_assigned_to_FLIGHT and
PILOT_assigned_to_FLIGHT_by_PILOT_ASSIGNMENT.

5-54 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty As a consequence, the following data elements in entity type PILOT ASSIGNMENT are
written (see also page 5-57):

Entity Type Data Element/Data Group


PILOT Flight Number (flight number for a pilot assignment)
ASSIGNMENT
Airport Code (airport of departure for a pilot assignment)

Airport Code (airport of arrival for a pilot assignment)

Flight Locator (flight locator for a pilot assignment)

Employee Number (employee number for a pilot assignment)

Pilot Function (pilot function for a pilot assignment)


The fact that instances are added to relationship types PILOT_assigned_to_FLIGHT
and PILOT_assigned_to_FLIGHT_by_PILOT_ASSIGNMENT does not imply that the
data elements for their defining attributes are updated in their sources and targets.
Therefore, they will not be listed under Data Written in the description of the business
process. In contrast, you may claim that they must be contained in the respective source
or target and should be listed under Data Read.
6. The sixth step displays the message confirming that the pilot has been assigned as
captain to the flight. The message includes employee number, last name, and first name
of the newly assigned captain.
This requires us to access entity type EMPLOYEE for the pilot being assigned to obtain
his/her last name and first name.
As a consequence, the following data elements are needed (read) in the indicated entity
types:

Entity Type Data Element/Data Group


EMPLOYEE Employee Number (employee number for an employee)

Last Name (last name of an employee as part of data group


Name of Person used by entity type EMPLOYEE)

First Name (first name of an employee as part of data group


Name of Person used by entity type EMPLOYEE)
All data elements, data groups, and roles have been identified before and, therefore, are
not repeated in the data inventory.
We could successfully identify all required entity types and relationship types for the
business process and assign all data elements or data groups to entity types. Thus, the
entity-relationship model is complete for this business process.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-55
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Sample Business Process (4 of 5)


Data Element/Group Role/Purpose Contained In
Read:
Flight Number Flight number for a flight FLIGHT
Flight number for a leg LEG
Flight number for a pilot assignment PILOT ASSIGNMENT
Airport Code Airport of departure for a flight FLIGHT
Airport of arrival for a flight FLIGHT
Airport of departure for a leg LEG
Airport of arrival for a leg LEG
Airport of departure for a pilot assignment PILOT ASSIGNMENT
Airport of arrival for a pilot assignment PILOT ASSIGNMENT
Flight Locator Flight locator for a flight FLIGHT
Flight locator for a pilot assignment PILOT ASSIGNMENT
Employee Number Employee number for a pilot PILOT
Employee number for a pilot assignment PILOT ASSIGNMENT
Employee number for an employee EMPLOYEE
Type Code Type code for an aircraft model AIRCRAFT MODEL
Model Number Model number for an aircraft model AIRCRAFT MODEL
Pilot Function Pilot function for a pilot assignment PILOT ASSIGNMENT
Last Name Last name of an employee Name of Person
EMPLOYEE
First Name First name of an employee Name of Person
EMPLOYEE
Figure 5-29. Sample Business Process (4 of 5) CF182.0

Notes:
The data elements and data groups on this visual have already been discussed in the
notes for the previous visual on page 5-50.
Note that column Contained In is not part of the description for a business process. It has
been added here to indicate the data groups and entity types the various data
elements/data groups will be associated with in the data inventory. It does not make sense
to describe this in the process inventory. The implementation of the business processes
must be based on the actual tables rather than on the entity types of the entity-relationship
model.

5-56 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Business Process (5 of 5)

Data Element/Group Role/Purpose Contained In


Written:
Flight Number Flight number for a pilot assignment PILOT ASSIGNMENT

Airport Code Airport of departure for a pilot assignment PILOT ASSIGNMENT


Airport of arrival for a pilot assignment PILOT ASSIGNMENT

Flight Locator Flight locator for a pilot assignment PILOT ASSIGNMENT

Employee Number Employee number for a pilot assignment PILOT ASSIGNMENT

Pilot Function Pilot function for a pilot assignment PILOT ASSIGNMENT

Figure 5-30. Sample Business Process (5 of 5) CF182.0

Notes:
This visual illustrates the data elements written by the sample business process. They have
already been discussed on page 5-50 ff..
Note that column Contained In is not part of the description for a business process. It has
been added here to indicate the data groups and entity types the various data
elements/data groups will be associated with in the data inventory.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-57
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Process Decomposition
To attain a complete set of business processes for the application domain
Business process = Process actually performed by the application domain
Step-by-step decomposition of application domain into groups of
related business processes and, finally, individual business processes
Next iteration is a refinement of the previous iteration
Next iteration creates business-related subsets of groups for previous iteration
Iteration stops if group consists of a single business process
Independent of whether or not the business process will employ other
business processes to achieve its task (implementation detail)
Business process then described in process inventory
Result is a process tree
Lowest level are business processes to be described in data inventory
Higher levels are groups of related business processes
Only a grouping of business processes
Does not imply an implementation structure
Does not specify if a business process internally uses another business
process to accomplish its task

Figure 5-31. Process Decomposition CF182.0

Notes:
To ensure the completeness of the data inventory, you need a comprehensive process
inventory. This requires that you have a complete set of the business processes for the
application domain, i.e., the processes (tasks) actually performed by the application
domain.
One technique for obtaining a comprehensive set of business processes is process
decomposition. It is a step-by-step decomposition of the application domain into groups of
related business processes and, finally, individual business processes.
Process decomposition is an iterative process. The next iteration is a refinement of the
previous iteration and creates business-related subgroups (subsets) for the groups
resulting from the previous iteration. The iteration stops when a group finally consists of a
single business process.
Process decomposition is a pure grouping of the business processes based on the tasks
performed by the application domain. It just describes which business processes are
performed by a the various subfunctions of the application domain. The same business
process may be performed by multiple subfunctions.

5-58 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Process decomposition neither considers nor reflects whether or not a business process
internally uses other business processes to perform its work. For example, the business
process displaying all maintenance records for an aircraft may very well use the business
process displaying an individual maintenance record. However, this is not a concern of
process decomposition and not reflected in its output.
Neither does process decomposition occupy itself with modules internally used or
invocation sequences. These are implementation details. Only externally visible tasks, i.e.,
tasks performed by the application domain, are considered and reflected. Remember that
we still are in the conceptual view. At this stage, you should not make any assumptions
about the implementation of the business processes.
As a business process is identified during process decomposition, it is described in the
process inventory.
The result of process decomposition is a process tree. The nodes at the lowest level of the
process tree are the business processes described in the process inventory. The nodes at
the higher levels are groups of related business processes. They act like folders of
directory structures. The process tree groups the business processes in accordance with
their usage by subfunctions of the application domain.
The process tree does not imply an implementation structure or an invocation sequence. It
does not specify if a business process internally uses another business process to
accomplish a task. It neither establishes nor enforces that separate business processes
become separate programs or queries.
The process tree should be incorporated in the process inventory. It provides an
overview of the business processes described in the process inventory.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-59
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Process Decomposition for CAB (1 of 2)

AVIATION

AIRPORT ITINERARY FLIGHT AIRCRAFT AIRCRAFT


MANAGEMENT MANAGEMENT MANAGEMENT MANAGEMENT MAINTENANCE

AIRCRAFT PILOT
ASSIGNMENT ASSIGNMENT

Assign Aircraft to Flight Assign Captain for Flight


Change Aircraft for Flight Change Captain for Flight
Remove Aircraft for Flight Remove Captain for Flight
Display Aircraft for Flight Assign Copilot for Flight
Display Aircraft Model for Flight Change Copilot for Flight
Display Aircraft Type for Flight Remove Copilot for Flight
Display All Aircraft Information Display Pilots for Flight
for Flight Display Flights for Pilot
Display Flights for Aircraft

Figure 5-32. Process Decomposition for CAB (1 of 2) CF182.0

Notes:
This visual illustrates a part of the process tree for our sample airline company called Come
Aboard.
We have used folders for the higher-level nodes to demonstrate the similarity to directory
structures on personal computers. A folder contains a list of items. In our case, these are
business processes or other folders containing business processes.
The top-level folder (node), called Aviation, represents the entire application domain. It
covers all business processes for the application domain.
The first iteration of the process decomposition resulted into groups of business processes
for subfunctions Airport Management, Itinerary Management, Flight Management, Aircraft
Management, and Aircraft Maintenance. Some additional subfunctions (such as Employee
Management) are not shown on the visual.
The visual shows a second iteration for Flight Management resulting in groups of business
processes for subfunctions Aircraft Assignment and Pilot Assignment. The business
processes for these groups (third iteration) are also listed on the visual.

5-60 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Many business processes access a single business object type or business relationship
type or, if you prefer, a single entity type or relationship type. However, there are also
business processes accessing multiple entity types and/or relationship types.
Note that business process Display All Aircraft Information for Flight might invoke business
processes Display Aircraft for Flight, Display Aircraft Model for Flight, and Display Aircraft
Type for Flight. Since this is an implementation detail, the process tree does not show it.
The actual implementation may look different.
Business processes Display Flights for Pilot and Display Flights for Aircraft may very well
be used by other subfunctions of the application domain as well. The first business process
may be used by Employee Management; the second by Aircraft Maintenance. The unique
title for a business process prevents that it is implemented twice. You can view the business
process as belonging to one subfunction (its major user) and the other subfunctions having
shortcuts for it.

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-61
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Process Decomposition for CAB (2 of 2)

AVIATION

AIRPORT ITINERARY FLIGHT AIRCRAFT AIRCRAFT


MANAGEMENT MANAGEMENT MANAGEMENT MANAGEMENT MAINTENANCE

Create Itinerary
Change Itinerary
Remove Itinerary
Display Itinerary
Add Single Leg to Itinerary
Add All Legs for Itinerary
Change Leg of Itinerary
Remove Leg of Itinerary
Display Single Leg of Itinerary
Display All Legs of Itinerary
Change Aircraft Model for Leg of Itinerary
Display Flights for Leg of Itinerary

Figure 5-33. Process Decomposition for CAB (2 of 2) CF182.0

Notes:
The next iteration for subfunction Itinerary Management does not result in groups of
business processes. Rather, it immediately provides the business processes for Itinerary
Management. This illustrates that the number of iterations required may vary for different
parts of the process tree.
This part of the process tree also shows a business process, Display All Legs of Itinerary,
whose implementation might use another business process (Display Single Leg of
Itinerary). Again, the process tree is not supposed to show such implementation details.

5-62 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Checkpoint

Exercise — Unit Checkpoint


1. Which of the following statements are true?
a. The data inventory describes all data for the application
domain.
b. The data inventory is jointly established by the application
domain expert and the database designer.
c. The database designer is not involved at all in the development
of the data inventory.
d. The data inventory should also describe data caused by the
implementation of the business processes, but not having a
business meaning.
e. The data inventory is input both for the database designer and
application programmers.
f. When establishing the data inventory, the entity-relationship
model is checked for completeness.

2. List the components of a data inventory.


_____________________________________________________
_____________________________________________________
_____________________________________________________

3. Describe the difference between a data element and a data group.


_____________________________________________________
_____________________________________________________
_____________________________________________________

4. What is the purpose of abstract data types?


_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-63
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

5. Name the three items that should be described for an abstract data
type.
_____________________________________________________
_____________________________________________________
_____________________________________________________

6. Which of the following items should you specify for a data group?
a. A unique name.
b. A textual description.
c. Its data type.
d. The data groups using it as components.
e. The entity types using it as attributes.
f. Its minimum, and average lengths.
g. A domain for its values.

7. Why should you associate data elements or data groups with entity
types when adding them to the data inventory?
_____________________________________________________
_____________________________________________________
_____________________________________________________

8. Name two methods for establishing a data inventory.


_____________________________________________________
_____________________________________________________
_____________________________________________________

9. List some of the problems with the method of surveying the


departments of expertise.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

5-64 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 10. Which principle is behind coupling the data and process
inventories?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

11. Which of the following statements are true?


a. The process inventory describes all data for the application
domain.
b. The process inventory describes all business process for the
application domain.
c. The process inventory is input for the database designer.
d. The descriptions of the business processes refer to the data
elements and data groups of the data inventory by their unique
names.
e. The business processes can be used to verify the
completeness of the entity-relationship model.

12. Name at least six items that the description for a business process
should include.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

13. What is to be understood by data read for a business process?


Which information should be provided for data read?
_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-65
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

14. How can you use a business process to verify the


entity-relationship model.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

15. What is process decomposition and what is its purpose?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

5-66 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary (1 of 2)

D A detailed description of all abstract data types, data elements,


and data groups for the application domain
A
Created jointly by application domain expert and database
T designer
A
Description of abstract data types includes signature, values, and
operations for data type
I
Description of data elements/data groups includes:
N
Name, type, textual description, owning data groups and entity
V types
E Additionally, for data elements: data type, lengths, and domain
N Methods for establishing a data inventory:
T Survey of departments of expertise
O Review of existing data and programs
R Parallel development with process inventory
Y Allows verification of entity-relationship model

Figure 5-34. Unit Summary (1 of 2) CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 5. Data and Process Inventories 5-67
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Summary (2 of 2)

A detailed description of all business processes for application


P domain
R
O Created by application domain expert for application
C programmers
E
S Contains process tree and descriptions for business processes
S
Process tree established by means of process decomposition for
application domain
I
N Functionally structures business processes for application domain
V Step-by-step process providing groups of functionally related
E business processes
N Provides a complete set of business processes
T
Descriptions for business processes include title, purpose, input,
O textual and formal descriptions, output, data read, data written
R
Y Allows verification of entity-relationship model

Figure 5-35. Unit Summary (2 of 2) CF182.0

Notes:

5-68 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 6. Tuple Types

What This Unit Is About


This unit describes the purpose of tuple types and how to establish
them for entity types and relationship types of an entity-relationship
model.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Explain the purpose of tuple types and position them in the design
process.
• Identify the objects of an entity-relationship model for which tuple
types are established.
• Establish the tuple types for the appropriate objects of an
entity-relationship model.
• Explain the purpose and rules for the normalization of tuple types.
• Normalize the tuple types for an application domain.

How You Will Check Your Progress


Accountability:
• Checkpoint questions
• Exercises

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives
After completion of this unit, you should be able to:

Explain the purpose of tuple types


and position them in the design process

Identify the objects of an ER model for


which tuple types are established

Establish the tuple types for the


appropriate objects of an ER model

Explain the purpose and rules for


the normalization of tuple types

Normalize the tuple types for an


application domain

Figure 6-1. Unit Objectives CF182.0

Notes:
Up to now, the entity-relationship model and the data and process inventories for the
application domain have been established. Now, it is time to transform the information
collected so far into objects that are machine processable. This requires a sequence of
steps. The first step is to establish the tuple types for the application domain and to
normalize them.
In this unit, we will talk about the purpose of tuple types, describe for which objects of the
entity-relationship model they are established, and how they are established.
The tuple types established this way may contain anomalies and redundant information
and are submitted to a process called Normalization. We will talk about the purpose of
normalization and Normal Forms.
Thus, after the completion of this unit, you should be able to establish the tuple types for an
application domain and to normalize them.

6-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 6.1 Establishing Tuple Types

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Tuple Types in Design Process


Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 6-2. Tuple Types in Design Process CF182.0

Notes:
As part of the conceptual view, the entity-relationship model and the data and process
inventories for the application domain were established. The first step of the storage view
uses the data inventory to construct tuple types for the entity types and relationship types
of the entity-relationship model and normalizes them.
Tuple types are an intermediate result of the design process. They are the precursors of
tables and provide the basis for the computerized processing of the entity types and
relationship types for the application domain. They are part of storage view since they
represent the first step in the physical implementation of the conceptual view.
You can view the design process as a layered approach transforming the objects of the
application domain step-by-step into more and more physical representations. Tuple types
are an intermediate result of this transformation process which, finally, results in the tables
and related objects of the target relational database management system.

6-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Tuple Types

Tuple Type

A construct:
Representing a class of objects with the same meaning,
structure, and characteristics
Consisting of a set of attributes
Forming the basis for the computerized processing
of the objects belonging to the tuple type

Tuple
A specific instance of a given tuple type

Figure 6-3. Tuple Types CF182.0

Notes:
Tuple types are the first result of storage view. They are established when the
entity-relationship model is transformed step-by-step into the physical objects for the target
relational database management system. They are not yet the tables for the target system.
They are an intermediate result.
Similarly to entity types, tuple types are constructs representing classes of objects with the
same meaning, structure, and characteristics. As entity types, they consist of attributes
which may be elementary or composite and can assume zero, one, or multiple values.
Nevertheless, they are not entity types. They rather could be seen as a generalization or
standardization of entity types and relationship types.
Tuple types form the basis for the computerized processing of the objects they represent.
Whereas entity types and relationship types were purely conceptual classes, tuple types
should be seen as semi-physical constructs. They can be compared with logical files as
further discussed on the subsequent visual.
A specific instance of a tuple type is referred to as tuple.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In the literature, tuple types are frequently referred to as relations. We have chosen the
term tuple types to avoid confusion with relationships and relationship types and to
emphasize that they are classes of tuples.

6-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Characteristics of Tuple Types
Tuple types can be viewed as logical data sets
Tuples form computational units and can be viewed as logical records
Attributes determine contents of logical records
All tuples of a tuple type have the same
Entity Types meaning, structure, and characteristics
Attributes can be elementary or composite
Tuple Types Attributes can assume zero, one, or multiple
values for a tuple
Cardinality determines number of values
Tables
Tuple type must have a set of attributes
uniquely identifying its potential tuples
Primary key
Each tuple type receives a unique name
Relationship
Types Tuple types established for entity types
and most relationship types

Figure 6-4. Characteristics of Tuple Types CF182.0

Notes:
Tuple types can be viewed as logical data sets. They are the logical containers for the
structured information represented by the tuples. Accordingly, the tuples can be viewed as
logical records. They are the computational units being processed. The attributes of a tuple
type determine the structure and contents of the tuples.
As logical records of logical data sets, all tuples of a tuple type have the same meaning,
structure, and characteristics. This means that they are composed of the same type of
information (attributes) and that the same constraints apply to them.
The attributes for a tuple type can be elementary or composite attributes. For a tuple, an
attribute can assume zero, one, or multiple values. The cardinality for the attribute
determines how many values the attribute must assume at least and at most for each tuple.
Each tuple type must have a set of attributes whose values uniquely identify all potential
tuples of the tuple type. This set of attributes is referred to as primary key of the tuple type.
For reference purposes, each tuple type receives a unique name. This should be the
unique class name expressing the function of the tuples.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In the design process, tuple types are an intermediate result in the process of transforming
the entity types and relationship types of the entity-relationship model into tables of the
target relational database management system. Tuple types are established for all entity
types and most relationship types. As for entity types, the attributes of tuple types are
affiliated with data elements and data groups of the data inventory. Basically, when
establishing a tuple type, the data elements and data groups corresponding to the
attributes or defining attributes of the associated entity type or relationship type are
compiled.

6-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Tuple Types for Entity Types

ONE tuple type for EVERY entity type of


entity-relationship model for application domain

Name for tuple type = Name for entity type


AIRCRAFT MODEL
Tuple type consists of all attributes for
K Type Code entity type
K Model Number
Dimensions
Length Attributes of entity key become attributes
Height of primary key for tuple type
Wing Span
...
Equivalent constraints as for entity type

Figure 6-5. Tuple Types for Entity Types CF182.0

Notes:
For every entity type of the entity-relationship model for the application domain, one tuple
type is established.
As name of the tuple type, we will use the name for the entity type. If you wish, you can use
a different name, but there is no need for that.
The tuple type consists of all attributes for the entity type. Thus, when forming the tuple
type, the data elements and data groups of the data inventory corresponding to the
attributes of the entity type are compiled and cardinalities assigned to them.
The primary key for the tuple type consists of the attributes for the entity key of the entity
type. Since entity keys satisfy the minimum principle (all attributes are necessary for the
unique identification of the entity instances), the primary key also follows the minimum
principle: All attributes are necessary for the unique identification of the individual tuples.
As you can imagine, the constraints for the entity type must be translated into equivalent
constraints for the tuple type. However, at this point in time, we will not worry about the
constraints since tuple types are only an intermediate result.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Note that we have already prepared the establishment of tuple types for entity types. In the
data inventory, we have recorded to which entity types the various data elements and data
groups belong. Thus, to obtain the tuple type for an entity type, you just need to compile the
data elements and data groups for the entity type.

6-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Tuple Types for Relationship Types
Name for tuple type = Full name for relationship type
AIRCRAFT MODEL
K Type Code Defining Attributes
K Model Number
... Type Code
Model Number Relationship Key
1. .1
Aircraft Number Aircraft Number
_for_

m Tuple type consists of Attributes of relationship


AIRCRAFT defining attributes for key become attributes of
K Aircraft Number
relationship type primary key
...
Equivalent constraints as for relationship type

Usually, ONE tuple type for each But NONE


relationship type for ...

Figure 6-6. Tuple Types for Relationship Types CF182.0

Notes:
As for tuple types for entity types, we will choose the full name of the relationship type as
name of the corresponding tuple type.
Since they describe the relationship type, the tuple type must consist of all defining
attributes for the relationship type. Thus, to form the tuple type, the data elements and data
groups of the data inventory for the defining attributes of the relationship type are compiled.
Since a tuple expresses a single relationship, all attributes must assume one and only one
value for each tuple. Accordingly, minimum cardinality and maximum cardinality must be 1
for all attributes of the tuple type.
As you would expect, the attributes of the relationship key become the attributes of the
primary key for the tuple type. Since the relationship key had to follow the minimum
principle, the primary key for the tuple type follows the minimum principle as well: All
attributes are required to uniquely identify the individual tuples of the tuple type.
The example on the visual shows relationship type AIRCRAFT MODEL_for_AIRCRAFT, a
1:m relationship type. As you know, the defining attributes for this relationship type are the
entity keys of its source and target: attributes Type Code and Model Number from

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

AIRCRAFT MODEL and attribute Aircraft Number from AIRCRAFT. They become the
attributes of tuple type AIRCRAFT MODEL_for_AIRCRAFT.
Since the relationship type is a 1:m relationship type, Aircraft Number, the entity key of
AIRCRAFT, becomes the relationship key. Therefore, it also becomes the primary key for
tuple type AIRCRAFT MODEL_for_AIRCRAFT.
Any constraints for the relationship type must be translated into equivalent constraints for
the tuple type. Again, we will not worry about them right now.
Up to now, we have only described how to establish the tuple type for a relationship type.
We have not yet answered the question if there is a tuple type for every relationship type of
the entity-relationship model? Usually, there is one tuple type for a relationship type.
However, there are some exceptions. For some relationship types, there must not be a
tuple type. These cases are described by the subsequent visuals.

6-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
No Tuple Type for Relationship Type (1 of 3)

May be entity type


Parent or relationship type Defining Attributes
Key of dependent
entity type
Key of dependent entity type
includes key of parent
Only instances with matching
values interconnected

D Tuple type for dependent entity type


already expresses owning relationship type
Dependent
Entity Type Tuple type for owning relationship
type would be redundant

NO tuple type for owning relationship type

Figure 6-7. No Tuple Type for Relationship Type (1 of 3) CF182.0

Notes:
An owning relationship type connects an entity type or relationship type, the parent, to a
dependent entity type. (The rectangle with rounded corners indicates that the represented
object may be an entity type or a relationship type.) The key of the parent is part of the key
of the dependent entity type and only instances with matching values are interconnected.
As we have seen before, the defining attributes for an owning relationship type consist of
the key of the dependent entity type. Accordingly, the tuple type for the owning relationship
type would just consist of the key of the dependent entity type.
The tuple type for the dependent entity type also contains the key for the dependent entity
type. Since only instances with matching values are interconnected, the tuple type for the
dependent entity type expresses all interconnections, and only those, established via the
owning relationship type. Consequently, a tuple type for the owning relationship type would
be redundant. Therefore, a tuple type for the owning relationship type need not and must
not be provided.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

No Tuple Type for Relationship Type (2 of 3)


Relationship Key of r1 Defining Attributes of r2
Defining attributes of r1 Defining attributes of r1
...

Entity type or
relationship type
Tuple type for r2 includes
. .m defining attributes for r1

For every instance of r1, there is a


r2 Entity type or corresponding instance for r2
r1 relationship type
1. .
Tuple type for r2 expresses all
. .m relationship instances for r1
Entity type or Tuple type for r1 would be redundant
relationship type

NO tuple type for r1

Figure 6-8. No Tuple Type for Relationship Type (2 of 3) CF182.0

Notes:
There is a second case when a tuple type must not be provided for a relationship type.
Assume that you have an m:m relationship type r1 which is the source of another
relationship type r2 with a minimum target cardinality of 1 (cardinality 1..). Because r1 is an
m:m relationship type, its relationship key consists of all its defining attributes.
Consequently, the defining attributes of r2 include the defining attributes of r1. Accordingly,
the tuple type for r2 includes all attributes of the tuple type for r1.
The target cardinality of 1.. of relationship type r2 implies that each instance of r1 is
connected to at least one target instance of r2. In turn, this entails that r2 contains an
instance for every instance of r1.
This means that the tuple type for r2 completely describes the instances for r1 and r2. As a
consequence, a tuple type for r1 would be redundant and, therefore, need not and must
not be provided.

6-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Note that it is imperative that the minimum target cardinality of r2 be 1. Otherwise, there
would not necessarily be an instance of r2 for every instance of r1. Thus, the instances of
r2 would not describe all instances of r1 and an own tuple type would be required for r1.
Note that it is also necessary that r1 is an m:m relationship type. Otherwise, the key of r1
would not consist of all defining attributes. Thus, the tuple type for r2 would not include all
defining attributes for r1 and, therefore, not completely describe the instances for r1.
Of course, a tuple type would also not be required if r1 were the target of a relationship type
r2 with a minimum source cardinality of 1. This case can be reduced to the case discussed
above by redefining the primary direction of r2.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-15


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

No Tuple Type for Relationship Type (3 of 3)

Entity type or r1 Entity type or


relationship type relationship type
. .m . .m

r2

D 1. .
Dependent
Entity Type

NO tuple type for r2 NO tuple type for r1

Figure 6-9. No Tuple Type for Relationship Type (3 of 3) CF182.0

Notes:
This visual combines the cases discussed on the previous two visuals. Thus, it is a
corollary of them.
If r1 is an m:m relationship type and r2 an owning relationship type with a minimum
cardinality of 1 for the dependent entity type, tuple types must not be provided for them.
A tuple type must not be provided for r1 because the tuple type for r2 would fully describe
all instances for r1. A tuple type for r2 must not be provided either because the tuple type
for the dependent entity type completely describes the appropriate instances.
In particular, the situation on the visual exists for mandatory nondefining attributes for m:m
relationship types.

6-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Required Tuple Types for CAB
EMPLOYEE
S _is_
1
DC 1 1 DC
AIRCRAFT
m
MECHANIC TYPE PILOT
_can_fly_
_for_
1. .1 m m m
D 1. .m
m
m AIRCRAFT
m m
_trained MODEL
_for_ _can_land m AIRPORT m
.1. .1. _at_
1 1 From To DC _by_ _assigned
_from_
PILOT
_for_ _to_
_nonstop_to_ ASSIGNMENT 1. .1
m 1. .m
m _as_ DC
AIRCRAFT _in_ LEG
_scheduled 1. .1
_for_ _for_
1 m
m m _for_ m
C MAINTENANCE Owner D
RECORD ITINERARY m
FLIGHT
m 1
m

_belongs_to_ _for_

Figure 6-10. Required Tuple Types for CAB CF182.0

Notes:
The above visual illustrates for which entity types and relationship types of our sample
airline company called Come Aboard tuple types are required:
• Tuple types are required for all entity types of the entity-relationship model for Come
Aboard. Therefore, the entity types are shown in reverse video.
• Because they are owning relationship types, tuple types must not be provided for
relationship types:
AIRCRAFT TYPE_for_AIRCRAFT MODEL
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY_as_LEG
FLIGHT_for_LEG
PILOT_assigned_to_FLIGHT_by_PILOT ASSIGNMENT
EMPLOYEE_is_MECHANIC
EMPLOYEE_is_PILOT
The arrows for these relationship types have been shaded.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-17


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Note that the is-bundle for supertype EMPLOYEE represents a set of relationship types
(two). All of them are owning relationship types.
• Tuple types must not be provided for relationship types:
AIRPORT_nonstop_to_AIRPORT_in_ITINERARY
PILOT_assigned_to_FLIGHT
They are m:m relationship types being the source of other relationship types whose
minimum target cardinality is 1. Both are cases of m:m relationship types with mandatory
nondefining attributes.
The arrows for these relationship types have been shaded.
• Tuple types are required for all remaining relationship types. Their connecting arrows
have been highlighted.

6-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Documentation of Tuple Types

AIRCRAFT TYPE Name of tuple type


Type Code, PK Belongs to primary key
Category
Manufacturer Composite attributes
Manufacturer Code
Components of composite
Company Name attributes are indented
Address
Street [0..1] Cardinality of attribute:
Post Office Box [0..1] Minimum and maximum number
City of values for each instance
State [0..1] Format:
Country [ minimum .. maximum ]
Postal Code [0..1] * for maximum if no upper limit
Phone Number [1..1] assumed if omitted
Number of Engines Cardinalities are relative!!

Figure 6-11. Documentation of Tuple Types CF182.0

Notes:
As described before, tuple types consist of attributes. A tuple type for an entity type
consists of the attributes of the entity type. A tuple type for a relationship type consists of
the defining attributes for the relationship type. To describe a tuple type, you need to list its
attributes, thereby, reflecting that the attributes may be composite.
Each line on the visual following the name for the tuple type represents an attribute. The
components of a composite attribute immediately follow the line for the composite attribute
itself and are indented. If a component is again a composite attribute, its components are
indented even further.
In the example, Manufacturer is a composite attribute having composite attribute Address
as a component.
To highlight the name of the tuple type, it is in boldface and has been underlined. The
names of composite attributes have been bold-faced as well.
Attributes belonging to the primary key are marked by the letters PK separated from the
name of the attribute by a comma. If a composite attribute belongs to the primary key (i.e.,

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-19


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

all its components belong to the primary key), the name of the composite attribute is
marked with the letters PK rather than the individual components.
As discussed before, attributes have cardinalities specifying the minimum number and
maximum number of values that an instance of the attribute must/can have. As part of the
documentation of a tuple type, we want to show the cardinalities for its attributes. They are
needed during normalization and in later steps of the design process.
The cardinality for an attribute follows its name or, if applicable, the letters PK and is
specified as follows: Minimum cardinality and maximum cardinality are separated by two
periods and enclosed in brackets:
[minimum .. maximum]
If there is no upper limit for the number of values the attribute can assume, an asterisk (*) is
used as maximum cardinality. Enclosing brackets are used in analogy to the dimension
specification for arrays in programming languages.
If the cardinality for an attribute is omitted, [1..1] is assumed.
Note that the specified cardinalities are relative: If the attribute is a direct component of the
tuple type, the cardinality expresses how many values the attributes must/can assume for
each tuple of the tuple type. If the attribute is a component of a composite attribute, the
cardinality rather specifies how many values the attribute must/can assume within each
instance of the composite attribute.
As a consequence, it is possible that, despite of a minimum cardinality of 1, an attribute
does not assume a value for a specific tuple! This happens if the owning composite
attribute has a minimum cardinality of 0 and does not assume a value for a tuple.
The cardinality for an attribute can be derived from the cardinality specifications for the data
element or data group the attribute is based upon. Thus, it would be possible to omit the
cardinalities in the documentation of a tuple type and to go back to the data inventory when
the cardinalities are needed. However, it is quite handy to have the cardinalities in the tuple
type documentation.

6-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Tuple Types With Roles

FLIGHT
Flight Number, PK
Name of Data Airport Code AS From, PK Attribute/Role
Element Airport Code AS To, PK Name
Flight Locator, PK
Departure AS Planned Departure
Departure Date
Departure Time
Name of Data Arrival AS Planned Arrival Attribute/Role
Group Name
Arrival Date
Arrival Time
Departure AS Actual Departure [0..1]
Departure Date
Departure Time Qualified Name
Arrival AS Actual Arrival [0..1] Departure Time OF
Arrival Date Actual Departure
Arrival Time

Figure 6-12. Tuple Types With Roles CF182.0

Notes:
Generally, an attribute of a tuple type receives the same name as the data element or data
group it is based upon. However, you might want to give it a different name. In some cases,
you even have to. If a data element or data group is used by multiple attributes at the same
level in different roles, you need to give the attributes different names. Same level in this
context means as direct components of the tuple type or of a composite attribute.
For example, in tuple type FLIGHT, data element Airport Code is used twice as direct
attribute of the tuple type. Once it is used as airport code for the airport of departure, once
as airport code for the airport of arrival. Without naming them differently, the two roles could
not be differentiated. In the data inventory for Come Aboard, the two roles for the data
element have been identified with different role names (From and To). Therefore, the
names of the attributes should be the role names.
However, you still want to keep the link to the appropriate data element or data group in the
data inventory. You can achieve this by specifying the data element or data group name
and the attribute name by means of an AS clause as done on the visual.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-21


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In addition to data element Airport Code, tuple type FLIGHT uses data groups Departure
and Arrival in different roles. The different usages of data group Departure have been
highlighted. Departure is used as planned departure (role/attribute name Planned
Departure) and as actual departure (role/attribute name Actual Departure).
Data group Departure contains data elements Departure Date and Departure Time. This
raises the question if the attributes for the different usages need not be named differently?
They need not because they are components of differently named composite attributes and
are unique in the scope of the composite attributes.
Formally, the full name of a component is qualified by the name of the composite attribute.
For example, the full name of attribute Departure Time of composite attribute Actual
Departure is Departure Time OF Actual Departure.

6-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Some Sample Tuple Types for CAB

AIRCRAFT MODEL ITINERARY


Type Code, PK Flight Number, PK
Model Number, PK Established On
Dimensions Effective From [0..1]
Length Effective Until [0..1]
Height
MAINTENANCE RECORD
Wing Span
Maintenance Number, PK
Weights
Date of Maintenance
Net Weight
Type of Maintenance
Maximum Weight
Aircraft Number
Cruising Speed ...

AIRCRAFT_for_FLIGHT MAINTENANCE RECORD_from_MECHANIC


Aircraft Number Maintenance Number, PK
Flight Number, PK Employee Number
Airport Code AS From, PK MECHANIC_scheduled_for_AIRCRAFT
Airport Code AS To, PK Employee Number, PK
Flight Locator, PK Aircraft Number, PK

Figure 6-13. Some Sample Tuple Types for CAB CF182.0

Notes:
The above visual illustrates some further tuple types for our sample airline company called
Come Aboard. The tuple types in the upper box are for entity types. Tuple type AIRCRAFT
MODEL has only mandatory attributes, i.e., all attributes have a minimum cardinality of 1.
They also all have a maximum cardinality of 1.
Tuple type ITINERARY has only a few attributes. You are probably missing the (starting)
weekdays on which the itinerary is operated as described in the problem statement for
Come Aboard. A closer examination reveals that the weekdays on which itineraries are
operated are not inherent characteristics of itineraries. Rather, they are characteristics of
the legs for itineraries. The (starting) weekdays for an itinerary can be derived from the
(starting) weekdays for its legs.
Only a few attributes of tuple type MAINTENANCE RECORD are shown. Note that the
tuple type contains an attribute Aircraft Number expressing to which aircraft the
maintenance record belongs. In Unit 4 - Entity-Relationship Model, we determined that the
interrelationship between maintenance records and aircraft could not be expressed by

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-23


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

means of a relationship type. It had to be expressed by an attribute. This is reflected in the


tuple type.
The lower box on the visual contains tuple types for relationship types for Come Aboard. As
a principle, all attributes of tuple types for relationship types have a cardinality of [1..1].
AIRCRAFT_for_FLIGHT and MAINTENANCE RECORD_from_MECHANIC are tuple types
for 1:m relationship types. Accordingly, their relationship keys only consist of some of the
defining attributes. This is reflected by only some of the attributes of the tuple types
belonging to the primary keys.
Tuple type MECHANIC_scheduled_for_AIRCRAFT is for an m:m relationship type.
Therefore, all its attributes belong to the primary key.

6-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
A Special Consideration

Resulting tuple type for entity type may


just consist of primary key

Check with application domain expert if


entity type is really necessary

Remove entity type from entity-relationship


model if not required by application domain
Change relationship types accordingly
Change tuple types accordingly

But be careful!!!
As such, the pure existence of an entity type or the
existence of a relationship type using it as source or
target are information that you may lose

Figure 6-14. A Special Consideration CF182.0

Notes:
It is possible that the tuple type for an entity type just consists of the primary key. However,
it is pretty unusual. Therefore, you should discuss with the application domain expert if the
appropriate entity type is really necessary. When establishing the problem statement for
the application domain, the application domain expert might have thought that there would
be information of that type. However, the data inventory may not contain any data elements
and data groups for the entity type.
If the application domain expert agrees, remove the entity type from the entity-relationship
model, adjust the relationship types using the entity type as source or target accordingly,
and correct the tuple types.
However, you really should examine the case carefully. The pure existence of an entity type
or the use of it by relationship types as source or target constitutes already information that
you may lose by removing the entity type. An entity type represents a class of objects with
the same meaning and characteristics. Being an instance of that class identifies the
appropriate object as a member of the class even if there are no further characteristics to
be stored for the object.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-25


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

6-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 6.2 Normalization of Tuple Types

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-27


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Normalization - An Introduction
Established Tuple Types . . .
Generally, cannot be converted one-to-one into tables
Attributes can assume multiple values whereas columns cannot
May contain redundant information
May lead to inconsistent tuples
May contain insert, update, and delete anomalies
Information cannot be stored because of missing unrelated information
Information may become inconsistent due to updates
Information may be lost when a tuple is deleted

Normalization
Improves condition of tuple types by raising their quality level
Normal Forms define quality levels of tuple types
Five Normal Forms: 1st Normal Form through 5th Normal Form
Subsequent Normal Form based on previous Normal Form
The higher the Normal Form the better the quality of the tuple type
Only first three Normal Forms of practical relevance

Figure 6-15. Normalization - An Introduction CF182.0

Notes:
The tuple types established so far may have the following problems:
• They may have attributes with a maximum cardinality other than 1, i.e., have repeating
groups. A tuple type with repeating groups cannot immediately be converted into a table.
This is because the columns of tables can only accept a single value.
• Even within tuple types, redundant information may be stored. This may lead to
inconsistencies between the tuples of a tuple type if update operations do not change all
affected tuples.
• The tuple types may contain insert, update, and delete anomalies. These anomalies
may prevent the storage of information, cause inconsistent tuples, or result in the loss of
information.
Normalization remedies these deficiencies within, but not across tuple types. It improves
the condition of the tuple types by raising their quality level step-by-step.
There are five quality levels defined for tuple types by means of Normal Forms. These
Normal Forms are referred to as First Normal Form, Second Normal Form, and so on.

6-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Each subsequent Normal Form requires that the previous Normal Form is satisfied
together with some additional conditions. Thus, the higher the Normal Form for a tuple
type, the better and more stable it is and the fewer of the above-mentioned problems may
occur.
Only the first three Normal Forms are of practical relevance. Nearly nobody ensures that
his/her tuple types satisfy the Fourth Normal Form or even the Fifth Normal Form. Both
Normal Forms deal with n-ary many-to-many relationship types and are more of a
theoretical nature. They are very complex and violations are extremely hard to detect.
Normally, when establishing tuple types based on an entity-relationship model with only
binary relationship types, you should not have violations of the Fourth Normal Form or the
Fifth Normal Form. This assumes that you have dutifully identified your relationship types
and not hidden and combined them in artificial entity types.
Because of the limited practical value of the remaining Normal Forms, we will concentrate
on the first three Normal Forms. However, to illustrate how difficult it is to verify the higher
Normal Forms, we will address the Fourth Normal Form as well, but skip the Fifth Normal
Form.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-29


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

First Normal Form - Definition

A tuple type is in the First Normal Form if all its attributes,


elementary or composite, can have at most one value

AIRCRAFT
Aircraft Number, PK
ITINERARY Date Manufactured
Flight Number, PK Seat [0..*] Repeating
Seat Number
Group
Established On
Effective From [0..1] Seat Location
Effective Until [0..1] Seat Class
Section
In 1st Normal Form Date in Service [0..1]
...
Not in 1st Normal Form

Figure 6-16. First Normal Form - Definition CF182.0

Notes:
The First Normal Form deals with repeating groups. This means, it deals with attributes
having a maximum cardinality higher than 1 (considering *, meaning unlimited, also as
higher than 1). Repeating groups represent a problem when mapping tuple types into
tables because the columns of tables only allow a single value. Therefore, the First Normal
Form requires that all attributes, elementary or composite, have at most one value. It is
allowed that an attribute may not have a value for some tuples.
Tuple type ITINERARY for our sample airline company called Come Aboard is in First
Normal Form. None of its attributes has a maximum cardinality higher than 1.
Tuple type AIRCRAFT violates the First Normal Form because it contains a repeating
group. Composite attribute Seat has a maximum cardinality of *. This means, an aircraft
can have many seats and an upper limit has not been established.
Since Seat is a composite attribute, its values are composed of values for its components.
Each value of Seat consists of a value for Seat Number, Seat Location, Seat Class, and
Section. Effectively, this means that these attributes assume multiple values as well,
namely, as many as the composite attribute.

6-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
First Normal Form - Solution

AIRCRAFT SEAT
Aircraft Number, PK Aircraft Number, PK
Date Manufactured Seat
Date in Service [0..1] Seat Number, PK
... Seat Location
Seat Class
Section

May need to separate out multiple


attributes together
With same maximum cardinality
Depends on composite attributes Cardinality [1..1]

Resulting tuple types must again be inspected


for violations of First Normal Form

Figure 6-17. First Normal Form - Solution CF182.0

Notes:
You can solve the violation of the First Normal Form as follows:
• Remove attribute Seat from tuple type AIRCRAFT and create a new tuple type SEAT.
The new tuple type contains one tuple for each seat on every aircraft. Accordingly, the
cardinality of composite attribute Seat in the new tuple type is [1..1].
• To not lose the interconnection to aircraft, the new tuple type must contain, for each
seat, the serial number of the aircraft to which the seat belongs (attribute Aircraft
Number).
• None of the attributes alone can form the primary key for the new tuple type since none
uniquely identifies the tuples of the tuple type. Seat numbers are not unique across
aircraft. Different aircraft may have the same seat numbers. However, seat numbers are
unique per aircraft. Therefore, the primary key must consist of two attributes:
Aircraft Number and Seat Number
Sometimes, it is necessary to introduce an additional attribute (e.g., a sequence
number) to attain the unique identification of the tuples. Sometimes, it is desirable to

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-31


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

introduce an additional attribute which, together with other attributes, uniquely identifies
the tuples. However, remember that the primary key is used to reference the individual
tuples of a tuple type. Therefore, it should be as natural as possible. A time which,
together with other attributes, could be used to uniquely identify the tuples is not a good
component for a primary key. Who remembers the various times for the tuples?!
When creating the new tuple type, all logically related attributes with the same maximum
cardinality should be moved to the same tuple type. If the data groups for the composite
attributes of a tuple type were established properly, all logically related attributes should be
part of the same composite attribute. In case of our example, they all belong to composite
attribute Seat. The composite attribute is then the only one (in addition to the primary key of
the original tuple type) to be moved to the new tuple type.
If data groups have not been established at all or improperly, you must determine during
normalization which attributes logically belong together and should be moved together. In
other words, the data groups must be established in any case. Why not establishing them
correctly from the start, i.e., when the data inventory is established?!
Repeating groups may be nested and should be resolved from outside in. Thus, a tuple
type resulting from normalization must be inspected again for violations of the First Normal
Form.

6-32 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
First Normal Form - Instance Example
Seat
Aircraft Date Seat Seat Seat Class Section Date in
Number Manufactured Number Location Service
B474001323 1994-10-12 1A WINDOW FIRST N/SMOKING 1997-01-01
1B MIDDLE FIRST N/SMOKING
1C AISLE FIRST N/SMOKING
... ... ... ...
46J WINDOW ECONOMY SMOKING BEFORE
B171004217 1999-10-23 1A WINDOW BUSINESS N/SMOKING 1999-11-15
1B AISLE BUSINESS N/SMOKING
... ... ... ...
28G WINDOW ECONOMY N/SMOKING

AIRCRAFT
Seat
Aircraft Seat Seat Seat Class Section
Number Number Location
AFTER B474001323 1A WINDOW FIRST N/SMOKING
B474001323 1B MIDDLE FIRST N/SMOKING
Aircraft Date Date in B474001323 1C AISLE FIRST N/SMOKING
Number Manufactured Service ... ... ... ... ...
B474001323 1994-10-12 1997-01-01 B474001323 46J WINDOW ECONOMY SMOKING
B171004217 1A WINDOW BUSINESS N/SMOKING
B171004217 1999-10-23 1999-11-15
B171004217 1B AISLE BUSINESS N/SMOKING
AIRCRAFT ... ... ... ... ...
SEAT B171004217 28G WINDOW ECONOMY N/SMOKING

Figure 6-18. First Normal Form - Instance Example CF182.0

Notes:
This visual uses an instance example for the tuple types considered on the previous
visuals. The tuple types are represented as tables to illustrate some sample tuples for
them.
The top portion of the visual illustrates tuple type AIRCRAFT before normalization. For both
tuples shown, composite attribute Seat, and, thus, its components Seat Number, Seat
Location, Seat Class, and Section, assume many values. The component values in a line
belong together. They form the components of the appropriate value for the composite
attribute. (As you may correctly conclude from the visual, the components of a composite
attribute become separate columns in the tables of the relational database management
system.)
The bottom half of the visual illustrates the situation after normalization. Tuple type
AIRCRAFT no longer contains any seat information. The seat information is contained in
tuple type SEAT. For each seat on an aircraft, SEAT contains one tuple. Aircraft Number
identifies to which aircraft the seat belongs.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-33


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

First Normal Form - ER Model Correction


EMPLOYEE
S _is_
1
DC 1 1 DC
AIRCRAFT
m
MECHANIC TYPE PILOT
_can_fly_
_for_
1. .1 m m m
D 1. .m
m
m AIRCRAFT
m m
_trained MODEL
_for_ _can_land m AIRPORT m
.1. .1. _at_
1 1 From To DC _by_ _assigned
_from_
PILOT
_for_ _to_
_nonstop_to_ ASSIGNMENT 1. .1
m 1. .m
m 1 _as_ DC
AIRCRAFT _in_ LEG
_scheduled 1. .1
_for_ _for_ m
m _has_ m _for_ m
C MAINTENANCE Owner D
RECORD m DC ITINERARY m
FLIGHT
m 1
SEAT m

_belongs_to_ _for_

Figure 6-19. First Normal Form - ER Model Correction CF182.0

Notes:
The fact that a new tuple type has been created to achieve First Normal Form should be
reflected in the entity-relationship model. In case of our example, this means that a
dependent entity type SEAT for entity type AIRCRAFT must be introduced together with the
associated owning relationship type AIRCRAFT_has_SEAT. The entity type is indeed a
dependent entity type:
• The key of entity type AIRCRAFT is part of the entity key for SEAT.
• Instances with matching key/key portion values, and only those, are interconnected.
The target cardinality for the owning relationship type is m (0..m) because the cardinality for
composite attribute Seat was [0..*] in the original tuple type. This means that there are
aircraft without seats (cargo planes). If necessary, go back to the application domain expert
to verify the cardinality.
The problem statement for the application domain should be updated as well (by the
application domain expert).

6-34 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
First Normal Form - 2nd Example (1 of 2)
AIRCRAFT AIRCRAFT
Aircraft Number, PK Aircraft Number, PK
Date Manufactured Date Manufactured
Date in Service [0..1]
Date in Service [0..1] Engine 1 [0..1]
Engine [0..4] Repeating Engine Number
Engine Number Group Engine Type
Engine Type Manufacturer
Manufacturer Engine Position
Engine 2 [0..1]
Engine Position Engine Number
... Engine Type
Manufacturer
Are you really sure that Engine Position
Engine 3 [0..1]
this is the solution??? Engine Number
Engine Type
Can you control that there will never be Manufacturer
more than four engines? Engine Position
Engine 4 [0..1]
What about engines not mounted on aircraft? Engine Number
Engine Type
Go back to the application Manufacturer
domain expert and ... Engine Position
...
Figure 6-20. First Normal Form - 2nd Example (1 of 2) CF182.0

Notes:
For the Seat example considered so far, you would have had another, but not attractive,
solution: You could have introduced an own tuple in tuple type AIRCRAFT for each value of
composite attribute Seat by repeating the corresponding values for the other attributes.
However, in this way, you would have created a lot of redundancy endangering the
consistency of the tuples through update operations not changing all related tuples. Thus,
not really a solution to be considered.
This visual discusses another possible solution for repeating groups with a low fixed
maximum cardinality. Look at the example on the visual. Tuple type AIRCRAFT has
another repeating group, namely, the engines belonging to the aircraft. In this repeating
group, Manufacturer is again a composite attribute. Its components have not been listed
since not relevant for the present discussion.
In contrast to the previous example, composite attribute Engine has a low fixed maximum
cardinality. Its maximum cardinality is four.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-35


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

To abolish the repeating group, you could replace Engine by four composite attributes
Engine 1, Engine 2, Engine 3, and Engine 4. All of these would have the same components
as Engine, but a cardinality of [0..1].
Formally, the violation of the First Normal Form has vanished. However, you should ask
yourself if that is really the solution that you want because it has serious limitations and
drawbacks:
• Are you really sure that the maximum cardinality will not increase over time? Is it really
under your control that the maximum cardinality will not increase or can somebody else
just change the rules on you? If the maximum cardinality increases, you need additional
attributes reflecting the cardinality increase. This will cause changes in your queries and,
especially, your programs because they will handle the various engines individually.
In contrast, if you have a new tuple type with one tuple for each engine of an aircraft, you
can use loop processing. If the proper end-of-data conditions are tested, processing can
be independent of the number of engines mounted and the maximum number of
engines for an aircraft.
• Another question to consider for this solution (as well as for the original tuple type) is:
What happens with engines not mounted on an aircraft? Do you not keep the referenced
information for them as well? As the entity-relationship model and the tuple type for
Come Aboard stand right now, you would not know where to keep information about
engines not mounted.
The case on the visual reveals a problem with the conceptual view of your database
design, especially, with the entity-relationship model. You should go back to the application
domain expert and ask him/her if the engine information must be kept for engines not
mounted? If so, you should solve the violation of the First Normal Form by first correcting
your entity-relationship model and then changing your tuple types accordingly. This is
illustrated on the next visual.

6-36 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
First Normal Form - 2nd Example (2 of 2)
... get your ER model in order!!!

AIRCRAFT _on_ ENGINE


K Aircraft Number K Engine Number
... 1 _in_ m ...

DC 1. .1
ENGINE LOCATION
K Engine Number
Engine Position

AIRCRAFT ENGINE LOCATION


Aircraft Number, PK Engine Number, PK
Date Manufactured Engine Position ENGINE_on_AIRCRAFT
Date in Service [0..1] Engine Number, PK
... Aircraft Number
Engine Position
ENGINE
Engine Number, PK ENGINE_on_AIRCRAFT
Engine Type Engine Number, PK
Manufacturer Aircraft Number

Figure 6-21. First Normal Form - 2nd Example (2 of 2) CF182.0

Notes:
In case of our example, the application domain expert has confirmed that information about
engines is also required for engines not mounted on aircraft. Consequently, the engines
represent an independent conceptual unit, a class of objects with the same meaning and
characteristics. Therefore, they must be represented by an entity type in the
entity-relationship model. Accordingly, the entity-relationship model for Come Aboard is
incomplete. It should be corrected before the tuple types are corrected:
• An entity type ENGINE is introduced containing elementary attributes Engine Number
and Engine Type and composite attribute Manufacturer.
Since the serial numbers for engines are unique across engine manufacturers, Engine
Number becomes the entity key for ENGINE.
• In addition to the entity type, a relationship type ENGINE_on_AIRCRAFT must be
introduced specifying which engines are mounted on the individual aircraft.
• You may wonder why attribute Engine Position has not been added to entity type
ENGINE. Engine Position specifies in which position the appropriate engine is mounted

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-37


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

on an aircraft. The engine position is not a characteristic of the engine as such, but
rather a characteristic of the relationship linking the engine to an aircraft. Accordingly,
Engine Number is a nondefining attribute of relationship type ENGINE_on_AIRCRAFT.
As described in Unit 4 - Entity-Relationship Model, dependent entity types are used to
model the nondefining attributes of relationship types. Therefore, dependent entity type
ENGINE LOCATION is introduced containing attribute Engine Position. Its parent is
relationship type ENGINE_on_AIRCRAFT and its owning relationship type is
ENGINE_on_AIRCRAFT_in_ENGINE POSITION.
The target cardinality of the owning relationship type is 1..1 because each mounted
engine must be in one and only one position of the aircraft. The entity key of ENGINE
LOCATION is Engine Number, the relationship key of ENGINE_on_AIRCRAFT. The
cascading property for the target of the owning relationship type expresses the fact that
the engine position is to be deleted when the engine is taken off the aircraft.
After we have corrected the entity-relationship model, we can establish the corresponding
tuple types:
• We need tuple types for the three entity types, i.e., for AIRCRAFT, ENGINE, and
ENGINE LOCATION. The tuple type for AIRCRAFT no longer contains engine
information. The tuple type for ENGINE contains only the really engine-specific
information. Tuple type ENGINE LOCATION describes, for mounted engines, on which
engine position they are mounted. It does not specify on which aircraft the engine is
mounted.
• We need a tuple type for relationship type ENGINE_on_AIRCRAFT. The tuple type
contains the engine number of the mounted engine and the aircraft number of the
aircraft on which the engine is mounted.
• Since ENGINE_on_AIRCRAFT_in_ENGINE POSITION is an owning relationship type,
we must not have a tuple type for it.
Tuple types ENGINE LOCATION and ENGINE_on_AIRCRAFT have the same primary key.
Since every tuple of ENGINE LOCATION has a corresponding tuple with the same primary
key value in ENGINE_on_AIRCRAFT and vice versa, the two tuple types can be combined.
The resulting tuple type is again called ENGINE_on_AIRCRAFT. We will not further discuss
here when tuple types can be combined. We will leave this to the next unit.

6-38 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Second Normal Form - Definition
A tuple type is in the Second Normal Form if:
It is in the First Normal Form
All its elementary nonkey attributes are
functionally dependent on the entire primary key

FLIGHT
Flight Number, PK
Airport Code AS From, PK
Airport Code AS To, PK LEG
Flight Locator, PK Flight Number, PK
Departure AS Planned Departure
Departure Date Airport Code AS From, PK
Departure Time
Arrival AS Planned Arrival Airport Code AS To, PK
Only Dependent
Arrival Date Leg Number
Arrival Time On
Departure AS Actual Departure [0..1] Mileage Credit
Departure Date ...
Departure Time
Arrival AS Actual Arrival [0..1]
Arrival Date
Arrival Time
Not in 2nd Normal Form
In 2nd Normal Form
Figure 6-22. Second Normal Form - Definition CF182.0

Notes:
Basically, the Second Normal Form deals with the improper assignment of attributes to
tuple types. It applies to tuple types whose primary keys consist of more than one
elementary attribute.
A tuple type is in the Second Normal Form if:
• It is in First Normal Form.
• All its elementary nonkey attributes are functionally dependent on the entire primary
key, i.e., on all attributes belonging to the primary key.
As mentioned before, if a composite attributes belongs to the primary key, all its
components belong to the primary key. Thus, the functional dependence must be on all
components of the composite attribute.
Similarly, all elementary components of a composite attribute must be functionally
dependent on the entire primary key for the Second Normal Form to be satisfied.
The primary key of tuple type FLIGHT for our sample airline company consists of four
attributes. All elementary nonkey attributes of the tuple type are functionally dependent on

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-39


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

the entire primary key, i.e., on all four attributes. Therefore, tuple type FLIGHT is in Second
Normal Form.
The primary key of tuple type LEG consists of the three attributes Flight Number, From and
To. From and To identify the airport of departure and the airport of arrival for the leg of the
considered flight. Leg Number depends on all attributes of the primary key. A different
itinerary (flight number) may contain the same nonstop connection as a different leg.
In contrast, attribute Mileage Credit, i.e., the miles credited for the leg on frequent-flyer
accounts, does not dependent on Flight Number. It only depends on the airport of
departure and the airport of arrival, i.e., on From and To. Thus, the tuple type violates the
Second Normal Form.

6-40 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Second Normal Form - Solution
m m Tuple types for AIRPORT
AIRPORT and ITINERARY unchanged
From To

_nonstop_to_ NONSTOP CONNECTION


Airport Code AS From, PK
_in_ Airport Code AS To, PK
DC 1. .1 Mileage Credit
NONSTOP
CONNECTION LEG
Flight Number, PK
1. .m Airport Code AS From, PK
_as_ DC Airport Code AS To, PK
_in_ LEG Leg Number
1. .1 ...
m
No tuple types for any of
ITINERARY the relationship types

Figure 6-23. Second Normal Form - Solution CF182.0

Notes:
As mentioned before, the Second Normal Form deals with attributes assigned to the wrong
tuple type. Attribute Mileage Credit in our example should not have been assigned to tuple
type Leg.
To determine the proper tuple type, you should consult the entity-relationship model for the
application domain. There are two possibilities:
• The entity-relationship model contains the entity type to which the improperly assigned
really belongs. In this case, add the attribute to the tuple type for the entity type.
• The entity-relationship model is incomplete since it does not contain the proper entity
type for the attribute. In this case, correct the entity-relationship model and reestablish
the tuple types concerned based on the corrected entity-relationship model.
In case of our example, the entity-relationship model is missing the proper entity type for
attribute Mileage Credit. As a matter of fact, Mileage Credit is rather a nondefining attribute
for nonstop connections, i.e., for relationship type AIRPORT_nonstop_to_AIRPORT. Thus,

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-41


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

it is modeled as a dependent entity type for that relationship type as illustrated on the
visual. The dependent entity type is called NONSTOP CONNECTION.
The cardinality of 1..1 for the target of the owning relationship type requires the mileage
credit to be provided when the nonstop connection is established.
Having introduced dependent entity type NONSTOP CONNECTION, the relationship type
specifying the nonstop connections for the various itineraries can now interconnect entity
types NONSTOP CONNECTION and ITINERARY. It need no longer interconnect
relationship type AIRPORT_nonstop_to_AIRPORT and entity type ITINERARY. The new
relationship type is called NONSTOP CONNECTION_in_ITINERARY. As a consequence,
dependent entity type LEG must now be based on this relationship type.
Of course, the problem statement for the application domain and the data inventory should
be updated accordingly by the application domain expert and the data base designer.
After we have corrected the entity-relationship model, we can reestablish the tuple types
for the entity types and relationship types concerned:
• The tuple types for entity types AIRPORT and ITINERARY remain unchanged.
• The new tuple type NONSTOP CONNECTION contains attribute Mileage Credit and the
key of the dependent entity type, i.e., the attributes From and To.
• The tuple type for entity type LEG no longer contains attribute Mileage Credit.
• Tuple types must not be provided for any of the relationship types on the visual for the
following reasons:
- Relationship types AIRPORT_nonstop_to_AIRPORT and
NONSTOP CONNECTION_in_ITINERARY are m:m relationship types being the
source of other relationship types with a minimum target cardinality of 1 (see
page 6-14).
- Relationship types AIRPORT_nonstop_to_AIRPORT_in_NONSTOP CONNECTION
and NONSTOP CONNECTION_in_ITINERARY_as_LEG are owning relationship
types (see page 6-13).
During the establishment of the entity-relationship model for Come Aboard, we already
resolved another violation of the Second Normal Form. The attributes of entity type
AIRCRAFT TYPE originally belonged to entity type AIRCRAFT MODEL which represented
a violation of the Second Normal Form.

6-42 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Third Normal Form - Definition
A tuple type is in the Third Normal Form if:
It is in the Second Normal Form
None of its elementary nonkey attributes is
functionally dependent on other nonkey attributes

AIRCRAFT TYPE
AIRCRAFT MODEL Type Code, PK
Category
Type Code, PK Manufacturer
Model Number, PK Manufacturer Code
Dimensions Company Name
Length Functionally Address
Height Dependent On Street [0..1]
Wing Span Post Office Box [0..1]
Weights City
Net Weight State [0..1]
Maximum Weight Country
Postal Code [0..1]
Cruising Speed Phone Number
Number of Engines
In 3rd Normal Form
Not in 3rd Normal Form
Figure 6-24. Third Normal Form - Definition CF182.0

Notes:
The Third Normal Form requires that a tuple type is in Second Normal Form and none of its
elementary nonkey attributes is functionally dependent on other nonkey attributes.
If attribute-1 and attribute-2 are attributes of a tuple type, attribute-2 is functionally
dependent on attribute-1 if, for each occurrence of a value of attribute-1, attribute-2
assumes the same value. For different values of attribute-1, attribute-2 may assume
different values. However, for the same value of attribute-1, it must always assume the
same value. Functional dependence may not just exist on a single elementary attribute; it
can also exist on a composite attribute, meaning dependence on all components, or on a
set of attributes.
For the Third Normal Form, functional independence is not only required for the direct
elementary attributes of the tuple type, but for all components of composite attributes. This
means, it is required for all elementary attributes of the tree structure for the tuple type.
Furthermore, there must not be a functional dependence on components of composite
attributes.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-43


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Tuple type AIRCRAFT MODEL on the visual is in Third Normal Form because none of its
elementary nonkey attributes is dependent on other nonkey attributes. The dimensions,
weights, and the cruising speed are all functionally independent of each other. For the
same dimensions, different weights and cruising speeds may apply and vice versa.
In tuple type AIRCRAFT TYPE, elementary attributes Company Name, Phone Number,
and all components of composite attribute Address are functionally dependent on attribute
Manufacturer Code. Thus, tuple type AIRCRAFT TYPE is not in Third Normal Form.
Violations of the Third Normal Form can lead to inconsistent tuples as a consequence of
update operations changing only some of the tuples with the same dependent values. They
may also lead to the loss of the dependent information if the last tuple for a value is deleted.
If the data groups for the composite attributes of a tuple type were established properly, all
related functional dependences should be within the same composite attribute. For our
example, they are all in composite attribute Manufacturer. Thus, the usage of properly
created composite attributes can ease your task of determining functional dependences. If
you have not formed data groups/composite attribute or have not established them
correctly, functional dependences may exist across composite attributes.

6-44 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Third Normal Form - Solution

MANUFACTURER
Manufacturer Code, PK
Company Name
AIRCRAFT TYPE Address
Type Code, PK Street [0..1]
Category Post Office Box [0..1]
Manufacturer Code City
Number of Engines State [0..1]
Country
Postal Code [0..1]
Phone Number

Figure 6-25. Third Normal Form - Solution CF182.0

Notes:
To solve a violation of the Third Normal Form, you must move all attributes being
functionally dependent on the same set of attributes to a new tuple type. The attributes the
moved attributes were dependent on are repeated in the new tuple type. They become the
primary key of the new tuple type.
In case of our example, the attributes Company Name, Phone Number and all components
of Address are removed from tuple type AIRCRAFT TYPE. They become attributes of a
new tuple type MANUFACTURER. Attribute Manufacturer Code remains in tuple type
AIRCRAFT TYPE, but is repeated in MANUFACTURER. It becomes the primary key of
MANUFACTURER. In this way, the association between aircraft types and manufacturers is
maintained.
If the composite attributes for a tuple type have been formed correctly, the resolution of a
Third Normal Form violation incorporates the following:
• A new tuple type is created for the entire composite attribute having the functional
dependences.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-45


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

• The primary key of the new tuple type is repeated (remains) in the original tuple type.
For our sample tuple type, the composite attributes have been formed correctly.
Accordingly, a new tuple type MANUFACTURER has been created for composite attribute
Manufacturer and the primary key of that tuple type is repeated in the original tuple type.

6-46 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Third Normal Form - Instance Example
BEFORE
Manufacturer
Type Manufacturer Company Name City State Country Number of
Code Code Engines
B747 BOEING BOEING CORPORATION SEATTLE WA USA 4
A310 AIRBUS AIRBUS INDUSTRIES TOULOUSE FRANCE 2
A340 AIRBUS AIRBUS INDUSTRIES TOULOUSE FRANCE 4
B737 BOEING BOEING CORPORATION SEATTLE WA USA 2
A319 AIRBUS AIRBUS INDUSTRIES TOULOUSE FRANCE 2
B777 BOEING BOEING CORPORATION SEATTLE WA USA 2
AIRCRAFT TYPE
Type Manufacturer Number of
Code Code Engines
B747 BOEING 4
A310 AIRBUS 2
A340 AIRBUS 4
B737 BOEING 2 AIRCRAFT TYPE AFTER
A319 AIRBUS 2
B777 BOEING 2

Manufacturer Company Name City State Country


Code
BOEING BOEING CORPORATION SEATTLE WA USA
AIRBUS AIRBUS INDUSTRIES TOULOUSE FRANCE
MANUFACTURER
Figure 6-26. Third Normal Form - Instance Example CF182.0

Notes:
This visual gives an instance example for the tuple types of the previous visuals. However,
because of the limited size of the visual, some attributes have been omitted: Category,
Street, Post Office Box, Postal Code, and Phone Number are not shown. The tuple types
have been presented in form of tables to show multiple instances for them.
The top portion of the visual illustrates tuple type AIRCRAFT TYPE before normalization.
The information for a manufacturer is (must be) repeated for each aircraft type produced by
him/her. As you can envisage, this leads to inconsistent information if only some of the
tuples for a manufacturer are updated when the manufacturer information changes.
The bottom half of the visual illustrates the situation after normalization. Tuple type
AIRCRAFT TYPE now only contains the manufacturer code and no longer the information
functionally dependent on it. The information for a manufacturer is contained in tuple type
MANUFACTURER. Tuple type MANUFACTURER contains one tuple for every
manufacturer.
The new tuple type allows Come Aboard to store information about manufacturers without
having aircraft types from them. This was not possible before normalization.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-47


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Third Normal Form - ER Model Correction

AIRCRAFT _from_ MANU-


TYPE m 1. .1 FACTURER

AIRCRAFT TYPE MANUFACTURER


Type Code, PK Manufacturer Code, PK
Category Company Name
Number of Engines Address
Street [0..1]
Post Office Box [0..1]
City
State [0..1]
Country
Postal Code [0..1]
AIRCRAFT TYPE Phone Number
Type Code, PK
Category AIRCRAFT TYPE_from_MANUFACTURER
Manufacturer Code Type Code, PK
Number of Engines Manufacturer Code

Figure 6-27. Third Normal Form - ER Model Correction CF182.0

Notes:
The fact that a new tuple type has been created to comply with the Third Normal Form
should be reflected in the entity-relationship model. The creation of a new tuple type really
means that the appropriate information has become an independent conceptual unit
representing a class of objects with the same meaning and characteristics. Thus, it means
that the entity-relationship model should contain a new entity type.
Since the new tuple type has an association with the old tuple type, the new entity type
must have a relationship type with the entity type (or relationship type) for the old tuple
type.
In case of our example, the entity-relationship model must be extended by a new entity
type (MANUFACTURER) and a new relationship type between entity types AIRCRAFT
TYPE and MANUFACTURER. The relationship type is called AIRCRAFT
TYPE_from_MANUFACTURER. The relationship type is a 1:m relationship type: An
aircraft type can be from one and only one manufacturer, but a manufacturer may
manufacture multiple aircraft types. Accordingly, the key of the relationship type is Type
Code, the key of entity type AIRCRAFT TYPE.

6-48 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The source cardinality of m (0..m) indicates that Come Aboard wants to keep information
about manufacturers even if it does not own one of their aircraft types. However, you must
verify this with the application domain expert. The unnormalized tuple type for AIRCRAFT
TYPE would not have allowed you to store information about manufacturers without an
aircraft type. This was a further reason for resolving the violation of the Third Normal Form.
As a matter of principle, you should correct the entity-relationship model first and then
reestablish the tuple types based on the corrected entity-relationship model. When
reestablishing the tuple types based on the corrected entity-relationship model, you get
tuple types for entity types AIRCRAFT TYPE and MANUFACTURER and for relationship
type AIRCRAFT TYPE_from_MANUFACTURER.
The tuple type for AIRCRAFT TYPE does not contain attribute Manufacturer Code! The
interrelationship between aircraft types and manufacturers is rather expressed by tuple
type AIRCRAFT TYPE_from_MANUFACTURER.
The fact that we get three tuple types seems to be conflicting with the solution developed
before. It is not. Tuple types AIRCRAFT TYPE and
AIRCRAFT TYPE_from_MANUFACTURER have the same primary key. For each tuple in
MANUFACTURER, AIRCRAFT TYPE_from_MANUFACTURER has a corresponding
tuple, and vice versa. Therefore, the two tuple types can be combined as will be discussed
further in the subsequent unit.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-49


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Third Normal Form in Multiple Tuple Types

ENGINE
ENGINE Engine Number, PK
Engine Number, PK Engine Type
Engine Type Manufacturer Code
Manufacturer
Manufacturer Code MANUFACTURER
Company Name Manufacturer Code, PK
Address Company Name
Street [0..1] 3NF Address
Post Office Box [0..1] Street [0..1]
City Post Office Box [0..1]
State [0..1] City
Country State [0..1]
Postal Code [0..1] Country
Phone Number Postal Code [0..1]
Phone Number

Is this the same MANUFACTURER


as for AIRCRAFT TYPE???
Figure 6-28. Third Normal Form in Multiple Tuple Types CF182.0

Notes:
This visual illustrates another violation of the Third Normal Form for our sample airline
company. Tuple type ENGINE contains the same composite attribute Manufacturer as
tuple type AIRCRAFT TYPE before normalization. Thus, it violates the Third Normal Form
as well.
The resolution of the violation is the same as for AIRCRAFT TYPE. The composite attribute
forms an own tuple type (MANUFACTURER). Except for Manufacturer Code, the attributes
of composite attribute Manufacturer are removed from tuple type AIRCRAFT TYPE.
This raises the question if tuple type MANUFACTURER is the same as created for
AIRCRAFT TYPE? Both tuple types have the same attributes. As usual, the question must
be answered by the application domain expert. From the real world, we know that some
manufacturers produce both aircraft and engines whereas other only manufacture engines
or aircraft. Thus, how should we solve the problem? The alternatives are discussed on the
next two visuals.

6-50 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
3rd NF in Multiple Tuple Types (Alternative 1)

AIRCRAFT _from_ MANU-


TYPE m 1. .1 FACTURER
1. .1
_for_
D 1. .m
AIRCRAFT
_from_
MODEL
1. .1
_for_
m m
_on_
AIRCRAFT ENGINE
1 m
_in_
DC 1. .1
ENGINE
LOCATION

Figure 6-29. 3rd NF in Multiple Tuple Types (Alternative 1) CF182.0

Notes:
This visual illustrates a possible solution for the problem raised on the previous visual. The
solution is discussed for the entity-relationship model changes required. The tuple types
then follow automatically. Since the attributes for engine and aircraft manufacturers are the
same, you can use the same entity type MANUFACTURER (and, thus, tuple type) to store
information about both. For each manufacturer, the entity type contains one entity instance.
To complete the entity-relationship model, you need relationship types
AIRCRAFT TYPE_from_MANUFACTURER and ENGINE_from_MANUFACTURER
expressing the interrelationships between aircraft types and manufacturers and engines
and manufacturers.
However, the solution has one problem: It is possible to establish relationships between
aircraft types and manufacturers just producing engines and between engines and
manufacturers only manufacturing aircraft.
This problem is generally considered a data-entry problem. Your data would also be wrong
if you specified the wrong aircraft manufacturer for an aircraft type or the wrong engine
manufacturer for an engine. Therefore, most application domain experts and database

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-51


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

designers will just go with this solution without further constraints. To solve the problem
completely, you can:
• introduce an additional attribute Manufacturer Type specifying the type of manufacturer
(engine manufacturer, aircraft manufacturer, or both engine and aircraft manufacturer)
• define constraints for relationship types AIRCRAFT TYPE_from_MANUFACTURER and
ENGINE_from_MANUFACTURER restricting the instances of the relationship types
based on the values of attribute Manufacturer Type of entity type MANUFACTURER.
The constraints for the relationship types prevent the improper assignment of
manufacturers.

6-52 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
3rd NF in Multiple Tuple Types (Alternative 2)

AIRCRAFT MANU-
TYPE m FACTURER
S
_for_ _is_
_from_ 1. .m
D 1. .m
AIRCRAFT DC 1 1 DC
MODEL AIRCRAFT ENGINE
1. .1 1. .1 MANUFACTURER MANUFACTURER
_for_
1. .1
m
_from_
_on_ m
AIRCRAFT ENGINE
1 m
_in_
DC 1. .1
ENGINE
LOCATION

Figure 6-30. 3rd NF in Multiple Tuple Types (Alternative 2) CF182.0

Notes:
This visual illustrates an alternate solution using supertypes and subtypes.
MANUFACTURER is made a supertype for subtypes AIRCRAFT MANUFACTURER and
ENGINE MANUFACTURER. MANUFACTURER contains instances for all manufacturers.
AIRCRAFT MANUFACTURER contains instances for manufacturers producing aircraft
(and possibly engines) and ENGINE MANUFACTURER instances for manufacturers
producing engines (and possibly aircraft).
In addition. relationship types are established between entity type AIRCRAFT TYPE and
subtype AIRCRAFT MANUFACTURER and entity type ENGINE and subtype ENGINE
MANUFACTURER.
Unless you want to store additional information for the different manufacturer types (a
possible by-product of the solution), subtypes AIRCRAFT MANUFACTURER and ENGINE
MANUFACTURER have a single attribute (Manufacturer Code). Therefore, if you do not
have additional information for the different manufacturer types, the solution would probably
be considered exaggerated.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-53


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Fourth Normal Form - Definition

A tuple type is in the Fourth Normal Form if:


It is in the Third Normal Form
Its attributes do not have multivalued
dependencies on each other

Let attribute-1, attribute-2, and attribute-3 be attributes of a


tuple type.
Attribute-3 is multivalued dependent on attribute-1 by the
way of attribute-2 if:
For each value of attribute-2 occurring with a specific,
but arbitrary, value of attribute-1, the tuple type must contain
tuples with the same values for attribute-3

Multivalued dependencies may lead


to group inconsistencies
Figure 6-31. Fourth Normal Form - Definition CF182.0

Notes:
The Fourth Normal Form requires that:
• the tuple type is in the Third Normal Form and
• its attributes do not have multivalued dependencies on each other.
A multivalued dependency involves three attributes. If attribute-1, attribute-2, and
attribute-3 are attributes of the same tuple type, attribute-3 is said to be multivalued
dependent on attribute-1 by the way of attribute-2 if the following is true:
For each value of attribute-2 occurring with a specific, but arbitrary, value of attribute-1, the
tuple type must contain tuples with the same values for attribute-3.
To make this definition more understandable, let us assume a21, a22, and a23 are values
of attribute-2 occurring with value a11 for attribute-1 in tuples of the tuple type.
Furthermore, assume that the tuples for a11 and a21 have the following values for
attribute-3:

6-54 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty

attribute-1 attribute-2 attribute-3


a11 a21 a31
a11 a21 a32
a11 a21 a33
a11 a21 a34
Then, multivalued dependency of attribute-3 on attribute-1 means that the following tuples
must exist for the combination a11 and a22:

attribute-1 attribute-2 attribute-3


a11 a22 a31
a11 a22 a32
a11 a22 a33
a11 a22 a34
Similarly, for the combination a11 and a23, the following tuples must exist:

attribute-1 attribute-2 attribute-3


a11 a23 a31
a11 a23 a32
a11 a23 a33
a11 a23 a34
Multivalued dependencies may lead to group inconsistencies due to insert, delete, or
update operations paying no attention to the multivalued dependency.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-55


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Fourth Normal Form - Sample Tuple Type

PILOTS_and_MECHANICS_for_ AIRCRAFT MODEL


Aircraft Model, PK
Type Code
Model Number
Employee Number AS Pilot Employee Number, PK
Employee Number AS Mechanic Employee Number, PK

Lists in same tuple type:


Pilots that can fly an aircraft model
Mechanics that can maintain (are trained for) an aircraft model
A tuple consists of an aircraft model, a pilot employee
number, and a mechanic employee number
Aircraft Model, Pilot Employee Number, and Mechanic
Employee Number all belong to primary key
An aircraft model can be flown by many pilots; a pilot can
fly many aircraft models
Many mechanics may be trained for an aircraft model; a
mechanic may be trained for many aircraft models
No interdependencies between pilots and mechanics
Figure 6-32. Fourth Normal Form - Sample Tuple Type CF182.0

Notes:
The above tuple type has not been the result of the creation of the tuple types for our
sample airline company called Come Aboard. It has been created artificially to demonstrate
a violation of the Fourth Normal Form. It has been created by joining the tuple types for two
m:m relationship types.
The tuple type lists, for the various aircraft models, both the pilots that can fly them and the
mechanics that are trained for them, i.e., can maintain them.
Each tuple contains an aircraft model (type code and model number), a pilot employee
number, and a mechanic employee number. Composite attribute Aircraft Model has been
used for clarity reasons. It groups the two attributes Type Code and Model Number
uniquely identifying aircraft models. All cardinalities of the tuple type are implicitly defined
and, therefore, are [1..1]. Thus, the tuple type is in First Normal Form.
All attributes belong to the primary key for the tuple type since:
• An aircraft model can be flown by many pilots, and a pilot can fly many aircraft models.

6-56 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • Many mechanics may be trained for an aircraft model, and a mechanic may be trained
for many aircraft models.
Consequently, the tuple type is in the Second Normal Form and even in the Third Normal
Form.
It is a further assumption for the tuple type that there are not any special interdependencies
between pilots and mechanics.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-57


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Fourth Normal Form - Instance Example

Model Pilot Mechanic


Type Code
Number Employee Number Employee Number
B747 400 0491337 5219330
B747 400 0491337 6027005
B747 400 0844092 5219330
B747 400 0844092 6027005
Same
B747 400 0003613 5219330
B747 400 0003613 6027005
A310 300 1662951 4421026
A310 300 1662951 6027005
A310 300 1662951 1427254
A310 300 3721040 4421026
Same
A310 300 3721040 6027005
A310 300 3721040 1427254

Mechanic Employee Number multivalued dependent on Aircraft


Model (Type, Model Number) by the way of Pilot Employee Number

Figure 6-33. Fourth Normal Form - Instance Example CF182.0

Notes:
This visual illustrates an instance example for the tuple type explained on the previous
visual. Attribute Mechanic Employee Number is multivalued dependent on composite
attribute Aircraft Model, i.e., on Type Code and Model Number, by the way of Pilot
Employee Number:
• Take a specific aircraft model, for example, the Boeing B747, Model 400.
• It occurs together with pilot employee numbers 0491337, 0844092, and 0003613.
• For the selected aircraft model and pilot employee number 0491337, the mechanic
employee numbers are 5219330 and 6027005.
• Since the mechanics trained for an aircraft model have nothing to do with the pilots, the
same mechanics need be listed for pilot numbers 0844092 and 0003613.
• Similar considerations apply to any other aircraft model selected. For the Airbus A310,
Model 300, tuples with the same mechanic employee numbers must exist for pilots
3721040 and 1662951.

6-58 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Accordingly, the tuple type violates the Fourth Normal Form. Improper insertions or
deletions of tuples could result in inconsistent data by violating the multivalued
dependencies. For example, if the tuple for aircraft model B747, Model 400, pilot employee
number 0844092, and mechanic employee number 6027005 were deleted, the data would
be inconsistent.
By the way, multivalued dependencies always come in pairs. If attribute-3 is multivalued
dependent on attribute-1 by the way of attribute-2, then attribute-2 is multivalued dependent
on attribute-1 by the way of attribute-3. In case of our example, Pilot Employee Number is
multivalued dependent on Aircraft Model by the way of Mechanic Employee Number. The
proof is left to you.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-59


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Fourth Normal Form - Solution


Model Pilot
Type Code
Number Employee Number
PILOT_can_fly_ AIRCRAFT MODEL B747 400 0491337
Aircraft Model, PK B747 400 0844092
Type Code B747 400 0003613
Model Number A310 300 1662951
Employee Number, PK A310 300 3721040

Model Mechanic
Type Code
Number Employee Number
MECHANIC_trained_for_ AIRCRAFT MODEL B747 400 5219330
Aircraft Model, PK B747 400 6027005
Type Code A310 300 4421026
Model Number A310 300 6027005
Employee Number, PK A310 300 1427254

Violations of Fourth Normal Form should not occur if


entity-relationship model established correctly

Figure 6-34. Fourth Normal Form - Solution CF182.0

Notes:
To solve Fourth Normal Form violations, the multivalued interdependencies between the
attributes must be unbundled by creating separate tuple types. One tuple type is created
for each relationship type. Accordingly, you get a tuple type for:
• the interdependency between aircraft models and pilots which is nothing else than
relationship type PILOT_can_fly_AIRCRAFT MODEL
• the interdependency between aircraft models and mechanics corresponding to
relationship type MECHANIC_trained_for_AIRCRAFT MODEL
Since each tuple type only contains a single employee number, AS clauses need not be
used. The purpose of the employee numbers is apparent from the meaning of the tuple
types.
The instances for the two tuple types are shown on the right-hand side of the visual.
If you have properly identified all relationship types in the entity-relationship model and
have not hidden them in entity types, you should not have violations of the Fourth Normal

6-60 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Form. The above example was created by joining the tuple types for the two relationship
types.

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-61


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Checkpoint

Exercise — Unit Checkpoint


1. Tuple types are the first result of storage view. (T/F)

2. Tuple types must not have composite attributes. (T/F)

3. What is the purpose of cardinalities for attributes of tuple types?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

4. Tuple types are established for:


a. The dependent entity types only of the entity-relationship
model.
b. All entity types of the entity-relationship model except
dependent entity types.
c. All entity types of the entity-relationship model.
d. All relationship types of the entity-relationship model.
e. Most relationship types of the entity-relationship model.
f. Owning relationship types.

5. For an entity type, multiple tuple types may be established. (T/F)

6. How are the tuple types for entity types established?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

6-62 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 7. For which relationship types must tuple types not be established?
_____________________________________________________
_____________________________________________________
_____________________________________________________

8. In the documentation of a tuple type, how are the components of a


composite attribute identified?
_____________________________________________________
_____________________________________________________
_____________________________________________________

9. In the documentation of a tuple type, what is the means for


identifying the role a data element or data group plays for an
attribute?
_____________________________________________________
_____________________________________________________
_____________________________________________________

10. Which cardinality is assumed if none has been specified for an


attribute in the tuple type documentation?
a. [0..1]
b. [0..*]
c. [1..*]
d. [1..1]

11. Establish the tuple type for relationship type


MAINTENANCE RECORD_belongs_to_ MAINTENANCE RECORD
for our sample airline company called Come Aboard.
_____________________________________________________
_____________________________________________________
_____________________________________________________

12. Which of the following choices are objectives the normalization of


tuple types wants to achieve? Normalization wants to:

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-63


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

a. Avoid difficulties with the conversion of tuple types into tables.


b. Improve the performance of the database being designed.
c. Reduce the size of the tuple types.
d. Remove redundancies within tuple types.
e. Remove redundancies across tuple types.
f. Avoid data inconsistencies resulting from insert, update, or
delete operations.

13. What do the Normal Forms define?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

14. What do you achieve by resolving violations of the First Normal


Form?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

15. How do you recognize repeating groups?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

16. Which modifications of the entity-relationship model does the


resolution of a First Normal Form violation generally require?

6-64 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty _____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

17. If a tuple type is in Second Normal Form, none of its elementary


nonkey attributes is functionally dependent on other nonkey
attributes. (T/F)

18. If a tuple type is in Third Normal Form, all its elementary nonkey
attributes are functionally dependent on the entire primary key.
(T/F)

19. Which modifications does the resolution of a Third Normal Form


violation generally require for the entity-relationship model?
_____________________________________________________
_____________________________________________________
_____________________________________________________

20. How can data groups help during normalization?


_____________________________________________________
_____________________________________________________
_____________________________________________________

21. Tuple types for relationship types can never violate any of the
Normal Forms. (T/F)

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-65


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Summary (1 of 2)

Tuple types are the first (intermediate) result of storage view


Form the basis for the computerized processing of the entity types
and relationship types of the entity-relationship model
Tuple types consist of attributes and have a primary key
Attributes may be elementary or composite
Attributes have cardinalities specifying how many values the attribute
must assume at least and at most in the scope used
Tuple types are established for all entity types and most relationship
types of the entity-relationship model
None for owning relationship types
None for m:m relationship types being the source (target) of another
relationship type with a minimum target (source) cardinality of 1
Tuple types for entity types consist of attributes for entity type
Primary key consists of attributes of entity key
Tuple types for relationship types consist of defining attributes for
relationship type
Primary key consists of attributes of relationship key

Figure 6-35. Unit Summary (1 of 2) CF182.0

Notes:

6-66 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary (2 of 2)

Established tuple types must be normalized to:


Make them convertible into tables
Remove redundancies within tuple types
Avoid data inconsistencies caused by insert, update, and delete anomalies
Normal Forms define states or quality levels for the tuple types
Five Normal Forms
Only first three of practical relevance
Subsequent Normal Form based on previous Normal Form
First Normal Form requires no repeating groups
Second Normal Form requires functional dependence of nonkey
attributes on entire primary key
Third Normal Form requires no functional dependence of nonkey
attributes on other nonkey attributes
Properly established data groups/composite attributes help identify
attributes to be moved together to new tuple types during normalization
Update entity-relationship model, problem statement, and data inventory
accordingly

Figure 6-36. Unit Summary (2 of 2) CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 6. Tuple Types 6-67


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

6-68 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 7. From Tuple Types to Tables

What This Unit Is About


This unit describes how you get from the tuple types for the application
domain to the tables of the target database management system. It
explains how multiple tuple types can be combined into a single tuple
type and, thus, become a single table. Conversely, it discusses how a
tuple type can be split into multiple tuple types to cope with restrictions
imposed by the target database management system.
Furthermore, the unit outlines how the tables and the objects
associated with them are established.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Combine tuple types to reduce the number of tables required.
• Split tuple types to cope with database limitations or performance
degradations.
• Denormalize tuple types as required for performance reasons.
• Establish the tables for the tuple types including
- The translation of abstract data types for attributes.
- The definition of data types and column attributes for the
columns of the tables.
- The documentation of the necessary database objects.

How You Will Check Your Progress


Accountability:
• Checkpoint questions
• Exercises

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives
After completion of this unit, you should be able to:

Combine tuple types to reduce the


number of tables required

Split tuple types to cope with database


limitations or performance degradations

Denormalize tuple types as required for


performance reasons

Establish the tables for the tuple types


including:
The translation of abstract data types for attributes
The definition of data types and column attributes
for the columns of the tables
The documentation of the necessary database objects

Figure 7-1. Unit Objectives CF182.0

Notes:
Conceptually, the tuple types established so far could immediately be converted into tables
of the target database management system. However, this would result in more tables than
necessary making it harder and more expensive than necessary to retrieve and maintain
the data for the application domain. Therefore, it is desirable to combine multiple tuple
types into a single tuple type, and thus a single table, if possible and reasonable. We will
discuss in this unit when tuple types can be combined.
Furthermore, limitations of the target database management system may not allow you to
convert the tuple types one-to-one into tables. The limitations as well as performance
considerations may force you to split tuple types vertically or horizontally into multiple
smaller tuple types which then can be implemented as tables.
Performance considerations may induce you to reverse normalizations you performed and
to take care of the resulting problems in a different manner. You might also want to
denormalize tuple types that were separate and not created by normalizations.
After these steps, you can establish the tables for the application domain. This includes:

7-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • Implementing the abstract data types for the attributes of the tuple types.
• Determining the data types (abstract or built-in) and column attributes for the columns of
the tables.
• Documenting the tables, their columns, and the related database objects for the
application domain.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

7-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 7.1 Combining and Splitting Tuple Types

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Tables in Design Process


Problem
Statement

Entity Relationship Conceptual


Model View

Process Data Inventory


Inventory

Tuple
Types

Tables
Logical Data
Structures
Integrity
Rules
Logical View Storage Indexes
View

Figure 7-2. Tables in Design Process CF182.0

Notes:
This unit deals with the establishment of the tables for the target database management
system. The tables are the containers for the data of the application domain. Thus, we are
right in the heart of physical design, i.e., of the storage view.

7-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Tables for Tuple Types
AIRCRAFT MODEL
Type Code, PK
Model Number, PK
Dimensions
Length
Height
Wing Span
Weights
Net Weight
Maximum Weight
Cruising Speed

TYPE_ MODEL_ WING_ NET_ MAXIMUM_ CRUISING_


LENGTH HEIGHT
CODE NUMBER SPAN WEIGHT WEIGHT SPEED

AIRCRAFT_MODEL
Figure 7-3. Tables for Tuple Types CF182.0

Notes:
Formally, the tuple types established so far can be translated into tables of the target
database management system as follows:
• For each tuple type, one table is created. The name for the table must follow the rules
for table names of the target database management system. There are length
restrictions for table names as well as restrictions on the characters they may include.
Unless you use delimited identifiers for table names, the table name may, for example,
not include blanks. However, they may generally include underscores (_). Thus, it is a
good idea to replace blanks in the names of the tuple types by underscores. Delimited
identifiers have the disadvantage that you need to specify enclosing double-quotes for
all references in SQL statements.
• Each (direct or indirect) elementary attribute of the tuple type becomes a column of the
table for the tuple type.
At present, composite attributes cannot be reflected in tables, only their elementary
components, the elementary components of their composite components, and so on.
Thus, tables cannot reflect the structure imposed by the composite attributes.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

The columns receive names in the target database management system. To column
names, the same restrictions apply as to table names. Furthermore, the names for the
columns of a table must be unique. Thus, if the same data group is used multiple times
(in different roles) by a tuple type, you must name the components of the corresponding
composite attributes differently. One way to achieve this is including the name of the
composite attribute (or part of it) in the column name. However, you must ensure that the
length restrictions for column names are adhered to.
• As for tuple types, a primary key is established for each table uniquely identifying the
rows of the table. The elementary attributes of the primary key for the tuple type become
the columns of the primary key for the table.

7-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Conversion of Tuple Types into Tables

Each tuple type becomes a table


Table name must follow rules for target DBMS
Each elementary attribute of tuple type becomes a
column of the table
Each elementary component of each composite
attribute becomes a column
Currently, composite attributes themselves cannot be
defined in all DBMS, but ...
Column names must follow rules for target DBMS
Elementary attributes for primary key of tuple type
become primary key for table

Data types for data elements associated with elementary attributes must
be implemented by means of:
Built-in data types for target DBMS or user defined distinct types
User defined functions, check constraints, and/or triggers

However . . .
Figure 7-4. Conversion of Tuple Types into Tables CF182.0

Notes:
The bullets in the gray box on this visual have already been described in the student notes
for the previous visual.
The elementary attributes for the tuple type are based on data elements in the data
inventory. In turn, the data elements are associated with data types. These data types must
be reflected in the target database management system. They can be implemented by
means of:
• Built-in data types for the target database management system or user defined distinct
types. Built-in data types are data types provided by the target database management
system. They are also referred to as standard data types. User defined distinct types are
data types that you can define yourself based on the built-in data types. Both built-in
data types and user defined distinct types will be discussed later in this units.
• In addition, the implementation of the data types for the data elements may need user
defined functions, check constraints, and triggers. User defined functions allow you to
perform customized operations for your data. Check constraints allow you to introduce

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

value constraints for the columns of a table. Triggers allow you to perform selected
actions as the consequence of database insert, update, or delete operations.
All these items will be discussed later in this unit.
As we mentioned before, the tuple types can formally be translated into tables in the
manner described. However, you should further manipulate the tuple types before
converting them into tables. The subsequent visuals will discuss why you should do this
and what you should do.

7-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Problems With One-to-One Conversion

. . . you should further manipulate the


tuple types before creating tables . . .

Direct conversion of normalized tuple types


may result in more tables than necessary
Unnecessary Join operations complicating Combine tuple type
queries and programs before creating tables
Unnecessary Join operations impacting
performance

Size limitations of target DBMS may not


Vertically or horizontally
allow implementation of resulting tables split tuple type before
creating tables

Resulting tables may have columns with a


different importance for application domain
Vertically split tuple type
Unnecessary retrieval of not required data
before creating tables
Negative performance impact on applications
only using important data

Figure 7-5. Problems With One-to-One Conversion CF182.0

Notes:
As described before, formally, the normalized tuple types could be converted into tables
one-to-one. However, this may result in more tables than required unnecessarily
complicating queries and programs by Join operations. In addition, the Join operations
result in performance degradations for queries and programs.
To avoid these problems, tuple types should be combined into a single tuple type where
possible and reasonable before converting them into tables.
Size limitations for the target database management system are a second problem
preventing the one-to-one conversion of tuple types into tables. Such limitations are upper
limits for the row size, the number of columns, or the table size. They may force you to split
a tuple type vertically or horizontally into multiple tuple types before creating the tables.
A third consideration is that the resulting tables may have columns with very different
usage characteristics and different importance for the application domain. Some of the
columns may never be used together. Some of them may only be used by unimportant and
not performance-critical business processes whereas the others are used by important,

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

performance-critical, processes. In this case, it may also make sense to split the tuple
types vertically before creating the tables to separate columns with different usage profiles.

7-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Merging Partial Tuple Types
BEFORE
Aircraft Date Date in Aircraft Type Model
Number Manufactured Service Number Code Number
B474001323 1994-10-12 1997-01-01 B474001323 B747 400
B373004518 1999-02-28 1999-03-15 One-to-one B373004518 B737 300
B373004519 1999-03-31 1999-04-20 correspondence B373004519 B737 300
A103000534 1998-05-12 1998-07-21 of primary key A103000534 A310 300
A103003167 1997-08-01 1997-09-01 values A103003167 A310 300
A402004217 1999-10-23 1999-11-15 A402004217 A340 200
AIRCRAFT AIRCRAFT MODEL
_for_AIRCRAFT
Same primary key

Aircraft Date Date in Type Model


Number Manufactured Service Code Number
B474001323 1994-10-12 1997-01-01 B747 400
B373004518 1999-02-28 1999-03-15 B737 300
AFTER B373004519
A103000534
1999-03-31
1998-05-12
1999-04-20
1998-07-21
B737
A310
300
300
A103003167 1997-08-01 1997-09-01 A310 300
A402004217 1999-10-23 1999-11-15 A340 200
AIRCRAFT

Figure 7-6. Merging Partial Tuple Types CF182.0

Notes:
Tuple types having the same primary key can be united in a single tuple type if they
contain, at all times, tuples with corresponding primary key values. This means that each
primary key value in one tuple type also occurs in the other tuple type and vice versa.
The two tuple types can be combined by adding the nonkey attributes of one tuple type to
the other tuple type. Note that it may be necessary to rename some of the added attributes.
It does not matter which tuple type is integrated in the other tuple type. In general, you will
integrate the tuple type with the smaller number of nonkey attributes in the other tuple type.
You may consider renaming the unified tuple type.
Since the original tuple types form parts of the larger, unified, tuple type, the unification is
referred to as merging of partial tuple types.
The example on the visual merges tuple types AIRCRAFT and
AIRCRAFT MODEL_for_AIRCRAFT being tuple types for an entity type and a relationship
type, respectively. For both tuple types, Aircraft Number is the primary key.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Since relationship type AIRCRAFT MODEL_for_AIRCRAFT is mandatory for entity type


AIRCRAFT, a relationship instance must exist for each aircraft expressing to which aircraft
model the aircraft belongs. Thus, for each tuple of tuple type AIRCRAFT, a tuple must exist
in tuple type AIRCRAFT MODEL_for_AIRCRAFT.
Since relationships always require that the corresponding source and target instances
exist, for each instance of relationship type AIRCRAFT MODEL_for_AIRCRAFT, the
appropriate aircraft must exist in entity type AIRCRAFT. Consequently, for each tuple in
tuple type AIRCRAFT MODEL_for_AIRCRAFT, a tuple with the same primary key value
must exist in tuple type AIRCRAFT.
Thus, the two tuple types must always contain tuples with the same primary key values and
can be combined. Attributes Type Code and Model Number of tuple type
AIRCRAFT MODEL_for_AIRCRAFT are added to tuple type AIRCRAFT to identify the
aircraft model for the aircraft.

7-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Finding Partial Tuple Types from ER Model
Entity type or D Dependent
relationship type Entity Type
1. .1
Tuple Tuple
Type 1 Type 2

Entity type or Entity type or


relationship type relationship type
. .m 1. .1
Tuple Tuple
Type 1 Type 2

Entity type or Relationship key = key of source Entity type or


relationship type relationship type
0. .1 1. .1
Tuple Tuple
Type 1 Type 2

Entity type or Entity type or


relationship type relationship type
1. .1 1. .1
Tuple Tuple Tuple
Type 1 Type 2 Type 3
OR
Depending on relationship key selected
Figure 7-7. Finding Partial Tuple Types from ER Model CF182.0

Notes:
For the sample tuple types on the previous visual, we used the entity-relationship model to
determine if the tuple types could be combined. This raises the question if the
entity-relationship model can generally be used to determine the partial tuple types that can
be merged? Indeed, the entity-relationship model helps to determine them.
In the following cases, the tuple types for entity types or relationship types represent partial
tuple types and can be merged:
• One of the tuple types is for a dependent entity type with a cardinality of 1..1 (for the
owning relationship type). The other tuple type may be for an entity type or a relationship
type. In this case, the two tuple types can be combined, for example, by integrating the
tuple type for the dependent entity type into the other tuple type.
Because of cardinality 1..1 for the dependent entity type, both tuple types have the same
primary key: Being a dependent entity type means that the own entity key includes the
key of the parent. Because of maximum cardinality 1, the entity key of the dependent
entity type need not and must not contain additional attributes.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Being a dependent entity type also means that, for every entity instance, the parent
contains an instance with the corresponding key value. Conversely, the minimum
cardinality of 1 requires that the dependent entity type contains an instance for every
parent instance.
• One of the tuple types is for a relationship type with cardinality 1..1 for one end (e.g., the
target) and maximum cardinality m for the other end. In this case, the tuple types for the
relationship type and for the end with maximum cardinality m can be combined. For
example, the tuple type for the relationship type can be integrated into the tuple type for
the end with maximum cardinality m.
Because of the cardinalities, the key for the relationship type consists of the key of the
end with maximum cardinality m. Thus, the corresponding tuple types have the same
primary key.
Cardinality 1..1 enforces that, for every instance of the end with maximum cardinality m,
the relationship type contains one and only one instance with the same key values.
Since source and target must exist for relationship instances, the end with maximum
cardinality m must contain, for every relationship instance, an instance with the same
key value. Thus, the corresponding tuple types are partial tuple types and can be
combined. For example, the tuple type for the relationship type can be integrated into
the tuple type for the end with maximum cardinality m.
This constellation represents the one on the previous visual.
• One of the tuple types is for a relationship type with cardinality 1..1 for one end (e.g., the
target) and cardinality 0..1 for the other end. In addition, the key of the relationship type
has been chosen to be the key of the end with cardinality 0..1. In this case, the tuple
types for the relationship type and for the end with cardinality 0..1 are partial tuple types
and can be combined. For example, the tuple type for the relationship type can be
integrated into the tuple type for the end with cardinality 0..1.
Since the key of the end with cardinality 0..1 has been chosen as relationship key, the
primary keys of the two tuple types are the same. (Note that there was a choice for the
relationship key because both maximum cardinalities were 1.) For the same reasons as
for the previous case, the two tuple types must at all times have corresponding primary
key values.
• One of the tuple types is for a relationship type with cardinality 1..1 for both ends. In this
case, the tuple type for the relationship type can be combined with the tuple type for the
source or with the tuple type for the target. With which tuple type it can be combined,
depends on which of the keys has been made the relationship key: If the key of the
source has been selected, the tuple type for the relationship type and the tuple type for
the source can be combined. If the key of the target has been selected, the tuple type for
the relationship type and the tuple type for the target can be combined.
• As you can imagine, combinations of the above cases may lead to cascaded mergers of
tuple types.

7-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Theoretically, it is possible that other tuple types can be combined as well. However, you
should only combine tuple types that can be combined directly or through several mergers.
Tuple types that cannot be combined by subsequent mergers have nothing to do with each
other. They lead to columns in tables that are never used together and, therefore, may
negatively impact performance.
It must be decided from case to case whether or not the combination of the tuple types
should be reflected in the entity-relationship model. If tuple types for relationship types are
involved, you do not want to reflect the merging of the tuple types in the entity-relationship
model. The entity-relationship model would no longer correctly describe the
interrelationships between entity types and relationship types.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-17
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Imbedding Detail Tuple Types


Engine Engine Manufacturer Engine Aircraft Engine
Number Type Code Number Number Position
PW9880193 PW4062 PW PW9880193 B474001323 1
PW9880194 PW4062 PW PW9880194 B474001323 2
PW9880195 PW4062 PW PW9882345 B474001323 3
PW9882345 PW4062 PW PW9974034 B474001323 4
BEFORE PW9974034 PW4062 PW R375184566 A103003167 1
A862946RR RB211-524 RR R375184568 A103003167 2
A59A350RR RB211-524 RR ENGINE
R375184566 CF6-80C2 GE _on_
R375184567 CF6-80C2 GE AIRCRAFT
ENGINE R375184568 CF6-80C2 GE
At least one nonkey column
Same primary key always contains a value

Engine Engine Manufacturer Aircraft Engine


Number Type Code Number Position
PW9880193 PW4062 PW B474001323 1
PW9880194 PW4062 PW B474001323 2
PW9880195 PW4062 PW
PW9882345 PW4062 PW B474001323 3
AFTER PW9974034 PW4062 PW B474001323 4
A862946RR RB211-254 RR
A59A350RR RB211-254 RR
R375184566 CF6-80C2 GE A103003167 1
R375184567 CF6-80C2 GE
ENGINE R375184568 CF6-80C2 GE A103003167 2

Figure 7-8. Imbedding Detail Tuple Types CF182.0

Notes:
Tuple type T2 can be imbedded into tuple type T1 if:
1. Both tuple types have the same primary key.
2. The primary key values of T2 form, at all times, a subset of the primary key values of T1.
3. For each tuple of T2, at least one of the nonkey attributes has a value. It need not
necessarily be the same attribute for all tuples.
The resulting extended tuple type T1 contains all attributes it contained before and the
nonkey attributes of tuple type T2. Note that it may be necessary to rename some of the
added attributes.
Tuples of old tuple type T1 not having a counterpart in T2 do not have a value for any
attributes added to new tuple type T1. Tuples of old tuple type T1 with a counterpart in T2
have a value for at least one attribute added to new tuple type T1 (third condition).
After the elimination of T2, it is still possible to determine the original tuple types (and, thus,
entity types or relationship types) for the various tuples. Thus, their (original) identity has
been preserved and no information has been lost.

7-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Since the tuples of T2 provide additional details for tuples of T1, tuple type T2 is referred as
a detail tuple type.
In the example on the visual, tuple type ENGINE_on_AIRCRAFT is a detail tuple type for
tuple type ENGINE. It provides further detail information for engines, namely, where they
are mounted. ENGINE_on_AIRCRAFT was created during in Unit 6 - Tuple Types as a
consequence of normalization.
Both tuple types have the same primary key Engine Number. Since not all engines are
mounted on aircraft, the primary key values of ENGINE_on_AIRCRAFT form a subset of
the primary key values of tuple type ENGINE. Attribute Aircraft Number of tuple type
ENGINE_on_AIRCRAFT always has a value so that the third condition for the imbedding of
tuple types is satisfied. Accordingly, ENGINE_on_AIRCRAFT is indeed a detail tuple type
of ENGINE and can be imbedded.
Resulting new tuple type ENGINE contains all attributes it had before plus the nonkey
attributes of ENGINE_on_AIRCRAFT. Tuples of old tuple type ENGINE that did not have a
counterpart in ENGINE_on_AIRCRAFT do not have values for the attributes added to tuple
type ENGINE. Tuples that had a counterpart in ENGINE_on_AIRCRAFT have values for
the added attributes.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-19
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Finding Detail Tuple Types from ER Model


Entity type or D Dependent For each instance,
relationship type a nonkey attribute
0. .1 Entity Type must have a value
Tuple Tuple
Type 1 Type 2

Defining attribute
Entity type or Entity type or not belonging to
relationship type relationship type key has always a
. .m 0. .1
value
Tuple Tuple
Type 1 Type 2
Defining attribute
Entity type or Relationship key = key of source Entity type or not belonging to
relationship type relationship type key has always a
1. .1 0. .1 value
Tuple Tuple
Type 1 Type 2

Defining attribute
Entity type or Entity type or not belonging to
relationship type relationship type key has always a
0. .1 0. .1
value
Tuple Tuple Tuple
Type 1 Type 2 Type 3
OR
Depending on relationship key selected
Figure 7-9. Finding Detail Tuple Types from ER Model CF182.0

Notes:
As for partial tuple types, the entity-relationship model can be used to determine the detail
tuple types that can be imbedded into other tuple types.
In the following cases, the tuple types for entity types or relationship types represent detail
tuple types and can be imbedded in other tuple types:
• One of the tuple types is for a dependent entity type with a cardinality of 0..1 (for the
owning relationship type). The other tuple type may be for an entity type or a relationship
type. In addition, for each instance of the dependent entity type, at least one nonkey
attribute must always have a value. In this case, the tuple type for the dependent entity
type can be imbedded in the tuple type for the parent.
Because of cardinality 0..1 for the dependent entity type, both tuple types have the
primary key: Being a dependent entity type means that the own entity key includes the
key of the parent. Because of maximum cardinality 1, the entity key of the dependent
entity type need not and must not contain additional attributes.

7-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Being a dependent entity type also means that, for every entity instance, the parent
contains an instance with the corresponding key value. Minimum cardinality 0 permits
that the dependent entity type does not contain an instance for every parent instance.
• One of the tuple types is for a relationship type with cardinality 0..1 for one end (e.g., the
target) and maximum cardinality m for the other end. In this case, the tuple type for the
relationship type can be imbedded in the tuple type for the end with maximum cardinality
m.
Because of the cardinalities, the key for the relationship type consists of the key of the
end with maximum cardinality m. Thus, the corresponding tuple types have the same
primary key.
Cardinality 0..1 permits that the relationship type does not contain an instance for every
instance of the end with maximum cardinality m. Since source and target must exist for a
relationship instance, the end with maximum cardinality m must contain, for every
relationship instance, an instance with the same key value. Since the defining attributes
not being part of the relationship key contain a value for every relationship instance, the
third condition for detail tuple types is automatically satisfied. Thus, the tuple type for the
relationship type is a detail tuple type. It can be imbedded in the tuple type for the end
with maximum cardinality m.
• One of the tuple types is for a relationship type with cardinality 0..1 for one end (e.g., the
target) and cardinality 1..1 for the other end. In addition, the key of the relationship type
has been chosen to be the key of the end with cardinality 1..1. In this case, the tuple type
for the relationship type is a detail tuple type and can be imbedded in the tuple type for
the end with cardinality 1..1.
Since the key of the end with cardinality 1..1 has been chosen as relationship key, the
primary keys of the two tuple types are the same. (Note that there was a choice for the
relationship key because both maximum cardinalities were 1.)
• One of the tuple types is for a relationship type with cardinality 0..1 at both ends. In this
case, the tuple type for the relationship type can be imbedded in the tuple type for the
source or in the tuple type for the target: If the key of the source has been selected as
relationship key, the tuple type for the relationship type can be imbedded in the tuple
type for the source. If the key of the target has been selected as relationship key, the
tuple type for the relationship type can be imbedded in the tuple type for the target.
• As you can imagine, combinations of the above cases may lead to cascaded imbeds of
tuple types.
Theoretically, other cases are possible. However, in cases that are not equivalent to
cascaded imbeds, you should not imbed the detail tuple type. The two tuple types
concerned have nothing to do with each other. Imbedding the detail tuple type leads to
columns in tables that are never used together and, therefore, may negatively impact
performance.
It must be decided from case to case whether or not the combination of the tuple types
should be reflected in the entity-relationship model. If tuple types for relationship types are

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-21
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

involved, you do not want to reflect the imbedding of tuple type in the entity-relationship
model. The entity-relationship model would no longer correctly describe the
interrelationships between entity types and relationship types.
As a conclusion of the previous three visuals, you can say:
Tuple types for 1:1 or 1:m relationship types can always be merged or imbedded.

7-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Decomposition of Super Tuple Types (1 of 2)
Employee Last Name First Name Date of
Number Birth
4627953 Miller Jonathan 1968-02-29
7003001 Ambrose Anna 1980-05-12
0562091 Repairmaid Susan 1975-03-17
2342007
0491337
Handyman
Miller
Peter
Jack
1974-04-20
1961-07-21
BEFORE
1662951 Smith Joe 1962-09-01
0844092 Ferguson Jane 1965-04-15
EMPLOYEE
Employee Pilot
Number Level
0844092 Copilot
All tuple types have same
1662951 Captain primary key
0491337 Captain
Key values of first tuple type
PILOT occur in at most one of the
other tuple types
Employee Date of All key values of other tuple
Number Certification types occur in first tuple type
2342007 1998-03-31
0562091 1999-02-25
MECHANIC

Figure 7-10. Decomposition of Super Tuple Types (1 of 2) CF182.0

Notes:
Let T, T1, T2, ..., Tn be tuple types with the following characteristics:
• All tuple types have the same primary key.
• At all times, each primary key value of T occurs in at most one of the tuple types T1
through Tn. This means that the primary key values of T1 through Tn are disjunctive.
• At all times, the primary values of T1 through Tn occur in tuple type T.
By adding the nonkey attributes of T to each of the tuple types T1 through Tn, the primary
key value sets of T and T1 through Tn can be made disjunctive. The tuples of T with
counterparts in T1 through Tn are removed from T and combined with the appropriate
tuples in T1 through Tn.
T1 through Tn are called a (partial) decomposition of T. Since the role of tuple type T has
changed, you should considered renaming it to correctly reflect its changed role.
If, at all times, each primary key value of T occurs in one of the tuple types T1 through Tn,
tuple type T can be eliminated. T1 through Tn then form a perfect decomposition of T.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-23
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

The situation described here exists for class (supertype/subtype) structures with exclusive
subtype sets. For this reason, tuple type T is referred to as super tuple type. Is the subtype
set also covering, the super tuple type can be eliminated.
The example on the visual illustrates the tuple types for a class structure with an exclusive,
but not covering, subtype set. The employees of Come Aboard may be pilots, mechanics,
or other types of employees. However, they may not be pilots and mechanics at the same
time. Since pilots or mechanics are employees at the same time, each tuple of PILOT or
MECHANIC has a counterpart in EMPLOYEE.
As illustrated on the next visual, the primary key values of EMPLOYEE, PILOT, and
MECHANIC can be made disjunctive.

7-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Decomposition of Super Tuple Types (2 of 2)
AFTER

Employee Last Name First Name Date of


Number Birth
4627953 Miller Jonathan 1968-02-29
7003001 Ambrose Anna 1980-05-12
OTHER EMPLOYEE

Employee Pilot Last Name First Name Date of


Number Level Birth
0844092 Copilot Ferguson Jane 1965-04-15
1662951 Captain Smith Joe 1962-09-01
0491337 Captain Miller Jack 1961-07-21
PILOT

Employee Date of Last Name First Name Date of


Number Certification Birth
2342007 1998-03-31 Handyman Peter 1974-04-20
0562091 1999-02-25 Repairmaid Susan 1975-03-17
MECHANIC

Figure 7-11. Decomposition of Super Tuple Types (2 of 2) CF182.0

Notes:
After the decomposition, tuple types PILOT and MECHANIC include all nonkey attributes of
EMPLOYEE (e.g., Last Name, First Name, and Date of Birth). Tuple type EMPLOYEE has
been renamed to OTHER EMPLOYEE to emphasize its changed role. Now, an employee
is either in OTHER EMPLOYEE or in PILOT or in MECHANIC, but not in more than one.
If the employees of Come Aboard could only be pilots or mechanics, tuple type OTHER
EMPLOYEE would not be needed, i.e, tuple type EMPLOYEE were eliminated completely.
You should note that, for the illustrated tuple types, generally, you would not perform a
decomposition of the super tuple type.
If the decomposition is a perfect decomposition, it should be reflected in the
entity-relationship model.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-25
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Combining Tuple Types - Considerations


Do not combine tuple types if their attributes are not processed together
or only by not performance-critical business processes
Otherwise, other critical business processes may experience a degradation
When imbedding detail tuple types, other tuple types should not be
referentially dependent on the detail tuple type
Otherwise, referential integrity cannot be enforced by referential integrity
support of target DBMS
When decomposing super tuple types, other tuple types should not be
referentially dependent on the super tuple type
Otherwise, referential integrity cannot be enforced by referential integrity
support of target DBMS
When combining tuple types, limitations for the referential integrity
support of the target DBMS may become effective not existing otherwise
For example, restrictions for referential cycles and delete-connected tables

When combining tuple types, size limitations for the target DBMS may
become effective forcing you to split the tuple type again

Figure 7-12. Combining Tuple Types - Considerations CF182.0

Notes:
The merging, imbedding, and decomposition of tuple types described on the preceding
visuals can be performed without the loss of information. However, there are a few things to
be considered which may make you not combine the tuple types:
• Do not combine tuple types which have nothing to do with each other; whose attributes
are never processed together; or whose attributes are only processed together by
business processes that are not performance-critical.
If you combined the tuple types, other critical business processes might experience a
performance degradation. The rows for the appropriate tables would become longer
resulting in fewer rows per page (physical blocks) and, thus, fewer rows per buffer. This
might increase the number of I/O operations required when processing or searching the
table sequentially.
• When imbedding a detail tuple type, other tuple types should not be referentially be
dependent on the detail tuple type. A tuple type is referentially dependent on another
tuple type if the values of one or more of its attributes must always be a subset of the
values of a corresponding set of attributes of the other tuple type.

7-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty If you imbed the referentially dependent tuple type, its referential integrity can no longer
be enforced by means of the referential integrity support of the target database
management system. You then must use other means to ensure the integrity of the data
(e.g., program logic or, if supported, triggers).
• When decomposing a super tuple type, other tuple types should not be referentially
dependent on the super tuple type.
If you decompose the super tuple type, the referential integrity of dependent tuple types
can no longer be enforced by means of the referential integrity support of the target
database management system.
• When combining tuple types, restrictions or limitations for the referential integrity support
of the target database management system may become effective which would not exist
otherwise. These limitations deal with referential cycles and delete-connected tables.
Referential cycles and delete-connected tables will be discussed in a later unit.
These restrictions can also become effective when you merge or imbed the tuple types
for 1:1 or 1:m relationship types and you might consider not to merge or imbed them.
• When combining tuple types, size limitations for the target database management
system may become effective forcing you not to combine the tuple types or to split them
differently.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-27
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Limitations and Consequences


Typical Limitations of Database Management Systems
Most database management systems store rows for tables into fixed-length
pages. The rows must fit into a single page
Maximum row length limited by chosen page size
Unused space if fixed-length rows are just a little longer than
half the page size, a third of the page size, and so on
There is an upper limit for the number of columns a table can have
There is an upper limit for the amount of space that a table can occupy

Possible Consequences
Cannot combine tuple types in a table that could be combined otherwise
Cannot denormalize tuple types
Must perform additional normalizations of tuple types
Must vertically split tuple types
Must horizonally split tuple types
Figure 7-13. Limitations and Consequences CF182.0

Notes:
All database management systems have limitations. The above visual illustrates the typical
limitations.
Most database management systems store the rows for tables into fixed-length pages, i.e.,
blocks of a fixed length. In general, a row must fit into a single page. The page size can be
chosen from predefined values and is the same for all pages of a table (or a set of tables).
For DB2 Universal Database for example, the page size can be 4096, 8192, 16384, or
32768 bytes.
The selection of a page size causes two problems:
• The maximum length of a row is restricted by the chosen page size. As a solution, you
could choose a bigger page size provided the target database management system
supports a bigger page size.
However, for a few exceptional rows, you do not always want to choose a larger page
size. For the direct retrieval of rows, a larger page size may mean that you read more
data than necessary for the majority of rows. The I/O operation for the larger page size

7-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty takes longer resulting in an undesirable performance degradation. Even for sequential
retrieval, a larger page size can negatively impact the overall system performance since
it may hamper concurrent requests for other tables.
• The fixed page size may result in a lot of unused space. Assume that all rows for a table
have the same fixed length and that the length is just a little over half a page. As a
consequence, only a single row fits into a page and nearly half the page is wasted. If the
row size is just over one third of the page size, you wasted about one third of the space,
and so on. The smaller the row size, the less space is wasted.
As a second limitation, there is typically an upper limit for the number of columns that a
table can have.
The third limitation common to all database management systems is that there is an upper
limit for the amount of space a table can occupy. In the course of time, the last two
limitations have been relaxed and will be relaxed even more.
If you hit one of the limitations mentioned above, the consequences are that:
• You cannot combine tuple types that could be combined otherwise.
• You cannot denormalize tuple types although you would like to.
• You must perform additional normalizations you did not want to do.
• You must vertically split tuple types.
• You must horizontally split tuple types.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-29
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Denormalization
Aircraft Date Date in Type Model
Number Manufactured Service Code Number
B474001323 1994-10-12 1997-01-01 B747 400
B373004518 1999-02-28 1999-03-15 B737 300
B373004519 1999-03-31 1999-04-20 B737 300
A103000534 1998-05-12 1989-07-21 A310 300
A103003167 1997-08-01 1997-09-01 A310 300
A402004217 1999-10-23 1999-11-15 A340 200
AIRCRAFT Type Model Length Height
Code Number
A340 200 59.40 16.91
A310 300 46.67 15.81
BEFORE B737 300 33.41 11.13
B747 400 70.67 19.33
AIRCRAFT MODEL

Aircraft Date Date in Type Model Length Height


Number Manufactured Service Code Number
B474001323 1994-10-12 1997-01-01 B747 400 70.67 19.33
B373004518 1999-02-28 1999-03-15 B737 300 33.41 11.13
AFTER B373004519
A103000534
1999-03-31
1998-05-12
1999-04-20
1989-07-21
B737
A310
300
300
33.41
46.67
11.13
15.81
A103003167 1997-08-01 1997-09-01 A310 300 46.67 15.81
A402004217 1999-10-23 1999-11-15 A340 200 59.40 16.91
AIRCRAFT
Figure 7-14. Denormalization CF182.0

Notes:
Let T1 and T2 be tuple types satisfying the following conditions:
• T1 and T2 have different primary keys.
• T1 has a set of attributes corresponding to the primary key of tuple type T2.
• At all times, the primary key of T2 and the corresponding attributes of T1 contain the
same values.
In this case, tuple type T2 can be integrated into tuple type T1 without loss of information
by adding the nonkey attributes of T2 to T1. However, as a consequence, information may
have to be stored redundantly in the integrated attributes.
This process is called denormalization since it represents a conscious violation of the
Second Normal Form or the Third Normal Form.
Frequently, the primary key values of T2 form a superset of the values of the corresponding
attributes of T1. In this case, you must decide if you can do without the tuples of T2 which
do not have a counterpart in T1. This means you accept the loss of information.

7-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The example on the visual integrates tuple type AIRCRAFT MODEL into tuple type
AIRCRAFT. Together, attributes Type Code and Model Number of tuple type AIRCRAFT
(T1) correspond to the primary key of tuple type AIRCRAFT MODEL (T2). Since the source
cardinality for relationship type AIRCRAFT MODEL_for_AIRCRAFT, in the
entity-relationship model for Come Aboard, is 1..1, there is an aircraft model for every
aircraft. Thus, each value of attribute pair (Type Code, Model Number) in T1 occurs as
primary key value of T2. However, because of target cardinality m for the relationship type,
there need not be an aircraft for every aircraft model. Consequently, AIRCRAFT MODEL
cannot be integrated in AIRCRAFT unless your decision is not to keep information about
aircraft models for which there is not an aircraft.
Since denormalization can be seen as a reversal of normalization, it reintroduces the
problems you tried to solve by normalization:
• Since every primary key value of T2 may occur multiple times in T1, information is
redundantly stored in the resulting combined tuple type. Consequently, you must ensure
that the attributes of T2 added to T1 are changed for all tuples with the same primary
key value of T2 at the same time. This can be achieved by using proper mass UPDATE
SQL statements for the (table of the) resulting tuple type.
Similarly, when adding a new tuple, it must be ensured that redundant information is
consistent with information already contained in existing tuples. This can be achieved by
copying the corresponding information from the existing tuples rather that entering it
again.
To reduce the risk of inconsistent redundant information as much as possible, you
should not allow end users to issue UPDATE or INSERT SQL statements against not
normalized tables. Rather, you should provide front-ends (to be used by the end users)
that include the proper UPDATE and INSERT statements.
• If the last tuple for a former primary key value of T2 is deleted, all T2-related information
for this value is lost. Similarly, you cannot add information about a new primary key value
of T2 without adding T1-related information at the same time.
For the example on the visual, when you delete the last Boeing 747, Model 400 aircraft
(B474001323), the information about the aircraft model is lost as well. Also, as outlined
above, you cannot add information about a new aircraft model without entering
information about an aircraft for that aircraft model at the same time.
When denormalizing tuple types, other tuple types should not be referentially dependent on
the integrated tuple types. Otherwise, the referential integrity of dependent tuple types can
no longer be enforced by means of the referential integrity support of the target database
management system.
If you look at the entity-relationship model for Come Aboard, you will see that entity type
AIRCRAFT MODEL is source or target of many relationship types. This means that many
tuple types are referentially dependent on it. Therefore, you would never integrate tuple
type AIRCRAFT MODEL into tuple type AIRCRAFT.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-31
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

It must be decided from case to case, if the denormalization should be reflected in the
entity-relationship model. It should be reflected in the entity-relationship model if it
combines the tuple types for two entity types.
The primary reason for denormalization is performance. However, because of the problems
involved with denormalization, you should investigate very carefully if the gain is worth the
trouble. If the table for the integrated tuple type always contains only a very few rows (e.g.,
just a page), denormalization will not bring a lot. After the first request, the page will be in
the buffers of the target database management system. Immediate subsequent requests
will not require an I/O operation. Also, locating the appropriate rows in the page does not
dramatically add to the processor time. However, to come to a reliable decision, you should
use the tools provided by the target database management system (such as EXPLAIN) to
determine the behavior of critical requests.

7-32 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Vertical Splitting of Tuple Types
Dimensions
Type Model Length Height Wing Net Maximum Cruising Range
Code Number Span Weight Weight Speed
A340 200 59.40 16.91 60.30 156500 274980 890 14800
BEFORE A310
B737
300
300
46.67
33.41
15.81
11.13
43.90
28.88
93710
35805
164400
62820
860
795
9600
4175
B747 400 70.67 19.33 64.31 226237 396890 930 13570
AIRCRAFT MODEL

Type Model Net Maximum Cruising Range


Code Number Weight Weight Speed
A340 200 156500 274980 890 14800
A310 300 93710 164400 860 9600
B737 300 35805 62820 795 4175
B747 400 226237 396890 930 13570
AIRCRAFT MODEL
Type Model Length Height Wing
Code Number Span
A340 200 59.40 16.91 60.30
AFTER A310
B737
300
300
46.67
33.41
15.81
11.13
43.90
28.88
B747 400 70.67 19.33 64.31
AIRCRAFT MODEL DIMENSIONS
Figure 7-15. Vertical Splitting of Tuple Types CF182.0

Notes:
Vertical splitting of a tuple type means that some attributes of the tuple type are moved to a
new tuple type with the same primary key. Of course, you should not arbitrarily split a tuple
type, but rather move attributes that logically belong together to the new tuple type. The
composite attributes for a tuple type identify attributes that belong together. They are a big
help when splitting a tuple type.
Limitations for the target database management system are one reason for splitting tuple
types. Another, equally important, reason are different usage profiles for the attributes of
the tuple type:
• Some attributes are never used together with other attributes.
• Some attributes are used very seldom and, then, together with other attributes, only in
business processes that are not performance-critical.
In these cases, splitting the tuple type may increase the performance of other,
performance-critical, business processes. As a consequence of the splitting, the rows for
the important tables become shorter and more rows will fit into a page. Thus, more rows

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-33
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

can be made available with a single I/O operation and kept in buffers of the target database
management system.
You should note, however, that vertical splitting only makes sense if the rows for the
corresponding table are not already very small. Some database management systems limit
the maximum number of rows per page. Thus, if the row size becomes too small, you lose
space without gaining performance.
When vertically splitting a tuple type, you effectively create a new dependent entity type.
The dependent entity type should be reflected in the entity-relationship model.
In the example on the visual, the dimensions for aircraft models are removed from tuple
type AIRCRAFT MODEL. They are moved to a new tuple type called AIRCRAFT MODEL
DIMENSIONS. The dimensions are less frequently used than the weights for the aircraft
models. Dimensions was a composite attribute of old tuple type AIRCRAFT MODEL.
Vertical splitting is the inverse of merging and imbedding of tuple types. If the original tuple
type contained tuples not having a value for any of the removed attributes, the new
dependent tuple type contains fewer tuples than the parent tuple type. You need not and
should not keep tuples just consisting of a value for the primary key and not containing
other useful information.

7-34 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Horizontal Splitting of Tuple Types
Engine Engine Manufacturer Aircraft Engine
Number Type Code Number Position
PW9880193 PW4062 PW B474001323 1
PW9880194 PW4062 PW B474001323 2
PW9880195 PW4062 PW
PW9882345 PW4062 PW B474001323 3
PW9974034 PW4062 PW B474001323 4 BEFORE
A862946RR RB211-254 RR
A59A350RR RB211-254 RR
R375184566 CF6-80C2 GE A103003167 1
R375184567 CF6-80C2 GE
ENGINE R375184568 CF6-80C2 GE A103003167 2
Engine Engine Manufacturer Aircraft Engine
Number Type Code Number Position
PW9880193 PW4062 PW B474001323 1
PW9880194 PW4062 PW B474001323 2
PW9880195 PW4062 PW
PW9882345 PW4062 PW B474001323 3
PW9974034 PW4062 PW B474001323 4
AFTER R375184566 CF6-80C2 GE A103003167 1
R375184567 CF6-80C2 GE
ENGINE R375184568 CF6-80C2 GE A103003167 2
Engine Engine Manufacturer Aircraft Engine
Number Type Code Number Position
A862946RR RB211-254 RR
RETIRED ENGINE A59A350RR RB211-254 RR

Figure 7-16. Horizontal Splitting of Tuple Types CF182.0

Notes:
Horizontal splitting of tuple types means that you partition the tuples of the tuple types.
Basically, you create multiple tuple types with the same attributes as the original tuple type.
Each of the new tuple types contains a part of the tuples of the old tuple type. The new
tuple types are referred to as partitions (of the old tuple type).
How the tuples are partitioned is completely up to the application domain, and you should
consult the application domain expert for advice. The partitioning need not be based on key
ranges for the primary key.
In the example on the visual, the engines are partitioned into active engines and retired
engines. Retired engines are engines permanently taken out of service. Active engines are
engines still used by aircraft, even though they may not be mounted at present. The
appropriate tuple types have been called ENGINE (for the active engines) and RETIRED
ENGINE. We could have called the tuple type for the active engines differently, but it
seemed handy to still call it ENGINE.
As illustrated on the visual, it might happen that, for a partition, some of the attributes do
not assume a value for any of the tuples. These attributes can be dropped from the tuple

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-35
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

type. On the visual, this is the case for attributes Aircraft Number and Engine Position of
tuple type RETIRED ENGINE: Retired engines are not and will not be mounted on aircraft.
Especially, if the partitions receive a new meaning, the horizontal splitting should be
reflected in the entity-relationship model.
When horizontally splitting tuple types, other tuple types should not be referentially
dependent on the split tuple type; otherwise, their referential integrity can no longer be
enforced by means of the referential integrity support of the target database management
system.
One reason for the horizontal splitting of tuple types are size limitations for tables. Another
reason may be that you want to assign the tuples for different responsibilities, branches, or
uses to different tables to avoid concurrent access problems.

7-36 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 7.2 Physical Implementation

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-37
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Built-In Data Types


Built-In
Data Types

Character Datetime Binary Numeric


String Data String Data

Date Binary large


object
Time
Timestamp

Single-Byte Double-Byte Binary Floating


Decimal
String String Integer Point

Fixed length Fixed length Small Packed Real


Varying length Varying length Large Double
Character Double-byte Big
large object character large
object

Figure 7-17. Built-In Data Types CF182.0

Notes:
When creating the tables for the target database management system, you must define the
columns for the tables. Defining the columns means that you have to specify a name, a
data type, and some additional column attributes for them. The names for the columns
must follow the rules for the target database management system and must be unique for
each table as was discussed earlier in this unit.
For the data types, you must translate the application-domain specific data types for the
corresponding data elements into data types supported by the target system. Each
database management system provides a set of built-in (standard) data types. For many
columns, the built-in data types are sufficient. For data elements based on abstract data
types, the built-in data types might not be sufficient and additional functions of the target
database management system must be used to simulate the abstract data types as closely
as possible. For now, let us concentrate on the built-in data types. Abstract data types will
be discussed later in this topic.
Most of the database management systems provide built-in data types for character
strings, numeric data, datetime data, and binary strings:

7-38 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • The data types for numeric data generally support integers, decimal numbers, and
floating-point numbers of varying sizes. The data types intended for integers generally
have binary representations of two (SMALLINT), four (INTEGER), or eight (BIGINT)
bytes supporting integers of different sizes. Check the reference manuals for your
database management system to determine the data types supported and their value
ranges. , for example, currently does not support BIGINT.
Decimal numbers are generally specified by means of DECIMAL(m[,n]) or
NUMERIC(m[,n]). Both specifications represent the same data type. m specifies the
number of digits and n the number of decimal places. If n is not specified, zero is
assumed, i.e., the numbers are integers. Internally, decimal numbers are mostly stored
in packed format. This means that each digit and the sign occupy half a byte. Again,
check the reference manuals for the supported syntax and the value ranges.
Floating-point numbers are approximations of real numbers. Normally, the target
database management systems support data types for single precision (REAL) and
double precision (DOUBLE). DOUBLE provides a better approximation of the real
numbers, but occupies more storage. In general, the representations occupy four and
eight bytes, respectively. Because of the different internal representations, check the
reference manuals for your database management system for the types supported and
their value ranges.
• The data types for character strings support single-byte character strings and
double-byte character strings. Single-byte character strings are sequences of one-byte
characters. Thus, each byte of the string represents a character of the underlying
character set. Frequently, if the context is clear, the term character string is used to
denote single-byte character strings.
Double-byte character strings are also referred to as graphic strings. They are
sequences of two-byte characters as required, for example, for some Asian character
sets. Thus, every two bytes of the string represents a character of the underlying
character set.
Both for single-byte and double-byte character strings, there are data types for
fixed-length strings, short varying-length strings, and large varying-length strings. The
latter are referred to as character large objects. For single-byte character strings, the
appropriate data types are CHARACTER(), VARCHAR(), and CLOB(). For double-byte
strings, they are GRAPHIC(), VARGRAPHIC(), and DBCLOB(). The maximum length
for the various data types depends on the target database management system. Thus,
check the reference manuals for your database management system to determine the
types and the maximum lengths supported. Character large objects allow millions or
even billions of characters.
• The datetime data types include data types for the date, the time, and timestamps. The
appropriate data types are DATE, TIME, and TIMESTAMP. As usual, the date includes
two digits for the day, two digits for the month, and four digits for the year. The time
includes two digits each for the hour, the minute, and the second. Timestamps include
date, time, and microseconds.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-39
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

• Binary strings can be binary large objects (data type BLOB()). They are strings of bytes.
Unlike character strings which usually contain text data, they are used to hold
nontraditional data such as pictures. The maximum length can be millions or even
billions of bytes.
Here are some design considerations for the built-in data types:
• Data that are numeric should be defined as numeric data and not as character strings
even if you will not perform calculations with them. When the data are defined as
numeric to the database management system, the database management system can
verify the correctness of the data for you. In addition, it can check if the data fall in the
supported or defined (check constraints) value ranges.
If the data are defined as character strings, all characters of the character set are valid
and the business processes must verify the correctness of the data themselves.
• For integer data, you have multiple choices for the data type. If binary integer data types
support the expected value range for the column, choose one of them because
binary-integer operations are generally cheaper. Choose the data type that best fits the
size of your expected data, but make sure that future extensions will not make the data
type obsolete. Rather, choose the next bigger data type. To change the data type
afterwards, you must delete the table and recreate it. This has consequences for the
objects based on the table and for authorizations you have granted.
• For character columns, you may have the choice between CHARACTER and
VARCHAR. If the actual length of the values varies, VARCHAR may save space.
However, you should be aware that the system adds two bytes for storing the length in
case of VARCHAR. Also, VARCHAR may slightly increase the processing time.
Furthermore, programmers do not like to work with varying-length data.
Therefore, only use VARCHAR if the length of the data varies considerably or you do
not have another choice because of the maximum length of the data. As a ballpark
figure, the difference between the average length and the maximum length for the
column should be greater than 25 bytes. The information for the corresponding data
element in the data inventory should tell you this.
If your target system supports compression, the space argument for VARCHAR
disappears and there is even less reason to use VARCHAR if you can use
CHARACTER instead.
• If you have VARCHAR columns, you should define them as last columns of the table to
save processing time. The sequence in which the columns are defined does not
mandate a sequence for their retrieval. For mass retrieval, some database
management systems calculate the offsets of the various columns once and not for
every row retrieved. They can only do this for the columns preceding the first
varying-length column and for the first varying-length column.

7-40 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Column Attributes - Nullable Columns
Columns need not assume a value for every row, i.e., a value need not
necessarily be provided for each row
Characteristic (attribute) for column
Column referred to as nullable
Special indicator used to indicate if the column has a value for a row
For rows without values, column is said to assume a value of NULL

If the value for a column of a row is NULL, no value has been provided for it
Different from 0 (zero) for numeric columns
Different from blanks for fixed-length character columns
Different from a string of length 0 for varying-length character columns

Engine_ Engine_ Manufacturer Aircraft_ Engine_


Number Type _Code Number Position
PW9880193 PW4062 PW B474001323 1
PW9880194 PW4062 PW B474001323 2
PW9880195 PW4062 PW --NULL-- --NULL-- Is not mounted
PW9882345 PW4062 PW B474001323 3
PW9974034 PW4062 PW B474001323 4 Is mounted on position 0
M18940012 CFM56 CFM A192003001 0 (may be a valid position)
M18940015 CFM56 CFM A192003001 1
M18940168 CFM56 CFM --NULL-- --NULL-- Is not mounted
ENGINE
Figure 7-18. Column Attributes - Nullable Columns CF182.0

Notes:
For a tuple type, some attributes (e.g., the primary key attributes) need assume a value for
every tuple whereas others need not. To correctly reflect this, it must be possible to specify
for the columns of the corresponding tables whether or not they must assume a value for
every row.
Indeed, it can be specified for a column, as a column attribute (characteristic), whether or
not a value must be provided for every row. A column that need not assume a value for
every row is referred to as a nullable column.
Internally, most database management systems use a special indicator, referred to as null
indicator, to indicate if the column has a value for a row. If a column does not have a value
for a row, it is said that the column has the value NULL for the row. This is a way of
speaking even though it is a contradiction in terms.
If the value for a column is NULL for a row, a value has not been provided for that row. For
numeric columns, this is different from a value of 0 (zero) for the column. For fixed-length
character columns, it is different from a value of all blanks. For varying-length character
columns, it is different from a character string of length 0.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-41
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In the example on the visual, engine M18940012 has an engine position of 0 meaning that
it is mounted on an aircraft in position 0. It does not mean that the engine is not mounted.
Zero may be a valid engine position.
In contrast, engines PW9880195 and M18940168 have an engine position of NULL. This
means that an engine position has not been provided for them: they are not mounted on an
aircraft.
You should be aware that NULL values may lead to different results for SQL functions or
operations than values of 0 or blanks or strings of length 0. For example, this is the case for
the column functions AVG and COUNT and for Join operations.
Nullable columns occupy a little additional storage and their handling requires a little extra
processing time. However, the additional storage or processing time is insignificant. You
should define columns that, from the perspective of the application domain, may not
contain a value as nullable and not try to save the extra overhead.

7-42 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Nullable Columns and Cardinalities
ENGINE
Engine Number, PK NOT NULL
Engine Type NOT NULL
Manufacturer Code NOT NULL
Aircraft Number [0..1] Nullable
Engine Position [0..1] Nullable
FLIGHT
Flight Number, PK NOT NULL
Airport Code AS From, PK NOT NULL
Airport Code AS To, PK NOT NULL
Flight Locator, PK NOT NULL
Departure AS Planned Departure
Departure Date NOT NULL
Departure Time NOT NULL
Arrival AS Planned Arrival
Arrival Time NOT NULL
Arrival Date NOT NULL
Departure AS Actual Departure [0..1]
Departure Date Nullable
Departure Time Nullable
Arrival AS Actual Arrival [0..1]
Arrival Date Nullable
Arrival Time Nullable

Figure 7-19. Nullable Columns and Cardinalities CF182.0

Notes:
As you certainly remember, we have introduced cardinalities for the attributes of tuple
types. The minimum cardinality for an attribute determines whether or not, in the context
used, the attribute must always assume a value. Thus, the minimum cardinalities for the
attributes determine whether or not the corresponding columns must always have a value.
The first example on the visual shows tuple type ENGINE. Its first three attributes do not
have a cardinality specified. This means that their implied cardinality is [1..1]. Since their
minimum cardinality is 1, the attributes and, thus, the corresponding columns must always
have a value. This can be defined by specifying NOT NULL for the columns.
The last two attributes of tuple type ENGINE have a minimum cardinality of 0. Therefore,
they need not assume a value for every tuple. Accordingly, the corresponding columns
need not assume a value for every row, i.e., the columns are nullable. This can be defined
by not specifying NOT NULL for the columns. By default, columns are nullable.
The second example on the visual illustrates the cardinalities for tuple type FLIGHT and
demonstrates that the cardinalities must be interpreted in the context of the comprising
structure: The first four attributes of FLIGHT are elementary attributes of the tuple type and

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-43
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

have an implied minimum cardinality of 1. Since they are direct attributes of the tuple type,
their minimum cardinality determines directly whether or not the corresponding columns
are nullable.
The other elementary attributes are components of composite attributes. All their minimum
cardinalities are 1. The minimum cardinalities of components do not alone determine
whether or not the corresponding columns are nullable. However, if the minimum
cardinality is 0, the column must be nullable.
If the minimum cardinality is 1, the associated column may still have to be nullable. This
depends on the minimum cardinality of the comprising composite attribute, the minimum
cardinality of the composite attribute comprising the composite attribute, and so on. If the
minimum cardinality of the comprising composite attribute is 1 and the composite attribute
is not again a component of another composite attribute, the corresponding column is not
nullable. It must be defined with NOT NULL. If the composite attribute is again contained in
a composite attribute, the minimum cardinality of the latter decides if the column will be
nullable.
If the minimum cardinality of the composite attribute comprising the elementary attribute is
0, the corresponding column must be defined as nullable.
In the example on the visual, composite attributes Planned Departure and Planned Arrival
have a minimum cardinality of 1. Since they are not again components of another
composite attribute, the columns for their elementary attributes must be defined with NOT
NULL. In contrast, composite attributes Actual Departure and Actual Arrival have a
minimum cardinality of 0. Accordingly, the columns associated with their elementary
attributes must be defined as nullable despite of the minimum cardinality of 1 for the
elementary attributes.
This added complexity stems from the fact that relational database management systems
currently do not support composite attributes.

7-44 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Column Attributes - Default Values
Default
Values

System User
Defaults Defaults

Numeric Fixed-Length Variable-Length Datetime


Data Strings Strings Data

Date Time Timestamp

0 Blanks String of Current Current Current


length 0 date time timestamp

Figure 7-20. Column Attributes - Default Values CF182.0

Notes:
The discussions about columns that always must have a value or need not have a value for
a row raise some questions:
• Independent of whether or not the column is nullable, what happens if a value is not
provided for a row? Does the system provide a default value?
• For nullable columns, does the column receive the value NULL or another default value?
This and the next visual will answer these questions.
Most target database management systems allow you to specify that a default value should
be assumed if a value is not provided for a row. The default values assumed can be system
defaults or user defaults.
System defaults are default values used by the database management system if:
• The column may assume default values.
• The database administrator has not defined an own default for the column.
• The user has not provided a column value for a row.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-45
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

All three conditions must be satisfied.


User defaults are default values defined by the database administrator for columns. The
selected values can be different from the system default values. User defaults values are
assumed if:
• The column may assume default values.
• The database administrator has defined an own default value for the column.
• The user has not provided a column value for a row.
All three conditions must be satisfied.
The visual illustrates the system default values for the various categories of data types.

7-46 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Selection of Default Values
NULL

(nullable) Nothing System


specified default

Value User
WITH DEFAULT specified provided
value

column- NULL NULL


name data-type
Nothing System
specified default
WITH DEFAULT
Value User
specified provided
NOT NULL value

Must
provide
value

Figure 7-21. Selection of Default Values CF182.0

Notes:
When defining a column for a table, you specify if the column may assume default values
and which default value it should assume. This is controlled by the WITH DEFAULT
keywords.
If the column is nullable and you do not specify WITH DEFAULT, the implicit default for the
column is the NULL value. That is, the column will not contain a value for a row, if the user
does not provide a value for the row on inserts.
If you specify WITH DEFAULT for nullable columns, the default value assumed depends on
whether or not you have provided an own default value. If you have not provided a default
value, the column will assume the system default value for the category of column. If you
provide your own default value, you can specify any value compatible with the data type for
the column or explicitly request that the column is set to NULL.
Similarly, for columns that always must assume a value (NOT NULL), you can request that
they assume a default value for a row if a value has not been provided. If you specify WITH
DEFAULT, but do not provide an own default value, the system default for the appropriate
category of data type is assumed. If you provide an own default value, it is assumed.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-47
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Finally, if you do not specify WITH DEFAULT for a column that always must have a value, a
value must be provided for every row inserted; otherwise, the request fails.

7-48 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Considerations for Abstract Data Types

When implementing an abstract data type :

You must ensure that the data for the abstract data type are properly
represented in the database

You must ensure that the data for the abstract data type satisfy any
value and length constraints imposed on them

You must ensure that the desired operations, and only those, can
be performed with data of the abstract data type

You want to provide functions converting external input into the


stored format for the abstract data type

Figure 7-22. Considerations for Abstract Data Types CF182.0

Notes:
When implementing abstract data types as discussed in Unit 5 - Data and Process
Inventories, the following considerations apply.
• Each abstract data type has its own set of allowable values and you must ensure that
the values are properly represented in the database of the target database management
system.
In some cases (e.g., for our sample abstract data type called name data), you want to
store the data in a normalized format. Thus, you must ensure that the data in the
database are in the normalized format.
• Abstract data types can be parameterized. In particular, they may allow you to specify
minimum and maximum lengths for each usage by data elements. Thus, when
implementing the abstract data type, you must ensure that the length constraints for
data elements are reflected as constraints for the columns and enforced for each usage.
In addition, the data elements of the application domain may have domains, i.e., value
constraints further restricting the values of their abstract data types. When defining a

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-49
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

column based on such a data element, you must ensure that its value constraints are
adhered to.
• Abstract data types generally provide a set of operations. When implementing an
abstract data type, you must ensure that these operations can be performed. Also, you
want to ensure that other illegal operations cannot be performed with data of the
abstract data type.
• If data can be entered by end users in different formats, but you want to store the data in
a normalized format, you should provide functions converting the external input into the
normalized format. You need these functions for comparing entered data with the stored
data (e.g., in the WHERE clause of SELECT statements). If the data entered were
compared directly with the stored data, you would not necessarily find the requested
data.

7-50 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
User Defined Distinct Types
User Defined Distinct Types

Allow you to define your own data types based on built-in data types
Cannot be parameterized
Always have a fixed maximum length
Even if based on a varying-length built-in data type
For a varying-length source data type, the maximum length is the
length specified when the user defined distinct type is created
Cannot specify a different (smaller) maximum length when the user
defined distinct type is used by a column
Must define it with the maximum length intended for any columns
and restrict the actual column lengths by other means
Disallow all operations for source data type except comparisons
Can only compare data of same user-defined distinct type
Cannot compare directly with data of source data type
Must cast to source data type to compare with source data type
Prevents illegal operations and incorrect comparisons

Figure 7-23. User Defined Distinct Types CF182.0

Notes:
User defined distinct types (UDTs) allow you to define your own data types based on the
built-in data types provided by the target database management system. However, they are
fairly simple-minded data types and cannot be parameterized.
When you create a user defined distinct type, you must select a built-in data type.
The built-in data type is referred to as source data type. If the source data type allows you
to specify a length, a number of digits, or a number of decimal places, you must specify the
appropriate values when you create the user defined distinct type.
Even if the user defined distinct type is based on a varying-length built-in data type, you
cannot specify a length later when the user defined distinct type is used as data type for a
column. The maximum length for the column is that defined for the user defined distinct
type. If you want to use the same user defined distinct type for multiple columns, you must
define it with the maximum length for any anticipated columns. You must use other means
to restrict the actual lengths of the columns. Alternatively, use different user defined distinct
types.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-51
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

With the exception of the comparison operations, the user defined distinct type does not
inherit any functions or operations of its source data type. Without further actions, you
cannot use any scalar or column functions for the source data type.
The comparison operations inherited are limited to the comparison of data belonging to the
user defined distinct type. You cannot directly compare data of the user defined distinct
type with data of the source data type. When you create a user defined distinct type, cast
functions are provided allowing you to change source data to data of the user defined
distinct type and vice versa. The cast data can then be compared with data of the
appropriate data type. The cast function changing data of the source data type to data of
the user defined distinct type has the same name as the user defined distinct type:
udt-name(source-data) t udt-data
The cast function changing data of the user defined distinct type to data of the source data
type has the same name as the source data type:
source-name(udt-data) t source-data
By using user defined distinct types, you can prevent illegal operations and incorrect
comparisons for columns of the same source data type having different semantics.
User defined distinct types are not supported by all target database management systems.

7-52 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
User Defined Distinct Types - Example

AIRCRAFT MODEL CREATE DISTINCT TYPE METER AS DECIMAL(5,2)


Type Code, PK WITH COMPARISONS
Model Number, PK CREATE DISTINCT TYPE CM AS INTEGER
Dimensions WITH COMPARISONS
Length CREATE TABLE AIRCRAFT_MODEL
Height (
Wing Span ...
Weights Length_of_Model CM NOT NULL,
Net Weight Height_of_Model CM NOT NULL,
Maximum Weight Wing_Span METER NOT NULL,
Cruising Speed ...
Range )

SELECT * FROM AIRCRAFT_MODEL


WHERE Length_of_Model < Wing_Span

Different user
defined distinct types ILLEGAL!!!

Figure 7-24. User Defined Distinct Types - Example CF182.0

Notes:
The example on the visual creates two user defined distinct types: One user defined
distinct type is based on built-in data type DECIMAL and is intended to represent
measurements in meters; the other is based on built-in data type INTEGER and is
supposed to represent measurements in centimeters. Their names are METER and CM,
respectively.
When creating the table for tuple type AIRCRAFT MODEL, columns Length_of_Model and
Height_of_Model are defined with user defined distinct type CM. Column Wing_Span is
defined with user defined distinct type METER. (Note that you should really have defined
all three dimensions with the same user defined distinct type.)
If you want to determine all aircraft models whose length is smaller than their wing span,
you cannot specify Length_of_Model < Wing_Span in the WHERE clause of the SELECT
statement. This is because the user defined distinct types of Length_of_Model and
Wing_Span are different. The comparison would indeed provide an incorrect result and is,
therefore, considered illegal.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-53
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

For a valid and correct comparison, you must cast the two columns to their source data
types and convert meters to centimeters in the WHERE clause:
WHERE INTEGER(Length_of_Model) < 100 * DECIMAL(Wing_Span)

7-54 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
User Defined Functions (UDFs)
User Defined
Functions

External Sourced
Functions Functions

Scalar Table Scalar Column


Functions Functions Functions Functions

Allow you to write your own functions Scalar functions are passed arguments
To be used in SQL DML statements and return a single value
To be used in SQL DDL statements
Column functions are passed a column
External functions are based on and return a single value
programs written by you
Sourced functions based on existing Table functions are passed arguments
built-in or user defined functions and return a table
Allow to extend existing functions to One row for each invocation
new user defined distinct types Can only be used in FROM clause

Figure 7-25. User Defined Functions (UDFs) CF182.0

Notes:
User defined functions (UDFs) allow you to write your own functions for the usage in SQL
statements. The user defined functions provided by you can be used in Data Manipulation
Language (DML) statements or Data Definition Language (DDL) statements. DML
statements are SELECT, INSERT, UPDATE, or DELETE. DDL statements are SQL
statements creating, altering, and deleting database objects, such as tables, indexes, user
defined distinct types, or user defined functions.
User defined functions can either be external functions or sourced functions. External
functions are based on programs, written in any of the programming languages supported
by the target database management system, that you provide. Of course, the functions
have to follow certain conventions concerning the passing and returning of arguments, but,
in the programs, you can pretty much do what you want. Depending on the database
management system, you may even issue SQL statements.
Sourced functions are based on existing built-in (system provided) functions or existing
user defined functions. Their primary purpose is to extend existing functions (e.g., the AVG
function or the LENGTH function) for the source data type to a newly created user defined

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-55
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

distinct type. They also allow you to rename an existing built-in function or user defined
function.
User defined functions can be scalar functions, column functions, or table functions. Scalar
functions are passed a set of arguments and return a single value. An example of a built-in
scalar function is the LENGTH function which returns the length of the expression
(argument) passed to it.
Column functions are passed the values of a column (or a subset thereof) and return a
single value which generally is derived from the values of the column. An example of a
built-in column function is the MIN function which returns the minimum of the column
values passed to it.
Table functions are passed a set of arguments and return a table row for each invocation.
They can only be used in the FROM clause of SELECT statements.
External functions can either be scalar functions or table functions. They cannot be column
functions. Sourced functions can only be scalar functions or table functions.
You can overload functions. You can define multiple functions with the same name as long
as the signatures of the various functions are different. This means that the data type of at
least one parameter must be different. Based on the data types of the arguments passed,
the database management system is capable of selecting the proper function.
User defined functions are not supported by all target database management systems.

7-56 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
UDFs - Definition and Invocation
CREATE DISTINCT TYPE TEXTDATA
AS VARCHAR(100) Checks text data string
WITH COMPARISONS for correctness and
converts it into stored
CREATE FUNCTION NORM(TEXTDATA) format (normalizes it)
RETURNS TEXTDATA
EXTERNAL NAME 'program'
LANGUAGE programming-language
... Program Library

CREATE FUNCTION
SUBSTR(TEXTDATA, INTEGER, INTEGER) Program
RETURNS VARCHAR(100)
SOURCE
SYSIBM.SUBSTR(VARCHAR(), INTEGER, INTEGER)

Must provide signature of function


Name of function
Data type(s) of parameters for function or of column passed
Must describe output returned
For scalar or column functions, data type of value returned
For table functions, names and data types of columns returned
Invocation: function-name ( expression , . . . )

Figure 7-26. UDFs - Definition and Invocation CF182.0

Notes:
This visual illustrates the definition of an external scalar and a sourced scalar user defined
function using user defined distinct type TEXTDATA.
The first user defined function, called NORM, checks data of user defined distinct type
TEXTDATA passed to it for correctness (it may only contain certain characters) and
converts it into a normalized text-data format.
Since the function is passed arguments and returns a single value, it is a scalar function.
When you define a function, you must describe the signature of the function. You must
specify:
• The name of the function.
• The data type(s) (including lengths) of the arguments passed to the function (i.e., of the
parameters for the function) or of the column passed. The latter applies to column
functions.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-57
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

You must also describe the output returned. For scalar or column functions, you must
specify the data type of the value returned. For table functions, you must specify the names
and the data types of the columns returned.
Function NORM is an external scalar function since it is based on a user-provided program.
To allow the database management system to establish the connection to the program
when the function is used, the object program to be executed must be identified when the
function is defined. So must be the programming language in which the program has been
written.
The second function on the visual is a sourced function extending built-in function SUBSTR
to text data. When you define it, you must again specify its signature and output for the new
data type. Furthermore, you must tell the system on which existing function it is based
(SOURCE). For the source function, you must provide its signature as well. For the
parameters of the source function, you need not provide lengths or decimal places since
they are already known to the system. However, you must specify the enclosing
parentheses if the data type has parameters.
The qualifier SYSIBM in the example identifies the source function as a built-in function of
an IBM database management system.
A user defined function is invoked by specifying its name followed, in parentheses, by the
arguments passed to the function.

7-58 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Check Constraints
Allow you to restrict acceptable values for columns
Can be defined on column or table level
On column level, to restrict accepted values for column concerned
On table level, to restrict accepted values for columns of table in
relationship to each other
Basically, check expression is a search condition evaluating to true,
false, or unknown
Predicates can be combined by AND and OR
Restrictions for check expressions depending on target DBMS
For example, for DB2 UDB for OS/390
Subselects not allowed
Built-in or user-defined functions not allowed
EXISTS and quantified predicates not allowed
CASE expressions not allowed
First operand of predicate must be a column
For example, for DB2 UDB for UNIX- and Intel-Based Platforms
Subselects not allowed
Some restrictions on use of user-defined functions
Enforced during the insertion, updating, and loading of rows

Figure 7-27. Check Constraints CF182.0

Notes:
Check constraints allow you to restrict the accepted values for columns of tables beyond
the values permitted by the column's data type.
Check constraints can be defined on the column level or on the table level. This means that
they can be defined for a particular column or for the table as such. When a check
constraint is defined for a column, it can just restrict the values for the column concerned.
References to other columns are not allowed.
In contrast, a check constraint that is defined on the table level can refer to any defined
column of the table. Thus, it can restrict the values of columns in relationship to each other.
For example, you may enforce that the values of a column must be existing values of
another column.
Check constraints are using check expressions. Basically, a check expression is a search
condition evaluating to true, false, or unknown. It may consists of predicates combined by
the logical operators AND and OR. A predicate specifies a condition that is true, false, or
unknown. The result is unknown, for example, if comparing with a NULL value.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-59
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

If the check expression for a constraint evaluates to true or unknown, the constraint is
considered as satisfied.
Partially, the database management systems have severe restrictions for the check
expressions of check constraints. The visual lists some for DB2 Universal Database for
z/OS and DB2 Universal Database for UNIX- and Intel-Based Platforms. For the precise
restrictions, see the reference manuals for your database management system.
Check constraints are enforced during the insertion, updating, and loading of rows.
Check constraints need not be defined when the table is created. They can be added later.
However, they are only enforced during subsequent operations. Existing rows are not
automatically rechecked when a check constraint is added.

7-60 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Check Constraints - Examples

CREATE TABLE AIRPORT


(
Airport_code CHARACTER(3) NOT NULL
CONSTRAINT APC
CHECK( Airport_code IN ( 'ATL', 'CDG', 'DFW', 'FCO', 'FRA',
'JFK', 'LAS', 'LAX', 'MAD', 'ORD',
'SAN', 'SFO', 'SJC', 'STR', 'ZRH', . . . )
),
...
) Values of abstract data type

CREATE TABLE AIRCRAFT_TYPE


(
...
Number_of_Engines INTEGER NOT NULL
CONSTRAINT NO_ENGINES
CHECK( Number_of_Engines BETWEEN 0 AND 4 ),
...
)
Domain for data element

Figure 7-28. Check Constraints - Examples CF182.0

Notes:
The first example on the visual illustrates how abstract data type AIRPORT CODE defined
in Unit 5 - Data and Process Inventories could be implemented. The abstract data type has
a finite set of values, namely, the three-letter codes for airports. Columns of the abstract
data type could be defined as 3-character columns with the check constraint shown on the
visual. The check expression for the check constraint uses the IN predicate listing the valid
character strings. On the visual, only a few values are shown as indicated by the ellipsis.
The second example on the visual implements the domain for data element Number of
Engines defined in Unit 5 - Data and Process Inventories. It uses the BETWEEN predicate
to enforce that the values for column Number_of_Engines, i.e., the number of engines for
an aircraft type, are between 0 and 4.
Note that check constraints can be named.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-61
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Triggers
A trigger is a set of actions to be
Trigger performed when a specific event
occurs

Triggering Activation Time Triggered


Operations Actions

INSERT Before change applied A set of SQL statements


DELETE After change applied Before Triggers
UPDATE Fullselect
Any columns SIGNAL SQLSTATE
Prerequisite
Selected columns SET transition-variable
Conditions
After Triggers
Triggered actions Fullselect
Granularity applied conditionally INSERT
Determined by search DELETE
For each row processed condition UPDATE
Once for SQL statement WHEN clause SIGNAL SQLSTATE

Figure 7-29. Triggers CF182.0

Notes:
A trigger defines a set of actions to be performed when a specific event occurs. Triggers are
defined for tables. The execution of the actions for the trigger can be triggered by insert,
update, or delete operations on the table for the trigger.
Triggers can be used to cause updates to other tables; automatically generate or transform
values for inserted or updated rows; or invoke functions to perform tasks such as issuing
alerts.
Triggers are a useful mechanism to define and enforce transitional business rules, i.e.,
rules involving different states of the data. Using triggers places the logic to enforce the
business rules in the database and relieves the business processes using the tables from
having to enforce it. Centralized logic means easier maintenance since no program
changes are required when the logic changes.
The following items must be considered when defining a trigger:

7-62 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Triggering Operations


Triggers are defined for tables. When defining a trigger, you must specify the operation to
which the trigger applies, i.e., which will cause the actions for the trigger to be executed.
The operation can be:
• an insert operation (INSERT)
• a delete operation (DELETE)
• an update operation (UPDATE)
For update operations, the trigger can apply to the updating of selected columns or the
updating of arbitrary columns of the table.
Granularity
The trigger can be executed for each row inserted, updated, or deleted or once for the
INSERT, UPDATE, or DELETE statement. This is referred to as the granularity of the
trigger.
Activation Time
Triggers can be executed before the changes of the triggering operation are applied or after
they have been applied. Depending on the time when they are applied, triggers are
classified as before triggers or after triggers.
Before triggers can be used to set or change the values for insert or update operations. An
after trigger can be used, for example, to reflect changes to the table for the trigger in
another table. For example, as rows are added to or deleted from the table for the trigger, a
row count in another table can be increased or decreased.
Prerequisite Conditions
The execution of the actions for a trigger can be made conditional: The actions are only
performed if a specified prerequisite condition is met. The prerequisite condition, a search
condition, is specified by means of a WHEN clause. The actions for the trigger are only
executed if the search condition evaluates to true.
Triggered Actions
The actions for a trigger consist of one or more SQL statements. They are only executed if
the search condition for the trigger evaluates to true. The SQL statements that can be part
of the actions depend on the type of trigger.
If the trigger is a before trigger, the actions can generally include fullselects, signal SQL
states, or set transition variables. Transition variables allow you to refer to values of the
rows affected by the trigger.
If the trigger is an after trigger, the triggered actions can generally include fullselects,
INSERT, DELETE, or UPDATE statements, or signal SQL states.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-63
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Triggers - Some Additional Remarks

Triggers can refer to values before (UPDATE, DELETE) and after


(UPDATE, INSERT) the execution of the triggering SQL operation
Must use REFERENCING clause to identify version (OLD or NEW)
Triggers may change data before it is stored
By means of SET transition-variable SQL statement
Triggers can use built-in functions and user defined functions
After triggers can cause other triggers to fire
Triggers for tables used by triggered actions
Multiple triggers can be defined for same event
Trigger created first, fires first
Triggers not effective during loading of rows
Some (minor) restrictions may apply
Not all target database management systems support triggers

Figure 7-30. Triggers - Some Additional Remarks CF182.0

Notes:
Triggers can reference the values of the affected rows. They can refer to the values before
the execution (update or delete operations) and/or after the execution (update or insert
operations) of the triggering SQL operation. The appropriate version of the data (OLD or
NEW) can be identified by means of the REFERENCING clause when defining the trigger.
As mentioned for the previous visual, before triggers can change the values of columns of
the affected rows. They can do this by setting transition variables via the SET transition-
variable SQL statement. Transition variables use the names of the columns, qualified by a
correlation name assigned to the version of the data via the REFERENCING clause. The
SET transition-variable SQL statement is also referred to as SET assignment SQL
statement.
In contrast to check expressions which, most of the time, are more restrictive, triggers can
generally use built-in functions and user defined functions. The functions can be used by
the search condition of the WHEN clause as well as by the triggered actions.

7-64 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The actions of after triggers can cause other triggers to fire, namely, triggers for the tables
maintained by the triggered actions. Since INSERT, UPDATE, and DELETE statements are
not permitted for before triggers, they cannot cause other triggers to fire.
Multiple triggers can be defined for the same table. You can even define multiple triggers for
the same event. If multiple triggers are defined for the same event, the trigger created first
fires first.
There is one drawback associated with triggers: triggers are not effective during the loading
of data.
Not all of the target database management systems support triggers. The various database
management systems supporting triggers may have restrictions. However, in general, the
restrictions are minor and less severe than the restrictions for check constraints.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-65
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

A Sample Abstract Data Type - Name Data


Signature: NAMEDATA( [ minimum-length ] [ , maximum-length ] )
Values: Any string of letters, blanks, and single dashes (-) or periods (.).
Minimum-length and maximum-length specify how many characters
the string has at least (default: 1) and at most (default: unlimited).
Operations:
Normalize Name Data
NORM(name-data-1) name-data-2
Removes all leading and trailing blanks from name-data-1
Reduces intermediate groups of blanks for name-data-1 to a single blank each
Uppercases all letters of name-data-1
Equal Comparison
EQUAL(name-data-1, name-data-2) { TRUE | FALSE }
Normalizes name-data-1 and name-data-2 and compares them character by
character
Result is TRUE if all characters are equal; FALSE otherwise

In the database, the data are to be stored normalized

Figure 7-31. A Sample Abstract Data Type - Name Data CF182.0

Notes:
Now, we want to illustrate the implementation of a sample abstract data type. We have
chosen abstract data type Name Data described in Unit 5 - Data and Process Inventories.
Its description is repeated on the visual. Its values consist of strings of letters, blanks, and
single dashes (-) or periods (.).
There are two operations defined for the abstract data type. The Normalization operation
(NORM) removes all leading and training blanks from a name-data string; reduces
intermediate groups of blanks to a single blank each; and uppercases all letters. In other
words, it produces a normalized version of the string.
The Equal Comparison operation (EQUAL) defines when two name-data strings are
considered equal. They are considered equal if their normalized versions are the same.
As you can see from the signature of the data type, it is parameterized. For a data element
using it, the minimum length and the maximum length of the accepted strings can be
specified.
In the database, we want to store all data in the normalized format.

7-66 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Setting Up the Abstract Data Type

Absolute maximum length allowed for


columns using the data type
CREATE DISTINCT TYPE NAMEDATA
Columns will use smaller maximum
AS VARCHAR(100) lengths
WITH COMPARISONS
Length ranges for columns will be limited
by other means

Normalization function
CREATE FUNCTION Checks name data string for valid name
NORM(NAMEDATA) data
RETURNS NAMEDATA
Returns nonzero SQL state if not valid
EXTERNAL NAME 'program' name data
LANGUAGE programming-language
... Returns zero SQL state and normalized
name data string otherwise

CREATE FUNCTION
Extends LENGTH built-in function to
LENGTH(NAMEDATA) name data
RETURNS INTEGER
Required for enforcing length ranges for
SOURCE columns
SYSIBM.LENGTH(VARCHAR())

Figure 7-32. Setting Up the Abstract Data Type CF182.0

Notes:
The approach chosen for the implementation of the abstract data type uses a user defined
distinct type for the abstract data type because we want to discuss some related problems.
It prevents the comparison of character strings that are not name data with name data. It
would be possible to implement the abstract data type without a user defined distinct type
which has some advantages, but also some disadvantages.
First, we define a user defined distinct type called NAMEDATA consisting of varying-length
character strings. When defining the user defined distinct type, you must provide a
maximum length for the source data type. Since the abstract data type is parameterized,
we need to specify the maximum length that any columns using it may have. However, you
must choose the maximum length carefully to ensure that the rows for the tables will fit into
the pages for the tables. The system will enforce this when the tables are created. Thus,
the lengths of the candidate columns should not vary too much.
In the example, we have restricted the maximum length of NAMEDATA columns to 100
characters. The columns using the data type may use smaller maximum lengths. We must

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-67
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

enforce the length ranges for the columns by other means. We will see later on how this
can be achieved.
Next, we define a user defined function, called NORM, corresponding to the Normalization
function. However, it is not quite the Normalization function since it performs some
additional validity checking. The function is an external scalar function accepting strings of
user defined distinct type NAMEDATA. It checks if the input string has a valid name-data
format, i.e., only contains letters (small or capital), blanks, and single dashes or periods. If
the string is invalid, a nonzero SQL state is returned by the function.
If the input string is valid, the function returns a zero SQL state and converts the input string
to its normalized name-data format. The data type for the output is NAMEDATA, the user
defined distinct type. Note that user defined functions must return an SQL state in
addition to the output described in their definition. The SQL state is checked by the target
database management system to determine if to continue or terminate the operation being
performed.
The function does not immediately accept variable-length character strings that are not of
type NAMEDATA. If you want to use it to convert other character strings to normalized
name data, you must first apply the system-provided cast function for user defined distinct
type NAMEDATA:
NORM(NAMEDATA(character-string))
On page 7-55, we defined a user defined function NORM whose only input parameter was
of type TEXTDATA, another user defined distinct type. Note that both user defined
functions may exist at the same time because their signatures are different.
Since the enforcement of the length ranges for the columns needs to determine the length
of input data, we must extend the LENGTH built-in function to user defined distinct type
NAMEDATA. This is done by the second user defined function on the visual, a sourced
scalar function.

7-68 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
INSERT Triggers for Abstract Data Type

CREATE TRIGGER INSNAME1 Checks for correct data and


NO CASCADE BEFORE normalizes input string
INSERT ON table-name
REFERENCING NEW AS N

1 FOR EACH ROW


MODE DB2SQL
BEGIN ATOMIC
SET N.column-name = NORM(N.column-name);
END

CREATE TRIGGER INSNAME2 Checks for correct column


NO CASCADE BEFORE length and sets SQL state
INSERT ON table-name
REFERENCING NEW AS N
FOR EACH ROW

2 MODE DB2SQL
WHEN ( LENGTH(N.column-name)
NOT BETWEEN minimum-length AND maximum-length )
BEGIN ATOMIC
SIGNAL SQLSTATE '72001' ('INVALID COLUMN LENGTH');
END

Figure 7-33. INSERT Triggers for Abstract Data Type CF182.0

Notes:
By means of the user defined functions on the previous visual, we can enforce that:
• The data of the column is always valid, i.e., only contains characters and character
sequences permitted for the abstract data type.
• The data of the column is stored in normalized format: leading and trailing blanks are
removed, intermediate blanks are reduced to a single blank each, and alphabetical
characters are in upper case.
• The minimum length and the maximum length for the column are observed.
For insert operations, this can be achieved by the two triggers on this visual. The triggers
are defined for each table containing name-data columns. The table must be created
before the triggers for the table can be defined.
Symbolic variables (in italics) are used in the CREATE TRIGGER statements on the visual.
If you want to create the triggers, you must replace them by the actually applicable values.
Table-name, column-name, minimum-length, and maximum-length must be replaced by the

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-69
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

name of the table, the name of the column, the minimum length for the column, and the
maximum length, respectively.
Both triggers are activated for each row to be inserted. They are activated before the row is
inserted. The first trigger (INSNAME1) uses user defined function NORM, in a SET
transition-variable SQL statement, to verify the correctness of the input string and to
normalize it. You need the correlation name defined via the REFERENCING clause on both
sides of the equal sign. On the right-hand side, you need it for the user defined function to
refer to the entered value for the new row. On the left-hand side, you need it because you
are changing the column value for the new row.
If the user defined function returns a zero SQL state, the value of the column for the row
becomes the normalized string and this will be the value inserted. If the user defined
function returns a nonzero SQL state, the SET transition-variable SQL statement fails and
the INSERT statement fails.
Note that the input string for the column is of type NAMEDATA when the trigger receives it.
A character string entered as input for the column in the INSERT statement is converted to
type NAMEDATA by the cast function for the user defined distinct type.
The second trigger (INSNAME2) ensures that the length range for the column is enforced
during insert operations. The WHEN clause checks if the length of the column is outside
the range defined by minimum-length and maximum-length. If it is outside, the WHEN
condition evaluates to true and a nonzero SQL state is signaled by means of the SIGNAL
SQLSTATE SQL statement. The nonzero SQL state causes the INSERT statement to
terminate.
The sequence in which the triggers are defined is relevant. The triggers must be created in
the sequence on the visual. As a consequence, the length check is performed for the
normalized string (which may be shorter) and not for the original input string.
You may ask if it were not possible to use a check constraint for the column instead of the
second trigger? The restrictions for check constraints are generally more severe than those
for triggers and your database management system may not allow you to use an equivalent
check constraint. For example, Version 6 of DB2 Universal Database for z/OS does not
allow you to use built-in functions or user defined functions in check expressions. In
addition, the result would not be quite the same. A check expression would verify the length
of the unnormalized string whereas the trigger verifies the length of the normalized string.
If you have multiple name-data columns for a table, you need not have two triggers for each
column. In the first trigger, you can use multiple SET transition-variable SQL statements as
triggered actions to check and normalize all columns. In the second trigger, you can
combine the length checks for all columns by logical ORs.

7-70 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
UPDATE Triggers for Abstract Data Type

CREATE TRIGGER UPDNAME1


NO CASCADE BEFORE
UPDATE OF column-name ON table-name
REFERENCING NEW AS N

1 FOR EACH ROW


MODE DB2SQL
BEGIN ATOMIC
Checks for correct data and
normalizes input string

SET N.column-name = NORM(N.column-name);


END

CREATE TRIGGER UPDNAME2


NO CASCADE BEFORE
UPDATE OF column-name ON table-name
REFERENCING NEW AS N
FOR EACH ROW Checks for correct column

2 MODE DB2SQL
WHEN ( LENGTH(N.column-name)
length and sets SQL state

NOT BETWEEN minimum-length AND maximum-length )


BEGIN ATOMIC
SIGNAL SQLSTATE '72001' ('INVALID COLUMN LENGTH');
END

Figure 7-34. UPDATE Triggers for Abstract Data Type CF182.0

Notes:
This visual illustrates the triggers needed for update operations to ensure the correctness
of the new column values; to ensure the observance of length constraints for the column;
and to store the new column values in normalized format.
Both triggers are activated for each row before the row is updated. They are only activated
if the appropriate column is updated (UPDATE OF column-name). Otherwise, the same
remarks apply as to the triggers for insert operations.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-71
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Abstract Data Type - Inserting and Updating


Column defined as NAMEDATA

Length checks
INSERT INTO table-name ( . . . , column-name , . . . ) by second
VALUES( . . . , 'wright bros.' , . . . ) trigger

'WRIGHT BROS.'

System-provided User defined


cast function for function NORM
NAMEDATA in first trigger
Nonzero SQL State

UPDATE table-name
SET column-name = 'wright bros..'

Casting since assignments of values

Figure 7-35. Abstract Data Type - Inserting and Updating CF182.0

Notes:
The above visual illustrates the flow of control and the conversions of input during insert
and update operations.
If a character string is assigned to a field defined as NAMEDATA, the system-provided cast
function for the user defined distinct type is automatically invoked. Thus, you need not
invoke it yourself. It casts the character string to user defined distinct type NAMEDATA, the
data type of the input parameter for user defined function NORM.
Next, the first (insert or update) trigger is activated which uses user defined function NORM
to normalize the value and assigns the normalized value to the column for the row. If the
value passes the length checks of the second trigger, the row is stored with the normalized
value for the column.
In the first example, an insert request, string 'wright bros.' is converted to
'WRIGHT BROS.' which then is assigned to the column and stored since it passes the
length checks.

7-72 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The second example on the visual illustrates, for an update request, what happens if the
new value for a column, defined as NAMEDATA, is invalid. User defined function NORM,
called by the SET transition-variable SQL statement of the triggered action, determines that
the input string does not have a valid name-data format (two successive periods). It returns
a nonzero SQL state which is passed on by the trigger and causes the update request to
fail.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-73
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Abstract Data Type - Selecting Data

SELECT . . .
FROM table-name
WHERE column-name = NORM ( NAMEDATA ( string ) )

Normalizes string and allows it to be


compared with values in column

Casts input string to NAMEDATA and


allows it to be input for NORM function

Figure 7-36. Abstract Data Type - Selecting Data CF182.0

Notes:
To retrieve specific rows based on a search condition for a column defined as NAMEDATA,
you must use both the system-provided cast function and user defined function NORM.
As described for user defined distinct types, you cannot directly compare values of a
column of a user defined distinct type with values of the source type. Accordingly, you
cannot directly compare the values of a column of user defined distinct type NAMEDATA
with a character string. You must first convert the character string to user defined distinct
type NAMEDATA. Furthermore, the input string should be normalized to ensure that the
corresponding rows are found in the table independent of the way they have been entered.
Both is achieved by first applying system-provided cast function NAMEDATA to the string
and then user defined function NORM:
NORM(NAMEDATA(string))
System-provided cast function NAMEDATA casts the string to user defined distinct type
NAMEDATA. Only then, user defined function NORM can be applied since its input must be
of type NAMEDATA. You cannot apply user defined function NORM directly to the

7-74 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty character string. Since the output of NORM is of type NAMEDATA, it can be compared with
the values of the column.
To avoid the invocation of two functions, you could define an additional user defined
function whose input parameter is of type VARCHAR(); whose output is of type
NAMEDATA; and which normalizes the input string.
Note that you will receive a operands-not-comparable SQL code when immediately
comparing the character string with the values of the columns.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-75
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

An Alternate Implementation (1 of 2)

No user defined distinct type

Name-data columns defined as VARCHAR() with actual maximum


lengths for columns

User defined function NAMEDATA verifies correctness of input and


normalizes it. Input and output are VARCHAR()

CREATE FUNCTION
NAMEDATA(VARCHAR(100)) Maximum length of
RETURNS VARCHAR(100) any name-data
EXTERNAL NAME 'program' columns intended
LANGUAGE programming-language
...

No sourced function LENGTH needed. Built-in function LENGTH


can be used since a user distinct type is not used

Figure 7-37. An Alternate Implementation (1 of 2) CF182.0

Notes:
This visual and the next illustrate an alternate implementation for abstract data type
NAMEDATA. The implementation does not use a user defined distinct type. The name-data
columns for a table are defined with built-in data type VARCHAR(). As length of the
column, the actual maximum length for the column is chosen.
As before, we need a user defined function checking the correctness of input for the
columns and normalizing the input strings. This time, we call the function NAMEDATA. The
data type for its only parameter as well as for its output is VARCHAR(). As length, we use
the maximum length for any anticipated name-data column. This allows us to use the same
function for all columns.
Because we do not use a user defined distinct type, we need not define a sourced user
defined function LENGTH. For determining the length of strings, we can use the LENGTH
built-in function.

7-76 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
An Alternate Implementation (2 of 2)
Triggers use function NAMEDATA and need only check for minimum length
CREATE TRIGGER INSNAME1
NO CASCADE BEFORE INSERT ON table-name
REFERENCING NEW AS N

1 FOR EACH ROW MODE DB2SQL


BEGIN ATOMIC
SET N.column-name = NAMEDATA(N.column-name);
Checks for correct data and
normalizes input string

END

CREATE TRIGGER INSNAME2


NO CASCADE BEFORE INSERT ON table-name
REFERENCING NEW AS N

2 FOR EACH ROW MODE DB2SQL


WHEN ( LENGTH(N.column-name) < minimum-length )
BEGIN ATOMIC
Checks for correct column
length and sets SQL state

SIGNAL SQLSTATE '72001' ('INVALID COLUMN LENGTH');


END

Similar triggers for UPDATE

On SELECT, use user defined function NAMEDATA to normalize input


SELECT . . .
FROM table-name
WHERE column-name = NAMEDATA ( string )

Figure 7-38. An Alternate Implementation (2 of 2) CF182.0

Notes:
The triggers needed are basically the same as for the other solution. The only differences
are:
• User defined function NAMEDATA is used instead of user defined function NORM.
• The second trigger only needs to check the minimum length. The maximum length is
enforced by the column length.
Again, you need two triggers for insert and update operations each.
On SELECT statements, you use the NAMEDATA function to normalize the search string.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-77
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Token Translation Tables


Aircraft_ Seat_ Seat_ Seat_ Section
Number Number Location Class To save space, frequently, tokens
B474001323 1A WINDOW FIRST N/SMOKING are stored instead of actual values
B474001323 1B MIDDLE FIRST N/SMOKING Descriptions for tokens are kept
B474001323 1C AISLE FIRST N/SMOKING in separate tables
... ... ... ... ... Token translation tables
B474001323 46J WINDOW ECONOMY SMOKING
B171004217 1A WINDOW BUSINESS N/SMOKING Requires Join operations to display
B171004217 1B AISLE BUSINESS N/SMOKING actual values
... ... ... ... ... May create a performance problem
B171004217 28G WINDOW ECONOMY N/SMOKING Not recommendable if system
SEAT supports compression

Aircraft_ Seat_ Seat_ Seat_ Section Seat_ Text Seat_ Text


Number Number Location Class Location Class
B474001323 1A 1 1 N 1 WINDOW 1 FIRST
B474001323 1B 2 1 N 2 MIDDLE 2 BUSINESS
B474001323 1C 3 1 N 3 AISLE 3 ECONOMY
... ... ... ... ... SEAT LOCATION SEAT CLASS
B474001323 46J 1 3 S
B171004217 1A 1 2 N Section Text
B171004217 1B 3 2 N N N/SMOKING
... ... ... ... ... S SMOKING
B171004217 28G 1 3 N
SECTION
SEAT
Figure 7-39. Token Translation Tables CF182.0

Notes:
Frequently, the columns of tables contain a well-defined, previously known small set of
values. For table SEAT on the top of the visual, this is the case for columns
SEAT_LOCATION, SEAT_CLASS, and SECTION.
To save space, frequently, smaller tokens (frequently numbers) are stored in the table
instead of the lengthy actual values. Descriptions for the tokens are kept in separate tables
as illustrated in the lower part of the visual. The descriptive tables are referred to as token
translation tables.
To display the rows of the main table with the actual values and not with the tokens, you
need Join operations to fill in the actual values. Even though the token translation tables
are small compared to the main table and their rows will probably be in the buffers of the
database management system, the Join operations may create a performance problem. In
addition, the Join operations will complicate the retrieval of the rows. Furthermore, the
number of tables that can be joined is generally limited.
The use of token translation tables is certainly not recommendable if compression is used
since the savings in this case do not warrant the performance degradation and effort.

7-78 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Token Translation Tables - An Alternative

CREATE TABLE SEAT


(
...
Seat_Class CHARACTER(1) NOT NULL
CONSTRAINT CLASS
CHECK( Seat_Class IN ( '1', '2', '3' ) ),
...
)

Ensures correctness of values

SELECT . . . , CASE Seat_Class


WHEN '1' THEN 'FIRST'
WHEN '2' THEN 'BUSINESS'
WHEN '3' THEN 'ECONOMY'
END AS Seat_Class, . . .
FROM SEAT

Makes actual values available

Figure 7-40. Token Translation Tables - An Alternative CF182.0

Notes:
Instead of token translation tables, you can use check constraints in conjunction with CASE
expressions to achieve the same space savings without the problems of Joins.
For a column concerned, you can provide a check expression using the IN predicate to list
all allowed tokens. Using a check expression ensures that only correct values are in the
columns.
On retrieval, you use a CASE expression when selecting the column. The CASE
expression allows you to translate the tokens into the actual values that should be returned.
There is one disadvantage with this method you should be aware of: When new values are
added, you must change the SELECT statements. If they are contained in views, you must
drop the views. The consequence is that authorizations for the views are lost and must be
reestablished. There is not a problem with the check constraints because they can be
deleted and added again without impact.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-79
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

7-80 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 7.3 Documentation

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-81
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Documenting User Defined Distinct Types

For each user defined distinct type:

Name: A unique name for the distinct type in compliance with


the naming requirements of the target DBMS
Source Type: Built-in data type on which the user defined distinct type
is based including any lengths and decimal places
For fixed-length string data types, the length of the strings
For varying-length string data types, the maximum length
for the distinct data type without considerations for columns
For decimal data types, number of digits and number of
decimal places
Description: A description for which type of data (columns) the
distinct type should be used.

Figure 7-41. Documenting User Defined Distinct Types CF182.0

Notes:
For user defined distinct types, you just need to provide their name and source data type
and a description for which types of data (columns) they should be used. For the source
data type, the (maximum) length, the number of digits, and/or the number of decimal places
must be provided in accordance with the requirements for the source data type.
To use a user defined distinct type with a varying-length source data type for multiple
columns, you must specify the maximum length of any columns using it.

7-82 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Documenting User Defined Functions (1 of 2)
For each user defined function:

Name: The name for the user defined function in compliance


with the naming requirements of the target DBMS
Signature: Signature of function in the form
name ( parameter-1, parameter-2, ...)
For each input parameter, specify its built-in or user-defined
data type including length, number of digits, and/or number
of decimal places
Output For scalar or column functions:
Returned: The built-in or user-defined data type returned including
length, number of digits, and/or decimal places
A textual description of the output
For each column returned by a table function:
Its column name
The built-in or user-defined data type for the column
including length, number of digits, and/or decimal places
A textual description of the column

Figure 7-42. Documenting User Defined Functions (1 of 2) CF182.0

Notes:
The documentation for a user defined function includes:
• Name, signature, and output returned by the user defined function.
• The category of the user defined function (scalar function, column function, or table
function).
• The type of the user defined function (external or sourced).
• A textual description of the user defined function.
• For an external function, name, location, and programming language for the object
program used by the user defined function.
• For a sourced user defined function, the built-in or user defined function on which the
user defined function is sourced including the appropriate parameters.
The items are described on this visual and the next. For the name, signature, and the
output returned, all relevant information is contained on the current visual.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-83
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Documenting User Defined Functions (2 of 2)

Category: Category for function: scalar function, column function,


or table function

Type: Type of function: external function or sourced function

Description: Textual description of function

Program Name of object program supporting the function and


Source: name of library containing it (external functions only)

Programming Programming language of program supporting function


Language: (external functions only)

Source Built-in or user defined function the current function is


Function: based upon including data types of parameters for the
function (sourced functions only)

Figure 7-43. Documenting User Defined Functions (2 of 2) CF182.0

Notes:
The textual description should outline in detail what the function does. This is especially
important for external functions. The description should include any SQL states returned
and their meaning.
For the program source, the name and library for the object program (load module or DLL),
should be provided if they are already known. The object program is invoked by the user
defined function, not the source program. (Note that the title of the item is Program Source
and not Source Program.) At the time the function is documented, some of the information
for this item may not yet be available. However, you can already select a name for the
object program. The missing information must be provided later.

7-84 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Documenting Check Constraints

For each check constraint:

Table: Name of table to which constraint applies

Column: If constraint applies to a particular column,


name of column to which constraint applies

Constraint Name for constraint (unique for table)


Name:

Description: Textual description of condition to be


checked by constraint

Check Search condition for condition to be checked


Condition: by constraint

Figure 7-44. Documenting Check Constraints CF182.0

Notes:
For check constraints, the following items need be documented:
• The name of the table to which the constraint applies.
• If the constraint applies to a particular column, the name of the column to which it
applies.
This item is not applicable to check constraints defined on the table level.
• Although the database management systems do not generally force you to specify a
name for a check constraint, you should give a name to each check constraint. This
eases the maintenance of check constraints.
The names for check constraints need only be unique for each table. Nevertheless, it is
recommended that you use unique names for all check constraints of your application
domain.
• A detailed textual description outlining what the check constraint achieves.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-85
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

• The search condition for the check constraint. When specifying the search condition for
the check constraint, verify with your database administrator that it can be implemented,
i.e., only uses functions supported by your database management system.

7-86 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Documenting Tables - Table Info (1 of 2)
For each table:

Table Name: The name of the table in the target DBMS. The maximum
length for table names depends on the target DBMS

Table Long Optional. An additional long name for the table referred to
Name: as label. Can be stored in system tables. Cannot be used
in SQL statements

Description: Optional. A textual description for the table referred to as


comment. Can be stored in system tables

Primary Key: Names and sequence of columns belonging to primary


key for table

Check Names of check constraints for table rather than for


Constraints: individual columns

Figure 7-45. Documenting Tables - Table Info (1 of 2) CF182.0

Notes:
The information to be documented for a table can be subdivided into table-related
information and column-related information. The current visual and the next describe the
table-related information to be documented.
The long table name can be stored into the system tables for the target database
management system by means of the LABEL ON TABLE SQL statement if that is
supported by the target database management system. The description can be stored by
means of the COMMENT ON TABLE SQL statement if that is supported by the target
database management system.
If the primary key consists of multiple columns, it is important to establish and specify the
logical sequence of the columns within the primary key. This will become relevant when
talking about foreign keys in a later unit.
Under the heading Check Constraints, only constraints should be listed that are not column
specific. The column-specific check constraints are listed for the columns.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-87
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Documenting Tables - Table Info (2 of 2)

Number of Number of rows initially in table


Rows:

Inserts/Time Expected number of inserts during a time interval


Interval: (e.g., a month)
Insert Pattern: Distribution of inserts over primary key values (e.g.,
equally distributed or ever increasing key values)
Updates/Time Expected number of updates during a time interval
Interval: (e.g., a week)
Length Percentage of updates causing length changes of
Changes: rows
Deletes/Time Expected number of deletions during a time interval
Interval: (e.g., a month)
Delete Distribution of deletions over primary key values
Pattern: (e.g., equally distributed or lowest key values)

Figure 7-46. Documenting Tables - Table Info (2 of 2) CF182.0

Notes:
The items on this visual represent information the database administrator needs to know
for the assignment of primary and secondary allocation units and for scheduling
reorganizations.
Length changes for rows during updates may cause rows for tables to relocated. This may
lead to indirect accesses decreasing performance.

7-88 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Documenting Tables - Column Information
For each column of a table:

Column The name of the column in the target DBMS. Maximum


Name: length for column name depends on target DBMS
Column Optional. An additional long name for the column referred
Long Name: to as label. Can be stored in system tables. Cannot be
used in SQL statements
Description: Optional. A textual description for the column referred to
as comment. Can be stored in system tables
Data Type: Built-in or user-defined data type for column including
length, number of digits, and/or number of decimal places
Column Additional attributes for column such as nullable,
Attributes: NOT NULL, WITH DEFAULT, and default values
Check Names of check constraints for column
Constraints: (column-specific check constraints only)

Figure 7-47. Documenting Tables - Column Information CF182.0

Notes:
The long column name can be stored into the system tables for the target database
management system by means of the LABEL ON [COLUMN] SQL statement if supported
by the target database management system. The description can be stored by means of
the COMMENT ON [COLUMN] SQL statement if that is supported by the target database
management system.
Under the heading Check Constraints, constraints just involving the column are to be listed
and not table-level constraints, i.e., constraints that involve multiple columns.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-89
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Documenting Triggers
For each trigger:

Name: Name for trigger


Table: Name of table to which the trigger applies
Triggering Operation to which trigger applies (INSERT, DELETE,
Operation: UPDATE, or UPDATE OF column)
Definition If multiple triggers for same event, a number determining
Sequence: the sequence in which triggers must be created
Granularity: If trigger applies to each row or to SQL statement
Time When trigger is applied: BEFORE operation or AFTER
Applied: operation
Prerequisite Search condition that must be TRUE for trigger to fire
Conditions:
Triggered Actions to be performed when trigger fires (SQL
Actions: statements)

Figure 7-48. Documenting Triggers CF182.0

Notes:
For each trigger, all the items we discussed in detail should be documented. Verify with
your database administrator that the search condition for the WHEN clause of the trigger
can be implemented, i.e., only uses functions supported by your database management
system. Also verify that the intended actions are supported by the target database
management system.

7-90 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Checkpoint

Exercise — Unit Checkpoint


1. How are tuple types translated into tables?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

2. When can tuple types be merged?


_____________________________________________________
_____________________________________________________
_____________________________________________________

3. When can you imbed a tuple type into another tuple type?
_____________________________________________________
_____________________________________________________
_____________________________________________________

4. The tuple types for 1:1 or 1:m relationship types can always be
merged or imbedded. (T/F)

5. Assume that T and T1 through Tn are tuple types satisfying the


following conditions:
• They all have the same primary key.
• At all times, each primary key value of T1 through Tn occurs in
T.
Which further condition must be satisfied for T1 through Tn being a
perfect decomposition of T?
_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-91
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

6. Give two reasons why you may not want to combine two tuple
types that theoretically could be combined.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

7. Name three limitations typically existing for relational database


management systems.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

8. The fixed-length size of pages may cause space to be wasted.


(T/F)

9. Denormalization causes the redundant storage of information.


(T/F)

10. Denormalization consciously violates the First Normal Form. (T/F)

11. Vertical splitting moves some attributes of a tuple type to another


tuple type with the same primary key. (T/F)

12. Horizontal splitting of a tuple type always creates tuple types for
different primary key ranges of the original tuple type. (T/F)

7-92 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 13. Match the following categories with the listed built-in data types:
a. Binary integers ____ VARCHAR
b. Decimal numbers ____ INTEGER
c. Floating-point numbers ____ REAL
d. Binary strings ____ DECIMAL
e. Datetime data ____ BIGINT
f. Single-byte character strings ____ BLOB
g. Double-byte character strings ____ DATE
____ SMALLINT
____ CHARACTER
____ DOUBLE
____ GRAPHIC
____ CLOB
____ NUMERIC
____ TIMESTAMP

14. For varying-length character strings, a value of NULL has the same
meaning as a string of length 0. (T/F)

15. Whether or not a column must always assume a value is specified


by means of the keywords NOT NULL and NULL, respectively.
(T/F)

16. Describe the difference between system default values and user
default values.
_____________________________________________________
_____________________________________________________
_____________________________________________________

17. How can you provide your own default value for a column.
_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-93
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

18. User defined distinct types must be based on built-in data types.
They cannot be based on other user defined distinct types. (T/F)

19. When using a user defined distinct type that is based on


VARCHAR for a specific column, you can specify a maximum
length for the column that is different from the length specified for
the user defined distinct type. (T/F)

20. Describe the difference between external and sourced user defined
functions.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

21. A primary purpose of sourced user defined functions is to promote


existing functions to new user defined distinct types. (T/F)

22. Establish the proper relationships:


a. Scalar functions can be ____ Sourced functions only
b. Column functions can be ____ External functions only
c. Table functions can be ____ External or sourced
functions

23. Describe the major purpose of check constraints.


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

24. Check constraints defined on the column level may refer to other
columns of the table. (T/F)

7-94 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 25. What is a trigger?


_____________________________________________________
_____________________________________________________

26. Triggers can be activated by SELECT, UPDATE, INSERT, or


DELETE statements. (T/F)

27. A trigger can be executed for each row processed or once for the
triggering SQL statement. (T/F)

28. When can triggers be activated?


_____________________________________________________
_____________________________________________________
_____________________________________________________

29. Which of the following SQL statements are allowed for before
triggers?
a. Fullselects.
b. INSERT statements.
c. UPDATE statements.
d. DELETE statements.
e. SIGNAL SQLSTATE statements.
f. SET transition-variable statements.

30. Which of the following SQL statements are allowed for after
triggers?
a. Fullselects.
b. INSERT statements.
c. UPDATE statements.
d. DELETE statements.
e. SIGNAL SQLSTATE statements.
f. SET transition-variable statements.

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-95
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

31. Trigger can change the values of columns before they are stored.
(T/F)

32. Generally, triggers can use user defined functions. (T/F)

7-96 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary (1 of 3)

Tuple types with always corresponding primary key values can be merged
Tuple types whose primary key values are always a subset of the primary
key values of another tuple type can be imbedded in the other tuple type if:
For each potential tuple, at least one nonkey attribute has a value
Tuple types for 1:1 and 1:m relationship types can always be merged or
imbedded
Tuple type for a supertype with an exclusive and covering subtype set can
be eliminated (perfect decomposition)
You do not always want to combine tuple types
If tuple types have nothing to do with each other
If other tuple types referentially dependent on tuple type to be eliminated
If restrictions for database management system become effective
If necessary for performance reasons, tuple types can be denormalized
(Re)introduces problems for not normalized tuple types
For performance reasons or because of limitations for the target DBMS, you
may want to split tuple types vertically or horizontally

Figure 7-49. Unit Summary (1 of 3) CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-97
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Summary (2 of 3)
Tuple types are converted into tables as follows:
Each tuple type becomes a table
Each elementary attribute becomes a column
Each elementary primary key attribute becomes a primary key column
Built-in data types include data types for:
Numeric data: INTEGER, SMALLINT, BIGINT, DECIMAL, NUMERIC,
REAL, DOUBLE
Single-byte character strings: CHARACTER, VARCHAR, CLOB
Double-byte character strings: GRAPHIC, VARGRAPHIC, DBCLOB
Datetime data: DATE, TIME, TIMESTAMP
Binary strings: BLOB
Columns can be defined as nullable or NOT NULL
Nullable: Column need not assume a value for every row
NOT NULL: Column must assume a value for every row
Columns can assume system-provided or user-provided default values
To implement an abstract data type, you need:
User defined distinct types
User defined functions
Check constraints and/or triggers

Figure 7-50. Unit Summary (2 of 3) CF182.0

Notes:

7-98 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary (3 of 3)

User defined distinct types are based on built-in data types


Restrict allowed operations to comparisons for data of user defined distinct type
Prevent illegal comparisons between different data types
User defined functions allow you to provide your own functions
User defined functions can be external or sourced functions
External functions: Use a program provided by you
Sourced functions: Extend existing built-in or user defined functions
User defined functions can be scalar functions, column functions, or table
functions
Check constraints allow you to restrict the values columns can assume
A trigger is a set of actions (SQL statements) to be performed when a
specific event occurs
Triggers are activated:
By INSERT, UPDATE, or DELETE statements
For each row or once per statement
Before or after changes have been applied
Only when a prerequisite condition is satisfied

Figure 7-51. Unit Summary (3 of 3) CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 7. From Tuple Types to Tables 7-99
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

7-100 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 8. Integrity Rules

What This Unit Is About


This unit discusses the different types of integrity to be achieved for a
good design. It presents methods for implementing the various types
of integrity.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Describe the different types of integrity to be enforced for a
database.
• Explain the integrity rules for referential integrity.
• Establish the referential constraints for the tables of an application
domain.
• Draw the referential structure for the tables of an application
domain.
• Know how to ensure the integrity of redundant information.
• Implement business constraints.

How You Will Check Your Progress


Accountability:
• Checkpoint questions
• Exercises

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives
After completion of this unit, you should be able to:

Describe the different types of integrity to be


enforced for a database
Explain the integrity rules for referential
integrity
Establish the referential constraints for the
tables of an application domain
Draw the referential structure for the tables of
an application domain
Know how to ensure the integrity of redundant
information
Implement business constraints

Figure 8-1. Unit Objectives CF182.0

Notes:
This unit discusses the different types of integrity that must be enforced for a database.
They are:
• Referential integrity
• Domain integrity
• Redundancy integrity
• Constraint integrity
The unit will describe: the integrity rules that can be enforced to achieve referential
integrity; how the referential constraints can be implemented; and how to establish the
referential structure for an application domain. The referential structure provides a
graphical overview of the referential constraints.
The unit will not discuss domain integrity in detail since it has been discussed by the
previous unit. It will discuss how the integrity of redundant information can be achieved.
Furthermore, the unit will explain how constraint integrity can be enforced, i.e., how the
business constraints for an application domain can be implemented.

8-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 8.1 Referential Integrity

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Integrity Rules in Design Process


Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 8-2. Integrity Rules in Design Process CF182.0

Notes:
This unit deals with the establishment of the integrity rules for the database being
designed. Thus, we are in the third step of storage view.

8-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Integrity - Areas of Concern and Types

Correctness of references to other tables


Referential Integrity

Correctness of values and domains for columns


Domain Integrity

Consistency of redundant information


Redundancy Integrity

Observance of business constraints


Constraint Integrity

Figure 8-3. Integrity - Areas of Concern and Types CF182.0

Notes:
As we have seen, multiple tables are using the same columns. For example, column
Type_Code occurs in tables AIRCRAFT_TYPE and AIRCRAFT_MODEL. The values it can
assume in table AIRCRAFT_MODEL are dependent on the values of the column in table
AIRCRAFT_TYPE since they are references to rows in table AIRCRAFT_TYPE. Therefore,
they must always be a subset of the current values of column Type_Code in
AIRCRAFT_TYPE.
It is a concern of database design to ensure that references to other tables are always
correct. The appropriate integrity is referred to as referential integrity.
A similar concern is the correctness of the values in the columns of the tables. A column
must only assume values allowed by the abstract data type for the data element associated
with the column. Furthermore, the values must be within the limits defined by the domain
for the data element.
Column Type_Code mentioned above must only assume 3-letter codes for valid airports.
Similarly, column Number_of_Engines for table AIRCRAFT_TYPE must only assume
integer values between 0 and 4.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

The corresponding type of integrity is referred to as value integrity or, more commonly,
domain integrity.
The third cause for concern is the redundant storage of information in the tables of the
application domain. Redundancy can occur as the consequence of the repetitive storage of
data or the storage of data that can be derived from other stored data (derivable data). If
redundancy cannot be avoided (e.g., because of performance reasons), the redundant
information on whose currency business processes are dependent must be consistent at
all times.
The corresponding type of integrity is referred to as redundancy integrity.
The data in the tables are also not correct if they violate business constraints (business
rules) for the application domain. This may be a rule as simple as that an employee cannot
be a pilot and a mechanic at the same time. It may also be a more complex rule such as
that a mechanic can only be assigned to the maintenance of an aircraft if he/she has been
trained for the appropriate aircraft model.
The corresponding type of integrity is referred to as (business) constraint integrity.
The integrity of data can be jeopardized by maintenance operations, i.e., insert, delete, or
update operations. Therefore, to guarantee the integrity of the data, rules must be
established that govern and must be followed for these types of operations. The rules are
referred to as integrity rules. In accordance with the type of operation to which they apply,
the rules are referred to as Insert Rules, Delete Rules, and Update Rules, respectively.

8-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Referential Integrity - Terminology
Parent Referential Constraint Dependent
Table Table
Type_ Model_ Length_ Aircraft_ Date_ Type_ Model_
Code Number of_Model Number Manufactured Code Number
A340 200 59.40 B474001323 1994-10-12 B747 400
A310 300 46.67 B373004518 1999-02-28 B737 300
B737 300 33.41 B373004519 1999-03-31 B737 300
B747 400 70.67 A103000534 1998-05-12 A310 300
AIRCRAFT_MODEL A103003167 1997-08-01 A310 300
A402004217 1999-10-23 A340 200
AIRCRAFT
Parent
Table
Parent/ Dependent
Primary Table Foreign
Key Engine_ Engine_ Aircraft_ Key
Number Type Number
PW9880193 PW4062 B474001323
PW9880194 PW4062 B474001323
PW9880195 PW4062
PW9882345 PW4062 B474001323
PW9974034 PW4062 B474001323
R375184566 CF6-80C2 A103003167
R375184567 CF6-80C2
ENGINE R375184568 CF6-80C2 A103003167

Figure 8-4. Referential Integrity - Terminology CF182.0

Notes:
In conjunction with referential integrity, some terms are used you need to be familiar with:

Key
A logically ordered set of columns of a table. The physical order of the columns in the table
is not relevant. If the key consists of multiple columns, it is referred to as a composite key.
A (logically ordered) set of columns of a table that uniquely identifies the rows of the table.
This need not be the primary key of the table. However, since we have established a
primary key for every table and the primary key can be a parent key. This course will
assume the parent key is a primary key.

Parent Key
On the visual, the parent key is the primary key of table AIRCRAFT_MODEL. It is a
composite key. It consists of columns Type_Code and Model_Number. We will define that
Type_Code is the first column and Model_Number the second column.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Foreign Key
A key which relates to the parent key of another table or the same table and whose values
must always be a subset of the values of the related parent key. Meaning and order of the
parent-key and foreign-key columns must be the same. The names of the columns can be
different. There is a one-to-one correspondence of the columns.
As mentioned before, we will always use the primary key of a table as parent key so that
the foreign key relates to a primary key.
On the visual, columns Type_Code and Model_Number together, and in that order, are a
foreign key of table AIRCRAFT referring to primary key (Type_Code, Model_Number) of
table AIRCRAFT_MODEL.

Referential Constraint
The correlation existing between a foreign key and the corresponding parent key.
On the visual, the correlation between foreign key (Type_Code, Model_Number) of table
AIRCRAFT and primary key (Type_Code, Model_Number) of table AIRCRAFT_MODEL
represents a referential constraint.
The arrow illustrating a referential constraint in a diagram points from the parent key to the
foreign key. A single-headed arrow is used if a parent key value can occur only once as
foreign key value. A double-headed arrow is used if a parent key value can occur more
than once as foreign key value.
Since each foreign key value can only occur once as parent key value, an arrowhead is not
necessary for the inverse direction.

Parent Table
The table of a referential constraint that contains the parent key.
On the visual, AIRCRAFT_MODEL is the parent table for the referential constraint between
foreign key (Type_Code, Model_Number) of table AIRCRAFT and the primary key of
AIRCRAFT_MODEL.

Dependent Table
The table of a referential constraint that contains the foreign key.
On the visual, AIRCRAFT is the dependent table for the referential constraint between
foreign key (Type_Code, Model_Number) of table AIRCRAFT and the primary key of
AIRCRAFT_MODEL.

8-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Self-Referencing Constraint


A referential constraint whose parent key and foreign key belong to the same table. For a
self-referencing constraint, the parent table and the dependent table are the same.
The referential constraint between columns Owning_Record (foreign key) and
Maintenance_Number (parent key) of table MAINTENANCE_RECORD is a
self-referencing constraint.

Self-Referencing Table
A table having a self-referencing constraint.
Table MAINTENANCE_RECORD for Come Aboard is a self-referencing table since it has
the self-referencing constraint mentioned before.

Parent Row
A row of the parent table whose parent key value exists as foreign key value in the
dependent table.
On the visual, all rows of AIRCRAFT_MODEL are parent rows for the referential constraint
between foreign key (Type_Code, Model_Number) of table AIRCRAFT and the primary key
of AIRCRAFT_MODEL.

Dependent Row
A row of the dependent table whose foreign key contains a value.
On the visual, all rows of AIRCRAFT are dependent rows for the referential constraint
between foreign key (Type_Code, Model_Number) of table AIRCRAFT and the primary key
of AIRCRAFT_MODEL.

Referential Integrity
For a referential constraint, referential integrity exists if, for every foreign key value of the
dependent table, the appropriate parent key value exists in the parent table.
On the visual, referential integrity exists for the referential constraint between foreign key
(Type_Code, Model_Number) of table tables AIRCRAFT and the primary key of
AIRCRAFT_MODEL. The visual illustrates a second referential constraint: the referential
constraint between column Aircraft_Number (foreign key) of table ENGINE and the primary
key of table AIRCRAFT (parent key). For this referential constraint, AIRCRAFT is the
parent table and ENGINE the dependent table.
The row for aircraft B373004518 is not a parent row since none of ENGINE's rows is
dependent on it. The row for engine PW9880195 is not a dependent row since column
Aircraft_Number does not contain a value for it.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

The referential integrity for a referential constraint must be controlled via insert, delete, and
update rules for the parent table and the dependent table. For different referential
constraints, the integrity rules may be different.
For each referential constraint, you can decide if you want the database management
system to enforce the referential integrity or if you want to take care of it yourself. You may
even decide not to care about referential integrity or to check it only periodically and correct
problems when you find time.

8-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Referential Integrity - Insert Rules
Type_ Model_ Length_
Code Number of_Model Always INSERT INTO AIRCRAFT_MODEL
A340 200 59.40 ( Type_Code,
A310 300 46.67 Model_Number, . . . )
B737 300 33.41 VALUES
B747 400 70.67 ( 'B777',
AIRCRAFT_MODEL '200', . . . )
Parent
Table
INSERT INTO AIRCRAFT
Dependent ( Aircraft_Number,
Table Type_Code,
Model_Number, . . . )
Aircraft_ Date_ Type_ Model_ VALUES
Number Manufactured Code Number ( 'B373004863',
B474001323 1994-10-12 B747 400
'B737',
'300', . . . )
B373004518 1999-02-28 B737 300
B373004519 1999-03-31 B737 300
A103000534 1998-05-12 A310 300
INSERT INTO AIRCRAFT
( Aircraft_Number,
A103003167 1997-08-01 A310 300
Type_Code,
A402004217 1999-10-23 A340 200 Model_Number, . . . )
AIRCRAFT VALUES
( 'A006003012',
Only if parent row exists 'A300',
'600', . . . )

Figure 8-5. Referential Integrity - Insert Rules CF182.0

Notes:
For the insertion of rows, the following rules ensure the integrity of referential constraints:
• If the referential constraint is not a self-referencing constraint, rows can be added to the
parent table at all times.
If the referential constraint is a self-referencing constraint, the parent table is the
dependent table as well and the restrictions for dependent tables apply.
The insertion of rows into the parent table for a referential constraint may also be
impaired by the parent table being the dependent table of another referential constraint.
• A row may be added to the dependent table for a referential constraint if:
- The foreign key for the row does not contain a value (if allowed for the columns of the
foreign key).
- The foreign key value has a matching parent key value in the parent table.
In short: Insertion of a row only if foreign key does not have a value or matches an
existing parent key value, i.e., an appropriate parent row exists.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

For the referential constraint on the visual, rows can be added to table AIRCRAFT_MODEL
without restrictions. The row for aircraft number B373004863 can be added to table
AIRCRAFT since its foreign key value (B737, 300) has a matching parent row in table
AIRCRAFT_MODEL.
The row for aircraft number A006003012 cannot be added to table AIRCRAFT because
AIRCRAFT_MODEL does not contain a row for aircraft model (A300, 600).

8-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Referential Integrity - Delete Rules

NA
Type_ Model_ Length_ Aircraft_ Date_ Type_ Model_
Code Number of_Model Number Manufactured Code Number
A340 200 59.40 B474001323 1994-10-12 B747 400
A310 300 46.67 B373004518 1999-02-28 B737 300
B737 300 33.41 B373004519 1999-03-31 B737 300
B747 400 70.67 A103000534 1998-05-12 A310 300
AIRCRAFT_MODEL A103003167 1997-08-01 A310 300
A402004217 1999-10-23 A340 200
AIRCRAFT

C
SN Aircraft_ Seat_
Engine_ Engine_ Aircraft_ Number Number
Number Type Number B474001323 1A
PW9880193 PW4062 B474001323 B474001323 1B
PW9880194 PW4062 B474001323 B474001323 1C
PW9880195 PW4062 ... ...
PW9882345 PW4062 B474001323 B474001323 46J
PW9974034 PW4062 B474001323 B171004217 1A
R375184566 CF6-80C2 A103003167 B171004217 1B
R375184567 CF6-80C2 ... ...
ENGINE R375184568 CF6-80C2 A103003167 SEAT B171004217 28G

Figure 8-6. Referential Integrity - Delete Rules CF182.0

Notes:
For the deletion of rows, the following delete rules ensure the integrity of referential
constraints:
• If the referential constraint is not a self-referencing constraint, a row can be deleted from
the dependent table at any time.
If the referential constraint is a self-referencing constraint, the dependent table is the
parent table at the same time. Since dependent rows may be parent rows at the same
time, the delete rule for the referential constraint may prevent the deletion of dependent
rows or cause the deletion of additional rows.
The deletion of rows from the dependent table of a referential constraint can also be
impaired by other referential constraints for which the dependent table is the parent
table.
• For the deletion of rows from the parent table of a referential constraint, one of the
following options can be chosen:

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

NO ACTION
For NO ACTION, rows of the parent table can only be deleted if none of the rows of the
dependent table becomes an orphan, i.e., does not have a matching parent key value
afterwards.
Conceptually, first, all rows requested by the delete request are deleted. After they have
been deleted, it is checked if the dependent table contains rows dependent on the
deleted rows and now being orphans. If so, the deletions are backed out and the request
is rejected. Thus, NO ACTION checks for conflicts after the deletion.
The subsequent RESTRICT option is very similar to NO ACTION, but its effects may be
different. NO ACTION is the SQL92 standard whereas RESTRICT is a DB2
implementation.
On the visual, NO ACTION (abbreviated as NA) has been chosen for the illustrated
constraint between tables AIRCRAFT_MODEL and AIRCRAFT. This means that an
aircraft model can only be deleted if an aircraft is no longer dependent on it.

RESTRICT (DB2 only)


For RESTRICT, rows of the parent table can only be deleted if they are not parent rows,
i.e., if none of the rows of the dependent table refer to them.
Conceptually, the checking for dependent rows is done before rows are deleted and not
after all rows have been deleted. If a conflict is detected, the request is rejected.
As you can see from this, RESTRICT has a performance advantage over NO ACTION in
case of conflicts.
The effects of NO ACTION and RESTRICT may be different for self-referencing
constraints because the dependent rows are in the same table as the deleted rows. If
considering each deletion of a row individually, as RESTRICT does, there may be
conflicts which disappear when considering the deletions collectively as NO ACTION
does. NO ACTION is successful if the dependent rows are deleted as well whereas
RESTRICT fails if there are any initial dependencies between the rows to be deleted.
Thus, DB2 doesn’t allow the usage of RESTRICT for self-referencing constraints.

SET NULL
The foreign key values of rows dependent on deleted parent rows are deleted, i.e., the
dependencies are removed.
SET NULL is only an option if the foreign key of the dependent table need not have a
value for every row. This raises the question when a composite foreign key is considered
to have no value for a row? In general, the foreign key of a row is considered not to have
a value if at least one column of the foreign key does not have a value. Thus, SET NULL
is only an option if at least one of the foreign key columns has been defined as nullable.
SET NULL resets all columns to NULL which have been defined as nullable.

8-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty On the visual, SET NULL (abbreviated as SN) has been chosen for the illustrated
referential constraint between tables AIRCRAFT and ENGINE. This means that, for an
engine, the reference to the aircraft is removed if the aircraft is deleted. Practically, this
implies that the engine is no longer mounted on an aircraft.
SET NULL can be used because column Aircraft_Number of table ENGINE has been
defined as nullable.

CASCADE
Rows dependent on deleted parent rows are deleted as well.
On the visual, CASCADE (abbreviated as C) has been chosen for the illustrated
referential constraint between tables AIRCRAFT and SEAT. Thus, if an aircraft is
deleted, information about its seats is no longer kept.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-15


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Referential Integrity - Update Rules


Dependent Table
Foreign key values can be changed to other existing parent key values
of the parent table or can be deleted (set to NULL) if permitted

Parent Table
NO ACTION
The parent key values of parent rows can only be changed if the
dependent table does not have orphans afterwards
RESTRICT
Can only change parent key values of rows that are not parent rows
SET NULL
If the parent key value of a parent row is changed, the foreign key
values of all dependent rows are set to NULL (if permitted)
CASCADE
If parent key value of a parent row is changed, the foreign key values
of all dependent rows are changed accordingly
Most systems only support NO ACTION or RESTRICT
Figure 8-7. Referential Integrity - Update Rules CF182.0

Notes:
The relational data model defines the following update rules for referential constraints:
• The foreign key values of dependent rows can be changed to matching parent key
values of the parent table or can be deleted (set to NULL) if permitted. As explained for
the delete rules, the deletion of foreign key values generally requires at least one of the
foreign key columns being defined as nullable.
• For the updating of parent key values, the relational data model provides the following
options:

NO ACTION
For NO ACTION, the parent key values of rows of the parent table can only be changed
if none of the rows of the dependent table becomes an orphan, i.e., does not have a
matching parent value, afterwards.

8-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty As for the delete rules, conceptually, the checking for dependent rows is done after all
parent key values have been changed. If orphans are detected, the changes are rolled
back and the request is rejected.

RESTRICT (DB2 only)


For RESTRICT, the parent key values of rows of the parent table can only be changed if
they are not parent rows, i.e., if none of the rows of the dependent table refer to them.
As for the delete rules, conceptually, the checking for dependent rows is done before the
parent key values are changed. If dependent rows are detected, the request is rejected.

SET NULL
The foreign key values of all rows dependent on parent rows whose parent key values
are changed are deleted, i.e., the dependencies are removed.
SET NULL is only an option if the foreign key of the dependent table need not have a
value for every row. The same foreign key considerations apply as for the delete rules.

CASCADE
The foreign key values of all dependent rows are changed to the new parent key values
of their parent rows.
For parent tables, most database management systems (in particular, DB2) only support
NO ACTION or RESTRICT. To change the parent key value for a parent row, you can use
the following procedure:
1. Add an identical row with the new parent key value to the parent table.
2. Change the foreign key values of the dependent rows to the new parent key value.
3. Delete the former parent row. (Since the foreign key values of the formerly dependent
rows have been changed, the former parent row is no longer a parent row and,
therefore, can be deleted.)
Alternatively, you can temporarily removed the referential constraint and reestablish it after
the parent key values have been changed. However, this requires that the integrity of the
referential constraint is checked by means of utilities before the dependent table can be
processed again.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-17


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Rules and ER Model (1 of 8)

Source SOURCE Source SOURCE

D . .m NA C D . .m C

Target TARGET Target TARGET

Source SOURCE Source SOURCE

D . .1 NA C D . .1 C

Target TARGET Target TARGET

Figure 8-8. Delete Rules and ER Model (1 of 8) CF182.0

Notes:
As mentioned before, we will always use the primary key of the parent table as parent key.
Therefore, we will no longer use the term parent key and only talk about primary
key/foreign key relationships in conjunction with referential constraints in the remainder of
the unit.
The existence of a referential constraint means that rows of the dependent table refer to
rows of the parent table. This implies an interrelationship between the parent table and the
dependent table. Since the tables are derived from tuple types, which are derived from
entity types and relationship types, referential constraints are the consequence of
relationship types of the entity-relationship model. In many cases, the entity-relationship
model also helps you determine the proper delete rules for the referential constraints as
illustrated by the next series of visuals.
The above visual discusses the resulting delete rules if one of the tables is for (a tuple type
belonging to) a dependent entity type. The key of the dependent entity type contains, as a
part, the key of the parent entity type or relationship type. For each dependent instance, the
appropriate parent instance must exist at all times.

8-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The interrelationships between source and target instances are expressed by the key of
the dependent entity type. As we know, a tuple type, and thus a table, is not established for
the owning relationship type. For the established tables, the interrelationships between the
instances (rows) are expressed by the primary key of the dependent table. The primary key
of the parent table is part of the primary key of the dependent table and constitutes a
foreign key for the dependent table.
The left two examples illustrate the delete rule to be chosen if the controlling property has
not been chosen for the dependent entity type. In these cases, the instances of the
dependent entity type, and thus the rows of the dependent table, are dependent on the
existence of the appropriate parent instances or rows. Since controlling has not been
specified for the dependent entity type, a parent instance (row) cannot be deleted as long
as an instance (row) is dependent on it. Consequently, the proper delete option for the
referential constraint is NO ACTION (or RESTRICT) independently of the cardinality for the
dependent entity type.
If the controlling property has been specified for the dependent entity type, dependent
instances, and thus rows, are to be deleted if the associated parent instances (rows) are
deleted. Thus, in this case, the proper delete option for the referential constraint is
CASCADE as illustrated for the right two examples on the visual.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-19


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Rules and ER Model (2 of 8)

Source SOURCE TARGET Source SOURCE TARGET


m 1. .m
r r
m C C m NA C

Target R Target R

Source SOURCE TARGET Source SOURCE TARGET


1. .m m
r r
1. .m NA NA 1. .m C NA

Target R Target R

Figure 8-9. Delete Rules and ER Model (2 of 8) CF182.0

Notes:
The cases on this visual consider m:m relationship types. For them, tuple types and tables
are established for source, target, and relationship type. Let us call them SOURCE,
TARGET, and R, respectively. Since the relationship key consists of the keys of source and
target, the primary key of R consists of the primary keys of SOURCE and TARGET. None
of the columns of the primary key may be nullable.
For a relationship instance, the associated source and target instances must exist at all
times. Therefore, for each row of R, rows with corresponding primary key values must exist
in SOURCE and TARGET. Consequently, as part of the primary key of R, the primary keys
of SOURCE and TARGET constitute foreign keys for R.
For relationship instances, the rule applies that they are deleted if their source or target
instances are deleted and the relationship instance can be deleted. Whether or not and
when a relationship instance can be deleted is controlled by the minimum cardinalities for
the relationship type. The controlling property may also have a certain effect as will be
illustrated on the next visual. If a relationship instance cannot be deleted, its source or
target instances cannot be deleted.

8-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The cases on the visual assume that the controlling property has not been specified for
either end of the relationship type. For the top left case, the minimum cardinalities for both
ends of the relationship type are 0. Thus, a relationship instance need not exist for a source
or target instance and nothing prevents the deletion of relationship instances. As a
consequence, if a parent row of one of the two parent tables is deleted, the corresponding
dependent rows must be deleted as well. Thus, CASCADE is the proper delete rule for
both referential constraints.
For the top right case, the minimum cardinality for the source is 1 and the minimum
cardinality for the target is 0. Consequently, for each target instance, at least one
relationship instance must exist. Since the controlling property has not been specified for
the target, the deletion of a relationship instance should not cause the automatic deletion of
its target instance.
Both points together disallow the deletion of a source instance if it resulted in a target
instance without relationship instance. They require a delete rule of NO ACTION (or
RESTRICT) for the referential constraint between the tables for the source and the
relationship type: You do not allow the deletion of a row of the table for the source as long
as a row is dependent on it; otherwise, you could delete the row for the last relationship
instance for the target.
Note that this does not completely match the meaning of minimum cardinality 1 for the
source of the relationship type, but this is as close as you can get.
Since the minimum target cardinality is 0, a source instance need not have a relationship
instance. Thus, a relationship instance can be and must be deleted if its target instance is
deleted. For the referential constraint between the tables for the target and the relationship
type, this translates into a delete rule of CASCADE.
For the bottom right case, the roles of source and target have been reversed. Thus, the
delete rules are: CASCADE for the referential constraint between the tables for the source
and the relationship type; NO ACTION (or RESTRICT) for the referential constraint
between the tables for the target and the relationship type.
For the bottom left case, both minimum cardinalities are 1. This means that a relationship
instance cannot be deleted if it is the last relationship instance for the source instance or
the target instance. This translates into delete rules of NO ACTION (or RESTRICT) for both
referential constraints with the caveat mentioned above.
Whenever you have a delete rule of NO ACTION or RESTRICT, you must get rid of
dependent rows (relationship instances) first. Only thereafter, you can delete the parent
rows.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-21


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Rules and ER Model (3 of 8)

Source SOURCE TARGET

OR
1. .m
r
C m NA C

Target R

CREATE TRIGGER . . .
Source SOURCE TARGET AFTER DELETE ON R
1. .m REFERENCING OLD AS O
FOR EACH ROW MODE DB2SQL
r BEGIN ATOMIC
DELETE FROM TARGET
C m C C
WHERE primary-key =
O.foreign-key;
Target R END

Figure 8-10. Delete Rules and ER Model (3 of 8) CF182.0

Notes:
For the previous visual, we assumed that the controlling property had not been specified for
either end of the relationship type. As a consequence, the deletion of a relationship
instance did not affect any source or target instances. Likewise, the deletion of a row of the
table for the relationship type did not have an effect on rows of the tables for source and
target.
If the controlling property has been specified for an end of the relationship type, the
deletion of the row for a relationship instance affects rows in other tables: If the controlling
property has been specified for the source, the row for the source instance must be
deleted. If the controlling property has been specified for the target, the row for the target
instance must be deleted. As on the visual, let us assume that the controlling property has
been specified for the target of the relationship type. Then, the deletion of the row for a
relationship instance should automatically trigger the deletion of the row for the target
instance (from the target table).
Referential integrity does not provide for the automatic deletion of rows in tables other than
the dependent table. For the example on the visual, you can still use NO ACTION or

8-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty RESTRICT for the referential constraint between the tables for the source and the
relationship type. This would prevent rows for target instances without corresponding rows
for relationship instances, but would ignore the controlling property.
If you want to implement the controlling property correctly and your database management
system supports triggers, you can do the following:
• Use a delete rule of CASCADE for the referential constraint between the table for the
end opposing the controlling property and the table for the relationship type.
In the example, this is for the referential constraint between the tables for the source and
the relationship type. (The other delete rule is already CASCADE because of minimum
cardinality 0 for the target.)
• To achieve the automatic deletion of the source or target instance for the relationship
instance, you need a trigger for the table for the relationship type. This trigger must be
an after trigger activated on each deletion of a row from the table for the relationship
type. It must delete the row of the controlled end of the relationship type associated with
the row being deleted. The controlled end is the end for which the controlling property
has been specified.
In the example on the visual, the controlling property has been specified for the target.
Therefore, the appropriate row in the table for the target must be deleted. The trigger
shown on the visual will achieve this. The WHERE clause of the DELETE statement has
been simplified. It assumes that the primary key and the foreign key are not composite
keys. If they were composite keys, multiple predicates, combined by logical ANDs,
would be needed in the WHERE clause.
Note that the correlation name of the REFERENCING clause is needed to refer to the
foreign key value of the row for the relationship instance being deleted.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-23


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Rules and ER Model (4 of 8)

Source SOURCE Source SOURCE


1 1

. .m SN C . .m C

Target TARGET Target TARGET

Source SOURCE Source SOURCE


1. .1 1. . 1

. .m NA C . .m C

Target TARGET Target TARGET

Figure 8-11. Delete Rules and ER Model (4 of 8) CF182.0

Notes:
The cases on this visual consider 1:m relationship types. For simplicity, let us assume that
the maximum cardinality for the source is 1. (If the maximum cardinality for the target is 1
instead, just reverse the directions of the relationship type.) In this case, the tuple type for
the relationship type can be integrated or imbedded into the tuple type for the target.
Tables are created for the source tuple type and the extended target tuple type. As foreign
key, the table for the extended target tuple type contains the primary key of the table for the
source.
If the minimum source cardinality is 0, a relationship instance need not exist for a target
instance. Thus, if a source instance is deleted, the target does not prevent the deletion of
relationship instances for the source.
If the controlling property has not been specified for the target (top left case), this translates
into a delete rule of SET NULL for the referential constraint. CASCADE would be wrong
since it would delete the row for the target instance as well. NO ACTION or RESTRICT
would be too restrictive since it would prevent the deletion of the relationship instance.

8-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty If the controlling property has been specified (top right case), the corresponding target
instance should be deleted as well if a relationship instance is deleted. Thus, CASCADE is
the proper delete rule.
The bottom two cases deal with a minimum cardinality of 1 for the source of the relationship
type. Consequently, for each target instance, at least one relationship instance must exist.
For the bottom right case, the controlling property has been specified for the target
meaning that the target instance should be deleted as well if a relationship instance is
deleted. Thus, the minimum source cardinality of 1 does not block the deletion of the
relationship instance and CASCADE is the proper delete rule for the referential constraint.
For the bottom left case, the controlling property has not been specified for the target.
Accordingly, the deletion of a relationship instance should not cause the automatic deletion
of its target instance. Minimum cardinality 1 together with the absence of the controlling
property disallow the deletion of a source instance if it resulted in a target instance without
relationship instance. They require a delete rule of NO ACTION (or RESTRICT) for the
referential constraint: You do not allow the deletion of a row of the table for the source as
long as a row is dependent on it; otherwise, you could delete the row for the last
relationship instance for the target.
Note that this does not completely match the meaning of minimum cardinality 1 for the
source of the relationship type, but this is as close as you can get.
For 1:1 relationship types, the delete rules are determined in the same manner. The only
consideration that is different is that you have a choice for the relationship key. It can either
be the key of the source or the key of the target. Depending on the choice, the tuple type
for the relationship type can be integrated into the tuple type for the source or for the target.
The table for the tuple type into which the tuple type for the relationship type is integrated
contains the foreign key.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-25


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Rules and ER Model (5 of 8)

m r1 m
A B TA TC TB
m

r2
1. .m C NA C

C TR2

m r1 m
A B TA TC TB
1. .m

r2
1. .m NA NA NA

C TR2

Figure 8-12. Delete Rules and ER Model (5 of 8) CF182.0

Notes:
In Unit 6 - Tuple Types, we saw that tuple types must not be provided for m:m relationship
types being the source (target) of another relationship type with a minimum target (source)
cardinality of 1. We will study the delete rules for the appropriate cases.
On the current visual and the next four visuals, the m:m relationship type and the other
relationship type are called r1 and r2, respectively. The source and target of r1 are called A
and B. Without loss of generality, we assume that r1 is the source of r2. The target for r2 is
called C.
Furthermore, we assume for this visual and the next that the maximum source cardinality of
r2 is m; otherwise, the tuple type for r2 could be combined with the tuple type for its target.
As we saw in Unit 6 - Tuple Types, a tuple type for r1 must not be provided since the tuple
type for r2 accurately describes the relationship instances for r1. Therefore, tables must
only be created for A, B, C, and r2. Let us name them TA, TB, TC, and TR2, respectively.
To describe the relationship instances for r1, TR2 contains columns for the primary keys of
TA and TB. In addition, it contains columns for the primary key TC.

8-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Since the relationship key for r1 and the key of C are defining attributes for r2, the
appropriate columns of TR2 must not be nullable. They are even foreign keys of TR2
because their values must exist as primary key values in the respective parent tables.
For the cases on this visual, the minimum cardinality is 0 for both the source and the target
of r1. This means that the affiliated relationship instances can be deleted if a source or
target instance is deleted.
For the upper case on the visual, the minimum source cardinality for r2 is 0 implying that a
relationship instance need not exist for an instance of C. Consequently, the deletion of the
source or target instance for a relationship instance of r1 must remove the relationship
instance and any relationship instances of r2 for which it is the source. This translates into
delete rules of CASCADE for the referential constraints between TA and TR2 and TB and
TR2.
For the referential constraint between the TC and TR2, the following considerations apply:
The minimum target cardinality of 1 for r2 requires an instance of r2 for each instance of r1.
Since the controlling property has not been specified for the source of r2, the delete rule
must be NO ACTION or RESTRICT; otherwise, the last row of TR2 for an instance of r1
could be deleted when a row of TC is deleted.
Again, note that delete rule NO ACTION (or RESTRICT) does not completely match the
meaning of the minimum target cardinality.
For the second case on the visual, the minimum source cardinality of r2 is 1. Therefore,
each instance of C requires an instance of r2. Since the controlling property has not been
specified for C, the delete rules for the referential constraints between TA and TR2 and TB
and TR2 must be NO ACTION or RESTRICT; otherwise, the last row of TR2 for an instance
of C could be deleted when rows of TA or TB are deleted.
Again, note that delete rule NO ACTION (or RESTRICT) does not completely match the
meaning of the minimum source cardinality.
If the controlling property for C were specified, delete rules of CASCADE could be used for
the referential constraints in conjunction with a trigger as outlined on page 8-22.
The trigger would need to delete the appropriate row in TC if a row of TR2 were deleted.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-27


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Rules and ER Model (6 of 8)

1. .m r1 m
A B TA TC TB
m

r2
1. .m NA NA C

C TR2

1. .m r1 1. .m
A B TA TC TB
m

r2
1. .m NA NA NA

C TR2

Figure 8-13. Delete Rules and ER Model (6 of 8) CF182.0

Notes:
The cases on this visual illustrate the delete rules if the minimum cardinality of source or
target of r1 is 1.
If, for example, the minimum cardinality of the source of r1 is 1, a relationship instance
must exist for each target instance of r1. The deletion of a source instance must not cause
the deletion of the last relationship instance of r1 for a target instance. Consequently, the
delete rule for the referential constraint between TA and TR2 must be NO ACTION or
RESTRICT. (Again, with the caveat that this does not match completely the meaning of
minimum cardinality 1.)
For the second case, both minimum cardinalities of r1 are 1. As a consequence, the delete
rules between TA and TR2 and TB and TR2 must both be NO ACTION or RESTRICT.

8-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Delete Rules and ER Model (7 of 8)

m r1 m
A B TA TB

r2
1. .m D NA NA

C TC

m r1 m
A B TA TB

r2
1. .m DC C C

C TC

Figure 8-14. Delete Rules and ER Model (7 of 8) CF182.0

Notes:
As we saw in Unit 6 - Tuple Types, tuple types must not be provided for r1 and r2, if:
• r1 is an m:m relationship type,
• r2 is an owning relationship type whose source or target is r1, and
• the minimum cardinality for the dependent entity type is 1.
This is because the key of r1 is part of the key of the dependent entity type and a
dependent entity instance must exist for each instance of r1. Accordingly, tables are only
established for A, B, and C. They are called TA, TB, and TC, respectively. The primary key
of TC comprises foreign keys referring to TA and TB.
For the cases on the visual, the minimum cardinalities are 0 for the source and the target of
r1.
Similar conclusions as for the previous visuals lead to delete rules of NO ACTION or
RESTRICT if the controlling property has not been specified for the dependent entity type.
The delete rules cannot be CASCADE since the deletion of rows of TA or TB would cause
the deletion of rows of TC. The delete rules cannot be SET NULL either. SET NULL could

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-29


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

cause instances for C without instances for r1 which is not allowed for dependent entity
types.
If the controlling property has been specified for the dependent entity type, both delete
rules must be CASCADE. If instances of A or B are deleted, affiliated relationship instances
of r1 should be deleted. Because of the controlling property for C, the associated
dependent entity instances are to be deleted as well.

8-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Delete Rules and ER Model (8 of 8)

1. .m r1 m
A B TA TB

r2
1. .m DC NA C

C TC

1. .m r1 1. .m
A B TA TB

r2
1. .m DC NA NA

C TC

Figure 8-15. Delete Rules and ER Model (8 of 8) CF182.0

Notes:
This visual illustrates two cases for which the minimum cardinalities for source or target of
relationship type r1 are 1.
Let us discuss the first case for which the minimum cardinality of the source is 1. The
minimum cardinality of 1 for A blocks the deletion of the last relationship instance for an
instance of B if an instance of A is to be deleted. Thus, a delete rule of NO ACTION or
RESTRICT is appropriate for the referential constraint between TA and TC. (Again, the
caveat for the minimum cardinality of 1 applies.) Note that the delete rules are independent
of whether or not the controlling property has been specified for the dependent entity type:
In any case, you must block the deletion of the relationship instance for r1.
As a consequence of the controlling property for the dependent entity type, the delete rule
for the referential constraint between TB and TC must be CASCADE. If the controlling
property were not specified, the delete rule would be NO ACTION or RESTRICT.
For the second case on the visual, the minimum cardinalities are 1 for both ends of
relationship type r1. This leads to delete rules of NO ACTION or RESTRICT for both
referential constraints.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-31


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Rules for an Imbed Case

m r1 m
A B TA TB
1

r2
1. .m SN SN

C TC

CREATE TRIGGER . . .
AFTER UPDATE ON TC
REFERENCING NEW AS N
FOR EACH ROW MODE DB2SQL
WHEN( (N.foreign-key-TA IS NULL AND N.foreign-key-TB IS NOT NULL)
OR (N.foreign-key-TB IS NULL AND N.foreign-key-TA IS NOT NULL) )
BEGIN ATOMIC
UPDATE TC
SET foreign-key-TA = NULL, foreign-key-TB = NULL
WHERE primary-key = N.primary-key;
END

Figure 8-16. Delete Rules for an Imbed Case CF182.0

Notes:
From the preceding discussions, we know that a tuple type is not required for relationship
type r1. Its relationship instances are completely described by the tuples for r2. However,
the source cardinality of 1 for r2 allows us to imbed the tuple type for r2 into the tuple type
for its target C.
Accordingly, tables are only established for A, B, and extended tuple type C. Let us call
them TA, TB, and TC, respectively. As the consequence of the imbedding, TC contains the
primary keys of TA and TB as foreign keys and the appropriate columns must be defined as
nullable.
If an instance of A or B is deleted, any relationship instances of r1 for it must be deleted as
well. In turn, the relationship instances of r2 being dependent on the deleted instances of r1
must be deleted. The cardinalities for r2 do not prevent the deletion of the instances of r2.
However, the target instances for the relationship instances (these are instances of C) must
not be deleted.
For the two referential constraints, this seems to translate into delete rules of SET NULL.
However, not quite so. If a row of TA is deleted, only the references to it in TC are deleted.

8-32 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Likewise, if a row of TB is deleted, only the references to it in TC are deleted. However, the
rows of TC describe relationship instances for r1 and not unrelated references to TA and
TB. A relationship instance consists of a pair of references and not a single reference.
Therefore, the references to TA and TB in a row of TC should be deleted at the same time.
You may decide not to care about a reference to TA or TB in TC if the other reference is
NULL. However, if you want to correctly implement relationship type r1, you need a trigger
synchronizing the foreign key columns when one of them is set to NULL. The trigger on the
visual achieves this.
The trigger is activated after a row of TC has be updated. The changing of foreign key
columns by the referential integrity support is considered as an update of the row. Some
systems (e.g., DB2) do not consider it as an update of the columns. Therefore, you should
not specify individual columns (UPDATE OF ...).
The WHEN clause ensures that the triggered action is executed only if one of the new
values for the foreign keys is NULL and the other is not. The triggered action, i.e., the
UPDATE statement, sets both foreign key values for a row to NULL.
In the trigger, synonyms foreign-key-TA and foreign-key-TB are used to denote the foreign
key columns in TC referring to TA and TB, respectively. The REFERENCING clause
enables us to refer to the new values of the updated rows of TC.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-33


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Delete Connection

T1

CASCADE CASCADE

T2
Must be the Delete
CASCADE
same and T4 Connected
not SET NULL
T3

NO ACTION NO ACTION

Figure 8-17. Delete Connection CF182.0

Notes:
A table T is delete-connected to a table T1 if the deletion of a row of T1 may require
(immediate or indirect) accesses to T. For example, for the deletion of a row of T1, it may
be necessary to determine if T contains rows with a foreign key value equal to the primary
key value of the deleted row.
In the example on the visual, T is delete-connected to T1 over two paths of referential
constraints:
• Let us first consider the left path of referential constraints. Delete rule CASCADE
between tables T1 and T2 may cause the deletion of rows of T2 if a row is deleted from
T1. Because of delete rule CASCADE between T2 and T3, this may, in turn, cause the
deletion of rows from T3. Delete rule NO ACTION between T3 and T requires that table
T is checked for matching foreign key values to determine if the rows of T3 can be
deleted.
Thus, T is delete-connected to T1 via the left path. Of course, T2 and T3 are also
delete-connected to T1.

8-34 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty • T is also delete-connected to T1 via the right path of referential constraints: Delete rule
CASCADE between tables T1 and T4 may cause the deletion of rows of T4 if a row is
deleted from T1. Delete rule NO ACTION between T4 and T requires that table T is
checked for matching foreign key values to determine if the rows of T4 can be deleted.
Of course, T4 is also delete-connected to T1.
For delete-connected tables, the following restriction applies:
If T is delete-connected to T1 via multiple paths with different referential constraints for T,
then the delete rules for the referential constraints involving T must be the same and must
not be SET NULL.
Otherwise, the result of the deletion of a row from T1 would depend on the sequence the
various paths are processed in by the database management system. The relational data
model requires, however, that the result be independent of the sequence chosen. Also, the
number of variations could be so large that checking if the result were the same for all
processing sequences of the paths must be ruled out.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-35


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Referential Cycles

Must be
T1 CASCADE NO ACTION
or CASCADE

CASCADE
NO ACTION
T2
At least two
SET NULL must not be
CASCADE

T3

Figure 8-18. Referential Cycles CF182.0

Notes:
A referential cycle is a sequence of referential constraints leading back to the same table.
The visual shows two cycles: First, it illustrates a cycle consisting of multiple referential
constraints involving tables T1, T2, and T3. Then, it illustrates a cycle just involving a single
table, i.e., a self-referencing constraint.
For referential cycles, the following restrictions apply:
1. In a referential cycle of two or more tables, the tables must not be delete-connected to
themselves.
This implies that at least two of the delete rules must not be CASCADE.
2. The delete rule for a self-referencing constraint must be CASCADE or NO ACTION. It
cannot be RESTRICT or SET NULL.
In both cases, the result of operations deleting multiple rows from a table would dependent
on the sequence in which the rows are deleted. The relational data model requires the
results to be independent of the processing sequences of the rows.

8-36 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Note that the table of a self-referencing constraint is always delete-connected to itself.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-37


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Definition of Referential Constraints

Parent Table

PRIMARY KEY( pk-column-1, pk-column-2, . . . )

FOREIGN KEY( fk-column-1, fk-column-2, . . . )


REFERENCES parent-table
ON DELETE delete-rule
ON UPDATE update-rule
CONSTRAINT constraint-name
Dependent Table

Figure 8-19. Definition of Referential Constraints CF182.0

Notes:
If you want to use your database management system for enforcing a referential constraint
between two tables, you must define the referential constraints to the database
management system. A referential constraint concerns two tables: the parent table and the
dependent table.
Assuming that the parent key of the parent table is the primary key (as we do), you must
define which columns form the primary key for the table. If the primary key is a composite
key, you must define the sequence of the columns for the primary key. This sequence is
relevant for the foreign key of the dependent table. The referential constraint itself must be
defined for the dependent table. First, you must specify the columns of the foreign key.
They must be specified in the same sequence as the corresponding primary key columns.
Next, you must specify the parent table. In addition, you must specify the delete rule and, if
your database management system allows it and gives you a choice, the update rule for
the referential constraint.
You may also give the referential constraint a name. The name can be used to delete the
constraint again if it is no longer needed.

8-38 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Referential Integrity - Documentation

For each referential constraint, add to dependent table:

Constraint Name for referential constraint. Must be unique for


Name: table. Should be unique for application domain

Foreign Key Ordered list of columns for foreign key of referential


Columns: constraint

Parent Name of parent table


Table:

Delete Rule: NO ACTION, RESTRICT, SET NULL, or CASCADE

Update Rule: NO ACTION or RESTRICT

Constraint A unique number for the referential constraint.


Number: Used to identify the constraint in referential structures

Figure 8-20. Referential Integrity - Documentation CF182.0

Notes:
The documentation for a referential constraint should be added to the documentation for
the dependent table. For each referential constraint, provide the following information:
• A name for the referential constraint. You should name each referential constraint. The
name must be unique per dependent table, but we suggest to make it unique for the
application domain.
• An ordered list of the columns making up the foreign key. The order of the columns
must match the order of the corresponding primary key columns. The names can be
different.
• The name of the parent table, i.e., the table containing the corresponding primary key.
• The delete rule for the referential constraint, i.e., NO ACTION, RESTRICT, SET NULL,
or CASCADE.
• If your database management system gives you a choice, the update rule for the
referential constraint. In most cases, this will be NO ACTION or RESTRICT.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-39


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

• A unique constraint number. You should give each referential constraint of the
application domain a unique number. This number is not needed for defining the
referential constraint to the target database management system. It is used to identify
the constraint in referential structures (described later in this topic).
A referential structure provides an overview of the referential constraints for the
application domain or for a subset thereof. It shows how the tables for the application
domain are interconnected by referential constraints. The constraint number in the
referential structure serves as reference to the documentation for the referential
constraint. It can be used to find details about the referential constraint.

8-40 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Maintenance View - Updated ER Model
AIRCRAFT _from_ MANU-
TYPE m 1. .1 FACTURER
1. .1
_for_
1. .m D
_trained
_for_ AIRCRAFT
MECHANIC _from_
m m MODEL
1. .1 m 1. .1
_for_
m m
_from_ m _on_
AIRCRAFT ENGINE
_scheduled 1 m
_for_
m _in_
_has_
C MAINTENANCE Owner
m DC 1. .1 D C
m RECORD 1
ENGINE
SEAT
LOCATION
_belongs_to_

Figure 8-21. Maintenance View - Updated ER Model CF182.0

Notes:
In Unit 4 - Entity-Relationship Model, we established the initial Maintenance View for our
sample airline company called Come Aboard. This visual shows an update of the
Maintenance View. It includes the changes caused by normalization.
The major changes are:
• Dependent entity type SEAT has been added as a result of normalization (First Normal
Form).
• Entity types ENGINE and ENGINE POSITION and relationship types
ENGINE_on_AIRCRAFT and ENGINE_on_AIRCRAFT_in_ENGINE LOCATION have
been added due to normalization (First Normal Form).
• Entity type MANUFACTURER and relationship types
AIRCRAFT TYPE_from_MANUFACTURER and ENGINE_from_MANUFACTURER
have been added due to normalization (Third Normal Form).
On the visual, the relationship types for which tuple types and, thus, tables must not be
established have been grayed out.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-41


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Referential Structure
Referential Structure MANU-
FACTURER
for Maintenance View
1
NA
AIRCRAFT_
TYPE

2
NA
AIRCRAFT_
MECHANIC 5
MODEL

8 7 3
C C NA
MECHANIC_
11 AIRCRAFT
FOR_AM

10 9 4 6
NA C C C SN NA
MAINTENANCE C MECHANIC_
SEAT ENGINE
_RECORD FOR_AC
12

Figure 8-22. Referential Structure CF182.0

Notes:
A referential structure is a graphical representation of the referential constraints for an
application domain. It gives an overview of the referential constraints, not a detailed
description.
The referential structure for the entire application domain may not fit onto a single page
with the consequence that it must be split into subsets fitting onto a page. On the visual, we
have concentrated on the subset corresponding to the (updated) Maintenance View for our
sample airline company called Come Aboard.
The referential structure contains rectangles for all tables of the considered subset. The
rectangles contain the names of the tables. A referential constraint between two tables is
represented by a single-headed or double-headed arrow leading from the parent table to
the dependent table. A single-headed arrow is used if a primary key value can occur at
most once in the dependent table. A double-headed arrow is used if a primary key value
can occur more than once in the dependent table. (Note that a foreign key value can occur
only once in the parent table.)

8-42 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Next to the arrowhead and next to the dependent table, the delete rule for the referential
constraint is specified. The abbreviations NA, R, SN, and C are used for NO ACTION,
RESTRICT, SET NULL, and CASCADE, respectively.
A little square with a number is placed on the arrow to identify the referential constraint. We
have already talked about that number, referred to as constraint number, in conjunction with
the documentation of referential constraints. The constraint number identifies the
documentation for the referential constraint being part of the documentation for the
dependent table. You could think of using the constraint name instead, but the constraint
name is generally too long and clumsy for the use in diagrams.
You can also add a constraint summary, in form of a listing or table (see page 8-46),
providing the names of the tables involved and the foreign key columns.
As mentioned before, the referential structure for the entire application domain may not fit
onto a single page with the consequence that it must be split into subsets fitting onto a
page. If possible, the subsets should correspond to the submodels you established for the
entity-relationship model of the application domain. Otherwise, proceed in the same
manner as for the entity-relationship model and establish referential (sub)structures for
autonomous subareas or different views of the application domain. Only if they will not fit
onto a single page, establish referential (sub)structures for sets of tables logically belonging
together and fitting onto a page.
Generally, the referential (sub)structures of the various pages will overlap. Some tables and
referential constraints will occur on multiple pages. The (sub)structures must not conflict
with each other. Together, they must cover all referential constraints for the application
domain.
Now, let us discuss the referential structure for the Maintenance View of Come Aboard in
more detail:
• Tables must be established for all entity types of the Maintenance View with the
exception of ENGINE LOCATION. Its tuple type together with the tuple type for
relationship type AIRCRAFT_on_ENGINE could be imbedded in the tuple type for
ENGINE.
Furthermore, tables must be established for all m:m relationship types, i.e., for
MECHANIC_trained_for_AIRCRAFT MODEL and
MECHANIC_scheduled_for_AIRCRAFT. The appropriate tables have been called
MECHANIC_for_AM and MECHANIC_for_AC, respectively.
Tables are not needed for any 1:m relationship types. Their tuple types can be combined
with tuple types for their source or target.
• Since the tuple type for relationship type AIRCRAFT TYPE_from_MANUFACTURER
has been imbedded into the tuple type for AIRCRAFT TYPE, table AIRCRAFT_TYPE
has a foreign key referring to table MANUFACTURER.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-43


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Because of minimum cardinality 1 for entity type MANUFACTURER and the absence of
the controlling property for AIRCRAFT TYPE, the delete rule for the referential constraint
must be NO ACTION. (An aircraft type must always have a manufacturer.)
• Because AIRCRAFT MODEL is a dependent entity type, table AIRCRAFT_MODEL
contains, as foreign key, the primary key of table AIRCRAFT_TYPE. Since the
controlling property has not been specified for the dependent entity type, the delete rule
must be NO ACTION. (An aircraft model must always have an aircraft type.)
• Since the tuple type for relationship type AIRCRAFT MODEL_for_AIRCRAFT has been
imbedded into the tuple type for AIRCRAFT, table AIRCRAFT has a foreign key referring
to table AIRCRAFT_MODEL.
Because of minimum cardinality 1 for entity type AIRCRAFT MODEL and the absence of
the controlling property for AIRCRAFT, the delete rule for the referential constraint must
be NO ACTION. (An aircraft must always have an aircraft model.)
• Because SEAT is a dependent entity type, table SEAT contains, as foreign key, the
primary key of table AIRCRAFT.
Since the controlling property has been specified for the dependent entity type, the
delete rule must be CASCADE. (If the aircraft is removed, information about the seats
on the aircraft need no longer be kept.)
• Since the tuple type for relationship type ENGINE_from_MANUFACTURER has been
imbedded into the tuple type for ENGINE, table ENGINE has a foreign key referring to
table MANUFACTURER.
Because of minimum cardinality 1 for entity type MANUFACTURER and the absence of
the controlling property for ENGINE, the delete rule for the referential constraint must be
NO ACTION. (An engine must always have a manufacturer.)
• As we mentioned before, the tuple types for entity type ENGINE LOCATION and
relationship type ENGINE_on_AIRCRAFT have been imbedded into the tuple type for
ENGINE. Therefore, table ENGINE has a foreign key referring to table AIRCRAFT.
Because the minimum cardinality of AIRCRAFT is 0 for relationship type
ENGINE_on_AIRCRAFT, the delete rule must be SET NULL. (An engine need not be
mounted on an aircraft.)
• Relationship type MECHANIC_trained_for_AIRCRAFT MODEL is an m:m relationship
type. Therefore, table MECHANIC_FOR_AM has foreign keys referring to tables
AIRCRAFT_MODEL and MECHANIC, respectively.
Since the minimum cardinalities for both ends of the relationship type are 0, both delete
rules must be CASCADE. (The relationship between an aircraft model and a mechanic
can be deleted if either one is "deleted".)
• Relationship type MECHANIC_scheduled_for_AIRCRAFT is an m:m relationship type.
Therefore, table MECHANIC_FOR_AC has foreign keys referring to tables AIRCRAFT
and MECHANIC, respectively.

8-44 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Since the minimum cardinalities for both ends of the relationship type are 0, both delete
rules must be CASCADE. (The relationship between an aircraft and a mechanic can be
deleted if either one is "deleted".)
• Since the tuple type for relationship type MAINTENANCE RECORD_from_MECHANIC
has been imbedded into the tuple type for MAINTENANCE RECORD, table
MAINTENANCE_RECORD has a foreign key referring to table MECHANIC.
Because of minimum cardinality 1 for entity type MECHANIC and the absence of the
controlling property for MAINTENANCE RECORD, the delete rule for the referential
constraint must be NO ACTION. (A maintenance record must always have a mechanic.)
• Since the tuple type for relationship type
MAINTENANCE RECORD_belongs_to_MAINTENANCE RECORD has been imbedded
into the tuple type for MAINTENANCE RECORD, table MAINTENANCE_RECORD has
a self-referencing constraint.
Because of the controlling property for the target end of the relationship type, the delete
rule for the referential constraint must be CASCADE. (A maintenance record should be
thrown away if its owning maintenance record is deleted.)
If you assumed that the controlling property had not been specified, the delete rule
should be SET NULL. (If the owning maintenance record is deleted, the dependent
records are kept, but their references to the owning record are reset.) However, the
restrictions for self-referencing constraints would not allow us to choose SET NULL as
delete rule. We would have to choose NO ACTION because CASCADE, the other
alternative, would delete the dependent records.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-45


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Referential Structure - Constraint Summary

Dependent Table Parent Table Foreign Key


1 AIRCRAFT_TYPE MANUFACTURER Manfacturer_Code
2 AIRCRAFT_MODEL AIRCRAFT_TYPE Type_Code
3 AIRCRAFT AIRCRAFT_MODEL 1: Type:Code, 2: Model_Number
4 SEAT AIRCRAFT Aircraft_Number
5 ENGINE MANUFACTURER Manufacturer_Code
6 ENGINE AIRCRAFT Aircraft_Number
7 MECHANIC_FOR_AM AIRCRAFT_MODEL 1: Type_Code, 2: Model_Number
8 MECHANIC_FOR_AM MECHANIC Employee_Number
9 MECHANIC_FOR_AC AIRCRAFT Aircraft_Number
10 MECHANIC_FOR_AC MECHANIC Employee_Number
11 MAINTENANCE_RECORD MECHANIC Employee_Number
12 MAINTENANCE_RECORD MAINTENANCE_RECORD Owning_Record

Figure 8-23. Referential Structure - Constraint Summary CF182.0

Notes:
This visual shows the constraint summary for the Maintenance View. The line numbers
match the numbers for the constraints. For each constraint, the dependent table, the parent
table, and the foreign key columns are listed. The numbers in front of foreign key columns
specify their sequence in the foreign key.

8-46 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 8.2 Other Types of Integrity

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-47


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Domain Integrity

Domain integrity = Correctness of


values and domains for columns

Handled by abstract data types


Including values for abstract data type
Including domains for data elements
Including restrictions for lengths
Ingredients required
User defined distinct types
User defined functions
Triggers
Check Constraints
Discussed during previous unit

Figure 8-24. Domain Integrity CF182.0

Notes:
Domain integrity, also referred to as value integrity, deals with the correctness of values in
columns of tables.
A column must only assume values allowed by the abstract data type for the data element
associated with the column. Furthermore, the values must adhere to the domain
specifications (restrictions) for the column's data element. They must also observe length
requirements or restrictions for the data element such as the minimum length, the
maximum length, the number of digits, or the number of decimal places. The length
requirements may have been expressed by parameters for the abstract data type for the
data element.
For example, column Type_Code for table AIRCRAFT must only assume 3-letter codes for
valid airports. Column Number_of_Engines for table AIRCRAFT_TYPE must only assume
integer values between 0 and 4. Column Last_Name of table EMPLOYEE may only
assume values of abstract data type NAMEDATA and must not be longer than 50
characters.

8-48 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The ingredients required to ensure domain integrity are user defined distinct types, user
defined functions, triggers, and check constraints.
Since the implementation of domain integrity is closely related to the implementation of
abstract data types described in Unit 7 - From Tuple Types to Tables, we need not discuss
it further.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-49


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Redundancy Integrity
Do not allow end users to maintain tables
directly through SQL DML statements
Violations Provide front-ends with proper SQL DML
of 2nd and statements ensuring integrity
3rd Normal Update all rows concerned at the same time
Forms On insert, copy existing redundant information
Can use triggers to maintain integrity for
update operations

If copies must be consistent at all times


Redundancy
Integrity Multiple Use triggers to ensure consistency of copies
Copies If copies need not be consistent at all times
of Data Disallow inserts, updates, or deletes for copies
Provide new versions periodically

Use triggers and user defined functions to


Derivable derive data on updates, inserts, and deletes
Data Alternative is not to store derivable data and to
derive them on retrieval

Figure 8-25. Redundancy Integrity CF182.0

Notes:
Redundancy integrity deals with the redundant storage of information in tables.
There are three major causes for the redundant storage of data:
• Violations of the Second Normal Form or Third Normal Form lead to the redundant
storage of data in the same table. Nonkey columns are solely dependent on columns
that are not primary key columns (Third Normal Form) or only on some of the primary
key columns (Second Normal Form). To ensure consistency, the dependent columns
must have the same values for all rows having the same values for the columns on
which the dependency exists.
For this type of redundancy, you can maintain integrity as follows:
- Do not allow end users to maintain the tables concerned directly through SQL Data
Manipulation Language (DML) statements (INSERT, UPDATE, or DELETE) via
dynamic SQL.

8-50 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty - Instead, provide front-ends with attractive user interfaces for the business processes
concerned. In the front-ends, use the proper program logic and SQL statements to
ensure the consistency of the redundant data.
For update operations, if redundant information is changed, all rows having the same
values for the columns on which the functional dependency exists must be changed at
the same time.
If new rows are inserted, copy the redundant information from existing rows already
containing the information rather than having the end user enter the information
again. The end user must only provide the information the first time around, i.e., when
the information does not yet exist.
Even though this does not make the information inconsistent, you may want to
prevent the deletion of the last row for a value of the columns on which the functional
dependency exists. However, you only need to do this if the redundant information is
still needed.
- If your database management system supports triggers, you may be able to use
triggers to ensure consistency of the redundant information for update operations.
The next visual illustrates how such a trigger must look like.
• Multiple copies of the same data are a second cause for redundant information. For
performance reasons, you may have decided to:
- repeat columns in other tables
- provide multiple copies of entire tables
If the information in the various tables must be consistent at all times, you can use
triggers to enforce the consistency.
Frequently, if you have provided multiple copies of entire tables, one of the tables is the
master table and must be up-to-date at all times. The copies are only used for reference
purposes and need not be up-to-date at all times. In this case, you should disallow
inserts, updates, or deletes for the copies. In addition, you may want to provide new
versions (refreshes) of the copies periodically or from time to time.
• Redundancy can also be caused by stored data that can be derived from other stored
data. Data that can be derived from other data is referred to as derivable data.
For our sample airline company, all seats for an aircraft have a row in table SEAT and the
number of seats on the aircraft is the number of rows in the table. Thus, the number of
seats can be derived from the information in table SEAT. To avoid scanning table SEAT
every time you need the number of seats, you may prefer to store the number of seats in
the rows for the aircraft in AIRCRAFT. If the seat arrangement for an aircraft changes
and you forget to update the appropriate row in table AIRCRAFT, the derivable data
becomes wrong.
If your database management system supports triggers, you can use triggers to
maintain the correctness of derivable data. Whenever data affecting the derivable data
are changed, a trigger must reevaluate and store the derivable data. The triggers

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-51


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

achieving this for the number of seats for our sample airline company are illustrated on
page 8-55.
An alternative is not to store derivable data and to derive them every time they are
needed. Which way is better depends on the usage profiles of your business processes.
Most of the time, retrieval operations are much more frequent than insert, update, or
delete operations (80-20 rule) and are more performance-critical. Then, triggers are
preferable.

8-52 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Violation of Normal Forms - Trigger

CREATE TRIGGER . . .
AFTER UPDATE OF dependent-column-1,
dependent-column-2,
...
ON table-name
REFERENCING NEW AS N OLD AS O
FOR EACH ROW MODE DB2SQL
WHEN (
N.dependent-column-1 <> O.dependent-column-1 OR
N.dependent-column-2 <> O.dependent-column-2 OR SAME
...
)
BEGIN ATOMIC
UPDATE table-name
SET dependent-column-1 = N.dependent-column-1,
dependent-column-2 = N.dependent-column-2,
...
WHERE reference-column = N.reference-column AND
primary-key <> N.primary-key ;
END

Figure 8-26. Violation of Normal Forms - Trigger CF182.0

Notes:
As we discussed, violations of the Second Normal Form or Third Normal Form lead to the
redundant storage of data in the same table. Nonkey columns are solely dependent on
columns that are not primary key columns (Third Normal Form) or only on some of the
primary key columns (Second Normal Form). To ensure consistency, the dependent
columns must have the same values for all rows having the same values for the columns
on which the dependency exists.
The visual illustrates how a trigger maintaining the integrity of the redundant information for
update operations should look. In the visual the dependent columns are called
dependent-column-1, dependent-column-2, and so on.
Furthermore, to simplify matters, it is assumed that the columns are dependent on a single
column and that the primary key for the table is not composite; otherwise, additional AND
operators would be needed in the WHERE clause. The column the dependent columns are
functionally dependent on is called reference-column on the visual.
The trigger is activated on update requests for the table violating the Normal Form. It is
executed for each row updated if any of the dependent columns, i.e., the columns

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-53


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

containing the redundant information, has been changed (UPDATE OF ...). However, the
triggered actions are only performed if the value of at least one of the dependent columns
has changed.
The triggered action changes the same table as the table being updated. It changes the
values of the dependent columns of rows other than the row being updated (primary-key g
N.primary-key) to the new values for the updated row (SET ... dependent-column-n =
N.dependent-column-n). It only changes the columns of the rows having the same value,
as the row being updated, for the column the dependent columns are functionally
dependent on (reference-column = N.reference-column).
You can easily see the importance of the REFERENCING clause in this case because we
need to refer to three different states for a column: the state before the row was updated,
the state after the row has been updated, and the column for the rows being updated by the
triggered action.
Because the triggered action updates the same columns for the same table, the trigger is
invoked recursively. Thus, without the proper precautions, looping could occur. The search
condition of the WHEN clause prevents an endless recursion because all dependent
columns will have the same old and new value after some iterations.
Depending on how the iterations are performed by your database management system,
you may experience a serious performance degradation when using the trigger!

8-54 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Derivable Data - Sample Triggers
In table AIRCRAFT, maintain number of
seats on aircraft (Number_of_Seats)

CREATE TRIGGER ADDSEAT


AFTER INSERT ON SEAT
REFERENCING NEW AS N
FOR EACH ROW MODE DB2SQL
BEGIN ATOMIC
UPDATE AIRCRAFT
SET Number_of_Seats = Number_of_Seats + 1
WHERE Aircraft_Number = N.Aircraft_Number;
END

CREATE TRIGGER DELSEAT


AFTER DELETE ON SEAT
REFERENCING OLD AS O
FOR EACH ROW MODE DB2SQL
BEGIN ATOMIC
UPDATE AIRCRAFT
SET Number_of_Seats = Number_of_Seats - 1
WHERE Aircraft_Number = O.Aircraft_Number;
END

Figure 8-27. Derivable Data - Sample Triggers CF182.0

Notes:
The example on the visual illustrates how the integrity of stored derivable data can be
maintained by means of triggers.
For our sample airline company, all seats for an aircraft have a row in table SEAT and the
number of seats on the aircraft is the number of rows in the table. Thus, the number of
seats can be derived from the information in table SEAT. To avoid scanning table SEAT
each time the number of seats is needed, the number of seats for an aircraft is also kept in
table AIRCRAFT. The appropriate column is Number_of_Seats and must be maintained as
seats are added or deleted for an aircraft.
The first trigger (ADDSEAT) is activated each time a seat is added to table SEAT. For each
seat added, it increases the number of seats in the row for the aircraft to which the seat
belongs. This requires that column Number_of_Seats was initialized to zero when the row
for the aircraft was inserted into table AIRCRAFT (default values).
Note that the row for an aircraft must exist before seats can be added to the aircraft. Also
note that each row in table SEAT contains the serial number for the aircraft to which the
seat belongs.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-55


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

The second trigger (DELSEAT) is activated each time a seat is deleted from table SEAT.
For each seat deleted for an aircraft, it decreases the number of seats in the row for the
aircraft in table AIRCRAFT.
For both triggers, the REFERENCING clause is needed to be able to refer to the aircraft
number for the seat added or deleted, respectively.

8-56 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Constraint Integrity

Ensures that business constraints are observed

Use triggers and user defined functions

OR
Do not allow end users to maintain tables
concerned by SQL DML statements
Provide proper front-ends to end users ensuring
that business constraints are observed

Sometimes, other items help such as unique indexes

Figure 8-28. Constraint Integrity CF182.0

Notes:
The data in the tables are also not correct if they violate business constraints (business
rules) for the application domain. This may be a rule as simple as that an employee cannot
be a pilot and a mechanic at the same time. It may also be a more complex rule such as
that a mechanic can only be assigned to the maintenance of an aircraft if he/she has been
trained for the appropriate aircraft model. Constraint integrity requires that the business
constraints for the application domain are observed.
In Unit 3 - Problem Statement, business constraints were discussed as part of the problem
statement for the application domain and the information to be provided for them was
listed. In Unit 4 - Entity-Relationship Model, it was described how business constraints are
represented in the entity-relationship model.
Some business constraints are expressed by basic modeling constructs in the
entity-relationship model. For example, the controlling property is really a business
constraint. In some cases, the controlling property translates into delete rule CASCADE for
a referential constraint. However, for other cases, it does not. It does not if specified for an

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-57


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

m:m relationship type as we have seen during our discussions about referential integrity. It
then has to be handled in the same manner as other business constraints.
Of course, the business constraints must be translated into constraints for the tables of the
application domain. During the discussions about referential constraints, we saw that the
controlling property could be implemented by means of triggers in some cases. Triggers,
possibly, in conjunction with user defined functions, are indeed in many cases the means
for implementing business constraints within the database management system.
If your database management system does not support triggers or you do not want to use
them, you can enforce business constraints by:
• Not allowing end users to maintain the tables concerned by directly using INSERT,
UPDATE, or DELETE statements.
• Providing proper front-ends to the end users that ensure that the business constraints
are observed.
Sometimes, other functions of the database management system may do the trick such as
unique indexes. A unique index ensures that the set of columns for which it is defined
contains every value only once. Indexes will be discussed in a later unit.
A unique index would solve the business constraint we modeled in Unit 4 -
Entity-Relationship Model that, for a flight, each pilot function (CAPTAIN or COPILOT) must
only be assigned once. The business constraint translated into the requirement that the
combined values for attributes Flight Number, From, To, Flight Locator, and Pilot Function
of entity type PILOT ASSIGNMENT must be unique. The resulting implementation is a
unique index for the appropriate columns in table PILOT_ASSIGNMENT.

8-58 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Constraint Integrity - Example 1
{ 4 : New maintenance record
MAINTENANCE only for existing aircraft }
AIRCRAFT
RECORD

MAINTENANCE
AIRCRAFT
{4} _RECORD

{4}

CREATE TRIGGER MRECORD


NO CASCADE BEFORE INSERT ON MAINTENANCE_RECORD
REFERENCING NEW AS N
FOR EACH ROW MODE DB2SQL
WHEN ( NOT EXISTS
( SELECT Aircraft_Number
FROM AIRCRAFT
WHERE Aircraft_Number = N.Aircraft_Number )
)
BEGIN ATOMIC
SIGNAL SQLSTATE '72002'
('AIRCRAFT FOR MAINTENANCE RECORD DOES NOT EXIST');
END

Figure 8-29. Constraint Integrity - Example 1 CF182.0

Notes:
As we have seen in Unit 4 - Entity-Relationship Model, maintenance records for Come
Aboard include the serial number of the aircraft the maintenance was performed for. The
business constraints for Come Aboard state that maintenance records must be kept even if
the aircraft is no longer owned by CAB. Even though the record for the aircraft no longer
exists, the maintenance records must still contain the serial number for the aircraft they
were established for. When a maintenance record is established, a record for the aircraft
must exist.
Because the aircraft number of maintenance records may point to aircraft no longer owned
by CAB, we could not model the interrelationship as a relationship type. Instead, we
introduced a business constraint between entity types AIRCRAFT and MAINTENANCE
RECORD: When an instance is added to entity type MAINTENANCE RECORD, entity type
AIRCRAFT must contain an instance for the aircraft the maintenance record is established
for.
As the entity types are converted into tuple types and into tables, the constraint must be
translated into an equivalent constraint for the tables: When a row is added to table

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-59


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

MAINTENANCE_RECORD, table AIRCRAFT must contain a row for the aircraft the
maintenance record is established for.
Because column Aircraft_Number of table MAINTENANCE_RECORD may contain aircraft
numbers not contained in table AIRCRAFT, Aircraft_Number is not a foreign key of table
MAINTENANCE_RECORD. Therefore, we cannot use referential constraints to ensure that
the aircraft for new maintenance records exist.
However, we can use a trigger to ensure the existence of the aircraft for new maintenance
records as illustrated by the bottom portion of the visual. Before a row is inserted into table
MAINTENANCE_RECORD, the WHEN clause of the trigger checks if the aircraft number
for the row exists in table AIRCRAFT. If it does not exist, a nonzero SQL state is raised
causing the INSERT statement to terminate.

8-60 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Constraint Integrity - Example 2 (1 of 2)

{ 5 : Only trained mechanics AIRCRAFT_


for aircraft maintenance } MECHANIC
MODEL

8 7
C C 3
_trained MECHANIC_
_for_ AIRCRAFT FOR_AM NA
MECHANIC m
m MODEL AIRCRAFT
m 1. .1
AND {5}
_for_
AND {5} m 10 9
m C C
AIRCRAFT MECHANIC_
_scheduled
_for_ FOR_AC

Figure 8-30. Constraint Integrity - Example 2 (1 of 2) CF182.0

Notes:
In Unit 4- Entity-Relationship Model, we also modeled the business constraint that
mechanics must only be scheduled for the maintenance of an aircraft if they have been
trained for the aircraft model. The business constraint applies to relationship type
MECHANIC_scheduled_for_AIRCRAFT. As input, it has relationship types
AIRCRAFT MODEL_for_AIRCRAFT and MECHANIC_trained_for_AIRCRAFT MODEL.
When translated into a constraint for tables, it applies to table MECHANIC_FOR_AC which
contains a row for each mechanic scheduled for the maintenance of an aircraft. Tables
AIRCRAFT and MECHANIC_FOR_AM are input for the constraint. Note that table
AIRCRAFT has, as foreign key, columns Type_Code and Model_Number specifying the
aircraft model for the various aircraft. Therefore, table AIRCRAFT_MODEL is not needed
as input for the business constraint.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-61


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraint Integrity - Example 2 (2 of 2)

CREATE TRIGGER MEFORAC


NO CASCADE BEFORE INSERT ON MECHANIC_FOR_AC
REFERENCING NEW AS N
FOR EACH ROW MODE DB2SQL
WHEN
( NOT EXISTS
( SELECT Employee_Number
FROM MECHANIC_FOR_AM AS M
JOIN AIRCRAFT AS AC
ON AC.Type_Code = M.Type_Code AND
AC.Model_Number = M.Model_Number
WHERE AC.Aircraft_Number = N.Aircraft_Number AND
M.Employee_Number = N.Employee_Number )
)
BEGIN ATOMIC
SIGNAL SQLSTATE '72003'
('MECHANIC NOT TRAINED FOR AIRCRAFT MODEL');
END

Figure 8-31. Constraint Integrity - Example 2 (2 of 2) CF182.0

Notes:
The visual illustrates the trigger for the business constraint discussed on the previous
visual. The trigger ensures that mechanics are scheduled for the maintenance of an aircraft
only if they have been trained for the appropriate aircraft model. The trigger achieves this
as follows:
• In the WHEN clause, it joins tables MECHANIC_FOR_AM and AIRCRAFT on columns
Type_Code and Model_Number. Each row of the intermediate result contains, for an
aircraft, the employee number of an employee that has been trained for the aircraft
model for the aircraft.
The WHERE clause extracts the rows for the aircraft number and the employee number
of the row to be inserted into table MECHANIC_for_AC. The NOT EXISTS predicate
determines if such rows were found. The result is true if rows were not found and false if
rows were found, i.e., the mechanic has been trained for the aircraft model.
• If the WHEN clause is true, i.e., if the mechanic has not been trained for the aircraft
model for the aircraft, the triggered action is performed.

8-62 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The triggered action signals a nonzero SQL state that causes the INSERT statement for
table MECHANIC_FOR_AC to terminate. If the mechanic has been trained for the
aircraft model, a zero SQL state is signaled and the row can be inserted.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-63


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraint Integrity - Example 3 (1 of 2)


{ 1 : Number of engines for aircraft <_
AIRCRAFT Number of engines for aircraft type }
TYPE

_for_ AIRCRAFT_
1. .m D TYPE

AIRCRAFT 2
MODEL
{1} NA
1. .1 AIRCRAFT_
_for_ MODEL
m 3
AIRCRAFT NA
AIRCRAFT
1
_in_ DC ENGINE AND {1}
_on_
1. .1 LOCATION 6
m
SN
ENGINE ENGINE

Figure 8-32. Constraint Integrity - Example 3 (1 of 2) CF182.0

Notes:
Another business constraint for our sample airline company was that the number of
engines mounted on an aircraft must not be larger than the number of engines for the
aircraft type.
The left-hand portion of the visual repeats that, in the entity-relationship model, the
business constraint is modeled as a constraint between entity types AIRCRAFT TYPE and
AIRCRAFT. The aircraft type must be the one for the aircraft whose number of engines is
matched against the number of engines that can be mounted. Therefore, in principle, also
relationship types AIRCRAFT MODEL_for_AIRCRAFT and
AIRCRAFT TYPE_for_AIRCRAFT MODEL are input for the constraint. This could have
been indicated by dashed lines in the entity-relationship model. However, since it is
self-evident and to avoid cluttering the entity-relationship model, it has not been shown in
the entity-relationship model.
When translating the business constraint into a constraint for the tables of the application
domain, it becomes a constraint between tables AIRCRAFT_TYPE and ENGINE. Again, a
similar remark applies: To come to the proper aircraft type for the aircraft on which an

8-64 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty engine is to be mounted, you must navigate, from ENGINE, via the referential constraints to
table AIRCRAFT_TYPE. Since entity type AIRCRAFT MODEL is a dependent entity type of
AIRCRAFT TYPE, you can directly go from table AIRCRAFT to table AIRCRAFT_TYPE.
To illustrate this, we have also shown table AIRCRAFT as input for the constraint in the
referential structure on the right-hand side of the visual.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-65


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Constraint Integrity - Example 3 (2 of 2)

CREATE TRIGGER ADDENGIN


NO CASCADE BEFORE INSERT ON ENGINE
REFERENCING NEW AS N
FOR EACH ROW MODE DB2SQL
WHEN (
( SELECT COUNT(*)
FROM ENGINE
WHERE Aircraft_Number = N.Aircraft_Number )
>=
( SELECT Number_of_Engines
FROM AIRCRAFT_TYPE AS AT
JOIN AIRCRAFT AS AC
ON AC.Type_Code = AT.Type_Code
WHERE AC.Aircraft_Number = N.Aircraft_Number )
)
BEGIN ATOMIC
SIGNAL SQLSTATE '72004' ('TOO MANY ENGINES FOR AIRCRAFT');
END

Figure 8-33. Constraint Integrity - Example 3 (2 of 2) CF182.0

Notes:
The implementation of the business constraint that the number of engines mounted on an
aircraft must not exceed the number of engines for the aircraft type requires two triggers: a
trigger controlling insert operations and a trigger controlling update operations. The visual
illustrates the trigger for the insert operations:
• The trigger is activated each time a row is added to table ENGINE. It is activated before
the row is inserted and checks if the new engine violates the constraint.
• The appropriate check is made in the WHEN clause.
The first SELECT statement counts the number of engines for the aircraft for the engine
being added.
The second SELECT statement joins tables AIRCRAFT and AIRCRAFT_TYPE on
column Type_Code. The intermediate result contains, for each aircraft, the number of
engines for its aircraft type.
The SELECT statement further extracts the number of engines for the aircraft type of the
aircraft to which the new engine is to be added.

8-66 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The results of the two SELECT statements are compared with each other.
• The triggered action is only performed if the WHEN condition evaluates to true, i.e., if the
number of engines for the aircraft becomes larger that allowed for the type. In this case,
a nonzero SQL state is signaled which causes the INSERT statement to terminate.
If the WHEN condition evaluated to false or unknown, the triggered action is not
performed and the new row can be added to table ENGINE.
The trigger for update operations looks the same except that the name for the trigger must
be different and the second line must read:
NO CASCADE BEFORE UPDATE ON ENGINE
Because column Aircraft_Number of table ENGINE can also be changed by the delete rule
of the referential constraint between tables AIRCRAFT and ENGINE, you should not
specify UPDATE OF Aircraft_Number ON ENGINE.
A further note of caution: Because of current restrictions, the trigger may not work on all
database management systems.

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-67


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Checkpoint

Exercise — Unit Checkpoint


1. Name the four basic types of integrity that must be maintained for a
database.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

2. What is a foreign key?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

3. The order and meaning of the parent-key and foreign-key columns


must be the same, not their names. (T/F)

4. Match the following terms with the proper definitions:

a. Parent table ____ All foreign key values have a


matching parent key value.
b. Dependent table
____ The table containing the
c. Self-referencing table
foreign key.
d. Referential constraint
____ A correlation between a
e. Referential integrity for parent key and a foreign key.
a referential constraint ____ The table containing the
parent key.
____ A table containing both the
parent key and the foreign
key for a referential
constraint.

8-68 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 5. Describe the difference between delete rules NO ACTION and


RESTRICT.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

6. Delete rule CASCADE causes the deletion of dependent rows.


(T/F)

7. Which are the update rules for referential constraints supported by


most database management systems?
a. NO ACTION.
b. RESTRICT.
c. SET NULL.
d. CASCADE.

8. Assume that the delete rule for a referential constraint is


CASCADE. Describe a case for which the deletion of a parent row
would still fail.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

9. Assume that the controlling property has been specified for the
source of an m:m relationship type. How can you ensure that the
row for a source instance is deleted if the row for an affiliated
relationship instance is deleted?
_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-69


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

10. When is table T delete-connected to table T1?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

11. If T is delete-connected to T1 via multiple referential paths with


different referential constraints for T, then the delete rules for the
referential constraints involving T must be the same and must be
CASCADE. (T/F)

12. A self-referencing constraint is a referential cycle. (T/F)

13. A self-referencing table is delete-connected to itself. (T/F)

14. What are the restrictions for referential cycles?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

15. Referential constraints must be defined for the parent table. (T/F)

16. What is the purpose of a referential structure?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

17. The arrow representing a referential constraint in a referential


structure points from the parent table to the dependent table. (T/F)

8-70 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 18. What is the meaning of a double-headed arrow in a referential


structure?
_____________________________________________________
_____________________________________________________
_____________________________________________________

19. What does domain integrity mean for your tables?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

20. Name two causes for the redundancy of data.


_____________________________________________________
_____________________________________________________
_____________________________________________________

21. The updating of redundant information for violations of the Third


Normal Form cannot be controlled by triggers. (T/F)

22. How can you ensure that derivable data are always correct?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

23. What is required to achieve constraint integrity?


_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-71


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

24. What are the main ingredients for achieving constraint integrity?
_____________________________________________________
_____________________________________________________
_____________________________________________________

8-72 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary

Referential integrity requires that all foreign key values have matching
parent key values
The delete rules for referential constraints are NO ACTION, RESTRICT,
SET NULL, and CASCADE
The update rules for referential constraints supported by most systems are NO
ACTION and RESTRICT
There exist restrictions for delete-connected tables and referential cycles
A referential structure provides an overview of the referential constraints for
the application domain or a subset thereof
Domain integrity requires the correctness of the values of the columns for
the tabIes of the application domain
Redundancy integrity requires the consistency of redundant information
Constraint integrity requires the observance of the business constraints for
the application domain
For achieving redundancy or constraint integrity, triggers can be
used (if necessary, in conjunction with user defined function)

Figure 8-34. Unit Summary CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 8. Integrity Rules 8-73


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

8-74 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 9. Indexes

What This Unit Is About


This unit describes the structure and purpose of indexes and
discusses for which columns of the tables indexes should be
established from a database design perspective.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Describe the basic structure of indexes.
• Explain the various options for indexes.
• Describe for which columns you should establish indexes.

How You Will Check Your Progress


Accountability:
• Checkpoint questions
• Exercises

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives

After completion of this unit, you should be able to:

Describe the basic structure of indexes

Explain the various options for indexes

Describe for which columns you should


establish indexes

Figure 9-1. Unit Objectives CF182.0

Notes:
Conceptually, the tables established can be implemented without indexes. However, in this
case, accessing a row may mean searching the entire table for the row and may be very
time-consuming and expensive. Indexes present a means for directly accessing specific
rows and are needed to ensure performance.
In this unit, we will describe the basic structure of indexes and demonstrate how they are
used for directly accessing a row. Furthermore, we will talk about various options for (forms
of) indexes such as unique or nonunique indexes.
In addition, we will discuss for which columns, from a database design perspective, you
should establish indexes. We will not talk about the usage of indexes from the
business-process perspective. The requirements of the business processes for indexes
depend on their usage patterns and may change in the course of time. Therefore, indexes
for business processes should be established as and when needed and dropped when
they are no longer needed.
The database management systems generally provide means for analyzing queries to
determine the need for and effectiveness of indexes.

9-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 9.1 Structure, Options, and Usage of Indexes

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Indexes in Design Process


Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 9-2. Indexes in Design Process CF182.0

Notes:
This unit deals with the establishment of indexes for the tables of the application domain.
Therefore, it follows the establishment of the tables. Because the referential integrity
support of most database management systems requires indexes, the establishment of
indexes even follows the establishment of the integrity rules. It is the last step of storage
view.

9-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Purpose of an Index

To improve performance in cases in which the locating


of a row would require the scanning of the rows

Searching for B737 600

A340 200 A300 600 B737 300 A320 200 B737 600 B777 200 B747 400

Data Pages AIRCRAFT_MODEL

Indexes allow direct access to rows


Indexes provide for logical sequential processing
rather than physical sequential processing
Starting/ending with a selected row (e.g., BETWEEN)
Indexes can avoid internal sorting

Figure 9-3. Purpose of an Index CF182.0

Notes:
The main purpose of an index is to improve performance in cases in which, otherwise, the
rows of the table would have to be scanned for locating a row. The visual illustrates this for
table AIRCRAFT_MODEL for our sample airline company called Come Aboard. Without an
index, when searching for an aircraft model, the data pages with the rows of the table must
be retrieved and scanned until the model has been found.
If the row is not contained in the table or multiple rows may exist for the same search
criterion, all rows for the table must be inspected. As you can see, this may require a lot of
pages (blocks) to be read and, as a consequence, a lot of I/O operations and may be very
expensive. The situation can be remedied by an index.
Indexes allow the database management system to directly access individual rows rather
than having to scan the rows of the table.
As we will see on the next visual, indexes logically order the rows of the table according to
the columns to which they apply. Per se, they do not order the rows physically even though
they may be used to ensure that the physical order corresponds to the logical order as
closely as possible.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

The logical order of an index allows, without sorting, the rows of the table to be processed
in that logical order rather than in their physical order. Combining the logical order with the
direct-access capability, an index allows you to start logical sequential processing at a
specific row and/or end it with a specific row. In particular, this supports the BETWEEN
predicate for SQL queries.
As already indicated, by using the logical ordering of an index, the database management
system may be able to avoid internal sorting of the rows retrieved. In particular, this may be
the case for SELECT statements using ORDER BY, GROUP BY, or DISTINCT.

9-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Structure of Indexes
Index Key =
Root Page B737 300 B777 200 X'FF...FF' (Type_Code,
Model_Number)

Leaf Pages

A300 600 A320 200 A340 200 B737 300 B737 600 B747 400 B777 200

A340 200 A300 600 B737 300 A320 200 B737 600 B777 200 B747 400

Data Pages AIRCRAFT_MODEL

Figure 9-4. Structure of Indexes CF182.0

Notes:
An index is based on a key, i.e., an ordered set of columns of a table. It is a multilevel tree
structure logically ordering the rows of a table in accordance with the key for the index. The
order can be ascending or descending depending on what you have requested. You can
determine the order when defining the index.
Assuming that the physical order of the rows may be different from the logical order implied
by the key, the index must be a dense index. This means that all key values must be
reflected by index entries in the lowest index level.
On the bottom of the visual, you see data pages with sample rows for table
AIRCRAFT_MODEL. On the lowest index level, there must be an index entry for each row.
The index entries are generally grouped into index pages. Within an index page, the index
entries are sorted in the requested order in accordance with the key for the index. The key
ranges for the index pages do not overlap.
Each index entry contains a key value and a pointer to the appropriate row(s) as indicated
on the visual. Thus, all rows of the table must be pointed to by index entries (dense
indexes).

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

The pages of the lowest index level are referred to as leaf pages. In general, the leaf pages
are chained together, forward and backward, in the ordering sequence emphasizing that
the index logically orders the rows of the table. The chaining of the leaf pages is used for
the logical sequential processing of rows.
Since an index may consist of many leaf pages (even more than data pages), it would still
be very inefficient to search the leaf pages for a particular row. Therefore, higher index
levels are introduced, again consisting of index pages referred to as nonleaf pages. The
index entries of the second index level (the one above the leaf pages) order the leaf pages.
Each index entry contains a key value and a pointer to a leaf page. Assuming an ascending
key sequence, the key value must be a key value higher than or equal to the highest key
value of the leaf page and lower than or equal to the lowest key value of the logically next
leaf page. On the visual, the lowest key value of the logically next leaf page is used as DB2
does. This has an advantage when inserting rows as we will see later.
The last index entry on any higher index level has a key value of all bytes hexadecimal FF,
the highest key value possible.
Since the second index level orders pages rather than rows, it will generally contain only a
few pages. If it contains more than one nonleaf page, a third index level is introduced to
order the pages of the second index level, and so on. The tree structure stops with an index
level that has only one index page.
The index page of the highest index level is referred to as root page.
The indexes established this way are balanced trees meaning that the number of index
levels to be traversed from the root page to a row is the same for all rows. There are other
types of indexes possible, but balanced trees have proven to be the best especially if the
distribution of the key values is random and cannot be predicted in advance.
Most indexes have two or three levels.

9-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Searching Via an Index
Searching for Index Key =
B737 600 B737 300 B777 200 X'FF...FF' (Type_Code,
Model_Number)

A300 600 A320 200 A340 200 B737 300 B737 600 B747 400 B777 200

A340 200 A300 600 B737 300 A320 200 B737 600 B777 200 B747 400

Data Pages AIRCRAFT_MODEL

Figure 9-5. Searching Via an Index CF182.0

Notes:
Using the index illustrated on this visual and the previous visual, aircraft model Boeing
B737, Model 600, is searched for. The index is an index in ascending order.
First, the root page of the index is searched for the proper index entry. The proper index
entry is the first index entry whose key value is higher than the given key value. This is the
search rule for all index levels above the leaf-page level.
In our example, the proper index entry is the second index entry of the root page, i.e., the
one with key value (B777, 200). The entry points to the second leaf page.
When searching leaf pages, you look for the last index entry whose key value is lower
than or equal to the given key value. If you find an index entry with the given key value,
the desired row exists and is pointed to by the index entry. If the key value of the index
entry is lower, the desired row does not exist.
In our case, the entry found is the second index entry of the second leaf page and has the
key value searched for. Thus, the row exist and, indeed, the index entry points to the row.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Consider a table with 20,000 rows of 100 characters each and assume that the key for the
index is 8 characters long. Furthermore, assume that the size of the data pages and of the
index pages is 4K.
Under these assumptions, the rows occupy approximately 500 pages and the index
consists of only two index levels. Reading a row using the index requires three pages to be
accessed. In contrast, scanning the rows would require on the average 250 pages to be
accessed assuming the system stops scanning when it has found the row (which is
generally not the case). This illustrates very clearly the advantage of having an index.

9-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unique and Nonunique Indexes

( Plain ) Unique Index


Every value of the key including the NULL value may occur at most once
Can be used to guarantee uniqueness of values for primary key of table
For foreign keys for merged 1:1 relationship types

Unique-Where-Not-NULL Index
Every value of the key excluding the NULL value may occur at most once;
NULL value may occur multiple times
For foreign keys for imbedded 1:1 relationship types

Nonunique Index
Every value of the key may occur multiple times
For foreign keys for merged or imbedded 1:m relationship types
For individual columns of composite keys

Figure 9-6. Unique and Nonunique Indexes CF182.0

Notes:
Unique indexes come in two flavors:
• Plain unique indexes consider the NULL value as a regular value and require/enforce
that each key value occurs in at most one row.
For an index key consisting of one column, this means that the NULL value may occur
in at most one row. Thus, uniqueness is enforced for all values including the NULL
value.
For an index key consisting of two columns, two key values (a,b) and (c,d) are
considered equal if a=c and b=d. This includes the NULL value: (a,NULL) and (c,NULL)
are considered different if a and c are different and identical if they are the same.
In particular, (plain) unique indexes can be used for the following two purposes:
- They can be used to guarantee the uniqueness of the values of the primary key for a
table.
- They can be used to guarantee the uniqueness of the values of a foreign key
resulting from the merging of the tuple type for a 1:1 relationship type.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Because the relationship type is a 1:1 relationship type, each defining attribute could
be the relationship key. Therefore, the corresponding (composite) attributes of the
related tuple type can assume each value only once. Thus, also the attribute that has
not been made the primary key of the tuple type.
For tuple types to be merged, they must, at all times, have the same primary key
values and, thus, number of tuples. As a consequence, the foreign key resulting from
the defining attribute that has not been made the primary key can assume each value
only once. Therefore, a (plain) unique index can be used to ensure the uniqueness of
the foreign key values.
• Unique-where-not-NULL indexes treat each occurrence of the NULL value as different
and require/enforce that each key value occurs in at most one row.
For an index consisting of one column, this means that each value except the NULL
value must occur in at most one row. Thus, uniqueness is only enforced for those
values that are not NULL.
For a key consisting of two columns, each occurrence of (a,NULL) is considered
different. Thus, uniqueness is only enforced for those key values for which none of their
components is the NULL value.
Unique-where-not-NULL indexes can be used to guarantee the uniqueness of the
values (that are not NULL) of foreign keys resulting from the imbedding of tuple types
for 1:1 relationship types.
The rationale is similar to the one for merged tuple types of 1:1 relationship types.
However, the imbedded tuple type may have, at any point in time, fewer tuples than the
target tuple type. As a consequence, for some of the rows, the foreign key may not have
a value requiring a unique-where-not-NULL index rather than a plain unique index.
Nonunique indexes allow any value of the key to occur in any number of rows.
In particular, nonunique indexes can be used for:
• The foreign keys resulting from merged or imbedded 1:m relationship types.
• Individual columns of composite keys. Even though the values of the composite key
may have to be unique, the values of the individual columns need not be unique.

9-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Clustering Index

Controls where new rows are inserted


Insertion point for new row determined via index

Attempts to make physical sequence of data pages


equal to logical sequence imposed by key for index
Only if free space available at insertion point
Should request free space during definition of corresponding space
object
Row inserted elsewhere if space not available at insertion point

Only one clustering index for a table

Supported only by some database


management systems
For example, DB2 Universal Database for OS/390

Figure 9-7. Clustering Index CF182.0

Notes:
Indexes marked as clustering indexes are used by the database management system to
control where new rows are inserted. They are used to determine the insertion point for the
new rows.
By using an index to determine the insertion point, the database management system
attempts to make the physical sequence of the data pages equal to the logical sequence
implied by the index. However, a new row is inserted at the point determined via the index
only if the located page contains enough free space for the row. Therefore, when defining
the space object for the table, you should request that free space is left in the data pages
for later insertions when the rows are loaded.
As mentioned before, the database management system attempts to insert the new row in
the page determined by means of the clustering index. If the data page does not contain
enough space for the row, the row is not inserted into the page. The data page is not split
either. Instead, the database management system inserts the new row into the closest
page with sufficient free space in the neighborhood of the ideal insertion point. If none of

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

the pages in the neighborhood has enough free space, the row is inserted somewhere
else.
Since the logical order of the index determines the physical order of the data pages and the
rows can only be ordered according to one criterion, there can only be one clustering index
for a table.
A clustering index is advantageous if you have business processes processing the rows in
the logical order imposed by the index. Since the data pages are pretty much in the logical
order of the key, the database management system need not jump permanently from one
place on the storage volume to another. It can efficiently use techniques such as sequential
prefetch to read a set of physically adjacent data pages with a single I/O operation.
Clustering indexes are only supported by a few database management systems. They are
supported, for example, by DB2 Universal Database for z/OS.

9-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Clustering Index - First Insertion (1 of 2)
Inserting Index Key =
B757 300 B747 400 B777 200 X'FF...FF' (Type_Code,
Model_Number)

A300 600 A320 200 A340 200 B747 400 B777 200

Insertion
Point

A300 600 A320 200 A340 200 B747 400 B777 200

Data Pages AIRCRAFT_MODEL

Figure 9-8. Clustering Index - First Insertion (1 of 2) CF182.0

Notes:
The visual illustrates how the insertion point for a new row of table AIRCRAFT_MODEL is
determined using a clustering index. The aircraft model to be inserted has type code B757
and model number 300.
The index is searched in precisely the same manner as for the retrieval of rows: The
second index entry of the root page is the first index entry with a higher key value than the
new row. Therefore, it is the one that points to the proper leaf page, the second leaf page
on the visual.
In the leaf page, you look for the last index entry with a key value lower than or equal to the
key value of the new row. Since the leaf page contains a single index entry, it is the entry
found and its key is lower. As a consequence, the new row will be inserted into the data
page pointed to by the index entry provided the data page has sufficient free space.
For the example, the new row is to be inserted into the third data page. It contains enough
free space. The new row will follow the row with key (B747, 400). This is indeed the proper
place for maintaining the data pages in the logical order implied by the index. The next
visual shows the data pages and the index after the insertion of the row.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-15


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Clustering Index - First Insertion (2 of 2)


Index Key =
B747 400 B777 200 X'FF...FF' (Type_Code,
Model_Number)

A300 600 A320 200 A340 200 B747 400 B757 300 B777 200

A300 600 A320 200 A340 200 B747 400 B757 300 B777 200

Data Pages AIRCRAFT_MODEL

Figure 9-9. Clustering Index - First Insertion (2 of 2) CF182.0

Notes:
Since the third data page has enough free space for the new row, the row with key
(B757, 300) is inserted into this data page. An appropriate index entry is added to the
second leaf page.

9-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Clustering Index - Second Insertion (1 of 2)
Inserting Index Key =
B767 200 B747 400 B777 200 X'FF...FF' (Type_Code,
Model_Number)

A300 600 A320 200 A340 200 B747 400 B757 300 B777 200

Insertion
Point

A300 600 A320 200 A340 200 B747 400 B757 300 B777 200

Data Pages AIRCRAFT_MODEL

Figure 9-10. Clustering Index - Second Insertion (1 of 2) CF182.0

Notes:
The visual illustrates the locating of the insertion point for a second insertion: A new row
with key (B767, 200) is to be inserted.
This time, the second index entry of the second leaf page is the last index entry whose key
value is lower than or equal to the key value of the row to be inserted. The data page
pointed to by the index entry is the third data page, the same as for the previous insert
request.
Since the third data page does not have any free space, the new row cannot be inserted
into the data page. The system looks for the closest data page with enough free space. The
second data page and the fourth data page have enough free space and are equally close.
Since the index is in ascending order, later pages in logical order are preferred and the
fourth data page is chosen. The new row will be inserted into the free space of the fourth
data page, i.e., following the row with key (B777, 200).
The next visual illustrated the insertion of the row.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-17


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Clustering Index - Second Insertion (2 of 2)


Index Key =
B747 400 B777 200 X'FF...FF' (Type_Code,
Model_Number)

A300 600 A320 200 A340 200 B747 400 B757 300 B767 200 B777 200

A300 600 A320 200 A340 200 B747 400 B757 300 B777 200 B767 200

Data Pages AIRCRAFT_MODEL

Figure 9-11. Clustering Index - Second Insertion (2 of 2) CF182.0

Notes:
Since the third data page does not have enough free space for the new row, the row with
key (B767, 200) is inserted into the fourth data page. An appropriate index entry is added
to the second leaf page.
As you can see, the physical order of the rows no longer coincides with the logical order.

9-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Partitioning Index

A special clustering index

Subdivides rows for table into key ranges


Key ranges referred to as partitions
For example:
1st partition: All employees with Employee_Number < 1350000
2nd partition: All employees with Employee_Number > 1350000
and Employee_Number < 2999999
3rd partition: All employees with Employee_Number > 2999999

Partitions in different physical spaces


Rows always inserted in physical space for partition

Partitions can be processed separately and in parallel


by utilities
SQL operations can access partitions in parallel
reducing run times

Figure 9-12. Partitioning Index CF182.0

Notes:
Partitioning indexes are a special form of clustering indexes. Thus, they have all the
features of clustering indexes. In addition, you must define key ranges for the key values of
the index. In turn, the key ranges for the index subdivide the rows for the table into
corresponding key ranges referred to as partitions.
Assuming that a partitioning index has been defined on column Employee_Number of table
EMPLOYEE for Come Aboard, the employees are partitioned in accordance with the key
ranges for the index. The example on the visual illustrates a partitioning into three
partitions: The first partition contains all rows for employees with an employee number
smaller than or equal to 1350000; the second partition all rows for employees with
employee numbers larger than 1350000, but not larger than 2999999; and the third
partition the rows for all remaining employees.
The partitioning can, however, only have an effect if something more is connected to it.
Generally, the following functions come with the subdivision into partitions:
• The rows of the partitions are placed into different physical spaces which may reside on
different cylinders or even different volumes. The rows for a partition are always placed

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-19


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

into the physical space for that partition and never into the physical space for another
partition.
• Utilities as, for example, a load utility, a backup utility, a recovery utility, or a
reorganization utility, can process individual partitions; can process the partitions
separately and jointly; and can process the partitions in parallel.
• SQL operations can process the partitions in parallel using multiple tasks or processes
of the operating system. This can considerably reduce the run time for SQL operations,
especially, queries.
These points imply that partitioning indexes are worth considering if you have large tables.
However, it is a prerequisite that you can reasonably subdivide the rows of the table into
partitions.
Since clustering indexes are only supported by a few database management systems,
partitioning indexes are also only supported by a few database management systems. For
example, they are supported by DB2 Universal Database for z/OS. Other systems use
other means to partition the rows of tables.

9-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Use of Indexes

For each primary key, define a plain unique index


Ensures unique identification of the rows

For each foreign key, define an index


If index exists, it is used by referential integrity support for ensuring referential
integrity when deleting a row of the parent table
Can cause poor performance if index is missing

Need not have a second index if other index exists whose key contains:
Key columns for second index as leading key columns
Key columns for second index in same sequence

Because of maintenance overhead, create


additional indexes only if really required

If the table does not change after loading, you can create any indexes

Good candidates for indexes are columns used (frequently) for


Joins, ORDER BY, GROUP BY, DISTINCT, or direct access of rows

Figure 9-13. Use of Indexes CF182.0

Notes:
From a database design perspective, the following rules apply for the use of indexes:
• For each primary key, define a (plain) unique index independent of the number of data
pages the table occupies. By allowing each primary key value to occur only once, the
index ensures the unique identification of the rows. Without the index, each primary key
value could occur more than once.
If you are using the referential integrity support of your system for a referential constraint,
most systems require a unique index for the primary key. The index (and only the index)
is used to check if the parent row exists for a row inserted into the dependent table.
• For each foreign key, define an index if the rows of the table occupy more than three data
pages.
When using the referential integrity support of your database management system, the
system will generally not force you to have an index for the foreign key. If an index exists
for the foreign key, it is used to ensure the referential integrity when deleting rows of the
parent table.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-21


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

For delete rules NO ACTION or RESTRICT, the index (and only the index) is used to
check if the dependent table has dependent rows. For delete rule SET NULL, it is used
to determine the dependent rows whose foreign key values must be reset to NULL. For
delete rule CASCADE, it is used to determine the rows of the dependent table that must
be deleted.
The missing index for the foreign key of a referential constraint is often the reason for
complaints about the poor performance of the referential integrity support.
• If you have an index for a composite key, you do not need an additional index for leading
columns of the key if you should need such an index. This assumes that the columns are
in the required order. The system is generally able to use the index for the composite key
since it is, in particular, ordered in accordance with the required leading key columns.
• The maintenance of indexes due to insert, update, or delete operations cannot be
neglected. Therefore, you should introduce additional indexes only if they are really
required by the business processes and not as a precautionary measure. Of course, you
can add any indexes, as long as you do not mind the space they occupy, if the rows of
the table do not change after the loading of the table.
• Good candidates for indexes are columns that are used for Join operations, the SQL
ORDER BY, GROUP BY, or DISTINCT clauses/keywords, or for the direct access of
rows.
Most systems have tools that allow you do determine the effectiveness of an index.

9-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
No Index for Leading Foreign Key
Employee_Number

MECHANIC

Aircraft_Number

AIRCRAFT

C C Index on primary key


Employee_Number Aircraft_Number
(Employee_Number, Aircraft_Number)

No index required for foreign key


Employee_Number
MECHANIC_FOR_AC

Figure 9-14. No Index for Leading Foreign Key CF182.0

Notes:
For m:m relationship types, you need a table containing columns for the defining attributes
for the relationship type. As you know, the defining attributes together form the relationship
key and, therefore, the primary key of the table. Thus, you should have a unique index
comprising all columns for the defining attributes.
For each of the defining attributes, the key consisting of the columns for the defining
attribute represents a foreign key. One of these keys is the first part of the primary key.
Thus, its columns are the leading columns of the primary key. Therefore, you do not need
an additional index for that key. However, you should have an index for the other foreign key
because the primary (key) index is ordered differently.
The visual illustrates this for table MECHANIC_FOR_AC, the table for m:m relationship
type MECHANIC_scheduled_for_AIRCRAFT. The table consists of the primary key
columns for tables MECHANIC and AIRCRAFT, i.e., columns Employee_Number and
Aircraft_Number. Together, the columns form the primary key. Let us assume that
Employee_Number is the first column of the primary key and that there is a unique index for
the primary key.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-23


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Individually, columns Employee_Number and Aircraft_Number are foreign keys as


indicated by the referential constraints on the visual. Since Employee_Number is the first
column of the primary key, you do not need an index for it. The primary (key) index is used
instead. However, you should have a nonunique index for column Aircraft_Number.

9-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Indexes - Documentation

For each index:

Table for Name of table to which index applies


Index:

Index Name: Name for index. Should be unique for application domain

Index Key: Ordered list of columns over which index is defined

Properties: UNIQUE, UNIQUE WHERE NOT NULL, NONUNIQUE,


CLUSTERING, PARTITIONING

Key Ranges: For a partitioning index, key ranges for the partitions

Figure 9-15. Indexes - Documentation CF182.0

Notes:
For each index, you should provide the following information:
• The name of the table for which the index is established.
• You should select a name for each index in agreement with the naming rules for indexes
for your database management system. The name for the index should be unique for the
application domain and must not be the name of a table.
The name is only used to identify the index to the database management system. It is
not needed by end-users or for any other objects being defined. Therefore, you could
omit it and leave it to the database administrator to select a name when he/she defines
the index.
• The ordered list of columns making up the key for the index.
• The properties the index should have: If it should be a (plain) unique index, a
unique-where-not-NULL index, or a nonunique index; if it should be a clustering index or
a partitioning index or not.
• For a partitioning index, the key ranges for the partitions of the table.

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-25


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Checkpoint

Exercise — Unit Checkpoint


1. What is the main purpose of an index?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

2. Indexes allow the database management system to directly access


the rows of the table on which the index has been defined. (T/F)

3. Indexes may help the database management system to avoid the


internal sorting of rows. (T/F)

4. What does it mean that an index is a dense index?


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

5. The root page is always a nonleaf page. (T/F)

6. Assume that you have defined a plain unique index on a column of


a table. For how many rows of the table can the column contain the
NULL value?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

9-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 7. For an index whose key consists of one column, match the
following definitions with the corresponding type of index:
a. Every value including the ____ Nonunique index
NULL value must occur in
at most one row.
b. Every value excluding the ____ Plain unique index
NULL value must occur in
at most one row.
c. Every value can occur in ____ Unique-where-not-NULL
any number of rows. index

8. Describe two cases for the usage of plain unique indexes.


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

9. Describe a case for the usage of a unique-where-not-NULL index.


_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

10. What does the system attempt to do if you have a clustering index
for a table?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-27


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

11. Which of the following actions are taken if the data page for a new
row determined by a clustering index does not have enough free
space for the row?
a. The row is not inserted.
b. The data page is split in half and the row is inserted into one of
the new data pages.
c. If there is a data page in the neighborhood that has enough free
space, the row is inserted into that data page; otherwise, it is
inserted somewhere else.
d. The row is always inserted at the end of the data pages.
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

12. From the database design perspective, for which columns should
you establish an index?
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________
_____________________________________________________

9-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary

Indexes allow the database management system to directly access the rows of
a table
Indexes support the logical sequential processing of rows without sorting
Indexes help avoid internal sorts by the database management system

Plain unique indexes ensure that each value of the index key including the
NULL value occurs in at most one row

Unique-where-not-NULL indexes ensure that each value of the index key


excluding the NULL value occurs in at most one row
Nonunique indexes allow any value to occur in any number of rows
Clustering indexes attempt to store the rows into the data pages in such a way
that the physical sequence agrees with the implied logical sequence

Establish a (plain) unique index for each primary key

You should establish an index for each foreign key

Figure 9-16. Unit Summary CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 9. Indexes 9-29


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

9-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Unit 10. Logical Data Structures

What This Unit Is About


This unit makes the transition from storage view to logical view. It
describes logical data structures and briefly discusses views which
complement the logical data structures. Logical data structures are
established for business processes and illustrate in which tables the
data for a business process are located and also show the
process-specific flow through the tables of the application domain.

What You Should Be Able to Do


After completing this unit, you should be able to:
• Explain the purpose of logical data structures.
• Understand who has the responsibility for the establishment of
logical data structures.
• Describe the components of logical data structures and their
representation.
• Explain the relationship between business processes and logical
data structures.
• Describe the interrelationship between logical data structures and
views.

How You Will Check Your Progress


Accountability:
• Checkpoint questions

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Unit Objectives
After completion of this unit, you should be able to:

Explain the purpose of logical data structures

Understand who has the responsibility for


the establishment of logical data structures

Describe the components of logical data


structures and their representation

Explain the relationship between business


processes and logical data structures

Describe the interrelationship between


logical data structures and views

Figure 10-1. Unit Objectives CF182.0

Notes:
After having established the tables, integrity rules, and indexes for the application domain,
we must make the transition from storage view to logical view. The transition verifies the
design of the database and proves that it meets the requirements of the business
processes. The verification is accomplished by establishing the logical data structures for
the business processes described in the process inventory.
This unit describes logical data structures and briefly discusses views which complement
them. It describes:
• The purpose of logical data structures.
• Who is responsible for establishing the logical data structures for the business
processes and the role of the database designer.
• The components of logical data structures and how they are represented.
• The relationship between the business processes and logical data structures.
• The interrelationship between logical data structures and views. Views are relational
database objects describing subsets and combinations of one or more tables.

10-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 10.1 Logical Data Structures

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Logical Data Structures in Design Process


Problem Statement

Entity Relationship Conceptual


Model View

Process Inventory Data Inventory

Tuple Types

Tables
Logical Data Structures
Integrity Rules

Logical View Storage Indexes


View

Figure 10-2. Logical Data Structures in Design Process CF182.0

Notes:
After the tables, integrity rules, and indexes for the application domain have been
determined, it is time to verify that the database design meets the requirements of the
business processes. As part of logical view, the logical data structures are established for
all business processes described in the process inventory.
The logical view looks at the data of the application domain from the perspective of the
business processes for the application domain. Accordingly, the logical data structures
show which tables of the application domain contain the data needed by the business
processes. They also describe how to navigate from table to table when accessing the
data.
The tables established for the application domain are the primary input for the
establishment of the logical data structure. The integrity rules, more precisely, the
referential constraints between primary keys and foreign keys, are a second input because
they show the natural paths between the various tables.

10-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Logical Data Structures - Purpose
Data Inventory
Data
Process Elements and
Data Groups

Logical Data
Tables
Structures

Data elements used by a process form a subset of the tables


Locigal data structures identify tables and columns needed by
the process
Logical data structures identify how to navigate from table to table

Logical data structures describe


logical view and data flow of process
Figure 10-3. Logical Data Structures - Purpose CF182.0

Notes:
As the last step of the conceptual view, the data needed by the business processes of the
process inventory were described as data elements and data groups in the data inventory.
Based on the data elements and data groups in the data inventory, the tables for the
application design were developed. The data elements became the columns of the tables.
The data groups only provided structural information needed for normalization and the
splitting of tuple types.
In general, the data elements for a single business process constitute a small subset of the
columns of the tables and may be located in different tables. Therefore, for the individual
business processes, it is necessary to identify:
• The columns and tables corresponding to the data elements used by the business
process.
• How the business process can find, using the data found in one table, related data in
other tables.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

This is where the logical data structures come into play. The logical data structures for a
business process describe this. They describe the subset of tables and columns needed by
the business process or a part of it. They also illustrate how the business process or the
appropriate part must navigate logically through the tables to achieve its function. Thus,
they reflect the logical view the business process (or the part) has of the tables and the
data flow between the tables for the business process.

10-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Logical Data Structures - Responsibilities
Logical views of tables and data
flows between tables for processes
Must show application
programmers:
Which tables contain data for processes
How to navigate from one table to the next
Joint effort between
Database designer
Knows tables
Knows referential constraints
and application programmers
Must write programs for processes
(Should) know processes
Input for application programmers
Allows verification of tables for application
domain
Figure 10-4. Logical Data Structures - Responsibilities CF182.0

Notes:
The logical data structures describe the logical views the business processes have of the
tables and the flow of data between the tables for the processes. They must show the
application programmers which tables contain the data (columns) for the business
processes and how to navigate from one table to the next. Thus, when establishing the
logical data structures, the interfaces between the business processes and the tables are
exposed.
The development of the logical data structures is a joint effort between the database
designer and the application programmers. The database designer must participate in the
development because he/she knows the tables, their columns, and the referential
constraints between the tables. The referential constraints, representing relationships
between primary keys and foreign keys, provide natural paths between the tables. They
are the primary vehicles for interconnecting the various tables.
However, the database designer cannot establish the logical data structures on his/her
own. The establishment of the logical data structures requires a detailed knowledge of the
business processes and may already consider implementation details. Therefore, the

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

application programmers writing the programs or queries for the business processes must
participate in the development of the logical data structures. They should have a primary
interest in the logical data structures. They should also have the required knowledge of the
business processes. How else can they implement them?!
Instead of the application programmers, the application domain expert could participate in
the development of the logical data structures. However, since implementation
considerations may affect the logical data structures for a business process, the
participation of the application programmers is preferable. Because the logical data
structures are input for the application programmers, they should be the driving force in
establishing them.
You may ask what the database designer's interest is in the development of the logical data
structures? He/She has a very good reason for participating in their development. By
establishing the logical data structures, the correctness and completeness of his/her
database design is verified. In addition, some performance bottlenecks may be revealed
leading to additional denormalizations, the combining of tables, and the splitting of tables.
The detection of design problems requires a reiteration of the design process rather than
patches to the tables. By just patching the tables, the quality of the design is jeopardized
and the rationale for design decisions is easily abandoned. If the changes are minor, it
does not take much time to verify and correct the intermediate design steps and, thus,
validate the basic design concept. If the changes are major, you better follow the design
steps from top to bottom when rectifying the problem.

10-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Business Process
Display Maintenance Record Summary

For a specified maintenance number, display the following information:

1. The date when the maintenance was performed and the type of
maintenance performed.

2. The employee number and the name of the mechanic who performed the
maintenance.

3. The aircraft number of the aircraft for which the maintenance was performed.
4. If the aircraft is still owned by CAB, the date when the aircraft was
manufactured, the date when the aircraft was put into service, the model
number and type code for the aircraft, and the name of the manufacturer.

5. For each subrecord (direct or indirect) for the maintenance record, the date
of the maintenance and the type of maintenance performed.

Figure 10-5. Sample Business Process CF182.0

Notes:
The business process on this visual is a business process for our sample airline company
called Come Aboard. For a given maintenance number, the business process displays
information about the maintenance record, the aircraft for maintenance record, and the
subrecords for the maintenance record.
For the maintenance record itself, it displays the date when the maintenance was
performed and the type of maintenance performed. In addition, it displays the employee
number and the name (last name, first name, and middle initial) of the employee that
performed the maintenance. Furthermore, the aircraft number of the aircraft is displayed for
which the maintenance was performed.
If Come Aboard still contains data about the aircraft, the date when the aircraft was
manufactured and the date when the aircraft was put into service are displayed. In addition,
the model number and type code for the aircraft and the name of the manufacturer are
displayed.
A maintenance record may have subrecords which again may have subrecords and so on.
For each subrecord, the date and type of maintenance are displayed.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-9
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

We will come back to the various points on the visual when discussing the logical data
structure for the business process.

10-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Structure Diagram

INPUT

MAINTENANCE_
RECORD/1

2 3 6

MAINTENANCE_
C
EMPLOYEE AIRCRAFT RECORD/2

7
4

AIRCRAFT_
TYPE

MANUFAC-
TURER

Figure 10-6. Sample Structure Diagram CF182.0

Notes:
A logical data structure consists of three components:
• A Structure Diagram illustrating how the various tables for the logical data structure are
interconnected. Since the structure diagram is the component resembling most what
you would expect from a structure, the term logical data structure is frequently used
synonymously for it.
• A Path Summary describing the columns through which the tables of the structure
diagram are interconnected.
• A Table Summary listing the columns needed for the various tables of the structure
diagram.
The current visual illustrates the structure diagram for the logical data structure for our
sample business process. Basically, the structure diagram looks as follows:
• The rectangular boxes in the structure diagram represent the tables used by the
business process (or a part of it) associated with the logical data structure. The boxes
contain the names of the tables.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-11
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

If a business process uses the same table multiple times, for the same purpose or for
different purposes, the table occurs multiple times in the structure diagram. A different
usage may require different columns of the table. To tell the different uses apart and
correctly assign the columns to their uses in the table summary, the names of tables
occurring multiple times are appended by "/n". n uniquely numbers the different uses. In
the example, table MAINTENANCE_RECORD is used for two purposes as will be
described later.
• An arrow interconnecting two tables illustrates a data flow in the direction of the arrow.
The table at the beginning of the arrow is referred to as source table for the flow, the
table at the end as target table. A value found in the source table is used unmodified to
access the corresponding rows in the target table. For example, the employee number
found in a maintenance record is used to access the row for the appropriate employee in
table EMPLOYEE ( 2 ). This corresponds to a Join operation for the tables.
The tables can be joined through a single column or multiple columns. The columns in
the target table can be named differently, but their function must be the same.
• The arrows are labeled to establish a reference to the path summary for the logical data
structure. For each interconnection of two tables (path), the path summary lists the
columns of the source table as well as the columns of the target table. If the
interconnection is through multiple columns, the column names are preceded by
sequence numbers establishing the correspondence between the respective source and
target columns.
• As for referential structures, single-headed and double-headed arrows are used to
indicate how many rows may be found in the target table for a value. A single-headed
arrow means that at most one row with the source value may be found in the target
table. A double-headed arrow means that multiple rows with the source value may be
found in the target table.
• If a path corresponds to a referential constraint (a primary-key/foreign-key relationship)
in the direction of the arrow, the delete rule is specified at the target end. The referential
constraint may allow the application programmer to skip steps of the business process
because they are automatically done by the referential integrity support of the system.
• It may happen that a table is accessed recursively (for the same purpose). In this case,
the arrow for the path leads back to the same table as is the case for table
MAINTENANCE_RECORD/2.
It is conceivable that the recursive loop comprises multiple tables.
• The data flow for a business process (or a part of a business process) always starts with
a specific table referred to as entry table. The entry table is identified by an oval labeled
INPUT pointing to it. In case of the sample logical data structure, the business process
starts with table MAINTENANCE_RECORD/1.
The interconnection between the INPUT box and the entry table is also labeled and
described in the path summary, the entry table being the target table.

10-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Most of the times, a subset of the rows of the entry table is selected based on the values
of certain columns. The columns used for the selection are specified as target columns
in the path summary. Since not applicable, the fields Source Table and Source Columns
remain blank.
So far the general description of the structure diagram. Now, let us explain how we arrived
at the logical data structure for the sample business process:
1. The business process displays information about the maintenance record whose
maintenance number is specified as input. Table MAINTENANCE_RECORD is the entry
table for the logical data structure and the oval labeled INPUT points to it. The
connecting arrow ( 1 ) is a single-headed arrow because table
MAINTENANCE_RECORD can only contain a single row with the specified
maintenance number.
The first step of the business process requests that the date of the maintenance record
and the type of maintenance performed be displayed.
2. As a consequence of the first step of the business process, the path summary for the
logical data structure contains a row for path 1 . The row identifies column
Maintenance_Number as target column for table MAINTENANCE_RECORD.
3. The first step of the business process needs the following columns of table
MAINTENANCE_RECORD: Maintenance_Number, Date_Maintenance, and
Type_Maintenance. Therefore, they are included in the table summary for table
MAINTENANCE_RECORD.
4. The second step of the business process requests that employee number and name of
the mechanic be displayed who performed the maintenance.
Table MAINTENANCE_RECORD contains the employee number of the mechanic who
performed the maintenance. The employee number is used to retrieve the row for the
mechanic in table EMPLOYEE. The row contains the name of the mechanic.
Accordingly, we have a path ( 2 ) from table MAINTENANCE_RECORD to table
EMPLOYEE. The connecting arrow must be a single-headed arrow because table
EMPLOYEE contains a single row for the employee number.
Note we do not need to access table MECHANIC since we do not need any data of that
table. Consequently, path 2 does not correspond to a relationship type of the
entity-relationship model or a referential constraint of the referential structure.
5. The path summary must include a row for path 2 . The row describes that tables
MAINTENANCE_RECORD and EMPLOYEE are joined via column Employee_Number.
The value found for column Employee_Number in table MAINTENANCE_RECORD is
used as search argument for column Employee_Number of table EMPLOYEE.
As the consequence of the second step of the business process, column
Employee_Number is added to table MAINTENANCE_RECORD in the table summary.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-13
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

In addition, the table summary states that the following columns of table EMPLOYEE
are needed by the business process: Employee_Number, Last_Name, First_Name, and
Middle_Initial.
6. The third step of the business process requests that the aircraft number for the aircraft
be displayed for which the maintenance was performed. Since the aircraft number is
contained in the maintenance record, the structure diagram need not change.
7. Because of Step 3 of the business process, column Aircraft_Number must be added to
the columns needed from table MAINTENANCE_RECORD. The path summary remains
unchanged.
8. The fourth step of the business process requests that the date when the aircraft was
manufactured, the date when the aircraft was put into service, and the model number
and type code for the aircraft be displayed.
If Come Aboard still has information about the aircraft, the requested information is
contained in the row for the aircraft in table AIRCRAFT. To retrieve the row, the aircraft
number in the maintenance record is used (path 3 ). The arrow must be a
single-headed arrow because at most one row can be found in table AIRCRAFT for a
given aircraft number.
Note that there is not a relationship type interconnecting entity types AIRCRAFT and
MAINTENANCE RECORD in the entity-relationship model. Remember that the
maintenance records for an aircraft must be kept even if the remaining information about
the aircraft is deleted. For that reason, there is also not a referential constraint for the
tables.
9. Because of Step 4 of the business process, the path summary must contain a row for
path 3 . The row shows that column Aircraft_Number of table
MAINTENANCE_RECORD is used as search argument for column Aircraft_Number of
table AIRCRAFT.
The table summary comprises a row for table AIRCRAFT listing all columns requested
by Step 4 of the business process.
10.The fourth step of the business process also requests that the name of the
manufacturer of the aircraft be displayed. To find the manufacturer name, we must use
the type code for the aircraft found in table AIRCRAFT and retrieve the row for the
aircraft type from table AIRCRAFT_TYPE. (We need not go to table AIRCRAFT_MODEL
since we do not need model-specific information.)
The row retrieved contains the manufacturer code which is then used to retrieve the row
for the manufacturer from table MANUFACTURER. The retrieved row contains the name
of the manufacturer.
The structure diagram is extended by the two interconnections ( 4 and 5 ) required to
accomplish the requested task.

10-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty 11.As a consequence of the retrieval of the manufacturer name, the path summary
contains two additional rows describing the transitions from table AIRCRAFT to table
AIRCRAFT_TYPE and from table AIRCRAFT_TYPE to table MANUFACTURER.
The table summary reflects that columns Type_Code and Manufacturer_Code of table
AIRCRAFT_TYPE and columns Manufacturer_Code and Company_Name of table
MANUFACTURER are needed.
12.The fifth step of the business process requests that, for all subrecords of the specified
maintenance record, the date and type of the maintenance be displayed.
To obtain the subrecords for the maintenance record, we must retrieve all rows of table
MAINTENANCE_RECORD for which the value of column Owning_Record is equal to
the maintenance number of the specified maintenance record. This is expressed by
path 6 whose source and target is table MAINTENANCE_RECORD.
The structure diagram shows table MAINTENANCE_RECORD twice, and not an arrow
returning to the same table, because we have two different uses of table
MAINTENANCE_RECORD: Once, it is used for the original maintenance record and
once for the subrecords. That the uses are different is underlined by the fact that the
columns needed for the subrecords are different (fewer) and that there are different
interconnections from the subrecords.
The arrow for path 6 must be double-headed because multiple subrecords may exist
for a maintenance record.
13.To obtain unique references for table MAINTENANCE_RECORD in the path summary
and the table summary, "/1" and "/2" are appended to the table name, respectively.
14.The path summary contains a row for path 6 describing that columns
Maintenance_Number of table MAINTENANCE_RECORD/1 and Owning_Record of
table MAINTENANCE_RECORD/2 are joined.
The table summary describes that columns Owning_Record, Date_Maintenance,
Type_Maintenance, and Maintenance_Number of table MAINTENANCE_RECORD/2
are needed by the business process. Even though not expressed explicitly by the
description of the business process, the maintenance numbers for the subrecords must
be displayed to identify the subrecords. Column Maintenance_Number is also needed
for a different reason as we will see in a moment.
15.Looking more closely at the description of Step 5 reveals that not only the immediate
subrecords of the maintenance record are needed, but also all indirect subrecords. This
means that also the subrecords of the subrecords, and again their subrecords, are
needed.
Thus, we need the recursion represented by path 7 : The maintenance number of a
subrecord is used to locate all maintenance records whose column Owning_Record
contains that maintenance number. Since the interconnection corresponds to the
self-referencing constraint for table MAINTENANCE_RECORD, the delete rule
(CASCADE) is specified at the target end of the arrow.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-15
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

16.The path summary must contain a row for path 7 describing the recursion. The table
summary remains unchanged since additional columns are not needed for table
MAINTENANCE_RECORD/2.

10-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Sample Path and Table Summaries
Path Summary
# Source Table Source Columns Target Table Target Columns
1 MAINTENANCE_RECORD/1 Maintenance_Number
2 MAINTENANCE_RECORD/1 Employee_Number EMPLOYEE Employee_Number
3 MAINTENANCE_RECORD/1 Aircraft_Number AIRCRAFT Aircraft_Number
4 AIRCRAFT Type_Code AIRCRAFT_TYPE Type_Code
5 AIRCRAFT_TYPE Manufacturer_Code MANUFACTURER Manufacturer_Code
6 MAINTENANCE_RECORD/1 Maintenance_Number MAINTENANCE_RECORD/2 Owning_Record
7 MAINTENANCE_RECORD/2 Maintenance_Number MAINTENANCE_RECORD/2 Owning_Record

Table Summary
Table Columns
MAINTENANCE_RECORD/1 Maintenance_Number, Date_Maintenance, Type_Maintenance,
Employee_Number, Aircraft_Number
EMPLOYEE Employee_Number, Last_Name, First_Name, Middle_Initial
AIRCRAFT Aircraft_Number, Date_Manufactured, Date_in_Service, Type_Code,
Model_Number
AIRCRAFT_TYPE Type_Code, Manufacturer_Code
MANUFACTURER Manufacturer_Code, Company_Name
MAINTENANCE_RECORD/2 Owning_Record, Date_Maintenance, Type_Maintenance,
Maintenance_Number

Figure 10-7. Sample Path and Table Summaries CF182.0

Notes:
The visual illustrates the path summary and the table summary for the sample business
process described on page 10-22. For each path of the structure diagram, the path
summary lists the source table and the target table. It also specifies the source-table and
target-table columns that are joined.
For each usage of a table, the table summary specifies the columns needed.
The notes for the previous visual describe how the path summary and the table summary
for the sample business process are derived.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-17
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

An Alternate Representation

Input

MAINTENANCE_ Employee_ Aircraft_ Maintenance_


RECORD/1 Number Number Number

C
Employee_ Aircraft_ Owning_ Maintenance_
Type_Code
Number Number Record Number
EMPLOYEE AIRCRAFT MAINTENANCE_
RECORD/2

AIRCRAFT_TYPE Type_Code Manufacturer_


Code

Manufacturer_
Still need table summary,
MANUFACTURER Code but not path summary

Figure 10-8. An Alternate Representation CF182.0

Notes:
You might already have wondered why the path summary is needed? Indeed, you can
show the joined columns immediately in the structure diagram as done on the above visual
for the sample business process used so far. The arrows then point from the source
column to the target column and labels are no longer needed for the arrows. The names for
the tables are outside the boxes, next to them. The table summary is still necessary since it
is impractical to incorporate all needed columns into the diagram.
The resulting diagram seems to be simpler and clearer. However, this representation does
not always work well. It works well for those cases where a single column is used to
navigate from a table to table. The representation becomes complex and blurred if you
must join the tables on multiple columns and the columns are named differently in the two
tables. It becomes especially confusing if you must join a table with multiple other tables on
multiple columns and the columns overlap.
Furthermore, the above representation requires more space and you might find it more
difficult to squeeze it onto a single page.

10-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty In contrast, if you are using a path summary, you can even omit the structure diagram since
path summary and table summary together contain all necessary information. The
structure diagram just provides a graphical view of the flow between the tables.
Now, you have a choice. Make the best of it.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-19
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Processes and Logical Data Structures


A logical data structure describes a continuous flow through
a subset of the tables of the application domain
Data found in one table used unmodified to select the rows of the next table
A business process may have many logical data structures

Many logical data structures are trivial


Just access a single table
Structure diagram consists of input box, box for table, and connecting arrow
Table summary lists columns needed/modified in table
Path summary identifies search argument for table

Many interconnections of the logical data structures represent


primary-key/foreign-key interrelationships, but not all

Most of the time, the input for a logical data structure is part of the input
for the business process, but not always

Columns needed for the tables should coincide with the data read or written
by the business process as described in the process inventory

Figure 10-9. Processes and Logical Data Structures CF182.0

Notes:
The example we studied on the previous pages only required a single logical data
structure. As we have already discussed, a logical data structure describes a continuous
flow through a subset of the tables of the application domain. As for Join operations, the
data found in a row is used unmodified to select the rows of the next table. This entails that
many business processes will require multiple logical data structures since they do not just
use the values found to select the rows of the next table. Rather, they use additional criteria
(search arguments) or derived search arguments. Different or additional search arguments
require a separate logical data structure.
Many logical data structures are simple because they access a single table. In particular,
this applies to the logical data structures involving insert, update, or delete operation
because the corresponding SQL statements only allow the specification of a single table.
The structure diagrams consist of an input box, the box for the table, and an arrow
connecting them. The table summary lists the accessed columns of the table. The path
summary shows through which columns the table is entered, i.e., the search argument for
the rows retrieved, updated, or deleted or the columns inserted. You may opt to omit the

10-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty structure diagram for these logical data structures since it does not provide much
information.
Many paths in logical data structures represent primary-key/foreign-key interrelationships,
but not all, as we have seen for the previous example.
Most of the time, the input for a logical data structure is part of the input for the business
process, but not always. Secondary logical data structures may use a derived input.
The columns needed for the various tables should coincide with the data read or written by
the business process as described in the data inventory (see Unit 5 - Data and Process
Inventories).

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-21
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Example 2 - Business Process


Assign Captain for Flight

1. It is verified that the specified flight and pilot exist. If flight or pilot do not exist, an
appropriate error message is displayed and the business process ends.
2. If pilot and flight exist, it is checked if the pilot has the license to fly the aircraft model
for the leg for the flight. If the pilot cannot fly the aircraft model, an appropriate error
message is displayed and the business process ends.
3. If the pilot has the license to fly the aircraft model, it is checked if the pilot has
already been assigned to the flight. If the pilot is already captain or copilot for the
flight, an appropriate message is displayed and the business process ends.
4. If the pilot has not yet been assigned to the flight, it is checked if another pilot is
already captain for the flight. If so, a message is displayed containing employee
number, last name, and first name of the current captain and the business process
ends.
5. If a captain has not yet been assigned to the flight, the specified pilot becomes the
captain for the flight.
6. A message is displayed confirming that the pilot has been assigned as captain to the
flight. The message includes employee number, last name, and first name of the
assigned captain.

Figure 10-10. Example 2 - Business Process CF182.0

Notes:
The visual displays the textual description of business process Assign Captain for Flight
which we already discussed in Unit 5 - Data and Process Inventories. This business
process will require multiple, fairly simple, logical data structures.

10-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Example 2 - Structure Diagrams

Structure 2 Structure 3

INPUT INPUT
Structure 1
1 1
INPUT
PILOT_FOR_ PILOT_
AM ASSIGNMENT
1

FLIGHT
Structure 4 Structure 5
2
INPUT INPUT

LEG
1 1

PILOT_
EMPLOYEE
ASSIGNMENT

Figure 10-11. Example 2 - Structure Diagrams CF182.0

Notes:
The sample business process described on the previous visual requires multiple logical
data structures as explained in the following:
1. The first two steps of the business process verify that the specified flight and pilot exist
and the pilot has the license to fly the aircraft model for (the leg for) the flight.
As a matter of fact, we need not explicitly verify that the specified employee is a pilot. It is
sufficient to verify that he/she belongs to the persons having the license to fly the aircraft
model for the flight. If we do not find him/her in the list of the persons, the business
process ends anyway. If he/she is in the list, we know that the specified employee is a
pilot. The referential constraints for table PILOT_FOR_AM enforce this. Table
PILOT_FOR_AM which contains a row for every valid pilot/aircraft model combination is
constrained by table PILOT.
The point discussed represents an implementation detail. It confirms that the application
programmers should participate in the establishment of the logical data structures.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-23
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

To verify that the flight exists, we must access table FLIGHT using the specified flight
number, airport of departure, airport of arrival, and flight locator. Using the values found
in columns Flight_Number, From, and To, we must navigate to table LEG to determine
the aircraft model for the flight (columns Type_Code and Model_Number).
To verify that the specified pilot can fly the aircraft model, we have two choices:
•We can access table PILOT_FOR_AM just using the type code and the model number
for the aircraft model. In this case, we need to retrieve all pilots that can fly the aircraft
model until we have found the specified pilot or know that he/she is not in the list.
•We can access table PILOT_FOR_AM using the type code and the model number for
the aircraft model and the employee number of the specified pilot. In this case, we will
retrieve at most one row. If a row is returned, the specified employee is a pilot and can
fly the aircraft model. If a row is not returned, the specified employee is not a pilot or
cannot fly the aircraft model. In either case, he must not be considered for the flight.
If we took the first choice, we could continue the structure diagram to table
PILOT_FOR_AM since the value found in table LEG is used unmodified to navigate to
table PILOT_for_AM.
For the second choice, the search arguments for table PILOT_FOR_AM are the type
code and model number found in table LEG and the specified employee number. Thus,
a second logical data structure is required.
The first choice is a poor performer and we will choose the second alternative assuming
that an index is provided for the primary key of table PILOT_FOR_AM.
Since choosing the second alternative, the structure diagram for Structure 1 ends with
table LEG. Path summary and table summary for the logical data structure are illustrated
on page 10-26.
2. As explained before, we will use the type code and model number for the aircraft model
and the employee number for the pilot to access table PILOT_FOR_AM. Therefore, we
need a second logical data structure (Structure 2). Its structure diagram is extremely
simple since only one table is accessed. It consists of an input box, table
PILOT_FOR_AM, and the arrow interconnecting them. The structure diagram does not
continue further because we must use different inputs for the subsequent steps of the
business process.
Path summary and table summary for the logical data structure are on page 10-26.
3. Steps 3 and 4 of the business process check if the pilot has already been assigned to
the flight or if another pilot is already captain for the flight.
Both questions can be answered by a single access to table PILOT_ASSIGNMENT. For
this access, only the flight information (flight number, airport of departure, airport of
arrival, and flight locator) is used and not the employee number of the pilot. At most, two
rows are returned: one for the captain of the flight and one for the copilot. The returned
rows are then examined.

10-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty The appropriate logical data structure is Structure 3. Its path summary and table
summary are on page 10-26.
Note that columns Employee_Number and Pilot_Function must be retrieved to make the
necessary decisions.
4. If another pilot has already been assigned as captain to the flight, the fourth step of the
business process requests that employee number, last name, and first name of that pilot
be displayed. For this, we need a further logical data structure (Structure 4). Its path
summary and table summary are on page 10-26.
You might ask why Structure 3 is not continued to table EMPLOYEE? Continuing the
structure to table EMPLOYEE would mean that table EMPLOYEE were accessed for
every row retrieved from table PILOT_ASSIGNMENT. However, this is not the case for
an existing copilot assignment and we do not want to make unnecessary accesses for
the no-error cases.
5. The fifth step of the business process assigns the specified pilot as captain to the flight.
This gives rise to logical data structure Structure 5. Its path summary and table summary
are on page 10-26.
At the first glance, the logical data structure seems to be the same as Structure 3.
However, Structure 3 was for retrieval whereas Structure 5 is for the insertion of rows
and its path summary is different. As target columns, it shows all columns of table
PILOT_ASSIGNMENT meaning that they are input for the insert request.
6. The final step of the business process (Step 6) requests that employee number, last
name, and first name of the newly assigned captain be displayed. This requires an
access to table EMPLOYEE. For this access, we do not need an additional logical data
structure. Structure 4 can be used with the employee number of the new captain for the
flight.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-25
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Example 2 - Path and Table Summaries (1 of 3)


Structure 1 - Path Summary
# Source Table Source Columns Target Table Target Columns
1 FLIGHT 1: Flight_Number, 2: From,
3: To, 4: Flight_Locator
2 FLIGHT 1: Flight_Number, 2: From, LEG 1: Flight_Number, 2: From,
3: To 3: To

Structure 1 - Table Summary


Table Columns
FLIGHT Flight_Number, From, To, Flight_Locator
LEG Flight_Number, From, To, Type_Code, Model_Number

Structure 2 - Path Summary


# Source Table Source Columns Target Table Target Columns
1 PILOT_FOR_AM 1: Type_Code,
2: Model_Number,
3: Employee_Number

Structure 2 - Table Summary


Table Columns
PILOT_FOR_AM Type_Code, Model_Number, Employee_Number

Figure 10-12. Example 2 - Path and Table Summaries (1 of 3) CF182.0

Notes:
This visual illustrates the path and table summaries for logical data structures Structure 1
and Structure 2 for the second sample business process.

10-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Example 2 - Path and Table Summaries (2 of 3)
Structure 3 - Path Summary
# Source Table Source Columns Target Table Target Columns
1 PILOT_ASSIGNMENT 1: Flight_Number, 2: From, 3:
To, 4: Flight_Locator

Structure 3 - Table Summary


Table Columns
PILOT_ASSIGNMENT Flight_Number, From. To, Flight_Locator, Employee_Number, Pilot_Function

Structure 4 - Path Summary


# Source Table Source Columns Target Table Target Columns
1 EMPLOYEE Employee_Number

Structure 4 - Table Summary


Table Columns
EMPLOYEE Employee_Number, Last_Name, First_Name

Figure 10-13. Example 2 - Path and Table Summaries (2 of 3) CF182.0

Notes:
This visual illustrates the path and table summaries for logical data structures Structure 3
and Structure 4 for the second sample business process.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-27
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Example 2 - Path and Table Summaries (3 of 3)

Structure 5 - Path Summary


# Source Table Source Columns Target Table Target Columns
1 PILOT_ASSIGNMENT 1: Flight_Number, 2: From, 3:
To, 4: Flight_Locator,
5: Employee_Number,
6: Pilot_Function

Structure 5 - Table Summary


Table Columns
PILOT_ASSIGNMENT Flight_Number, From. To, Flight_Locator, Employee_Number, Pilot_Function

Figure 10-14. Example 2 - Path and Table Summaries (3 of 3) CF182.0

Notes:
This visual illustrates path summary and table summary for logical data structure Structure
5 for the second sample business process.

10-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Characteristics of Views
Represent subsets of the data in the tables of the
application domain
May comprise rows and columns of multiple tables
Selected columns can be ordered in any way desired
Selected columns can be renamed
When defining a view, a description of the data represented by the
view is stored
A view receives a name which can be used in SQL
statements where table names can be used
During execution, SQL statement replaced by SQL statement only containing
actual column and table names which is then executed
Views comprising multiple tables cannot be used in INSERT, UPDATE, or
DELETE statements
When the data described by the view is displayed, it is presented
in form of a table
All data comes from the base tables and not from a table corresponding
to the view
Data is always up-to-date

Figure 10-15. Characteristics of Views CF182.0

Notes:
Views are database objects representing subsets of columns and rows of one or more
tables. Thus, by means of views, you can represent the views logical data structures have
of the data in the tables of the application domain.
For example, you could define a view joining tables MAINTENANCE_RECORD,
AIRCRAFT, and EMPLOYEE and selecting a subset of their rows and columns:
• The rows of tables MAINTENANCE_RECORD and AIRCRAFT having the same aircraft
number should be combined.
• The resulting rows should be joined with the rows of table EMPLOYEE having the same
employee number.
• The view should contain the following columns of the three tables:
From table MAINTENANCE_RECORD:
Maintenance_Number, Date_Maintenance, Type_Maintenance,
Aircraft_Number, and Employee_Number

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-29
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

From table AIRCRAFT:


Date_Manufactured, Date_in_Service, Type_Code, and Model_Number
From table EMPLOYEE:
Last_Name, First_Name, and Middle_Initial
Column Aircraft_Number of table AIRCRAFT is not needed since tables
MAINTENANCE_RECORD and AIRCRAFT are joined on equal aircraft numbers.
Similarly, column Employee_Number of table EMPLOYEE is not needed because tables
MAINTENANCE_RECORD and EMPLOYEE are joined on equal employee numbers.
• You can add criteria for selecting specific rows. For example, you could request that only
the rows for a specific maintenance number be part of the view.
The view described above represents a subset of the logical data structure for the first
sample business process discussed in conjunction with logical data structures.
Basically, a view can select the rows and columns you can select by means of (a subset of)
the SELECT SQL statement. The columns and rows for a view can come from a single
table or from multiple tables. In the view definition, you can order the desired columns in
any way you want. The order of the columns in the view definition determines the order in
which the columns are made available when using a "SELECT *" to display the data for the
view. You can also rename the columns in the view definition.
When defining a view to the system, you specify the appropriate SELECT statement. The
SELECT statement is not executed. Rather, it is stored as description of the data belonging
to the view.
As all database objects, views receive names to allow them to be referenced. Their names
can be used in SQL statements where table names can be used. To access the data of a
view, you must specify the name of the view in the appropriate SQL statement. For
example, if you want to retrieve the data represented by a view, you must use the name of
the view (in place of a table name) in a SELECT statement.
When an SQL statement containing a view name is executed, it is replaced, by means of
the view definition, by a different SQL statement only containing actual column and table
names. The derived SQL statement is executed in place of the original SQL statement. The
replacement concept is the reason why views comprising multiple tables cannot be used in
INSERT, DELETE, or UPDATE SQL statements: The resulting SQL statement would not be
valid. For the same reason, there are also other restrictions for views.
When displaying the data for a view by naming the view in a SELECT statement, the data is
presented in form of a table. Since the derived SQL statement only contains references to
the base tables, i.e., the real tables used by the view, all data comes directly from the base
tables. As a consequence, the data is always up-to-date.
Because the displayed data is presented in form of a table, views are also referred to as
virtual tables. They are not real tables.

10-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Usage of Views

Views allow you to limit the data an end user can see or change
Data security

End users/Business processes only see data they are interested in


Ease of use

Explicitly name columns in views


Resilience against changes in base tables

Do not allow end users or business processes to access base tables


Freedom to change and extend base tables

Complementary to logical data structures


One or more views for each logical data structure
A view may serve multiple data structures

Figure 10-16. Usage of Views CF182.0

Notes:
Views are an important tool for achieving data security for your tables since they limit the
data end users or programs can see or change. By not allowing direct access to the actual
tables, you can limit the access of people to the data of the views you authorized them for.
Another positive aspect of views is that end users and business processes only see the
data they are interested in. Thus, the data presented to end users are more readily
understandable and the programs for the business processes need not provide variables
for data they do not need. Consequently, views ease the work of end users and application
programmers.
Explicitly naming the columns in the view definition makes your business processes more
resilient again database changes. If the sequence of the columns in the database changes
due to the redefinition of a table, end users and programs using the view will not realize the
changes and are not impacted. If the actual names of columns change, you can change the
view definition in such a way that end users and programs using the view do not realize the
name changes.

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-31
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

By explicitly naming the columns in the view, you also ensure that end users or programs
do not realize the addition of new columns they are not interested in.
From a design perspective, you should not allow end users or program to directly access
the base tables, i.e., the actual tables. As a consequence, you have more freedom to
change and extend the tables as long as you ensure that the external appearance of the
views remains unchanged. Furthermore, if all columns are explicitly named in the view
definition, end users and programs selecting all columns via "SELECT *" are not impacted
if new columns are added to the table that are not contained in the view definition.
As we have illustrated by means of the example in the notes for the previous visual, views
complement the logical data structures. For a logical data structure, you may have multiple
views. Conversely, a single view may serve multiple logical data structures.

10-32 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty Checkpoint

Exercise — Unit Checkpoint


1. Name the two major inputs for the development of the logical data
structures.
_____________________________________________________
_____________________________________________________
_____________________________________________________

2. What are the two main purposes of logical data structures?


_____________________________________________________
_____________________________________________________
_____________________________________________________

3. A logical data structure reflects the data flow between the tables of
the application domain for a business process or a part of it. (T/F)

4. Since the logical data structures are intended for the application
programmers, the database designer is not involved in their
development. (T/F)

5. Which of the following choices are correct? If problems are


detected during the development of the logical data structures, the
database designer should ...
a. Patch the tables and not worry about the earlier design steps.
b. Verify all steps of the design process.
c. Restart the design process with the establishment of the tables.
d. Restart the design process with the establishment of the tuple
types.

6. Name the components of a logical data structure.


_____________________________________________________
_____________________________________________________

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-33
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

7. What is the purpose of the structure diagram?


_____________________________________________________
_____________________________________________________
_____________________________________________________

8. What does the path summary specify?


_____________________________________________________
_____________________________________________________
_____________________________________________________

9. What does the table summary specify?


_____________________________________________________
_____________________________________________________
_____________________________________________________

10. All interconnections in a structure diagram are


primary-key/foreign-key interrelationships. (T/F)

11. Views are only descriptions of data. They are not real tables. (T/F)

12. Name four advantages of views.


_____________________________________________________
_____________________________________________________
_____________________________________________________

10-34 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2 BKM2MIF
Student Notebook

Uempty
Unit Summary

The logical data structures for a business process identify:


The tables and columns for the data elements of the business process
How the tables for the business process are interconnected
The components of logical data structures are: structure diagram,
path summary, and table summary
The structure diagram is a graphical representation of the
interconnections between the tables
The path summary describes the interconnections between the tables
From table to table via columns
The table summary describes which columns are needed for the
different uses of the tables
Views allow to describe subsets of the tables of the application domain
Views provide data security, ease of use, resilience against database
changes, and freedom to change tables
For a logical data structure, you can have one or more views
The same view may be used by multiple logical data structures

Figure 10-17. Unit Summary CF182.0

Notes:

© Copyright IBM Corp. 2000, 2002 Unit 10. Logical Data Structures 10-35
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

10-36 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Appendix A. Sample Problem Statement

Overview
Come Aboard (CAB) is an airline servicing a set of airports with its
aircraft. As employees, it has pilots flying the aircraft, mechanics
maintaining and servicing the aircraft, and other personnel for various
service functions.
CAB wants to administer flight planning, pilot assignment, and aircraft
maintenance activities by means of a database management system.

Business Object Types


CAB wants to store information about the following business object
types in its database:
Aircraft Models
For its flying activities, CAB uses aircraft of different types or, more
precisely, models such as Boeing 737, Model 500, or Airbus A320,
Model 200. For the aircraft models it owns or has on order, CAB wants
to maintain information in its database such as their category (e.g.,
JET or TURBOPROP), length, height, wing span, or number of
engines.
The aircraft models can be uniquely identified by their type code (e.g.,
B737) together with their model number (e.g., 500).
Aircraft
CAB owns multiple aircraft of the various aircraft models. For the
aircraft it owns, CAB wants to maintain information such as the date
when the aircraft was acquired, the engines mounted on the aircraft, or
the seats of the aircraft.
Each aircraft has a unique serial number. This serial number is unique
across aircraft models.
Airports
CAB services a set of airports with its aircraft. For these airports, as
well as for airports CAB plans to service in the near future, CAB wants
to keep information in its database such as the airport code, the
location of the airport, the address of CAB's city ticketing office, or the
address of CAB's airport office.
The airport codes uniquely identify the various airports.

© Copyright IBM Corp. 2000, 2002 Appendix A. Sample Problem Statement A-1
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Pilots
CAB wants to store information (e.g., name, address, phone number,
or date of previous medical check-up) for its pilots. As every employee,
pilots have a unique employee serial number.
Mechanics
CAB wants to store information (e.g., name, address, phone number,
or area of expertise) for its mechanics. As every employee, mechanics
have a unique employee serial number.
Itineraries
Itineraries are ordered collections of consecutive nonstop connections
between airports which are called legs. This means that the ending
airport for the previous leg is always the starting airport for the next
leg.
Itineraries have unique flight numbers (e.g., YY1842). All legs of an
itinerary are operated under the flight number of the itinerary. CAB
wants to maintain information about the itineraries such as the seating
classes offered, the weekdays on which the itinerary is operated
(starting days), and the planned departure and arrival times for the
legs.
Flights
A flight is a scheduled or executed nonstop trip between two airports.
Flights are always related to the legs of itineraries. The information
kept about flights includes, for example, the estimated departure and
arrival times (which might be different from the planned departure and
arrival times for the appropriate leg because of delays) and the actual
departure and arrival times.
The individual flights can be identified by means of a sequence
number, referred to as flight locator, which is unique per itinerary and
leg. Thus, to identify a particular flight, you need to know the flight
number for the itinerary (e.g., YY1842), the airports for the legs (e.g.,
FRA - JFK), and the flight locator (e.g., 453) for the flight.
Maintenance Records
As the aircraft are maintained, maintenance records are established
for them. The information gathered as part of the maintenance records
includes, for example, the type of the maintenance performed and the
date of the maintenance.
Each maintenance record has a unique sequence number referred to
as maintenance number.

A-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Business Relationship Types


The following types of relationships exist between the business object
types which CAB wants to implement in its database:
Aircraft Models - Aircraft
For an aircraft model, CAB may have any number of aircraft. In
particular, it is possible that there are no aircraft (yet) for an aircraft
model. Conversely, an aircraft belongs to one and only one aircraft
model.
Aircraft Models - Airports
Before an aircraft can be used for flights to and from an airport, CAB
must acquire start and landing rights for the appropriate aircraft model
for this airport. An aircraft model servicing multiple airports must have
start and landing rights for all these airports. For an airport serviced by
different aircraft models, start and landing rights must be obtained for
all aircraft models servicing the airport.
It is possible that CAB does not have any start and landing rights for
an aircraft model. For example, this may happen if the airports
serviced by this aircraft model are no longer serviced by CAB and,
thus, dropped.
It is also possible that CAB does not have any start and landing rights
for an airport in its database.
Pilots - Aircraft Models
CAB wants to record which pilots can fly the various aircraft models.
Pilots may be able to fly multiple aircraft models. Conversely, an
aircraft model may be flown by different pilots.
It is possible that, temporarily, a pilot cannot fly any of the aircraft
models. It is also possible for an aircraft model that CAB does not
have a pilot that can fly the aircraft of this model. For example, this
may be the case for a newly ordered aircraft model for which CAB has
not yet hired a pilot.
Airports - Itineraries
An itinerary consists of one or more legs. The legs are nonstop
connections between two airports, the starting and the ending airports
for the leg. Airports can be the starting or ending points for legs of
multiple itineraries.
If an airport is no longer needed by CAB and is deleted, all itineraries
should be deleted as well for which the airport had been a stopover.

© Copyright IBM Corp. 2000, 2002 Appendix A. Sample Problem Statement A-3
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Itineraries - Flights
For each leg of an itinerary, there may be multiple flights. These can
be scheduled flights or completed flights. Completed flights are kept
for a certain period of time.
A flight always applies to one leg of one itinerary.
Aircraft Models - Legs
Aircraft models are assigned to the legs of an itinerary to define the
kind of aircraft for the flights for the legs. At all times, a leg must have
one, and only one, aircraft model assigned to it. The assignment is
made when the leg is established, but may be changed.
An aircraft model may be assigned to multiple legs. It need not be
assigned to any legs.
Aircraft - Flights
Aircraft are assigned to flights. Flights represent nonstop connections.
Therefore, only one aircraft is assigned to a flight. An aircraft can be
assigned to multiple flights. The aircraft assignment is not necessarily
made at the point in time when the flight is scheduled.
It is possible that, at a given point in time, an aircraft has not been
assigned to any flight.
Pilots - Flights
To each flight, one pilot is assigned as (flight) captain and another pilot
as copilot. This assignment is not necessarily made at the point of time
when the flight is scheduled, but at least three weeks before the flight
is performed.
A pilot can function as captain or copilot for multiple flights. It is
possible that, at a given point in time, a pilot does not have any flight
assignments.
Mechanics - Aircraft Models
Mechanics are trained to repair the aircraft of a specific aircraft model.
A mechanic can be trained for multiple aircraft models. For an aircraft
model, multiple mechanics may have the required training.
It is possible that, temporarily, a mechanic does not have the training
for any of the aircraft models. Conversely, it is possible that, for an
aircraft model, CAB does not have a trained mechanic.
Mechanics - Aircraft
CAB wants to record which mechanics are scheduled for the next
maintenance service of an aircraft. A mechanic may perform the

A-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP maintenance service for multiple aircraft. Conversely, multiple


mechanics may be assigned to a single aircraft.
It is possible that the next maintenance service has not yet been
scheduled for an aircraft. At a given moment, it is also possible, that a
mechanic has not been assigned to any aircraft.
Mechanics - Maintenance Records
For every maintenance performed, a maintenance record is
established by a mechanic. For each maintenance record, one and
only one mechanic is responsible.
If a mechanics leaves the company, his/her maintenance records are
assigned to another mechanic.
Aircraft - Maintenance Records
As an aircraft is serviced, a maintenance record for the aircraft is
established. A maintenance record applies to one and only one
aircraft. For an aircraft, there may be multiple maintenance records.
The maintenance records for an aircraft contain the serial number for
the aircraft. All maintenance records for an aircraft must be kept for the
time the aircraft is owned by CAB and for two years thereafter. This
implies that the maintenance records must still be kept after the
remaining information for the aircraft has been deleted.
Maintenance Records - Maintenance Records
As the consequence of a maintenance activity for an aircraft, other
maintenance activities may be triggered for that aircraft. These
subjunctives have their own maintenance records. CAB wants to
record the relationships between maintenance records.
A maintenance record can have any number of (maintenance)
subrecords. Conversely, a subrecord always belongs to one and only
one maintenance record referred to as owning maintenance record.
Maintenance subrecords do not have special characteristics. They are
normal maintenance records and contain the same type of information
as their owning maintenance records.
If a maintenance record is deleted, all its subrecords are deleted as
well.

Business Constraints
The following constraints exist for the business object types and
business relationship types that CAB wants to maintain in its
database:

© Copyright IBM Corp. 2000, 2002 Appendix A. Sample Problem Statement A-5
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

Number of Engines on Aircraft


An aircraft cannot have more engines mounted than the aircraft model
allows.
To be enforced when an engine is added to an aircraft.
The request to add an engine to an aircraft must be rejected if it
violates the constraint.
Aircraft for Flight Must Belong to Aircraft Model for Leg
The aircraft assigned to a flight must belong to the aircraft model for
the leg for the flight.
To be enforced when an aircraft is assigned to a flight or the aircraft
assignment is changed.
The aircraft assignment must be rejected if it violates the constraint.
Also to be verified if the aircraft model for a leg is changed.
In this case, previous aircraft assignments for flights for the leg must
be canceled and appropriate notifications must be given.
Captain and Copilot Must Be Different
A pilot cannot be captain and copilot for the same flight.
To be enforced when a pilot is assigned to a flight or the pilot
assignment is changed.
The pilot assignment must be rejected if the pilot does not qualify for
the flight.
Pilots for Flight Must Have License for Aircraft Model for Leg
A pilot for a flight must have the license to fly the aircraft model for the
leg for the flight.
To be checked when a pilot is assigned to a flight or when a previous
pilot assignment is changed.
The pilot assignment is to be rejected if the pilot does not qualify for
the flight.
Also to be verified if the aircraft model for a leg of an itinerary is
changed.
In this case, previous pilot assignments for flights for the leg must be
canceled and appropriate notifications must be given.
Only Trained Mechanics for Aircraft Maintenance
A mechanic can only service an aircraft if he/she has been trained for
the appropriate aircraft model.

A-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP To be checked when a mechanic is assigned to the next maintenance


service for an aircraft.
The assignment is to be rejected if the mechanic has not been trained
for the appropriate aircraft model.
Employees Cannot Be Pilots and Mechanics at the Same Time
An employee cannot be a pilot and a mechanic at the same time.
To be checked when a new pilot is added. Also to be checked when a
new mechanic is added.
Only Aircraft Models With Start and Landing Rights for Legs
An aircraft model can only be assigned to a leg of an itinerary if it has
start and landing rights for the airports of the leg.
To be checked when an aircraft model is assigned to a leg or when the
aircraft model assignment is changed.
The aircraft model assignment must be rejected if it violates the
constraint.

© Copyright IBM Corp. 2000, 2002 Appendix A. Sample Problem Statement A-7
Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

A-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Appendix B. Checkpoint Solutions


Unit 1 - Relational Concepts

1.
The relational data model describes the conceptual representation
of the data objects of relational databases and gives guidelines for
their implementation.

2.
c

3.
False

4.
Fields are the columns for a particular row of a table. They are the
actual receptacles for the data stored into a table.

5.
True

6.
True

7.
a, b, d

8.
The main reasons are:
- Identical rows cannot be modified or deleted individually.

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

- To ensure that the database design is open-ended: Future


application changes may require the retrieval, update, and
deletion of particular rows.

9.
False

10.
True

11.
c

B-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 2 - Views and Results During Database Design

1.
The problem statement for an application domain is a document
describing the types of business objects for the application domain,
the relationships between them, and the business constraints for
both of them.

2.
c

3.
c, a, b

4.
False

5.
True

6.
b, a, a, c, b, b, c, b

7.
An entity-relationship model visualizes the business object types of
the application domain, the relationships between them, and the
business constraints for both of them.

8.
The data inventory is a description of the data elements, i.e., the
elementary data, of the application domain.

9.
a, b

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

10.
Logical data structures apply to processes or parts of them. They
describe:
- The subset of the tables (of the database for the application
domain) used by the process or the pertinent part of the
process.
- How the process or the part of the process must logically
navigate through the tables in order to accomplish its function.

11.
False

B-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 3 - Problem Statement

1.
a, b, c, e, g

2.
The main sections of a problem statement are:
- An overview of the application domain.
- A description of the business object types.
- A description of the business relationship types.
- A description of the business constraints.

3.
The overview section should:
- Describe what the application domain does.
- Identify the areas of the application domain to be implemented
in the target database.

4.
b, c

5.
True

6.
A business relationship type represents a category of business
relationships, with the same meaning and characteristics, between
the objects of one or more business object types.

7.
For each business relationship type, the problem statement should:
- Contain a textual description of the business relationship type.
- Identify the business object types linked by the business
relationship type.

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

- Specify how many relationships of the same type an object can


have.
- Describe if the business relationship type requires an object of
a business object type for the business relationship type to
have at least one relationship.
- If the objects having a relationship with an object must be
deleted when the object is deleted.

8.
True

9.
Cascading business relationship type.

10.
A business constraint represents a restriction for the objects of
business object types, for the relationships of business relationship
types, or for a mixture thereof.

11.
For each business constraint, the problem statement should:
- Contain a textual description of the restriction that must be
adhered to.
- Identify the business object types or business relationship types
to which the restriction applies.
- Specify when the constraint is to be applied.
- Describe the action to be performed if the constraint is violated.

12.
True

B-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 4 - Entity-Relationship Model

1.
The three major components of entity-relationship models are:
- Entity types
- Relationship types
- Constraints

2.
False

3.
An entity type is a conceptual unit representing a class of objects
with the same meaning and characteristics about which information
is to be stored and maintained.
An entity instance is an actual object belonging to an entity type.

4.
The entity key allows to uniquely identify the instances belonging to
an entity type.
The minimum principle requires that all attributes of the entity key
are necessary for the unique identification of the instances of the
entity type. If an attribute is omitted, the remaining attributes no
longer uniquely identify the instances of the entity type.

5.
True

6.
A relationship type is a conceptual association between:
- The entity instances, one each, of two not necessarily different
entity types.
- The relationship instances, one each, of two not necessarily
different relationship types.

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

- The entity instances and relationship instances, one of each, of


an entity type and a relationship type.

7.
True

8.
True

9.
False

10.
b, a, a, b, d, c

11.
The cardinalities for relationship type PASSENGER_has_SEAT
are the following:
Cardinality for source: 0..1 or 1
Cardinality for target: 0..m or m

12.
A 1:1 relationship type is a relationship type with cardinalities ..1 at
both ends of the relationship type.
A 1:m relationship type is a relationship type with cardinality ..1 at
one end and cardinality ..m at the other end of the relationship
type.
A m:m relationship type is a relationship type with cardinalities ..m
at both ends of the relationship type.

13.
a. For relationship type r1, any number of instances of entity type
B can be connected to an instance of entity type A.

B-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP b. For relationship type r1, zero instances of entity type B need be


connected to an instance of entity type A.
c. For relationship type r1, at most one instance of entity type A
can be connected to an instance of entity type B.
d. For relationship type r1, zero instances of entity type A need be
connected to an instance of entity type B.
e. For relationship type r2, at most one instance of entity type A
can be connected to an instance of entity type C.
f. For relationship type r2, one instance of entity type A must be
connected to every instance of entity type C.

14.
Since relationship type r1 has a source cardinality of 1, entity
instance B3 cannot be connected to multiple instances of entity
type A.
Since relationship type r2 has a source cardinality of 1..1, entity
instance C1 of entity type C must be connected to one and only
one instance of entity type A.

15.
The defining attributes and the relationship keys for relationship
types r1 and r2 are:
Defining attributes for r1: Key of A and key of B
Relationship key for r1: Key of A (target cardinality of 1)
Defining attributes for r2: Key of r1 and key of C, i.e., key of A and
key of C
Relationship key for r2: Key of r1 and key of C, i.e., key of A and
key of C, since r2 is a m:m relationship
type

16.
False

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-9


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

17.
To be a dependent entity type, the entity type must fulfill the
following requirements:
- A part of its key or its entire key must be equal to the key of
another entity type or of a relationship type (referred to as
parent entity type or relationship type, respectively).
- There must exist a relationship type between the parent entity
type or relationship type and the dependent entity type so that:
• Each instance of the dependent entity type is, at all times,
connected to one and only one parent instance.
• The dependent and parent instances interconnected are
those with matching key values: The value of the
appropriate key portion of the dependent entity instance
must be equal to the key value of the parent instance.

18.
Owning relationship type r1 cannot have the instance (A1, A2.B1)
because the value of the appropriate key portion for the entity
instance of B is different from the key value for the instance of A.

19.
By means of dependent entity types.

20.
False

21.
True

B-10 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP 22.
Deletion of C2
u Deletion of (C2, A3) for r1
u Deletion of A3 (controlling property)
u Deletion of (A3, B2) for r2
u Deletion of B2 (controlling property)
u Deletion of (C2, D3) for r3
u Deletion of ((A1, B1), (C2, D3)) for r4
Remaining Instances:
Object Instances
A A1, A2,
B B1
C C1, C3
D D1, D2, D3
r1 (C1, A2)
r2 (A1, B1), (A2, B1)
r3 (C1, D1), (C1, D2)
r4 ((A1, B1), (C1, D2))

23.
True

24.
The components of a class structure are:
Supertype
Subtypes
Is-bundle

25.
The is-bundle is the set of _is_ relationship types connecting the
supertype to its subtypes.

26.
b, c, a, d

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-11


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

27.
The instances of entity types and relationship types can be
restricted by means of constraints.

28.
The three components of constraints are:
The constraining objects
The constrained objects
The rule specifying how the constraining objects restrict the
instances of the constrained objects.

29.
The format of a constraint in the entity-relationship model is:
{ identifier [ : rule ] }

B-12 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 5 - Data and Process Inventories

1.
a, b, e, f

2.
A data inventory should contain:
- A description of the abstract data types for the application
domain.
- A description of the data elements and data groups for the
application domain.

3.
From the application-domain perspective, a data element is an
indivisible piece of data.
A data group consists of one or more related data elements and/or
data groups and, thus, generally is not an indivisible piece of data.

4.
Data elements can be associated with standard data types or
abstract data types. Abstract data types are an extension of
standard data types. They can be tailored to the application
domain. They describe the values that the data elements
associated with them can assume and the operations that can be
performed with them.

5.
For an abstract data type, you should provide:
- Its signature, i.e., its name and parameters.
- The values that can be assumed.
- The operations that can be performed.

6.
a, b, d, e

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-13


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

7.
By associating data elements and data groups with the entity types
using them as attributes, you can verify the completeness of the
entity-relationship model for your application domain. If you cannot
find an entity type for a data element or data group not belonging to
a data group, the entity-relationship model is incomplete.

8.
The usual methods for establishing a data inventory are:
- Surveying the departments of expertise.
- Screening existing data and programs.
- Coupling the data and process inventories.

9.
Some of the problems in surveying the departments of expertise
are:
- Communicative problems:
• The application domain expert may not be able to extract
the proper information from the members of the
departments of expertise.
• The members of the departments of expertise may not be
able to communicate their thoughts and ideas.
• Due to workload pressure, the members of the departments
of expertise may be reluctant to talk with the application
domain expert about database related topics.
- In discussions, it is easy to forget something.
- You may obtain data elements and data groups not actually
needed.
- It is a one-time effort. Later changes are not reflected in the
data inventory.

10.
The principle behind coupling the data and process inventories is
the following:
- When a business process is described or updated in the
process inventory, the data elements and data groups it uses
are identified or changed accordingly.

B-14 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP - As a data element or data group for a business process is


identified or changed, it is registered or changed in the data
inventory. For a new data element or data group, or a new role,
it is verified that the entity-relationship model contains the
corresponding entity type. The entity type is named as part of
the description of the data element or data group.
Consequently, the data inventory contains all data elements
and data groups for the documented business processes and
no superfluous data elements or data groups.

11.
b, d, e
12.
The description of a business process should contain the following
items:
Title
Purpose
Input
Textual description
Formal description
Output
Data read
Data written
Others (such as window formats or listing formats)

13.
Data read for a business process are the data elements or data
groups read internally during the execution of the business
process.
For each data element or data group read, its name in the data
inventory and all purposes it is read for should be described.

14.
For each step of the business process, you determine the entity
types and relationship types of the entity-relationship model
needed to access the data elements and data groups for the step.
The entity types are the receptacles for the appropriate data. The
relationship types are the paths for navigating from a piece of

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-15


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

information to another logically related piece of information


needed.
When verifying the entity-relationship model for a business
process, you perform a walk through the entity-relationship model
and determine the view needed for the business process.

15.
Process decomposition is an iterative, step-by-step decomposition
of the application domain into groups of functionally related
business processes. Each iteration decomposes the groups for the
previous iteration into functionally related subsets until the groups
cannot be broken down any further. The result is a process tree.
The purpose of process decomposition is to obtain the complete
set of business processes for the application domain.

B-16 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 6 - Tuple Types

1.
True

2.
False

3.
The cardinality for an attribute determines how many values the
attribute must assume at least and can assume at most in the
scope it is used.
If the attribute is used as direct component of the tuple type, the
cardinality specifies how many values the attribute must assume at
least and can assume at most for each tuple.
If the attribute is used as component of a composite attribute, the
cardinality specifies how many values the attribute must assume at
least and can assume at most for each value of the composite
attribute.

4.
c, e

5.
False

6.
The tuple type for an entity type is established by compiling the
data elements and data groups of the data inventory associated
with the attributes of the entity type.

7.
Tuple types must not be established for:
- Owning relationship types.

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-17


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

- m:m relationship types being the source of a relationship type


with a minimum target cardinality of 1.
- m:m relationship types being the target of a relationship type
with a minimum source cardinality of 1.

8.
The components of a composite attribute are indented.

9.
In the tuple type documentation, the role of a data element or data
group for an attribute can be identified by means of the AS clause:
name of data element/group AS role name

10.
d

11.
MAINTENANCE RECORD_belongs_to_MAINTENANCE RECORD
Maintenance Number, PK
Maintenance Number AS Owner

12.
a, d, f

13.
The Normal Forms describe states or quality levels for the tuple
types. The higher the Normal Form of a tuple type, the more stable
the tuple type is, the fewer data inconsistencies are possible, and
the less redundant information it contains.

14.
The resulting tuple types no longer contain repeating groups, i.e.,
all attributes can assume at most one value.

B-18 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP
15.
The attributes for repeating groups have a maximum cardinality
higher than 1. This includes a maximum cardinality of * meaning
that the appropriate attribute can assume any number of values
within its scope.

16.
Generally, in the entity-relationship model, you need:
- A new dependent entity type.
- A new owning relationship type interconnecting the new
dependent entity type and the entity type/relationship type for
the original tuple type.

17.
False

18.
True

19.
Generally, in the entity-relationship model, you need:
- A new entity type.
- A new relationship type interconnecting the new entity type and
the entity type/relationship type for the original tuple type.

20.
If the data groups the attributes for a tuple type are based upon
have been established properly, they contain all attributes (and
only those) that, during normalization, must be moved together to a
new tuple type.

21.
True

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-19


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

B-20 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 7 - From Tuple Types to Tables

1.
Tuple types are translated into tables as follows:
- Each tuple type becomes a table.
- Each elementary attribute becomes a column.
- Each elementary attribute of the tuple type's primary key
becomes a column of the table's primary key.

2.
Tuple types with always corresponding primary key values can be
merged.

3.
A tuple type whose primary key values always are a subset of the
primary key values of another tuple type can be imbedded in the
other tuple type if the following condition is met: For each
potentially imbedded tuple, at least one of its nonkey attributes has
a value.

4.
True

5.
For T1 through Tn to be a perfect decomposition of T, the following
condition must be satisfied as well:
At all times, each primary key value of T must occur in one and
only one of the tuple types T1 through Tn.

6.
The following are some reasons for not combining tuple types:
- The tuple types have nothing to do with each other.
- The tuple types are only processed together by business
processes that are not performance-critical.

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-21


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

- Other tuple types are referentially dependent on the tuple type


being eliminated.
- Limitations of the target database management system do not
allow you to combine the tuple types.

7.
Some typical limitations for relational database management
systems are:
- The rows must fit entirely into a single pages of a chosen size.
This limits the row size.
- The maximum number of rows per page is limited.
- The maximum number of columns per page is limited.
- The maximum size of a table is limited.

8.
True

9.
True

10.
False

11.
True

12.
False

13.
f, a, c, b, a, d, e, a, f, c, g, f, b, e

B-22 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP 14.
False

15.
False

16.
System default values are system-provided, predefined, default
values for the various data types. They are independent of
columns.
User default values are default values you define for specific
columns. As user default for a column, any value can be chosen
that is compatible with the data type for the column.

17.
You can provide your own default value for a column by specifying
the value in the WITH DEFAULT clause for the column.

18.
True

19.
False

20.
External user defined functions are based on programs written by
you. Sourced user defined functions are based on existing built-in
functions or user defined functions.

21.
True

22.
b, c, a

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-23


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

23.
Check constraints allow you to restrict the values of columns
beyond the values permitted by the data types of the columns.

24.
False

25.
A trigger is a set of actions to be performed when a specific event
occurs.

26.
False

27.
True

28.
A trigger can be activated before the changes for the row or SQL
statement are applied or after they have been applied.

29.
a, e, f

30.
a, b, c, d, e

31.
True

32.
True

B-24 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 8 - Integrity Rules

1.
The four basic types of integrity to be maintained for a data base
are:
- Referential integrity
- Domain integrity
- Redundancy integrity
- Constraint integrity

2.
A foreign key is an ordered set of columns whose values are, at all
times, a subset of the values of a parent key of the same or another
table.

3.
True

4.
e, b, d, a, c

5.
NO ACTION checks for orphans after the deletion of the rows of
the parent table and rejects the request if orphans are detected.
RESTRICT checks for parent rows before the deletion of the rows
of the parent table and rejects the request if parent rows are found.

6.
True

7.
a, b

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-25


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

8.
The deletion of a parent row fails if:
- Another referential constraint with delete rule NO ACTION or
RESTRICT prevents the deletion of the parent row.
- Another referential constraint with delete rule NO ACTION or
RESTRICT for which the dependent table is the parent table
prevents the deletion of a dependent row.

9.
You need an after trigger for the table for the relationship type. The
trigger must be activated for each deletion of a row for the
relationship type and must delete the row for the appropriate
source instance.

10.
Table T is delete-connected to table T1 if the deletion of a row of T1
requires that rows of T are accessed.

11.
False

12.
True

13.
True

14.
For referential cycles, the following restrictions exist:
- For a cycle of two or more tables, at least two delete rules must
be different from CASCADE.
- For a self-referencing constraint, the delete rule must be NO
ACTION or CASCADE.

B-26 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP 15.
False

16.
The purpose of a referential structure is to provide an overview of
the referential constraints for the tables of an application domain or
a subset thereof.

17.
True

18.
A double-headed arrow in a referential structure indicates that a
parent key value may occur more than once as foreign key value in
the dependent table.

19.
Domain integrity requires that the values of the columns for the
tables are correct. This means that:
- The values belong to the values supported by the abstract data
types for the data elements for the columns.
- The values adhere to domain restrictions for the data elements
for the columns.
- The values observe length restrictions for the data elements for
the columns.

20.
The three major causes for the redundancy of data are:
- Violations of the Second Normal Form or Third Normal Form
- Multiple copies of columns or tables
- Derivable data

21.
False

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-27


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

22.
You can ensure the correctness of derivable data by:
- Not storing them and deriving them each time they are needed.
- Triggers reevaluating and storing the derivable data each time
data affecting the derivable data are inserted, updated, or
deleted.

23.
For constraint integrity, all business constraints of the application
domain must be observed.

24.
The main ingredients for achieving constraint integrity are triggers
and user defined functions. Sometimes, unique indexes or
referential constraints can be used.

B-28 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 9 - Indexes

1.
The main purpose of an index is to improve performance when the
locating of a row would require the scanning of the rows of the
table.

2.
True

3.
True

4.
An index is a dense index if each key value has an index entry in
the lowest index level.

5.
False

6.
At most one.

7.
c, a, b

8.
Plain unique index can be used for:
- The primary key of a table.
- The foreign key resulting from merging the tuple type for a 1:1
relationship type.

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-29


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

9.
Unique-where-not-NULL indexes can be used for the foreign key
resulting from imbedding the tuple type for a 1:1 relationship type.

10.
If you have a clustering index for a table, the database
management system attempts to store the rows of the table in such
a way that the physical sequence of the data pages agrees with the
logical order implied by the index.

11.
c

12.
From a database design perspective, you should establish an
index for:
- Each primary key.
- Each foreign key.

B-30 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2
Student Notebook

AP Unit 10 - Logical Data Structures

1.
The two major inputs for the development of the logical data
structures are:
- The tables for the application domain.
- The referential structure for the application domain.

2.
The main purposes of logical data structures are to identify:
- The columns (and the tables containing the columns)
corresponding to the data elements used by the business
processes.
- How the business processes can navigate, with the data found,
from one table to the next.

3.
True

4.
False

5.
b

6.
The components of a logical data structure are:
The structure diagram.
The path summary.
The table summary.

7.
The structure diagram for a logical data structure illustrates the
paths interconnecting the tables of the logical data structure.

© Copyright IBM Corp. 2000, 2002 Appendix B. Checkpoint Solutions B-31


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

8.
For each path of the structure diagram, the path summary specifies
the source table, the target table, and the interconnected columns.

9.
For each use of a table of the logical data structure, the table
summary specifies the columns needed.

10.
False
11.
True

12.
Views provide data security, ease of use, resilience against
database changes, and freedom to change the table definitions.

B-32 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2.3
Student Notebook

IX Index
Numerics condition 3-16
1:1 relationship types 4-43 textual description 3-16
1:m relationship types 4-43 business object types 3-6
business process 5-42
data read 5-45
A data written 5-45
abstract data types 5-9 formal description 5-43
example 5-12, 5-14, 5-15 input 5-42
implementation considerations 7-49 output 5-44
operations 5-11 purpose 5-42
sample implementation 7-66 sample business process 5-46
signature 5-10 textual description 5-43
values 5-10 title 5-42
attributes 4-9 business relationship types 3-11
components 4-11
composite attributes 4-11
definition 4-9 C
elementary attributes 4-11 candidate keys 4-12
name 4-11 cardinalities 4-41
properties 4-10 example 4-44, 4-46
value 4-11 CASCADE 8-15, 8-17
character strings 7-39
CHARACTER 7-39
B CLOB 7-39
balanced trees 9-8 DBCLOB 7-39
searching via an index 9-9 GRAPHIC 7-39
basic entity types 4-15 VARCHAR 7-39
built-in data types 7-38 VARGRAPHIC 7-39
BIGINT 7-39 check constraints 7-59
CHARACTER 7-39 documentation 7-85
character strings 7-39 examples 7-61
CLOB 7-39 class structure 4-76
DATE 7-39 subtypes 4-76
datetime data types 7-39 supertype 4-76
DBCLOB 7-39 clustering indexes 9-13
DECIMAL 7-39 locating insertion point 9-15, 9-17
design considerations 7-40 partitioning indexes 9-19
DOUBLE 7-39 purpose 9-13
GRAPHIC 7-39 sample insertion 9-15, 9-17
INTEGER 7-39 column attributes 7-41
NUMERIC 7-39 default values 7-45
numeric data types 7-39 column functions 7-56
REAL 7-39 columns 1-4
SMALLINT 7-39 combining tuple types 7-13
TIME 7-39 considerations 7-26
TIMESTAMP 7-39 decomposition of super tuple types 7-23
VARCHAR 7-39 imbedding detail tuple types 7-18
VARGRAPHIC 7-39 merging partial tuple types 7-13
bundle cardinality 4-79 Come Aboard A-1
business constraints 3-16 CAB A-1
action if violated 3-16 composite attributes 4-11
affected constructs 3-16 components 4-11

© Copyright IBM Corp. 2000, 2002 Index X-1


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

conceptual view 2-4 data element 5-6


conditional relationship types 4-43 data group 5-6
constraint integrity 8-6, 8-57 in design methodology 5-4
definition 8-57 items for data elements 5-17
example 8-59, 8-61, 8-64 items for of data groups 5-17
maintaining integrity 8-57 methods 5-31
constraint summary 8-46 purpose 5-6
constraints 4-88 responsibility 5-8
constrained object 4-88 review of existing data and programs 5-34
constraining object 4-88 survey of departments 5-32
definition 4-88 data types 5-9
examples 4-92 abstract data types 5-9
representation in entity-relationship model 4-90 standard data types 5-9
rule 4-88 datetime data types 7-39
controlling property 4-69 DATE 7-39
cascading effect 4-71 TIME 7-39
for nondefining attributes 4-73 TIMESTAMP 7-39
conversion of tuple types into tables 7-7 decomposition of super tuple types 7-23
problems with one-to-one conversion 7-11 partial decomposition 7-23
coupling data and process inventories 5-36 perfect decomposition 7-23
covering subtype set 4-81 default values 7-45
criteria for entity types 4-17 selecting default values 7-47
system default values 7-45
user default values 7-46
D defining attributes 4-48
data element 5-6 example 4-50
(textual) description 5-18 delete connection 8-34
cardinality for usage 5-8 restrictions 8-34, 8-36
data type 5-18 via multiple paths 8-34
domain 5-18 delete rules 8-6
example 5-22, 5-25, 5-26, 5-30 delete rules (referential integrity) 8-13
homonyms 5-18 an imbed case 8-32
items 5-17 CASCADE 8-15
lengths 5-18 delete-connected tables 8-34
name 5-17 determining via ER model 8-18
owning data groups 5-20 for 1:1 relationship types 8-25
owning entity types 5-21 for 1:m relationship types 8-24
synonyms 5-18 for dependent entity types 8-18
type 5-18 for m:m relationship types 8-20
usage 5-8 NO ACTION 8-14
data elements 2-6 referential cycles 8-36
data group 5-6 RESTRICT 8-14
(textual) description 5-18 SET NULL 8-14
cardinality for usage 5-8 delete-connected tables 8-34
components 5-6 denormalization 7-30
example 5-24, 5-27, 5-28 dense indexes 9-7
homonyms 5-18 dependent entity types 4-58
items 5-17 characteristics 4-60
name 5-17 owning relationship type 4-60
owning data groups 5-20 parent entity type 4-60
owning entity types 5-21 parent relationship type 4-60
synonyms 5-18 dependent row 8-9
type 5-18 dependent table 8-8
usage 5-8 design methodology 2-12
data inventory 5-4 detail tuple types 7-18
coupling data and process inventories 5-36 imbedding detail tuple types 7-18

X-2 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

documentation 7-81 exclusive subtype set 4-80


check constraints 7-85 external functions 7-55
column-related items 7-89, 7-90
columns 7-89
table-related items 7-87 F
tables 7-87 fields 1-4
triggers 7-90 first normal form 6-30
user defined distinct types 7-82 correction of entity-relationship model 6-34, 6-37
user defined functions 7-83 definition 6-30
documentation of tuple types 6-19 example 6-30, 6-35
cardinalities for attributes 6-20 instance example 6-33
identification of primary key 6-20 repeating groups 6-30
domain integrity 8-5, 8-48 solution 6-31
value integrity 8-5 violation 6-30
foreign key 8-8
fourth normal form 6-54
E definition 6-54
elementary attributes 4-11 multivalued dependency 6-54
enterprise-wide entity-relationship model 4-108 sample tuple type 6-56
consolidation of submodels 4-110 solution 6-60
entity instances 4-9 violation 6-58
definition 4-9
representation 4-13
entity key 4-12 G
candidate keys 4-12 generalization 4-77
minimum principle 4-12
entity types 4-8
advices 4-19
H
horizontal splitting 7-35
attribute representation 4-13
basic entity types 4-15
class structure 4-76 I
corresponding tuple types 6-9 imbedding detail tuple types 7-18
criteria for entity types 4-17 determining detail tuple types 7-20
definition 4-8 implementation-independent architecture 2-4
dependent entity types 4-58 indexes 9-4
determining entity types 4-15 balanced trees 9-8
entity key 4-12 clustering indexes 9-13
name 4-10 dense indexes 9-7
parent entity type 4-60 documentation 9-25
properties 4-10 in design process 9-4
representation 4-13 leaf pages 9-8
standard representation 4-13 nonleaf pages 9-8
subtypes 4-76 nonunique indexes 9-12
supertypes 4-76 partitioning indexes 9-19
entity-relationship model 4-6 plain unique indexes 9-11
attributes 4-9 purpose of an index 9-5
basic considerations 4-6 root page 9-8
constraints 4-88 structure 9-7
enterprise-wide entity-relationship model 4-108 unique indexes 9-11
entity instances 4-9 unique-where-not-null indexes 9-12
entity types 4-8 use of indexes 9-21
position in design methodology 4-6 insert rules 8-6
relationship instances 4-24 insert rules (referential integrity) 8-11
relationship types 4-24 integrity 8-5
sample view of entity-relationship model 4-104 constraint integrity 8-6, 8-57
splitting into pages 4-102 domain integrity 8-5, 8-48

© Copyright IBM Corp. 2000, 2002 Index X-3


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

integrity rules 8-6 sample usage 4-66


redundancy integrity 8-6, 8-50 nonleaf pages 9-8
referential integrity 8-5 nonunique indexes 9-12
integrity rules 8-6 normal forms 6-28
delete rules 8-6 first normal form 6-30
insert rules 8-6 fourth normal form 6-54
update rules 8-6 second normal form 6-39
inverse direction 4-26 third normal form 6-43
is-bundle 4-77 normalization 6-28
bundle cardinality 4-79 first normal form 6-30
fourth normal form 6-54
normal forms 6-28
J problems with tuple types 6-28
joining tables 1-9 second normal form 6-39
third normal form 6-43
NULL attribute 7-41
K nullable columns 7-41
key 8-7
relationship to cardinalities 7-43
numeric data types 7-39
L BIGINT 7-39
leaf pages 9-8 DECIMAL 7-39
limitations for target database management system DOUBLE 7-39
7-28 INTEGER 7-39
linkage of tables 1-9 NUMERIC 7-39
logical data structures 10-4 REAL 7-39
an alternate representation 10-18 SMALLINT 7-39
example 10-11, 10-22
in design process 10-4
interrelationship to business processes 10-20
O
optional relationship types 4-43
path summary 10-11
owning relationship type 4-60
purpose 10-5
responsibilities 10-7
sample business process 10-9 P
sample path summaries 10-26 parent entity type 4-60
sample path summary 10-17 parent key 8-7
sample structure diagrams 10-23 parent relationship type 4-60
sample table summaries 10-26 parent row 8-9
sample table summary 10-17 parent table 8-8
structure diagram 10-11 partial tuple types 7-13
table summary 10-11 merging partial tuple types 7-13
logical view 2-4 partitioning indexes 9-19
path summary 10-11
description 10-11
M example 10-17
m:m relationship types 4-43
physical pointers 1-9
mandatory relationship types 4-43
plain unique indexes 9-11
merging partial tuple types 7-13
primary direction 4-26
determining partial tuple types 7-15
primary key 6-7
multiplicities 4-41
for tuple types 6-7
for tuple types for entity types 6-9
N for tuple types for relationship types 6-11
NO ACTION 8-14, 8-16 identification in tuple types 6-20
nondefining attributes 4-62 problem statement 3-4
controlling property 4-73 business constraints 3-16
sample instance diagram 4-65 business object types 3-6

X-4 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

business relationship types 3-11 referential constraint 8-8


contents 3-6 referential cycles 8-36
overview section 3-6 referential structure 8-42
purpose 3-4 self-referencing constraint 8-9
responsibility 3-4 self-referencing table 8-9
sample business constraint 3-18 terminology 8-7
sample business object type 3-9 update rules 8-16
sample business relationship type 3-13 referential structure 8-42
sample overview section 3-8 relational data model 1-2
process decomposition 5-58 relations 6-6
process tree 5-59 refer to tuple types 6-6
sample process decomposition 5-60 relationship instance diagram 4-29
sample process tree 5-60 relationship instances 4-24, 4-35
process inventory 5-4 general definition 4-35
business process 5-42 restricted definition 4-24
contents business process 5-42 relationship key 4-48
in design methodology 5-4 example 4-50
process decomposition 5-58 minimum principle 4-48
process tree 5-59 relationship type names 4-27
purpose 5-40 relationship type on owning relationship type 4-68
responsibilities 5-40 relationship type on relationship type 4-36
sample business process 5-46 relationship type versus attribute 4-38
sample process decomposition 5-60 relationship types 4-24, 4-35
sample process tree 5-60 1:1 relationship types 4-43
process-independent architecture 2-4 1:m relationship types 4-43
purpose of problem statement 3-4 cardinalities 4-41
conditional relationship types 4-43
controlling property 4-69
R corresponding tuple types 6-11
redundancy integrity 8-6, 8-50 defining attributes 4-48
ensuring integrity 8-50 directions 4-26
example for derivable data 8-55 general definition 4-35
forms of redundancy 8-50 inverse direction 4-26
trigger for update 8-53 m:m relationship types 4-43
referential constraint 8-8 mandatory relationship types 4-43
constraint summary 8-46 multiple for same entity types 4-30
definition 8-38 multiplicities 4-41
documentation 8-39 name for relationship type 4-27
referential cycles 8-36 names for directions 4-27
restrictions 8-36 naming convention 4-27
referential integrity 8-5 nondefining attributes 4-62
composite key 8-7 optional relationship types 4-43
constraint summary 8-46 owning relationship type 4-60
definition of constraints 8-38 parent relationship type 4-60
delete connection 8-34 primary direction 4-26
delete rules 8-13 relationship instance diagram 4-29
delete-connected tables 8-34 relationship key 4-48
dependent row 8-9 relationship type on owning relationship type
dependent table 8-8 4-68
documentation 8-39 relationship type on relationship type 4-36
foreign key 8-8 relationship type versus attribute 4-38
insert rules 8-11 representation 4-26
key 8-7 restricted definition 4-24
parent key 8-7 roles 4-51
parent row 8-9 sample relationship type 4-29, 4-32
parent table 8-8 source for directions 4-27

© Copyright IBM Corp. 2000, 2002 Index X-5


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

source for relationship type 4-27 subtype set 4-80


target for directions 4-27 covering 4-81
target for relationship type 4-27 exclusive 4-80
unary relationship types 4-32 subtypes 4-76
repeating groups 6-30 bundle cardinality 4-79
representation of relationship types 4-26 is-bundle 4-77
responsibility for problem statement 3-4 representation 4-76
RESTRICT 8-14, 8-17 specialization 4-77
results of conceptual view 2-6 super tuple types 7-23
results of logical view 2-10 decomposition of super tuple types 7-23
results of storage view 2-8 supertype 4-76
retrieval order of columns 1-8 bundle cardinality 4-79
retrieval order of rows 1-8 generalization 4-77
review of existing data and programs 5-34 is-bundle 4-77
roles 4-51 representation 4-76
example 4-51 survey of departments 5-32
root page 9-8 system default values 7-45
rows 1-4

T
S table functions 7-56
sample business process 5-46 table summary 10-11
data read 5-56 description 10-11
input 5-46 example 10-17
output 5-48 tables 1-4, 7-7
purpose 5-46 built-in data types 7-38
textual description 5-47 check expressions 7-59
verification of ER model 5-50 column attributes 7-41
sample problem statement A-1 conversion of tuple types into tables 7-7
business constraints A-5 documentation 7-87
business object types A-1 token translation tables 7-78
business relationship types A-3 target 4-27
CAB A-1 third normal form 6-43
Come Aboard A-1 correction of entity-relationship model 6-48
overview A-1 definition 6-43
scalar functions 7-56 example 6-44, 6-50
second normal form 6-39 functional dependency 6-43
correction of entity-relationship model 6-41 instance example 6-47
definition 6-39 solution 6-45
example 6-40 violation 6-44
solution 6-41 token translation tables 7-78
violation 6-40 an alternative 7-79
self-referencing constraint 8-9 triggers 7-62
self-referencing table 8-9 activation time 7-63
SET NULL 8-14, 8-17 after triggers 7-63
source 4-27 before triggers 7-63
sourced functions 7-56 examples 7-69
specialization 4-77 granularity 7-63
standard data types 5-9 prerequisite conditions 7-63
steps during conceptual view 2-6 remarks 7-64
steps during logical view 2-10 triggered actions 7-63
steps during storage view 2-8 triggering operations 7-63
storage view 2-4 tuple types 2-8, 6-4
structure diagram 10-11 characteristics 6-7
description 10-11 conversion into tables 7-7
example 10-11 decomposition of super tuple types 7-23

X-6 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

definition 6-5 views 10-29


denormalization 7-30 characteristics 10-29
documentation 6-17 uses 10-31
for entity types 6-9
for relationship types 6-11
horizontal splitting 7-35
imbedding detail tuple types 7-18
in design methodology 6-4
merging partial tuple types 7-13
name 6-7
none for owning relationship type 6-13
none for some m:m relationship types 6-14
primary key 6-7
relations 6-6
renaming attributes 6-21
required for Come Aboard 6-16
roles 6-21
tuples 6-5
vertical splitting 7-33
tuples 6-5

U
unary relationship types 4-32
unique indexes 9-11
plain unique indexes 9-11
unique-where-not-null indexes 9-12
uniqueness of columns 1-6
uniqueness of rows 1-6
unique-where-not-null indexes 9-12
update rules 8-6
update rules (referential integrity) 8-16
CASCADE 8-17
NO ACTION 8-16
RESTRICT 8-17
SET NULL 8-17
updated maintenance view 8-41
user default values 7-46
user defined distinct types 7-51
documentation 7-82
example 7-53
source data type 7-51
user defined functions 7-55
column functions 7-56
definition 7-57
documentation 7-83
external functions 7-55
invocation 7-57
scalar functions 7-56
sourced functions 7-56
table functions 7-56

V
values 1-4
vertical splitting 7-33

© Copyright IBM Corp. 2000, 2002 Index X-7


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook

X-8 Relational DB Design © Copyright IBM Corp. 2000, 2002


Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
V1.2.2

backpg

Das könnte Ihnen auch gefallen