Sie sind auf Seite 1von 98

Module 2: Entity-Relationship Model

Introduction
Benefits of Data Modeling
Types of Models
Phases of Database Modeling
The Entity-Relationship (ER) Model
Extended Entity-Relationship(EER) Model
Generalization
Specialization
Aggregation

Introduction
Building a database system is a complex process
that normally requires analyzing the user's
requirements followed by a design process and
finally concluding with an implementation. During
analysis and design, the database designer
normally needs to build a model of the proposed
database system. Database modeling is essential to
manage the database development process.
Models have not been invented recently. Humans
have used models for a long time for a variety of
purposes and perhaps that is why modeling is
interpreted differently by different people. Some
examples of model building:

Architects build models of buildings to show


what a building or a house would look like.

Landscape architects build models of


landscaping to show what a landscape or garden
will look like.
Aeronautical engineers build models of planes
to show what a plane would look like. Scaled
models may also be used in testing to evaluate a
plane's performance.
Computer architects build models of new
computers or new circuit boards that are
implemented in hardware.
Traffic engineers build models of traffic for a
city or part of a city to show why there are
traffic problems or how they could be solved.

A model is an abstract and often incomplete


representation of the design or definition of a
complex product, process or situation

Benefits of Data Modeling


Focusing on essentials
Ease of communication and understanding

Product or process improvement


Exploring alternatives

Data Models
A collection of tools for describing
Data
Data relationships
Data semantics
Data constraints
Entity-Relationship data model (mainly for database
design)
Relational model
Object-based data models (Object-oriented and
Object-relational)

Semistructured data model (XML)


Other older models:
Network model
Hierarchical model

Entity-Relationship Model
Models an enterprise as a collection of
entities and relationships
Entity: a thing or object in the enterprise
that is distinguishable from other objects
Described by a set of attributes

Relationship: an association among several


entities
e.g. each employee is an entity described by
empno, empname, designation etc.

Entity set set of all entities of the same


type
Relationship set set of all relationships of
same type
Mapping cardinality number of entities
to which another entities can be
associated via relationship set

The overall logical design of database


can be expressed graphically by an E-R
diagram, which has following
components:
Divided Rectangles - entity set
Diamonds relationship among entity sets
Lines link attributes to entity sets and
entity sets to relationship sets

Each component of E-R diagram is labeled with entity or


relationship that it represents
Example of schema in the entity-relationship model

Relational Model
It uses collection of tables to represent
data as well as relationship among those
data.
Each table has multiple columns, each
column has unique name.

Other Models
Network Model
Data are represented by collection of
records, and relationships among those
data are represented by links, which
are viewed as pointers.
Records are organized as a collection
of arbitrary graph

Hierarchical Model
Data are represented by collection of
records, and relationships among those
data are represented by links, which are
viewed as pointers.
Records are organized as a collection of
trees rather than arbitrary graph

Entity-Relationship Model
ER modeling: A graphical technique for
understanding and organizing the data independent
of the actual database implementation OR
The ER model describes data as entities, attributes
and relationships
Basic Notations of ER model are
Entity
Attributes
Relationship

Entity
Entity An entity is a thing or object in the
real world that is distinguishable from all
other objects OR
Any thing from the real world that have an
independent existence and about which
we intend to collect data
An entity may be an object with
physical existence (person, book, car,
house) or with a conceptual existence
(Company, job, or university course)

Entity
Each Entity has attributes, the particular
properties that describe it
Entity type - Set of entities of the same type
that share the same attributes. Each Entity
type is described by its name and attributes
Entity Set Collection of all entities of a
particular entity type in DB at any point in
time
Extension of the entity Set - The individual
entities that constitute a set
Entity sets do not need to be disjoint

Key Attributes of an Entity Set


Uniqueness Constraint - An important
constraint on the entities of an entity type
Key attribute - An attribute whose values
are distinct for each individual entities in the
entity set. Its values can be used to identify
each entity uniquely
e. g. the account number of an account, an
employee id of an employee

composite key attribute - If several


attributes form a key i.e. combination of
attribute values must be distinct for each
entity. A composite key must be minimal

Attributes
Each entity has attributesthe particular
properties that describe it

An entity is represented by a set of attributes


Attributes are descriptive properties possessed
by each member of an entity set
Each entity may have its own value for each
attribute
The set of possible values for each attribute of a
particular entity is called the domain of the
attribute
The domain of attribute marital status is just the four
values of set of alphabetic characters single, married,
divorced, widowed
The domain of the attribute month is set of twelve
values ranging from January to December

Figure: Two entities, EMPLOYEE e1, and COMPANY c1, and their attributes

Domain of attributes
Each Simple attribute of an entity is
associated with a value set or domain of
values, which specifies the set of values that
may be assigned to that attribute for each
individual entity
If range of ages allowed for employees is between
16 & 70 then value set for age attribute is set of
integer numbers between 16 & 70
Value set for name attribute is set of strings of
alphabetic characters separated by blank spaces

Value sets are not displayed in ERD. They are


specified using the basic data types available
in DBMS

Types of attributes
Simple Vs composite attribute
Simple (Atomic) - cannot be divided into
smaller subparts (simpler) components
age of an employee

Composite - can be divided into smaller


subparts (simpler) components, which
represents more basic attributes with
independent meaning
joining date of the employee. Can be divided
into day, month and year

If composite attribute is referred only as a


whole then no need to divide it into
component attributes, the whole composite
attribute can be referred as simple attribute

Composite Attributes

Composite attributes can form a hierarchy


The value of a composite attribute is the concatenation of the values
of its components i.e. simple attributes
Composite attributes are useful to model situations in which a user
sometimes refers to the composite attribute as a unit but at other
times refers specifically to its components
If the composite attribute is referenced only as a whole, there is no
need to subdivide it into component attributes

Single Vs Multi-valued Attributes


Single valued - Can take only a single
value for each particular entity
age of a person. There can be only one value
for this

Multi-valued - Can take set of values for


the same entity
skill set of an employee, colors for a car,
degrees for a person, dependents of an
employee, nominees of an a/c holder
A multi-valued attribute may have lower and
upper bounds to limit the number of values
allowed for each individual entity

Stored Vs Derived attribute


Stored - Attribute values that need to be
stored permanently
name of an employee, title of a book

Derived In some cases, two (or more)


attribute values are related. Attribute
values that can be calculated (derived)
from value of other attribute value
years of service of employee can be calculated
from date of joining and current date
Some attribute values can be derived from
related Entities
an attribute Number_of_employees of a
DEPARTMENT entity can be derived by counting
the number of employees related to (working
for) that department.

Null attribute
Null - In some cases, a particular entity may not have an
applicable value for an attribute

Apartment_number attribute of an address applies only to


addresses that are in apartment buildings and not to other
types of residences, such as single-family homes (Bunglow)
College_degrees attribute applies only to people with college
degrees
The meaning of the above type of NULL is not applicable

NULL can also be used if we do not know the value of an


attribute for a particular entity
if we do not know the home phone number of John Smith
The meaning of the above type of NULL is not known
The unknown category of NULL can be further classified into two
cases.
The first case arises when it is known that the attribute value
exists but is missingfor instance, if the Height attribute of a
person is listed as NULL.
The second case arises when it is not known whether the
attribute value existsfor example, if the Home_phone
attribute of a person is NULL.

Complex Attributes
Composite and multi-valued attributes can be nested
arbitrarily.
We can represent arbitrary nesting by grouping components
of a composite attribute between parentheses () and
separating the components with commas, and by displaying
multi-valued attributes between braces { }. Such attributes
are called complex attributes
if a person can have more than one residence and each residence
can have a single address and multiple phones, an attribute
Address_phone for a person can be specified as

A complex attribute: Address_phone.

Relationship
A relationship is an association among several entity
sets

Example:
Hayes

depositor

customer entity set relationship set

A-102

account entity set

A relationship set is a mathematical relation among


n 2 entities, each taken from entity sets
{(e1, e2, en) | e1 E1, e2 E2, , en En}
where (e1, e2, , en) is a relationship
Example:
(Hayes, A-102) depositor

Relationship Set borrower

An attribute can also be property of a relationship set,


called as descriptive attribute.
For instance, the depositor relationship set between
entity sets customer and account may have the
attribute access-date

Degree of a Relationship Set


Refers to number of entity sets that participate in
a relationship set.
One
Unary
Two
Binary
Three
Ternary
Relationship sets that involve two entity sets are
binary (or degree two).
Relationship sets may involve more than two
entity sets.
E.g. Suppose employees of a bank may have
jobs (responsibilities) at multiple branches,
with different jobs at different branches. Then
there is a ternary relationship set between
entity sets employee, job and branch
Relationships between more than two entity sets
are rare. Most relationships are binary.

Mapping Constraints
E-R model may define certain
constraints to which the contents of
the database must confirm. Two
most important constraints are
Existence dependencies
Mapping Cardinalities

Existence Dependencies
If the existence of entity set x
depends on the existence of entity
set y, then x is said to be existence
dependent on y. If y is deleted, so
is x. Entity set y is said to be
dominant entity set and x is said to
be subordinate entity set

Mapping Cardinalities
Express the number of entities to which
another entities can be associated via a
relationship set.
Most useful in describing binary relationship
sets.
For a binary relationship set the mapping
cardinality must be one of the following
types:

One to one (1:1)


One to many (1:N)
Many to one (N:1)
Many to many (M:N)

Mapping Cardinalities

Some elements in A and B may not be mapped to any elements in


the other set

Mapping Cardinalities

Some elements in A and B may not be mapped to any elements in


the other set

Participation of an Entity Set in a


Relationship Set

Total participation - every entity in


the entity set participates in at least
one relationship in the relationship
set
Partial participation - some
entities may not participate in any
relationship in the relationship set

Participation of an Entity Set in a


Relationship Set

Entitys Role
Each entity set that participates in a relationship set
plays a particular role in a relationship
The role name signifies the role that a participating entity
from the entity set plays in each relationship instance,
and helps to explain what the relationship means
in the WORKS_FOR relationship set, EMPLOYEE plays
the role of employee or worker and DEPARTMENT
plays the role of department or employer.
Role names are not necessary in relationship types
where all the participating entity set are distinct, since
each participating entity set name can be used as the
role name.
In some cases the same entity set participates more
than once in a relationship set in different roles. In such
cases the role name becomes essential for distinguishing
the meaning of the role that each participating entity
plays

Entity sets of a relationship need


not be distinct
The labels Supervisor and
worker are called roles; they
specify how employee entities
interact via the Supervision
relationship set.

Roles are indicated in E-R


diagrams by labeling the lines
that
connect
diamonds
to
rectangles.
Role labels are optional, and are
used to clarify semantics of the
relationship

Weak Entity Set


Strong Entity set - Entity set that has its
own key attribute
Employee, student, customer, department

Weak Entity set - Entity sets that do not


have key attributes of their own

Dependent of an Employee, nominees of an a/c


holder

Identifying / Owner Entity set Entities belonging to a weak entity set are
identified by being related to specific
entities from another entity set in
combination with one of their attribute
values. The other entity set is identifying
(owner / parent / dominant) entity set

Weak Entity Set


Identifying Relationship The relationship
that relates a weak entity set to the owner
Weak Entity set always has a total
participation constraint i.e. existence
dependency with respect to its identifying
relationship
Partial Key / Discriminator Set of
attributes that can uniquely identify weak
entities that are related to the same owner
entity
In ERD identifying relationship are
represented by double line diamond. Partial
key attribute is underlined with a dashed or
dotted line

E-R Diagram
Overall logical structure of a database expressed graphically

Major Components of ERD consists of


Rectangles divided into two parts represent entity
sets. The first part contains the name of the entity set.
The second part contains the names of all the
attributes of the entity set. Attributes that are part of
the primary key are underlined.
Diamonds represent relationship sets.
Undivided rectangles represent the attributes of a
relationship set.
Lines link entity sets to relationship sets.
Dashed lines link attributes of a relationship set to the
relationship set.
Double lines indicate total participation of an entity in
a relationship set.
Double diamonds represent identifying relationship
sets linked to weak entity sets.

Design Issues
Use of entity sets vs. attributes
Choice mainly depends on the structure of the
enterprise being modeled, and on the semantics
associated with the attribute in question
Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship
set to describe an action that occurs between
entities
Binary versus n-ary relationship sets
Although it is possible to replace any nonbinary
(n-ary, for n > 2) relationship set by a number of
distinct binary relationship sets, a n-ary
relationship set shows more clearly that several
entities participate in a single relationship.
Placement of relationship attributes

An Example Database Application


The company is organized into departments. Each department
has a unique name, a unique number, and a particular
employee who manages the department. We keep track of the
start date when that employee began managing the
department. A department may have several locations
A department controls number of projects, each of which has a
unique name, a unique number, and a single location
We store each employees name, number, address, salary,
gender, and DOB. An employee is assigned to one department
but may work on several projects, which are not necessarily
controlled by the same department. We keep track of the
current number of hours per week that an employee works on
each project. We also keep track of the direct supervisor of
each employee (who is another employee).
We want to keep track of the dependents of each employee for
insurance purposes. We keep each dependents first name,
gender, DOB and relationship to the employee

A banking Scenario
Banks have customers. Customers are identified by name,
customer id, phone number and address. Accounts can be
held by more than one customer and a customers can have
more than one accounts. Accounts are identified by an
account number, account type (savings, current) and a
balance. Customers can avail loans. Loans are identified by
loan id, loan type (car, home, personal) and an amount.
Banks are identified by a name, code and the address of the
main office. Banks are organized into branches. Branches
are identified by a branch number, branch name and an
address. Accounts and loans are related to the banks
branches.
Draw an ER diagram for a database to represent this application

University Database
The university is organized into departments. Each department is identified by a
unique name (dept_name), is located in particular building, and has a budget.
Each department has a list of courses it offers. Each course has associated with it a
course_id, title, dept_name, and credits, and may also have associated
prerequisites.
Instructors are identified by their unique ID. Each instructor has name, associated
department (dept_name), and salary.
Students are identified by their unique ID. Each student has a name, an associated
major department (dept_name), and tot_cred (total credit hours the student
earned thus far).

The university maintains a list of classrooms, specifying the name of the building,
room_number, and room capacity.
The university maintains a list of all classes (sections) taught. Each section is
identified by a course_id, sec_id, year, and semester, and has associated with it
a semester, year, building, room_number, and time_slot_id (the time slot when
the class meets).
The department has a list of teaching assignments specifying, for each instructor,
the sections the instructor is teaching.
The university maintains a list of all student course registrations, specifying, for
each student, the courses and the associated sections that the student has
taken (registered for).

Enhanced / Extended ER Modeling (EER)


In late 1970s designers of the DB
applications
designed
more
accurate
database schemas that reflect the data
properties and constraint more precisely for
newer applications of database such as GIS
(Geographic
Information
Systems),
CAD/CAM. These databases have more
complex requirements than traditional
applications. This led to the development of
additional semantic data modeling concepts
that were incorporated into conceptual data
models such as ER model. EER model
includes all the modeling concepts of the
ER model

Sub-classes, Super-classes And


Inheritance
Sub-Classes In many cases an entity
type has various sub-groupings of its
entities that are meaningful and need to
be represented explicitly because of their
significance to the DB applications
Entities of Employee entity type may be
grouped into Secretary, Engineer, Manager,
Salaried_employee, Hourly_Employee

Each of the subgroup is sub-class of the


Employee entity type. The Employee
entity is called Super-class for each of
these sub-classes
An entity in sub-class is same as entity in
the super-class but in a distinct specific
role

Sub-classes, Super-classes And


Inheritance
The relationship between a super-class and
any of its sub-classes is a superclass/sub-class or class/sub-class
relationship. A specialization is depicted by
a hollow arrow-head pointing from the
specialized (subtype) entity to the other
(supertype) entity. It is often called an ISA
or ISAN relationship
Type Inheritance A concept associated
with sub-class. An entity that is member of a
sub-class inherits all the attributes of the
entity that is member of super-class
Sub-class entity also inherits all the
relationships in which the super-class
participates

Specialization
A process of defining a set of sub-classes of an
entity type (set); this entity type is called superclass of the specialization. The sub-class is
defined on the basis of some distinguishing
characteristic of the super-class.
Specialization is Top-down process

Recent EER Notation

Generalization
A reverse process of abstraction in which the
differences among several entity types are
suppressed. Common features of sub-class
entity types are identified and generalized them
into a single super-class. It is a Bottom-up
process
The term generalization refer to the process of
defining a generalized entity type from the given
entity types

Specialization and Generalization


Can have multiple specializations of an
entity set based on different features
permanent-employee vs. temporaryemployee, in addition to secretary, manager,
engineer

Each particular employee would be


a member of one of permanent-employee or
temporary-employee
and also a member of one of secretary,
manager, engineer

Design Constraints on a
Specialization/Generalization
Constraint on which entities can be members of a given
subclass (lower-level) entity type
Condition-defined An entity will become member of
which subclass is determined by placing a condition on the
value of some attribute of the superclass.
E.g. membership in the SECRETARY subclass by the
condition (Job_type = Secretary)
User-defined - When we do not have a condition for
determining membership in a subclass, the subclass is
called user-defined. Membership in such a subclass is
determined by the database users when they apply the
operation to add an entity to the subclass; hence,
membership is specified individually for each entity by the
user, not by any condition that may be evaluated
automatically.

Design Constraints on a
Specialization/Generalization
Constraint on whether or not entities may belong
to more than one subclass (lower-level) entity
type within a single generalization
Disjoint - an entity in super class can belong to
only one subclass (lower-level) entity type
Noted in E-R diagram by writing disjoint next to
the ISA triangle

Overlapping - an entity in super class can belong


to more than one subclass (lower-level) entity type

Design Constraints on a
Specialization/Generalization
Completeness constraint -- specifies whether or
not an entity in the super class (higher-level)
entity set must belong to at least one of the
subclass (lower-level) entity sets within a
generalization
Total : an entity in super class must belong to one
of the subclass (lower-level) entity type
Partial: an entity in super class need not belong
to one of the subclass (lower-level) entity type

Aggregation
Consider the ternary relationship
works-on
Suppose we want to record managers for
tasks performed by an employee at a
branch

Aggregation
Eliminate this redundancy via
aggregation
Treat relationship as an abstract entity
Allows relationships between
relationships
Abstraction of relationship into new
entity

E-R Diagram With Aggregation

Keys
A super key of an entity set is a set of one
or more attributes whose values uniquely
determine each entity.
A candidate key of an entity set is a
minimal super key
Customer-id is candidate key of customer
account-number is candidate key of account

Although several candidate keys may


exist, one of the candidate keys is selected
to be the primary key.

Keys
A super key of an entity set is a set of one
or more attributes whose values uniquely
determine each entity.
A candidate key of an entity set is a
minimal super key
Customer-id is candidate key of customer
account-number is candidate key of account

Although several candidate keys may


exist, one of the candidate keys is selected
to be the primary key.

Composite primary key - A primary


key which is a combination of more
than one attribute is called a
composite primary key
Overlapping candidate keys - Two
candidate keys overlap if they
involve any attribute in common.
e.g., in an student table, S#, C#
and Emailid, C# are two overlapping
candidate keys. (they have C# in
common)

Non-Key Attributes
The attributes other than the Candidate Key
attributes are called Non-Key attributes.
OR
The attributes which do not participate in
any of the Candidate keys
Foreign key
A foreign key is a copy of a primary key
that has been exported from one relation
into another to represent the existence of
a relationship between them. A foreign
key is a copy of the whole of its parent
primary key i.e. if the primary key is
composite, then so is the foreign key

Weak entity set represented by double


rectangles.
underline the discriminator of a weak
entity set with a dashed line.
payment-number discriminator of the
payment entity set
Primary key for payment (loan-number,
payment-number)

Note: the primary key of the strong


entity set is not explicitly stored with
the weak entity set, since it is implicit
in the identifying relationship.
If loan-number were explicitly stored,
payment could be made a strong entity,
but then the relationship between
payment and loan would be duplicated by
an implicit relationship defined by the
attribute loan-number common to
payment and loan

Reduction of an E-R Schema to


Tables
Each strong entity set becomes a table
Each single-valued attribute becomes a
column
Derived attributes are ignored
Composite attributes are represented
by components
Multi-valued attributes are represented
by a separate table
The key attribute of the entity set
becomes the primary key of the table

Entity example
Here address is a composite attribute
Years of service is a derived attribute (can be
calculated from date of joining and current
date)
Skill set is a multi-valued attribute
The relational Schema
Employee (E#, Name, Door_No, Street, City, Pincode, Joining_Date)

Emp_Skillset( E#, Skillset)


Each value of the multi-valued attribute maps to a separate row of the table

Converting weak entity types


Weak entity types are converted into
a table of their own, with the primary
key of the strong entity acting as a
foreign key in the table
This foreign key along with the key
of the weak entity (discriminator)
form the composite primary key of
this table

The Relational Schema


Employee (E# ,.)
Dependant (Employee, Dependant_ID, Name, Address)

Converting relationships
The way relationships are represented depends
on the cardinality and the degree of the
relationship and participation
The possible cardinalities :

1:1, 1:M, N:M


The degrees :
Unary
Binary
Ternary
The participation :
Total
Partial

Binary 1:1

The primary key of the partial participant will


become the foreign key of the total participant

Employee( E#, Name,)


Department (Dept#, Name,Head)

Binary 1 : 1

Case 2: Uniform participation types


The primary key of either of the participants can
become a foreign key in the other
Employee (E#,name)
Chair( item#, model, location, used_by)
(or)
Employee ( E#, Name.Sits_on)
Chair (item#,.)

Binary 1:N

The primary key of the relation on the 1 side of


the relationship becomes a foreign key in the
relation on the N side
Teacher (ID, Name, Telephone, ...)

Subject (Code, Name, ..., Teacher)

Binary 1 : N

A new table is created to represent the


relationship
New table contains two foreign keys - one from
each of the participants in the relationship
The primary key of the new table is the
combination of the two foreign keys
Student (Sid#,Title) Course(C#,CName,)

Enrolls (Sid#, C#)

Binary M : N

Unary 1 : 1
Consider employees who are
also a couple
The primary key field itself will
become foreign key in the same
table
Employee( E#, Name,... , Spouse)

Unary 1 : 1

Employee Table

EmpCode

PK

EmpName
DateofJoining
Spouse

FK

Unary 1:N
The primary key field itself will
become foreign key in the same
table
Same as unary 1:1

Employee( E#, Name,,Manager)

Unary 1 : N

Employee Table

EmpCode

PK

EmpName
DateofJoining
Manager

FK

Unary M:N
There will
be two
resulting
tables. One to
represent the
entity and
another to
represent the
M:N
relationship as
follows

Employee( E#, Name,)

Guaranty( Guarantor, beneficiary)

Unary M : N

Employee Table

Employee Table
Guarantor PK/FK

EmpCode

Beneficiary PK /FK

PK

EmpName
DateofJoining

Spouse

FK

Ternary relationship

Ternary relationship
Represented by a new table
The new table contains three foreign keys
one from each of the participating Entities

The primary key of the new table is the


combination of all three foreign keys
Prescription (Doctor#, Patient #, Medicine_Name)

Descriptive attribute of relationship


becomes a separate column of new table
Prescription (Doctor#, Patient #, Medicine_Name,
doses, duration, next_visit)

Representing Specialization as
Tables
Method 1
create a table for the higher level entity
create a table for each lower level entity set,
include primary key of higher level entity set
and local attributes
table

table attributes

person
customer
employee

name, street, city


name, credit-rating
name, salary

Drawback - getting information about customer


and employee requires accessing two tables

Method 2
create a table for each entity set with all local
and inherited attributes
table
customer
employee

table attributes
name, street, city, credit-rating
name, street, city, salary

If specialization is total, no need to create table


for generalized entity (person)
Drawback - street and city may be stored
redundantly for persons who are both customers
and employees

Representing Aggregation as Tables

Relational Schema Diagram


Car Insurance Company

Das könnte Ihnen auch gefallen