EERD - Part 2

Conceptual Data Models
The Enhanced Entity-Relationship Model

Enhanced Entity-Relationship Model
Since 1980s there has been an increase in emergence of new database
applications with more demanding requirements
Basic concepts of ER modeling are not sufficient to represent requirements of

newer, more complex applications
Response is development of additional ‘semantic’ modeling concepts
Semantic concepts are incorporated into the original ER model and called the
Enhanced Entity-Relationship (EERM) model.
Examples of additional concepts of EERM model are:

specialization / generalization;
aggregation;
composition.
EERM - Specialization / Generalization
Superclass
An entity type that includes one or more distinct subgroupings of
its occurrences
Subclass
A distinct subgrouping of occurrences of an entity type
Superclass/subclass relationship is one-to-one (1:1) also called

isa relationship
Superclass may contain overlapping or distinct subclasses
Not all members of a superclass need be a member of a subclass
Attribute Inheritance
An entity in a subclass represents same ‘real world’ object as in
superclass, and may possess subclass-specific attributes, as well
as those associated with the superclass.
Specialization
Process of maximizing differences between members of an
entity by identifying their distinguishing characteristics.
Generalization
Process of minimizing differences between entities by
identifying their common characteristics.
EERM - Specialization / Generalization (examples)
AllStaff table holding details of all Staff

Specialization / Generalization of Staff Entity with subclasses for

Job Roles
Specialization / Generalization of Staff Entity with subclasses for
Job Roles and Contracts of Employment
Specialization / Generalization of Staff Entity with with a shared
subclass and a subclass with its own subclass
EERM - Specialization / Generalization (Constraints)
Constraints on Specialization / Generalization:
Participation constraints
determine whether every member in superclass must be a
member of at least one subclass
a specialization/generalization may be mandatory or optional
Disjoint constraints
describe a relationship between members of the subclasses and
indicates whether a member of a superclass can be a member of
one, or more than one, subclasses
a specialization/generalization may be disjoint or non-disjoint
EERM - Specialization / Generalization (Constraints)
Based on the above constraints of specialization / generalization

relationship can fall into one of the following four type:
Participation Constraints
mandatory optional
mandatory & optional &
Disjoint disjoint disjoint disjoint
Constr
aints non- mandatory & optional &
disjoint non-disjoint non-disjoint
EERM - Specialization / Generalization (Constraints - Examples)
Staff Superclass with Supervisor and Manager Subclasses

Owner Superclass with PrivateOwner and BusinessOwner

Subclasses
Person Superclass with Staff, PrivateOwner, and Client Subclasses

An EERD of Branch View with Specialization/Generalization

EERM - Aggregation
Aggregation
represents a ‘has-a’ or ‘is-part-of’ relationship between entity
types, where one represents the ‘whole’ and the other ‘the part’
EERM - Composition
Composition
a specific form of aggregation that represents an association
between entities, where there is a strong ownership and
coincidental lifetime between the ‘whole’ and the ‘part’
Design Principles
Faithfulness
A design should be faithful to the specifications of the applications, i.e. it should
reflect reality.
Avoid redundancy
Be careful to say everything once, otherwise you may end up producing a confusing
and inconsistent design.
Simplicity
Avoid introducing more elements into your design than it is absolutely necessary.
Choose the correct relationships
Adding every relationship is not always a good idea. It can lead to redundancy,
storage waste, complex updates, but it can also not represent faithfully users’
perception of relationships (connection traps).
To overcome the problem find out the validity of any assumptions you make and also
the queries that will be asked.
Design Principles
Picking the Right Kind of Element
Sometimes options exist regarding the type of design elements used to
represent reality. In general an attribute is simpler to implement than either an
entity or a relationship. However, making everything an attribute is not wise
either.
In general the following rule can be applied:
Let E be an entity
whose attributes collectively identify the entity, i.e. if E has more than 1 attribute
then no attribute must depend on the other attributes and
that is involved only in one-many relationships with E always in the one side of
the relationship is in the 1-side and
that is not involved in a relationship with another entity more than once
Then E could be removed and its attributes should become (suitably
renamed, if necessary) attributes of each entity it is related to. If E participates
in a multi-way relationship then its attributes should be made attributes of the
multi-way relationship instead.
The Relational Data Model
Relational Model - Instances of Branch and Staff Relations
Relational Model - Examples of Attribute Domains
Relational Model (Terminology)
Relation
(conceptually) a table with columns and rows
Attribute
a named column of a relation
Domain
the set of allowable values for one or more attributes
Tuple
a row of a relation
Degree
the number of attributes in a relation
Cardinality
the number of tuples in a relation
Relational Database
a collection of normalized relations with distinct relation names
Alternative Terminology for Relational Model
Data Redundancy
A Major aim of relational database design is to group attributes into

relations to minimize data redundancy and reduce file storage space
required by base relations.
Data Redundancy
StaffBranch relation has redundant data: details of a branch are

repeated for every member of staff.
In contrast, branch information appears only once for each

branch in Branch relation and only branchNo is repeated in Staff
relation, to represent where each member of staff works.
Update Anomalies
Relations that contain redundant information may potentially

suffer from update anomalies.
Types of update anomalies include:

Insertion,
Deletion,
Modification.
Database Relations
Relation schema
Named relation defined by a set of attribute and domain name
pairs
Relational database schema

Set of relation schemas, each with a distinct name
Properties of Relations
Relation name is distinct from all other relation names in relational schema.
Each cell of relation contains exactly one atomic (single) value (1st Normal
Form / 1NF Assumption)
Each attribute has a distinct name.
Values of an attribute are all from the same domain.
Each tuple is distinct; there are no duplicate tuples (Why ?).
Order of attributes has no significance (Why ?).
Order of tuples has no significance, theoretically (Why?).

Relational Keys
Superkey
An attribute, or a set of attributes, that uniquely identifies a tuple
within a relation.
Candidate Key
Superkey (K) such that no proper subset is a superkey within the
relation.
In each tuple of R, values of K uniquely identify that tuple
(uniqueness).
No proper subset of K has the uniqueness property (irreducibility).
Relational Keys
Primary Key
Candidate key selected to identify tuples uniquely within relation.
Alternate Keys
Candidate keys that are not selected to be primary key.
Foreign Key
Attribute, or set of attributes, within one relation that matches candidate key of
some (possibly same) relation.
Relational Integrity
Null
Represents value for an attribute that is currently unknown or not
applicable for tuple
Deals with incomplete or exceptional data
Represents the absence of a value and is not the same as zero
or spaces, which are values.
Relational Integrity
Entity Integrity
In a base relation, no attribute of a primary key can be null
Referential Integrity
If foreign key exists in a relation, either foreign key value must
match a candidate key value of some tuple in its home relation or
foreign key value must be wholly null
Enterprise Constraints
Additional rules specified by users or database administrators
Mathematical definition of relation
Consider two sets, D1 and D2, where D1 = {2, 4} & D2 = {1, 3, 5}.
Cartesian product, D1 × D2, is the set of all ordered pairs, where the first
element is member of D1 and second element is member of D2.
D1 × D2 = {(2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)}
An alternative way of representing the Cartesian product D1 × D2 is to

describe its elements, i.e. all combinations of elements of D1 and D2, such
that in each such combination (or pair) the first component is an element of
D1 whereas the second is an element of D2.
D1 × D2 = {(x, y) | x ∈D1, y ∈D2}

Any subset of Cartesian product is a relation;

e.g. let R = {(2, 1), (4, 1)} ⊂ D1 × D2, R is a relation
May specify which pairs are in relation using some condition for
selection; e.g.
second element is 1:
R = {(x, y) | x ∈D1, y ∈D2, and y = 1}
first element is always twice the second:
S = {(x, y) | x ∈D1, y ∈D2, and x = 2y}
Consider three sets D1, D2, D3 with Cartesian Product D1 × D2

× D3 ; e.g.
D1 = {1, 3} D2 = {2, 4} D3 = {5, 6}
D1 × D2 × D3 =
{(1,2,5), (1,2,6), (1,4,5), (1,4,6), (3,2,5), (3,2,6), (3,4,5), (3,4,6)}
Any subset of these ordered triples is a relation.

The Cartesian product of n sets (D1, D2, . . ., Dn) is a set of tuples:
D1 × D2 × . . . × Dn = { (d1, d2, . . . , dn) | d1 ∈D1, d2 ∈D2, ... , dn∈Dn}
n
Usually we write × Di instead of D1 × D2 × . . . × Dn
i =1
Any set of n-tuples from this Cartesian product is a relation on the

n sets.
Views
Base Relation
Named relation corresponding to an entity in conceptual schema,
whose tuples are physically stored in database.
View
Dynamic result of one or more relational operations operating on
base relations to produce another relation.
Views
A virtual relation that does not necessarily actually exist in the

database but is produced upon request, at time of request.
Contents of a view are defined as a query on one or more base

relations.
Views are dynamic, meaning that changes made to base

relations that affect view attributes are immediately reflected in
the view.
Purpose of Views
Provides powerful and flexible security mechanism by hiding

parts of database from certain users.
Permits users to access data in a customized way, so that same

data can be seen by different users in different ways, at same
time.
Can simplify complex operations on base relations.

Updating Views
All updates to a base relation should be immediately reflected in all views that
reference that base relation
If view is updated, underlying base relation should reflect change
There are restrictions on types of modifications that can be made through
views:
Updates are allowed if query involves a single base relation and contains a
candidate key of base relation
Updates are not allowed involving multiple base relations
Updates are not allowed involving aggregation or grouping operations
Classes of views are defined as:

theoretically not updateable
theoretically updateable
partially updateable
Relational Languages
Relational Languages
Relational algebra and relational calculus are formal languages

associated with the relational model.
Informally,
relational algebra is a (high-level) procedural language (how data
is to be manipulated), whereas
relational calculus a non-procedural language (what data is
required).
Relational Algebra
Relational algebra operations work on one or more relations to

define another relation without changing the original relations.
Both operands and results are relations, so output from one

operation can become input to another operation. This property is
called closure.
Allows expressions to be nested, just as in arithmetic.

Relational Algebra
5 basic operations in relational algebra:

Selection,
Projection,
Cartesian product
Union and
Set Difference.
These perform most of the data retrieval operations needed.
Also have Join, Intersection, and Division operations, which can

be expressed in terms of 5 basic operations.
Relational Algebra Operations
Relational Algebra Operations
Relational Calculus
Relational calculus query specifies what is to be retrieved rather

than how to retrieve it.
No description of how to evaluate a query.
Interested in finding tuples for which a predicate is true. Based on

use of tuple variables.
In first-order logic (or predicate calculus), predicate is a truth-

valued function with arguments.
Tuple Relational Calculus
When we substitute values for the arguments, function yields an expression,

called a proposition, which can be either true or false.
Tuple variable is a variable that ‘ranges over’ a named relation: ie., variable
whose only permitted values are tuples of the relation.
Specify range of a tuple variable S as the Staff relation as:

Staff(S)
To find set of all tuples S such that P(S) is true:
{S | P(S)}
Relational Calculus
If predicate contains a variable (e.g. ‘x is a member of staff’),

there must be a range for x.
When we substitute some values of this range for x, proposition

may be true; for other values, it may be false.
When applied to databases, relational calculus has forms: tuple

and domain.
Relational Completeness
Relational algebra and relational calculus are equivalent to one

another, i.e. for every relation algebra expression we can write a
relational calculus that will produce the same result (and vice
versa)
A language that produces a relation that can be derived using

relational calculus is relationally complete.
Other Languages
Transform-oriented languages are non-procedural languages that use

relations to transform input data into required outputs (e.g. SQL).
Graphical languages provide user with picture of the structure of the relation.
User fills in example of what is wanted and system returns required data in
that format (e.g. QBE).
4GLs can create complete customized application using limited set of

commands in a user-friendly, often menu-driven environment.
Some systems accept a form of natural language, sometimes called a 5GL,

although this development is still a an early stage.
The road from
the EE-R model
to
the Relational Model
Handling strong entity types
For each strong entity type E in an EER schema
create a relation R (preferably with the same name)
Include all the simple attributes of E
Include ONLY simple component attributes of a composite

attribute
Choose one of the key attributes of E as the primary key for R (if
the chosen key for E is composite, then the set of simple attributes
that form it will together form the primary key of R).
7
Handling weak entity types
For each weak entity type, say W, in the EER schema, with owner
entity type E,
create a relation
include all the simple attributes (or simple components of
composite attributes) of W
include also as foreign key attributes of R any primary key
attribute(s) of the relations that correspond to the owner entity
type(s)
choose as the primary key of R the combination of the primary
keys of the owner(s) and the partial key of the week entity type W.
Note: The identifying relationship of W does not need to be created,

because the attributes of such a relationship are always a subset of
the attributes for the weak entity type itself and thus it does not provide
any additional information. 7
Handling relationship types
For every relationship R in an EERD identify the relations
that correspond to the entity types participating in R, and
create a relation that:
includes the primary key of each relation corresponding to an
entity type involved in the relationship R.
includes all the simple attributes of R, if any.
includes ONLY simple component attributes of a composite
attributes of R, if any.
Attributes of an entity type involved several times in a relationship
must have its attributes renamed in the corresponding relation.
The primary key of relation R is a combination of the primary
key(s) of all the relations corresponding to entity types involved in
the relationship R.
7
Handling binary 1:1 relationship types
If the relationship R is a binary 1:1 relationship then

the schema of the relation corresponding to either of the two entity
types participating in R can be amended to include all the
attributes of the corresponding relation R.
It is preferable to choose the relation that corresponds to an entity
type with mandatory participation in R.
7
Handling binary 1 : M relationship types
If the relationship R is a binary 1:* relationship then the schema of

the relation corresponding to *-side entity type of the relationship
can be amended to include:
all the simple attributes of the relationship R
and also,
as a foreign key, the primary key attributes of relation
corresponding to the other entity type of the relationship.
7
Handling Multi-valued Attributes
For each multi-valued attribute A,

create a new relation R that includes an attribute corresponding to
A plus the primary key attribute K (as a foreign key in R) of the
relation that represents the entity type or relationship type that has
A as an attribute.
The primary key of R is the combination of A and K.
If the multi-valued attribute is composite then we include its simple

components.
7
Handling specialization relationship types
Convert each specialization with m subclasses {S1, S2, …, Sm}

and (generalized) superclass C, where the attributes of C are
{key, a1, a2, …, an}, into relation schemas using one of the
following options:
Create a relation L for C with attributes attr(L) = {key, a1, a2, …,
an}, where key is again the primary key. Create a relation Li for
each subclass Si, with attributes attr(Li) = {key} ∪ attr(Si), where
key is again the primary key.
Create a relation Li for each subclass Si, with attributes
attr(Li) = {key, a1, a2, …, an} ∪ attr(Si),

where key is again the primary key.
7
Handling specialization relationship types (cont.)
Create a single relation L with attributes

attr(L) = {key,a1,a2,…,an}∪attr(S1)∪attr(S2)∪…∪attr(Sm)∪{t},
where key is again the primary key. This option is for a specialization whose
classes are disjoint, and t is a type attribute that indicates the subclass to
which each tuple belongs, if any. This option generates too many tuples
containing null values.
Create a single relation L with attributes
attr(L) = {key,a1,a2,…,an}∪attr(S1)∪attr(S2)∪…∪attr(Sm)∪{t1, t2, …, tm},
where key is again the primary key. This option is for a specialization whose
classes are overlapping, and each ti is a Boolean attribute that indicates
whether a tuple belongs to subclass Si.

EERD - Part 2

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

EERD - Part 2

Hochgeladen von

Copyright:

Verfügbare Formate

Conceptual Data Models

The Enhanced Entity-Relationship Model

 Basic concepts of ER modeling are not sufficient to represent requirements of

 Response is development of additional ‘semantic’ modeling concepts

 Examples of additional concepts of EERM model are:

 Superclass/subclass relationship is one-to-one (1:1) also called

 Superclass may contain overlapping or distinct subclasses

 Not all members of a superclass need be a member of a subclass

AllStaff table holding details of all Staff

Specialization / Generalization of Staff Entity with subclasses for

Constraints on Specialization / Generalization:

Based on the above constraints of specialization / generalization

Staff Superclass with Supervisor and Manager Subclasses

Owner Superclass with PrivateOwner and BusinessOwner

Person Superclass with Staff, PrivateOwner, and Client Subclasses

An EERD of Branch View with Specialization/Generalization

A Major aim of relational database design is to group attributes into

 StaffBranch relation has redundant data: details of a branch are

 In contrast, branch information appears only once for each

 Relations that contain redundant information may potentially

 Types of update anomalies include:

Relational database schema

 Each attribute has a distinct name.

 Values of an attribute are all from the same domain.

 Each tuple is distinct; there are no duplicate tuples (Why ?).

 Order of attributes has no significance (Why ?).

 Order of tuples has no significance, theoretically (Why?).

 An alternative way of representing the Cartesian product D1 × D2 is to

D1 × D2 = {(x, y) | x ∈D1, y ∈D2}

 Any subset of Cartesian product is a relation;

 Consider three sets D1, D2, D3 with Cartesian Product D1 × D2

D1 = {1, 3} D2 = {2, 4} D3 = {5, 6}

 Any subset of these ordered triples is a relation.

 The Cartesian product of n sets (D1, D2, . . ., Dn) is a set of tuples:

D1 × D2 × . . . × Dn = { (d1, d2, . . . , dn) | d1 ∈D1, d2 ∈D2, ... , dn∈Dn}

 Any set of n-tuples from this Cartesian product is a relation on the

 A virtual relation that does not necessarily actually exist in the

 Contents of a view are defined as a query on one or more base

 Views are dynamic, meaning that changes made to base

 Provides powerful and flexible security mechanism by hiding

 Permits users to access data in a customized way, so that same

 Can simplify complex operations on base relations.

 Classes of views are defined as:

 Relational algebra and relational calculus are formal languages

 Relational algebra operations work on one or more relations to

 Both operands and results are relations, so output from one

 Allows expressions to be nested, just as in arithmetic.

 5 basic operations in relational algebra:

 These perform most of the data retrieval operations needed.

 Also have Join, Intersection, and Division operations, which can

 Relational calculus query specifies what is to be retrieved rather

 Interested in finding tuples for which a predicate is true. Based on

 In first-order logic (or predicate calculus), predicate is a truth-

 When we substitute values for the arguments, function yields an expression,

 Specify range of a tuple variable S as the Staff relation as:

 If predicate contains a variable (e.g. ‘x is a member of staff’),

 When we substitute some values of this range for x, proposition

 When applied to databases, relational calculus has forms: tuple

 Relational algebra and relational calculus are equivalent to one

 A language that produces a relation that can be derived using

 Transform-oriented languages are non-procedural languages that use

 4GLs can create complete customized application using limited set of

Basic concepts of ER modeling are not sufficient to represent requirements of

Response is development of additional ‘semantic’ modeling concepts

Examples of additional concepts of EERM model are:

Superclass/subclass relationship is one-to-one (1:1) also called

Superclass may contain overlapping or distinct subclasses

Not all members of a superclass need be a member of a subclass

StaffBranch relation has redundant data: details of a branch are

In contrast, branch information appears only once for each

Relations that contain redundant information may potentially

Types of update anomalies include:

Each attribute has a distinct name.

Values of an attribute are all from the same domain.

Each tuple is distinct; there are no duplicate tuples (Why ?).

Order of attributes has no significance (Why ?).

Order of tuples has no significance, theoretically (Why?).

An alternative way of representing the Cartesian product D1 × D2 is to

Any subset of Cartesian product is a relation;

Consider three sets D1, D2, D3 with Cartesian Product D1 × D2

Any subset of these ordered triples is a relation.

The Cartesian product of n sets (D1, D2, . . ., Dn) is a set of tuples:

Any set of n-tuples from this Cartesian product is a relation on the

A virtual relation that does not necessarily actually exist in the

Contents of a view are defined as a query on one or more base

Views are dynamic, meaning that changes made to base

Provides powerful and flexible security mechanism by hiding

Permits users to access data in a customized way, so that same

Can simplify complex operations on base relations.

Classes of views are defined as:

Relational algebra and relational calculus are formal languages

Relational algebra operations work on one or more relations to

Both operands and results are relations, so output from one

Allows expressions to be nested, just as in arithmetic.

5 basic operations in relational algebra:

These perform most of the data retrieval operations needed.

Also have Join, Intersection, and Division operations, which can

Relational calculus query specifies what is to be retrieved rather

Interested in finding tuples for which a predicate is true. Based on

In first-order logic (or predicate calculus), predicate is a truth-

When we substitute values for the arguments, function yields an expression,

Specify range of a tuple variable S as the Staff relation as:

If predicate contains a variable (e.g. ‘x is a member of staff’),

When we substitute some values of this range for x, proposition

When applied to databases, relational calculus has forms: tuple

Relational algebra and relational calculus are equivalent to one

A language that produces a relation that can be derived using

Transform-oriented languages are non-procedural languages that use

4GLs can create complete customized application using limited set of

Some systems accept a form of natural language, sometimes called a 5GL,

create a relation R (preferably with the same name)

Include all the simple attributes of E

Include ONLY simple component attributes of a composite

If the relationship R is a binary 1:1 relationship then

If the relationship R is a binary 1:* relationship then the schema of

If the multi-valued attribute is composite then we include its simple

Create a single relation L with attributes