Sie sind auf Seite 1von 6

CIS 211C iLearning text notes associated with text slides.

Database Design
Section 6
6.1.4 A composite UID is a set of columns that you define which makes the data in
your table unique. When you define more than one column as your primary key on a
table, it is called a composite primary key. composite primary keys are very
important when designing a data model with integrity.
6.1.5 There are three types of primary keys. A simple UID is a primary key that
consists of a column that uniquely identifies an entity, for instances a person's
Social Security number or a vehicle identification number (VIN). An artificial key is a
column created for an entity to serve solely as the primary key and is visible to
users. Examples of this would include student number, customer number or invoice
number.
6.1.10 In an entity, there may be more than one attribute that could identify the row
of data uniquely. However, in those instances, only one can be selected as the
Primary UID and the others would be Secondary UIDs. In this instance where
several attributes could be used to identify a record uniquely, we call these
candidate UIDs, because each could be a candidate for becoming a Primary UID in
and of themselves.
6.2.5 The last topic for this weeks discussion is normalization. Although well look at
three levels of normalization, and we call them first normal form, second normal
form, and third normal form, this week well look at the first normal form. The
purposes of normalization are: to ensure that data is stored in only one location in
the database; to increase data integrity; and, to identify missing tables, columns,
and constraints in the original model. First normal form states that all attributes
must be single-valued or that there be no multivalued attributes. In order for a data
model to be in first normal form, this rule has to be followed. To validate first normal
form, we check that each attribute has a single value for each occurrence of the
entity. This means that no attribute should have repeating values. Sometimes when
a database is first modeled, the first normal form is automatic. But, sometimes
there are attributes that violate first normal form. We need to decide what to do
with those. In this example, the classroom attribute of SCHOOL BUILDING has
multiple values. Therefore, to put the entity in first normal form, a separate entity
CLASSROOM is created that has a one-to-many relationship with SCHOOL BUILDING
and has its own attributes. Lets examine additional examples.
6.2.6
The examples on this slide are not in first normal form. In Example 1 there is a
CLIENT entity that has event as an attribute. With this model the CLIENT could

request more than one event, therefore violating first normal form. In example 2,
the SHOPPING MALL entity has attributes labelled name and floor for the shops.
There is usually more than one shop in a SHOPPING MALL demonstrating that this
entity has repeating groups of name and floor. Finally, in example 3, the player
attribute of the TEAM entity is a repeating group since a TEAM should have more
than one player.
6.2.7 To correct the ER diagrams in the previous slide, we can see that in example 1,
an EVENT entity is created and the attribute event is removed from the CLIENT
entity. Additionally, a one-to-many relationship is created between the two entities.
Likewise, in example 2, the SHOPPING MALL entity now has a one-to-many
relationship with the new STORE entity, which has the attributes name and floor.
Finally, example 3 is now in first normal form because a new entity PLAYER was
created with its own attributes and the player attribute was removed from the TEAM
entity. A one-to-many relationship was created to show that a TEAM may have one
or more PLAYERs.
6.3.5
In the process of efficiently storing data, and eliminating redundancy, tables in a
database are designed and created to be in one of five possible normal forms. Each
normal form contains and enforces the rule of the previous form, and, in turn,
applies some stricter rules on the design of tables. A set of tables in a database are
initially said to be in zero normal form. Tables in first normal form have a primary
key, no single column has multiple values and the columns that do not make up the
primary key (non-key attributes) depend on the primary key. Now, second normal
form is when the tables meet the criteria of first normal form AND, if the primary
key is a composite of attributes (columns), the non-key attributes (columns) must
depend on the whole key. Note that any table with a primary key that is composed
of a single attribute (column) is automatically in second normal form.
6.3.6 Lets review what we know about unique identifiers. First of all, every instance
of an entity must be identified by a unique value. In an ERD, we use the pound or
hash symbol to commonly identify the attribute that is the unique identifier. All
unique identifiers must be unique and not null. In most cases a number is typically
assigned as a unique identifier. In some cases unique identifiers are composites,
this means that they are composed of more than one attribute, or by a combination
of attributes in relationships, or even by relationships themselves. In this example, a
barred unique id is used to show that the unique ID for the ACCOUNT entity is the
combination of the ACCOUNT number and BANK number. Because no part of the UID
may ever be null, the bar is placed on the solid line near the ACCOUNT entity.
6.3.7 The ACCOUNT side of this ERD shows a second normal form violation. In this
case we have a relationship between ACCOUNT and BANK. Each ACCOUNT has a
number, balance and date opened as attributes. And, the attribute location is

included with the ACCOUNT. However, the location is actually an attribute of BANK.
In this case the attribute location is not dependent on the ACCOUNT number
because the location is only relevant to the BANK. The location really belongs to the
BANK id and is dependent upon what BANK the account is held in. By moving this
attribute over to the BANK entity, we fix this second normal form violation.
6.3.9
Examine the intersection entity that is created between EVENT and SONG. Because
each event can have more than one SONG, and each SONG can be played at more
than one EVENT, this is a many-to-many relationship. The relationship was resolved
by adding an intersection entity labelled PLAY LIST ITEM. We can identify each PLAY
LIST ITEM uniquely by knowing at what EVENT it will be played and what SONG
thats being played. Therefore, we put a small bar at the many side of this
relationship to show that the unique identifier for PLAY LIST ITEM is actually the
EVENT id and the SONG id combined together. ()Because no part of the unique
identifier may ever be null, the bars are placed on the solid lines. Both of these
relationships are mandatory at the many side. You will never find an UID bar on a
dashed line. Notice the second normal form violation, and how it is corrected, with
the event date attribute being moved into the EVENT entity, and the duration
attribute being moved into the SONG entity.
6.4.4 Lets look at third normal form. Third normal form states that a non-UID
attribute cannot be dependent on another non-UID attribute. Lets examine what
that means by looking at a violation. Look at the entity CD. CD has the attribute
labelled id marked as the unique identifier. CD also has the attributes title, producer
and year. Additional attributes provide information on the store from which the CD
was purchased, the store name and address. The third normal form violation occurs
with the attribute store name and store address because these attributes are not
dependent upon the CDs id attribute. Rather, they are dependent upon the store,
and therefore are better modeled as attributes of a new entity STORE.
6.4.5 Although the CD entity is in 1NF and 2NF, it violates 3NF. To correct this
violation, we place store name and store address as attributes under a new STORE
entity. We can give STORE its own UID attribute of number and we can build a
relationship between CD and STORE. The many-to-one relationship would then read
as each CD may be in one and only one STORE, and each STORE may have one or
more CDs.() We need to say one more thing about normalization before we move
on. The three normal forms are progressive. What this means is you cannot have a
database in second normal form unless it is already in first normal form. And, the
database can only be in third normal form if it is already in first and second normal
form. Some database designs include a fourth and a fifth normal form, but our
training stops at third normal form.
Section 7

7.1.5 An arc is a diagramming method used to model two or more mutually


exclusive relationships from the same entity. The arc is drawn between the two
relationships connecting both of them and demonstrating mutual exclusivity. The
arc indicates that any instance of that entity can have only one valid relationship of
the relationship in the arc at, any one time. The arc can be drawn between two or
more relationships. An arc is sometimes known as an exclusive arc. In programming
we call this x-or or exclusive or type of situation. In this example, the BIILLBOARD
can feature only one of the following items at a time: a MOVIE, a PRODUCT
ADVERTISEMENT, or a PUBLIC ANNOUNCEMENT. To denote this on the ER model, we
use the arc and circles at the junctions to indicate mutual exclusivity.
7.1.8 Lets examine this example of an arc relationship. This example models a
mailing LIST for an email account. The mailing LIST may contain individuals, and it
may also contain groups. Each LIST item is either referring to an individual USER or
a LIST. It may not refer to both. Notice that the arc is drawn between two
relationships. The relationship in between the arc, the relationship that says each
LIST item must be contained in one and only one LIST, does not participate in the
arc. This is an independent relationship. The two relationships that participate in the
arc are shown by the presence of a circle intersecting the relationships. This page is
primarily used to demonstrate how to draw the arc relationship.
7.1.9 This example shows the relationship between a MEMBERSHIP and either a
COMPANY or a CUSTOMER. Lets read the sentences that describe the relationship.
In this case, watch the way the relationship is read. Each MEMBERSHIP may be
either held by one and only one COMPANY, or held by one and only one CUSTOMER.
The relationship is normal from COMPANY to MEMBERSHIP. Each COMPANY may be
the holder of one or more MEMBERSHIPs and also from CUSTOMER. Each CUSTOMER
may be the holder of one or more MEMBERSHIPs. A MEMBERSHIP may not be held
by both a COMPANY and a CUSTOMER. This is the mutual exclusivity that is modeled
by the arc. Arcs may be drawn for optional relationships and for mandatory
relationships. But, the two relationships participating in the arc must be of the same
cardinality.
7.1.11 Arcs are similar to supertypes and subtypes and can sometimes be modeled
as such. The difference is that we use supertypes and subtypes when we want to
represent classifications or types of things, that have some things in common and
some things different. We use arcs to represent mutually exclusive relationships
between entities.
7.2.5 It is not uncommon that an entity has a barred relationship to another entity
that has a barred relationship to a third entity and so on. This is called a cascading
composite UID. Look carefully at this example. The highest level of this hierarchy is
BUILDING. The BUILDING has an id, name, and address. A FLOOR is represented by
a floor number in a particular building, but since every BUILDING has a first floor,
the FLOOR number is not enough to be a unique identifier. It must be associated

with a particular BUILDING. A SUITE is then identified by its number, the FLOOR
number containing the SUITE and the BUILDING id. In sequence, the ROOM is
identified by its id, the SUITE number, the FLOOR number, and the BUILDING id. All
four of those attributes are part of the unique identifier for the room.
7.2.6 Next we will review two methods for modeling a hierarchical relationship. The
first method is through a strict hierarchy, which shows levels of organization. The
second method is called a recursive relationship, which shows a single entity related
to itself. This circular relationship is sometimes known in data modeling circles as a
pigs ear. The example on the left shows a typical corporate hierarchy. The
company president or chief executive officer is at the top of the hierarchy. With
directors in the next level and so forth until we find the workers that are at the
bottom of the hierarchy. This hierarchy is modeled through an ER diagram showing
different levels with many-to-one relationships. More commonly you will see this
model inverted with the most volatile entity on the top, in this case ENTRY-LEVEL
EMPLOYEE. And the least volatile entity, the entity with the fewest instances at the
bottom, in this case PRESIDENT. Notice that this is a chain of many-to-one
relationships.
7.2.7 A recursive relationship is often used to model employee relationships.
Examine this recursive relationship with PARTNER. Notice PARTNERs have their own
attributes: id, first name, and last name. The subtypes are all PARTNERs. In this case
each PARTNER (as EVENT PLANNER or DJ) may be managed by one and only one
MANAGER and going the other way around, each PARTNER (with the subtype
MANAGER) may be the manager of one or more PARTNERs (such as EVENT PLANNER
or DJ). Recursive relationships must be optional on both sides. Notice that the line is
dashed from one end to the other. You might be asked why. Well, lets think what
would happen to the first and last employees.() If the MANAGER of the event must
be supervised by someone, that doesnt make sense. Or, if a DJ at the bottom of the
seniority level must supervise somewhere else, that cant be true also. The
recursive relationship is optional at both ends because someone has to be at the
top of the relationship. And someone else has to be at the bottom. In other words,
all PARTNERs will have a manager_id except the MANAGER. So, if the entity is
PARTNER and the relationship is supervised by or supervises then the MANAGER
will not have a MANAGER. Also, all PARTNERs will not necessarily be MANAGERs,
especially those at the bottom of the hierarchy.
7.2.8 This example shows a recursive relationship that is a many-to-many type of
relationship. This relationship is common when modeling COMPONENTs. For
example, in the automotive industry we could think of a fan as being a COMPONENT.
A fan has parts such as, blades, bolts, and pulleys. So the fan would be made up of
those other parts. But, a fan is also a COMPONENT or part of an engine. Because
this is a many-to-many relationship, we need to add an intersection entity called
ASSEMBLY RULE to successfully resolve this many-to-many relationship.

7.3.4 Modeling data over time requires special consideration. When you have
attributes that change over time you have to question whether there is a need to
keep historical data. Sometimes dates may be modeled as attributes and other
times dates must be modeled as separate entities with a relationship to the entity in
question. Well examine some examples of this. Some considerations must be made
when time and dates are being modeled. If dates are stored as attributes, then
every update means a loss of historical information. If you choose to model time as
separate entities, your model can become more complex. Its often difficult to join
tables together when time conditions are modeled. But, an advantage is that users
can work in advance to model a database that can be implemented at a future time.
7.3.5 This example discusses the weekly salary attribute for the EMPLOYEE entity,
which would be overwritten when updated. However, with the new guideline that
historical salary data be maintained, another entity for the historical salary must be
created to hold that information.
7.3.6 To model the historical salary data, the SALARY HISTORY entity is created, and
a relationship is made between EMPLOYEE and SALARY HISTORY. Notice that the
salary start date is mandatory, and it is in fact part of the unique id, and the salary
end date is optional. This allows the database to contain SALARY HISTORYs that are
not yet completed. The unique identifier for SALARY HISTORY is the barred
relationship to EMPLOYEE; that is, the UID is the EMPLOYEE id and the SALARY
HISTORY salary start date since the date and employee id are tied together and may
change over time as a new salary is entered.
7.3.7
This example details the way to model historical lending. In this case a JEWELRY
PIECE is named and has a code as a unique identifier. It also has attributes of who is
renting the piece at any given moment, but not who has rented it, the historical
data. Also, a JEWELRY PIECE may have one or more MOVIE STARs who rent the piece
depending upon what time is being examined. Therefore, the model must be revised
to a many-to-many relationship.
7.3.8 Since we know that many-to-many relationships can be resolved by an
intersection entity, we create the RENTAL HISTORY entity. Each RENTAL HISTORY
may be for one and only one JEWELRY PIECE. But, what is the unique identifier for
this entity? It has a start date, the start date is a unique identifier in combination
with the JEWELRY PIECE code. The start date alone is not enough to be a unique
identifier because various JEWELRY PIECEs may have the same start date for a
specific event. Therefore, the RENTAL HISTORY entity is uniquely identified by both
the start date and the JEWELRY PIECE code. Remember that some models may not
be all inclusive on the business rules, and will require programming to enforce the
relationships.

Das könnte Ihnen auch gefallen