Sie sind auf Seite 1von 22

03

Databases

IT3010 Data management FOR business analytics


REMINDER:
NO CLASS NEXT WEEK
3

QUICK RECAP
Organisations store data, but where?

Storing data for different purposes in different files is highly problematic. It


makes managing the consistency, constraints and access to the data very difficult.
It also creates redundant copies of the data (which can also lead to
inconsistency). These affect the quality of the data we use for decision making.

Sales Department Accounting Department

Accounts Accounts
Quotation Order Generation
Receivables Payable
Program Program
Program Program

Order Fulfillment Invoice Management


System System

Customer Inventory Orders Customer Inventory


Master File File Master Pricing
File File File
4

QUICK RECAP
So we use databases.

Databases avoid all the data problems associated with file storage. They provide
a (usually) single source/repository for organisational data. Modern enterprise
systems (ES) are built on a central database, and that is where data resides.

Sales Department Accounting Department

Accounts Accounts
Quotation Order Generation
… Receivables Payable …
Applications Applications
Applications Applications

Enterprise System
(Central database)

https://hbr.org/1998/07/putting-the-enterprise-into-the-enterprise-system
5

QUICK RECAP
About databases

A database is a collection of logically related information. ‘Logically related’ means


the data represents the relationships between real world entities (eg departments in
a business process). This data usually stored in tables (in a relational database).

Today, we talk
more about
Enterprise System
tables. (Central database)
- - - - -
- - - - -
- - - - - - - -
- - - - -
- - - - - - - - - -
- - -
- - - - - - - -
- - - - - - - - - - - -
- - - - - - - - - - - -
- - - -
6

QUICK RECAP
DBMS - the software that manages databases

A database management system is a software system that manages databases. It


decides how best data should be stored physically (as bits on a disk), and allows us to
look at the data, or subsets of the data, easily. It controls access to the data, making sure
people and applications cannot corrupt, or make inconsistent, any data. At the same
time, it gives access to the data to multiple concurrent users and applications.

USERS AND
APPLICATIONS

VIEW

LOGICAL

DBMS

PHYSICAL
7

READINGS

Davenport, T. (1998) “Putting the Enterprise into


the Enterprise System”, hbr.org, July-August.
How organisations moved from isolated data to
centralised databases, and how those
databases capture and change processes,
structure and strategy.

Database System Concepts


Chapter 1: Introduction

Chapter 2: Introduction to the Relational Model
8

WHAT ARE WE DOING TODAY?


COMPLEXITY

DESIGNING DATABASES
THE RELATIONAL data MODEL Creating a good structure for our data
to be stored.
How is data structured? How do we decide how the
data should be stored? What conditions do we need to
follow, if we wanted to create our own database?

TIME
THE RELATIONAL MODEL
10

THE RELATIONAL (DATA) MODEL


Most databases today rely on it

A data model is a high level conceptual paradigm that describes how data
should be structured, so that it reflects how the data elements relate to
one another.
Organisational data represents real life, and so the data model often captures the
flow of data in business workflows and processes.

There are several common data models, but the relational model is the
predominant one as far as databases are concerned. According to the
relational model, data should be captured as records, in two-dimensional
tables that represent both the data and the data’s relationships. These
tables are known as relations. The rows correspond to individual records,
the columns correspond to their attributes.
11

THE RELATIONAL (DATA) MODEL


Terminology
COLUMNS or
RELATION or TABLE
ATTRIBUTES or
NAME FIELDS

STUDENT
Stu No First Name Last Name Age CAP Programme Faculty

12 John Blue 23 4 BZA SOC

13 Lee White 44 4.95 ISC SOC ROWS or


RECORDS or
14 Judith Green 31 4.1 BIZACC BIZ TUPLES

15 Robert Red 24 3.6 DEN DEN

Each table will contain only a particular type of record, dictated


by the table’s fields, which are fixed in number.
12

RULES OF THE RELATIONAL MODEL


These seven rules must be preserved

Relations must have the following properties:


The relation has a name that is distinct from all other relation names
in the relational schema
Each cell of the relation contains exactly one atomic (single) value
Each attribute has a distinct name
The values of an attribute are all from the same domain
Each tuple is distinct; there are no duplicate tuples
The order of attributes has no significance
The order of tuples has no significance
LET’S TRY TO DESIGN A TABLE
14

DESIGNING A TABLE

We have a group of suppliers under contract. Each supplier has a


name, a rating on their reliability and a headquarter city location.
These suppliers ship us parts. Each type of part has a name, a colour,
a weight, and a city location where it is currently stored.
In a shipment, one supplier ships one part in a quantity that is a
multiple of 100.
What could we store this data?
15

EACH TUPLE MUST BE UNIQUE


How do we ensure this?

Each tuple should be uniquely identifiable by a set of one or more attributes. This set is
called a key.
Super key: A set of attributes that uniquely identifies a tuple within a relation. 

Candidate key: A super key such that no proper subset is a super key within the relation. 

Composite key: A candidate key with more than one attribute

Primary key: A candidate key selected to uniquely identify tuples

Alternate key: A candidate key not selected as primary key

Stu No First Name Last Name Age CAP Programme Faculty

12 John Blue 23 4 BZA SOC

13 Lee White 44 4.95 ISC SOC

14 Judith Green 31 4.1 BIZACC BIZ

15 Robert Red 24 3.6 DEN DEN


16

HOW DO YOU DO THIS IN SQL?


17

ANOTHER EXAMPLE
Let’s capture an old paper invoice into digital data
18

NORMALISATION AND ER MODELLING


Figuring out what goes into a relation, optimised

Normalisation, a bottom up approach


Aims to remove redundancy of data, therefore reducing storage space
required and speeding up searches

Occurs progressively using a set of rules (normal forms)

Entity-relationship modelling, a top-down approach


Identifies objects/entities in a system, then maps their relationships

Both lead to a design that can be made into a physical database


Then the tables can be created in the database and populated with data
19

NORMALISATION: 1NF

First Normal Form


The domain of an attribute must include only atomic values, and the value of
any attribute in a tuple must be a single value from the domain of that attribute.
ie. disallows multi-valued attributes
What to do if you’re not in 1NF
If a max number of values for an attribute is known, replace the attribute with
atomic attributes (eg. Qty1, Qty2 since only 2 items allowed). If not, remove the
violating attribute and place it in a separate relation, and propagate the primary
key

Every relation should at least be in 1NF.


20

NORMALISATION: 2NF

Second Normal Form


If a relation exists where a primary key contains multiple attributes,
all non-key attributes that are dependent on part of the primary key
should be separated.

What to do if you’re not in 2NF


Decompose the relation, and create a new relation with the partial
key and its dependent attributes. Keep a relation with the primary
key and attributed dependent on it.
21

NORMALISATION: 3NF

Third Normal Form


If a relation exists where a non-key attribute can be determined by
another non-key attribute (or set of non-key attributes), it is not in
3NF.

What to do if you’re not in 3NF


Decompose the relation, and separate the non-key dependency in
another relation
Also remove any attributes that can be computed
22

TRY AGAIN

We have a group of suppliers under contract. Each supplier has a


name, though that’s not necessarily unique, a rating on their
reliability and a headquarter city location. These suppliers ship us
parts. Each type of part has a name, a colour, a weight, and a city
location where it is currently stored. In a shipment, one supplier
ships one part in a quantity that is a multiple of 100. Design the
shipment database.

Das könnte Ihnen auch gefallen