Beruflich Dokumente
Kultur Dokumente
3 4
5 6
1
File Processing Systems (FPS) Disadvantages of FPS
PLI Cust-No Redundancy of data: Identical data are distributed
User 1
Application
OS Customer
Cust-Name
Address
over various files – a major source of problems
Program 1
Customer list Master Credit-Code Waste of storage space: When the same field is stored
File Description
in several files, the required storage space is needlessly
COBOL
Cust-No high high storage cost
User 2 Cust-Name
Application
OS Invoice Address Multiple updates: One field may be updated in one file
Program 2 Part-No
Monthly invoice Master
Qty-Ordered but not in others inconsistency and lack of data
File Price integrity and hence potential conflicting reports
PASCAL Part-No Multiple programming languages: Dealing with several
User 3 Part-Descr
Application
OS Inventory Vendor-No
programming languages which are often not user
Parts list
Program 3 Master Qty-In-Stock friendly high system maintenance cost
File Qty-On-Order 7 8
Data
Data Items :
Cust-No
Integrity and reliability
Qty-Orderd
User 3 Manipulation Cust-Name Price
Language Address Part-descr Data abstraction and independence
Credit-Code Vendor-No
Parts list Description Qty-In-Stock
Part-No Qty-On-Order9 10
2
A quick test! Data Modeling and
Which one of the following is the main source of Database Design
the problems in file processing systems, addressed
by databases? An Overview
A. Waste of storage space.
B. Update anomalies, which result in lack of data integrity.
C. Data redundancy.
D. Data inconsistency.
13 14
17 18
3
Three Views of Data Three Views / Levels of Data
Employee name SIN
External view Internal (physical) level
Employee address Annual salary
A block of consecutive bytes actually holding the data
User 1 Employee name: string
User 2 Conceptual (logical) level
SIN: dec, key
type emp = record
Conceptual view Employee address: string SIN : integer;
Employee health card No: string, unique
name : string;
Annual salary: float
address : string;
salary : real;
healthCard : string;
Database Administrator (DBA) Employee name: string length 25 offset 0
SIN: 9 dec offset 25 unique end
Internal view Employee health card No: string length 10 offset 34 unique External (logical) level
Employee address: string length 51 offset 44 View 1 : (emp.name, emp.address)
Annual salary: 9,2 dec offset 95 19 View 2 : (emp.SIN, emp.salary) 20
4
Architecture of a DBMS
DBMS Implementation There are 3 types of inputs Queries
to DBMS: Schema Transactions
Queries Modifications
Overview Transactions, i.e., Query
data Modifications Processor
Transaction
Schema Creations/ Manager
Modifications Storage
Manager
Data
25
Metadata 26
5
Users of a Database System
Specialized user
Naive user Database Programming
Query
processor Telecom system
Telecom system
DBMS and
Overview
Compiled its data manager DDL compiler
user interface
OS or own Telecom system
file manager
Compiled
application OS disk
program manager
DBA
6
Example SQL Query Example SQL Query
Relation schema: Relation schema:
Course (courseNumber, name, noOfCredits) Movie (title, year, length, filmType)
Query: Query:
Find all the courses stored in the database Find the titles of all movies stored in the database
Query in SQL: Query in SQL:
SELECT ∗ SELECT title
FROM Course; FROM Movie;
39 40
7
Example SQL Query
Products and Joins
Relation schema:
Movie (title, year, length, filmType) SQL has a simple way to “couple” relations in one query
Query: How? By “listing” the relevant relation(s) in the FROM clause
Find the titles of color movies that are either made after 1970 or
are less than 90 minutes long All the relations in the FROM clause are coupled through
Query in SQL: Cartesian product (shown as × in algebra notation)
SELECT title
FROM Movie
WHERE (year > 1970 OR length < 90) AND filmType = ’color’;
Note the precedence rules, when parentheses are absent:
AND takes precedence over OR, and
NOT takes precedence over AND and OR 43 44
45 46
Example Example
Instance R: Instance S: Instance of Student: Instance of Course:
ID firstName lastName GPA Address courseNumber name noOfCredits
A B B C D
111 Joe Smith 4.0 45 Pine av. Comp352 Data structures 3
1 2 2 5 6
222 Sue Brown 3.1 71 Main st. Comp353 Databases 4
3 4 4 7 8 333 Ann Johns 3.7 39 Bay st.
9 10 11
SELECT ∗ FROM Student, Course;
R x S: A R.B S.B C D ID firstName lastName GPA Address courseNumber name noOfCredits
1 2 2 5 6 111 Joe Smith 4.0 45 Pine av. Comp352 Data structures 3
1 2 4 7 8 111 Joe Smith 4.0 45 Pine av. Comp353 Databases 4
1 2 9 10 11 222 Sue Brown 3.1 71 Main st. Comp352 Data structures 3
3 4 2 5 6 222 Sue Brown 3.1 71 Main st. Comp353 Databases 4
8
Example Example
Instance of Student: Instance of Course:
ID firstName lastName GPA Address courseNumber name noOfCredits
Relation schemas:
111 Joe Smith 4.0 45 Pine av. Comp352 Data structures 3 Student (ID, firstName, lastName, address, GPA)
222 Sue Brown 3.1 71 Main st. Comp353 Databases 4 Ugrad (ID, major)
333 Ann Johns 3.7 39 Bay st.
Query:
Find “all” information about every undergraduate student
SELECT ID, courseNumber ID courseNumber
333 Comp352
333 Comp353
49 50
Example Example
Instance of Student: Instance of Ugrad: Instance of Student: Instance of Ugrad:
ID firstName lastName GPA Address ID major ID firstName lastName GPA Address ID major
111 Joe Smith 4.0 45 Pine av. 111 CS 111 Joe Smith 4.0 45 Pine av. 111 CS
222 Sue Brown 3.1 71 Main st. 333 EE 222 Sue Brown 3.1 71 Main st. 333 EE
333 Ann Johns 3.7 39 Bay st. 333 Ann Johns 3.7 39 Bay st.
9
Joining Relations Joining Relations
Relation schemas: Relation schemas:
Movie (title, year, length, filmType)
Movie (title, year, length, filmType)
Owns (title, year, studioName)
Owns (title, year, studioName) StarsIn (title, year, starName)
Query: Query:
Find the title and length of every movie produced by Find the title and length of Disney movies with JR as an actress.
Disney studio.
Query in SQL: Query in SQL:
SELECT Movie.title, length SELECT Movie.title, Movie.length
FROM Movie, Owns, StarsIn
FROM Movie, Owns
WHERE Movie.title = Owns.title AND Movie.year = Owns.year
WHERE Movie.title = Owns.title AND AND Movie.title = StarsIn.title AND Movie.year = StarsIn.year
Movie.year = Owns.year AND studioName = ’Disney’; AND studioName = ’Disney’ AND starName = ‘JR’;
55
10
Example Example
Relation schema: Relation schema:
Exec (name, address, cert#, netWorth) Exec (name, address, cert#, netWorth)
Query: Query:
How many movie executives are there in the Exec relation? How many different names are there in the Exec relation?
Query in SQL: Query in SQL:
SELECT COUNT(*) SELECT COUNT (DISTINCT name)
FROM Exec; FROM Exec;
The use of * as a parameter is unique to COUNT; In query processing time, the system first eliminates the duplicates
Its use for other aggregation operations makes no sense. from the column name, and then counts the number of present
values
61 62
11
A rule about null values! Another test!
The answer: Consider again the same instance of R(A,B) containing the tuples:
A B (null,1), (2,1), (null, null), (3,2), (2,3), and (1,null). Which of the
Select A, Sum(B) null 1 following tuples will be in the result of the query below?
From R 2 1
Group By A; null null Select A, Sum(B)
3 2 From R
√ (null, null) 2 3 Where B<>2
Group By A;
(2,4) 1 null
(1,null)
A. (null, 0)
(null,1)
B. (1,null)
C. (2,3)
D. (2,4)
67 68
(null, 0) null 1
(1,null) 2 1
(2,3) 2 3
√ (2,4)
69 70
12
Order By Database Modifications
The SQL statements/queries we looked at so far return an unordered SQL & Database Modifications?
relation/bag. What if we want the result displayed in a certain order?
We now look at SQL statements that do not return tuples,
Movie (title, year, length, filmType, studioName, producerC#)
but rather change the state (content) of the database
SELECT Exec.name, SUM(Movie.length) There are three types of such statements/transactions:
FROM Exec, Movie Insert tuples into a relation
WHERE producerC# = cert# Delete certain tuples from a relation
GROUP BY Exec.name Update values of certain attributes of certain existing tuples
HAVING MIN(Movie.year) < 1930 These types of operations that modify the database content are
ORDER BY Exec.name ASC; referred to as transactions
In general:
ORDER BY A ASC, B DESC, C ASC; 73 74
Insertion Insertion
The insertion statement consists of: Relation schema:
The keyword INSERT INTO StarsIn (title, year, starName)
The name of a relation R
Update the database:
A parenthesized list of attributes of the relation R Add “Sydney Greenstreet” to the list of stars of The Maltese Falcon
The keyword VALUES
In SQL:
A tuple expression, that is, a parenthesized list of concrete values,
one for each attribute in the attribute list INSERT INTO StarsIn (title,year, starName)
The form of an insert statement: VALUES(’The Maltese Falcon’, 1942, ’Sydney Greenstreet’);
INSERT INTO R(A1, …, An) VALUES (v1,…, vn) ; Another formulation of this query:
• This command inserts the tuple (v1,...,vn) to table R, where vi is INSERT INTO StarsIn
the value of attribute Ai , for i = 1,…,n 75 VALUES(’The Maltese Falcon’, 1942, ’Sydney Greenstreet’); 76
Insertion Insertion
The previous insertion statement was “simple” in that it Database schema:
added one tuple only into a relation Studio(name, address, presC#)
Instead of using explicit values for a tuple in insertion, we can Movie(title, year, length, filmType, studioName, producerC#)
request a set of tuples to be inserted. For this we define, in a Update the database:
subquery, the set of tuples from an existing relation Add to Studio, all studio names mentioned in the Movie relation
This subquery replaces the keyword VALUES and the tuple
expression in the INSERT statement Note: If the list of attributes in an “insert” statement does not
include all the attributes of the relation, the tuple created will
have the default value for each missing attribute
Since there is no way to determine an address or a presC#
for a studio tuple, NULL will be used for these attributes.
77 78
13
Insertion Deletion
Database schema: A delete statement consists of :
Studio(name, address, presC#) The keyword DELETE FROM
Movie(title, year, length, filmType, studioName, producerC#) The name of a relation R
Update the database: The keyword WHERE
Add to Studio, all studio names mentioned in the Movie relation A condition
In SQL: The syntax of the delete statement:
INSERT INTO Studio(name) DELETE FROM R WHERE <condition>;
SELECT DISTINCT studioName The effect of executing this statement is that “every tuple” in
FROM Movie relation R satisfying the condition will be deleted from R
WHERE studioName NOT IN (SELECT name Note: unlike the INSERT, we MAY need a WHERE clause here
FROM Studio);
79 80
Deletion Deletion
Relation schema: Relation schema:
StarsIn(title, year, starName) Exec(name, address, cert#, netWorth)
Update: Update:
Delete the tuple that says: Delete every movie executive whose net worth is < $10,000,000
Sydney Greenstreet was a star in The Maltese Falcon In SQL:
In SQL: DELETE FROM Exec
DELETE FROM StarIn WHERE netWorth < 10,000,000;
WHERE title = ’The Maltese Falcon’ AND
Anything wrong here?!
starName = ’Sydney Greenstreet’;
81 82
Deletion Update
Relation schema: Update statement consists of:
Studio(name, address, presC#) The keyword UPDATE
Movie(title, year, length, filmType, studioName, producerC#) The name of a relation R
Update: The keyword SET
Delete from Studio, those studios not mentioned in Movie A list of formulas, each of which will assign a value to an
(i.e., we don’t want to have non-productive studios!!) attribute of R
In SQL: The keyword WHERE
DELETE FROM Studio A condition
WHERE name NOT IN (SELECT StudioName The syntax of the update statement:
FROM Movie);
UPDATE R SET <new-value assignments>
> WHERE <condition>
>;
83 84
14
Update Defining Database Schema
Database schema: SQL includes two types of statements:
Studio(name, address, presC#)
DML
Exec(name, address, cert#, netWorth)
Update: DDL
Modify table Exec by attaching the title ‘Pres. ’ in front of the name So far we looked at the DML part to specify or
of every movie executive who is also the president of some studio modify the relation/database instances.
In SQL:
UPDATE Exec The DDL part allows us to define or modify the
SET name = ’Pres. ’ || name ← this line performs the update relation/database schemas.
WHERE cert# IN (SELECT presC#
FROM Studio); 85 86
15
Altering Relation Schemas Attribute Properties
Adding Columns We can assert that the value of an attribute A to be:
Add an attribute to an existing relation R: NOT NULL
ALTER TABLE R ADD <column declaration> >; • Then every tuple must have a “real” value (not null) for this
Example: Add attribute phone to table Star attribute
ALTER TABLE Star ADD phone CHAR(16); DEFAULT value
Removing Columns • Null is the default value for every attribute
Remove an attribute from a relation R using DROP: • However, we can consider/define any value we wish as the
ALTER TABLE R DROP COLUMN <column_name> >; default for a column, when we create a table.
Example: Remove column phone from Star
ALTER TABLE Star DROP COLUMN phone;
Note: Can’t drop a column, if it is the only column 91 92
16