Sie sind auf Seite 1von 109

Chapter 3: Relational Model

Structure of Relational Databases


Relational Algebra
Tuple Relational Calculus
Domain Relational Calculus
Extended Relational-Algebra-Operations
Modification of the Database
Views

Database System Concepts 3.1 Silberschatz, Korth and Sudarshan


Why Study the Relational Model?
Most widely used model.
Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
Legacy systems in older models
E.G., IBMs IMS
Recent competitor: object-oriented model
ObjectStore, Versant, Ontos
A synthesis emerging: object-relational model
Informix Universal Server, UniSQL, O2, Oracle, DB2

Database System Concepts 3.2 Silberschatz, Korth and Sudarshan


Basic Structure
Formally, given sets D1, D2, . Dn, a relation r is a subset of
D1 x D2 x x Dn
Thus a relation is a set of n-tuples (a1, a2, , an) where each ai Di
Example: if
customer-name = {Jones, Smith, Curry, Lindsay}
customer-street = {Main, North, Park}
customer-city = {Harrison, Rye, Pittsfield}
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield)}
is a relation over customer-name x customer-street x customer-city

Database System Concepts 3.3 Silberschatz, Korth and Sudarshan


Attribute Types
Each attribute of a relation has a name
The set of allowed values for each attribute is called the domain of the
attribute
Attribute values are (normally) required to be atomic, that is, indivisible
E.g. multivalued attribute values are not atomic
E.g. composite attribute values are not atomic
The special value null is a member of every domain
The null value causes complications in the definition of many operations
we shall ignore the effect of null values in our main presentation and consider
their effect later

Database System Concepts 3.4 Silberschatz, Korth and Sudarshan


Relation Schema
A1, A2, , An are attributes
R = (A1, A2, , An ) is a relation schema
E.g. Customer-schema =
(customer-name, customer-street, customer-city)
r(R) is a relation on the relation schema R
E.g. customer (Customer-schema)

Database System Concepts 3.5 Silberschatz, Korth and Sudarshan


Relation Instance
The current values (relation instance) of a relation are
specified by a table
An element t of r is a tuple, represented by a row in a table

attributes
(or columns)
customer-name customer-street customer-city

Jones Main Harrison


Smith North Rye tuples
Curry North Rye (or rows)
Lindsay Park Pittsfield

customer

Database System Concepts 3.6 Silberschatz, Korth and Sudarshan


Relations are Unordered
Order of tuples is irrelevant (tuples may be stored in an arbitrary order)
E.g. account relation with unordered tuples

Database System Concepts 3.7 Silberschatz, Korth and Sudarshan


Example Instance of Students Relation

sid name login age gpa


53666 Jones jones@cs 18 3.4
53688 Smith smith@eecs 18 3.2
53650 Smith smith@math 19 3.8

Cardinality = 3, degree = 5, all rows distinct

Database System Concepts 3.8 Silberschatz, Korth and Sudarshan


Database
A database consists of multiple relations
Information about an enterprise is broken up into parts, with each
relation storing one part of the information

E.g.: account : stores information about accounts


depositor : stores information about which customer
owns which account
customer : stores information about customers
Storing all information as a single relation such as
bank(account-number, balance, customer-name, ..)
results in
repetition of information (e.g. two customers own an account)
the need for null values (e.g. represent a customer without an
account)
Normalization theory (Chapter 7) deals with how to design
relational schemas

Database System Concepts 3.9 Silberschatz, Korth and Sudarshan


E-R Diagram for the Banking Enterprise

Database System Concepts 3.10 Silberschatz, Korth and Sudarshan


The customer Relation

Database System Concepts 3.11 Silberschatz, Korth and Sudarshan


The depositor Relation

Database System Concepts 3.12 Silberschatz, Korth and Sudarshan


The Account Relation

Database System Concepts 3.13 Silberschatz, Korth and Sudarshan


Keys
Let K R
K is a superkey of R if values for K are sufficient to identify a unique tuple
of each possible relation r(R)
by possible r we mean a relation r that could exist in the enterprise we are
modeling.
Example: {customer-name, customer-street} and
{customer-name}
are both superkeys of Customer, if no two customers can possibly have the
same name.
K is a candidate key if K is minimal
Example: {customer-name} is a candidate key for Customer.
Since it is a superkey, and no subset of it is a superkey.
Assuming no two customers can possibly have the same name.

Database System Concepts 3.14 Silberschatz, Korth and Sudarshan


Determining Keys from E-R Sets
Strong entity set.
The primary key of the entity set becomes the primary key of the relation.
Weak entity set.
The primary key of the relation consists of the union of the primary key of the
strong entity set and the discriminator of the weak entity set.
Relationship set.
The union of the primary keys of the related entity sets becomes a super key of
the relation.
For binary many-to-one relationship sets
The primary key of the many entity set becomes the relations primary
key.
For one-to-one relationship sets, the relations primary key can be that of
either entity set.
For many-to-many relationship sets
The union of the primary keys becomes the relations primary key

Database System Concepts 3.15 Silberschatz, Korth and Sudarshan



Codd .1970
08 -.90 -
:
DB2, ORACLE, SYBASE, Informix
?
, .

Database System Concepts 3.16 Silberschatz, Korth and Sudarshan


-
. Domain
(Intention) R(A1, A2An) (scheme) Relation
. Attribute Ai
(Extension) r(t1, t2tm) (Instance) Relation
t=(v1, v2vn) tuple
.t i Vi
.Degree or Arity
Cardinality - .

Database System Concepts 3.17 Silberschatz, Korth and Sudarshan


-
.
.
.
.Super-Key
.Candidate key
.Primary key

Database System Concepts 3.18 Silberschatz, Korth and Sudarshan



Q, R, S ()
q, r, s ()
t, u, v
] t[Ai Ai .t
Ai : .t(SSN, GPA) t
R. Ai
STUDENT.NAME :
Null -

Database System Concepts 3.19 Silberschatz, Korth and Sudarshan


-
( .)Null :
, :
.primary
.Null
:
.
FK constraint referential integrity constraint :
Null
.
:
, :

Database System Concepts 3.20 Silberschatz, Korth and Sudarshan



: .1
0 < salary < 100,000
: .2
QTY_SUPPLIED QTY_ORDERED
: .3
NAME -> AGE
.
SQL Integrity Constraints .4

Database System Concepts 3.21 Silberschatz, Korth and Sudarshan


-
, , ,
.

Database System Concepts 3.22 Silberschatz, Korth and Sudarshan


Domain Constraints
Integrity constraints guard against accidental damage to the
database, by ensuring that authorized changes to the database do
not result in a loss of data consistency.
Domain constraints are the most elementary form of integrity
constraint.
They test values inserted in the database, and test queries to
ensure that the comparisons make sense.
New domains can be created from existing data types
E.g. create domain Dollars numeric(12, 2)
create domain Pounds numeric(12,2)
We cannot assign or compare a value of type Dollars to a value of
type Pounds.
However, we can convert type as below
(cast r.A as Pounds)
(Should also multiply by the dollar-to-pound conversion-rate)

Database System Concepts 3.23 Silberschatz, Korth and Sudarshan


Domain Constraints (Cont.)
The check clause in SQL-92 permits domains to be restricted:
Use check clause to ensure that an hourly-wage domain allows only
values greater than a specified value.
create domain hourly-wage numeric(5,2)
constraint value-test check(value > = 4.00)
The domain has a constraint that ensures that the hourly-wage is
greater than 4.00
The clause constraint value-test is optional; useful to indicate which
constraint an update violated.
Can have complex conditions in domain check
create domain AccountType char(10)
constraint account-type-test
check (value in (Checking, Saving))
check (branch-name in (select branch-name from branch))

Database System Concepts 3.24 Silberschatz, Korth and Sudarshan


Referential Integrity
Ensures that a value that appears in one relation for a given set of
attributes also appears for a certain set of attributes in another
relation.
Example: If Perryridge is a branch name appearing in one of the
tuples in the account relation, then there exists a tuple in the branch
relation for branch Perryridge.
Formal Definition
Let r1(R1) and r2(R2) be relations with primary keys K1 and K2
respectively.
The subset of R2 is a foreign key referencing K1 in relation r1, if for
every t2 in r2 there must be a tuple t1 in r1 such that t1[K1] = t2[].
Referential integrity constraint also called subset dependency since its
can be written as
(r2) K1 (r1)

Database System Concepts 3.25 Silberschatz, Korth and Sudarshan


Referential Integrity in the E-R Model
Consider relationship set R between entity sets E1 and E2. The
relational schema for R includes the primary keys K1 of E1 and
K2 of E2.
Then K1 and K2 form foreign keys on the relational schemas for
E1 and E2 respectively.
E1 R E2

Weak entity sets are also a source of referential integrity


constraints.
For the relation schema for a weak entity set must include the
primary key attributes of the entity set on which it depends

Database System Concepts 3.26 Silberschatz, Korth and Sudarshan


Checking Referential Integrity on
Database Modification
The following tests must be made in order to preserve the
following referential integrity constraint:
(r2) K (r1)
Insert. If a tuple t2 is inserted into r2, the system must ensure
that there is a tuple t1 in r1 such that t1[K] = t2[]. That is
t2 [] K (r1)
Delete. If a tuple, t1 is deleted from r1, the system must
compute the set of tuples in r2 that reference t1:
= t1[K] (r2)
If this set is not empty
either the delete command is rejected as an error, or
the tuples that reference t1 must themselves be deleted
(cascading deletions are possible).

Database System Concepts 3.27 Silberschatz, Korth and Sudarshan


Database Modification (Cont.)
Update. There are two cases:
If a tuple t2 is updated in relation r2 and the update modifies values for
foreign key , then a test similar to the insert case is made:
Let t2 denote the new value of tuple t2. The system must ensure
that
t2[] K(r1)
If a tuple t1 is updated in r1, and the update modifies values for the
primary key (K), then a test similar to the delete case is made:
1. The system must compute
= t1[K] (r2)
using the old value of t1 (the value before the update is applied).
2. If this set is not empty

1. the update may be rejected as an error, or


2. the update may be cascaded to the tuples in the set, or
3. the tuples in the set may be deleted.

Database System Concepts 3.28 Silberschatz, Korth and Sudarshan


Referential Integrity in SQL
Primary and candidate keys and foreign keys can be specified as part of
the SQL create table statement:
The primary key clause lists attributes that comprise the primary key.
The unique key clause lists attributes that comprise a candidate key.
The foreign key clause lists the attributes that comprise the foreign key and
the name of the relation referenced by the foreign key.
By default, a foreign key references the primary key attributes of the
referenced table
foreign key (account-number) references account
Short form for specifying a single column as foreign key
account-number char (10) references account
Reference columns in the referenced table can be explicitly specified
but must be declared as primary/candidate keys
foreign key (account-number) references account(account-number)

Database System Concepts 3.29 Silberschatz, Korth and Sudarshan


Referential Integrity in SQL Example

create table customer


(customer-name char(20),
customer-street char(30),
customer-city char(30),
primary key (customer-name))
create table branch
(branch-name char(15),
branch-city char(30),
assets integer,
primary key (branch-name))

Database System Concepts 3.30 Silberschatz, Korth and Sudarshan


Referential Integrity in SQL Example (Cont.)

create table account


(account-number char(10),
branch-name char(15),
balance integer,
primary key (account-number),
foreign key (branch-name) references branch)
create table depositor
(customer-name char(20),
account-number char(10),
primary key (customer-name, account-number),
foreign key (account-number) references account,
foreign key (customer-name) references customer)

Database System Concepts 3.31 Silberschatz, Korth and Sudarshan


Cascading Actions in SQL
create table account
...
foreign key(branch-name) references branch
on delete cascade
on update cascade
...)
Due to the on delete cascade clauses, if a delete of a tuple in
branch results in referential-integrity constraint violation, the
delete cascades to the account relation, deleting the tuple that
refers to the branch that was deleted.
Cascading updates are similar.

Database System Concepts 3.32 Silberschatz, Korth and Sudarshan


Cascading Actions in SQL (Cont.)
If there is a chain of foreign-key dependencies across multiple
relations, with on delete cascade specified for each dependency,
a deletion or update at one end of the chain can propagate across
the entire chain.
If a cascading update to delete causes a constraint violation that
cannot be handled by a further cascading operation, the system
aborts the transaction.
As a result, all the changes caused by the transaction and its
cascading actions are undone.
Referential integrity is only checked at the end of a transaction
Intermediate steps are allowed to violate referential integrity provided
later steps remove the violation
Otherwise it would be impossible to create some database states, e.g.
insert two tuples whose foreign keys point to each other
E.g. spouse attribute of relation
marriedperson(name, address, spouse)

Database System Concepts 3.33 Silberschatz, Korth and Sudarshan


Referential Integrity in SQL (Cont.)
Alternative to cascading:
on delete set null
on delete set default
Null values in foreign key attributes complicate SQL referential
integrity semantics, and are best prevented using not null
if any attribute of a foreign key is null, the tuple is defined to satisfy
the foreign key constraint!

Database System Concepts 3.34 Silberschatz, Korth and Sudarshan


Schema Diagram for the Banking Enterprise
(another style to denote foreign keys)

Database System Concepts 3.35 Silberschatz, Korth and Sudarshan


Database System Concepts 3.36 Silberschatz, Korth and Sudarshan
Possible relational database state
corresponding to the COMPANY scheme

Database System Concepts 3.37 Silberschatz, Korth and Sudarshan


SQL -
CREATE TABLE EMPOLYEE
( FNAME VARCHAR(15) NOT NULL.
MINIT CHAR.
LNAME VARCHAR(15) NOT NULL.
SSN CHAR(9) NOT NULL.
BDATE DATE
ADDRESS VARCHAR(30).
SEX CHAR.
SALARY DECIMAL(10,2).
SUPERSSN CHAR(9).
DNO INT NOT NULL.
PRIMARY KEY (SSN).
FOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE (SSN),
FOREIGN KEY (DNO) REFERENCES DEPARTMENT (DNUMBER));
CREATE TABLE DEPARTMENT
( DNAME VARCHAR(15) NOT NULL
DNUMBER INT NOT NULL
MGRSSN CHR(9) NOT NULL
MGRSTARTDATE DATE,
PRIMARY KEY (DNUMBER)
UNIQUE (DNAME)
FOREIGN KEY (MGRSSN) REFERENCES EMPLOYEE (SSN));
CREATE TABLE DEPT_LOCATIONS
( DNUMBER INT NOT NULL,
DLOCATION VARCHAR(15) NOT NULL,
PRIMARY KEY (DNUMBER, DLOCATION),
FOREIGN KEY (DNUMBER) REFERENCES DEPARTMENT (DNUMBER)

Database System Concepts 3.38 Silberschatz, Korth and Sudarshan


SQL -
CREATE TABLE PROJECT
( PNAME VARCHAR(15) NOT NULL,
PNUMBER INT NOT NULL,
PLOCATION VARCHAR(15) .
DNUM INT NOT NULL,
PRIMARY KEY (PNUMBER)
UNIQUE (PNAME)
FOREIGN KEY (DNUM) REFERENCES DEPARTMENT (DNUMBER) );
CREATE TABLE WORKS_ON
( ESSN CHAR(9) NOT NULL,
PNO INT NOT NULL,
HOURS DECIMAL(3, 1) NOT NULL,
PRIMARY KEY (ESSN, PNO),
FOREIGN KEY (ESSN) REFERENCES EMPLOYEE (SSN),
FOREIGN KEY (PNO) REFERENCES PROJECT (PNUMBER) );
CREATE TABLE DEPENDENT
( ESSN CHAR(9) NOT NULL,
DEPENDENT_NAME VARCHR(15)NOT NULL,
SEX CHAR,
BDATE DATE,
RELATIONSHIP VARCHAR(8),
PRIMARY KEY (ESSN, DEPENDENR_NAME),
FOREIGN KEY (ESSN) REFERENCES EMPLOYEE (SSN) );

Database System Concepts 3.39 Silberschatz, Korth and Sudarshan


Query Languages
Language in which user requests information from the database.
Categories of languages
procedural
non-procedural
Pure languages:
Relational Algebra
Tuple Relational Calculus
Domain Relational Calculus
Pure languages form underlying basis of query languages that
people use.

Database System Concepts 3.40 Silberschatz, Korth and Sudarshan


Relational Algebra
Procedural language
Six basic operators
select
project
union
set difference
Cartesian product
rename
The operators take two or more relations as inputs and give a
new relation as a result.

Database System Concepts 3.41 Silberschatz, Korth and Sudarshan



B select
A, B, C project
AxB
U Union
Intersection
- Difference
B JOIN -
% Division

Database System Concepts 3.42 Silberschatz, Korth and Sudarshan


Select Operation Example

Relation r A B C D

1 7
5 7
12 3
23 10

A=B ^ D > 5 (r)


A B C D

1 7
23 10

Database System Concepts 3.43 Silberschatz, Korth and Sudarshan


Select Operation

Notation: p(r)
p is called the selection predicate
Defined as:
p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus consisting
of terms connected by : (and), (or), (not)
Each term is one of:
<attribute> op <attribute> or <constant>
where op is one of: =, , >, . <.
Example of selection:
branch-name=Perryridge(account)

Database System Concepts 3.44 Silberschatz, Korth and Sudarshan


Project Operation Example

Relation r: A B C

10 1
20 1
30 1
40 2

A,C (r) A C A C

1 1
1 = 1
1 2
2

Database System Concepts 3.45 Silberschatz, Korth and Sudarshan


Project Operation
Notation:

A1, A2, , Ak (r)


where A1, A2 are attribute names and r is a relation name.
The result is defined as the relation of k columns obtained by
erasing the columns that are not listed
Duplicate rows removed from result, since relations are sets
E.g. To eliminate the branch-name attribute of account
account-number, balance (account)

Database System Concepts 3.46 Silberschatz, Korth and Sudarshan


Union Operation Example

Relations r, s:
A B A B

1 2
2 3
1 s
r

r s: A B

1
2
1
3

Database System Concepts 3.47 Silberschatz, Korth and Sudarshan


Union Operation
Notation: r s
Defined as:
r s = {t | t r or t s}

For r s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (e.g., 2nd column
of r deals with the same type of values as does the 2nd
column of s)
E.g. to find all customers with either an account or a loan
customer-name (depositor) customer-name (borrower)

Database System Concepts 3.48 Silberschatz, Korth and Sudarshan


Set Difference Operation Example

Relations r, s:
A B A B

1 2
2 3
1 s
r

r s: A B

1
1

Database System Concepts 3.49 Silberschatz, Korth and Sudarshan


Set Difference Operation
Notation r s
Defined as:
r s = {t | t r and t s}
Set differences must be taken between compatible relations.
r and s must have the same arity
attribute domains of r and s must be compatible

Database System Concepts 3.50 Silberschatz, Korth and Sudarshan


Set-Intersection Operation
Notation: r s
Defined as:
r s ={ t | t r and t s }
Assume:
r, s have the same arity
attributes of r and s are compatible
Note: r s = r - (r - s)

Database System Concepts 3.51 Silberschatz, Korth and Sudarshan


Set-Intersection Operation - Example
Relation r, s: A B A B
1 2
2 3
1

r s
rs A B

Database System Concepts 3.52 Silberschatz, Korth and Sudarshan


Possible relational database state
corresponding to the COMPANY scheme

Database System Concepts 3.53 Silberschatz, Korth and Sudarshan


Database System Concepts 3.54 Silberschatz, Korth and Sudarshan
Database System Concepts 3.55 Silberschatz, Korth and Sudarshan

Database System Concepts 3.56 Silberschatz, Korth and Sudarshan


Cartesian-Product Operation-Example

Relations r, s: A B C D E

1 10 a
10 a
2
20 b
r 10 b
s
r x s:
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b

Database System Concepts 3.57 Silberschatz, Korth and Sudarshan


Cartesian-Product Operation
Notation r x s
Defined as:
r x s = {t q | t r and q s}
Assume that attributes of r(R) and s(S) are disjoint. (That is, R S =
).
If attributes of r(R) and s(S) are not disjoint, then renaming must be
used.

Database System Concepts 3.58 Silberschatz, Korth and Sudarshan


Rename Operation
Allows us to name, and therefore to refer to, the results of
relational-algebra expressions.
Allows us to refer to a relation by more than one name.
Example:
x (E)
returns the expression E under the name X
If a relational-algebra expression E has arity n, then
x (A1, A2, , An) (E)
returns the result of expression E under the name X, and with the
attributes renamed to A1, A2, ., An.

Database System Concepts 3.59 Silberschatz, Korth and Sudarshan


Composition of Operations
Can build expressions using multiple operations
Example: A=C(r x s)
rxs A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
A=C(r x s)
A B C D E

1 10 a
2 20 a
2 20 b
Database System Concepts 3.60 Silberschatz, Korth and Sudarshan
Join or Theta Join
Selection over a cartesian product

R BS B (RxS)

Meaning:
For every row r of R
output all rows s of S
which satisfy condition B.

Database System Concepts 3.61 Silberschatz, Korth and Sudarshan


Natural-Join Operation
n Notation: r s
Let r and s be relations on schemas R and S respectively.
Then, r s is a relation on schema R S obtained as follows:
Consider each pair of tuples tr from r and ts from s.
If tr and ts have the same value on each of the attributes in R S, add
a tuple t to the result, where
t has the same value as t on r
r
t has the same value as t
s on s
Example:
R = (A, B, C, D)
S = (E, B, D)
Result schema = (A, B, C, D, E)
r s is defined as:
r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))

Database System Concepts 3.62 Silberschatz, Korth and Sudarshan


Natural Join Operation Example

Relations r, s:

A B C D B D E

1 a 1 a
2 a 3 a
4 b 1 a
1 a 2 b
2 b 3 b
r s

r s
A B C D E
1 a
1 a
1 a
1 a
2 b

Database System Concepts 3.63 Silberschatz, Korth and Sudarshan


Natural-Join Operation my definition
n Notation: r s also r*s
Let r and s be relations on schemas R and S respectively.
Then, r s is a relation on schema R S obtained as follows:
r and s are joined by some Equi-join
The redundant (duplicate) attributes are removed
Example:
R = (A, B, C, D)
S = (E, B, D)
The equi-join may be on B only
Examples Dept natural join Emp on Emp-id
Dept natural join Emp on Mgr-id
Importance: natural joins along foreign key express Relationship!

To avoid confusion: write the predicate B explicitly!

Database System Concepts 3.64 Silberschatz, Korth and Sudarshan


Possible relational database state
corresponding to the COMPANY scheme

Database System Concepts 3.65 Silberschatz, Korth and Sudarshan


Database System Concepts 3.66 Silberschatz, Korth and Sudarshan
Illustrating the join operation

Database System Concepts 3.67 Silberschatz, Korth and Sudarshan


Database System Concepts 3.68 Silberschatz, Korth and Sudarshan
Division Operation

rs
Suited to queries that include the phrase for all.
Let r and s be relations on schemas R and S respectively
where
R = (A1, , Am, B1, , Bn)
S = (B1, , Bn)
The result of r s is a relation on schema
R S = (A1, , Am)

r s = { t | t R-S(r) u s ( tu r ) }

Database System Concepts 3.69 Silberschatz, Korth and Sudarshan


Division Operation Example

Relations r, s: A B B
1 1
2
3 2
1 s
1
1
3
4
6
1
2
r s: A r

Database System Concepts 3.70 Silberschatz, Korth and Sudarshan


Another Division Example

Relations r, s:
A B C D E D E

a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r

r s: A B C

a
a

Database System Concepts 3.71 Silberschatz, Korth and Sudarshan


Division Operation (Cont.)

Property
Let q = r s
Then q is the largest relation satisfying q x s r
Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R

r s = R-S (r) R-S ( (R-S (r) x s) R-S,S(r))

To see why
R-S,S(r) simply reorders attributes of r

T = R-S(R-S (r) x s) R-S,S(r)) gives those tuples t in

R-S (r) such that for some tuple u s, tu r.


Therefore R-S (r) - T is what we need!

Database System Concepts 3.72 Silberschatz, Korth and Sudarshan


Illustrating the division operation
(a)Dividing SSN_PNOS by SMITH_PNOS.
(b) T <= R \ S

Database System Concepts 3.73 Silberschatz, Korth and Sudarshan


Banking Example

branch (branch-name, branch-city, assets)

customer (customer-name, customer-street, customer-only)

account (account-number, branch-name, balance)

loan (loan-number, branch-name, amount)

depositor (customer-name, account-number)

borrower (customer-name, loan-number)

Database System Concepts 3.74 Silberschatz, Korth and Sudarshan


Example Queries

Find all loans of over $1200


amount > 1200 (loan)
Find the loan number for each loan of an amount greater than
$1200
loan-number (amount > 1200 (loan))

Database System Concepts 3.75 Silberschatz, Korth and Sudarshan


Example Queries

Find the names of all customers who have a loan, an account, or both, from
the bank

customer-name (borrower) customer-name (depositor)

Find the names of all customers who have a loan and an account at
bank.

customer-name (borrower) customer-name (depositor)

Database System Concepts 3.76 Silberschatz, Korth and Sudarshan


Example Queries
Find the names of all customers who have a loan at the Perryridge
branch.

customer-name (branch-name=Perryridge
(borrower.loan-number = loan.loan-number(borrower x loan)))

Find the names of all customers who have a loan at the Perryridge
branch but do not have an account at any branch of the bank.

customer-name (branch-name = Perryridge

(borrower.loan-number = loan.loan-number(borrower x loan)))


customer-name(depositor)

Database System Concepts 3.77 Silberschatz, Korth and Sudarshan


Example Queries
Find the names of all customers who have a loan at the Perryridge
branch.
Query 1
customer-name(branch-name = Perryridge (
borrower.loan-number = loan.loan-number(borrower x loan)))

Query 2
customer-name(loan.loan-number = borrower.loan-number(
(branch-name = Perryridge(loan)) x borrower))

Which one is more efficient?

Database System Concepts 3.78 Silberschatz, Korth and Sudarshan


Example Queries
Find the largest account balance
Rename account relation as d
The query is:

balance(account) - account.balance
(account.balance < d.balance (account x d (account)))

Second term is all those accounts which are smaller than some
account

Database System Concepts 3.79 Silberschatz, Korth and Sudarshan


Assignment Operation
The assignment operation () provides a convenient way to express
complex queries.
Write query as a sequential program consisting of
a series of assignments
followed by an expression whose value is displayed as a result of the
query.
Assignment must always be made to a temporary relation variable.
Example: Write r s as

temp1 R-S (r)


temp2 R-S ((temp1 x s) R-S,S (r))
result = temp1 temp2
The result to the right of the is assigned to the relation variable on the left of
the .
May use variable in subsequent expressions.

Database System Concepts 3.80 Silberschatz, Korth and Sudarshan


Example Queries
Find all customers who have an account from at least the
Downtown and the Uptown branches.
Query 1

CN(BN=Downtown(depositor account))

CN(BN=Uptown(depositor account))

where CN denotes customer-name and BN denotes


branch-name.

Query 2 using division


customer-name, branch-name (depositor account)
temp(branch-name) ({(Downtown), (Uptown)})

Database System Concepts 3.81 Silberschatz, Korth and Sudarshan


Example Queries
Find all customers who have an account at all branches located
in Brooklyn city.

customer-name, branch-name (depositor


account)
branch-name (branch-city = Brooklyn (branch))

Note the right project before the division!

Database System Concepts 3.82 Silberschatz, Korth and Sudarshan


Extended Relational-Algebra-Operations

Generalized Projection
Outer Join
Aggregate Functions

Database System Concepts 3.83 Silberschatz, Korth and Sudarshan


Generalized Projection
Extends the projection operation by allowing arithmetic functions
to be used in the projection list.

F1, F2, , Fn(E)


E is any relational-algebra expression
Each of F1, F2, , Fn are are arithmetic expressions involving
constants and attributes in the schema of E.
Given relation credit-info(customer-name, limit, credit-balance),
find how much more each person can spend:

customer-name, limit credit-balance (credit-info)

Database System Concepts 3.84 Silberschatz, Korth and Sudarshan


Aggregate Functions and Operations
Aggregation function takes a collection of values and returns a
single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Aggregate operation in relational algebra

G1, G2, , Gn g F1( A1), F2( A2),, Fn( An) (E)


E is any relational-algebra expression
G1, G2 , Gn is a list of attributes on which to group (can be empty)
Each Fi is an aggregate function
Each Ai is an attribute name

Database System Concepts 3.85 Silberschatz, Korth and Sudarshan


Aggregate Operation Example
Relation r:
A B C

7
7
3
10

sum-C
g sum(c) (r)
27

Database System Concepts 3.86 Silberschatz, Korth and Sudarshan


Aggregate Operation Example

Relation account grouped by branch-name:

branch-name account-number balance


Perryridge A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700

branch-name g sum(balance) (account)


branch-name balance
Perryridge 1300
Brighton 1500
Redwood 700

Database System Concepts 3.87 Silberschatz, Korth and Sudarshan


Aggregate Functions (Cont.)
Result of aggregation does not have a name
Can use rename operation to give it a name
For convenience, we permit renaming as part of aggregate
operation

branch-name g sum(balance) as sum-balance (account)

Note: branch-name is the Group-by attribute


sum is the function
balance is the attribute on which the
function operates
account is the relation expression

Database System Concepts 3.88 Silberschatz, Korth and Sudarshan


Database System Concepts 3.89 Silberschatz, Korth and Sudarshan
Outer Join
An extension of the join operation that avoids loss of information.
Computes the join and then adds tuples form one relation that do
not match tuples in the other relation to the result of the join.
Uses null values:
null signifies that the value is unknown or does not exist
All comparisons involving null are (roughly speaking) false by
definition.
Will study precise meaning of comparisons with nulls later

Database System Concepts 3.90 Silberschatz, Korth and Sudarshan


Outer Join Example

Relation loan

loan-number branch-name amount


L-170 Downtown 3000
L-230 Redwood 4000
L-260 Perryridge 1700

Relation borrower
customer-name loan-number
Jones L-170
Smith L-230
Hayes L-155

Database System Concepts 3.91 Silberschatz, Korth and Sudarshan


Outer Join Example

Inner Join

loan Borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith

Left Outer Join


loan Borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null

Database System Concepts 3.92 Silberschatz, Korth and Sudarshan


Outer Join Example
Right Outer Join
loan borrower

loan-number branch-name amount customer-name


L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-155 null null Hayes

Full Outer Join


loan borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes

Database System Concepts 3.93 Silberschatz, Korth and Sudarshan


The left outer join operation

Database System Concepts 3.94 Silberschatz, Korth and Sudarshan


A two level recursive query

Database System Concepts 3.95 Silberschatz, Korth and Sudarshan


Null Values
It is possible for tuples to have a null value, denoted by null, for
some of their attributes
null signifies an unknown value or that a value does not exist.
The result of any arithmetic expression involving null is null.
Aggregate functions simply ignore null values
Is an arbitrary decision. Could have returned null as result instead.
We follow the semantics of SQL in its handling of null values
For duplicate elimination and grouping, null is treated like any
other value, and two nulls are assumed to be the same
Alternative: assume each null is different from each other
Both are arbitrary decisions, so we simply follow SQL

Database System Concepts 3.96 Silberschatz, Korth and Sudarshan


Null Values
Comparisons with null values return the special truth value unknown
If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5
Three-valued logic using the truth value unknown:
OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
NOT: (not unknown) = unknown
In SQL P is unknown evaluates to true if predicate P evaluates to
unknown
Result of select predicate is treated as false if it evaluates to
unknown

Database System Concepts 3.97 Silberschatz, Korth and Sudarshan


Modification of the Database
The content of the database may be modified using the following
operations:
Deletion
Insertion
Updating
All these operations are expressed using the assignment
operator.

Database System Concepts 3.98 Silberschatz, Korth and Sudarshan


Deletion
A delete request is expressed similarly to a query, except instead
of displaying tuples to the user, the selected tuples are removed
from the database.
Can delete only whole tuples; cannot delete values on only
particular attributes
A deletion is expressed in relational algebra by:
rrE
where r is a relation and E is a relational algebra query.

Database System Concepts 3.99 Silberschatz, Korth and Sudarshan


Deletion Examples

Delete all account records in the Perryridge branch.

account account branch-name = Perryridge (account)

Delete all loan records with amount in the range of 0 to 50

loan loan amount 0 and amount 50 (loan)

Delete all accounts at branches located in Needham.

r1 branch-city = Needham (account branch)


r2 branch-name, account-number, balance (r1)
r3 customer-name, account-number (r2 depositor)
account account r2
depositor depositor r3

Database System Concepts 3.100 Silberschatz, Korth and Sudarshan


Insertion
To insert data into a relation, we either:
specify a tuple to be inserted
write a query whose result is a set of tuples to be inserted
in relational algebra, an insertion is expressed by:
r r E
where r is a relation and E is a relational algebra expression.
The insertion of a single tuple is expressed by letting E be a
constant relation containing one tuple.

Database System Concepts 3.101 Silberschatz, Korth and Sudarshan


Insertion Examples
Insert information in the database specifying that Smith has
$1200 in account A-973 at the Perryridge branch.

account account {(Perryridge, A-973, 1200)}


depositor depositor {(Smith, A-973)}

n Provide as a gift for all loan customers in the Perryridge


branch, a $200 savings account. Let the loan number serve
as the account number for the new savings account.
r1 (branch-name = Perryridge (borrower loan))
account account branch-name, account-number,200 (r1)
depositor depositor customer-name, loan-number(r1)

Database System Concepts 3.102 Silberschatz, Korth and Sudarshan


Updating
A mechanism to change a value in a tuple without charging all
values in the tuple
Use the generalized projection operator to do this task
r F1, F2, , FI, (r)
Each Fi is either
the ith attribute of r, if the ith attribute is not updated, or,
if the attribute is to be updated Fi is an expression, involving only
constants and the attributes of r, which gives the new value for the
attribute

Database System Concepts 3.103 Silberschatz, Korth and Sudarshan


Update Examples
Make interest payments by increasing all balances by 5 percent.

account AN, BN, BAL * 1.05 (account)

where AN, BN and BAL stand for account-number, branch-name


and balance, respectively.

Pay all accounts with balances over $10,000 6 percent interest


and pay all others 5 percent

account AN, BN, BAL * 1.06 ( BAL 10000 (account))


AN, BN, BAL * 1.05 (BAL 10000 (account))

Database System Concepts 3.104 Silberschatz, Korth and Sudarshan


Summary operations of the relational algebra

Operation Purpose
Notation
SELECT Selects all tuples that satisfy the selection < selection condition> (R)
condition from a relative R.

PROJECT Produces a new relation with only some < attribute list > (R)
of the attributes of R, and removes
duplicate tuples.

THETA JOIN Produces all combinations of tuples from R1 < join condition > R2
R1 and R2 that satisfy the join condition.

EQUIJOIN Produces all the combinations of tuples R1 < join condition > R2, or
from R1 and R2 that satisfy a join R1 (< join attributes 1>),
condition with only equality (<join attributes 2>R2
comparisons.

NATURAL JOIN Same as equijoin except that the join R1 < join condition > R2, or
attributes of R2 are not included in the R1 (< join attributes 1>),
resulting relation; (<join attributes 2>)
R2 or R1*R2

Database System Concepts 3.105 Silberschatz, Korth and Sudarshan


Summary operations of the relational algebra cont.

Operation Purpose Notation


UNION Products a relation that includes all the tuples R1 R2
in R1 or R2 or both R1 or R2; R1 and R2 must
be union compatible.

INTERSECTION Produces a relation that includes all the R1 R2


tuples in R1 or R2 or both R1 and R2;
R1 and R2 must be union compatible.

R1 R2
DIFFERENCE Produces a relation that includes all the
tuples in R1 that are not in R2;
.

CARTESIAN PRODUCT Produces a relation that has the attributes of R 1 X R2


R1 and R2 and includes as tuples all possible
combinations of tuples from R1 and R2.

Produces a relation R(X) that includes all R1(Z) R2(Y)


DIVISION
tuples t[] in that appear in R1 in combination
with every tuple from R2(Y), where Z = X
Y.
Database System Concepts 3.106 Silberschatz, Korth and Sudarshan
Tuple Relational Calculus
A nonprocedural query language, where each query is of the form
{t | P (t) }
It is the set of all tuples t such that predicate P is true for t
t is a tuple variable, t[A] denotes the value of tuple t on attribute A
t r denotes that tuple t is in relation r
P is a formula similar to that of the predicate calculus

Database System Concepts 3.107 Silberschatz, Korth and Sudarshan


Predicate Calculus Formula
1. Set of attributes and constants
2. Set of comparison operators: (e.g., , , , , , )
3. Set of connectives: and (), or (v) not ()
4. Implication (): x y, if x if true, then y is true
x y x v y
5. Set of quantifiers:
t r (Q(t)) there exists a tuple in t in relation r
such that predicate Q(t) is true
t r (Q(t)) Q is true for all tuples t in relation r

Database System Concepts 3.108 Silberschatz, Korth and Sudarshan


A Valid TRC my definition
{t1.A, t2 .B, tn .Z | P (t1, t2, ,tn , tn+1, ,tm) }

t1.A, t2 .B, tn .Z are tuple variables which define the output.


each must be defined over a single relation, they must remain free in P, i.e
not associated with quantifiers
tn+1, ,tm are tuple variables which must be defined over relations, and must
be bound by a quantifier.
Semantics: run the free ts on all their corresponding relations, and for each
combination, check whether the P is true, if it is, output the defined output
values.
A variable is defined over a relation either as: t R or R(t), both syntax are
ok and will be used.
Value of a variable may be defined as t.A or t[A], both syntaxes are ok.

Database System Concepts 3.109 Silberschatz, Korth and Sudarshan

Das könnte Ihnen auch gefallen