Sie sind auf Seite 1von 81

Relational Model

Overview •Introduced by Dr.E.F.Codd in 1970 •Model based on mathematical foundations


•Earlier models intrinsically tied to internal representations •Developer had to be
aware of navigational principles •Need for a model to be divorced from physical o
rganization •Require independence between physical & logical model The Model •In Oct
.85 Dr.E.F.Codd published a two part paper •It introduced rules for Relational mod
el •Rules determined whether a product is fully relational or not Two papers •Is you
r DBMS really relational - Oct 14,1985 •Does your DBMS run by the rules - Oct 21,1
985 The implications were •satisfying rules was a technically feasible proposition
•practical benefits if system did satisfy rules A DBMS is said to be fully relati
onal if it supports Codd’s 12 rules, 9 Structural, 3 Integrity and 18 Manipulative
features.
Basic Concepts •Relation corresponds to a table, resembles files •Rows of a relation
are called Tuples, resemble records •Columns of a relation are called attributes,
resemble fields •Entity and relationships are both represented as relations •A rela
tion consists of same kind of tuples •Structure of a relation is defined by Scheme
(definition) •An instance of a scheme is called Relational instance Properties of
a Relation •Relation is a set of tuples •No two tuples in a relation are identical •T
uples in a relation have no order among themselves •Attribute values are atomic •Att
ribute values map onto a domain Notion of Keys Superkey Candidate key Primary ke
y - unique identifier for tuple - minimal superkey - Designated candidate key
A Relational model consists of Structural part relations, domains, etc. Integrit
y part entity, referential, domain/user defined Manipulative part operators & ex
tensions
Structural Features Relations Mathematical entity to hold data in the relational
model. A set of n-tuple, perceived as a two dimensional table where an intersec
tion of a row and a column is a atomic value. Base tables A named and autonomous
relation, one that actually holds data. Query tables Relations that result from
execution of queries, not named and do not have persistent existance. View tabl
es A named, virtual and derived relation, that is defined in terms of other name
d relations. Snapshot tables A named, derived and real relation, represented in
terms of other relations and also by it’s own materialized data. Attributes Corres
pond to columns of the table, all attribute values are of the same type and atom
ic in nature. Domains All possible values from which attribute values are chosen
. Primary key A set of attributes, whose value is a minimal unique identifier to
a row of the table. A designated candidate key. Foreign keys A set of attribute
s, whose value is the the primary key of another table
Integrity Features Entity Integrity No component of primary key of base relation
is allowed to be NULLS Referential Integrity The database must not contain any
unmatched foreign key values Domain defined Integrity Attribute values should be
those within the domain that it is mapped onto Manipulative Features •Restrict •Pro
ject •Cross product •Union, Intersection, Difference •Join •Divide •Extensions
Twelve Rules
Information Rule All information be represented in one and only one way, i.e val
ues in column positions within rows of tables. Guaranteed access Rule Every indi
vidual scalar value in database must be logically addressable by specifying tabl
e name, column name and primary key value. Systematic treatment of NULL Support
representation of missing and inapplicable information that is systematic and di
stinct from all regular values. Active Online Catalog Support for online, relati
onal catalog accessible to authorised users by means of regular query language.
Comprehensive data sublanguage Rule Supports one relational language that has li
near syntax, that is used interactively and within application programs, that su
pports data definition, manipulation, security, integrity and transaction manage
ment operations. View updating Rule All views that are technically updatable mus
t be updatable by the system.
Twelve Rules
High level Insert, Update, Delete Support for set-at-a-time Insert, Update, Dele
te operations. Physical Data Independence Changes at Internal level do not affec
t Conceptual level. Logical Data Independence Changes at Conceptual level do not
affect External level. Integrity Independence Integrity constraints specified s
eperately from application programs, they are stored in catalog, it is possible
to change integrity constraints without affecting existing code. Distributive In
dependence Existing applications should operate sucessfully when distributed ver
sion of DBMS is first introduced and when existing data is redistributed around
the system. Non subversion Rule If the system provides low-level record-at-a-tim
e interface, then that interface cannot subvert the system, thereby bypassing re
lational security or integrity constraints.
SQL
SQL •special purpose language for accessing & manipulating data •different from appl
ication progamming languages like C,Cobol •uses a combination of relational algebr
a & calculus constructs History 1970 - Dr.Codd proposed Relational model 1971-79
- SEQUEL implemented in System R 1980 - SEQUEL became SQL 1986 - SQL 86 (ANSI s
tandard) 1989 - Follow on to SQL-86 - SQL-89 1992 - SQL-2 1995 - SQL-3
SQL components Data Definition Language (DDL) •specifies database schema •creates, m
odifies, deletes database objects (tables, view, index) Data Manipulation Langua
ge (DML) •manipulates data using Insert, Modify, Delete operations •accesses data fr
om database for queries Data Control Language (DCL) •grants and revokes authorisat
ion for database access •audits database use •provides transaction management
Data Definition
Base tables •consists of a row of column headings •zero or more rows of data values •e
ach data row contains one scalar value for each column •all values in a column are
of same data type •row ordering is irrelevant •order is imposed when rows are retri
eved •columns are considered to be ordered from left to right •column ordering has a
significance •rows and columns do have a physical ordering as stored version •physi
cal row and column ordering is transparent to user •base table is autonomous, it e
xists in it’s own right
Table Creation Create Table S ( S# SNAME STATUS CITY Primary Key (S#) ); Create
Table SCOPY Like S ; Table Modification Alter Table S Add DISCOUNT SmallInt ; Ta
ble Deletion Drop Table S ; Char(5) Char(20) Smallint Char(15) Not Null, Not Nul
l With Default, Not Null With Default, Not Null With Default,
Index •indexes are created and dropped using SQL •data manipulation statements do no
t refer to indexes at all •decision to use or not to use index is made by DB2 Inde
x Creation Create [Unique] Index X on T (P,Q Desc, R) ; Index Deletion Drop Inde
x X ; Notes on Data definition •data definition statements can be executed at any
time •possible to create a few tables and start using them •subsequently new columns
could be added •possible to experiment with effects of indexes •permits one not to
get everything right the first time
Data Manipulation
Select Statement reference Select [All/Distinct] <scalar-expr> From <table-names
> Where <condition-expr> Group By <columns> Having <condition-expr> Order By <co
lumns> Sample table definitions 1. Supplier 2. Part 3. Supp-Part S (S#,SNAME,CIT
Y,STATUS) P (P#,PNAME,COLOR,WEIGHT) SP (S#,P#)
Simple retrieval Get part names of all parts Select PNAME From P Retrieval with
duplicate elimination Get part numbers for all part supplied Select Distinct P#
From SP Retrieval of computed values Select P#,’Height’,HEIGHT*250 From P Retrieval
of full details Select * From S
Qualified Retrieval Get supplier numbers for suppliers in Bombay with status abo
ve 20 Select S# From S Where CITY = ‘Bombay’ and STATUS > 20 Retrieval using orderin
g Get supplier numbers and status for suppliers in Bombay in descending order of
status Select S#,STATUS From S Where CITY = ‘Bombay’ Order By STATUS desc Order by
3rd column Select P#,’Height’,’HEIGHT*250 From P Order By 3,P#
Retrieval using Range of values Get parts whose weight is in the range of 16..19
both limit values inclusive Normal way Select P#,PNAME From P Where WEIGHT >= 1
6 and WEIGHT <= 19 Using BETWEEN Select P#,PNAME From P Where WEIGHT Between 16
and 19 Similarly Where WEIGHT Not Between 16 and 19
Retrieval using IN Get parts whose weight is any one of the following values 12,
16,17 Normal way Select P#,PNAME From P Where WEIGHT = 12 or WEIGHT = 16 or WEIG
HT = 17 Using IN Select P#,PNAME From P Where WEIGHT IN (12,16,17) Similarly Whe
re WEIGHT NOT IN (12,16,17)
Retrieval using NULL Since NULL is missing or inapplicable information, normal c
omparisons won’t work Get supplier numbers for those suppliers for whom STATUS is
inapplicable Select S# From S Where STATUS Is Null Similarly Where STATUS Is Not
Null
Cartesian product Each row of one table joined with every row of the other table
Select S.*,P.* From S,P Equi Join If the join condition comprises of equality o
perator, then the join is known as Equi Join Select S.*,P.* From S,P Where S.CIT
Y = P.CITY Theta Join If the rows from a cartesian product are eliminated by res
trict operation on the basis of any condition, then the join is known as Theta J
oin. Select S.*,P.* From P,SP Where P.P# = SP.P# And P.RATE < SP.RATE
Natural Join From a cartesian product, choose common fields and compare their va
lues for equality, finally one of the common fields is eliminated from the proje
ction Select S.*,SP.* From S,SP Where S.S# = SP.S# Natural join on 3 tables Sele
ct S.*,SP.*,P.* From S,SP,P Where S.S# = SP.S# And P.P# = SP.P# Join table with
Self Get employees with their manager names Select First.E#,First.ENAME,First.M#
, Second.ENAME From E First, E Second Where First.M# = Second.E#
Simple Subquery Suppose suppliers supplying part P2 are S1,S2,S3,S4, then the qu
ery to get supplier names supplying part P2 could be as follows. Select SNAME Fr
om S Where S# In (‘S1’,’S2’,’S3’,’S4’) However we can get S# of suppliers supplying part P2
the database by the following query Select S# From SP Where P# = ‘P2’ The IN clause
of SQL requires a list of values, and a query with one attribute is also a list
of values, hence can be substituted in the IN clause Select SNAME From S Where
S# In (
Select S# From SP Where P# = ‘P2’)
This is known as subquery or nested query
Query with multiple nesting Get supplier names for suppliers supplying at least
one red part Select SNAME From S Where S# In (List of suppliers supplying red pa
rt) Select SNAME From S Where S# In (
Select S# From SP Where P# In (List of red parts))
Seleet SNAME From S Where S# In (
Select S# From SP Where P# In (
Select P# From P Where COLOR = ‘RED’))
Subquery & Outer query referring to same table Get supplier number for suppliers
who supply atleast one part supplied by supplier S2 Select Distinct S# From SP
Where P# In (List of parts supplied by supplier S2) Select Distinct S# From SP W
here P# In ( Select P# From SP Where S# = ‘S2’) Subquery with scalar comparisons Get
supplier number for suppliers located in the same city as supplier S1 Select S#
From S Where CITY = (City of supplier S2) Select S# From S Where CITY = ( Selec
t CITY From S Where S# = ‘S1’)
Query using EXISTS EXISTS is always associated with a subquery. EXISTS tests whe
ther the results of an suquery return zero rows or non zero rows. EXISTS with a
subquery is used as a condition in the Where clause of the outer query If the su
bquery returns one or more rows the EXISTS condition is TRUE, otherwise it is FA
LSE Similarly NOT EXISTS is negation if EXISTS
Query using EXISTS Get supplier names for suppliers who supply part P2 Select SN
AME From S Where Exists ( List of suppliers where S# is the same as that of the
outer query and P# is ‘P2’) Select SNAME From S Where Exists ( Select * From SP Wher
e S# = S.S# And P# = ‘P2’) Example with NOT EXISTS Get supplier names for suppliers
who do not supply part P2 Select SNAME From S Where Not Exists (
Select * From SP Where S# = S.S# And P# = ‘P2’)
Quantified comparisons Get part names for parts whose height is greater than eve
ry blue part List of heights of blue parts Select HEIGHT From P Where COLOR = ‘Blu
e’ Part names for required parts Select PNAME From P Where HEIGHT > ALL (List of h
eights of blue parts) Select PNAME From P Where HEIGHT > ALL ( Select HEIGHT Fro
m P Where COLOR = ‘Blue’) Similarly Where HEIGHT > ANY ( Subquery )
Aggregate Functions Aggregate functions operate on a collection of scalar values
of one column of a table to produce a single scalar value defined as it’s result
Some aggregate functions are as follows: COUNT SUM AVG MAX MIN Examples Get tota
l number of suppliers Select Count(*) From S Get total suppliers supplying parts
Select Count(Distinct S#) From SP - number of rows - sum of values - average of
values - largest/maximum value - smallest/minimum value
Examples Get number of shipments for part P2 Select Count(*) From SP Where P# = ‘P
2’ Get total quantity of part P2 supplied Select Sum(QTY) From SP Where P# = ‘P2’ Get
average quantity of part P3 supplied Select Sum(QTY)/Count(*) From SP Where P# =
‘P3’ Or Select Avg(QTY) From SP Where P# = ‘P3’
Aggregate functions in subquery Get supplier numbers of suppliers with status le
ss than maximum status Select S# From S Where STATUS < (Maximum status) Select S
# From S Where STATUS < (
Select Max(STATUS) From S)
Get supplier number and city for all suppliers whose status is greater than or e
qual to the average status of their city Select S#,STATUS,CITY From S Where STAT
US >= (Average of their city) Select S#,STATUS,CITY From S SX Where STATUS >= (
Select Avg(STATUS) From S SY Where SY.CITY = SX.CITY )
Group By, Having Group By statement •rearranges rows into groups •on the basis of Gr
oup By attributes •such that each group has same value for Group By attributes •expr
essions in Select should be single valued for the group •such as Aggregate functio
ns or Group By attributes Example A part is supplied by more than one supplier a
t different rates, this information is maintained in SP table, each row contains
the part supplied, by a supplier and at specific rate. Get average rate for eac
h part supplied. Note: For each part there would be a group of rows, the average
rate for that part would be the average of all values of RATE attribute in that
group. Select P#,Avg(RATE) From SP Group By P#
Use of Where and Group By The use of Where clause restricts some rows from parti
cipating in groups as specified by Group By clause Get part number, total and ma
ximum quantity for parts excluding those supplied by supplier S1 Select P#,Sum(Q
TY),Max(QTY) From SP Where S# <> ‘S1’ Group By P# Get part number and average rate s
upplied by suppliers excluding those of type B Select P#,Avg(RATE) From SP Where
TYPE <> ‘B’ Group BY P#
Having The Having clause restricts output of rows that result from the Group By
clause, the restriction is based on a condition that usually includes an aggrega
te function. Get part numbers for all such parts that are supplied by more than
one supplier Select P# From SP Group By P# The results of this query are all par
ts that are supplied, we need to choose from this list only those that are suppl
ied by more than one supplier, this information is available as an aggregate fun
ction i.e.Count(*) Select P# From SP Group By P# Having Count(*) > 1
Union The Union operator performs a union of tuples from two tables, the two tab
les are said to be union-compatible if the number of columns are the same and co
lumn types are compatible. Get part number of parts that either weigh more than
16 pounds or are supplied by supplier S2 or both. Parts that weigh more than 16
pounds UNION Parts supplied by supplier S2 Select P# From P Where WEIGHT > 16 Un
ion Select P# From SP Where S# = ‘S2’ Note: Union eliminates redundant duplicates Un
ion All retains duplicates
Union The Output of the union can be ordered by the ORDER BY clause, ORDER BY cl
ause cannot appear in individual SQL statements that participate in the Union. S
elect P#,’Weight’ From P Where WEIGHT > 16 Union All Select P#,’Supplied’ From SP Where
S# = ‘S2’ Order By 2,1 Intersection Intersection operator performs Intersection of t
uples from two tables, the output is common tuples from two tables. Difference D
ifference operator performs Difference of tuples from two tables, the output is
tuples from one table that are not there in the other.
SQL Exercises Schema SAILORS RES BOATS 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. (sid, snam
e, rating) (sid, bid, date) (bid, bname, color)
Find names of sailors who have reserved boat #2 Find names of sailors who have r
eserved a red boat Find names of sailors with rating > 5 Find names of sailors w
ho have reserved boats on 1/1/95 Find color of boats reserved by Ravi Find names
of sailors who have reserved at least one boat Find names of sailors who have r
eserved all boats Find names of sailors who have reserved red or green boats Fin
d names of sailors who have reserved red and green boats Find names of sailors w
ho have reserved boats reserved by Ravi
SQL Exercises Schema CUSTOMER PRODUCT SALES 1. (cno, cname, city, status, catego
ry) (pno, pname, type) (cno, pno, date, qty, rate)
A sales report giving sales quantity for products of type ‘P01’, report should have
customer and product names, report should be sorted on product type, customer na
me A sales report giving total sales quantity for each product, report should ha
ve product name A sales report giving total sales value for each customer, repor
t should have customer name A sales report giving citywise total sales value, re
port should contain only type ‘P04’ products, report should consider customer whose
status is ‘SMP’ or ‘STP’ A sales report giving maximum & minimum sales quantity for each
product, report should contain product name
2. 3. 4.
5.
Embedded SQL
Introduction Any SQL statement that is used as an online query can also be used
in an application program as an embedded statement. This is known as dual-mode p
rinciple. Embedded SQL statements are prefixed with EXEC SQL for distinction. Ex
ecutable SQL statements appear wherever an executable host language statement ca
n appear. eg: Procedure division in Cobol program. SQL statements include refere
nces to host variables, such references are prefixed with colon ‘:’ to distinguish t
hem from column names. Host variables must be declared in ‘DECLARE’ section. Declara
tion of host variables must physically preceed use of variable in SQL statement.
eg: BEGIN DECLARE SECTION ...<host variable declaration> END DECLARE SECTION
Introduction In SQL statements, host variables can appear wherever a literal is
permitted. Host variables can also be used to place output from SQL statement, t
hey can thus be used as source and target for data in SQL statements. Any tables
used in program can optionally be declared by means of EXEC SQL DECLARE TABLE s
tatement, this is to make the program self-documentary. After any SQL statement
is executed, a feedback information reflecting the outcome of execution is retur
ned to the program in two special host variables. They are • SQLCODE - a 31 bit si
gned integer • SQLSTATE - character string of length 5 In principle, every SQL sta
tement should be followed by a test on either SQLCODE or SQLSTATE. A zero value
in SQLCODE indicates successful completion, a positive value means the statement
executed but some exceptional condition occurred, a negative value means the st
atement did not execute successfully. SQLSTATE is subdivided into a two characte
r class code and three character subclass code.
Introduction A program can contain EXEC SQL INCLUDE SQLCA This causes the precom
piler to insert declaration of SQL communication area, which contains declaratio
n of SQLCODE and SQLSTATE along with other feedback variables. Host variables mu
st have datatype compatibility with SQL datatype of columns that they are to be
compared or assigned to or from. SELECT statements usually retrieve multiple row
s, and host languages are not equipped to handle more than one row at a time. It
is therefore necessary to provide some kind of bridge between set-at-a-time ope
rations of SQL statements and row-at-a-time operations that host language can ha
ndle. Cursors provide such a bridge. A cursor is a new kind of object relevant t
o embedded SQL, interactive SQL has no need for it. It consists of a kind of a p
ointer that is used to run thru a set of rows, thus providing addressability to
rows retrieved by SQL statement, one row at a time.
SQL examples - Not involving Cursors Singleton Select EXEC SQL Select STATUS, CI
TY Into :RANK, :CITY From S Where S# = :GIVENS# ; Select STATUS, CITY Into :RANK
Indicator :RANKIND, :CITY From S Where S# = :GIVENS# ;
EXEC SQL
If RANKIND = -1 Then ...<RANK has NULL value> End-If
SQL examples - Not involving Cursors INSERT EXEC SQL Insert Into P (P#, PNAME, W
EIGHT) Values (:PNO, :PNAME, :PWT) ;
COLORIND = -1 CITYIND = -1 EXEC SQL Insert Into P (P#, PNAME, COLOR, CITY) Value
s (:PNO, :PNAME, :PCOLOR Indicator :COLORIND, :PCITY Indicator :CITYIND) ;
SQL examples - Not involving Cursors UPDATE EXEC SQL Update S Set STATUS = STATU
S + :RAISE Where CITY = ‘LONDON’ ;
RANKIND = -1 EXEC SQL Update S Set STATUS = :RANK Indicator :RANKIND Where CITY
= ‘LONDON’ ; Update S Set STATUS = NULL Where CITY = ‘LONDON’ ;
EXEC SQL
DELETE EXEC SQL Delete From SP Where :CITY = ( Select CITY From S Where S.S# = S
P.S#) ;
SQL examples - Involving Cursors The DECLARE X CURSOR ... statement defines a cu
rsor called X, and associates itself with a query specified by SELECT statement,
which is also a part of Cursor declaration. The query is not executed at this p
oint. The SELECT statement is effectively executed when the cursor is opened usi
ng current values of host variables in the procedural part of the program. The F
ETCH ... INTO statement is used to retrieve rows from the result table, one row
at a time. The INTO clause specifies a list of host variables that match the SEL
ECT clause declaration of the cursor. EXEC SQL Declare X Cursor For Select S#, S
NAME From S Where CITY = :Y ; Open X For all rows accessible via X EXEC SQL Fetc
h X Into :S#, :SNAME ; Close X ;
EXEC SQL EXEC SQL
Since there are multiple rows in the result table, Fetch is placed in a loop. A
Fetch would result in SQLCODE as +100 if no more rows exist in the result table,
this condition is used to terminate the loop. The cursor is finally closed by C
LOSE statement.
Cursor Declaration EXEC SQL Declare <cursor name> Cursor For <union expression>
Order By <columns> For [Fetch Only / Update of Columns] Optimize For <n> Rows
Notes: 1. Union expression is a SELECT expression or a union of SELECT expressio
ns
2. DECLARE cursor is a declarative and not executable statement 3. ORDER BY cann
ot be specified if UPDATE or DELETE CURRENT needs to be invoked 4. ORDER BY orde
rs rows to be retrieved by FETCH statements 5. Specifying FOR FETCH ONLY perform
s better, and is the default 6. OPTIMIZE may be specified purely for performance
reasons, it causes the optimizer to choose a more efficient access.
Executable Statements EXEC SQL OPEN <cursor name> •Opens or activites a specified
cursor •A set of rows are identifed and become active for the cursor •Cursor also id
entifes a position within that set of rows •Active set of rows is considered to ha
ve an order EXEC SQL FETCH <cursor name> INTO :<host-var> ... •identified cursor m
ust be open •advances cursor to next position •assigns values from that row to host
variables •if there is no row then SQLCODE is +100 •fetch next is the only cursor mo
vement operation EXEC SQL CLOSE <cursor name> •deactivates specified cursor, which
is currently open •closed cursor can be opened again •when opened again, the active
set of rows may be different •values in host variables can be different •changes to
host variables while cursor is open is redundant
Executable Statements Update & Delete EXEC SQL Update <table name> Set <column n
ame> = <expr>, <column name> = <expr> ... Where Current of Cursor Delete From <t
able name> Where Current of Cursor
EXEC SQL
Update and Delete using cursor are not permissible if cursor declaration involve
s Union or ORDER BY clause or if union expression involves non-updatable view Up
date should have FOR UPDATE clause identifying columns that appear as targets of
SET statement.
Relational Integrity
Relational Integrity Rules Need for Integrity rules •Any database consists of some
configuration of data values •That configuration is supposed to reflect real worl
d situation •Some configuration of values do not make sense •These do not represent
any possible state of real world Rules as Database definition •Database definition
needs to be extended to include some rules •rules inform DBMS of certain constrai
nts in the real world •rules can also prevent such impossible configuration of val
ues •Such rules are known as Integrity rules. Since base tables are supposed to re
flect reality, all Integrity rules apply to base tables. Integrity rules are spe
cific and general. General Integrity rules are those that concern primary key an
d foreign key. Specific Integrity rules are those concerning domain & user defin
ed integrity.
Notion of Primary key Primary key •is a unique identifier for a relation •can be com
posite •is a designated candidate key •no component can be eliminated without destro
ying uniqueness Notes •Every relation has a primary key, moreso the base relations
•Reason for choosing primary key is outside the scope of model •Important for prima
ry key to be really significant •There need not be an index on primary key •Primary
key is pre-requisite to foreign key support •Provides tuple level addressing mecha
nism in relational system
Entity Integrity Entity Integrity Rule No component of primary key of base relat
ion is allowed to be NULLS. Justification Base relations correspond to real worl
d and entities in real world must be distinguishable. Hence their representative
s in database must also be distinguishable. Null value for primary key implies t
hat entity in database has no full or partial identity. Primary key is supposed
to perform a unique identification function. Null value imples ‘value is unknown’. P
rimary key with Null implies that a tuple represents an identity we do not know.
This means that we do not know how many entities exist in the database or real
world. An entity without identity does not exist. Rule In a relational model, we
never record information about something we cannot identify.
Foreign keys In a table, a given value of an attribute should be permitted to ap
pear in the database if the same value also appears as a primary key value of so
me other table in the database. Foreign key value represents a reference to a tu
ple containing a matching primary key value. The relation containing foreign key
is called referencing relation. The relation containing the matching primary ke
y is called the referenced or target relation. The problem ensuring that the dat
abase does not include any invalid foreign key values is known as Referential In
tegrity problem. The foreign key is either wholly NULL or wholly NON NULL. There
should exist a base relation with primary key such that each NON NULL foreign k
ey value has a corresponding primary key value.
Notes 1. Foreign key and primary key should be defined on the same underlying do
main. 2. Foreign keys need not be component of primary keys. Any attribute can b
e a foreign key 3. A given relation can be referencing as well as referenced rel
ation. 4. A relation might include a foreign key where it’s values are required to
match the primary key value of the same relation. Such relations are known as s
elf-referencing relations. 5. Foreign keys, unlike primary keys, can have NULLS
6. Foreign key to primary key relationship is said to be the glue that holds the
database together.
Referential Integrity Rule The database must not contain any unmatched foreign k
ey values An unmatched foreign key value is a NON NULL foreign key value for whi
ch there does not exist a matching value of primary key in the relevant target r
elation. Note 1. Referential integrity requires foreign keys to match primary ke
ys 2. Foreign key and referential integrity are defined in terms of each other 3
. Support for referential integrity and support for foreign key mean the same. F
oreign key rules Referential integrity rule is framed purely in terms of databas
e states. Any state of database that does not satisfy the rule, is by definition
incorrect. These incorrect states can be avoided as follows - system could reje
ct any operation that results in illegal state - system accepts the operation an
d performs additional compensating operation to guarantee that overall results a
re a legal state
Some key issues Can a foreign key accept NULLS ? The answer to this question doe
s not depend on the database designer but the policies that are in effect in the
real world. What happens to an attempt to delete a target record of a foreign k
ey reference ? Restricted - target record is not deleted if there are any refere
ncing records Cascade - All referencing records are deleted Nullifies - Foreign
keys of all referencing records are set to NULLS. What happens on an attempt to
update primary key of a target of foreign key reference ? Same as with Delete. R
estrict, Cascade, Nullify
Primary key definition in DB2 Create Table SP ( S# Char(5) Not Null, P# Char(6)
Not Null, QTY Integer, Primary Key (S#, P#) ); Foreign key definition in DB2 Cre
ate Table SP ( S# Char(5) Not Null, P# Char(6) Not Null, QTY Integer, Primary Ke
y (S#, P#), Foreign Key SKF (S#) References S On Delete Cascade, Foreign Key PKF
(P#) References P On Delete Restrict );
Views
Introduction View is a named virtual table that is derived from a base table. It
does not exist in it’s own right, but appears to the user as if it did. Views do
not have their own physical seperate, distinguishable stored data. Instead their
definition in terms of other tables is stored in the catalog. An example: Creat
e View GOOD_SUPPLIERS As Select S#, STATUS, CITY From S Where STATUS > 15 ; GOOD
_SUPPLIERS is in effect a window into the real table S. Further this window is d
ynamic in nature, changes to S would automatically and instantaneously be visibl
e thru that window. Likewise changes to GOOD_SUPPLIERS would automatically and i
nstantaneously be applied to real table S. Users can operate on GOOD_SUPPLIERS a
s if it were a real table. The system handles the operation by converting it int
o an equivalent operation on the underlying base table.
View Creation Examples Create View REDPARTS (P#, PNAME, WT, CITY) As Select P#,
PNAME, WEIGHT, CITY From P Where COLOR = ‘RED’ ; Create View PQ (P#, TOTQTY) As Sele
ct P#, Sum(QTY) From SP Group By P# ; Create View SUPPLIER_PARTS (S#, P#, SNAME,
PNAME) As Select SP.S#, SP.P#, S.SNAME, P.PNAME From S, SP, P ; Create View LON
DON_REDPARTS As Select P#, WT From REDPARTS Where CITY = ‘LONDON’ ; Create View GOOD
_SUPPLIERS As Select S#, STATUS, CITY From S Where STATUS > 15 With Check Option
;
Types of Views Column subset Views Create View SCITY As Select S#, CITY From S C
reate View STATUS_CITY As Select STATUS, CITY From S The above views are created
such that they are vertical or column subsets of base tables, hence they are ca
lled column-subset views. View SCITY includes the primary key of the base table,
whereas view STATUS_CITY does not. For a given record, in the view STATUS_CITY,
it would be impossible to identify the corresponding record in the base table.
This is because the view does not include the primary key of the base table. Vie
ws that include the primary key of the base tables are known as Key preserving v
iews. Column subset views are theoretically updatable if they preserve the prima
ry key of the base table.
Types of Views Row subset Views Create View LONDON_SUPPLIERS As Select S#, SNAME
, STATUS, CITY From S Where CITY = ‘LONDON’ ; The above view is created such that it
is horizontal or row subset of the base tables, hence it is called row-subset v
iew. The view LONDON_SUPPLIERS includes the primary key of the base table, hence
it is a Key preserving view. Row subset views are theoretically updatable if th
ey preserve the primary key of the base table.
Types of Views Join Views Create View SUPPLIER_PART As Select SP.S#, SP.P#, S.SN
AME From S,SP Where S.S# = SP.S# ; View SUPPLIER_PART is constructed from join o
f two tables, these are known as colocated views. Colocated views suffer from al
l kinds of problems from standpoint of updatability. Statistical Summary Create
View PQ (P#, TOTQTY) As Select P#, Sum(QTY) From SP Group By P# ; The view PQ is
constructed such that each row of the view is a result of some aggregate functi
on on a set of rows in the base table. Such a view cannot be updated as it would
be impossible to know how to distribute the updates in the rows of the base tab
le.
View Updatability Updatable views are those on which Insert, Delete and Update o
perations can occur. Not all views are Updatable. There are some views that are
theoretically updatable but are not updatable in SQL systems. In general Join vi
ews cannot be updated, however there are some views that are not joins which can
not be updated. Check Option Create View GOOD_SUPPLIERS As Select S#, STATUS, CI
TY From S Where STATUS > 15 ; The view is row-column subset, key preserving and
updatable. Would it be possible to insert a supplier with STATUS = 10, the CHECK
option is designed to deal with such situations. During Insert and Update opera
tions, the view is checked to ensure that all Inserted and Updated rows satisfy
the view definition condition. If Check option is not specified, all data would
be accepted, but some newly Inserted or Updated rows may disappear from the view
.
Logical Data Independence Since application programs are not dependent on the ph
ysical structure of stored database, DB2 provides physical data independence. If
application programs are also independent of the logical structure of database,
the system is said to provide logical data independence. Logical structure of d
atabase can change due to two aspects, Growth & Restructuring. Growth •Database gr
ows to incorporate new kinds of information •A table is expanded to include new fi
elds •The database is expanded to include a new table •Growth does not affect applic
ation programs in DB2 Restructuring •Database is restructured so that overall info
rmation remains same •Placement of information within database changes •Restructurin
g is undesirable, but unavoidable
Logical data Independence An Example A base table S(S#,SNAME,STATUS,CITY) is spl
it into two tables SX(S#,SNAME,CITY) and SY(S#,STATUS) Application programs that
referred to base table S would need to be changed, since they would not have to
refer to SX & SY for any database operations. The old table S can be reconstruc
ted as a join of SX & SY. Hence the view can substitute reference to old base ta
ble S after it is split. Application programs that referred to base table S woul
d not refer to the view S, hence they need not undergo change. Create View S(S#,
SNAME, STATUS,CITY) As Select SX.S#,SX.SNAME,SY.STATUS,SX.CITY From SX,SY Where
SX.S# = SY.S# ; Having create the view S, Select operations would continue to wo
rk as before, however Update operations would not work. Although such a view is
theoretically updatable, DB2 does not allow updates on a view that is defined as
a join. Thus application programs performing update operations are not immune t
o this type of change.
Advantages of Views •Provide certain amount of logical data independence •allow some
data to be seen by different users in different ways •simplifies user’ perception •al
lows focus on data that is of concern and ignore the rest •provides automatic secu
rity
SQL Access Guidelines
Never Use SELECT * •Never ask DB2 anything more than required •Query should access o
nly those columns that are needed •Changes to table structure may imply changes to
program Singleton SELECT verses Cursor •Singleton Select outperforms Cursor •When a
row needs to be retrieved, cursor is preferred •FOR UPDATE clause of cursor ensur
es integrity •DB2 places an X lock prohibiting concurrent updates Use FOR FETCH ON
LY •Enales DB2 to use block fetch •Increases efficiency Avoid using DISTINCT •Distinct
eliminates duplicates •Invokes Sort to eliminate duplicates •Code only when duplica
te elimination is mandatory Limit the SELECTed data •Select should return minimal
but required rows •Do not code generic queries without WHERE clause •More efficient
to use WHERE clause to restrict retrieval
Code Predicates on Indexed columns •Requests satisfied more efficiently using an e
xisting index •Not efficient when most rows in a table are to be accessed Multi Co
lumn Indexes •Used when high-level column is specified in WHERE clause Several Ind
exes instead of multicolumn Index •Multiple indexes more efficient than single mul
ticolumn index •Provide better overall performance for all queries •At the expense o
f individual queries Use ORDER BY when sequence is important •DB2 doesn’t guarantee
the order of rows returned •Path of data retrieved may change from each execution •O
RDER BY is mandatory when sequence is important Limit columns in ORDER BY clause
•For ORDER BY clause, DB2 invokes a Sort •The more columns in ORDER BY, the less ef
ficient •Specify only those columns that are essential Use Equivalent data types •Es
sential when comparing column values to host variables •Eliminates need for data c
onversion •Index is not used if data types incompatible
Use Between instead of <= and >= •Between more efficient than combination of <= an
d >= •Optimizer selects a more efficient path Use IN instead of LIKE •For known list
of data occurrences use IN •IN with specific list is more efficient than LIKE For
mulate LIKE predicates with care •Avoid % or _ at the begining of comparison strin
g •Avoid using LIKE with host variable Avoid using NOT (except with EXISTS) •Recode
queries to avoid use of NOT •By taking advantage of knowledge of data being access
ed Code most restrictive predicate first •Place predicate that eliminates greatest
number of rows first Use Predicates wisely •Reduce number of predicates •Know your
data to reduce predicates Specify number of rows to be returned •Code cursor state
ment with OPTIMIZE FOR n ROWS •DB2 selects optimal path •Does not prevent program fr
om fetching more rows
Complex SQL Guidelines
UNION versus UNION ALL •Union invokes a sort, UNION ALL does not •Use UNION ALL when
retrieved data does not have duplicates Use NOT EXISTS instead of NOT IN •NOT EXI
STS verifies non existance •NOT IN must have complete set of rows materialized •Use
NOT EXISTS for subquery using negation logic Use constant for existance checking
•If subquery tests existance, specify constant in select-list •Select-list of subqu
ery is unimportant, will not return data Predicate transitive closure rules •DB2 o
ptimizer uses the rule of transitivity, which states •If a=b and b=c then a=c •Effic
ient to code a redundant predicate to exploit transitivity where a.col1 = b.col1
where a.col1 = b.col1 and a.col1 = :hostvar and a.col1 = :hostvar and b.col1 =
:hostvar Minimize number of tables in a join •Do not join more than 5 tables •Elimin
ate unnecessary tables from join statements Denormalize to reduce joining •To mini
mize the need for joins, consider denormalizing •Would imply redundancy, dual upda
ting & usage of space
Reduce the number of rows to be joined •number of rows participating in join deter
mines response time •to reduce join response time, reduce number of rows •code predi
cates to minimize number of rows Join using SQL instead of program logic •efficien
t to join using SQL instead of application code •optimizer has a vast array of too
ls of optimize performance •application code would not consider equivalent optimiz
ations Use Joins instead of Subqueries •Join is more efficient than a correlated s
ubquery or using IN Join on clustered column •Use clustered columns in join criter
ia when possible •reduces the need for immediate sorts •might require clustering par
ent table by primary key •and child table by foreign key Join in Indexed columns •ef
ficient when tables are joined on indexed columns •consider creating indexes for j
oin predicates Avoid Cartesian products •Never use a join without a predicate •A joi
n without a predicate results in a cartesian product •A lot of resources are spent
on such a join
Provide adequate search criteria •provide additional search criteria in WHERE clau
se •these to be in addition to the join criteria •provides DB2 an opportunity for ra
nking tables to be joined •allows queries to perform adequately Limit columns to b
e used in GROUP BY •specify only minimum columns to be grouped •DB2 needs to sort re
trieved data •the more columns, the more expensive is the sort
Increase the possibility of Stage 1 processing A predicate that is satisfied by
Stage 1 processing is evaluated by the Data Manager portion of DB2 rather than t
he Relational Data System. The Data Manager component is at a closer level to da
ta than the Relational System. Stage 1 predicate is evaluated at earlier stage o
f data retrieval, thereby avoiding the overhead of passing data from one compone
nt to another. The following list shows predicates satisfied by Stage 1 Colname
Colname Colname Colname Colname Colname a.Colname operator value IS NULL BETWEEN
val1 AND val2 IN (list) LIKE pattern LIKE hostvariable operator b.Colname
The last item in list refers to two columns in different tables. Predicates form
ulated with AND, OR, NOT are not at Stage 1
Increase the possibility of Index processing A query that can use an index has m
ore access path options, so it has the capability of being a more efficient quer
y than the query that cannot use an index. DB2 optimizer can use an index or ind
exes in a variety of ways to speed the retrieval of data from DB2 tables. The fo
llowing list shows predicates satisfied by using Index Colname Colname Colname C
olname Colname Colname a.Colname operator value IS NULL BETWEEN val1 AND val2 IN
(list) LIKE pattern LIKE hostvariable operator b.Colname
The last item in list refers to two columns in different tables. Predicates form
ulated with AND, OR, NOT are not Indexable

Das könnte Ihnen auch gefallen