Sie sind auf Seite 1von 147

Relational Database Management System

surendersingh@rediffmail.com
Surender Singh
Sr. Programmer
surendersingh@titsbhiwani.ac.in

surendersingh@rediffmail.com
Relational Database Management System

DATA

DATABASE

DBMS/RDBMS

Information

surendersingh@rediffmail.com
File Processing System

surendersingh@rediffmail.com
File Processing System

Application
Programs
File System
(Programs Database
Written in C (Data Structure (Information in
Pascal etc.) File Handling) Files Format)

surendersingh@rediffmail.com
File System

Database

surendersingh@rediffmail.com
Disadvantages of FPS

Data Redundancy and Inconsistency


Difficulty in accessing data
Data isolation
Integrity Problems
Atomicity Problems
Concurrent-access anomalies
Security Problems

surendersingh@rediffmail.com
Data Redundancy and Inconsistency

Name Address AccNo Name Address


ABC Bhiwani 1002 ABC Bhiwani
DEF Delhi 1005 DEF Jaipur

Customer Information Saving Account

surendersingh@rediffmail.com
Difficulty in accessing data

Manager

Requirement

Application
Programs
File System
(Programs Database
Written in C (Data Structure (Information Storage
Pascal etc.) File Handling) in Files Format)

surendersingh@rediffmail.com
Data Isolation and Integrity Problems
Program in C Program in COBOL

#include <stdio.h>
01 Reserve-rec.
Main()
03 saving
{
05 accno PIC A(2)
-----
--------
}

New Document

surendersingh@rediffmail.com
Atomicity Problems

Bank
Data Transmission

surendersingh@rediffmail.com
USER USER
Concurrent-access anomalies

surendersingh@rediffmail.com
Security Problems

Employee
Information

surendersingh@rediffmail.com
Database the Piece of mind

surendersingh@rediffmail.com
Requirements of a DBMS

A mechanism for specification of data and its dependencies


(Integrity Constraints) in an integrated fashion.
Prevention of redundancy and inconsistency.
Provision of adequate security and access-rights.
Mechanism for concurrency control.
Mechanism for recovery from failure.

Additionally any DBMS must provide


Schemes for specification of procession rules or application Programs.
Efficient techniques for storage and retrieval of data from the secondary
storage (disk).

surendersingh@rediffmail.com
A DBMS has two major components, namely

Structure of Database is called Database Schema.


Instance, which is a state of the database with the actual data loaded.
A set of software tools/programs which access, update and
process the database, called the query and update-mechanism.

D
B File
Secondary
M Manager
Storage
S

surendersingh@rediffmail.com
View of DATA

View Level (External Level)

View 1 View 2 View n

Logical Level Conceptual View

Physical Level Internal View

surendersingh@rediffmail.com
Data Independence
The ability to modify a schema definition in one level without affecting a
schema definition in the next higher level is called data independence.

Physical data independence Logical data independence

Create table emp


(empno number(10),
--------------
);

surendersingh@rediffmail.com
Data Models
A Data Model is a mechanism for describing the data, their interrelationships
and the constraints.

Object-based Conceptual models.


Entity-Relationship model

Record-based models.
Relational Model
Network Model
Hierarchical Model

Physical data models.

surendersingh@rediffmail.com
The E-R Model

Entities : An entity is a distinct clearly identifiable object of the database e.g Book
Attribute : Each Entity is characterized by a set of attributes e.g. Acc.No.
Entity Set : Set of all entities having attributes of the same type.
Relationships : A relationship is a mapping between entity sets.

Acc_No Card_No Name


Acc_No Title

BOOK Borrowed_By USERS

surendersingh@rediffmail.com
Author YearofPub Card_No DOI Address
The Relational Model
Relational Model uses a collection of tables to represent both data and
relationship among those data. Each table has multiple Attributes and
similar kind of tuples.

Attribute

Book Table/Relation

AccNo Title Author YearofPub

Tuple

surendersingh@rediffmail.com
Network Model
Data in the network model are represented by collection of records and
relationships among data are represented by links, which can be viewed
as Pointers.

User
Card_No Name Address Link

Pointer Next

Book
Acc_No Author ----- Link

surendersingh@rediffmail.com
Hierarchical Model
This is special kind of a network model where the relationship is
essentially a tree-like structure.

Hospital

Wards Units

Patient Doctors Nurses Cardiology Skin

surendersingh@rediffmail.com
Physical Data Models

Physical data models are used to describe data at the lowest level.
In contrast to logical data models, there are few physical data models
In use. Two of the widely known ones are the Unifing model and
frame-Memory model.

surendersingh@rediffmail.com
Database Languages

Database Languages
Data-Definition Data-Manipulation Data-Control

Create Table Test


( Update
Title Varchar2(20), Insert GRANT Connect,
-------- Delete Resource TO xUser
); Query

surendersingh@rediffmail.com
Database Management System Structure
Nave Users Application Sophisticated Database
Users
(tellers, agents, etc.) Programmers Users Administrators

Application Application Database


Interfaces Programs Query Scheme

Embedded
DML DDL
Application DML
Compiler Interpreter
Programs Precompiler
Object Code Query
Query
Processor
Evaluation
Engine
Database
Management
System
Transaction Buffer
Manager
Storage
Manager
Manager
File
Manager

Indices Statistical Data


Disk Storage

surendersingh@rediffmail.com
Data Files Data Dictionary
surendersingh@rediffmail.com
Oracle Storage System Structure

surendersingh@rediffmail.com
Database Administrator

Roles of DBA

Schema Definition
Storage structure and access-method definition
Schema and Physical-organization modification
Granting of authorization for data access
Integrity-constraint specification

surendersingh@rediffmail.com
Terms
Simple and Composite Attributes
Single-valued and Multivalued Attributes
Null Attributes
Derived Attributes
Existence Dependencies
Weak Entity Set and Strong Entity Set

surendersingh@rediffmail.com
Weak Entity Set

surendersingh@rediffmail.com
Attributes

surendersingh@rediffmail.com
Keys

Keys

Candidate Key Secondary Key Foreign Key

Primary Key Alternate Key

Composite Key

surendersingh@rediffmail.com
Candidate Keys

Primary Alternate Keys


key
Roll_No Name Branch City

01 Deepak Computers Bhiwani

02 Mukesh Electronics Rohtak

03 Teena Mechanical Bhiwani

04 Deepti Chemical Rohtak

05 Monika Civil Delhi

surendersingh@rediffmail.com
Primary Secondary Key
Key
Roll_No Name Branch City
01 Deepak Computers Bhiwani
02 Mukesh Electronics Rohtak
03 Teena Computers Bhiwani
04 Deepak Electronics Rohtak
05 Monika Computers Delhi

surendersingh@rediffmail.com
Composite Primary Key

Name Branch City

Deepak Computers Bhiwani

Mukesh Electronics Rohtak

Teena Computers Bhiwani

Deepak Electronics Rohtak

Monika Computers Delhi

surendersingh@rediffmail.com
P#
Part P_Name Colour Quantity
P1 Nut Red 200
P2 Bolt Green 250
P3 Screw Blue 300

surendersingh@rediffmail.com
S#
Supplier S_Name City Quantity
S1 John Delhi 200
S2 Smith Kolkata 250
S3 James Delhi 300
S4 David Chennai 400
S5 John Chennai 300

surendersingh@rediffmail.com
SP#
P# S# Quantity
P1 S1 200
P2 S1 300
P3 S1 400
P1 S2 250
P2 S3 250
P3 S4 200
P2 S4 300
P3 S5 400
surendersingh@rediffmail.com
Mapping Cardinalities

Mapping cardinalities, or cardinality ratios, express the number of entities


to which another entity can be associated via a relationship set.

For a binary relationship set R between entity sets A and B, the mapping
Cardinality must be one of the following

A B A B

One to One One to Many

surendersingh@rediffmail.com
A B A B

surendersingh@rediffmail.com
Many to One Many to Many
More on E-R Diagrams
Company

Owns Multiple Relationship between Leased


Same entity set

Vehicle

Manager
Staff Reports to

Subordinate

Circular Relationship

surendersingh@rediffmail.com
Ternary E-R Diagram

Instructors Teaches Students

Courses

Book Borrowed_By User


N 1

Constraints

surendersingh@rediffmail.com
E-R Diagram Components
Entity Sets

Attributes

Relationship Sets

Connectors/Constraints

Multivalued Attributes

Derived Attributes

Total Participation of an entity in a relationship set

surendersingh@rediffmail.com
Existence Dependencies

surendersingh@rediffmail.com
Generalization and Specialization

surendersingh@rediffmail.com
Generalization and Specialization
The abstraction mechanisms

Emp_No Name Date_of_hire

Generalization Employee Specialization

IS_A IS_A

Full_time Part_time
Type
Employee Salary Employee

IS_A IS_A IS_A IS_A

Faculty Staff Teaching Casual

surendersingh@rediffmail.com
Degree Interest Stipend Hour_Rate
Aggregation
The Process of compiling information on an object

Teaches

Teacher Uses Course

Book

Teacher-Teaches

Teacher Teaches Course

Uses

surendersingh@rediffmail.com
Book
Represent ER model using tables

surendersingh@rediffmail.com
Query Languages

A query language is a language in which a user requests information from a database.


These are typically higher-level than programming languages.
They may be one of:

Procedural, where the user instructs the system to perform a sequence of operations
on the database. This will compute the desired information.

Nonprocedural, where the user species the information desired without giving a
procedure for ob-taining the information.

A complete query language also contains facilities to insert and delete tuples as well
as to modify parts of existing tuples.

surendersingh@rediffmail.com
The Relational Algebra
The relational algebra is a procedural query language.

The Borrow and Branch relations

surendersingh@rediffmail.com
Fundamental Operations

select (unary)
project (unary)
rename (unary)
cartesian product (binary)
union (binary)
set-difference (binary)

Several other operations, dened in terms of the fundamental operations:


set-intersection
natural join
division
assignment
Operations produce a new relation as a result.

surendersingh@rediffmail.com
Formal Definition of Relational Algebra

surendersingh@rediffmail.com
The Select Operation

surendersingh@rediffmail.com
The Project Operation

surendersingh@rediffmail.com
The Cartesian Product Operation

surendersingh@rediffmail.com
Output of Cartesian Product

Relation A Relation B AXB

A B A B
1 1 X
X
2 1 Y
Y 2 X
3
2 Y
3 X
3 Y

surendersingh@rediffmail.com
The Rename Operation

surendersingh@rediffmail.com
The Union Operation

surendersingh@rediffmail.com
The Set Difference Operation

surendersingh@rediffmail.com
Additional Operations

The Set Intersection Operation

surendersingh@rediffmail.com
The Natural Join Operation

surendersingh@rediffmail.com
The Division Operation

surendersingh@rediffmail.com
Example of Division Operation

Relation R Relation S S
R

A B B A
P A A P
Q A
P B B Q
Q T
M A
Q B

surendersingh@rediffmail.com
The Assignment Operation

surendersingh@rediffmail.com
Relational Calculus

Relational Calculus is a nonprocedural Query language


Tuple Relational Calculus
Uses Tuple variables which take values of an entire tuple
Domain Relational Calculus
Uses Domain variables which takes values from an attribute

surendersingh@rediffmail.com
Tuple Relational Calculus

surendersingh@rediffmail.com
Example Queries

surendersingh@rediffmail.com
Some More Examples

surendersingh@rediffmail.com
Domain Relational Calculus

surendersingh@rediffmail.com
SQL

surendersingh@rediffmail.com
Integrity Constraints

Integrity and Consistency is of primary concern to any database design


At any instance a database must be correct according to a set of rules.
Rules are checked during any database operation.

Insertion
Deletion
Updation
Recovery from Failure
Concurrent Operations

Types of Constraints

Domain Constraints
Referential Integrity Constraint
Functional Dependencies

surendersingh@rediffmail.com
Domain Constraints
Includes

Type
Width
Null or Not Null
Checks/Conditions
Specify at the time of designing
Checked at the time of insertion, deletion or modification

e.g
Bname char(20)
Amount number(7,2)
DOL date check (date>=29/09/2004
City char(10) not null
TotalAmt = amount + interest

surendersingh@rediffmail.com
Referential Integrity
Foreign Key
Referential integrity states that all values of the foreign key of one
Relation must be present in another relation where the same attribute
Is declared as the primary key

Checks during Database Modification


Insert
Delete
Update

surendersingh@rediffmail.com
Assertions and Triggers
An assertion is a general predicate, expressed in relational algebra
Or calculus or any language like SQL which must always hold in a
Database

Assert salary-constraint on emp


salary >= 1000

A trigger is a statement or a block of statements which are executed


Automatically by the system when an event (i.e., insertion, updation
Or deletion) takes place on a table

Define trigger insert_record


on delete of emp e
(insert into emp_history
values e.empno, e.name, e.deptno)

surendersingh@rediffmail.com
Functional Dependencies
Functional Dependencies provide a formal mechanism to express
Constraints between attributes.

It is a mean of identifying how values of certain attributes are


Determined by values of other attributes.

A functional dependency (FD) generalizes the concept of a key.

Book (acc_no, yr_pub, title)

Acc_no is Primary Key

Formal representation of Constraints


acc_no yr_pub
acc_no title

surendersingh@rediffmail.com
Formal Notation of FD
In general if there are two attributes A and B and the FD

A B

Holds then, it means that there can be no two tuple which have
The same value of attributes A and different values in attribute B.

If and are two sets of attributes then the FD holds on a


Relation r(R), if

1. , R, i.e. , subset of R
2. for all tuples t1 and t2 in r,
if t1 [ ] = t2 [ ] then
t1 [ ] = t2 [ ]

surendersingh@rediffmail.com
Closure of a Set of Functional
Dependencies

surendersingh@rediffmail.com
Armstrongs Axioms

surendersingh@rediffmail.com
Closure of a Set of F+

surendersingh@rediffmail.com
Closure of Attribute Sets

surendersingh@rediffmail.com
Canonical Cover

To minimize the number of functional dependencies that need to be


Tested in case of an update we may restrict F to a canonical cover Fc .

A canonical cover for F is a set of dependencies such that F logically


Implies all dependencies in Fc.
A canonical cover Fc of a set of FDs F is a minimal cover of F in the
Sense that there is no subset of Fc which also covers F.

surendersingh@rediffmail.com
Example of Cannonical Cover
Consider a relation r ( X, Y, Z ) with the FDs F.

1. X YZ
2. Y Z
3. X Y
4. XY Z
Here 4 is redundant because (1) states that X Y and X Z holds.
Thus (4) can be derived from (1). Also (3) is redundant because (1) contains (3).
Deleting these two we get
1. X YZ
2. Y Z
Which is a cover of F. Here again since X Y and Y Z holds, by
Transitivity X Z holds. So it is redundant. Deleting this we get the FDs as
X Y
Y Z
Which is a cannonical cover of F.
surendersingh@rediffmail.com
Relational Database Design

surendersingh@rediffmail.com
Database Decomposition 1

Representation of Information

surendersingh@rediffmail.com
Database Decomposition 2

surendersingh@rediffmail.com
Database Decomposition 3

surendersingh@rediffmail.com
Database Decomposition 4

surendersingh@rediffmail.com
Lossless-join Decomposition

surendersingh@rediffmail.com
Example of lossy decomposition
S_by
s_name s_addr Item Price
A1 B1 C1 D1
A1 B1 C2 D1
p1 A2 B2 C1 D2 p2
S_addr Item price
S_name Item A2 B2 C3 D3
B1 C1 D1
A1 C1 A3 B1 C2 D2
B1 C2 D1
A1 C2
A2 C1 Natural Join of P1 and p2 B2 C1 D2
S_name S_addr Item Price B2 C3 D3
A2 C3
A1 B1 C1 D1 B1 C2 D2
A3 C2
A1 B2 C1 D2

A1 B1 C2 D1

A1 B1 C2 D2

A2 B1 C1 D1

A2 B2 C1 D2

A2 B2 C3 D3

A3 B1 C2 D1

surendersingh@rediffmail.com
A3 B1 C2 D1
Dependency Preservation

surendersingh@rediffmail.com
Normalization

Normalization is a process of removing redundancy using functional Dependencies.

To reduce redundancy it is necessary to decompose a relation into a number of smaller relations.

There are several normal Forms.

-First Normal Form (1 NF)


-Second Normal Form (2 NF)
-Third Normal Form(3 NF)
-Boyce-Codd Normal Form (BCNF)

surendersingh@rediffmail.com
First Normal Form (1NF)

This normal form says that all attributes are simple.

An attribute is said to be simple if it does not contain any subparts.


An attributes which contains subparts is called complex attributes.

Name C_addr

F_name L_name City State Zip

surendersingh@rediffmail.com
Second Normal Form (2NF)
A relation is said to be in 2NF if it is in 1NF and
All non-prime attributes are fully functionally dependent on candidate key

Consider a relation savings_deposit having the following structure:-


Saving_deposit (name, addr, acc_no, amt )

With the following FDs :


name addr
name, acc_no amt

Here [name, acc_no ] is the candidate key and addr and amt are the non prime attributes.
Among the non-prime attributes amt depends on [name, acc_no ] whereas addr depends
on name only.

Note that due to FD name addr every tuple with the same name will contain the same
Address causing redundancy.

This redundancy arises because a non-prime attribute like address is dependent on an attribute

surendersingh@rediffmail.com
Which is not a candidate key.
Solution
We can remove this redundancy by splitting the original relation into following two relations

Sav_sch1 (name, addr)


Sav_sch2(name, acc_no,amt)

Both the relations are now 2NF.


In the first relation name is Primary Key and the onlyNon-prime attribute is addr
which is dependent on name

In the second relation the only non-prime attribute amt depend on both name and
Acc_no. that this decomposition is also lossless join and dependency preserving

Courses ( Course_no, title, loc, time )

And FDs are

Course_no title
Course_no, time loc

surendersingh@rediffmail.com
Third Normal Form (3NF)
A relation is said to be in 3NF and non-prime attributes are not dependent
On each other.

Consider the relation


s_by ( s_name, item, price, gift_item )
With FDs
s_name, item price
price gift_item

Here all prime attributes are fully functional dependent on candidate keys, the
Non-prime attribute gift-item is also fully functional dependent on the non-prime
Attribute price. This create redundancy because every price value there is a fixed
Gift item.

We shall have to impose the additional restriction that no non-prime attribute can
Be functionally dependent on another non-prime attributes.

surendersingh@rediffmail.com
Solution
We decompose the relation
s_by (s_name, item, price, gift_item )
Into
s_by_1 (s_name, item, price )
s_by_2 (price, gift_item)

Now we have a lossless join and dependency preserving decomposition.

An alternative yet equivalent definition for 3NF is :

For every FD on R at least one of the following conditions hold


(trivial dependency)
R ( is a super key )

surendersingh@rediffmail.com
Boyce-Codd Normal Form (BCNF)

surendersingh@rediffmail.com
More on BCNF

surendersingh@rediffmail.com
Comparison of BCNF and 3NF

surendersingh@rediffmail.com
Comparison of BCNF and 3NF - 2

surendersingh@rediffmail.com
Normalization using Multivalued
Dependencies

surendersingh@rediffmail.com
Multivalued Dependencies -2

surendersingh@rediffmail.com
Rules

surendersingh@rediffmail.com
More Rules

surendersingh@rediffmail.com
Fourth Normal Form (4NF)

surendersingh@rediffmail.com
Example

surendersingh@rediffmail.com
Normalization using Join Dependencies
Let R be a relation schema and R1, R2,.Rn be a decomposition of R. The join dependency
*(R1, R2,.Rn) is used to restrict the set of legal relations to those for which R1, R2,.Rn is
A lossless-join decomposition of R.

Formally, if R = R1 R2 Rn, we say that a relation r( R ) satisfies the join dependency.

surendersingh@rediffmail.com
Fifth Normal Form (5NF)
Project-Join Normal Form
Project-join normal form (PJNF) is defined in a manner similar to BCNF and 4NF,
Except that join dependencies are used.

A relation schema R is in PJNF with respect to a set D of functional multivalued and


Join dependencies if, for all join depencdencies in D+ of the form *(R1, R2,. Rn).
Where each Ri R and R = R1 R2 Rn, at least one of the following holds:

*(R1, R2..Rn) is a trival join dependency.


Every Ri is a superkey for R.

Its seems that every PJNF is also in 4NF


Thus, in general, we may not be able to find a dependency-preserving decomposition
Into PJNF for a given schema.

surendersingh@rediffmail.com
Storage and File Structure
Hierarchy of Storage

surendersingh@rediffmail.com
Description

surendersingh@rediffmail.com
Description - 2

surendersingh@rediffmail.com
File Organization

surendersingh@rediffmail.com
Fixed Length Record -1

surendersingh@rediffmail.com
Fixed Length Record -2

surendersingh@rediffmail.com
Variable-length Records

surendersingh@rediffmail.com
Fixed-length representation

surendersingh@rediffmail.com
Organization of Records in files

surendersingh@rediffmail.com
Concurrency Control and Recovery
Transactions
Concurrent execution of user programs is essential for good DBMS performance.
Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by
working on several user programs concurrently.
A users program may carry out many operations on the data retrieved from the database, but the
DBMS is only concerned about what data is read/written from/to the database.
A transaction is the DBMSs abstract view of a user program: a sequence of reads and writes.

A Tracnsaction is a unit of program execution That accesses and possibly updates various
Data items.

Collection of operations that form a single logical unit of work are called tracsactions.
A database system must ensure proper execution of transaction despite failures.

To ensure integrity of the data, database system must maintain the following properties of the
transactions:

surendersingh@rediffmail.com
States of Transactions

Partially Committed

Active

Aborted
Failed

surendersingh@rediffmail.com
Concurrency in a DBMS
Users submit transactions, and can think of each transaction as executing by itself.
Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of
various transactions.
Each transaction must leave the database in a consistent state if the DB is consistent when the
transaction begins.
DBMS will enforce some ICs, depending on the ICs declared in CREATE TABLE statements.

Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not

understand how the interest on a bank account is computed).


Issues: Effect of interleaving transactions, and crashes.

surendersingh@rediffmail.com
Example

Consider two transactions (Xacts):

T1: BEGIN A=A+100, B=B-100 END


T2: BEGIN A=1.06*A, B=1.06*B END
Intuitively, the first transaction is transferring $100 from Bs account to As account. The
second is crediting both accounts with a 6% interest payment.
There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted
together. However, the net effect must be equivalent to these two transactions running
serially in some order.

surendersingh@rediffmail.com
Example (Contd.)

Consider a possible interleaving (schedule):


T1: A=A+100, B=B-100
T2: A=1.06*A, B=1.06*B

This is OK. But what about:


T1: A=A+100, B=B-100
T2: A=1.06*A, B=1.06*B
The DBMSs view of the second schedule:
T1: R(A), W(A), R(B), W(B)
T2: R(A), W(A), R(B), W(B)
surendersingh@rediffmail.com
Example (Contd.)

The DBMS must not allow schedules like this!

T1: R(A), W(A), R(B), W(B)


T2: R(A), W(A), R(B), W(B)

A
T1 T2 Dependency graph
B
Dependency graph: One node per Xact; edge from Ti to Tj if Tj reads or writes an object last
written by Ti.
The cycle in the graph reveals the problem. The output of T1 depends on T2, and vice-versa.

surendersingh@rediffmail.com
Scheduling Transactions

Equivalent schedules: For any database state, the effect (on the set of objects in the database) of
executing the first schedule is identical to the effect of executing the second schedule.
Serializable schedule: A schedule that is equivalent to some serial execution of the transactions.
If the dependency graph of a schedule is acyclic, the schedule is called conflict serializable. Such a
schedule is equivalent to a serial schedule.
This is the condition that is typically enforced in a DBMS (although it is not necessary for
serializability).

surendersingh@rediffmail.com
Detection of Serializability
One of the techniques of concurrency control is to detect whether a schedule is valid or not
Prior to execution.

The task of understanding a schedule is simplified by considering only the sequence of read
and write operation in a transaction

T1 T2

Read(X)
Read(X)
Write(X)
Write(X)
Read(Y)
Write(Y)
Read(Y)
Write(Y)

Read-Write sequence of a non-serializable schedule

surendersingh@rediffmail.com
Serializable Concurrency
T1 T2

Read(X)
Write(X)
Read(X)
Write(X)
Read(Y)
Write(Y)
Read(Y)
Write(Y)

A serializable concurrent schedule

Generalize the idea of conflict. Consider the four possibilities which can arise between two
Consecutive instructions T1 and T2 in a schedule ( T1 and T2 belong to two different transactions)

1. T1 : Read(X) followed by T2 : Write(X)


2. T1 : Read(X) followed by T2 : Read(X)
3. T1 : Write(X) followed by T2 : Read(X)
4. T1 : Write(X) followed by T2 : Write(X)

T1 and T2 are said to be conflict if they cannot be swapped without fear of loss of consistency.

surendersingh@rediffmail.com
In above 3 cases all pairs except case 2 are said to be in conflict.
Deadlock Condition

T1 T2

UPDATE account UPDATE account


SET balance = balance * 0.1 SET balance = balance * 0.1
WHERE acc_no = FC821 WHERE acc_no = FC523

UPDATE account UPDATE account


SET age = 30 SET age = 38
WHERE acc_no = FC523 WHERE acc_no = FC821

surendersingh@rediffmail.com
Lock-Based Techniques
In this technique the system does not participate in detection of inconsistency nor does it take any
Corrective action.

The DBMS however, provides the user with a set of operations which when used properly can
ensure that concurrent execution will not violate consistency.

In this techniques functions are provided to lock and unlock data items by transactions,

In the simplest case a data item X can be locked by a transaction T1 in two modes :

Shared Mode : if T1 locks X in shared mode then before T1 unlocks X, no other transaction T2
can write into X. But a transaction T2 can read the value of X even if T1 has locked
locked X in shared mode.

Exclusive Mode : If T1 locks X in exclusive mode then before T1 unlocks X, no other transaction
T2 can read or write into X.

surendersingh@rediffmail.com
Example
T1 T2

Lock-X(P)
Read (P,p)
P=p-1
Write(P,p)
Unlock(P)
Lock-S(Q)
Read(Q,q)
unlock(Q)
Lock-S(P)
Read(P,p)
unlock(P)
display(p)
display(p)
Lock-X(Q)
Read(Q,q)
q=q+1
Write(Q,q)
Unlock(Q)

surendersingh@rediffmail.com
Two-Phase locking
Phase I Acquiring Phase : During this phase a transaction may lock a data item but not
unlock any data item.

Phase II Releasing Phase : During this phase a transaction may unlock data items locked
earlier but no new locks may be acquired.

In two phase locking phase I must always precede phase II. This will ensure that all schedule
are automatically conflict serialzable.

surendersingh@rediffmail.com
Enforcing (Conflict) Serializability

Two-phase Locking (2PL) Protocol:


Each Xact must obtain a S (shared) lock on object before reading, and an X (exclusive) lock on object
before writing.
Once an Xact releases any lock, it cannot obtain new locks.

If an Xact holds an X lock on an object, no other Xact can get a lock (S or X) on that object.
2PL allows only conflict-serializable schedules.
Potential problem of deadlocks: we could have a cycle of Xacts, T1, T2, ... , Tn, with each Ti waiting for its
predecessor to release some lock that it needs.
Dealt with by killing one of them and releasing its locks.

surendersingh@rediffmail.com
Atomicity of Transactions
A transaction might commit after completing all its actions, or it could abort (or be aborted by the DBMS)
after executing some actions.
A very important property guaranteed by the DBMS for all transactions is that they are atomic. That is, a
user can think of a Xact as always executing all its actions in one step, or not executing any actions at all.
DBMS logs all actions so that it can undo the actions of aborted transactions.

This ensures that if each Xact preserves consistency, every serializable schedule preserves consistency.

surendersingh@rediffmail.com
Aborting a Transaction
If a transaction Ti is aborted, all its actions have to be undone. Not only that, if Tj reads an object last
written by Ti, Tj must be aborted as well!
Most systems avoid such cascading aborts by releasing a transactions locks only at commit time.
If Ti writes an object, Tj can read this only after Ti commits.

In order to undo the actions of an aborted transaction, the DBMS maintains a log in which every write is
recorded. This mechanism is also used to recover from system crashes: all active Xacts at the time of the
crash are aborted when the system comes back up.

surendersingh@rediffmail.com
The Log

The following actions are recorded in the log:


Ti writes an object: the old value and the new value.

Log record must go to disk before the changed page!

Ti commits/aborts: a log record indicating this action.

Log records are chained together by Xact id, so its easy to undo a specific Xact.
Log is often duplexed and archived on stable storage.
All log related activities (and in fact, all activities such as lock/unlock, dealing with deadlocks etc.) are
handled transparently by the DBMS.

surendersingh@rediffmail.com
The Log - 2
Log file e.g. X=1000, Y= 2000
T:
Read (X, xi) Transaction Name
xi xi 500 Data item Name
Write (X,xi) Old Value
New Value
Read ( Y, yi)
yi yi + 500 <T starts>
Write (Y, yi) <T, X, 1000, 500>
<T, Y, 2000, 2500>
<T, commits>

surendersingh@rediffmail.com
Checkpoints
At the time of recovery the entire log needs to be searched to know which transaction need to
Be redone and which transactions needs to be undone. The problem with this approach is:

1. It will take a reasonable amount of time.


2. Most of the transactions that need to be redone have already modified the database.

To solve this problem the concept of checkpoint is used here at different points.
Checkpoints are introduced to indicate that the data before this point has already been
Updated to the database. Before writing checkpoints the following sequence of actions
shuld to take place

- Output all log records currently residing in the main store to a stable storage
- Output all modified buffer blocks to secondary storage.
- Output a log record <checkpoint>

surendersingh@rediffmail.com
Recovering From a Crash
There are 3 phases in the Aries recovery algorithm:
Analysis: Scan the log forward (from the most recent checkpoint) to identify all Xacts that were active,
and all dirty pages in the buffer pool at the time of the crash.
Redo: Redoes all updates to dirty pages in the buffer pool, as needed, to ensure that all logged
updates are in fact carried out and written to disk.
Undo: The writes of all Xacts that were active at the crash are undone (by restoring the before value
of the update, which is in the log record for the update), working backwards in the log. (Some care
must be taken to handle the case of a crash occurring during the recovery process!)

Data can be lost due to the failure of the nonvolatile storage like the disk. The scheme which is available
To protect the data from disk failure is to periodically dump the entire contents of the database to any backup
(or even stable) storage like a magnetic tape. When a failure occurs the most recent dump is used to restoring
The datbase to a previous consistent state. Then the log is used to redo all the transactions that have committed
Since the last dump occurred. The following steps are performed for this purpose :

Output all log records currently residing in the main memory onto stable store.
Output all buffer blocks onto the disk.
Copy the contents of the database to stable store.
Output a log record <dump>.

surendersingh@rediffmail.com
Summary

Concurrency control and recovery are among the most important functions provided by a DBMS.
Users need not worry about concurrency.
System automatically inserts lock/unlock requests and schedules actions of different Xacts in such a
way as to ensure that the resulting execution is equivalent to executing the Xacts one after the other in
some order.
Write-ahead logging (WAL) is used to undo the actions of aborted transactions and to restore the
system to a consistent state after a crash.
Consistent state: Only the effects of commited Xacts seen.

surendersingh@rediffmail.com
Query Processing/Optimization

surendersingh@rediffmail.com
Rules

Optimization using algebraic Manipulation


Any algebraic manipulation approach to query optimization uses a set of rules, which may
Be enumerated as follows.

Perform selection as early as possible, in order to reduce the number of tuples to be


processed subsequently.
Projections of projections should be combined, if possible, in order to avoid repeated
scanning of tuples.
Projection over indexed attributes should be done earlier and That over non-indexed
attributes should be done later.
Intermediate relations produced in separate processing sequences must be shared as
as and when possible.
If possible, attributes which are controlling a join operation should be sorted earlier.

surendersingh@rediffmail.com
Example

surendersingh@rediffmail.com
Example contd.

surendersingh@rediffmail.com
Projection Operation

surendersingh@rediffmail.com
Natural Join Operation

surendersingh@rediffmail.com
Natural Join Operation - 2

surendersingh@rediffmail.com

Das könnte Ihnen auch gefallen