Concurrency Control READ

ICS 3212: INTERNET AND DISTRIBUTED DATABASES
CONCURRENCY CONTROL
Concurrency Control and Recovery

Distributed Databases encounter a number of
concurrency control and recovery problems which

are not present in centralized databases. Some of
them are listed below.
Dealing with multiple copies of data items

Failure of individual sites
Communication link failure
Distributed commit
Distributed deadlock
Slide 25- 2

Details
Dealing with multiple copies of data items:
The concurrency control must maintain global consistency.

Likewise the recovery mechanism must recover all copies and
maintain consistency after recovery.
Failure of individual sites:
Database availability must not be affected due to the failure of one

or two sites and the recovery scheme must recover them before
they are available for use.
Slide 25- 3

Details (contd.)
Communication link failure:
Distributed commit:
This failure may create network partition which would affect

database availability even though all database sites may be
running.
A transaction may be fragmented and they may be executed by a
number of sites. This require a two or three-phase commit
approach for transaction commit.
Distributed deadlock:
Since transactions are processed at multiple sites, two or more

sites may get involved in deadlock. This must be resolved in a
distributed manner.
What is Concurrency Control
Process of managing simultaneous execution of transactions in a

shared database, to ensure the serializability of transactions.
Why we need Concurrency Control
Simultaneous execution of transactions over a shared database can create

several data integrity and consistency problems:
Lost Updates.
Uncommitted Data.
Inconsistent retrievals.
When we need Concurrency Control
Concurrent access to data is desirable when:

1.The amount of data is sufficiently great that at any given time
only fraction of the data can be in primary memory & rest should
be swapped from secondary memory as needed.
2. Even if the entire database can be present in primary memory,
there may be multiple processes.
Concurrency Control Techniques
Optimistic concurrency control

Pessimistic concurrency control
Locking
Pessimistic Concurrency Control

Pessimistic Concurrency Control assumes that conflicts will
happen
Pessimistic Concurrency Control techniques detect conflicts as
soon as they occur and resolve them using blocking.
Optimistic Concurrency Control
Optimistic Concurrency Control assumes that conflicts between

transactions are rare.
Does not require locking
Transaction executed without restrictions
Check for conflicts just before commit
Locking
Locking is pessimistic because it assumes that conflicts will
happen.
The concept of locking data items is one of the main techniques
used for controlling the concurrent execution of transactions.
A lock is a variable associated with a data item in the database.
Generally there is a lock for each data item in the database.
A lock describes the status of the data item with respect to
possible operations that can be applied to that item. It is used for
synchronising the access by concurrent transactions to the database
items.
A transaction locks an object before using it
When an object is locked by another transaction, the
requesting transaction must wait
Lock-Based Protocols
A lock is a mechanism to control concurrent access to a
data item
Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as
well as written. X-lock is requested using lock-X
instruction.
2. shared (S) mode. Data item can only be read. S-lock
is requested using lock-S instruction.
Lock requests are made to concurrency-control
manager. Transaction can proceed only after request is

granted.
Lock-Based Protocols (Cont.)

Lock-compatibility matrix
A transaction may be granted a lock on an item if the
requested lock is compatible with locks already held on the

item by other transactions
Any number of transactions can hold shared locks on an
item, but if any transaction holds an exclusive on the item
no other transaction may hold any lock on the item.
If a lock cannot be granted, the requesting transaction is
made to wait till all incompatible locks held by other
transactions have been released. The lock is then granted.
Lock-Based Protocols (Cont.)

Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
Locking as above is not sufficient to guarantee serializability
if A and B get updated in-between the read of A and B, the displayed sum would be
wrong.
A locking protocol is a set of rules followed by all transactions while requesting
and releasing locks.
Locking protocols restrict the set of possible schedules.
Pitfalls of Lock-Based Protocols

Lock management overhead.
Deadlock detection/resolution.
Concurrency is significantly lowered, when congested nodes are
locked.
To allow a transaction to abort itself when mistakes occur, locks
cant be released until the end of transaction, thus currency is
significantly lowered
(Most Important) Conflicts are rare. (We might get better
performance by not locking, and instead checking for conflicts at
commit time.)
Pitfalls of Lock-Based Protocols (Cont.)

The potential for deadlock exists in most locking
protocols. Deadlocks are a necessary evil.

Starvation is also possible if concurrency control
manager is badly designed. For example:
A transaction may be waiting for an X-lock on an item,
while a sequence of other transactions request and are
granted an S-lock on the same item.
The same transaction is repeatedly rolled back due to
deadlocks.
Concurrency control manager can be designed to
prevent starvation.
Pitfalls of Lock-Based Protocols

Consider the partial schedule
Neither T3 nor T4 can make progress

executing lock-S(B) causes T4 to wait for T3 to release its lock on B, while
executing lock-X(A) causes T3 to wait for T4 to release its lock on A.

Such a situation is called a deadlock.
To handle a deadlock one of T3 or T4 must be rolled back

and its locks released.
Two-Phase Locking (2PL)

Two-Phase Locking Protocol
Each X act must obtain a S (shared) lock on object before

reading, and an X (exclusive) lock on object before writing.
A transaction can not request additional locks once it releases
any locks.
If an X act holds an X lock on an object, no other X act can get
a lock (S or X) on that object.
The Two-Phase Locking Protocol

The protocol ensures conflict-serializable schedules.
Phase 1: Growing Phase
transaction
may obtain locks

transaction may not release locks
Phase 2: Shrinking Phase
transaction may release locks
transaction may not obtain locks
The protocol assures serializability. It can be proved that
the transactions can be serialized in the order of their lock

points (i.e. the point where a transaction acquired its final
lock).
The Two-Phase Locking Protocol (Cont.)

Two-phase locking does not ensure freedom from
deadlocks
Cascading roll-back is possible under two-phase locking.
To avoid this, follow a modified protocol called strict

two-phase locking. Here a transaction must hold all its
exclusive locks till it commits/aborts.
Rigorous two-phase locking is even stricter: here all
locks are held till commit/abort. In this protocol

transactions can be serialized in the order in which they
commit.
The Two-Phase Locking Protocol (Cont.)

There can be conflict serializable schedules that
cannot be obtained if two-phase locking is used.

However, in the absence of extra information
(e.g., ordering of access to data), two-phase
locking is needed for conflict serializability in the
following sense:
Given a transaction Ti that does not follow twophase locking, we can find a transaction Tj that
uses two-phase locking, and a schedule for Ti and
Tj that is not conflict serializable.
Lock Conversions
Two-phase locking with lock conversions:
First Phase:
can acquire a lock-S on item
can acquire a lock-X on item
can convert a lock-S to a lock-X (upgrade)
Second Phase:
can release a lock-S
can release a lock-X
can convert a lock-X to a lock-S (downgrade)
This protocol assures serializability. But still relies on the
programmer to insert the various locking instructions .
Automatic Acquisition of Locks

A transaction Ti issues the standard read/write instruction,
without explicit locking calls.

The operation read(D) is processed as:
if Ti has a lock on D
then
read(D)
else
begin
if necessary wait until no other
transaction has a lock-X on D
grant Ti a lock-S on D;
read(D)
Lock Table
Black rectangles indicate granted
locks, white ones indicate waiting

requests
Lock table also records the type
of lock granted or requested
New request is added to the end
of the queue of requests for the
data item, and granted if it is
compatible with all earlier locks
Unlock requests result in the
request being deleted, and later
requests are checked to see if they
can now be granted
If transaction aborts, all waiting
or granted requests of the
transaction are deleted
lock manager may keep a list of
locks held by each transaction,
to implement this efficiently
Graph-Based Protocols
Graph-based protocols are an alternative to two-phase
locking
Impose a partial ordering on the set D = {d1, d2 ,...,
dh} of all data items.
If di dj then any transaction accessing both di and dj must access

di before accessing dj.
Implies that the set D may now be viewed as a directed acyclic

graph, called a database graph.
The tree-protocol is a simple kind of graph protocol.
Tree Protocol
Only exclusive locks are allowed.

The first lock by Ti may be on any data item. Subsequently, a
data Q can be locked by Ti only if the parent of Q is currently

locked by Ti.
Data items may be unlocked at any time.
What is Timestamping?
Scheduler assign each transaction T a
unique number, its timestamp TS(T).

Timestamps must be issued in ascending
order, at the time when a transaction first
notifies the scheduler that it is beginning.
Timestamp TS(T)
Two methods of generating Timestamps.
Use
the value of system, clock as the timestamp.

Use a logical counter that is incremented after a
new timestamp has been assigned.
Scheduler maintains a table of currently
active transactions and their timestamps

irrespective of the method used
Timestamps for database element X

and commit bit
RT(X):- The read time of X, which is the highest
timestamp of transaction that has read X.

WT(X):- The write time of X, which is the highest
timestamp of transaction that has write X.
C(X):- The commit bit for X, which is true if and
only if the most recent transaction to write X has
already committed.
Rules for Timestamps-Based

scheduling
1. Scheduler receives a request rT(X)
a) If its timestamp TS(T). TS(T) WT(X), the read is physically
realizable.
1. If C(X) is true, grant the request, if TS(T) > RT(X), set

RT(X) := TS(T); otherwise do not change RT(X).
2. If C(X) is false, delay T until C(X) becomes true or
transaction that wrote X aborts.
b) If TS(T) < WT(X), the read is physically unrealizable. Rollback T.
Rules for Timestamps-Based scheduling (Cont.)
2. Scheduler receives a request WT(X).

a) if TS(T) RT(X) and TS(T) WT(X), write is physically
realizable and must be performed.
1. Write the new value for X,
2. Set WT(X) := TS(T), and
3. Set C(X) := false.
b) if TS(T) RT(X) but TS(T) < WT(X), then the write is
physically realizable, but there is already a later values in X.
a. If C(X) is true, then the previous writers of X is

committed, and ignore the write by T.
b. If C(X) is false, we must delay T.
c) if TS(T) < RT(X), then the write is physically unrealizable,
and T must be rolled back.
Rules for Timestamps-Based

scheduling (Cont.)
3. Scheduler receives a request to commit T. It must find

all the database elements X written by T and set C(X) :=
true. If any transactions are waiting for X to be
committed, these transactions are allowed to proceed.
4. Scheduler receives a request to abort T or decides to
rollback T, then any transaction that was waiting on an
element X that T wrote must repeat its attempt to read or
write.
Timestamps and Locking

Generally, timestamping performs better than
locking in situations where:

Most transactions are read-only.
It is rare that concurrent transaction will try to
read and write the same element.
In high-conflict situation, locking performs better
than timestamps
Timestamp-Based Protocols
Each transaction is issued a timestamp when it enters the
system. If an old transaction Ti has time-stamp TS(Ti), a new

transaction Tj is assigned time-stamp TS(Tj) such that TS(Ti)
<TS(Tj).
The protocol manages concurrent execution such that the time-
stamps determine the serializability order.

In order to assure such behavior, the protocol maintains for
each data Q two timestamp values:
W-timestamp(Q) is the largest time-stamp of any

transaction that executed write(Q) successfully.
R-timestamp(Q) is the largest time-stamp of any transaction

that executed read(Q) successfully.
Timestamp-Based Protocols (Cont.)

The timestamp ordering protocol ensures that any
conflicting read and write operations are executed in

timestamp order.
Suppose a transaction Ti issues a read(Q)
1. If TS(Ti) W-timestamp(Q), then Ti needs to read a
value of Q that was already overwritten. Hence, the
read operation is rejected, and Ti is rolled back.
2. If TS(Ti) W-timestamp(Q), then the read operation
is executed, and R-timestamp(Q) is set to the
maximum of R- timestamp(Q) and TS(Ti).
Timestamp-Based Protocols (Cont.)

Suppose that transaction Ti issues write(Q).
If TS(Ti) < R-timestamp(Q), then the value of Q that Ti
is producing was needed previously, and the system

assumed that that value would never be produced.
Hence, the write operation is rejected, and Ti is rolled
back.
If TS(Ti) < W-timestamp(Q), then Ti is attempting to
write an obsolete value of Q. Hence, this write
operation is rejected, and Ti is rolled back.
Otherwise, the write operation is executed, and W-
timestamp(Q) is set to TS(T ).
Correctness of Timestamp-Ordering Protocol

The timestamp-ordering protocol guarantees
serializability since all the arcs in the precedence graph

are of the form:
transaction
with smaller
timestamp
transaction
with larger
timestamp
Thus, there will be no cycles in the precedence graph

Timestamp protocol ensures freedom from deadlock as
no transaction ever waits.
But the schedule may not be cascade-free, and may not
even be recoverable.
Validation Test for Transaction Tj

For a transaction that has updates, validation consists of determining
whether the current transaction leaves the database in a consistent state,

with serializibility maintained. If not, transaction is backed up and
restarted.
If for all Ti with TS (Ti) < TS (Tj) either one of the following condition holds:
finish(Ti) < start(Tj)
start(Tj) < finish(Ti) < validation(Tj) and the set of data items written
by Ti does not intersect with the set of data items read by Tj.
then validation succeeds and Tj can be committed. Otherwise, validation
fails and Tj is aborted.
Justification: Either first condition is satisfied, and there is no overlapped
execution, or second condition is satisfied and
1. the writes of Tj do not affect reads of Ti since they occur after Ti has
finished its reads.
2. the writes of Ti do not affect reads of Tj since Tj does not read
any item written by Ti.
Schedule Produced by Validation

Example of schedule produced using validation
T14
read(B)
read(A)
(validate)
display (A+B)
T15
read(B)
B:- B-50
read(A)
A:- A+50
(validate)
write (B)
write (A)
Multiple Granularity
Allow data items to be of various sizes and define a
hierarchy of data granularities, where the small

granularities are nested within larger ones
Can be represented graphically as a tree (but don't confuse
with tree-locking protocol)
When a transaction locks a node in the tree explicitly, it
implicitly locks all the node's descendents in the same
mode.
Granularity of locking (level in tree where locking is done):
fine granularity (lower in tree): high concurrency, high
locking overhead
coarse granularity (higher in tree): low locking
overhead, low concurrency
Example of Granularity Hierarchy
The highest level in the example hierarchy is the entire

database.
The levels below are of type area, file and record in
that order.
Intention Lock Modes

In addition to S and X lock modes, there are three
additional lock modes with multiple granularity:

intention-shared (IS): indicates explicit locking at a
lower level of the tree but only with shared locks.
intention-exclusive (IX): indicates explicit locking at a
lower level with exclusive or shared locks
shared and intention-exclusive (SIX): the subtree
rooted by that node is locked explicitly in shared mode
and explicit locking is being done at a lower level with
exclusive-mode locks.
intention locks allow a higher level node to be locked in S
or X mode without having to check all descendent nodes.
Thank You

Concurrency Control READ

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Concurrency Control READ

Hochgeladen von

Copyright:

Verfügbare Formate

ICS 3212: INTERNET AND DISTRIBUTED DATABASES

Concurrency Control and Recovery

concurrency control and recovery problems which

Dealing with multiple copies of data items

Concurrency Control and Recovery

Dealing with multiple copies of data items:

The concurrency control must maintain global consistency.

Failure of individual sites:

Database availability must not be affected due to the failure of one

Concurrency Control and Recovery

Communication link failure:

This failure may create network partition which would affect

Since transactions are processed at multiple sites, two or more

What is Concurrency Control

Process of managing simultaneous execution of transactions in a

Why we need Concurrency Control

Simultaneous execution of transactions over a shared database can create

When we need Concurrency Control

Concurrent access to data is desirable when:

Concurrency Control Techniques

Optimistic concurrency control

Pessimistic Concurrency Control

Optimistic Concurrency Control

Optimistic Concurrency Control assumes that conflicts between

A lock is a mechanism to control concurrent access to a

manager. Transaction can proceed only after request is

Lock-Based Protocols (Cont.)

A transaction may be granted a lock on an item if the

requested lock is compatible with locks already held on the

Lock-Based Protocols (Cont.)

Pitfalls of Lock-Based Protocols

Pitfalls of Lock-Based Protocols (Cont.)

protocols. Deadlocks are a necessary evil.

Pitfalls of Lock-Based Protocols

Neither T3 nor T4 can make progress

executing lock-X(A) causes T3 to wait for T4 to release its lock on A.

To handle a deadlock one of T3 or T4 must be rolled back

Two-Phase Locking (2PL)

Each X act must obtain a S (shared) lock on object before

The Two-Phase Locking Protocol

may obtain locks

the transactions can be serialized in the order of their lock

The Two-Phase Locking Protocol (Cont.)

To avoid this, follow a modified protocol called strict

locks are held till commit/abort. In this protocol

The Two-Phase Locking Protocol (Cont.)

cannot be obtained if two-phase locking is used.

programmer to insert the various locking instructions .

Automatic Acquisition of Locks

without explicit locking calls.

locks, white ones indicate waiting

If di dj then any transaction accessing both di and dj must access

Implies that the set D may now be viewed as a directed acyclic

The tree-protocol is a simple kind of graph protocol.

Only exclusive locks are allowed.

data Q can be locked by Ti only if the parent of Q is currently

unique number, its timestamp TS(T).

the value of system, clock as the timestamp.

active transactions and their timestamps

Timestamps for database element X

timestamp of transaction that has read X.

Rules for Timestamps-Based

1. If C(X) is true, grant the request, if TS(T) > RT(X), set

Rules for Timestamps-Based scheduling (Cont.)