Sie sind auf Seite 1von 12

Concurrency Control

In a multiprogramming environment where multiple transactions can be executed simultaneously, it is


highly important to control the concurrency of transactions. Concurrent access is quite easy if all users
are just reading data. There is no way they can interfere with one another. Though for any practical
database, would have a mix of READ and WRITE operations and hence the concurrency is a challenge.

Problems of concurrency control


Several problems can occur when concurrent transactions are executed in an uncontrolled manner.
Following are some issues which you will likely to face while concurrency control –
1. Lost updates
2. Dirty read
3. Unrepeatable read

1. Lost update problem


o When two transactions that access the same database items contain their operations in a way
that makes the value of some database item incorrect, then the lost update problem occurs.
o If two transactions T1 and T2 read a record and then update it, then the effect of updating of
the first record will be overwritten by the second update.

Example:

Here,
o At time t2, transaction-X reads A's value.
o At time t3, Transaction-Y reads A's value.
o At time t4, Transactions-X writes A's value on the basis of the value seen at time t2.
o At time t5, Transactions-Y writes A's value on the basis of the value seen at time t3.
o So at time T5, the update of Transaction-X is lost because Transaction-Y overwrites it
without looking at its current value.
o Such type of problem is known as Lost Update Problem as update made by one transaction is
lost here.

2. Dirty Read
o The dirty read occurs in the case when one transaction updates an item of the database, and
then the transaction fails for some reason. The updated database item is accessed by another
transaction before it is changed back to the original value.
o A transaction T1 updates a record which is read by T2. If T1 aborts then T2 now has values
which have never formed part of the stable database.

Unit-5(DBMS) |Pronab Kumar Adhikari 1


Example:

Here,
o At time t2, transaction-Y writes A's value.
o At time t3, Transaction-X reads A's value.
o At time t4, Transactions-Y rollbacks. So, it changes A's value back to that of prior to t1.
o So, Transaction-X now contains a value which has never become part of the stable database.
o Such type of problem is known as Dirty Read Problem, as one transaction reads a dirty value
which has not been committed.

3. Inconsistent Retrievals Problem


o When a transaction calculates some summary function over a set of data while the other
transactions are updating the data, then the Inconsistent Retrievals Problem occurs.
o A transaction T1 reads a record and then does some other processing during which the
transaction T2 updates the record. Now when the transaction T1 reads the record, then the new
value will be inconsistent with the previous value.

Example:
Suppose two transactions operate on three accounts.

Here,

o Transaction-X is doing the sum of all balance while transaction-Y is transferring an amount 50
from Account-1 to Account-3.
o Here, transaction-X produces the result of 550 which is incorrect. If we write this produced
result in the database, the database will become an inconsistent state because the actual sum is
600.
o Here, transaction-X has seen an inconsistent state of the database.

Unit-5(DBMS) |Pronab Kumar Adhikari 2


Concurrency Control Protocols
We have concurrency control protocols to ensure atomicity, isolation, and serializability of concurrent
transactions. Concurrency control protocols can be broadly divided into three categories-
1. Lock based protocol
2. Time-stamp protocol
3. Validation based protocol

Lock-based Protocols
A lock is a mechanism to control concurrent access to a data item. Lock requests are made to
concurrency-control manager. Transaction can proceed only after request is granted. Data items can
be locked in two modes :

1. Shared lock:
o It is also known as a Read-only lock. In a shared lock, the data item can only read by the
transaction.
o It can be shared between the transactions because when the transaction holds a lock, then it
can't update the data on the data item.
o Shared-lock is requested using lock-S instruction.

For example, consider a case where two transactions are reading the account balance of a person. The
database will let them read by placing a shared lock. However, if another transaction wants to update
that account's balance, shared lock prevent it until the reading process is over.

2. Exclusive lock:
o In the exclusive lock, the data item can be both reads as well as written by the transaction.
o This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
o Transactions may unlock the data item after finishing the 'write' operation so that other
transaction can acquire lock on that data item for its operations.
o Exclusive-lock is requested using lock-X instruction.

For example, when a transaction needs to update the account balance of a person. You can allows this
transaction by placing X lock on it. Therefore, when the second transaction wants to read or write,
exclusive lock prevent this operation.

Lock-compatibility matrix
o A transaction may be granted a lock on an item if the requested lock is compatible with locks
already held on the item by other transactions
o Any number of transactions can hold shared locks on an item, but if any transaction holds an
exclusive lock on the item no other transaction may hold any lock on the item.
o If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks
held by other transactions have been released. The lock is then granted.

Unit-5(DBMS) |Pronab Kumar Adhikari 3


Example of a transaction performing locking:
Ti: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
A locking protocol is a set of rules followed by all transactions while requesting and releasing
locks.Locking protocols restrict the set of possible schedules. The locking protocol must ensure
serializability.

There are four types of lock protocols available:-


1. Simplistic lock protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a 'write'
operation is performed. Transactions may unlock the data item after completing the ‘write’ operation.

2. Pre-claiming Lock Protocol


Pre-claiming protocols evaluate their operations and create a list of data items on which they need
locks. Before initiating an execution, the transaction requests the system for all the locks it needs
beforehand. If all the locks are granted, the transaction executes and releases all the locks when all its
operations are over. If all the locks are not granted, the transaction rolls back and waits until all the
locks are granted.

Starvation
Starvation is the situation when a transaction needs to wait for an indefinite period to acquire a lock.
Following are the reasons for Starvation:
 When waiting scheme for locked items is not properly managed
 In the case of resource leak
 The same transaction is selected as a victim repeatedly
Deadlock
Deadlock refers to a specific situation where two or more processes are waiting for each other to
release a resource or more than two processes are waiting for the resource in a circular chain.

Unit-5(DBMS) |Pronab Kumar Adhikari 4


3. Two-phase locking (2PL)
Two-Phase locking protocol which is also known as a 2PL protocol. It is also called P2L. In this type of
locking protocol, the transaction should acquire a lock after it releases one of its locks. This locking
protocol divides the execution phase of a transaction into three different parts.
 In the first phase, when the transaction begins to execute, it requires permission for the locks it
needs.
 The second part is where the transaction obtains all the locks. When a transaction releases its
first lock, the third phase starts.
 In this third phase, the transaction cannot demand any new locks. Instead, it only releases the
acquired locks.

There are two phases of 2PL:

Growing phase:
o In the growing phase, a new lock on the data item may be acquired by the transaction, but
none can be released.
Shrinking phase:
o In the shrinking phase, existing lock held by the transaction may be released, but no new locks
can be acquired.

The 2PL protocol assures serializability, but does not ensure freedom from deadlocks. Cascading roll-
back is possible under two-phase locking.

4. Strict Two-phase locking (Strict-2PL)


The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the first phase, the
transaction continues to execute normally. But in contrast to 2PL, Strict-2PL does not release a lock
after using it. Strict-2PL holds all the locks until the commit point and releases all the locks at a time.
Strict-2PL protocol does not have shrinking phase of lock release.

In strict-2PL cascading rollback not is possible.

Unit-5(DBMS) |Pronab Kumar Adhikari 5


Time-Stamp Protocols
Concurrency Control can be implemented in different ways. One way to implement it is by using Locks.
Now, lets discuss about Time Stamp Ordering Protocol.
o The timestamp-based algorithm uses a timestamp to serialize the execution of concurrent
transactions.
o This protocol ensures that every conflicting read and write operations are executed in
timestamp order. The protocol uses the System Time or Logical Count as a Timestamp.
o The older transaction is always given priority in this method. It uses system time to determine
the time stamp of the transaction. This is the most commonly used concurrency protocol.
o Lock-based protocols help you to manage the order between the conflicting transactions when
they will execute. Timestamp-based protocols manage conflicts as soon as an operation is
created.

Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the
system at 007 times and transaction T2 has entered the system at 009 times. T1 has the higher
priority, so it executes first as it is entered the system first.
The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write' operation on a
data.

Basic Timestamp ordering protocol works as follows:


1. Whenever a transaction Ti issues a Read (X) operation:

o If W_TS(X) >TS(Ti) then the operation is rejected.


o If W_TS(X) <= TS(Ti) then the operation is executed.
o Timestamps of all the data items are updated.

2. Whenever a transaction Ti issues a Write(X) operation:

o If TS(Ti) < R_TS(X) then the operation is rejected.


o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the operation
is executed.

Where,

TS(TI) denotes the timestamp of the transaction Ti.

R_TS(X) denotes the Read time-stamp of data-item X.

W_TS(X) denotes the Write time-stamp of data-item X.

Advantages:
 Schedules are serializable just like 2PL protocols
 No waiting for the transaction, which eliminates the possibility of deadlocks!

Disadvantages:
Starvation is possible if the same transaction is restarted and continually aborted

Unit-5(DBMS) |Pronab Kumar Adhikari 6


Thomas write Rule
Thomas Write Rule provides the guarantee of serializability order for the protocol. It improves the
Basic Timestamp Ordering Algorithm.

The basic Thomas write rules are as follows:

o If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and operation is rejected.
o If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the transaction and
continue processing.
o If neither condition 1 nor condition 2 occurs, then allowed to execute the WRITE operation by
transaction Ti and set W_TS(X) to TS(T).

If we use the Thomas write rule then some serializable schedule can be permitted that does not conflict
serializable as illustrate by the schedule in a given figure:

Figure: A Serializable Schedule that is not Conflict Serializable

In the above figure, T1's read and precedes T1's write of the same data item. This schedule does not
conflict serializable.

Thomas write rule checks that T2's write is never seen by any transaction. If we delete the write
operation in transaction T2, then conflict serializable schedule can be obtained which is shown in below
figure.

Figure: A Conflict Serializable Schedule

Validation based Protocols


Validation phase is also known as optimistic concurrency control technique. In the validation based
protocol, the transaction is executed in the following three phases:

1. Read phase: In this phase, the transaction T is read and executed. It is used to read the value
of various data items and stores them in temporary local variables. It can perform all the write
operations on temporary variables without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated against the
actual data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the temporary results are
written to the database or system otherwise the transaction is rolled back.

Unit-5(DBMS) |Pronab Kumar Adhikari 7


Here each phase has the following different timestamps:

a) Start(Ti): It contains the time when Ti started its execution.


b) Validation (Ti): It contains the time when Ti finishes its read phase and starts its validation
phase.
c) Finish(Ti): It contains the time when Ti finishes its write phase.

This protocol is used to determine the time stamp for the transaction for serialization using the time
stamp of the validation phase, as it is the actual phase which determines if the transaction will commit
or rollback.

Hence TS(T) = validation(T).

The serializability is determined during the validation process. It can't be decided in advance. While
executing the transaction, it ensures a greater degree of concurrency and also less number of conflicts.
Thus it contains transactions which have less number of rollbacks.

Multiple Granularity
In the various Concurrency Control schemes have used different methods and every individual Data
item as the unit on which synchronization is performed. A certain drawback of this technique is if a
transaction Ti needs to access the entire database, and a locking protocol is used, then T i must lock
each item in the database. It is less efficient, it would be more simpler if Ti could use a single lock to
lock the entire database.

But, if it consider the second proposal, this should not in fact overlook the certain flaw in the proposed
method. Suppose another transaction just needs to access a few data items from a database, so
locking the entire database seems to be unnecessary moreover it may cost us loss of Concurrency,
which was our primary goal in the first place. To bargain between Efficiency and Concurrency. Use
Granularity. Granularity: It is the size of data item allowed to lock.

o It can be defined as hierarchically breaking up the database into blocks which can be locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item. This type of
hierarchy can be graphically represented as a tree.

For example: Consider a tree which has four levels of nodes.

o The first level or higher level shows the entire database.


o The second level represents a node of type area. The higher level database consists of exactly
these areas.
o The area consists of children nodes which are known as files. No file can be present in more
than one area.
o Finally, each file contains child nodes known as records. The file has exactly those records that
are its child nodes. No records represent in more than one file.
o Hence, the levels of the tree starting from the top level are as follows:
1. Database
2. Area
3. File
4. Record

Unit-5(DBMS) |Pronab Kumar Adhikari 8


In this example, the highest level shows the entire database. The levels below are file, record, and
fields.

There are three additional lock modes with multiple granularity:

a) Intention-shared (IS): It contains explicit locking at a lower level of the tree but only with
shared locks.
b) Intention-Exclusive (IX): It contains explicit locking at a lower level with exclusive or shared
locks.
c) Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared mode, and
some node is locked in exclusive mode by the same transaction.

Compatibility Matrix with Intention Lock Modes: The below table describes the compatibility
matrix for these lock modes:

It uses the intention lock modes to ensure serializability. It requires that if a transaction attempts to
lock a node, then that node must follow these protocols:

o Transaction T1 should follow the lock-compatibility matrix.


o Transaction T1 firstly locks the root of the tree. It can lock it in any mode.
o If T1 currently has the parent of the node locked in either IX or IS mode, then the transaction
T1 will lock a node in S or IS mode only.
o If T1 currently has the parent of the node locked in either IX or SIX modes, then the transaction
T1 will lock a node in X, SIX, or IX mode only.
o If T1 has not previously unlocked any node only, then the Transaction T1 can lock a node.
o If T1 currently has none of the children of the node-locked only, then Transaction T1 will unlock
a node.

Unit-5(DBMS) |Pronab Kumar Adhikari 9


Observe that in multiple-granularity, the locks are acquired in top-down order, and locks must be
released in bottom-up order.

o If transaction T1 reads record Ra9 in file Fa, then transaction T1 needs to lock the database, area
A1 and file Fa in IX mode. Finally, it needs to lock R a2 in S mode.
o If transaction T2 modifies record Ra9 in file Fa, then it can do so after locking the database, area
A1 and file Fa in IX mode. Finally, it needs to lock the Ra9 in X mode.
o If transaction T3 reads all the records in file Fa, then transaction T3 needs to lock the database,
and area A in IS mode. At last, it needs to lock Fa in S mode.
o If transaction T4 reads the entire database, then T4 needs to lock the database in S mode.

Multiversion Concurrency Control Technique


o This concurrency control technique keeps the old values of a data item when the item is
updated. These are known as multiversion concurrency control, because several versions
(values) of an item are maintained.
o When a transaction requires access to an item, an appropriate version is chosen to maintain the
serializability of the currently executing schedule, if possible. The idea is that some read
operations that would be rejected in other techniques can still be accepted by reading an older
version of the item to maintain serializability. When a transaction writes an item, it writes a new
version and the old version of the item is retained.
o Some multiversion concurrency control algorithms use the concept of view serializability rather
than conflict serializability.
o An obvious drawback of multiversion techniques is that more storage is needed to maintain
multiple versions of the database items. However, older versions may have to be maintained
anyway—for example, for recovery purposes. In addition, some database applications require
older versions to be kept to maintain a history of the evolution of data item values. The
extreme case is a temporal database, which keeps track of all changes and the times at which
they occurred. In such cases, there is no additional storage penalty for multiversion techniques,
since older versions are already maintained.

Multiversion Technique Based on Timestamp Ordering


In this method, several versions of each data item X are maintained. For each version, the value of
version and the following two timestamps are kept:
1. read_TS: The read timestamp of is the largest of all the timestamps of transactions that have
successfully read version .
2. write_TS: The write timestamp of is the timestamp of the transaction that wrote the value of
version .

Whenever a transaction T is allowed to execute a write_item(X) operation, a new version of item X is


created, with both the write_TS and the read_TS set to TS(T). Correspondingly, when a transaction T
is allowed to read the value of version Xi, the value of read_TS() is set to the larger of the current
read_TS() and TS(T).

Unit-5(DBMS) |Pronab Kumar Adhikari 10


To ensure serializability, the following two rules are used:

1. If transaction T issues a write_item(X) operation, and version i of X has the highest write_TS()
of all versions of X that is alsoless than or equal to TS(T), and read_TS() > TS(T), then abort
and roll back transaction T; otherwise, create a new version of Xwith read_TS() = write_TS() =
TS(T).
2. If transaction T issues a read_item(X) operation, find the version i of X that has the highest
write_TS() of all versions of Xthat is also less than or equal to TS(T); then return the value of to
transaction T, and set the value of read_TS() to the larger of TS(T) and the current read_TS().

As we can see in case 2, a read_item(X) is always successful, since it finds the appropriate version to
read based on the write_TS of the various existing versions of X.
In case 1, however, transaction T may be aborted and rolled back. This happens if T is attempting to
write a version of X that should have been read by another transaction T whose timestamp is
read_TS(); however, T has already read version Xi, which was written by the transaction with
timestamp equal to write_TS(). If this conflict occurs, T is rolled back; otherwise, a new version of X,
written by transaction T, is created. Notice that, if T is rolled back, cascading rollback may occur.
Hence, to ensure recoverability, a transaction T should not be allowed to commit until after all the
transactions that have written some version that T has read have committed.

Multiversion Two-Phase Locking Using Certify Locks

o In this multiple-mode locking scheme, there are three locking modes for an item: read, write,
and certify, instead of just the two modes (read, write). Hence, the state of LOCK(X) for an
item X can be one of read-locked, write-locked, certify-locked, or unlocked.
o In the standard locking scheme, once a transaction obtains a write lock on an item, no other
transactions can access that item. The idea behind multiversion 2PL is to allow other
transactions T to read an item X while a single transaction T holds a write lock on X. This is
accomplished by allowing two versions for each item X; one version must always have been
written by some committed transaction.
o The second version X is created when a transaction T acquires a write lock on the item. Other
transactions can continue to read the committed version of X while T holds the write lock.
o Transaction T can write the value of X as needed, without affecting the value of the committed
version X. However, once T is ready to commit, it must obtain a certify lock on all items that it
currently holds write locks on before it can commit.
o The certify lock is not compatible with read locks, so the transaction may have to delay its
commit until all its write-locked items are released by any reading transactions in order to
obtain the certify locks.
o Once the certify locks—which are exclusive locks—are acquired, the committed version X of the
data item is set to the value of version X, version X is discarded, and the certify locks are then
released.
o In this multiversion 2PL scheme, reads can proceed concurrently with a single write operation—
an arrangement not permitted under the standard 2PL schemes.

Unit-5(DBMS) |Pronab Kumar Adhikari 11


Recovery with Concurrent Transactions in DBMS

Log based recovery and shadow paging holds good if there is single transaction like updating the
address or so. But what will happen when there are multiple transactions which occur concurrently?
Same method of logging the logs can be followed. But since there are a concurrent transactions, order
and time of each transaction makes a great difference. Failing to maintain the order of transaction will
lead to wrong data while recovering. Also, transactions may have number of steps. Maintaining the log
for each step will increase the log file size. Again it will become an overhead to maintain a log file along
with these transactions. In addition performing redo operation is also an overhead because it is
executing the executed transaction again and again. It is not actually necessary. So our goal here
should be small log file with easy recovery of data in case of failure. To handle this
situation Checkpoints are introduced during the transaction.

Checkpoint acts like a bookmark. During the execution of transaction, such checkpoints are marked
and transaction is executed. The log files will be created as usual with the steps of transactions. When
it reaches the checkpoint, the transaction will be updated into database and all the logs till that point
will be removed from file. Log files then are updated with new steps of transaction till next checkpoint
and so on. Here care should be taken to create a checkpoint because, if any checkpoints are created
before any transaction is complete fully, and data is updated to database, it will not meet the purpose
of the log file and checkpoint. If checkpoints are created when each transaction is complete or where
the database is at consistent state, then it will be useful.

Suppose there are 4 concurrent transactions – T1, T2, T3 and T4. A checkpoint is added at the middle
of T1 and there is failure while executing T4. Let us see how a recovery system recovers the database
from this failure.

 It starts reading the log files from the end to start, so that it can reverse the transactions. i.e.;
it reads log files from transaction T4 to T1.

 Recovery system always maintains undo log and redo log. The log entries in the undo log will be
used to undo the transactions where as entries in the redo list will be re executed. The
transactions are put into redo list if it reads the log files with (<Tn, Start>, <Tn, Commit>)
or <Tn , Commit>. That means, it lists all the transactions that are fully complete into redo
list to re execute after the recovery. In above example, transactions T2 andT3 will have (<Tn,
Start>, <Tn, Commit>) in the log file. The transaction T1 will have only <Tn, Commit> in the
log file. This because, the transaction is committed after the checkpoint is crossed. Hence all
the logs with<Tn, Start>, are already written to the database and log file is removed for
those steps. Hence it puts T1, T2 and T3 into redo list.

 The logs with only <Tn, Start> are put into undo list because they are not complete and can
lead to inconsistent state of DB. In above example T4 will be put into undo list since this
transaction is not yet complete and failed amid.

This is how a DBMS recovers the data in case concurrent transaction failure.

Unit-5(DBMS) |Pronab Kumar Adhikari 12

Das könnte Ihnen auch gefallen