Beruflich Dokumente
Kultur Dokumente
n
n
n
n
n
n
o
Introduction
Background
Distributed DBMS Architecture
Distributed Database Design
Semantic Data Control
Distributed Query Processing
Distributed Transaction Management
Transaction Concepts and Models
Distributed Concurrency Control
Distributed Reliability
o
o
o
o
Distributed DBMS
Page 10-12. 1
Transaction
A transaction is a collection of actions that make
consistent transformations of system states while
preserving system consistency.
concurrency transparency
failure transparency
Database in a
consistent
state
Begin
Transaction
Distributed DBMS
Database may be
temporarily in an
inconsistent state
during execution
Execution of
Transaction
1998 M. Tamer zsu & Patrick Valduriez
Database in a
consistent
state
End
Transaction
Page 10-12. 2
Transaction Example
A Simple SQL Query
Transaction BUDGET_UPDATE
begin
EXEC SQL UPDATE PROJ
SET
BUDGET = BUDGET1.1
Distributed DBMS
Page 10-12. 3
Example Database
Consider an airline reservation example with the
relations:
FLIGHT(FNO, DATE, SRC, DEST, STSOLD, CAP)
CUST(CNAME, ADDR, BAL)
FC(FNO, DATE, CNAME,SPECIAL)
Distributed DBMS
Page 10-12. 4
end . {Reservation}
Distributed DBMS
Page 10-12. 5
Termination of Transactions
Begin_transaction Reservation
begin
temp1,temp2
FLIGHT
FNO = flight_no AND DATE = date;
else
EXEC SQL
EXEC SQL
UPDATEFLIGHT
SET
STSOLD = STSOLD + 1
WHERE FNO = flight_no AND DATE = date;
INSERT
INTO
FC(FNO, DATE, CNAME, SPECIAL);
VALUES(flight_no, date, customer_name, null);
Commit
output(reservation completed)
endif
end . {Reservation}
Distributed DBMS
Page 10-12. 6
Example Transaction
Reads & Writes
Begin_transaction Reservation
begin
input(flight_no, date, customer_name);
temp Read(flight_no(date).stsold);
if temp = flight(date).cap then
begin
output(no free seats);
Abort
end
else begin
Write(flight(date).stsold, temp + 1);
Write(flight(date).cname, customer_name);
Write(flight(date).special, null);
Commit;
output(reservation completed)
end
end. {Reservation}
Distributed DBMS
Page 10-12. 7
Characterization
this transaction
Distributed DBMS
Page 10-12. 8
Formalization
Let
Oij(x) be some operation Oj of transaction Ti operating on
OSi = j Oij
Ni {abort,commit}
Page 10-12. 9
Example
Consider a transaction T:
Read(x)
Read(y)
x x + y
Write(x)
Commit
Then
= {R(x), R(y), W(x), C}
< = {(R(x), W(x)), (R(y), W(x)), (W(x), C), (R(x), C), (R(y), C)}
Distributed DBMS
Page 10-12. 10
DAG Representation
Assume
< = {(R(x),W(x)), (R(y),W(x)), (R(x), C), (R(y), C), (W(x), C)}
R(x)
W(x)
R(y)
Distributed DBMS
Page 10-12. 11
Properties of Transactions
ATOMICITY
all or nothing
CONSISTENCY
no violation of integrity constraints
ISOLATION
concurrent changes invisible serializable
DURABILITY
committed updates persist
Distributed DBMS
Page 10-12. 12
Atomicity
n
n
Distributed DBMS
Page 10-12. 13
Consistency
Internal consistency
A transaction which executes alone against a
Distributed DBMS
Page 10-12. 14
Consistency Degrees
n
Degree 0
Transaction T does not overwrite dirty data of other
transactions
Dirty data refers to data values that have been
updated by a transaction prior to its commitment
n
Degree 1
T does not overwrite dirty data of other transactions
T does not commit any writes before EOT
Distributed DBMS
Page 10-12. 15
Degree 2
T does not overwrite dirty data of other transactions
T does not commit any writes before EOT
T does not read dirty data from other transactions
Degree 3
T does not overwrite dirty data of other transactions
T does not commit any writes before EOT
T does not read dirty data from other transactions
Other transactions do not dirty any data read by T
before T completes.
Distributed DBMS
Page 10-12. 16
Isolation
n
Serializability
If several transactions are executed concurrently,
Incomplete results
An incomplete transaction cannot reveal its
Distributed DBMS
Page 10-12. 17
Isolation Example
n
Distributed DBMS
Read(x)
x x+1
Write(x)
Commit
Read(x)
x x+1
Write(x)
Commit
T1:
T1:
T2:
T1:
T2:
T2:
T1:
T2:
Read(x)
x x+1
Read(x)
Write(x)
x x+1
Write(x)
Commit
Commit
Page 10-12. 18
Phantom
T1 searches the database according to a predicate
Distributed DBMS
Page 10-12. 19
Read Uncommitted
For transactions operating at this level, all three
Read Committed
Fuzzy reads and phantoms are possible, but dirty
Repeatable Read
Only phantoms possible.
Anomaly Serializable
None of the phenomena are possible.
Distributed DBMS
Page 10-12. 20
Durability
Distributed DBMS
Page 10-12. 21
Characterization of Transactions
Based on
Application areas
u
u
u
Timing
u
two-step
restricted
action model
Structure
u
u
u
Distributed DBMS
Page 10-12. 22
Transaction Structure
n
Flat transaction
Consists of a sequence of primitive operations embraced
end.
n
Nested transaction
The operations of a transaction may themselves be
transactions.
Begin_transaction Reservation
Begin_transaction Airline
end. {Airline}
Begin_transaction Hotel
end. {Hotel}
end. {Reservation}
Distributed DBMS
Page 10-12. 23
Nested Transactions
n
Open nesting
Distributed DBMS
Page 10-12. 24
Workflows
n
n
System-oriented workflows
u
u
Transactional workflows
u
Distributed DBMS
Page 10-12. 25
Workflow Example
T3
T1
T2
Customer
Database
Customer
Database
Distributed DBMS
T5
T4
Customer
Database
Page 10-12. 26
Transactions Provide
Distributed DBMS
Page 10-12. 27
algorithms
Reliability protocols
Atomicity & Durability
Local recovery protocols
Global commit protocols
Distributed DBMS
Page 10-12. 28
data
Distributed DBMS
Page 10-12. 29
Architecture Revisited
Begin_transaction,
Read, Write,
Commit, Abort
Results
Distributed
Execution Monitor
With other
TMs
Transaction Manager
(TM)
Scheduling/
Descheduling
Requests
Scheduler
(SC)
With other
SCs
To data
processor
Distributed DBMS
Page 10-12. 30
User
Application
Begin_Transaction,
Read, Write, Abort, EOT
Results &
User Notifications
Transaction
Manager
(TM)
Read, Write,
Abort, EOT
Results
Scheduler
(SC)
Scheduled
Operations
Results
Recovery
Manager
(RM)
Distributed DBMS
Page 10-12. 31
Begin_transaction,
Read, Write, EOT,
Abort
TM
Distributed
Transaction Execution
Model
TM
Replica Control
Protocol
Read, Write,
EOT, Abort
Distributed DBMS
SC
SC
Distributed
Concurrency Control
Protocol
RM
RM
Local
Recovery
Protocol
Page 10-12. 32
Concurrency Control
n
Inconsistent retrievals
u
Distributed DBMS
Page 10-12. 33
T2: Write(x)
Write(y)
Read(z)
Commit
T3: Read(x)
Read(y)
Read(z)
Commit
H1={W2(x),R1(x), R3(x),W1(x),C1,W2(y),R3(y),R2(z),C2,R3(z),C3}
Distributed DBMS
Page 10-12. 34
Formalization of Schedule
A complete schedule SC(T) over a set of
transactions T={T1, , Tn} is a partial order
SC(T)={T, < T} where
T = i i , for i = 1, 2, , n
< T i < i , for i = 1, 2, , n
For any two conflicting operations Oij, Okl T,
either Oij < T Okl or Okl < T Oij
Distributed DBMS
Page 10-12. 35
T2: Write(x)
Write(y)
Read(z)
Commit
T3: Read(x)
Read(y)
Read(z)
Commit
Distributed DBMS
R1(x)
W2(x)
R3(x)
W1(x)
W2(y)
R3(y)
C1
R2(z)
R3(z)
C2
C3
Page 10-12. 36
Schedule Definition
A schedule is a prefix of a complete schedule
such that only some of the operations and only
some of the ordering relationships are included.
T1: Read(x)
Write(x)
Commit
R11(x)
W2(x)
T2:Write(x)
T3:
Write(y)
Read(z)
Commit
R3(x)
R1(x)
W1(x)
W2(y)
R3(y)
C1
R2(z)
R3(z)
C2
C3
Distributed DBMS
Read(x)
Read(y)
Read(z)
Commit
W2(x)
R3(x)
W2(y)
R3(y)
R2(z)
R3(z)
Page 10-12. 37
Serial History
n
n
n
T2: Write(x)
Write(y)
Read(z)
Commit
T3: Read(x)
Read(y)
Read(z)
Commit
Hs={W2(x),W2(y),R2(z),C2,R1(x),W1(x),C1,R3(x),R3(y),R3(z),C3}
Distributed DBMS
Page 10-12. 38
Serializable History
n
Distributed DBMS
Page 10-12. 39
Serializable History
T1: Read(x)
Write(x)
Commit
T2: Write(x)
Write(y)
Read(z)
Commit
T3:Read(x)
Read(y)
Read(z)
Commit
Distributed DBMS
Page 10-12. 40
Page 10-12. 41
Global Non-serializability
T1: Read(x)
x x+5
Write(x)
Commit
T2: Read(x)
x x15
Write(x)
Commit
Distributed DBMS
Page 10-12. 42
Pessimistic
Two-Phase Locking-based (2PL)
u
u
u
Basic TO
Multiversion TO
Conservative TO
Hybrid
n
Optimistic
Locking-based
Timestamp ordering-based
Distributed DBMS
Page 10-12. 43
Locking-Based Algorithms
n
n
n
Distributed DBMS
Page 10-12. 44
Release lock
Phase 1
Phase 2
BEGIN
Distributed DBMS
END
Page 10-12. 45
Strict 2PL
Hold locks until the end.
Obtain lock
Release lock
BEGIN
END
period of
data item
use
Distributed DBMS
Transaction
duration
Page 10-12. 46
Centralized 2PL
n
n
Coordinating TM
Lock
Central Site LM
Requ
est
ranted
Lock G
tion
Opera
End of
Opera
tion
Releas
e Lock
Distributed DBMS
Page 10-12. 47
Distributed 2PL
Distributed DBMS
Page 10-12. 48
Participating LMs
Participating DPs
Requ
est
Oper
ation
ration
End of Ope
Relea
se Lo
Distributed DBMS
cks
Page 10-12. 49
Timestamp Ordering
Transaction (Ti) is assigned a globally unique
timestamp ts(Ti).
Transaction manager attaches the timestamp to all
operations issued by the transaction.
Each data item is assigned a write timestamp (wts) and
a read timestamp (rts):
rts(x) = largest timestamp of any read on x
wts(x) = largest timestamp of any read on x
for Wi(x)
Page 10-12. 50
Conservative Timestamp
Ordering
n
Page 10-12. 51
Distributed DBMS
Page 10-12. 52
Validate
Read
Compute
Write
Validate
Write
Optimistic execution
Read
Distributed DBMS
Compute
Page 10-12. 53
Distributed DBMS
Page 10-12. 54
Tk
W
R
Tij
Distributed DBMS
Page 10-12. 55
items written by Tk
Tk
W
Tij
Distributed DBMS
Page 10-12. 56
Tk
Tij
Distributed DBMS
V
R
W
V
Page 10-12. 57
Deadlock
n
n
n
Ti
Distributed DBMS
Tj
Page 10-12. 58
T4
T2
T3
Global WFG
T1
T4
T2
T3
Distributed DBMS
Page 10-12. 59
Deadlock Management
n
Ignore
Let the application programmer deal with it, or
Prevention
Guaranteeing that deadlocks can never occur in
Avoidance
Detecting potential deadlocks in advance and
Distributed DBMS
Page 10-12. 60
Deadlock Prevention
n
Evaluation:
Reduced concurrency due to preallocation
Evaluating whether an allocation is safe leads to added
overhead.
Difficult to determine (partial order)
+ No transaction rollback or restart is involved.
Distributed DBMS
Page 10-12. 61
Deadlock Avoidance
n
Distributed DBMS
Page 10-12. 62
Deadlock Avoidance
Wait-Die & Wound-Wait Algorithms
WAIT-DIE Rule: If Ti requests a lock on a data item
which is already locked by Tj, then Ti is permitted to
wait iff ts(Ti)<ts(Tj). If ts(Ti)>ts(Tj), then Ti is aborted
and restarted with the same timestamp.
if ts(Ti)<ts(Tj) then Ti waits else Ti dies
non-preemptive: Ti never preempts Tj
prefers younger transactions
Distributed DBMS
Page 10-12. 63
Deadlock Detection
Distributed DBMS
Page 10-12. 64
communication cost
Distributed DBMS
Page 10-12. 65
DD11
Site 1
DD21
Distributed DBMS
DD14
Site 2 Site 3
DD22
DD23
Site 4
DD24
Page 10-12. 66
looks for a cycle that does not involve the external edge. If it
exists, there is a local deadlock which can be handled locally.
looks for a cycle involving the external edge. If it exists, it
indicates a potential global deadlock. Pass on the information
to the next site.
Distributed DBMS
Page 10-12. 67
Reliability
Problem:
How to maintain
atomicity
durability
properties of transactions
Distributed DBMS
Page 10-12. 68
Fundamental Definitions
n
Reliability
A measure of success with which a system conforms
Availability
The fraction of the time that a system meets its
specification.
given time t.
Distributed DBMS
Page 10-12. 69
Component 2
Stimuli
Responses
Component 3
External state
Internal state
Distributed DBMS
Page 10-12. 70
Fundamental Definitions
n
Failure
The deviation of a system from the behavior that is
Erroneous state
The internal state of a system such that there exist
Error
The part of the state which is incorrect.
Fault
An error in the internal states of the components of
Distributed DBMS
Page 10-12. 71
Faults to Failures
causes
Fault
Distributed DBMS
results in
Error
Failure
Page 10-12. 72
Types of Faults
n
Hard faults
Permanent
Resulting failures are called hard failures
Soft faults
Transient or intermittent
Account for more than 90% of all failures
Resulting failures are called soft failures
Distributed DBMS
Page 10-12. 73
Fault Classification
Permanent
fault
Permanent
error
Incorrect
design
Unstable or
marginal
components
Unstable
environment
Intermittent
error
System
Failure
Transient
error
Operator
mistake
Distributed DBMS
Page 10-12. 74
Failures
MTBF
MTTD
MTTR
Time
Fault
Error
occurs caused
Detection
of error
Repair
Fault Error
occurs caused
Distributed DBMS
Page 10-12. 75
e-m(t)[m(t)]k
k!
z(x)dx
t
Distributed DBMS
Page 10-12. 76
Fault-Tolerance Measures
Reliability
The mean number of failures in time [0, t] can be
computed as
Rsys(t) = Ri(t)
i =1
Distributed DBMS
Page 10-12. 77
Fault-Tolerance Measures
Availability
A(t) = Pr{system is operational at time t}
Assume
u
A = lim A(t) =
t
Distributed DBMS
Page 10-12. 78
Fault-Tolerance Measures
MTBF
Mean time between failures
MTBF = 0 R(t)dt
MTTR
Mean time to repair
Availability
MTBF
MTBF + MTTR
Distributed DBMS
Page 10-12. 79
Sources of Failure
SLAC Data (1985)
Software
13%
Operations
57%
Hardware
13%
Environment
17%
Page 10-12. 80
Sources of Failure
Japanese Data (1986)
Operations
10% Environment
11%
Application
SW
25%
Vendor
42%
Comm.
Lines
12%
Page 10-12. 81
Sources of Failure
5ESS Switch (1987)
Operations
18%
Software
44%
Unknown
6%
Hardware
32%
Page 10-12. 82
Sources of Failures
Tandem Data (1985)
Maintenance
25%
Operations
17%
Environment
14%
Hardware
18%
Software
26%
Page 10-12. 83
Types of Failures
n
Transaction failures
Transaction aborts (unilaterally or due to deadlock)
Avg. 3% of transactions abort abnormally
Media failures
Failure of secondary storage devices such that the
Communication failures
Lost/undeliverable messages
Network partitioning
Distributed DBMS
Page 10-12. 84
Volatile storage
(RAM).
Stable storage
Resilient to failures and loses its contents only in the
Local Recovery
Manager
Main memory
Fetch,
Flush
Read
Stable
database
Write
Distributed DBMS
Write
Database Buffer
Manager
Read
Database
buffers
(Volatile
database)
Page 10-12. 85
Update Strategies
In-place update
Each update causes a change in one or more data
Out-of-place update
Each update causes the new value(s) of data item(s)
Distributed DBMS
Page 10-12. 86
Old
stable database
state
Update
Operation
New
stable database
state
Database
Log
Distributed DBMS
Page 10-12. 87
Logging
The log contains information used by the
recovery process to restore the consistency of a
system. This information may include
transaction identifier
type of operation (action)
items accessed by the transaction to perform the
action
Distributed DBMS
Page 10-12. 88
Why Logging?
Upon recovery:
all of T1's effects should be reflected in the database
system
crash
Begin
T1
Begin
End
T2
0
Distributed DBMS
time
t
1998 M. Tamer zsu & Patrick Valduriez
Page 10-12. 89
REDO Protocol
Old
stable database
state
REDO
New
stable database
state
Database
Log
n
n
Distributed DBMS
Page 10-12. 90
UNDO Protocol
New
stable database
state
UNDO
Old
stable database
state
Database
Log
n
n
Distributed DBMS
Page 10-12. 91
updated)
Distributed DBMS
Page 10-12. 92
Notice:
If a system crashes before a transaction is committed,
WAL protocol :
Before a stable database is updated, the undo portion of
Distributed DBMS
Page 10-12. 93
Logging Interface
Secondary
storage
Main memory
Stable
log
Local Recovery
Manager
Stable
database
Fetch,
Flush
Database Buffer
Manager
Distributed DBMS
Read
Write
ad
Log
buffers
Re
e
rit
W
Read
Write
Database
buffers
(Volatile
database)
Page 10-12. 94
Out-of-Place Update
Recovery Information
n
Shadowing
When an update occurs, don't change the old page, but
Differential files
For each file F maintain
u
Page 10-12. 95
Execution of Commands
Commands to consider:
begin_transaction
read
write
commit
abort
recover
Distributed DBMS
Independent of execution
strategy for LRM
Page 10-12. 96
Execution Strategies
n
Dependent upon
Can the buffer manager decide to write some of
fix/no-fix decision
flush/no-flush decision
Distributed DBMS
Page 10-12. 97
No-Fix/No-Flush
n
Abort
Buffer manager may have written some of the updated
Commit
LRM writes an end_of_transaction record into the log.
Recover
For those transactions that have both a
Page 10-12. 98
No-Fix/Flush
n
Abort
Buffer manager may have written some of the
Commit
LRM issues a flush command to the buffer
log.
n
Recover
No need to perform redo
Perform global undo
Distributed DBMS
Page 10-12. 99
Fix/No-Flush
n
Abort
None of the updated pages have been written
Commit
LRM writes an end_of_transaction record into
the log.
Recover
Perform partial redo
No need to perform global undo
Distributed DBMS
Fix/Flush
n
Abort
None of the updated pages have been written into stable
database
Recover
No need to do anything
Distributed DBMS
Checkpoints
n
Distributed DBMS
Media Failures
Full Architecture
Secondary
storage
Main memory
Stable
log
Local Recovery
Manager
Stable
database
Fetch,
Flush
Database Buffer
Manager
Read
Write
Write
Archive
database
Distributed DBMS
ad
Log
buffers
Re
e
rit
W
Read
Write
Database
buffers
(Volatile
database)
Write
Archive
log
Commit protocols
How to execute commit command for distributed
transactions.
Termination protocols
If a failure occurs, how can the remaining operational
Recovery protocols
When a failure occurs, how do the sites where the failure
Distributed DBMS
Centralized 2PC
P
C
P
ready?
yes/no
Phase 1
Distributed DBMS
commit/abort? commited/aborted
Phase 2
Participant
INITIAL
INITIAL
AR E
PREP
write
begin_commit
in log
write abort
in log
BOR
TE-A
No
Ready to
Commit?
Yes
VO
VOTE-COMMIT
WAIT
Yes
Any No?
write ready
in log
GLOBAL-ABORT
write abort
in log
VO
write commit
in log
Abort
ACK
COMMIT
ABORT
ACK
Type of
msg
Commit
write abort
in log
write commit
in log
write
end_of_transaction
in log
Distributed DBMS
READY
M MIT
TE-CO
No
ABORT
COMMIT
Linear 2PC
Phase 1
Prepare
1
VC/VA
2
GC/GA
VC/VA
3
GC/GA
VC/VA
4
GC/GA
VC/VA
5
GC/GA
N
GC/GA
Phase 2
VC: Vote-Commit, VA: Vote-Abort, GC: Global-commit, GA: Global-abort
Distributed DBMS
Distributed 2PC
Coordinator
Participants
Participants
global-commit/
global-abort
vote-abort/ decision made
vote-commit independently
prepare
Phase 1
Distributed DBMS
INITIAL
Commit command
Prepare
Prepare
Vote-commit
Prepare
Vote-abort
WAIT
Vote-abort
Global-abort
READY
Vote-commit (all)
Global-commit
ABORT
COMMIT
Global-abort
Ack
ABORT
Coordinator
Distributed DBMS
Global-commit
Ack
COMMIT
Participants
Page 10-12. 110
Timeout in INITIAL
Who cares
INITIAL
Timeout in WAIT
Commit command
Prepare
WAIT
Vote-abort
Global-abort
Vote-commit
Global-commit
ABORT
Distributed DBMS
COMMIT
INITIAL
Timeout in INITIAL
Coordinator must have
Prepare
Vote-commit
Prepare
Vote-abort
Timeout in READY
Stay blocked
READY
Global-abort
Ack
ABORT
Distributed DBMS
Global-commit
Ack
COMMIT
Failure in INITIAL
INITIAL
Failure in WAIT
Restart the commit process upon
Commit command
Prepare
recovery
WAIT
been received
Vote-commit
Global-commit
Vote-abort
Global-abort
involved
Distributed DBMS
ABORT
COMMIT
Failure in INITIAL
INITIAL
Failure in READY
The coordinator has been informed
Prepare
Vote-abort
Prepare
Vote-commit
READY
Global-abort
Ack
Global-commit
Ack
Distributed DBMS
COMMIT
command
Distributed DBMS
ABORT state
state
coordinator will handle it by timeout in COMMIT or
ABORT state
Distributed DBMS
Blocking
Ready implies that the participant waits for the
coordinator
If coordinator fails, site is blocked until recovery
Blocking reduces availability
n
n
Distributed DBMS
Three-Phase Commit
n
n
3PC is non-blocking.
A commit protocols is non-blocking iff
it is synchronous within one state
transition, and
its state transition diagram contains
u
u
n
n
Distributed DBMS
INITIAL
Commit command
Prepare
Prepare
Vote-commit
Prepare
Vote-abort
WAIT
Vote-abort
Global-abort
READY
Vote-commit
Prepare-to-commit
PRECOMMIT
ABORT
Global-abort
Ack
ABORT
Ready-to-commit
Global commit
Prepared-to-commit
Ready-to-commit
PRECOMMIT
Global commit
Ack
COMMIT
Distributed DBMS
Participants
INITIAL
COMMIT
Communication Structure
P
P
C
ready?
yes/no
Phase 1
Distributed DBMS
pre-commit/
pre-abort?
yes/no
Phase 2
commit/abort
ack
Phase 3
Site Failures
3PC Termination
Coordinator
INITIAL
Timeout in INITIAL
Who cares
Commit command
Prepare
Timeout in WAIT
Unilaterally abort
WAIT
Vote-abort
Global-abort
Vote-commit
Prepare-to-commit
Timeout in PRECOMMIT
PRECOMMIT
ABORT
Ready-to-commit
Global commit
COMMIT
Distributed DBMS
Site Failures
3PC Termination
Coordinator
INITIAL
Commit command
Prepare
WAIT
Vote-abort
Global-abort
ABORT
Timeout in ABORT or
COMMIT
Vote-commit
Prepare-to-commit
PRECOMMIT
transaction as completed
participants are either in
PRECOMMIT or READY
state and can follow their
termination protocols
Ready-to-commit
Global commit
COMMIT
Distributed DBMS
Site Failures
3PC Termination
Participants
INITIAL
Timeout in INITIAL
Coordinator must have
Prepare
Vote-commit
Prepare
Vote-abort
READY
Global-abort
Ack
ABORT
Timeout in READY
Voted to commit, but does
Prepared-to-commit
Ready-to-commit
PRECOMMIT
Global commit
Ack
COMMIT
Distributed DBMS
Unilaterally abort
Timeout in PRECOMMIT
Handle it the same as
Distributed DBMS
Coordinator
INITIAL
Failure in INITIAL
start commit process upon
recovery
Commit command
Prepare
Failure in WAIT
the participants may have
WAIT
Vote-abort
Global-abort
Vote-commit
Prepare-to-commit
PRECOMMIT
ABORT
Ready-to-commit
Global commit
Failure in PRECOMMIT
ask around for the fate of the
COMMIT
Distributed DBMS
transaction
INITIAL
Commit command
Prepare
WAIT
Vote-abort
Global-abort
ABORT
Failure in COMMIT or
ABORT
Vote-commit
Prepare-to-commit
PRECOMMIT
Ready-to-commit
Global commit
COMMIT
Distributed DBMS
Participants
INITIAL
Failure in INITIAL
unilaterally abort upon
recovery
Prepare
Vote-commit
Prepare
Vote-abort
Failure in READY
the coordinator has been
READY
Global-abort
Ack
Prepared-to-commit
Ready-to-commit n
Failure in PRECOMMIT
ask around to determine how
ABORT
PRECOMMIT
Global commit
Ack
Failure in COMMIT or
ABORT
COMMIT
Distributed DBMS
no need to do anything
Network Partitioning
n
Simple partitioning
Only two partitions
Multiple partitioning
More than two partitions
Distributed DBMS
blocked
improve availability
n
voting-based (quorum)
u
Distributed DBMS
a commit quorum Vc
abort quorum Va
Distributed DBMS
State Transitions in
Quorum Protocols
Coordinator
INITIAL
Commit command
Prepare
Prepare
Vote-abort
WAIT
Vote-abort
Prepare-to-abort
PREABORT
Vote-commit
Prepare-to-commit
Prepared-to-abortt
Ready-to-abort
PREABORT
PRECOMMIT
Ready-to-commit
Global commit
Distributed DBMS
Prepare
Vote-commit
READY
Ready-to-abort
Global-abort
ABORT
Participants
INITIAL
Prepare-to-commit
Ready-to-commit
PRECOMMIT
Global-abort
Ack
COMMIT
ABORT
Global commit
Ack
COMMIT
Distributed DBMS
Vr + Vw > V
Vw > V/2
Open Problems
n
Replication protocols
experimental validation
replication of computation and communication
Transaction models
changing requirements
u
u
u
u
relaxed semantics
u
Distributed DBMS
flat
Distributed DBMS
closed
nesting
open
mixed
nesting
Transaction
structure