Beruflich Dokumente
Kultur Dokumente
Master 2007
Outline
Introduction
Distributed Mutual Exclusion
Election Algorithms
Group Communication
Consensus and Related Problems
2
Main Assumptions
Each pair of processes is connected by reliable
channels
Processes independent from each other
Network: don’t
disconnect
…
Shared Process n
resource
4
Distributed Mutual Exclusion (2)
Critical section
5
Distributed Mutual Exclusion (3)
Distributed mutual exclusion: no shared variables,
only message passing
Properties:
Safety: At most one process may execute in the critical
section at a time
Liveness: Requests to enter and exit the critical section
eventually succeed
No deadlock and no starvation
6
Mutual Exclusion Algorithms
Basic hypotheses:
System: asynchronous
Processes: don’t fail
Message transmission: reliable
1) Request 2) Release
token token P4
P1 Waiting
P2 P3
Holds the token
8
Ring-Based Algorithm (1)
A group of unordered
processes in a network
P4 P2 Pn P1 P3
Ethernet
9
Ring-Based Algorithm (2)
P1 Enter()
P2 • Critical
•
• Section
Pn Exit()
P3
P4
Token navigates
around the ring
10
Mutual Exclusion using
Multicast and Logical Clocks (1)
Waiting
queue 19 P3
19
2
P1
23
Enter() 23 P1 and P2 request
• entering the critical
•
• 19 23 section simultaneously
Exit()
P2
Critical Section
11
Mutual Exclusion using
Multicast and Logical Clocks (2)
Main steps of the algorithm:
Initialization
State := RELEASED;
12
Mutual Exclusion using
Multicast and Logical Clocks(3)
Main steps of the algorithm (cont’d):
On receipt of a request <Ti, pi> at pj (i j)
If (state = HELD) OR
(state = WANTED AND (T, pj) < (Ti, pi))
Then queue request from pi without replying;
Else reply immediately to pi;
13
Maekawa’s Voting Algorithm (1)
Candidate process: must collect sufficient votes to
enter to the critical section
Each process pi maintain a voting set Vi (i=1, …, N),
where Vi {p1, …, pN}
Sets Vi: chosen such that i,j
pi Vi
(at least one common member of any
Vi Vj two voting sets)
Vi = k (fairness)
15
Maekawa’s Voting Algorithm (3)
Main steps of the algorithm (cont’d):
On receipt of a request from pi at pj (i j)
16
Maekawa’s Voting Algorithm (4)
Main steps of the algorithm (cont’d):
17
M. E. Algorithms Comparison
Number of messages
Algorithm Enter()/Exit Before Enter() Problems
Crash of a process
Virtual Token lost
1 to 0 to N-1
ring Ordering non
satisfied
Logical Crash of a
2(N-1) 2(N-1)
clocks process
Maekawa’s Alg. 3N 2N Crash of a
process who votes
18
Outline
Introduction
Distributed Mutual Exclusion
Election Algorithms
Group Communication
Consensus and Related Problems
19
Election Algorithms (1)
Objective: Elect one process pi from a group of
processes p1…pN
Even if multiple elections have
Utility: Elect a been
primary started simultaneously
manager, a master process, a
coordinator or a central server
Each process pi maintains the identity of the elected
in the variable Electedi (NIL if it isn’t defined yet)
Properties to satisfy: pi,
Safety: Electedi = NIL or Elected = P A non-crashed
i identifier
process with the
Liveness: pi participates and sets Elected
largest
NIL, or
crashes
20
Election Algorithms (2)
Bully Algorithm
21
Ring-Based Election Algorithm (1)
5
5
16
16
9
25
Process 5 starts
25
the election
25
22
Ring-Based Election Algorithm (2)
Initialization
Participanti := FALSE;
Electedi := NIL
Pi starts an election
Participanti := TRUE;
Send the message <election, pi> to its neighbor
Participanti := FALSE;
If pi pj
Then Send the message <elected, pj> to its neighbor
23
Ring-Based Election Algorithm (3)
Receipt of the election’s message <election, pi> at pj
If pi > pj
Then Send the message <election, pi> to its neighbor
Participantj := TRUE;
Else If pi < pj AND Participantj = FALSE
DelayTrans.
DelayTrait.
T = 2 DelayTrans. + DelayTrait.
25
Bully Algorithm (2)
Hypotheses (cont’d):
Each process knows which processes have higher
identifiers, and it can communicate with all such
processes
Three types of messages:
Election: starts an election
OK: sent in response to an election message
Coordinator: announces the new coordinator
Election started by a process when it notices, through
timeouts, that the coordinator has failed
26
Bully Algorithm (3)
2
3 6
Process 5 detects
5 it first Election
OK
7 New Coordinator
1 4
8 Coordinator failed
27
Bully Algorithm (4)
Initialization
Electedi := NIL
Elected := pj;
29
Election Algorithms Comparison
30
Outline
Introduction
Distributed Mutual Exclusion
Election Algorithms
Group Communication
Consensus and Related Problems
31
Group Communication (1)
Objective: each of a group of processes must
receive copies of the messages sent to the group
Group communication requires:
Coordination
Agreement: on the set of messages that is
received and on the delivery ordering
32
Group Communication (2)
System: contains a collection of processes, which
can communicate reliably over one-to-one channels
Processes: members of groups, may fail only by
crashing
Groups:
34
Group Communication (4)
Basic Multicast
Reliable Multicast
Ordered Multicast
35
Basic Multicast
Objective: Guarantee that a correct process will eventually
deliver the message as long as the multicaster does not crash
To B_multicast(g, m)
For each process p g, send(p, m);
Use
On receive(m) of
at p threads to perform the send
operations simultaneously
B_deliver(m) to p
37
Reliable Multicast (2)
Implementation using B-multicast:
Initialization Correct algorithm, but
msgReceived := {}; inefficient
(each message is sent |g|
R-multicast(g, m) by p times to each process)
B-multicast(g, m); // p g
B-deliver(m) by q with g = group(m)
If (m msgReceived)
Then msgReceived := msgReceived {m};
If (q p) Then B-multicast(g, m);
R-deliver(m);
38
Ordered Multicast
Ordering categories:
FIFO Ordering
Total Ordering
Causal Ordering
39
FIFO Ordering (1)
If a correct process issues multicast(g, m1) and then
multicast(g, m2), then every correct process that
delivers m2 will deliver m1 before m2
m1
m3
m2
42
Total Ordering (2)
Implementation: Assign totally ordered identifiers to
multicast messages
Each process makes the same ordering decision
based upon these identifiers
Methods for assigning identifiers to messages:
Sequencer process
Processes collectively agree on the assignment of
sequence numbers to messages in a distributed
fashion
43
Total Ordering (3)
Sequencer process: Maintains a group-specific
sequence number Sg
Initialization
Sg := 0;
46
Total Ordering (6)
p3
p3 p3 A p3 = SN
Pg = MAX(Ag, Pgg ) + 1 P3
Proposition
Assigning
Message of
a sequence
a sequence
transmission
number to the
P3
<Ident.,
<m, Ident.>
P
SN>
g > number
message
p2
Ag = SN P2 P4
<Ident., Pg SN>
<Ident.,
<m, Ident.>
> <Ident.,
<m,
<Ident., Pg >
Ident.>
SN>
P2 P1 P4
p1 pi p4
p2
Pg =
p2
MAX(Ag,
p2
Pg SN =
A
)+1 gMAX= SN
i=1,..,5 (P
PP5g )
g
p4
= A p4 = SN
p4
MAX(Ag, Pg )
g +1
<Ident.,
<m, Ident.>
P
SN>
g >
p5 p5 p5 p5
Ag = SN P5 Pg = MAX(Ag, Pg )+1
47
Causal Ordering (1)
If multicast(g, m1) multicast(g, m3), then any correct
process that delivers m3 will deliver m1 before m3
m1
m2
m3
Initialization
Example
g
Vi [j] := 0 (j = 1, …, N);
49
Causal Ordering (3)
CO-multicast(g, m)
g g
Vi [i] := Vi [i] + 1;
g
B-multicast(g, <m,Vi >);
g
B-deliver(<m, Vj >) of pj, with g = group(m)
g
Place <m, Vj> in a hold-back queue;
g g g g
V
Wait until (Vj [j] = i [j] + 1) AND ( Vj [k] Vi [k] );
(k j)
CO-deliver(m);
g g
Vi [j] := Vi [j] + 1;
50
Outline
Introduction
Distributed Mutual Exclusion
Election Algorithms
Group Communication
Consensus and Related Problems
51
Consensus introduction
Consensus problem
Agree on a value after one or more of the processes
V1:=proceed V2:=proceed
Consensus
algorithm
V3:=abort
P3 (Crashes)
54
Consensus (3)
Proprieties to satisfy:
Termination: Eventually each correct process
sets its decision variable
Agreement: the decision value of all correct
processes is the same:
Pi and Pj are correct di = dj (i,j=1, …, N)
Integrity: If the correct processes all proposed
the same value, then any correct process in the
decided state has chosen that value
55
Consensus (4)
Consensus in a synchronous system:
Use of basic multicast Valuesir : set of proposed
values known to process pi at
At most f processes may crash
the beginning of round r
f+1 rounds are necessary
Delay of one round is bounded by a timeout
56
Consensus (5)
Interactive consistency problem: variant of the consensus
problem
Objective: correct processes must agree on a vector of values,
one for each process
Proprieties to satisfy:
Termination: Eventually each correct process sets its
decision variable
Agreement: the decision vector of all correct processes is
the same
Integrity: If Pi is correct, then all correct processes decide
on Vi as the ith component of their vector
57
Consensus (6)
Byzantine generals problems: variant of the consensus
problem
Objective: a distinguished process supplies a value that the
others must agree upon
Proprieties to satisfy:
Termination: Eventually each correct process sets its
decision variable
Agreement: the decision value of all correct processes is
the same
same:
Pi and Pj are
Integrity: correct
If the di = dj is
commander (i,j=1, …, N)then all correct
correct,
processes decide on the value that the commander
proposed
58
Consensus (7)
Byzantine agreement in a synchronous system:
Example : a system composed of three processes (must
Commander Commander
1 1 1 0
59
Consensus (8)
For m faulty processes, n 3m+1, where n denotes
the total number of processes
Interactive Consistency Algorithm: ICA(m), m>0, m denotes
the maximal number of processes that may fail simultaneously
Sender: all nodes must agree upon its value
Receivers: all other processes
61
References
PhD. Mourad Elhadef’s presentation
Other presentations
Wikipedia: www.wikipedia.com
62
63