Beruflich Dokumente
Kultur Dokumente
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
LectureNotes
Dr.TongLaiYu,March2010
0.ReviewandOverview
1.AnIntroductiontoDistributedSystems
2.Deadlocks
3.DistributedSystemsArchitecture
4.Processes
5.Communication
6.DistributedOSTheories
7.DistributedMutualExclusions
8.AgreementProtocols
9.DistributedScheduling
10.DistributedResourceManagement
11.RecoveryandFaultTolerance
12.SecurityandProtection
DistributedMutualExclusion
Lifeconsistsnotinholdinggoodcardsbutinplayingthoseyouholdwell.
JoshBillings
1.Introduction
ACentralizedAlgorithm
Oneprocessiselectedasthecoordinator.
Wheneveraprocesswantstoaccessasharedresource,itsendsrequest
tothecoordinatortoaskforpermission.
Coordinatormayqueuerequests.
Decentralized
nontokenbased
tokenbased
RequirementsofMutualExclusionAlgorithms
onlyonerequestaccessesstheCSatatime(primarygoal)
Freedomfromdeadlocks
Freedomfromstarvation
Fairness
FaultTolerance
Performanceofamutualexclusionalgorithm
SystemthroughputS(rateatwhichthesystemexecutesrequestsfortheCS)
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
1/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
1
S=
Sd+E
Sd=synchronizationdelay
E=averageexecutiontime
lowloadandhighloadperformance
bestandworstcaseperformanceiffluctuatesstatistically,takeaverage
8.ElectionAlgorithms
Principle
Analgorithmrequiresthatsomeprocessactsasacoordinator.Thequestion
ishowtoselectthisspecialprocessdynamically.
Note
Inmanysystemsthecoordinatorischosenbyhand(e.g.fileservers).This
leadstocentralizedsolutions)singlepointoffailure.
Afteranetworkpartition,theleaderlesspartitionmustelectaleader.
Electionbybullying
Principle
Eachprocesshasanassociatedpriority(weight).Theprocesswith
thehighestpriorityshouldalwaysbeelectedasthecoordinator.
Issue
Howdowefindtheheaviestprocess?
Anyprocesscanjuststartanelectionbysendinganelection
messagetoallotherprocesseswithhighernumbers.
IfaprocessPheavyreceivesanelectionmessagefromalighter
processPlight,itsendsatakeovermessagetoPlight.Plightisoutof
therace.
Ifaprocessdoesn'tgetatakeovermessageback,itwins,and
sendsavictorymessagetoallotherprocesses.
(a)Pocess4holdsanelection.
(b)Processes5and6respond,telling4tostop.
(c)Noew5and6holdanelection.
(d)Process6tells5tostop.
(e)Process6winsandtellseveryone.
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
2/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
Issue
Supposecrashednodescomesbackonline:
Sendsanewelectionmessagetohighernumberedprocesses
Repeatuntilonlyoneprocessleftstanding
Announcesvictorybysendingmessagesayingthatitiscoordinator(ifnotalreadycoordinator)
Existing(lowernumbered)coordinatoryields
Hencetheterm'bully'
Electioninaring
Principle
Processpriorityisobtainedbyorganizingprocessesintoa(logical)
ring.Processwiththehighestpriorityshouldbeelectedas
coordinator.
Anyprocesscanstartanelectionbysendinganelectionmessage
toitssuccessor.Ifasuccessorisdown,themessageispassed
ontothenextsuccessor.
Ifamessageispassedon,thesenderaddsitselftothelist.When
itgetsbacktotheinitiator,everyonehadachancetomakeits
presenceknown.
Theinitiatorsendsacoordinatormessagearoundthering
containingalistofalllivingprocesses.Theonewiththehighest
priorityiselectedascoordinator.
2and5startelectionmessageindependently.
Eventually,bothmessageswillgoallthewayaround.
Bothmessagescontinuetocirculate.
2and5willconvertElectionmessagestoCOORDINATORmessages.
Allprocessesrecognizehighestnumberedprocessasnewcoordinator.
Question
Doesitmatteriftwoprocessesinitiateanelection?
Question
Whathappensifaprocesscrashesduringtheelection?
Superpeerelection
Issue
Howcanweselectsuperpeerssuchthat:
Normalnodeshavelowlatencyaccesstosuperpeers
Superpeersareevenlydistributedacrosstheoverlaynetwork
Thereisapredefinedfractionofsuperpeers
Eachsuperpeershouldnotneedtoservemorethanafixed
numberofnormalnodes
DHTs
ReserveafixedpartoftheIDspaceforsuperpeers.ExampleifS
superpeersareneededforasystemthatusesmbitidentifiers,simply
reservethek=|log2S|leftmostbitsforsuperpeers.WithNnodes,
we'llhave,onaverage,2kmNsuperpeers.
Routingtosuperpeer
Sendmessageforkeyptonoderesponsiblefor
pAND11...1100...00
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
3/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
NodePositioningApproach
NtokensarespreadacrossNrandomlychosennodes.
Nonodecanholdmorethan1token.
Eachtokenrepresentsarepellingforce.
Ifforceontokenholdernodeexceedsathreshold,tokenmovesaway.
Eventually,theywillspreadevenlyacrossthenetwork.
ElectioninWirelessEnvironment
Nodewaitsforneigbours'repliesbeforereplyingtoparent.
Nodeaisthesource(a)istheinitialnetwork.
(b)(e)Treebuildingphase
(f)Reportingofbestnodetosource.
32.Nontokenbasedalgorithms
Lamport'sAlgorithm
Sisite,Nsites
eachsitemaintainsarequestset
Ri={S1,S2,...,SN}
requestqueueicontainingmutualexclusionrequestsorderedbytheirtimestamps,
use=>totalorderrelation(withLamport'sclock)
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
4/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
tsitimestampofsitei
Assume
messagesarereceivedinthesameorderastheyaresent
eventuallyeverymessageisreceived
1.TorequestenteringtheCS,processPisendsaREQUEST(tsi,i)messagetoeveryprocess(includingitself),putstherequestonrequest
queuei
2.WhenprocessPjreceivesREQUEST(tsi,i),itplacesitonitsrequestqueuejandsendsatimestampedREPLY(acknowledgement)toPi
3.ProcessPientersCSwhenthefollowing2conditionsaresatisfied:
Pi'srequestisattheheadofrequestqueuei
Pihasreceiveda(REPLY)messagefromeveryotherprocesstimestampedlaterthantsi
4.WhenexitingtheCS,processPiremovesitsrequestfromheadofitsrequestqueueandsendsatimestampedRELEASEtoeveryother
process
5.WhenPjreceivesaRELEASEfromPi,itremovesPi'srequestfromitsrequestqueue.
Performance
foreachCSinvocation
(N1)REQUEST
(N1)REPLY
(N1)RELEASE
total3(N1)messages
synchronizationdelaySd=averagedelay
Ricart,AgrawalaoptimizedLamport'salgorithmbymergingtheRELEASEandREPLYmessages.
Example:
(a)Twoprocesseswanttoaccessasharedresourceatthesametime
(b)Process0hasthelowesttimestamp,soitwins
(c)Whenprocess0isdone,itsendsanOKalso,so2cannowgoahead
Maekawa'sVotingAlgorithm
VotingAlgorithms:
Lamport'salgorithemrequiresaprocesstogetpermissonfromallotherprocesses.Itisanoverkill.
Adifferentapproachistoletprocessescompeteforvotes.Ifaprocesshasreceivedmorevotesthananyotherprocess,itcan
entertheCS.Ifitdoesnothaveenoughvotes,itwaitsuntiltheprocessintheCSisdoneandreleasesitsvotes.
Quorumshavethepropertythatanytwogroupshaveanonemptyintersection.
Simplemajoritiesarequorums.Any2setswhosesizesaresimplemajoritiesmusthaveatleastoneelementincommon.
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
5/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
12nodes,somajorityis7
Gridquorum:arrangenodesinlogicalgrid(square).Aquorumisallofarowandallofacolumn.Quorumsizeis2N1.
Principles:
TogetaccessitoaCS,notallprocesseshavetoagree
Sufficestosplitsetofprocessesupintosubsets("votingsets")thatoverlap
Sufficesthatthereisconsensuswithineverysubset
WhenaprocesswishestoentertheCS,itsendsavoterequesttoeverymemberofitsvotingdistrict.
Whentheprocessreceivesrepliesfromallthemembersofthedistrict,itcanentertheCS.
Whenaprocessreceivesavoterequest,itrespondswitha"YES"voteifithasnotalreadycastitsvote.
WhenaprocessexitstheCS,itinformsthevotingdistrict,whichcanthenvoteforothercandidates.
Mayhavedeadlock.
Requestsets
N={1,2,...,N}
RiRjalli,jN
AsitecansendaREPLY(LOCKED)messageonlyifithasnotbeenLOCKED(i.e.hasnotcastthevote).
Properties:
1.RiRj
2.SiRi
3.|Ri|=KforalliN
4.anysiteSiisinKnumberofRi's
Maekawafoundthat:
N=K(K1)+1
orK=|R |N
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
6/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
orK=|Ri|N
Messagesexchange:
FailedF,SjcannotgrantpermissiontoSkbecauseSjhasgrantedpermissiontoasitewithhigher
requestpriority.
InquireI,SjwantstofindoutifSkhassuccessfullylockedallsites.(theoutstandinggranttoSk
hasalowerprioritythanthenewrequest)
YieldY,SjyieldstoSk(SjhasreceivedafailedmessagefromsomeothersiteorSjhassenta
yieldtosomeothersitebuthasnotreceivedanewgrant)
(Therequest'spriorityisdeterminedbyitssequencenumber(timestamp)thesamllerthesequencenumber,thehigherthepriorityifsequence#
same,theonewithsmallersitenumberhashigherpriority)
Algorithm:
1.AsiteSirequestsaccesstoCSbysendingREQUEST(i)messagestoallthesitesinitsrequestsetRi
2.WhenasiteSjreceivestheREQUEST(i)message,itsendsaREPLY(j)messagetoSiprovidedithasn'tsentanyREPLYtoanysitesincelast
RELEASE.Otherwise,itqueuesuptheREQUEST.
3.SiteSicouldaccesstheCSonlyafterithasreceivedREPLYfromallsitesinRi
DeadlockHandling:
1.WhenaREQUEST(i)fromSiblocksatsiteSjbecauseSjhascurrentlygrantedpermissiontositeSkthenSjsendsFAILED(j)messagetoSi
ifSihaslowerpriority.OtherwiseSjsendsanINQUIRE(j)messagetoSk.
2.InresponsetoanINQUIRE(j)fromSj,siteSksendsYIELD(k)toSj,providedSkhasreceivedaFAILEDmessageorhassentaYIELDto
anothersite,buthasnotrecivedanewREPLYfromit.
3.InresponsetoaYIELD(k)messagefromSk,siteSjassumesithasbeenreleasedbySk,placestherequestofSkattheappropriatelocationin
therequestqueue,andsendsaREPLY(j)tothetoprequest'ssiteinthequeue.Sj
Example
13nodes,13=4(41)+1,thusK=4
R1={1,2,3,4}
R2={2,5,8,11}
R3={3,6,8,13}
R4={4,6,10,11}
R5={1,5,6,7}
R6={2,6,9,12}
R7={2,7,10,13}
R8={1,8,9,10}
R9={3,7,9,11}
R10={3,5,10,12}
R11={1,11,12,13}
R12={4,7,8,12}
R13={4,5,9,13}
Supposesites11,8,7wanttoenterCStheyallsendrequestswithsequencenumber1.(7hashighestpriority,8next,11lowest)
1.site11wantstoenterrequestshavearrivedat12,13Rto1isontheway
2.7wantstoenterCSRarrivedat2and10butRto13isonitsway
3.8alsowantstoenterCSsendsRto1,9,10butfailstolock10because10hasbeenlockedby7withhigherpriority
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
7/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
4.Rfrom11finallyarrivedat1andRfrom7arrivedat13
11,7,8arecircularlylocked:
8receivesFandcannotenterCS
11receivesFandcannotenterCS
7cannotenterCSbecauseithasnotreceivedallREPLY(LOCKED)messages
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
8/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
8.13islockedby11(haslowerprioritythan7)andreceivesrequestfrom7,soitsendsanINQUIREto11toaskittoyield
9.When11receivesanINQUIRE,itknowsthatitcannotenterCSthereforeitsendsaYIELDto13
10.then13cansendLto7whichentersCS
11.when7finished,sendsRELEASE
12.then8locksallmembers,...,sendsRELEASE
13.then11enters
48.Tokenbasedalgorithms
Principles
onetoken,sharedamongallsites
sitecanenteritsCSiffitholdstoken
Themajordifferenceisthewaythetokenissearched
usesequencenumbersinsteadoftimestamps
ousedtodistinguishrequestsfromsamesite
okeptindependentlyforeachsite
ousesequencenumbertodistinguishbetweenoldandcurrentrequests
Theproofofmutualexclusionistrivial
Theproofofotherissues(deadlockandstarvation)maybelessso
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
9/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
(a)Anunorderedgroupofprocessesonanetwork.
(b)Alogicalringconnectedinsoftware.
a)SuzukiKasami'sBroadcastAlgorithm
TOKENaspecialPRIVILEGEmessage
nodeownsTOKENcanenterCS
initiallynode1hastheTOKEN
nodeholdingTOKENcanexecuteCSrepeatedlyifnorequestfromotherscomes
ifanodewantsTOKEN,itbroadcastsaREQUESTmessagetoallothernodes
node:
REQUEST(j,n)
nodejrequestingnthCSinvocation
n=1,2,3,...,sequence#
nodeireceivesREQUESTfromj
updateRNi[j]=max(RNi[j],n)
RNi[j]=largestseq#receivedsofarfromnodej
TOKEN:
TOKEN(Q,LN)(supposeatnodei)
Qqueueofrequestingnodes
LNarrayofsizeNsuchthat
LN[j]=theseq#oftherequestofnodejgrantedmostrecently
WhennodeifinishedexecutingCS,itdoesthefollowing
1.setLN[i]=RNi[i]toindicatethatcurrentrequestofnodeihasbeengranted(executed)
2.allnodeksuchthat
RNi[k]>LN[i]
(i.e.nodekrequesting)isappendedtoQifitsnotthere
Whentheseupdatesarecomplete,ifQisnotempty,thefrontnodeisdeletedandTOKENissentthere
FCFS
Example:
Therearethreeprocesses,p1,p2,andp3.
p1andp3seekmutuallyexclusiveaccesstoasharedresource.
Initially:thetokenisatp2andthetoken'sstateisLN=[0,0,0]andQempty
p1'sstateis:n1(seq#)=0,RN1=[0,0,0]
p2'sstateis:n2=0,RN2=[0,0,0]
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
10/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
p3'sstateis:n3=0,RN3=[0,0,0]
p1sendsREQUEST(1,1)top2andp3p1:n1=1,RN1=[1,0,0]
p3sendsREQUEST(3,1)top1andp2p3:n3=1,RN3=[0,0,1]
p2receivesREQUEST(1,1)fromp1p2:n2=1,RN2=[1,0,0],holdingtoken
p2sendsthetokentop1
p1receivesREQUEST(3,1)fromp3:n1=1,RN1=[1,0,1]
p3receivesREQUEST(1,1)fromp1p3:n3=1,RN3=[1,0,1]
p1receivesthetokenfromp2
p1entersthecriticalsection
p1exitsthecriticalsectionandsetsthetoken'sstatetoLN=[1,0,0]andQ=(3)
p1sendsthetokentop3p1:n1=2,RN1=[1,0,1],holdingtokentoken'sstateisLN=[1,0,0]andQempty
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
11/12
31/08/2016
cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
p3receivesthetokenfromp1p3:n3=1,RN3=[1,0,1],holdingtoken
p3entersthecriticalsection
p3exitsthecriticalsectionandsetsthetoken'sstatetoLN=[1,0,1]andQempty
Performance:
ItrequiresatmostNmessageexchangeperCSexecution((N1)REQUESTmessages+TOKENmessage
or0messageifTOKENisinthesite
synchronizationdelayis0orT
deadlockfree(becauseofTOKENrequirement)
nostarvation(i.e.arequestingsiteentersCSinfinitetime)
ComparisonofLamportandSuzukiKazamiAlgorithms
Theessentialdifferenceisinwhokeepsthequeue.Inonecaseeverysitekeepsitsownlocalcopyofthequeue.Intheothercase,the
queueispassedaroundwithinthetoken.
http://cse.csusb.edu/tongyu/courses/cs660/notes/dmex.php
12/12