CH 4 Synchronization Models of Memory Consistency

CS2354 Advanced Computer Architecture
Unit III Synchronization & Memory consistency Models
Synchronization
Why we need Synchronization? important because while using multiprocessors architectures Need to know when it is safe for different processes to use shared data When Synchronization is important? different processor uses shared data, we must have some mechanism to establish the synchronization in process
Synchronization
The synchronization mechanism are built with the user level software routines relay on hardware supplies synchronization instructions For small multiprocessors, uninterruptable instruction are used to fetch and update the memory which is referred to as the atomic operation The operation used to fetch and update memory at same time (read and write) referred to as atomic operation In large scale multiprocessors synchronization can be a bottleneck
Synchronization
several techniques have been proposed to reduce contention and latency in synchronization We have to first examine the hardware primitives to implement synchronization and then construct synchronization routines
Synchronization
Hardware primitives Basic requirements to implement synchronization in multiprocessor is the set of hardware primitives with the ability to automatically read and modify the memory location(In one step) Interchanging value in a register for a value in memory automatically referred to as atomic and exchange The key property of these atomic primitive is that they read and update memory value atomically (ie) without subdividing into number of steps
Synchronization
In a simple lock, Value 0 represents the lock is free Value 1 represents the lock is unavailable For example, 2 processors tries to do the exchange simultaneously: This race is broken down since exactly one of the processors will perform the exchange. First returning 0, the second processor will return 1 when it does the exchange
Synchronization
In addition to atomic exchange the other option used in old multiprocessors are
Atomic exchange: interchange a value in a register for a value in memory 0 synchronization variable is free 1 synchronization variable is locked and unavailable Set register to 1 & swap New value in register determines success in getting lock 0 if you succeeded in setting the lock (you were first) 1 if other processor had already claimed access Key is that exchange operation is indivisible Test-and-set: tests a value and sets it if the value passes the test Fetch-and-increment: it returns the value of a memory location and atomically increments it 0 synchronization variable is free
Synchronization
To implement synchronization, a processor tries to set the lock by exchange of 1, which is in the register ,with the memory address corresponding to the lock The value returned from the exchange instruction is 1, if some other processor had already claimed access , otherwise value returned is 0 The synchronization is locked and unavailable if some other processor has already claimed access otherwise value=0
Uninterruptable Instruction to Fetch and Update Memory

Hard to have read & write in 1 instruction: use 2 instead Load linked (or load locked) + store conditional
Load linked returns the initial value Store conditional returns 1 if it succeeds (no other store to same memory location since preceding load) and 0 otherwise

Example doing atomic swap with LL & SC:
try: mov ll sc beqz mov R3,R4 R2,0(R1) R3,0(R1) R3,try R4,R2 ; mov exchange value ; load linked ; store conditional ; branch store fails (R3 = 0) ; put load value in R4
At the end of this sequence the contents of R4 & the memory location specified by R1 have been atomically exchanged Anytime a processor intervenes & modifies the value in memory b/w LL & SC instructions SC returns 0 in R3, causing the code sequence to try again
10

Another Example doing fetch & increment with LL & SC:
try: ll addi sc beqz R2,0(R1) R2,R2,#1 R2,0(R1) R2,try ; load linked ; increment (OK if regreg) ; store conditional ; branch store fails (R2 = 0)
11
Implementing locks using coherence(reverse of atomic exchange)

Spin locks: processor continuously tries to acquire, spinning around a loop trying to get the lock li R2,#1 lockit: exch R2,0(R1) ;atomic exchange bnez R2,lockit ;already locked? Problem: exchange includes a write, which invalidates all other copies; this generates considerable bus traffic Solution: start by simply repeatedly reading the variable; when it changes, then try exchange (test and test&set): try: li R2,#1 lockit: lw R3,0(R1) ;load var bnez R3,lockit ; 0 not free spin exch R2,0(R1) ;atomic exchange 12 bnez R2,try ;already locked?
Lock&Unlock: Test&Set
/* Test&Set */ ============== loadi R2, #1 lockit: exch R2, location /* atomic operation*/ bnez R2, lockit /* test*/
unlock:
store location, #0 /* free the lock (write 0) */
13
Lock&Unlock: Test and Test&Set

/* Test and Test&Set */ ======================= lockit: load R2, location /* read lock varijable */ bnz R2, lockit /* check value */ loadi R2, #1 exch R2, location /* atomic operation */ bnz reg, lockit /* if lock is not acquired, repeat */
unlock:
14
store location, #0 /* free the lock (write 0) */
Lock&Unlock: Test and Test&Set

/* Load-linked and Store-Conditional */ ======================================= lockit: ll R2, location /* load-linked read */ bnz R2, lockit /* if busy, try again */ load R2, #1 sc location, R2 /* conditional store */ beqz R2, lockit /* if sc unsuccessful, try again */
unlock:
store location, #0 /* store 0 */
15
What is consistency? When must a processor see consistent view of memory . e.g., seems that
P1: A = 0; ..... A = 1; L1: if (B == 0) ... P2: L2: B = 0; ..... B = 1; if (A == 0) ...
Another MP Issue: Memory Consistency Models
Impossible for both if statements L1 & L2 to be true?
The processes are assumed to be running on different processors and location A,B are originally cached by both processors with initial value 0. If write always take immediate effect and immediately seen by other processors it is impossible to for both statement (L1,L2) to be true
16
What if write invalidate is delayed & processor continues?
Reaching the if statement means that either A or B must be assigned the value to 1 Suppose the invalidate delays, there is no possible that p1 & p2 seen invalidation respectively This behavior came be overcome by using model for memory consistency
17

Rules for the cases : Three consistency Models strict consistency model sequential consistency model
relaxed consistency model
18

Strict Consistency Model P1: P2: write ( Vrble) at the time of t1 Read ( Vrble ) at the time of t2
19
Sequential consistency model: Result of any execution is the same as if the accesses of each processor were kept in order and the accesses among different processors were interleaved assignments before ifs above SC: delay all memory accesses until all invalidates done Relaxed consistency model: works by allowing read and write to complete out of order but uses synchronization to enforce ordering among the writes
20
Relaxed Consistency Models: The Basics Key idea: allow reads and writes to complete out of order, but to use synchronization operations to enforce ordering, so that a synchronized program behaves as if the processor were sequentially consistent
21
Relaxed Consistency Models: The Basics By relaxing orderings, may obtain performance advantages Also specifies range of legal compiler optimizations on shared data Unless synchronization points are clearly defined and programs are synchronized, compiler could not interchange read and write of 2 shared data items because might affect the semantics of the program
22
Relaxed Consistency Models: The Basics

3 major sets of relaxed orderings: 1. WR ordering (all writes completed before next read) Because retains ordering among writes, many programs that operate under sequential consistency operate under this model, without additional synchronization. Called processor consistency 1. W W ordering (all writes completed before next write)
23
Relaxed Consistency Models: The Basics

3 major sets of relaxed orderings: ( Cont ) 3. R W and R R orderings, a variety of models depending on ordering restrictions and how synchronization operations enforce ordering Many complexities in relaxed consistency models; defining precisely what it means for a write to complete; deciding when processors can see values that it has written
24
Write Consistency
For now assume 1. A write does not complete (and allow the next write to occur) until all processors have seen the effect of that write 2. The processor does not change the order of any write with respect to any other memory access if a processor writes location A followed by location B, any processor that sees the new value of B must also see the new value of A These restrictions allow the processor to reorder reads, but forces the processor to finish writes in program order 25
Memory Consistency Model

Schemes faster execution to sequential consistency Not an issue for most programs; they are synchronized
A program is synchronized if all access to shared data are ordered by synchronization operations write (x) ... release (s) {unlock} ... acquire (s) {lock} ... read(x)
Several Relaxed Models for Memory Consistency since most programs are synchronized; characterized by their attitude towards: RAR, WAR, RAW, WAW to different addresses
26

CH 4 Synchronization Models of Memory Consistency

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

CH 4 Synchronization Models of Memory Consistency

Hochgeladen von

Copyright:

Verfügbare Formate

CS2354 Advanced Computer Architecture

Unit III Synchronization & Memory consistency Models

Uninterruptable Instruction to Fetch and Update Memory

Uninterruptable Instruction to Fetch and Update Memory

Uninterruptable Instruction to Fetch and Update Memory

Implementing locks using coherence(reverse of atomic exchange)

store location, #0 /* free the lock (write 0) */

Lock&Unlock: Test and Test&Set

store location, #0 /* free the lock (write 0) */

Lock&Unlock: Test and Test&Set

store location, #0 /* store 0 */

Another MP Issue: Memory Consistency Models

Impossible for both if statements L1 & L2 to be true?

What if write invalidate is delayed & processor continues?

Another MP Issue: Memory Consistency Models

Another MP Issue: Memory Consistency Models

relaxed consistency model

Another MP Issue: Memory Consistency Models

Another MP Issue: Memory Consistency Models

Relaxed Consistency Models: The Basics

Relaxed Consistency Models: The Basics

Memory Consistency Model

Das könnte Ihnen auch gefallen