Download presentation

Presentation is loading. Please wait.

Published byHerbert Caron Modified about 1 year ago

1
Local-Spin Algorithms Multiprocessor synchronization algorithms (20225241) Lecturer: Danny Hendler This presentation is based on the book “Synchronization Algorithms and Concurrent Programming” by G. Taubenfeld and on a the survey “Shared-memory mutual exclusion: major research trends since 1986” by J. Anderson, Y-J. Kim and T. Herman

2
The CC and DSM models This figure is taken from the survey “Shared-memory mutual exclusion: major research trends since 1986” by J. Anderson, Y-J. Kim and T. Herman

3
Remote and local memory accesses In a DSM system: local remote In a Cache-coherent system: An access of v by p is remote if it is the first access or if v has been written by another process since p’s last access of it.

4
Local-spin algorithms In a local-spin algorithm, all busy waiting (‘await’) is done by read-only loops of local-accesses, that do not cause interconnect traffic. The same algorithm may be local-spin on one architecture (DSM/CC) and non-local spin on the other! For local-spin algorithms, our complexity metric is the worst-case number of Remote Memory References (RMRs)

5
Peterson’s 2-process algorithm Program for process 1 1.b[1]:=true 2.turn:=1 3.await (b[0]=false or turn=0) 4.CS 5.b[1]:=false Program for process 0 1.b[0]:=true 2.turn:=0 3.await (b[1]=false or turn=1) 4.CS 5.b[1]:=false Is this algorithm local-spin on a DSM machine? No Is this algorithm local-spin on a CC machine? Yes

6
Peterson’s 2-process algorithm Program for process 1 1.b[1]:=true 2.turn:=1 3.await (b[0]=false or turn=0) 4.CS 5.b[1]:=false Program for process 0 1.b[0]:=true 2.turn:=0 3.await (b[1]=false or turn=1) 4.CS 5.b[0]:=false What is the RMR complexity on a DSM machine? Unbounded What is the RMR complexity on a CC machine? Constant

7
Kessel’s single-writer algorithm Program for process 0 1.b[0]:=true 2.local[0]:=turn[1] 3.turn[0]:=local[0] 4.Await (b[1]=false or local[0]<>turn[1]) 5.CS 6.b[0]:=false Program for process 1 1.b[1]:=true 2.local[1]:=1-turn[0] 3.turn[1]:=local[1] 4.Await (b[0]=false or local[1]=turn[0]) 5.CS 6.b[1]:=false Can Kessel’s algorithm be made local-spin on a DSM machine? Yes, if: b[1], turn[1] are located at p 0 ’s memory module b[0], turn[0] are located at p 1 ’s memory module

8
Local Spinning Mutual Exclusion Using Strong Primitives

9
Anderson’s queue-based algorithm Shared: integer ticket – A RMW object, initially 0 bit valid[0..n-1], initially valid[0]=1 and valid[i]=0, for i {1,..,n-1} Local: integer myTicket Program for process i 1.myTicket=fetch-and-inc-modulo-n(ticket) ; take a ticket 2.await valid[myTicket]=1 ; wait for your turn 3.CS 4.valid[myTicket]:=0 ; dequeue 5.valid[myTicket+1 mod n]:=1 ; signal successor 0123n-1 valid10 1 0000 ticket

10
Anderson’s queue-based algorithm (cont’d) 0 ticket valid 10000 Initial configuration 1 ticket valid 10000 After entry section of p 3 0 myTicket 3 After p 1 performs entry section 2 ticket valid 10000 0 myTicket 3 1 myTicket 1 2 ticket valid 01000 After p 3 exits 1 myTicket 1

11
Anderson’s queue-based algorithm (cont’d) What is the RMR complexity on a DSM machine? Unbounded What is the RMR complexity on a CC machine? Constant Program for process i 1.myTicket=fetch-and-inc-modulo-n(ticket) ; take a ticket 2.await valid[myTicket]=1 ; wait for your turn 3.CS 4.valid[myTicket]:=0 ; dequeue 5.valid[myTicket+1 mod n]:=1 ; signal successor

12
Graunke and Thakkar’s algorithm Uses the more common swap primitive: swap(w, new) do atomically prev:=w w:=new return prev

13
Graunke and Thakkar’s algorithm (cont’d) Shared: bit slots[0..n-1], initially slots[i]=1, for i {0,..,n-1} structure {bit value, bit *node} tail, initially {0, &slots[0]} Local: structure {bit value, bit *node} myRecord, prev bit temp 0 tail 11111 023n-11 slots

14
Graunke and Thakkar’s algorithm (cont’d) Shared: bit slots[0..n-1], initially slots[i]=1, for i {0,..,n-1} structure {bit value, bit* slot} tail, initially {0, &slot[0]} Local: structure {bit value, bit* node} myRecord, prev, bit temp Program for process i 1.myRecord.value:=slots[i] ; prepare to thread yourself to queue 2.myRecord.slot:=&slots[i] 3.prev=swap(&tail, myRecord) ; prev now points to predecessor 4.await (*prev.slot ≠ prev.value) ;local spin until predecessor’s value changes 5.CS 6.temp:=1-slots[i] 7.slots[i]:=temp ; signal successor

15
Graunke and Thakkar’s algorithm (cont’d)

16
What is the RMR complexity on a DSM machine? Unbounded What is the RMR complexity on a CC machine? Constant Program for process i 1.myRecord.value:=slots[i] ; prepare to thread yourself to queue 2.myRecord.slot:=&slots[i] 3.prev=swap(&tail, myRecord) ; prev now points to predecessor 4.await (*prev.slot ≠ prev.value) ;local spin until predecessor’s value changes 5.CS 6.temp:=1-slots[i] 7.slots[i]:=temp ; signal successor

17
The MCS queue-based algorithm Type: Qnode: structure {bit locked, Qnode *next} Shared: Qnode nodes[0..n-1] Qnode *tail initially nil Local: Qnode *myNode, initially &nodes[i] Qnode *prev, *successor Has constant RMR complexity under both the DSM and CC models Uses swap and CAS

18
The MCS queue-based algorithm (cont’d) Program for process i 1.myNode.next := nil ; prepare to be last in queue 2.prev := myNode ;prepare to thread yourself 3.swap(&tail, prev) ;tail now points to myNode 4.if (prev ≠ nil) ;I need to wait for a predecessor 5. *myNode.locked := true ;prepare to wait 6. *prev.next := myNode ;let my predecessor know it has to unlock me 7. await myNode.locked := false 8.CS 9.if (myNode.next = nil) ; if not sure there is a successor 10. if (compare-and-swap(tail, myNode, nil) = false) ; if there is a successor 11. await (myNode->next ≠ null) ; spin until successor let me know its identity 12. successor := myNode->next ; get a pointer to my successor 13. successor->locked := false ; unlock my successor 14.else ; for sure, I have a successor 15. successor := myNode->next ; get a pointer to my successor 16. successor->locked := false ; unlock my successor

19
The MCS queue-based algorithm (cont’d)

20
Local Spinning Mutual Exclusion Using reads and writes

21
A local-spin tournament-tree algorithm (Anderson, Yang, 1993) O(log n) RMR complexity for both DSM and CC systems This is `suspected’ to be optimal! Uses O(n log n) registers 0 0 1 0 1 2 3 0 1 2 3 4 5 6 7 Level 0 Level 1 Level 2 Processes Each node is identified by (level, number)

22
A local-spin tournament-tree algorithm (cont’d) Shared: - Per each node, v, there are 3 registers: name[level, 2node], name[level, 2node+1] initially -1 turn[level, node] - Per each level l and process i, a spin flag: flag[level, i] Local : level, node, id

23
A local-spin tournament-tree algorithm (cont’d) Program for process i 1.id:=i 2.For level = o to log n-1 do ;from leaf to root 3. node:= id/2 ;the current node 4. name[level, 2node+(id mod 2)]:=i ;identify yourself 5. turn[level,node]:=id ;update the tie-breaker 6. flag[level, i]:=0 ;initialize the locally-accessible spin flag 7. if (even(id)) 8. rival:=name[level, id+1] 9. else 10. rival:=name[level, id-1] 11. if ( (rival ≠ -1) and (turn[level, node] = i) ) ;if not sure I should precede rival 12. if (flag[level, rival] =0) 13. flag[level, rival]:=1 ;release the rival from waiting 14. await flag[level, i] ≠ 0 ;await until sure the rival updated the tie-breaker 15. if (turn[level,node]=i) ;if I lost 16. await flag[level,i]=2 ;wait till rival notifies me its my turn 17. id:=node ;move to the next level 18.CS 19.for level=log n –1 downto 0 do ;begin exit code 20. id:= i/2 level , node:= id/2 ;set node and id 21. name[level, 2node+(id mod 2]) :=-1 ;erase name 22.rival := turn[level,node] ;find who rival is (if there is one) 23.if rival ≠ i ;if there is a rival 24. flag[level,rival] :=2 ;notify rival

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google