Download presentation

Presentation is loading. Please wait.

Published byElle Allen Modified over 3 years ago

1
Wait-Free Linked-Lists Shahar Timnat, Anastasia Braginsky, Alex Kogan, Erez Petrank Technion, Israel Presented by Shahar Timnat 469-+

2
Our Contribution A fast, wait-free linked-list The first wait-free list fast enough to be used in practice

3
Agenda What is a wait-free linked-list? Related work and existing tools Wait-Free Linked-List design Performance 3

4
Concurrent Data Structures Allow several threads to read or modify the data-structure simultaneously Increasing demands due to highly-parallel systems

5
Progress Guarantees Obstruction Free – A thread running exclusively will make a progress Lock Free – At least one of the running threads will make a progress Wait Free – every thread that gets the CPU will make a progress.

6
Wait Free Algorithms Provides the strongest progress guarantee Always desirable, particularly in real-time systems. Relatively rare Hard to design Typically slower

7
The Linked List Interface Following the traditional choice; a sorted list-based set of integers insert(int x); delete(int x); contains(int x); 469-+

8
Prior Wait-Free Lists Only Universal Constructions Non-scalable (by nature ?) Achieve good complexity, but poor performance State-of-the-art construction (Chuong, Ellen, Ramachandran) significantly under-perform our construction.

9
Our wait-free versus a universal construction

12
Linked-Lists with Progress Guarantee No practical wait-free linked-lists available Lock-free linked-lists exists Most notably: Harriss linked-list

13
Existing Lock-Free List (by Harris) Deletion in two steps Logical: Mark the next field using a CAS Physical: Remove the node 469469

14
Existing Lock-Free List (by Harris) Use the least significant bit in each next field, as a mark bit The mark bit signals that a node is logically deleted The Nodes next field cannot be changed (the CAS will fail) if it is logically deleted 469469

15
Help Mechanism A common technique to achieve wait- freedom Each thread declares in a designated state array the operation it desires Many threads may attempt to execute it

16
Help Mechanism - Difficulties Multiple threads should be able to work concurrently on the same operation Many potential races Difficult to design Usually slower

17
Help Mechanism – Usually slower Much more synchronization needed At times many threads are attempting to help the same operation (Can we use help without suffering slower execution ?)

18
Means of Using Help Serialize everything (non-scalable) Eager help Delayed help Fast-Path-Slow-Path

19
Complication: Deletion Owning T1, T2 both attempt delete(6) 469-+

20
Complication: Deletion Owning T1, T2 both attempt delete(6) T1, T2 both declare in the state array 469-+

21
Complication: Deletion Owning T1, T2 both attempt delete(6) T1, T2 both declare in the state array T3 sees T1 declaration and tries to help it, while T4 helps T2 469-+

22
Complication: Deletion Owning T1, T2 both attempt delete(6) T1, T2 both declare in the state array T3 sees T1 declaration and tries to help it, while T4 helps T2 469-+

23
Complication: Deletion Owning If both helpers T3, T4 go to sleep after the mark was done, which thread (T1 or T2) should return true and which false? 469-+

24
Means of Using Help Serialize everything (non-scalable) Eager help Delayed help Fast-Path-Slow-Path In A Serialized help, this could not happen!

25
"Solution: use a success bit Each node holds an extra success bit (initially 0) Potential owners compete to CAS it to 1 (no help in this part) Note the node is deleted before it is decided which thread owns its deletion

26
Means of Using Help Serialize everything (non-scalable) Eager help Delayed help Fast-Path-Slow-Path Opportunistic help

27
Helping an Insert Operation Search Direct Insert Report

28
Helping an Insert Operation Search Direct Insert Report 469 Status: Pending Operation: Insert New node: 7

29
Helping an Insert Operation Search Direct Insert Report 469 Status: Pending Operation: Insert New node: 7

30
Helping an Insert Operation Search Direct Insert Report 469 Status: Pending Operation: Insert New node: 7

31
Helping an Insert Operation Search Direct Insert Report 469 CAS Status: Pending Operation: Insert New node: 7

32
Helping an Insert Operation Search Direct Insert Report 469 Status: Pending Operation: Insert New node: 7 Status: Success Operation: Insert New node: CAS

33
Helping an Insert Operation helpInsert (state s) { (pred,succ) = search(s.key) /* Search */ if (found key) CAS (state[tid],s, failure); /* Report Failure */ else { s.node.next = succ; /* Direct */ if (CAS(pred.next, succ, s.node) /* Insert */ CAS(state[tid],s,success); /* Report */ }

34
Good Enough ? helpInsert (state s) { (pred,succ) = search(s.key) /* Search */ if (found key) CAS (state[tid],s, failure); /* Report Failure */ else { s.node.next = succ; /* Direct */ if (CAS(pred.next, succ, s.node) /* Insert */ CAS(state[tid],s,success); /* Report */ }

35
Incorrect Result Returned consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 inserts new node. CAS(state[tid],s,success) } 469 T2 { found(6,7) CAS(state[tid],s,failure) } 7

36
Incorrect Result Returned consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 inserts new node CAS(state[tid],s,success) } 469 T2 { found(6,7) CAS(state[tid],s,failure) } 7

37
Incorrect Result Returned consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 inserts new node CAS(state[tid],s,success) } 469 T2 { found(6,7) CAS(state[tid],s,failure) } 7

38
Incorrect Result Returned consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 inserts new node. CAS(state[tid],s,success) } 469 T2 { found(6,7) CAS(state[tid],s,failure) } 7

39
Incorrect Result Returned consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 inserts new node. CAS(state[tid],s,success) } 469 T2 { found(6,7) CAS(state[tid],s,failure) } 7

40
Incorrect Result Returned consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 inserts new node CAS(state[tid],s,success) } 469 T2 { found(6,7) CAS(state[tid],s,failure) } 7

41
Fix: helpInsert (state s) { (pred,succ) = search(s.key) if (found key && foundNode != s.node) CAS (state[tid],s, failure); else { s.node.next = succ; if (CAS(pred.next, succ, s.node) CAS(state[tid],s,success); } Good Enough ?

42
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success ) } 469 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 7

43
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success) } 469 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 7

44
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success) } 467 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 9

45
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success) } 467 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 9

46
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success) } 467 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 97

47
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success) } 467 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 97

48
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success) } 467 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 97

49
Incorrect Result Returned 2 T1 { found (6,9) node.next = &9 inserts new node CAS(->success) } 467 T2 { found(6,7) CAS(->failure} T3 { Delete(7) Insert(7) } 97

50
Fix: helpInsert (state s) { (pred,succ) = search(s.key) if (found key && foundNode != s.node && !s.node.marked) CAS (state[tid],s, failure); else { s.node.next = succ; if (CAS(pred.next, succ, s.node) CAS(state[tid],s,success); } Good Enough ?

51
Ill-timed Direct consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 } 469 T2 { found (6,9) node.next = &9 inserts the new node CAS(->success)...Insert(8) (after 7) } 7

52
Ill-timed Direct consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 } 469 T2 { found (6,9) node.next = &9 inserts the new node CAS(->success)...Insert(8) (after 7) } 7

53
Ill-timed Direct consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 } 469 T2 { found (6,9) node.next = &9 inserts the new node CAS(->success)...Insert(8) (after 7) } 7

54
Ill-timed Direct consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 } 469 T2 { found (6,9) node.next = &9 inserts the new node CAS(->success)...Insert(8) (after 7) } 7

55
Ill-timed Direct consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 } 469 T2 { found (6,9) node.next = &9 inserts the new node CAS(->success)...Insert(8) (after 7) } 7

56
Ill-timed Direct consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 } 468 T2 { found (6,9) node.next = &9 inserts the new node CAS(->success)...Insert(8) (after 7) } 79

57
Ill-timed Direct consider 2 threads helping insert(7) T1 { found (6,9) node.next = &9 } 468 T2 { found (6,9) node.next = &9 inserts the new node CAS(->success)...Insert(8) (after 7) } 79

58
Fix: helpInsert (state s) { node_next = s.node.next; (pred,succ) = search(s.key)/*Remove marked*/ if (found key && foundNode != s.node && !s.node.marked) CAS (state[tid],s, failure); else { s.node.next = succ; if (!CAS(s.node.next, node_next, succ)) restart if (CAS(pred.next, succ, s.node) CAS(state[tid],s,success); } Good Enough ?

59
More Races Exist Additional races were handled in both the delete and insert operations We constructed a formal proof for the correctness of the algorithm

60
Main Invariant Each modification of a nodes next field belongs into one of four categories Marking (change the mark bit to true) Snipping (removing a marked node) Redirection (of an infant node) Insertion (a non-infant to an infant) Proof by induction and by following the code lines

61
Fast-Path-Slow-Path (Kogan and Petrank, PPOPP 2012) Each thread: Tries to complete the operation without help Asks For help Only if it failed due to contention (Almost) as fast as the lock-free Gives the stronger wait-free guarantee

62
Fast-Path-Slow-Path Previously implemented for a queue Requires the wait-free algorithm and the lock-free one to work concurrently Our algorithm was carefully chosen to allow a fast-path-slow-path execution

63
Proof Structure Basic Invariants: A nodes key never changes A marked node is never unmarked A marked nodes next field never changes (And more…)

64
Proving the Main Invariant Examine each code line that modifies a node Use induction on the execution steps

65
Performance We measured our Algorithm against Harriss lock-free algorithm We measured our algorithm using Immediate help Deferred help FPSP

66
Performance We report the results of a micro- benchmark: 1024 possible keys, 512 on average 60% contains, 20% insert, 20% delete Measured on: Intel Xeon (8 concurrent threads) Sun ULTRA SPARC (32 concurrent threads)

67
Performance

68
When employing the FPSP technique together with our algorithm: 0-2% difference on Intel (R) Exon (R) 9-11% difference on on UltraSPARC

69
Conclusions We designed the first practical wait-free linked-list Performance measurement shows our algorithm to work almost as fast as the lock-free list, and give a stronger progress guarantee A formal correctness proof is available

70
Questions ?

Similar presentations

OK

Copyright © 2005-2011 Curt Hill Parallelism in Processors Several Approaches.

Copyright © 2005-2011 Curt Hill Parallelism in Processors Several Approaches.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google