Presentation is loading. Please wait.

Presentation is loading. Please wait.

How Good can a Transactional Memory be? R. Guerraoui, EPFL.

Similar presentations


Presentation on theme: "How Good can a Transactional Memory be? R. Guerraoui, EPFL."— Presentation transcript:

1 How Good can a Transactional Memory be? R. Guerraoui, EPFL

2

3

4

5 How Good can a Transactional Memory be?

6 From the New York Times San Francisco, May 7, 2004 Intel announces a drastic change in its business strategy: « Multicore is THE way to boost performance »

7 Multicores are almost everywhere Dual-core commonplace in laptops Quad-core in desktops Dual quad-core in servers All major chip manufacturers produce multicore CPUs SUN Niagara (8 cores, 64 concurrent threads) Intel Xeon (6 cores) AMD Opteron (4 cores) …

8 The free ride is over

9 Every one will need to fork threads

10 Forking threads is easy Handling their conflicts is hard

11 Traditional scaling 1x 2x 4x Time: Moore’s Law Speedup User code Traditional CPU

12 Speedup 1x 2x 4x User code Multicore CPU Time: Moore’s Law Ideal multicore scaling

13 Real multicore scaling Speedup 1x 1.4x 2.2x User code Multicore CPU Time: Moore’s Law Parallelization & synchronization require great care!

14 Coarse grained locks => slow Fine grained locks => errors

15 Double-ended queue Enqueue Dequeue

16 A synchronization abstraction that is: Simple to use Efficient to implement Wanted

17 Transactions

18 accessing object 1; accessing object 2; Back to the undergraduate level atomic { }

19 Historical perspective Eswaran et al (CACM’76) Database Papadimitriou (JACM’79) Theory Liskov/Sheifler (TOPLAS’82) Language Knight (ICFP’86) Architecture Herlihy/Moss (ISCA’93) Hardware Shavit/Touitou (PODC’95) Software

20 Simple example (consistency invariant) 0 < x < y

21 T: x := x+1 ; y:= y+1 Simple example (transaction)

22 Two-phase locking To write O, T requires a write-lock on O; To read O, T requires a read-lock on O; Before committing, T releases all its locks T waits if some T’ acquired a lock on O T waits if some T’ acquired a write-lock on O

23 Why two phases? To write O, T requires a write-lock on O; T waits if some T’ acquired a lock on O To read O, T requires a read-lock on O; T waits if some T’ acquired a write-lock on O T releases the lock on O when done with O

24 Why two phases? T1 T2 read(0)write(1) O1 O2 read(0)write(1) O2O1

25 Two-phase locking - better dead than wait - To write O, T requires a write-lock on O; To read O, T requires a read-lock on O; Before committing, T releases all its locks T aborts if some T’ acquired a lock on O A transaction that aborts restarts again T aborts if some T’ acquired a lock on O

26 Two-phase locking - better kill than wait - To write O, T requires a write-lock on O; To read O, T requires a read-lock on O; Before committing, T releases all its lock T aborts T’ if T’ acquired a lock on O T aborts T’ if T’ acquired a write-lock on O A transaction that aborts restarts again

27 Two-phase locking - invisible reads - To write O, T requires a write-lock on O; T aborts T’ if some T’ acquired a write-lock on O To read O, T checks if all objects read remain valid - else T aborts Before committing, T checks if all objects read remain valid and releases all its locks

28 “It is better for Intel to get involved in this [Transactional Memory] now so when we get to the point of having …tons… of cores we will have the answers” Justin Rattner, Intel Chief Technology Officer

29 “…we need to explore new techniques like transactional memory that will allow us to get the full benefit of all those transistors and map that into higher and higher performance.” Bill Gates, President of a transparently operated private foundation

30 “…manual synchronization is intractable…transactions are the only plausible solution….” Tim Sweeney, Epic Games

31 Speedup 1x 2x 4x User code Multicore CPU Time: Moore’s Law Multicore scaling

32 Micro-benchmarks … Linked-lists; red-black trees, etc. Consider specific loads: typically read- only transactions

33 Challenging TMs STMBench7 (Eurosys’07) Large data structure: challenge memory overhead Long operations: kills non-linear algorithms Complex access patterns: stretches flexibility

34 STMBench7:(Transact’08) Transaction Terminator All TMs collapsed because of internal bugs or memory usage (except X) Certain urban legends about performance revealed inaccurate; e.g., OF is inherently slower than lock-based TMs

35 Real-world scaling Speedup 1x 1.4x 2.2x User code Multicore CPU Time: Moore’s Law Parallelization & synchronization require great care!

36 Software Transactional Memory: Why is it only a Research Toy? C. Cascaval, C. Blundell, M. Michael, H. Cain, P. Wu, S. Chiras, S. Chatterjee

37 37 Why STM can be more than a Research Toy? A. Dragojević, P. Felber, V. Gramoli, R. Guerraoui

38

39 How Good can a Transactional Memory be?

40 How much commit can a TM perform?

41 A closer look at TM What really prevents commitment? (Safety) How much commitment is practically possible? (Progress)

42 To write an object O, a transaction acquires O and aborts “the” transaction that owns O To read an object, a transaction T takes a snapshot to see if the system hasn’t changed since T’s last reads; else T is aborted Two-phase locking - invisible reads -

43 Killer write (ownership) Careful read (validation) Two-phase locking - invisible reads -

44 More efficient algorithm Apologizing versus asking permission Killer write Lazy read: validity check at commit time

45 A history is atomic if its restriction to committed transactions is serializable The old safety (Pap79)

46 A history H of committed transactions is serializable if there is a history S(H) that is (1) equivalent to H (2) sequential (3) legal

47 Back to the example Invariant: 0 < x < y Initially: x := 1; y := 2

48 Division by zero T1: x := x+1 ; y:= y+1 T2: z := 1 / (y - x)

49 T1: x := 3; y:= 6 Infinite loop T2: a := y; b:= x; repeat b:= b + 1 until a = b

50 We need to restrict ALL transactions (as with critical sections) The old safety property restricts only committed transactions

51 A history H is opaque if for every transaction T in H, at least one history in committed(T,H) is serializable Candidate property: Opacity (PPoPP’08)

52 Careful read (validation) Killer write (ownership) Two-phase locking - invisible reads -

53 Visible vs Invisible Read (SXM; RSTM) Write is mega killer: to write an object, a transaction aborts any live one which has read or written the object Visible but not so careful read: when a transaction reads an object, it says so

54 Theorem Reads are eithervisible or careful NB. Assuming minimal progress and single versions

55 Intuition of the proof T1 T2 read() write() commit I1,I2,..,Im O1,O2,..,On read() Ik

56 The theorem does not hold for classical atomicity i.e., the theorem does not hold for database transactions

57 A closer look at TM What really prevents commitment? (Opacity)

58 How to prove opacity? Model checking

59 Check that the conflict graph is acyclic Number of nodes is unbounded (STM) NP-Complete problem Model checking opacity?

60 Hint: reduce the verification space Uniform system All transactions are treated equally Transaction and variable names do not matter Atomic system Read, write and commit operations are atomic

61 TM verification theorem (PLDI’08) A TM either violates opacity with 2 transactions and 3 variables or satisfies it with any number of variables and transactions

62 Examples It takes 15mn to check the correctness of TL2 and DSTM Reverse two lines in TL2: bug found in 10mn

63 A closer look at TM What really prevents commitment? Opacity How much commitment is possible (given opacity)?

64 Ideal progress No correct transaction aborts

65 T1 T2 read() write() commit O1 write() O2 Aborting is a fatality read() O2 abort

66 Eventual progress Every correct transaction eventually commits NB. We allow the possibility for a transaction to abort a finite number of times as long as it eventually commits

67 Eventual progress T1 T2 read() write() commit O1 write() O2 read() O2 abort

68 Eventual progress is impossible NB. This impossibility is fundamentally different from FLP: It holds no matter what underlying object is used

69 A closer look at TM What really prevents commitment? Opacity How much commitment is possible (given opacity)?

70 Conditional progress - obstruction-freedom - A correct transaction that does not encounter contention eventually commits

71 DSTM To write O, T requires a write-lock on O (use C&S); T aborts T’ if some T’ acquired a write-lock on O (use C&S) To read O, T checks if all objects read remain valid - else abort (use C&S) Before committing, T releases all its locks (use C&S)

72 DSTM uses C&S C&S is the strongest synchronization primitive Is OF-TM possible with less than C&S? e.g., R/W objects

73 The consensus number of OF-TM is 2 (SPAA’08) OF-TM cannot be implemented with R/W objects only OF-TM does not need C&S

74 A closer look at TM What really prevents commitment? Opacity How much commitment is practically possible? conditional More?

75 Permissiveness (DISC’08) A TM is permissive if it never aborts when it should not

76 More precisely Let P be any safety property and H any P-safe history prefix of a deterministic TM We say that a TM is permissive w.r.t P if Whenever satisfies P can be generated by the TM

77 Permissive TM But…. Very expensive (maintain global conflict graphs)

78 Let P be any safety property and H any history H generated by a TM The TM is probabilistic permissive with respect to P if Whenever satisfies P: can be generated by TM with a positive probability Probabilistic permissiveness

79 AVSTM: a probabilistically permissive TM w.r.t opacity Idea: at commit, a transaction guesses a serialization time in the future AVSTM

80 Non-blocking Good performance under high-contention Can be combined with an OF-TM AVSTM

81 A closer look at TM What really prevents commitment? Opacity How much commitment is practically possible? conditional; permissiveness How fast?

82 Off-line scheduler (PODC’95) Compare the TM protocol with an off- line scheduler that knows: The starting time of transactions Which objects are accessed (i.e., conflicts)

83 Greedy contention manager State Priority (based on start time) Waiting flag (set while waiting) Wait if other has Higher priority AND not waiting Abort other if Lower priority OR waiting

84 Example of competitive ratio Let s be the number of objects accessed by all transactions Compare time to commit all transactions Greedy is O(s)-competitive with the off-line scheduler GHP’05 O(s 2 ) AEST’06 O(s)

85 Joint work with A. Dragojevic T. Henzinger M. Herlihy M. Kapalka V. Sing See Transactions@epfl: lpdwww.epfl.ch

86 Transactions are conquering the parallel programming world They look simple and familiar and might make the programmer happy They are in fact very tricky and should make US happy The one slide to remember


Download ppt "How Good can a Transactional Memory be? R. Guerraoui, EPFL."

Similar presentations


Ads by Google