Presentation is loading. Please wait.

Presentation is loading. Please wait.

Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin.

Similar presentations


Presentation on theme: "Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin."— Presentation transcript:

1 Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin

2 Detecting data races in production

3 Overhead FastTrack [Flanagan & Freund ’09] 80x  8x

4 Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Number of threads

5 Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Problem in future Problem today

6 Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Sampling rate

7 Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Sampling periods Non-sampling periods

8 Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Probability (detecting any race) FastTrack1 Pacerr

9 Detect race  first access sampled

10 Sampling period Thread AThread B Non-sampling period Sampling period Non-sampling period

11 Thread AThread B write x read x read y write y

12 Insight #1: Stop tracking variable after non-sampled access

13 Thread A write x unlock m Thread B

14 Thread A write x unlock m Thread B lock m

15 Thread A write x unlock m Thread B lock m write x

16 Thread A write x unlock m read x Thread B lock m write x

17 Thread A write x unlock m read x Thread B lock m write x Race!

18 Thread A write x unlock m read x Thread B lock m write x Race!

19 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB Vector clocks

20 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB Vector clocks

21 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB Vector clocks

22 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB

23 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5@A

24 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 2 2 5@A ABAB

25 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 6 6 2 2 5 5 2 2 5@A Increment clock Increment clock ABAB

26 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 6 6 2 2 5 5 4 4 5 5 2 2 Join clocks Join clocks 5@A ABAB

27 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 4 4 5 5 2 2 5@A 6 6 2 2 Happens before? ABAB

28 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 4 4 5@A 5 5 2 2 6 6 2 2 ABAB

29 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 4 4 5@A 5 5 2 2 6 6 2 2 ABAB

30 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 4 4 5@A 5 5 2 2 6 6 2 2 No work performed ABAB

31 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 4 4 5@A 5 5 2 2 6 6 2 2 Race uncaught ABAB

32 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 4 4 5 5 2 2 6 6 2 2 4@B ABAB

33 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 5 5 4 4 5 5 2 2 6 6 2 2 4@B Happens before? Race! ABAB

34 Insight #2: We only care whether “A happens before B” if A is sampled

35 Thread AThread B Do these events happen before other events? We don’t care!

36 Increment clocks Thread AThread B Don’t increment clocks Increment clocks Don’t increment clocks Do these events happen before other events? We don’t care!

37 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 ABAB

38 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 5 5 4 4 5 5 4 4 5 5 2 2 No clock increment ABAB

39 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 5 5 4 4 5 5 4 4 5 5 2 2 5 5 2 2 ABAB

40 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 5 5 4 4 5 5 4 4 5 5 2 2 5 5 2 2 Unnecessary join ABAB

41 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 5 5 4 4 5 5 4 4 5 5 2 2 5 5 2 2 O(n)  O(1) ABAB

42 http://jikesrvm.org/Research+Archive

43 1

44 Qualitative improvement in time & space

45 Probability (detecting any race) = r ?

46

47 LiteRace [Marino et al. ’09] Cold-region hypothesis [Chilimbi & Hauswirth ’04] Full analysis at synchronization operations

48 Accuracy, time, space  sampling rate Detect race  first access sampled

49 Accuracy, time, space  sampling rate Detect race  first access sampled Qualitative improvement

50 Accuracy, time, space  sampling rate Detect race  first access sampled Qualitative improvement Help developers fix difficult-to-reproduce bugs

51 Accuracy, time, space  sampling rate Detect race  first access sampled Qualitative improvement Help developers fix difficult-to-reproduce bugs Thank you

52

53 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 5 5 4 4 ABAB 5 5 4 4 v6 Vector clock versions

54 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 5 5 4 4 ABAB v6 5 5 2 2

55 Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m2 5 5 2 2 3 3 4 4 ABAB v6 5 5 2 2 5 5 2 2 Join unnecessary 5 5 4 4 v6

56

57

58

59

60 Qualitative improvement

61  Core 2 Quad (4 cores)  Multithreaded benchmarks (DaCapo & SPECjbb2000)  Evaluating sampling-based race detection  Need 100s of trials to evaluate  Some races are rare  Evaluate only frequent races

62  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

63 Thread A write x unlock m Thread B  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

64 Thread A write x unlock m Thread B lock m write x  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

65 Thread A write x unlock m read x Thread B lock m write x  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

66 Thread A write x unlock m read x Thread B lock m write x Race!  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

67 Races indicate  Atomicity violations  Order violations

68 Races indicate  Atomicity violations  Order violations Races lead to  Sequential consistency violations  No races  sequential consistency (Java/C++)  Races  writes observed out of order

69 Races indicate  Atomicity violations  Order violations Races lead to  Sequential consistency violations  No races  sequential consistency (Java/C++)  Races  writes observed out of order Most races potentially harmful [Flanagan & Freund ’10]

70 class ProducerConsumer { boolean ready; int x; produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

71 class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

72 class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

73 class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

74 class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; } Can read old value

75 class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { … = x; while (!ready) { } } Legal reordering by compiler or hardware

76 class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

77 class ProducerConsumer { volatile boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; } Happens- before edge

78 class LibraryBook { Set borrowers; }

79 class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

80 class LibraryBook { Set borrowers; addBorrower(Person p) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

81 class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

82 class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

83 addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet(); }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

84 addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; obj. (); borrowers = obj; }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

85 addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; borrowers = obj; obj. (); }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

86 addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; borrowers = obj; obj. (); }}}... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

87

88 33% base overhead ~50% overhead

89 Program alone FastTrackPacer Detection rate 0occurrence rateoccurrence rate × r Running time tt(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ]  Evaluate only frequent races  Evaluate scaling with r  Don’t evaluate scaling with n

90

91 50 million people

92  Energy Management System  Alarm and Event Processing Routine (1 MLOC) http://www.securityfocus.com/news/8412

93  Energy Management System  Alarm and Event Processing Routine (1 MLOC)  Post-mortem analysis: 8 weeks "This fault was so deeply embedded, it took them weeks of poring through millions of lines of code and data to find it.” –Ralph DiNicola, FirstEnergy http://www.securityfocus.com/news/8412

94  Race condition  Two threads writing to data structure simultaneously  Usually occurs without error  Small window for causing data corruption http://www.securityfocus.com/news/8412

95  Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)

96  Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)  FastTrack [Flanagan & Freund ’09]  Reads & writes (97%): O(1) time  Synchronization (3%): O(n) time  8X slowdown

97  Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)  FastTrack [Flanagan & Freund ’09]  Reads & writes (97%): O(1) time  Synchronization (3%): O(n) time  8X slowdown Problem today Problem in future

98  Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)  FastTrack [Flanagan & Freund ’09]  Reads & writes (97%): O(1) time  Synchronization (3%): O(n) time  8X slowdown

99 Thread AThread B 5 5 2 2 3 3 4 4 ABAB Vector clocks

100 Thread AThread B 5 5 2 2 3 3 4 4 ABAB Vector clocks Thread A’s logical time Thread B’s logical time

101 Thread AThread B 5 5 2 2 3 3 4 4 ABAB Vector clocks Last logical time “received” from B Last logical time “received” from A

102 5 5 2 2 3 3 4 4 ABAB Thread A unlock m Thread B lock m 6 6 2 2 Increment clock

103 5 5 2 2 3 3 4 4 ABAB Thread A unlock m Thread B lock m 6 6 2 2 5 5 4 4 5 5 2 2 Join clocks Join clocks

104 5 5 2 2 3 3 4 4 ABAB Thread A unlock m Thread B lock m 6 6 2 2 5 5 4 4 n = # of threads O(n) time 5 5 2 2

105 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB

106 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5@A

107 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5@A

108 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 6 6 2 2 5@A 5 5 2 2

109 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 6 6 2 2 5@A 5 5 2 2

110 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 6 6 2 2 5 5 4 4 5@A 5 5 2 2

111 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 5@A 6 6 2 2 Happens before? 5 5 2 2

112 5@A Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 6 6 2 2 4@B 5 5 2 2

113 5@A Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 6 6 2 2 Happens before? 4@B 5 5 2 2

114 5@A Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 6 6 2 2 Happens before? 4@B Race! 5 5 2 2

115 FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Sampling rate

116 FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) No. of threads

117 FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) Reads & writesSynchronization

118 FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) Reads & writes Problem today Problem in future Synchronization

119 FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ] Overhead in sampling periods

120 FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ] Overhead in sampling periods Overhead in non-sampling periods (small)

121

122

123 Data race occurs extremely rarely Data race occurs periodically Pre-deploymentDeployed

124 “We test exhaustively … we had in excess of three million online operational hours [342 years] in which nothing had ever exercised that bug.” –Mike Unum, manager of commercial solutions, GE Energy http://www.securityfocus.com/news/8412

125 Data race  buggy execution

126 Thread AThread B 5 5 2 2 3 3 4 4 ABAB Vector clocks

127 Thread AThread B 5 5 2 2 3 3 4 4 ABAB Vector clocks Thread A’s logical time Thread B’s logical time

128 Thread AThread B 5 5 2 2 3 3 4 4 ABAB Vector clocks Last logical time “received” from B Last logical time “received” from A

129 5 5 2 2 3 3 4 4 ABAB Thread A unlock m Thread B lock m 6 6 2 2 Increment clock

130 5 5 2 2 3 3 4 4 ABAB Thread A unlock m Thread B lock m 6 6 2 2 5 5 4 4 5 5 2 2 Join clocks Join clocks

131 5 5 2 2 3 3 4 4 ABAB Thread A unlock m Thread B lock m 6 6 2 2 5 5 4 4 n = # of threads O(n) time 5 5 2 2

132 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB

133 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5@A

134 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5@A

135 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 6 6 2 2 5@A 5 5 2 2

136 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 6 6 2 2 5@A 5 5 2 2

137 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 6 6 2 2 5 5 4 4 5@A 5 5 2 2

138 Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 5@A 6 6 2 2 Happens before? 5 5 2 2

139 5@A Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 6 6 2 2 4@B 5 5 2 2

140 5@A Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 6 6 2 2 Happens before? 4@B 5 5 2 2

141 5@A Thread A write x unlock m read x Thread B lock m write x 5 5 2 2 3 3 4 4 ABAB 5 5 4 4 6 6 2 2 Happens before? 4@B Race! 5 5 2 2


Download ppt "Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin."

Similar presentations


Ads by Google