Presentation is loading. Please wait.

Presentation is loading. Please wait.

FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha.

Similar presentations


Presentation on theme: "FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha."— Presentation transcript:

1 FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha

2 [FlFr09] PLDI’09 2Contents  Introduction  Background  The FastTrack Algorithm  Implementation  Evaluation  Conclusions

3 [FlFr09] PLDI’09 3Introduction  Motivation  vector clocks are expensive  VC requires O(n) storage space and each VC operation requires O(n) time  motivated in part by the performance limitations of vector clocks  limitations  imprecise race detectors or static race detector can report false alarms  precise race detectors never produce false alarms, but it limited by the performance overhead of VC  vector clock’s full generality is not actually necessary in most cases  the vast majority of data in multithreaded programs is either thread local, lock protected, or read shared  can provide constant-time fast paths for common cases without any loss of precision or correctness in the general case

4 [FlFr09] PLDI’09 4  FastTract Overview  using ephoch  a pair of a clock and a thread identifier  for write accesses: records information only about the very last write to x  all write to x are totally ordered by the happens-before relation  for read accesses: records only the epoch of the last read to x  read operations on thread-local and lock-protected data are totally ordered  reduces overhead of almost all monitored operations  for analysis: from O(n)-time to O(1)-time  n is the number of threads in the program  for space: from O(n) to O(1)  only thread-local and lock-protected data

5 [FlFr09] PLDI’09 5Background  Multithreaded Programs Traces  a thread t has the set of operations  rd(t, x) and wr(t, x): read and write a value from x  acq(t, m) and rel(t, m): acquire and release a lock m  fork(t, u): forks a new thread u  join(t, u): blocks until thread u terminates  happens-before relation < α  the smallest transitively-closed relation over the operations in a trace α  a < α b: one of the states, Program order, Locking, Fork-join  race condition  two operations in a trace are not related by the happens-before relation  a trace has two concurrent conflicting accesses

6 [FlFr09] PLDI’09 6  Review: the DJIT + Algorithm  based on vector clocks  maintains an additional vector clock for each lock m  to identify conflicting accesses keeps two vector clock for read and write C 0 C 1 l m W x wr(0, x) rel(0, m) acq(1, m) wr(1, x)

7 [FlFr09] PLDI’09 7 The FastTrack Algorithm  Empirical data gathered from the action of race detection  full VC is not necessary in almost read and write operations  lightweight representation of the happens-before rel. can be used instead  only a small fraction of operations need full vector clock operations  How to catch each type of race condition?  each race condition is either  a read-write race: a read concurrent with a later write to the same variable  a write-read race: a write concurrent with a later read  a write-write race: involving two concurrent writes rd(0, x) wr(1, x) wr(0, x) rd(1, x) wr(0, x) wr(1, x) a read-write racea write-read racea write-write race

8 [FlFr09] PLDI’09 8  Detecting write-write races  all writes to x are totally ordered (no races have been detected)  an epoch c@t: a pair of a clock c and a thread t  epochs reduce the space and analysis overhead (write-write): O(1) C0C0 C1C1 LmWx ⊥e⊥e 4@0 4@0 4@0 8@1 wr(0, x) rel(0, m) acq(1, m) wr(1, x)

9 [FlFr09] PLDI’09 9`  Detecting write-read races  uses epoch of Wx and current vector clock Ct  check that the read happens after the last write  need O(1)-time for comparison Wx ≤ Ct C0C0 C1C1 LmWx ⊥e⊥e 4@0 4@0 4@0 8@1 wr(0, x) rel(0, m) acq(1, m) wr(1, x)rd(0, x)

10 [FlFr09] PLDI’09 10  Detecting read-write races  read-write race condition is more difficult  a write could potentially conflict with the last read performed by any other thread  need to record an entire VC of the last read from x by thread t  common situations for using epoch (totally ordered in practice)  Thread-local data: only one thread accesses a variable, and hence these accesses are totally ordered (program order)  Lock-protected data: a protecting lock is held on each access to a variable, and hence all access are totally ordered (program order or synch. order)  reads are typically unordered only when data is read-shared  uses an adaptive representation for tracking the read history

11 [FlFr09] PLDI’09 11  Analysis Details  an online algorithm that maintains an analysis state σ  σ = (Ct, Lm, Rx, Wx)  Rx: identifies either the epoch of the last read of x (all other read is ordered) or a vector clock that is the join of all reads of x  reads: 82.3% of all operations  requires O(n)-time for shared-read: 0.1% of reads  requires O(1)-time for other reads  writes: 14.5% of all operations  requires O(n)-time for shared-write: 0.1% of writes  requires O(1)-time for other writes

12 [FlFr09] PLDI’09 12  An Example of FT C0C0 C1C1 WxRx ⊥e⊥e 1@1 wr(0, x) 8@0 ⊥e⊥e 7@0 8@0 fork(0, 1) rd(0, x) join(0, 1) wr(0, x) rd(0, x) rd(1, x) ⊥e⊥e ⊥e⊥e ⊥e⊥e

13 [FlFr09] PLDI’09 13Implementation  FT Instrumentation State and Code  represents an epoch c@t as a 32-bit integer  the top 8 bits: store the thread identifier t  the bottom 24 bits: store the clock c  associates with each thread a ThreadState object  containing a unique thread identifier tid and a vector clock C  for instrumentation: t. C [t. tid ]  Granularity  supports two levels of granularity for analyzing memory locations  fine-grain analysis (default) and coarse-grain analysis  coarse-grain analysis reduces the memory footprint  but may produce false alarms if two fields of an object are protected by different locks

14 [FlFr09] PLDI’09 14  Extensions  supports additional synchronization primitives  wait and notify, volatile variables, and barriers  models a wait operation on lock m  does not need additional analysis rules  a notify operation can be ignored  guarantees that a write of vx happens before every subsequent read of vx  extends the L component to map volatile variables to the VC of the last write  volatile writes and reads modify the same way as lock acquire and release  consider release operation barrier_rel(T) for a barrier  the first post-barrier step happens after all pre-barrier steps  is unordered with respect to the next steps taken by other threads

15 [FlFr09] PLDI’09 15Evaluation  Precision and Performance  compares the precision and performance of 7 dynamic analyses  Empty, FastTrack, Eraser, DJIT +, MultiRace, GoldiLocks, and BasicVC  all tools were implemented on top of RoadRunner  Benchmark Configuration  performed experiments on 16 benchmarks  report at most one race for each field of each class and each array access  Summary of Results  FT outperforms other tools  provides almost a 10x speedup over BasicVC and a 2.3x speedup even over the DJIT+ algorithm  provides a substantial increase in precision over Eraser without loss in performance

16 [FlFr09] PLDI’09 16Conclusion  FastTrack is a new precise race detection algorithm  uses an adaptive lightweight representation for the happens-before relation that reduces both space and time overheads  despite its efficiency, it is a comparatively simple algorithm that is straightforward to implement  contains optimized constant-time fast paths that handle upwards of 96% of the operations in benchmarks  provides a 2.3x performance improvement over the DJIT + algorithm, and incurs less than half the memory overhead of DJIT +


Download ppt "FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha."

Similar presentations


Ads by Google