1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker.

1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker

2 Definition of data race  It is an event  between 2 threads, where there are  unordered accesses to the same memory location  and at least one of the accesses is a write … temp = g; … g = V; … thread 1 thread 2

3 Bugs from races  Example bug:  incorrect resource bounds … temp = size; … …; free(); size = size - n; … thread 1 thread 2

4 RELAY against data races  RELAY finds data races  RELAY is a static tool:  analyzes the program before it runs  RELAY is scalable:  analyzed the Linux kernel (4.5 million LOC)  in 5 hours on 32 cpus  Found 53 races in a sample of 149 warnings

5  Introduction  Computing locksets & recording accesses  Relative locksets  Guarded accesses  Experiments with Linux  Categorizing warnings: false positives  Filters targeting categories Outline

6 locks held Checking with Locksets lock(l); temp = g; unlock(l); … read(g); lock(l); g = 0; … thread 1 thread 2  Locks are a mechanism for mutual exclusion  Only one thread holds a particular lock at a time.  No race if the same lock must have been acquired for each of two different shared accesses locks held ll common lll

7 A more realistic (but simple) example read/write with lock write without lock work (void *d) { o = d->priv; lock(o->lock); read_stats(o); } read_stats(x) { if (x->f1++ < 10) { unlock(x->lock); x->stats =...; } else unlock(x->lock); } No race Race

8 Key components of RELAY work (void *d) { o = d->priv; lock(o->lock); read_stats(o); } read_stats(x) { if (x->f1++ < 10) { unlock(x->lock); x->stats =...; } else unlock(x->lock); } GOAL: scalability KEY: modularity work ( ) { … } read_stats ( ) {…} read_stats work

9 L+ = {}, L- = {} Key components of RELAY work (void *d) { o = d->priv; lock(o->lock); read_stats(o); } read_stats(x) { if (x->f1++ < 10) { unlock(x->lock); x->stats =...; } else unlock(x->lock); } 1) Relative locksets: locks acq./rel. in function – caller handles locks before 2) Guarded accesses: pair accesses with relative locksets to catch races L+ = {}, L- = {x->lock} L+: MUST have been acq. L-: MAY have been rel. L+ = {d->priv->lock}, L- = {} L+ = {}, L- = {x->lock} 3) Summaries 4) Symbolic execution: what is the “same” memory location? Normalize to globals and formals.

10 How RELAY runs read_stats(x) { if (x->f1++ < 10) { unlock(x->lock); x->stats =...; } else unlock(x->lock); } L+ = {}, L- = {} L+ = {}, L- = {x->lock} x->stats: L+ = {}, L- = {x->lock} x->f1: L+ = {}, L- = {} L+ = {}, L- = {x->lock} summary: 1) Assume symbolic execution ran. 2) Compute relative locksets 3) Compute guarded accesses x->stats: L+ = {}, L- = {x->lock} x->f1: L+ = {}, L- = {}

11 x->stats: L+ = {}, L- = {x->lock} How RELAY runs read_stats(x) x->stats: L+ = {}, L- = {x->lock} x->f1: L+ = {}, L- = {} L+ = {}, L- = {x->lock} summary: x->f1: L+ = {}, L- = {} L+ = {}, L- = {x->lock} summary:

12 x->stats: L+ = {}, L- = {x->lock} Applying summaries work (void *d) { o = d->priv; lock(o->lock); read_stats(o); } read_stats(x) L+ = {d->priv->lock}, L- = {} L+ = {}, L- = {} x->f1: L+ = {}, L- = {} L+ = {}, L- = {x->lock} summary: L+ = {}, L- = {d->priv->lock} summary: BEFORE DIFFERENCE AFTER

13 Applying summaries work (void *d) { o = d->priv; lock(o->lock); read_stats(o); } read_stats(x) { L+ = {d->priv->lock}, L- = {} L+ = {}, L- = {} L+ = {}, L- = {d->priv->lock} d->priv->stats: L+ = {}, L- = {d->priv->lock} d->priv->f1: L+ = {d->priv->lock}, L- = {} L+ = {}, L- = {d->priv->lock} d->priv: L+ = {}, L- = {} summary: x->stats: L+ = {}, L- = {x->lock} x->f1: L+ = {}, L- = {} L+ = {}, L- = {x->lock} summary: L+ = {d->priv->lock}, L- = {} d->priv->f1: L+ = {d->priv->lock}, L- = {} BEFORE DIFFERENCE AFTER

14 Checking for Races d->priv->stats: L+ = {}, L- = {d->priv->lock} d->priv->f1: L+ = {d->priv->lock}, L- = {} L+ = {}, L- = {d->priv->lock} d->priv: L+ = {}, L- = {} summary: work (void *d) { o = d->priv; lock(o->lock); read_stats(o); } L+ = {d->priv->lock}, L- = {} L+ = {}, L- = {} L+ = {}, L- = {d->priv->lock} L+ = {d->priv->lock}, L- = {} work (void *d) d->priv->stats: L+ = {}, L- = {d->priv->lock} d->priv->f1: L+ = {d->priv->lock}, L- = {} d->priv: L+ = {}, L- = {} summary: d->priv->stats: L+ = {}, L- = {d->priv->lock} d->priv->f1: L+ = {d->priv->lock}, L- = {} d->priv: L+ = {}, L- = {} summary: row 1: reads only => no race row 2: common lock => no race row 3: no common lock => race

15 Modular Unsoundness  Pointer-arithmetic corner cases  Accesses in assembly code  Function pointers  Not enforcing must-alias for lockset intersection  Filters Revisit each and improve

16  Introduction  Computing locksets & recording accesses  Relative locksets  Guarded accesses  Experiments with Linux  Categorizing warnings: false positives  Filters targeting categories Outline

17 Linux experiments  5000+ warnings  Sample 90 and categorize  Design and apply filters to zoom-in on races

18 Categories of false positives  Initialization: thread allocates object and initializes it before sharing  Aliasing: mixed up different data structures  Unsharing: objects removed from shared structures  Recursive locks: “ Big kernel lock”  Non-lock synchronization: spawn, wait, signal, etc.  Conditional locking: locking correlated with return value, conditionals, etc.

19 Example filter: Thread “ownership”  To reduce initialization false positives:  remove accesses originating from the thread that allocated the object. x = malloc()‏ init(x)‏ share(x)‏ update(x)‏ x = get()‏ update(x)‏ x = get()‏ update(x)‏ thread 1thread 2 thread 3 filtered

20 Before filters: 11% data races

21 After filters: 80% data races

22 initialization non-aliasing, unsharing recursive locks non-lock sync. The absolute numbers

23 Related Work  Dynamic techniques  Locksets and extensions [Savage et al. 97, Choi et al. 02, Yu et al. 05, Elmas et al. 07]  Atomicity [Flanagan et al. 04, Wang et al. 06]  Benign vs. harmful [Narayanasamy et al. 07]  Static techniques for Java  Type systems [Flanagan et al. 99, Boyapati et al. 02]  Aliasing, must-not aliasing [Naik et al. 06, 07]  Static techniques for C  Scalability, ranking [Engler et al. 03]  Aliasing and sharing [Pratikakis et al. 06, Kahlon et al. 07]

24 Summary  Relative locksets: Modular summary-based analysis  Can analyze 46K functions of Linux kernel  modular => parallelizable  on a grid of 32 cpus: approx. 5 hours  Modular unsoundness  finds 53 races (or 25 after all filters)  future work: better analyses, better filters  whether races are benign or not, is another question!

1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker.

Similar presentations

Presentation on theme: "1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker.

Similar presentations

Presentation on theme: "1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker."— Presentation transcript:

Similar presentations

About project

Feedback