CS294-32: Dynamic Data Race Detection Koushik Sen UC Berkeley.

Slides:

Advertisements

Similar presentations

A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Pallavi Joshi  Chang-Seo Park  Koushik Sen  Mayur Naik ‡  Par Lab, EECS, UC Berkeley‡

Advertisements

50.003: Elements of Software Construction Week 6 Thread Safety and Synchronization.

1 Chao Wang, Yu Yang*, Aarti Gupta, and Ganesh Gopalakrishnan* NEC Laboratories America, Princeton, NJ * University of Utah, Salt Lake City, UT Dynamic.

Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,

Race Directed Random Testing of Concurrent Programs KOUSHIK SEN - UNIVERSITY OF CALIFORNIA, BERKELEY PRESENTED BY – ARTHUR KIYANOVSKI – TECHNION, ISRAEL.

A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.

D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.

Testing Concurrent/Distributed Systems Review of Final CEN 5076 Class 14 – 12/05.

Eraser: A Dynamic Data Race Detector for Multithreaded Programs STEFAN SAVAGE, MICHAEL BURROWS, GREG NELSON, PATRICK SOBALVARRO and THOMAS ANDERSON.

Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research.

Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs Stephen Freund Williams College Cormac Flanagan University of California, Santa Cruz.

Types for Atomicity in Multithreaded Software Shaz Qadeer Microsoft Research (Joint work with Cormac Flanagan)

Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric.

SOS: Saving Time in Dynamic Race Detection with Stationary Analysis Du Li, Witawas Srisa-an, Matthew B. Dwyer.

Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.

/ PSWLAB Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs By Cormac Flanagan, Stephen N. Freund 24 th April, 2008 Hong,Shin.

CS 263 Course Project1 Survey: Type Systems for Race Detection and Atomicity Feng Zhou, 12/3/2003.

ADVERSARIAL MEMORY FOR DETECTING DESTRUCTIVE RACES Cormac Flanagan & Stephen Freund UC Santa Cruz Williams College PLDI 2010 Slides by Michelle Goodstein.

S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

Threading Part 2 CS221 – 4/22/09. Where We Left Off Simple Threads Program: – Start a worker thread from the Main thread – Worker thread prints messages.

Distributed Systems Spring 2009

Ordering and Consistent Cuts Presented By Biswanath Panda.

C. FlanaganSAS’04: Type Inference Against Races1 Type Inference Against Races Cormac Flanagan UC Santa Cruz Stephen N. Freund Williams College.

CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.

CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.

C. FlanaganDynamic Analysis for Atomicity1. C. Flanagan2Dynamic Analysis for Atomicity Atomicity The method inc() is atomic if concurrent threads do not.

C. Flanagan1Atomicity for Reliable Concurrent Software - PLDI'05 Tutorial Atomicity for Reliable Concurrent Software Joint work with Stephen Freund Shaz.

Runtime Atomicity Analysis of Multi-threaded Programs Focus is on the paper: “Atomizer: A Dynamic Atomicity Checker for Multithreaded Programs” by C. Flanagan.

1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker.

Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.

Verifying Commit-Atomicity Using Model Checking Cormac Flanagan University of California, Santa Cruz.

Cormac Flanagan UC Santa Cruz Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Jaeheon Yi UC Santa Cruz Stephen Freund.

/ PSWLAB Eraser: A Dynamic Data Race Detector for Multithreaded Programs By Stefan Savage et al 5 th Mar 2008 presented by Hong,Shin Eraser:

Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.

Distributed Systems Foundations Lecture 1. Main Characteristics of Distributed Systems Independent processors, sites, processes Message passing No shared.

C. FlanaganType Systems for Multithreaded Software1 Cormac Flanagan UC Santa Cruz Stephen N. Freund Williams College Shaz Qadeer Microsoft Research.

15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.

Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.

Logical Clocks n event ordering, happened-before relation (review) n logical clocks conditions n scalar clocks condition implementation limitation n vector.

Eraser: A Dynamic Data Race Detector for Multithreaded Programs STEFAN SAVAGE, MICHAEL BURROWS, GREG NELSON, PATRICK SOBALVARRO, and THOMAS ANDERSON Ethan.

Survey on Trace Analyzer (2) Hong, Shin /34Survey on Trace Analyzer (2) KAIST.

50.530: Software Engineering Sun Jun SUTD. Week 8: Race Detection.

Issues with Clocks. Context The tree correction protocol was based on the idea of local detection and correction. Protocols of this type are complex to.

Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.

Synchronization. Why we need synchronization? It is important that multiple processes do not access shared resources simultaneously. Synchronization in.

“Virtual Time and Global States of Distributed Systems”

D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.

CS265: Dynamic Partial Order Reduction Koushik Sen UC Berkeley.

CS527 Topics in Software Engineering (Software Testing and Analysis) Darko Marinov August 30, 2011.

HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.

Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using –shared variables –message passing.

Eraser: A dynamic Data Race Detector for Multithreaded Programs Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, Thomas Anderson Presenter:

Reachability Testing of Concurrent Programs1 Reachability Testing of Concurrent Programs Richard Carver, GMU Yu Lei, UTA.

CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.

Testing Concurrent Programs Sri Teja Basava Arpit Sud CSCI 5535: Fundamentals of Programming Languages University of Colorado at Boulder Spring 2010.

Logical Clocks event ordering, happened-before relation (review) logical clocks conditions scalar clocks  condition  implementation  limitation vector.

FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha.

1 Active Random Testing of Parallel Programs Koushik Sen University of California, Berkeley.

Detecting Data Races in Multi-Threaded Programs

Presenter: Godmar Back

Background on the need for Synchronization

Threads and Memory Models Hal Perkins Autumn 2011

Atomicity in Multithreaded Software

Threads and Memory Models Hal Perkins Autumn 2009

CS 425 / ECE 428  2013, I. Gupta, K. Nahrtstedt, S. Mitra, N. Vaidya, M. T. Harandi, J. Hou.

Concurrency: Mutual Exclusion and Process Synchronization

50.530: Software Engineering

Eraser: A dynamic data race detector for multithreaded programs

Presentation transcript:

CS294-32: Dynamic Data Race Detection Koushik Sen UC Berkeley

Race Conditions class Ref { int i; void inc() { int t = i + 1; i = t; } Courtesy Cormac Flanagan

Race Conditions class Ref { int i; void inc() { int t = i + 1; i = t; } Ref x = new Ref(0); parallel { x.inc(); // two calls happen x.inc(); // in parallel } assert x.i == 2; A race condition occurs if two threads access a shared variable at the same time without synchronization at least one of those accesses is a write

Race Conditions class Ref { int i; void inc() { int t = i + 1; i = t; } Ref x = new Ref(0); parallel { x.inc(); // two calls happen x.inc(); // in parallel } assert x.i == 2; t1 t2 RD(i) WR(i)

Lock-Based Synchronization class Ref { int i; // guarded by this void inc() { synchronized (this) { int t = i + 1; i = t; } Ref x = new Ref(0); parallel { x.inc(); // two calls happen x.inc(); // in parallel } assert x.i == 2; Field guarded by a lock Lock acquired before accessing field Ensures race freedom

Dynamic Race Detection Happens Before [Dinning and Schonberg 1991] Lockset: –Eraser [Savage et al. 1997] –Precise Lockset [Choi et al. 2002] Hybrid [O'Callahan and Choi 2003]

Dynamic Race Detection Advantages –Precise knowledge of the execution No False positive [Happens Before] Unless you try to predict data races [Lockset] Disadvantages –Produce false negatives because they only consider a subset of possible program executions.

What we are going to analyze? A trace representing an actual execution of a program Trace is sequence of events: –MEM(m,a,t): thread t accessed memory local m, where the access a 2 {RD,WR} m can be o.f, C.f, a[i] –ACQ(l,t): thread t acquires lock l Ignore re-acquire of locks. l can be o. –REL(l,t): thread t releases lock l –SND(g,t): thread t sends message g –RCV(g,t): thread t receive message g If t 1 calls t 2.start(), then generate SND(g,t 1 ) and RCV(g,t 2 ) If t 1 calls t 2.join(), then generate SND(g,t 2 ) and RCV(g,t 1 )

How to generate a trace? class Ref { int i; // guarded by this void inc() { synchronized (this) { int t = i + 1; i = t; } Ref x = new Ref(0); parallel { x.inc(); // two calls happen x.inc(); // in parallel } assert x.i == 2; class Ref { int i; // guarded by this void inc() { print(“ACQ(“+id(this)+”,”+thisThread+”)”); synchronized (this) { print(“MEM(“+id(this)+”.i,RD,”+thisThread+”)”); int t = i + 1; print(“MEM(“+id(this)+”.i,WR,”+thisThread+”)”); i = t; } print(“REL(“+id(this)+”,”+thisThread+”)”); } Ref x = new Ref(0); parallel { x.inc(); // two calls happen x.inc(); // in parallel } assert x.i == 2; Instrument a Program

Sample Trace class Ref { int i; // guarded by this void inc() { print(“ACQ(“+id(this)+”,”+thisThread+”)”); synchronized (this) { print(“MEM(“+id(this)+”.i,RD,”+thisThread+”)”); print(“MEM(“+id(this)+”.i,WR,”+thisThread+”)”); i = i + 1; } print(“REL(“+id(this)+”,”+thisThread+”)”); } Ref x = new Ref(0); parallel { x.inc(); // two calls happen x.inc(); // in parallel } assert x.i == 2; ACQ(4365,t1); MEM(4365.i,RD,t1) MEM(4365.i,WR,t1) REL(4365,t1); ACQ(4365,t2); MEM(4365.i,RD,t2) MEM(4365.i,WR,t2) REL(4365,t2); Sample Trace

Compute Locks Held by a Thread ACQ(4365,t1); MEM(4365.i,RD,t1) MEM(4365.i,WR,t1) REL(4365,t1); ACQ(4365,t2); MEM(4365.i,RD,t2) MEM(4365.i,WR,t2) REL(4365,t2); Sample Trace L(t1)={}, L(t2)={} L(t1)={4365}, L(t2)={} L(t1)={}, L(t2)={} L(t1)={}, L(t2)={4365} L(t1)={}, L(t2)={} Locks Held L(t) = locks held by thread t. How do we compute L(t)?

Let us now analyze a trace Instrument Program Run Program => A Trace File Analyze Trace File

Happens-before relation [Dinning and Schonberg 1991] Idea: Infer a happens-before relation Á between events in a trace We say e 1 Á e 2 –If e 1 and e 2 are events from the same thread and e 1 appears before e 2 in the trace –If e 1 = SND(g,t) and e 2 = RCV(g,t’) –If there is a e’ such that e 1 Á e’ and e’ Á e 2 REL(l,t) and ACQ(g,t’) generates SND(g,t) and RCV(g,t’) We say e 1 and e 2 are in race, if –e 1 and e 2 are not related by Á, –e 1 and e 2 are from different threads –e 1 and e 2 access the same memory location and one of the accesses is a write

Happens-before: example 1 ACQ(mutex) v := v + 1 REL(mutex) Thread 1 ACQ(mutex) v := v + 1 REL(mutex) Thread 2 x := x + 1 Any two accesses of shared variables are in the relation happens-before

Happens-before: example 2 ACQ(mutex) v := v + 1 REL(mutex) Thread 1 ACQ(mutex) v := v + 1 REL(mutex) Thread 2 x := x + 1 Therefore, only this second execution reveals the existing datarace!!

Eraser Lockset Savage,Burrows,Nelson,Sobalvarro,Anderson Assume a database D storing tuples (m,L) where: –m is a memory location –L is a set of locks that protect m Initially D contains a tuple (m,U) for each memory location m, where U is the universal set

How it works? For an event MEM(m,a,t) generate the tuple (m,L(t)) Let (m, L’) be the tuple present in D –Report race over memory location m if L(t) Å L’ = empty set Replace (m, L’) by (m, L(t) Å L’ ) in D

Eraser: Example 1 ACQ(mutex) v := v + 1 REL(mutex) ACQ(mutex) v := v + 1 REL(mutex) Thread 1Thread 2 L(t2) = {mutex} (v,{mutex}) 2 D L(t2) = {mutex} (v,{mutex}) 2 D L(t1)={mutex} (v,{mutex}) 2 D L(t1)={mutex} (v,{mutex}) 2 D

Eraser: Example 2 ACQ(mutex1) v := v + 1 REL(mutex1) ACQ(mutex2) v := v + 1 REL(mutex2) Thread 1Thread 2 L(t2) = {mutex2} (v,{}) 2 D L(t2) = {mutex2} (v,{}) 2 D L(t1) = {mutex1} (v,{mutex1}) 2 D L(t1) = {mutex1} (v,{mutex1}) 2 D Warning!!

Lockset Shared-exclusive Track lockset any thread r/w Shared-read/write Track lockset race condition!

Extending Lockset (Thread Local Data) Thread Local Shared-exclusive Track lockset first thread r/w any thread r/w Shared-read/write Track lockset race condition! second thread r/w

Extending Lockset (Read Shared Data) Thread Local Read Shared Shared-exclusive Track lockset first thread r/w any thread read any thread write any thread r/w Shared-read/write Track lockset race condition! second thread read second thread write

Eraser: Problem v T 1 (L 1,L 2 ) T 2 (L 2,L 3 ) T 3 (L 1,L 3 )  false alarm ACQ(L1,L2) v := v + 1 REL(L2,L1) ACQ(L2,L3) v := v + 1 REL(L3,L2) ACQ(L3,L1) v := v + 1 REL(L1,L3) Thread 1Thread 2Thread 3

Precise Lockset Choi, Lee, Loginov, O'Callahan, Sarkar, Sridharan Assume a database D storing tuples (m,t,L,a) where: –m is a memory location –t is a thread accessing m –L is a set of locks held by t while accessing m –a is the type of access (read or write) Initially D is empty

How it works? For an event MEM(m,a,t) generate the tuple (m,a,L(t),t) If there is a tuple (m’,a’,L’,t’) in D such that –m = m’, –(a = WR) Ç (a’=WR) –L(t) Å L’ = empty set –t  t’ –Report race over memory location m Add (m,a,L(t),t) in D

Optimizations Stop adding tuples on m once a race on m is detected Do not add (m,a,L,t) to D if (m,a,L’,t) is already in D and L’ µ L Many more …

Precise Lockset ACQ(mutex) v := v + 1 REL(mutex) Thread 1 ACQ(mutex) v := v + 1 REL(mutex) Thread 2 x := x + 1 D (x, RD,{},t1) (x,WR,{},t1) (v,RD,{mutex}, t1 ) (v,WR,{mutex},t1) (v,RD,{mutex},t2) (v,WR,{mutex},t2) (x,RD,{},t2) (x,WR,{},t2) Conflict detected!

Precise Lockset ACQ(m1,m2) v := v + 1 REL(m2,m1) ACQ(m2,m3) v := v + 1 REL(m3,m2) ACQ(m3,m1) v := v + 1 REL(m1,m3) Thread 1Thread 2Thread 3 D (v,RD,{m1,m2},t1) (v,WR,{m1,m2},t1) (v,RD,{m2,m3},t2) (v,WR,{m2,m3},t2) (v,RD,{m1,m3},t3) (v,WR,{m1,m3},t3) No conflicts detected!

Precise Lockset: Not so Precise t2.start() Thread 1Thread 2 x := x + 1 Precise Lockset gives Warning: But no warning with Happens-Before

Hybrid Dynamic Data Race Detection Relax Happens Before –No happens before relation between REL(l,t) and a subsequent ACQ(l,t) Maintain Precise Lockset along with Relaxed Happens-Before

Hybrid O'Callahan and Choi Assume a database D storing tuples (m,t,L,a,e) where: –m is a memory location –t is a thread accessing m –L is a set of locks held by t while accessing m –a is the type of access (read or write) –e is the event associated with the access Initially D is empty

How it works? For an event e = MEM(m,a,t) generate the tuple (m,a,L(t),t,e) If there is a tuple (m’,a’,L’,t’,e’) in D such that –m = m’, –(a = WR) Ç (a’=WR) –L(t) Å L’ = empty set –t  t’ –e and e’ are not related by the happens before relation, i.e., : (e Á e’) Æ : (e’ Á e) –Report race over memory location m Add (m,a,L(t),t,e) in D

Hybrid Dynamic Data Race Detection t2.start() Thread 1Thread 2 x := x + 1 D (x,RD,{},t1,e1) (x,WR,{},t1,e2) (x,RD,{},t2,e3) (x,WR,{},t2,e4) Precise Lockset detects a data race, but e2 Á e3. Therefore, hybrid technique gives no warning

Distributed Computation A set of processes: {p 1,p 2,…,p n } No shared memory –Communication through messages Each process executes a sequence of events –send(m) : sends a message with content m –receive(m): receives a message with content m –Internal event: changes local state of a process i th event from process p j is denoted by e i j m4m4 m3m3 m2m2 m1m1 p2p2 p3p3 p1p1 Physical Time e13e13 e33e33 e23e23 e12e12 e43e43 e31e31 e21e21 e11e11 e22e22 e32e32

Distributed Computation as a Partial Order Distributed Computation defines a partial order on the events –e ! e’ e and e’ are events from the same process and e executes before e’ e is the send of a message and e’ is the receive of the same message there is a e’’ such that e ! e’’ and e’’ ! e’ m4m4 m3m3 m2m2 m1m1 p2p2 p3p3 p1p1 Physical Time e13e13 e33e33 e23e23 e12e12 e43e43 e31e31 e21e21 e11e11 e22e22 e32e32

Distributed Computation as a Partial Order Problem: An external process or observer wants to infer the partial order or the computation for debugging –No global clock –At each event a process can send a message to the observer to inform about the event –Message delay is unbounded Observer m4m4 m3m3 m2m2 m1m1 p2p2 p3p3 p1p1 Physical Time e13e13 e33e33 e23e23 e12e12 e43e43 e31e31 e21e21 e11e11 e22e22 e32e32

Can we infer the partial order? From the observation: Can we associate a suitable value with every event such that –V(e) < V(e’), e ! e’ We need the notion of clock (logical) e 1 2 e 1 3 e 1 1 e 2 1 e 2 3 e 4 3 e 3 3 e 3 1 e 3 2 e 2 2

Lamport’s Logical Time All processes use a counter (clock) with initial value of zero The counter is incremented by and assigned to each event, as its timestamp A send (message) event carries its timestamp For a receive (message) event the counter is updated by –Max(receiver-counter, message-timestamp) + 1 Send the counter value along with an event to the observer

Example m4m4 m3m3 m2m2 m1m1 p2p2 p3p3 p1p1 Physical Time e13e13 e33e33 e23e23 e12e12 e43e43 e31e31 e21e21 e11e11 e22e22 e32e32

Example Problem with Lamport’s logical clock: –e ! e’ ) C(e) < C(e’) –C(e) < C(e’) ) e ! e’ X m4m4 m3m3 m2m2 m1m1 p2p2 p3p3 p1p1 Physical Time e13e13 e33e33 e23e23 e12e12 e43e43 e31e31 e21e21 e11e11 e22e22 e32e32

Example Problem with Lamport’s logical clock: –e ! e’ ) C(e) < C(e’) –C(e) < C(e’) ) e ! e’ X m4m4 m3m3 m2m2 m1m1 p2p2 p3p3 p1p1 Physical Time e13e13 e33e33 e23e23 e12e12 e43e43 e31e31 e21e21 e11e11 e22e22 e32e32

Vector Clock Vector Clock: Process ! Nat V: P ! N Associate a vector clock V p with every process p Update vector clock with every event as follows: –Internal event at p_i: V p (p) := V p (p) + 1 –Send Message from p: V p (p) := V p (p) + 1 Send V p with message –Receive message m at p: V p (i) := max(V p (i),V m (i)) for all i 2 P, where V m is the vector clock sent with the message m V p (p) := V p (p) + 1

Example (0,0,1)(0,1,2) (1,0,0) (0,1,0) (2,2,4) (2,1,4) (3,0,0) (2,0,0) (2,1,3) (2,3,4) V = (a,b,c) means V(p 1 )=a, V(p 2 )=b, and V(p 3 )=c m4m4 m3m3 m2m2 m1m1 p2p2 p3p3 p1p1 Physical Time e13e13 e33e33 e23e23 e12e12 e43e43 e31e31 e21e21 e11e11 e22e22 e32e32

Intuitive Meaning of a Vector Clock If V p = (a,b,c) after some event then –p is affected by the a th event from p 1 –p is affected by the b th event from p 2 –p is affected by the c th event from p 3

Comparing Vector Clocks V · V’ iff for all p 2 P, V(p) · V’(p) V = V’ iff for all p 2 P, V(p) = V’(p) V < V’ iff V · V’ and V  V’ Theorem: V e < V e’ iff e ! e’ Send an event along with its vector clock to the observer

Definition of Data Race Traditional Definition (Netzer and Miller 1992) 46 x=1 if (x==1) …

Definition of Data Race 47 x=1 if (x==1) … send(m) receive(m) X Traditional Definition (Netzer and Miller 1992)

Operational Definition of Data Race We say that the execution of two statements are in race if they could be executed by different threads temporally next to each other and both access the same memory location and at least one of the accesses is a write 48 x=1 if (x==1) … send(m) receive(m) X x=1 if (x==1) … Temporally next to each other

Race Directed Random Testing: RACEFUZZER RaceFuzzer: Race directed random testing STEP1: Use an existing technique to find set of pairs of state transitions that could potentially race –We use hybrid dynamic race detection –Static race detection can also be used –Transitions are approximated using program statements 49

Race Directed Random Testing: RACEFUZZER RaceFuzzer: Race directed random testing STEP1: Use an existing technique to find set of pairs of state transitions that could potentially race –We use hybrid dynamic race detection –Static race detection can also be used –Transitions are approximated using program statements STEP2: Bias a random scheduler so that two transitions under race can be executed temporally next to each other 50

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 51 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Run ERASER: Statement pair (s5,s6) are in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 52 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Run ERASER: Statement pair (s5,s6) are in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 53 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); (s5,s6) in race Goal: Create a trace exhibiting the race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 54 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Example Trace: s1: g1(); s2: g2(); s3: g3(); s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; s6: if (o1.f==1) s7: ERROR; s4: g4(); s5: o2.f = 1; Racing Statements Temporally Adjacent (s5,s6) in race Goal: Create a trace exhibiting the race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 55 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 56 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 57 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 58 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 59 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 60 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s6: if (o1.f==1) (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 61 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s6: if (o1.f==1) (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 62 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s6: if (o1.f==1) (s5,s6) in race Postponed = { }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 63 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); (s5,s6) in race Postponed = { } s6: if (o1.f==1) Do not postpone if there is a deadlock

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 64 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); (s5,s6) in race

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 65 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 66 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 67 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 68 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 69 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 70 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 71 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o2.f = 1; (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 72 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o2.f = 1; (s5,s6) in race Postponed = { s6: if (o1.f==1) }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 73 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o2.f = 1; (s5,s6) in race Postponed = { s6: if (o1.f==1) } Race?

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 74 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o2.f = 1; (s5,s6) in race Postponed = { s6: if (o1.f==1) } Race? NO o1.f ≠ o2.f

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 75 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); (s5,s6) in race Postponed = { s6: if (o1.f==1), } s5: o2.f = 1;

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 76 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); (s5,s6) in race Postponed = { s6: if (o1.f==1), s5: o2.f = 1; }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 77 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); (s5,s6) in race Postponed = { s6: if (o1.f==1), s5: o2.f = 1; }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 78 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; (s5,s6) in race Postponed = { s6: if (o1.f==1), s5: o2.f = 1; }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 79 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; (s5,s6) in race Postponed = { s6: if (o1.f==1), s5: o2.f = 1; }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 80 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; (s5,s6) in race Postponed = { s6: if (o1.f==1), s5: o2.f = 1; } Race? YES o1.f = o1.f

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 81 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); (s5,s6) in race Postponed = { s5: o2.f = 1; } s6: if (o1.f==1) s5: o1.f = 1;

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 82 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; s6: if (o1.f==1) (s5,s6) in race Postponed = { s5: o2.f = 1; }

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 83 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; s6: if (o1.f==1) (s5,s6) in race Postponed = { s5: o2.f = 1; } Racing Statements Temporally Adjacent

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 84 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; s6: if (o1.f==1) s7: ERROR; (s5,s6) in race Postponed = { s5: o2.f = 1; } Racing Statements Temporally Adjacent

RACEFUZZER using an example Thread1 foo(o1); sync foo(C x) { s1: g1() s2: g2(); s3: g3(); s4: g4(); s5: x.f = 1; } 85 Thread2 bar(o1); bar(C y) { s6: if (y.f==1) s7: ERROR; } Thread3 foo(o2); Execution: s1: g1(); s2: g2(); s3: g3(); s4: g4(); s5: o1.f = 1; s6: if (o1.f==1) s7: ERROR; s5: o2.f = 1; (s5,s6) in race Postponed = { } Racing Statements Temporally Adjacent

Another Example Thread1{ 1:lock(L); 2:f1(); 3:f2(); 4:f3(); 5:f4(); 6:f5(); 7:unlock(L); 8:if (x==0) 9:ERROR; } Thread2{ 10:x = 1; 11:lock(L); 12:f6(); 13:unlock(L); } 86

Another Example Thread1{ 1:lock(L); 2:f1(); 3:f2(); 4:f3(); 5:f4(); 6:f5(); 7:unlock(L); 8:if (x==0) 9:ERROR; } Thread2{ 10:x = 1; 11:lock(L); 12:f6(); 13:unlock(L); } Race Racing Pair: (8,10) 87

Another Example Thread1{ 1:lock(L); 2:f1(); 3:f2(); 4:f3(); 5:f4(); 6:f5(); 7:unlock(L); 8:if (x==0) 9:ERROR; } Thread2{ 10:x = 1; 11:lock(L); 12:f6(); 13:unlock(L); } Racing Pair: (8,10) Postponed Set = {Thread2} 88

Another Example Thread1{ 1:lock(L); 2:f1(); 3:f2(); 4:f3(); 5:f4(); 6:f5(); 7:unlock(L); 8:if (x==0) 9:ERROR; } Thread2{ 10:x = 1; 11:lock(L); 12:f6(); 13:unlock(L); } 89

Another Example Thread1{ 1:lock(L); 2:f1(); 3:f2(); 4:f3(); 5:f4(); 6:f5(); 7:unlock(L); 8:if (x==0) 9:ERROR; } Thread2{ 10:x = 1; 11:lock(L); 12:f6(); 13:unlock(L); } 90

Another Example Thread1{ 1:lock(L); 2:f1(); 3:f2(); 4:f3(); 5:f4(); 6:f5(); 7:unlock(L); 8:if (x==0) 9:ERROR; } Thread2{ 10:x = 1; 11:lock(L); 12:f6(); 13:unlock(L); } Hit error with 0.5 probability 91

92 Implementation RaceFuzzer: Part of CalFuzzer tool suite Instrument using SOOT compiler framework Instrumentations are used to “hijack” the scheduler –Implement a custom scheduler –Run one thread at a time –Use semaphores to control threads Deadlock detector –Because we cannot instrument native method calls lock(L1); X=1; unlock(L1); lock(L2); Y=2; unlock(L2);

93 Implementation RaceFuzzer: Part of CalFuzzer tool suite Instrument using SOOT compiler framework Instrumentations are used to “hijack” the scheduler –Implement a custom scheduler –Run one thread at a time –Use semaphores to control threads Deadlock detector –Because we cannot instrument native method calls ins_lock(L1); lock(L1); ins_write(&X); X=1; unlock(L1); ins_unlock(L1); ins_lock(L1); lock(L2); Y=2; unlock(L2); ins_unlock(L1); Custom Scheduler

Experimental Results 94

RACEFUZZER: Useful Features Classify real races from false alarms Inexpensive replay of a concurrent execution exhibiting a real race Separate some harmful races from benign races No false warning Very efficient –We instrument at most two memory access statements and all synchronization statements Embarrassingly parallel 95

RACEFUZZER: Limitations Not complete: can miss a real race –Can only detect races that happen on the given test suite on some schedule May not be able to separate all real races from false warnings –Being random in nature May not be able to separate harmful races from benign races –If a harmful race does not cause in a program crash Each test run is sequential 96

Summary Claim: testing (a.k.a verification in industry) is the most practical way to find software bugs –We need to make software testing systematic and rigorous Random testing works amazingly well in practice –Randomizing a scheduler is more effective than randomizing inputs –We need to make random testing smarter and closer to verification Bias random testing – Prioritize random testing Find interesting preemption points in the programs Randomly preempt threads at these interesting points 97

java.lang.StringBuffer /**... used by the compiler to implement the binary string concatenation operator... String buffers are safe for use by multiple threads. The methods are synchronized so that all the operations on any particular instance behave as if they occur in some serial order that is consistent with the order of the method calls made by each of the individual threads involved. */ /*# atomic */ public class StringBuffer {... }

public class StringBuffer { private int count; public synchronized int length() { return count; } public synchronized void getChars(...) {... } /*# atomic */ public synchronized void append(StringBuffer sb){ int len = sb.length();... sb.getChars(...,len,...);... } java.lang.StringBuffer sb.length() acquires lock on sb, gets length, and releases lock use of stale len may yield StringIndexOutOfBoundsException inside getChars(...) other threads can change sb

public class StringBuffer { private int count; public synchronized int length() { return count; } public synchronized void getChars(...) {... } /*# atomic */ public synchronized void append(StringBuffer sb){ int len = sb.length();... sb.getChars(...,len,...);... } java.lang.StringBuffer StringBuffer.append is not atomic: Start: at StringBuffer.append(StringBuffer.java:701) at Thread1.run(Example.java:17) Commit: Lock Release at StringBuffer.length(StringBuffer.java:221) at StringBuffer.append(StringBuffer.java:702) at Thread1.run(Example.java:17) Error: Lock Acquire at StringBuffer.getChars(StringBuffer.java:245) at StringBuffer.append(StringBuffer.java:710) at Thread1.run(Example.java:17)

101 Related work Stoller et al. and Edelstein et al. [ConTest] –Inserts yield() and sleep() randomly in Java code Parallel randomized depth-first search by Dwyer et al. –Modifies search strategy in Java Pathfinder by Visser et al. Iterative context bounding (Musuvathi and Qadeer) –Systematic testing with bounded context switches Satish Narayanasamy, Zhenghao Wang, Jordan Tigani, Andrew Edwards and Brad Calder “Automatically Classifying Benign and Harmful Data Races Using Replay Analysis”, PLDI 07