Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
1 Chao Wang, Yu Yang*, Aarti Gupta, and Ganesh Gopalakrishnan* NEC Laboratories America, Princeton, NJ * University of Utah, Salt Lake City, UT Dynamic.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
SSA.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Conditional Must Not Aliasing for Static Race Detection Mayur Naik Alex Aiken Stanford University.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Chapter 6: Process Synchronization
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Background for “KISS: Keep It Simple and Sequential” cs264 Ras Bodik spring 2005.
ADVERSARIAL MEMORY FOR DETECTING DESTRUCTIVE RACES Cormac Flanagan & Stephen Freund UC Santa Cruz Williams College PLDI 2010 Slides by Michelle Goodstein.
Aliases in a bug finding tool Benjamin Chelf Seth Hallem June 5 th, 2002.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Program analysis Mooly Sagiv html://
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Intraprocedural Points-to Analysis Flow functions:
1 Islands : Aliasing Protection In Object-Oriented Languages By : John Hogg OOPSLA 91 Aharon Abadi.
1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Overview of program analysis Mooly Sagiv html://
RADAR: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan Voung, Ranjit Jhala, Sorin Lerner {rchugh, jvoung, jhala, lerner}
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 5 Data Flow Testing
/ PSWLAB Eraser: A Dynamic Data Race Detector for Multithreaded Programs By Stefan Savage et al 5 th Mar 2008 presented by Hong,Shin Eraser:
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared.
Precision Going back to constant prop, in what cases would we lose precision?
1 CS 201 Compiler Construction Data Flow Analysis.
Presented By Dr. Shazzad Hosain Asst. Prof., EECS, NSU
Runtime Refinement Checking of Concurrent Data Structures (the VYRD project) Serdar Tasiran Koç University, Istanbul, Turkey Shaz Qadeer Microsoft Research,
- 1 - Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanasamy University of Michigan, Ann Arbor Chimera: Hybrid Program Analysis for Determinism * Chimera.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
1 Effective Static Race Detection for Java Mayur, Alex, CS Department Stanford University Presented by Roy Ganor 14/2/08 Point-To Analysis Seminar.
1 Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations Chi-Leung Wong, Zehra Sura, Xing Fang, Kyungwoo Lee, Samuel P. Midkiff,
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
Detecting and Eliminating Potential Violation of Sequential Consistency for concurrent C/C++ program Duan Yuelu, Feng Xiaobing, Pen-chung Yew.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Space-Efficient Online Computation of Quantile Summaries SIGMOD 01 Michael Greenwald & Sanjeev Khanna Presented by ellery.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
CS223: Software Engineering Lecture 26: Software Testing.
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Data Flow Analysis Suman Jana
Copyright © Cengage Learning. All rights reserved.
Textbook: Principles of Program Analysis
Amir Kamil and Katherine Yelick
G. Ramalingam Microsoft Research, India & K. V. Raghavan
University Of Virginia
Pointer analysis.
Amir Kamil and Katherine Yelick
Chapter 6: Synchronization Tools
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein 6/5/08

Outline Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Motivation Want to apply dataflow analysis to concurrent programs without: Requiring annotations Escape analysis (loss of precision) Custom concurrency analysis Model checking (combinatorial explosion)

Introducing Radar Scheme for concurrent dataflow analysis Starts with sequential dataflow analysis Race detection creates concurrent analysis Can use already-created race detectors We’ll see it applied to Relay

Outline Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Assumptions For each procedure, either Have access to code Have access to a sound summary Shared memory is sequentially consistent

Radar’s Key Insights Adjustability of sequential analysis: Concurrent dataflow facts are a subset of sequential dataflow facts “Missing facts” Facts that can be killed by other threads Suppose we have a fact about lvalue l  “At line y, l is not null” Enough to know if another thread can write to l concurrently  “At line z, another thread can write to l”

Radar’s Key Insights Pseudo-Races : Identify “missing facts”, Remove from sequential analysis Solution: insert a pseudo-read for location l Ask a race detector: “is there a race at this point for l?” Yes  Another thread can write. Remove fact No  No other thread can write. Retain fact. Producer/Consumer examples follow Non-null dataflow analysis Sequential analysis on left Facts “killed” by concurrency crossed out in red

First example: non-null facts Producer–Consumer Pseudo-read for px->data at line PA Consumer thread can execute line C5  Race! px->data is crossed out at line PA

Second example: non-null facts Modified producer/consumer Still race-free, other than perf_ctr Now, producer acquires/releases lock twice

Second example: non-null facts Insert pseudo-read at P5 on px- >data Races with C5 write to cx->data Kills px->data at P5 and where it propagates At P8, not necessarily true that px->data is non-null Null pointer dereference! Note: no data races (except on perf_ctr) We can detect this!

Outline Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Sequential Dataflow Analysis Representation: nodes in CFG Flow function F(n,d,p): facts true after point p n: node, d: incoming dataflow fact, p: program point lvals(f): lvalues fact f depends on ThreadKill(p,l): computes whether race can occur on l at program point p F adj (n,d,p) = {  f  F(n,d,p),  l  lvals(f), f  ThreadKill(p,l)}

Is Radar Sound? Suppose there is an oracle function O Give a program point p and a location l Returns whether a race is possible Suppose radar is given a race detector R Radar is sound if O(p,l) implies R(p,l) If there is a race, radar wil detect it Can also return false positives

Outline Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Radar Optimizations Reduce number of times call ThreadKill Handle function calls

Reduce ThreadKill calls Race detector for cross product of program points and lvalues is expensive Many program points have similar behavior For each lvalue in a region: Racy for entire region Not racy for entire region Compute once for entire region Region Map: points  “regions

Incorporating Function Calls To handle function calls: Introduce a new kind of region: Introprocedural Summary Region (SumReg) At a particular call site, approximately summarizes possible regions can pass through To maintain soundness Suppose there is a transitively reachable path from a callsite cs to a racy region Summary region must repot that cs is racy

Radar’s Requirements Race Detection Engine Region  Lvalue  raciness Region Map Points  Race-equivalent Regions Summary Region Map Callsites  Summary Regions

Outline Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Relay Static race detection tool Lockset-based Works bottom up Scales to the linux kernel

Relay Uses relative lockset analysis: L +, L - : L + : locks definitely acquired since function entry point L - : locks possibly released since function entry point Relative lockset for exit point of function is stored as summary of function’s behavior Approximates effect of function call on locks currently held

Radar(Relay) Race Detection Engine Relay Region Map Maps program point  (g, (L +,L - )) g: function name (L +,L - ): relative lockset summary for function g Summary Region Map Function g being called at the call site cs in function h Computes AllUnlocks(cs) =  L - in g Suppose Region is (h, (L +,L - )) Returns (h, (L + - AllUnlocks(cs),L -  AllUnlocks(cs)))

Pseudoreads Suppose at some program point p fact f holds RegionMap(p): region (g, (L +,L - )) For all lvalues l  lvals(f): Pretend to read l at p with relative lockset (L +,L - ) For any other lvalue m which might be aliased… Intersection of positive locksets is empty  report race

Relay with Radar: Implementation First Pass: Run Relay Computes relative lockset associated with each function Second Pass: Sequential Analysis Pretend no races exist Collect all the possible queries about races Third Pass: Run Relay, Adding Pseudo-reads Insert pseudo-access wherever race query exist Fourth Pass: Adjusted Sequential Analysis At each pseudo-access for l, query race detector If race could occur, kill facts depending on l

Outline Motivation Overview of Radar Radar(Relay) Radar Formalization Radar Optimizations Evaluation & Results Conclusions

Evaluation Focus on non-null dataflow analysis Used 4 black boxes to answer race queries Steensgaard’s pointer analysis If a value is reachable from a global  true Radar alias Region map always returns empty lockset Answers the question of whether any two values alias Radar Optimistic Always return false Unsound, and overly precise

Results

Terminology Blob nodes: Many lvalues on the heap are merged into one node by alias analysis Can lead to false positives when checking null- dereferences Other work shows hard to account for heap structures Next figure excludes “blob nodes” for pointer dereferences Non-blob dereferences: Apache: 52% SSL: 76% Linux: 71%

Results

Consider gap between Seq and Steensgaard Check how much is bridged by Radar With and without locks

Outline Motivation Overview of Radar Radar(Relay) Radar Formalization Radar Optimizations Evaluation & Results Conclusions

Radar is Scalable Not tied to particular concurrency models Tunable to desired precision Radar(Relay) Good precision relative to sequential, steensgaard Future Work More types of analysis Race detection for other concurrency constructs