ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked.

Slides:



Advertisements
Similar presentations
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Advertisements

Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Program Representations. Representing programs Goals.
Parallel Inclusion-based Points-to Analysis Mario Méndez-Lojo Augustine Mathew Keshav Pingali The University of Texas at Austin (USA) 1.
Automated Soundness Proofs for Dataflow Analyses and Transformations via Local Rules Sorin Lerner* Todd Millstein** Erika Rice* Craig Chambers* * University.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Scalable Error Detection using Boolean Satisfiability 1 Yichen Xie and Alex Aiken Stanford University.
Feedback: Keep, Quit, Start
Purity Analysis : Abstract Interpretation Formulation Ravichandhran Madhavan, G. Ramalingam, Kapil Vaswani Microsoft Research, India.
Interprocedural analyses and optimizations. Costs of procedure calls Up until now, we treated calls conservatively: –make the flow function for call nodes.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Final exam week Three things on finals week: –final exam –final project presentations –final project report.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
Chair of Software Engineering Fundamentals of Program Analysis Dr. Manuel Oriol.
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
Inlining pros and cons (discussion). Inlining pros and cons Pros –eliminate overhead of call/return sequence –eliminate overhead of passing args & returning.
KQS More exercises/practice What about research frontier? Reading material Meetings for project Post notes more promptly.
Previous finals up on the web page use them as practice problems look at them early.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Intraprocedural Points-to Analysis Flow functions:
ESP: Program Verification Of Millions of Lines of Code Manuvir Das Researcher PPRC Reliability Team Microsoft Research.
From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time g() { lock; } h() { unlock; } f() { h(); if (...) { main(); } } main() { g(); f(); lock; unlock; } mainfgh ;;;;;;; u u ” ”””” ” ”
A simple approach Given call graph and CFGs of procedures, create a single CFG (control flow super-graph) by: –connecting call sites to entry nodes of.
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
From last time: Inlining pros and cons Pros –eliminate overhead of call/return sequence –eliminate overhead of passing args & returning results –can optimize.
Projects. Dataflow analysis Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Schedule Midterm out tomorrow, due by next Monday Final during finals week Project updates next week.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Transfer functions Our pairs of summaries look like functions from input information to output information We call these transfer functions Complete transfer.
Examples of summaries. Issues with summaries Level of “context” sensitivity: –For example, one summary that summarizes the entire procedure for all call.
Improving the Precision of Abstract Simulation using Demand-driven Analysis Olatunji Ruwase Suzanne Rivoire CS June 12, 2002.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
From last class. The above is Click’s solution (PLDI 95)
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Precision Going back to constant prop, in what cases would we lose precision?
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Scalable Defect Detection Manuvir Das, Zhe Yang, Daniel Wang Center for Software Excellence Microsoft Corporation.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Example x := read() v := a + b x := x + 1 w := x + 1 a := w v := a + b z := x + 1 t := a + b.
PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio.
Dataflow Frequency Analysis based on Whole Program Paths Eduard Mehofer Institute for Software Science University of Vienna
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Points-To Analysis in Almost Linear Time Josh Bauman Jason Bartkowiak CSCI 3294 OCTOBER 9, 2001.
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
Interprocedural analyses and optimizations. Costs of procedure calls Up until now, we treated calls conservatively: –make the flow function for call nodes.
Interprocedural analyses and optimizations. Costs of procedure calls Up until now, we treated calls conservatively: –make the flow function for call nodes.
Inter-procedural analysis
Dataflow analysis.
Another example: constant prop
ESP: Program Verification Of Millions of Lines of Code
Data Flow Analysis Compiler Design
Pointer analysis.
Interprocedural analyses and optimizations
Advanced Compiler Design
Presentation transcript:

ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked Violated rules ) crashes –Failed runtime checks –Unreliable software CleanL TSys DSL DFA WP/SP MC ATP

ESP [Das et al PLDI 2002] C Program Safe Not Safe Rules ESP CleanL TSys DSL DFA WP/SP MC ATP

ESP [Das et al PLDI 2002] ESP is a program analysis that keeps track of object state at each program point –e.g.: is file handle open or closed? Challenge: scale to large programs –One of scalability issues: merge nodes –Always analyze both sides of merge node ) exponential (or non-terminating) program analyses ESP has a heuristic for handling merges that –avoids exponential blow-up and runs fast in practice –maintains enough precision to verify programs CleanL TSys DSL DFA WP/SP MC ATP

4 Prop Sim example: stdio usage in gcc void main () { if (dump) fil = fopen(dumpFile,”w”); if (p) x = 0; else x = 1; if (dump) fclose(fil); }

5 Prop Sim example: stdio usage in gcc void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }

6 Prop Sim example: stdio usage in gcc $uninit Opened $error Open Print Open Close Print/Close * void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }

7 Example: no path-sensitivity entry dump p x = 0x = 1 Open Close exit dump T T T F F F

8 Example: no path-sensitivity {$uninit,Opened} {$error,$uninit,Opened} {$uninit} {$uninit,Opened} entry dump p x = 0x = 1 Open Close exit dump T T T F F F

9 Example: full path-sensitivity entry dump p x = 0x = 1 Open Close exit dump T T T F F F

10 Example: full path-sensitivity [Opened|dump=T] [$uninit|dump=T] [Opened|dump=T,p=T] [Opened|dump=T,p=T,x=0] [$uninit|dump=T,p=T,x=0] [$uninit] entry dump p x = 0x = 1 Open Close exit dump T T T F F F

11 Example: ESP technique entry dump p x = 0x = 1 Open Close exit dump T T T F F F

12 Example: ESP technique [Opened|dump=T] [$uninit] [$uninit|dump=T] entry dump p x = 0x = 1 Open Close exit dump T T T F F F [$uninit|dump=F] [Opened|dump=T,p=T,x=0] [Opened|dump=T,p=F,x=1] [Opened|dump=T] [$uninit|dump=F] [$uninit|dump=T] [$uninit|dump=F] [$uninit]

13 Case study: stdio usage in gcc cc1 from gcc version (Spec95) Does cc1 always print to opened files? cc1 is a complex program: –140K non-blank, non-comment lines of C –2149 functions, 66 files, 1086 globals –Call graph includes one 450 function SCC

14 Experimental results Precision –Verification succeeds for every file handle –No transitions to $error ; no false errors Scalability –Average per handle: 72.9 seconds, 49.7 MB –Single 1GHz PIII laptop with 512 MB RAM Proved that: –Each of the 646 calls to fprintf in the source code prints to a valid, open file

15 ESP follow-up ESP has since been run on large real-world applications ESP/X: local intra-procedural version PSE: post-mortem analysis –run ESP backwards to figure out what cause a crash

Recap and conclusion

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Flow-sensitive intraproc dataflow analysis Iterative dataflow analysis –flow functions, lattice-theoretic formulation Termination –monotonic flow functions + finite height lattice Meet over all paths vs. meet over all feasible paths vs. dataflow analysis –For distributive problems, MOP = dataflow analysis

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Program representations Simple –AST –CFG More advanced –Dataflow Graph –Control Dependence Graph –Program Dependence Graph –SSA

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Interprocedural analysis Context insensitive –caller summaries and callee summaries Context-sensitive –call-strings as context (k-CFA, “call-strings”) –dataflow info as context bottom-up, complete summaries top-down, partial summaries (partial transfer functions)

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Flow-insensitive analysis Keep only one piece of information for the entire program/procedure Loses precision, but improves space consumption

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Path-sensitive analysis Enhance dataflow to try to keep paths separate Two kinds of path-sensitive analysis: –aim towards MOP –aim towards removing infeasible paths (branch correlations)

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses Applications Pointer analysis Optimizing OO languages Program reliability Rhodium

Applications Pointer analysis Optimizing OO languages Program reliability Rhodium Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses

Pointer analysis Started with simple naïve intraproc analysis with allocation site summaries To scale to large programs: –make naïve pointer analysis flow insensitive (Andersen) –make each node have only one outgoing edge, which makes it near linear time (Steensgaard) –add one level of flow to regain some precision (One- level flow)

Applications Pointer analysis Optimizing OO languages Program reliability Rhodium Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Cross-cutting issues Correctness Ordering transformations and analyses

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Applications Pointer analysis Optimizing OO languages Program reliability Rhodium Cross-cutting issues Correctness Ordering transformations and analyses

Program analysis and program reliability Property simulation –path sensitive analysis in polynomial time –uses clever heuristic for merges –algorithm behind ESP Predicate abstraction and iterative refinement –given set of predicates, compute predicates that hold at each program point –iteratively refine set of predicates –core of BLAST and SLAM

Course overview Dataflow analysis and variations iterative dataflow analysis program representations interprocedural flow-insensitive path-sensitive Applications Pointer analysis Optimizing OO languages Program reliability Rhodium Cross-cutting issues Correctness Ordering transformations and analyses

Looking forward (discussion) What are the current hot topics in compilers and program analysis? Compilers and program analysis in 20 years from now?

Looking forward: Concurrency Hardware trends are making exploiting concurrency more and more important Language features and compiler technology to express and exploit concurrency Current examples: –race detection –primitives for concurrency and efficient implementations (eg: atomic primitive)

Looking forward: Scalability Scale to large programs while retaining precision Current examples: –Use scalable constraint solvers such as SAT (SATURN) –Use compact representations such as BDDs

Looking forward: Verification Tradeoffs between: –automation –scalability –precision –domain-specificity Current examples –ESP, BLAST, SLAM, Rhodium

Looking forward: Extensibility Removing barrier to entry to the compiler New models of using compilers for –domain-specific checkers –domain-specific optimizations Current examples: –Rhodium, Collider