Static Analysis: Data-Flow Analysis I

Slides:



Advertisements
Similar presentations
Claus Brabrand, UFPE, Brazil Aug 11, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.
Advertisements

Lecture 24 MAS 714 Hartmut Klauck
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Control Structures Any mechanism that departs from straight-line execution: –Selection: if-statements –Multiway-selection: case statements –Unbounded iteration:
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Claus Brabrand, ITU, Denmark Feb 26, 2008WHITE-BOX TESTING White-Box Testing Claus Brabrand [ ] ( “FÅP”: First-year Project Course, ITU,
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
CPSC 411, Fall 2008: Set 12 1 CPSC 411 Design and Analysis of Algorithms Set 12: Undecidability Prof. Jennifer Welch Fall 2008.
Claus Brabrand, ITU, Denmark Mar 23, 2009STATIC ANALYSIS (DATA-FLOW ANALYSIS) Static Analysis: Data-Flow Analysis I Claus Brabrand IT University of Copenhagen.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
Introduction to Computers and Programming Lecture 4: Mathematical Operators New York University.
Introduction to Computers and Programming Lecture 5 New York University.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Introduction to Computers and Programming Lecture 5 Boolean type; if statement Professor: Evan Korth New York University.
Claus Brabrand, ITU, Denmark Feb 17, 2009WHITE-BOX TESTING White-Box Testing Claus Brabrand [ ] ( “FÅP”: First-year Project Course, ITU,
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Claus Brabrand, ITU, Denmark DATA-FLOW ANALYSISMar 25, 2009 Static Analysis: Data-Flow Analysis II Claus Brabrand IT University of Copenhagen (
Claus Brabrand, UFPE, Brazil Aug 09, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.
Claus Brabrand, UFPE, Brazil Aug 04, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.
UNIT II Decision Making And Branching Decision Making And Looping
Language Evaluation Criteria
Cs3102: Theory of Computation Class 18: Proving Undecidability Spring 2010 University of Virginia David Evans.
The Java Programming Language
Loops: Handling Infinite Processes CS 21a: Introduction to Computing I First Semester,
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
COP-3330: Object Oriented Programming Flow Control May 16, 2012 Eng. Hector M Lugo-Cordero, MS.
CSE 425: Control Flow I Categories of Control Flow Constructs Sequencing –order of expressions and statements Selection –if, else, switch Iteration –loops.
Chapter 7 Selection Dept of Computer Engineering Khon Kaen University.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
SNU OOPSLA Lab. 1 Great Ideas of CS with Java Part 1 WWW & Computer programming in the language Java Ch 1: The World Wide Web Ch 2: Watch out: Here comes.
How to execute Program structure Variables name, keywords, binding, scope, lifetime Data types – type system – primitives, strings, arrays, hashes – pointers/references.
Types and Programming Languages
Application development with Java Lecture 6 Rina Zviel-Girshin.
Types and Programming Languages Lecture 3 Simon Gay Department of Computing Science University of Glasgow 2006/07.
1 Flow of Control Chapter 5. 2 Objectives You will be able to: Use the Java "if" statement to control flow of control within your program.  Use the Java.
© 2011 Pearson Education, publishing as Addison-Wesley Chapter 1: Computer Systems Presentation slides for Java Software Solutions for AP* Computer Science.
Introduction to Programming G50PRO University of Nottingham Unit 6 : Control Flow Statements 2 Paul Tennent
1 Problem Solving  The purpose of writing a program is to solve a problem  The general steps in problem solving are: Understand the problem Dissect the.
CMSC201 Computer Science I for Majors Lecture 05 – Comparison Operators and Boolean (Logical) Operators Prof. Katherine Gibson Prof. Jeremy.
CS314 – Section 5 Recitation 9
Dynamic Allocation in C
CSE-321 Programming Languages Simply Typed -Calculus
Chapter 13: Pointers, Classes, Virtual Functions, and Abstract Classes
Selection (also known as Branching) Jumail Bin Taliba by
Introduction to Computer Science / Procedural – 67130
Semantic Analysis with Emphasis on Name Analysis
Principles of programming languages 4: Parameter passing, Scope rules
CSE 311 Foundations of Computing I
Chapter 5: Control Structures II
Great Theoretical Ideas in Computer Science
Type Systems Terms to learn about types: Related concepts: Type
Compiler Design 18. Object Oriented Semantic Analysis (Symbol Tables, Type Checking) Kanat Bolazar March 30, 2010.
Outline Altering flow of control Boolean expressions
CSCE 411 Design and Analysis of Algorithms
Chapter 1: Computer Systems
Selection Statements.
Pointer analysis.
CSE 3302 Programming Languages
Examining Variables on Flow Paths
Focus of the Course Object-Oriented Software Development
Chapter8: Statement-Level Control Structures April 9, 2019
CS 240 – Lecture 7 Boolean Operations, Increment and Decrement Operators, Constant Types, enum Types, Precedence.
Introduction to Programming
Loops and Iteration CS 21a: Introduction to Computing I
Controlling Program Flow
CSE 3302 Programming Languages
Presentation transcript:

Static Analysis: Data-Flow Analysis I ( AMP’11: Advanced Models & Programs, 2011 ) Claus Brabrand IT University of Copenhagen ( brabrand@itu.dk )

Agenda (4/4, ’11) Introduction: Data-flow Analysis: Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited

Breaks? #Breaks, |Breaks|, ...? I’ll aim for: more (but shorter) breaks

Notes on Static Analysis ”Lecture Notes on Static Analysis” by Michael I. Schwartzbach (BRICS, Uni. Aarhus) Chapter 1, 2, 4, 5, 6 (until p. 19) (Excl. ”pointers”) Claims to be "not overly formal", but the math involved can be quite challenging (at times)…

Static Analysis Purpose (of static analysis): Gather information (on running behavior of program) ”program points” Usage (of static analysis): Basis for subsequent…: Error detection information program points Static Analysis Optimization

Analyses for Optimization Example Analyses: ”Constant Propagation Analysis”: Precompute constants (e.g., replace ’5*x+z’ by ’42’) ”Live Variables Analysis”: Dead-code elimination (e.g., get rid of unused variable ’z’) ”Available Expressions Analysis”: Avoid recomputing already computed exprs (cache results) …

Analyses for Finding Errors Example Analyses: ”Symbol Checking”: Catch (dynamic) symbol errors ”Type Checking”: Catch (dynamic) type errors ”Initialized Variable Analysis”: Catch unintialized variables …

Quizzzzz: Optimization? If you want a fast C-program, should you use: LOOP 1: LOOP 2 (optimized by programmer): i.e., ”array-version” or ”optimized pointer-version” ? for (i = 0; i < N; i++) { a[i] = a[i] * 2000; a[i] = a[i] / 10000; } ? …or… b = a; for (i = 0; i < N; i++) { *b = *b * 2000; *b = *b / 10000; b++; } ?

Answer: Results (of running the programs): You’ll learn how to make similar analyses [next two weeks] Results (of running the programs): Compilers use highly sophisticated static analyses for optimization! Recommendation: focus on writing clear code (for people and for compilers to understand)

Conceptual Motivation Undecidability Reduction principle Approximation

Rice’s Theorem (1953) Examples: does program ’P’ always halt? is the value of integer variable ’x’ always positive? does variable ’x’ always have the same value? which variables can pointer ’p’ point to? does expression ’E’ always evaluate to true? what are the possible outputs of program ’P’? … “Any interesting problem about the runtime behavior of a program* is undecidable” -- Rice’s Theorem [paraphrased] (1953) *) written in a turing-complete language

Undecidability (self-referentiality) Consider "The Book-of-all-Books": This book contains the titles of all books that do not have a self-reference (i.e. don't contain their title inside) Finitely many books; i.e.: We can sit down & figure out whether to include or not... Q: What about "The Book-of-all-Books"; Should it be included or not? "Self-referential paradox" (many guises): e.g. "This sentence is false" "The Bible" "War and Peace" "Programming Languages, An Interp.-Based Approach" ... The Book-of-all-Books  

Termination Undecidable! Assume termination is decidable (in Java); i.e.  some program, halts: Program  bool Q: Does P0 loop or terminate...? :) Hence: halts cannot exist! i.e., "Termination is undecidable" bool halts(Program p) { ... } -- P0.java -- bool halts(Program p) { ... } Program p0 = read_program("P0.java"); if (halts(p0)) loop(); else halt(); *) for turing-complete languages

Rice’s Theorem (1953) Examples: does program ’P’ always halt? is the value of integer variable ’x’ always positive? does variable ’x’ always have the same value? which variables can pointer ’p’ point to? does expression ’E’ always evaluate to true? what are the possible outputs of program ’P’? … “Any interesting problem about the runtime behavior program* is undecidable” -- Rice’s Theorem [paraphrased] (1953) *) written in a turing-complete language reduction

solve always-pos  solve halts Assume ’x-is-always-pos(P)’ is decidable Given P (here’s how we could solve ’halts(P)’): Construct (clever) reduction program R: Run ”supposedly decidable” analysis: res = Deduce from result: if (res) then P loops!; else P halts :-) THUS: ’x-is-always-pos(P)’ must be undecidable! -- R.java -- int x = 1; P /* insert program P here :-) */ x = -1; x-is-always-positive(R)

Reduction Principle Reduction principle (in short): Example: Exercise: Carry out reduction + whole explanation for: ”which are the possible outputs of program P?” (P) undecidable  [solve (P)  solve (P)] (P) undecidable reduction ’halts(P)’ undecidable  [solve ’x-is-always-pos(P)’  solve ’halts(P)’] ’x-is-always-pos(P)’ undecidable Assume ’which-var-p-points-to(P)’ is decidable: Given P (here’s how to (cleverly) decide halts(P)): Construct (clever) reduction program R: ptr p = 0xffff; P /* insert program P */ p = null; Run ’which-var-p-points-to(R)’ = res If (null \in res) P halts! else; P loops! :-) THUS: ’which-var-p-points-to(P)’ must be undecidable!

Answer Assume ’which-var-q-points-to(P)’ is decidable: Given P (here’s how to (cleverly) decide halts(P)): Construct (clever) reduction program R: Run ’which-var-p-points-to(R)’ = res If (null  res) P halts! else; P loops! :-) THUS: ’which-var-q-points-to(P)’ must be undecidable! -- R.java -- ptr q = 0xffff; P /* insert program P */ q = null;

Undecidability Undecidability means that…: However(!)… ? …no-one can decide this line (for all programs)! However(!)… ? loops halts

“Side-Stepping Undecidability” Compilers use safe approximations (computed via ”static analyses”) such that: However, just because it’s undecidable, doesn’t mean there aren’t (good) approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”: error error ok ok Okay! Dunno? Dunno? Error!

“Side-Stepping Undecidability” Unsafe approximation: For testing it may be okay to ”abandon” safety and use unsafe approximations: However, just because it’s undecidable, doesn’t mean there aren’t (good) approximations! Indeed, the whole area of static analysis works on “side-stepping undecidability”: loops halts unsafe approximation error ok Here are some programs for you to (manually) consider !

“Slack” Undecidability means: “there’ll always be a slack”: However, still useful: (possible interpretations of “Dunno?”): Treat as error (i.e., reject program): “Sorry, program not accepted!” Treat as warning (i.e., warn programmer): “Here are some potential problems: …” loops . halts . . Okay! Dunno?

Soundness & Completeness Analysis reports no errors  Really are no errors Completeness: Analysis reports an error  Really is an error error error ok ok Sound analysis Complete analysis …or alternative (equivalent) formulation, via ”contra-position”: P  Q  Q  P Really are error(s)  Analysis reports error(s) Really no error(s)  Analysis reports no error(s)

Example: Type Checking Will this program have type error (when run)? Undecidable (because of reduction): Type error  <EXP> evaluates to true void f() { var b; if (<EXP>) { b = 42; } else { b = true; } /* some code */ if (b) ...; // error is b is '42' Undecidable (in all cases) i.e., what <EXP> evaluates to (when run)

Example: Type Checking Hence, languages use static requirements: All variables must be declared And have only one type (throughout the program) This is (very) easy to check (i.e., "type-checking") void f() { bool b; // instead of ”var b;” /* some code */ if (<EXP>) { b = 42; } else { b = true; } /* some more code */ } Static compiler error: Regardless of what <EXP> evaluates to when run

Agenda (4/4, ’11) Introduction: Data-flow Analysis: Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited

5’ Crash Course on Data-Flow Analysis ( AMP’11: Advanced Models & Programs, 2011 ) Claus Brabrand IT University of Copenhagen ( brabrand@itu.dk )

“Simulate runtime execution at compile-time using abstract values” Data-Flow Analysis IDEA: We (only) need 3 things: A control-flow graph A lattice Transfer functions Example: “(integer) constant propagation” “Simulate runtime execution at compile-time using abstract values”

 Control-flow graph We (only) need 3 things: A control-flow graph A lattice Transfer functions Control-flow graph Given program: int x = 1; int x = 1; int y = 3; if (...) { x = x+2; } else { x <-> y; } print(x,y); int y = 3;  ... true false x = x+2; x <-> y; print(x,y);

We (only) need 3 things: A control-flow graph A lattice Transfer functions A Lattice Lattice L of abstract values of interest and their relationships (i.e. ordering “”): Induces least-upper-bound operator: for combining information “top” ~ “could be anything” ·· -3 -2 -1 0 1 2 3 ·· “bottom” ~ “we haven’t analyzed yet”

Data-Flow Analysis We (only) need 3 things: A control-flow graph A lattice Transfer functions Data-Flow Analysis x y [ , ]  ENVL E . E[x  1] int x = 1; [ , ] 1 [ , ] 1 E . E[y  3] int y = 3; [ , ] 1 3 [ , ] 1 3 ... [ , ] 1 3 [ , ] [ , ] 1 3 1 3 E . E[x  E(y), y  E(x)] E . E[x  E(x)  2] x = x+2; x <-> y; [ , ] 3 3 [ , ] [ , ] 3 1 3 print(x,y);

Agenda (4/4, ’11) Introduction: Data-flow Analysis: Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited

Control Structures Control Structures: if-else: if: Statements (or Expr’s) that affect ”flow of control”: if-else: if: Stm1 Exp true false Stm2 confluence [syntax] if ( Exp ) Stm1 else Stm2 [semantics] The expression must be of type boolean; if it evaluates to true, Statement-1 is executed, otherwise Statement-2 is executed. Stm Exp true false confluence [syntax] if ( Exp ) Stm [semantics] The expression must be of type boolean; if it evaluates to true, the given statement is executed, otherwise not.

Control Structures (cont’d) Stm Exp true false confluence while: for: [syntax] while ( Exp ) Stm [semantics] The expression must be of type boolean; if it evaluates to false, the given statement is skipped, otherwise it is executed and afterwards the expression is evaluated again. If it is still true, the statement is executed again. This is continued until the expression evaluates to false. Exp1; Exp2 true false Stm confluence Exp3; [syntax] for (Exp1 ; Exp2 ; Exp3) Stm [semantics] Equivalent to: { Exp1; while ( Exp2 ) { Stm Exp3; } }

 Control-flow graph Given program: int x = 1; int x = 1; int y = 3; if (a>b) { x = x+2; } else { x <-> y; } print(x,y); int y = 3;  a>b true false x = x+2; x <-> y; print(x,y);

Exercise: Draw a Control-Flow Graph for: public static void main ( String[] args ) { int mi, ma; if (args.length == 0) System.out.println("No numbers"); else { mi = ma = Integer.parseInt(args[0]); for (int i=1; i < args.length; i++) { int obs = Integer.parseInt(args[i]); if (obs > ma) ma = obs; else if (mi < obs) mi = obs; } System.out.println(”min=" + mi + "," + "max=" + ma); }} if else for if else if

System.out.println ("No numbers"); Control-Flow Graph int mi, ma; CFG: args.length == 0 true false System.out.println ("No numbers"); mi = ma = Integer.parseInt(args[0]); int i=1; i < args.length true false int obs = Integer.parseInt(args[i]); obs > ma true false ma = obs; mi < obs true false mi = obs; i++; System.out.println(”min=" + mi + "," + "max=" + ma);

Control Structures (cont’d2) do-while: ”?:”; ”conditional expression”: ”||”; ”lazy disjunction” (aka., ”short-cut ”): ”&&”; ”lazy conjunction” (aka., ”short-cut ”): switch: Exercise: do Stm while ( Exp ); Exp1 ? Exp2 : Exp3 Exp1 || Exp2 Exp1 && Exp2 Swb: case Exp : Stm* break; switch ( Exp ) { Swb* } default : Stm* break;

Control Structures (cont’d3) try-catch-finally (exceptions): return / break / continue: ”method invocation”: e.g.; ”recursive method invocation”: ”virtual dispatching”: try Stm1 catch ( Exp ) Stm2 finally Stm3 return ; return Exp ; break ; continue ; f(x) f(x) f(x)

Control Structures (cont’d4) ”function pointers”: e.g.; ”higher-order functions”: ”dynamic evaluation”: Some constructions (and thus languages) require a separate control-flow analysis for determining control-flow in order to do data-flow analysis (*f)(x) f.x.(f x) eval(some-string-which-has-been-dynamically-computed)

Agenda (4/4, ’11) Introduction: Data-flow Analysis: Undecidability, Reduction, and Approximation Data-flow Analysis: Quick tour & running example Control-Flow Graphs: Control-flow, data-flow, and confluence ”Science-Fiction Math”: Lattice theory, monotonicity, and fixed-points Putting it all together…: Example revisited