Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

CSCI 115 Chapter 6 Order Relations and Structures.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Exercise 1 Generics and Assignments. Language with Generics and Lots of Type Annotations Simple language with this syntax types:T ::= Int | Bool | T =>
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
Chapter 7 Relations : the second time around
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
CSE 231 : Advanced Compilers Building Program Analyzers.
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Program analysis Mooly Sagiv html://
CS 536 Spring Global Optimizations Lecture 23.
1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Programming Language Semantics Denotational Semantics Chapter 5 Part II.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Program analysis Mooly Sagiv html://
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Administrative stuff Office hours: After class on Tuesday.
1 CS 201 Compiler Construction Lecture 6 Code Optimizations: Constant Propagation & Folding.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Overview of program analysis Mooly Sagiv html://
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Claus Brabrand, ITU, Denmark DATA-FLOW ANALYSISMar 25, 2009 Static Analysis: Data-Flow Analysis II Claus Brabrand IT University of Copenhagen (
Claus Brabrand, UFPE, Brazil Aug 09, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Partially Ordered Sets (POSets)
Lecture 9 Illustrations Lattices. Fixpoints Abstract Interpretation.
Solving fixpoint equations
Compiler Construction Lecture 16 Data-Flow Analysis.
The Integers. The Division Algorithms A high-school question: Compute 58/17. We can write 58 as 58 = 3 (17) + 7 This forms illustrates the answer: “3.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
Control Structures II Repetition (Loops). Why Is Repetition Needed? How can you solve the following problem: What is the sum of all the numbers from 1.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.
Unit II Discrete Structures Relations and Functions SE (Comp.Engg.)
1 Section 4.3 Order Relations A binary relation is an partial order if it transitive and antisymmetric. If R is a partial order over the set S, we also.
Principle of Programming Lanugages 3: Compilation of statements Statements in C Assertion Hoare logic Department of Information Science and Engineering.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Semilattices presented by Niko Simonson, CSS 548, Autumn 2012 Semilattice City, © 2009 Nora Shader.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Iterative Dataflow Problems Taken largely from notes of Alex Aiken (UC Berkeley) and Martin Rinard (MIT) Dataflow information used in optimization Several.
Abstractions Eric Feron. Outline Principles of abstraction Motivating example Abstracting variables Abstracting functions Abstracting operators Recommended.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d.
DFA foundations Simone Campanoni
Simone Campanoni DFA foundations Simone Campanoni
Data Flow Analysis Compiler Design
Lecture 20: Dataflow Analysis Frameworks 11 Mar 02
Background material.
교환 학생 프로그램 내년 1월 중순부터 6월 초 현재 학부 2,3 학년?
Background material.
Presentation transcript:

Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis

Goal of Data-Flow Analysis Automatically compute information about the program Use it to report errors to user (like type errors) Use it to optimize the program Works on control-flow graphs: (like flow-charts) x = 1 while (x < 10) { x = x + 2 }

Constant Propagation int a, b, step, i; boolean c; a = 0; b = a + 10; step = -1; if (step > 0) { i = a; } else { i = b; } c = true; while (c) { print(i); i = i + step; // can emit decrement if (step > 0) { c = (i < b); } else { c = (i > a); // can emit better instruction here } // insert here (a = a + step), redo analysis }

Control-Flow Graph: (V,E) Set of nodes, V Set of edges, which have statements on them (v 1,st,v 2 ) in E means there is edge from v 1 to v 2 labeled with statement st. x = 1 while (x < 10) { x = x + 2 } V = {v 0,v 1,v 2,v 3 } E = {(v 0,x=1,v 1 ), (v 1,[x<10],v 2 ), (v 2,x=x+2,v 1 ), (v 1,[!(x<10)],v 3 )}

Interpretation and Abstract Interpratation Control-Flow graph is similar to AST We can – interpret control flow graph – generate machine code from it (e.g. LLVM, gcc) – abstractly interpret it: do not push values, but approximately compute supersets of possible values (e.g. intervals, types, etc.)

Compute Range of x at Each Point

What we see today 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe abstractly sets of program states (facts) 3.Transfer functions that describe how to update facts (started) Next time: 4.Iterative analysis algorithm 5.Convergence

Generating Control-Flow Graphs Start with graph that has one entry and one exit node and label is entire program Recursively decompose the program to have more edges with simpler labels When labels cannot be decomposed further, we are done

Flattening Expressions

If-Then-Else Better translation uses the "branch" instruction approach: have two destinations

While Better translation uses the "branch" instruction

Example 1: Convert to CFG while (i < 10) { println(j); i = i + 1; j = j +2*i + 1; }

Example 1 Result while (i < 10) { println(j); i = i + 1; j = j +2*i + 1; }

Example 2: Convert to CFG int i = n; while (i > 1) { println(i); if (i % 2 == 0) { i = i / 2; } else { i = 3*i + 1; }

Example 2 Result int i = n; while (i > 1) { println(i); if (i % 2 == 0) { i = i / 2; } else { i = 3*i + 1; }

Analysis Domains

Abstract Intepretation Generalizes Type Inference Type Inference computes types type rules – can be used to compute types of expression from subtypes types fixed for a variable Abstract Interpretation computes facts from a domain – types – intervals – formulas – set of initialized variables – set of live variables transfer functions – compute facts for one program point from facts at previous program points facts change as the values of vars change (flow-sensitivity)

scalac computes types. Try in REPL: class C class D extends C class E extends C val p = false val d = new D() val e = new E() val z = if (p) d else e val u = if (p) (d,e) else (d,d) val v = if (p) (d,e) else (e,d) val f1 = if (p) ((d1:D) => 5) else ((e1:E) => 5) val f2 = if (p) ((d1:D) => d) else ((e1:E) => e)

Finds "Best Type" for Expression class C class D extends C class E extends C val p = false val d = new D()// d:D val e = new E()// e:E val z = if (p) d else e// z:C val u = if (p) (d,e) else (d,d)// u:(D,C) val v = if (p) (d,e) else (e,d)// v:(C,C) val f1 = if (p) ((d1:D) => 5) else ((e1:E) => 5)// f1: ((D with E) => Int) val f2 = if (p) ((d1:D) => d) else ((e1:E) => e)// f2: ((D with E) => C)

Subtyping Relation in this Example product of two graphs:...

Subtyping Relation in this Example

Least Upper Bound (lub, join) A,B,C are all upper bounds on both D and E (they are above each of then in the picture, they are supertypes of D and supertypes of E). Among these upper bounds, C is the least one (the most specific one). We therefore say C is the least upper bound, In any partial order , if S is a set of elements (e.g. S={D,E}) then: U is upper bound on S iff x  U for every x in S. U 0 is the least upper bound (lub) of S, written U 0 = S, or U 0 =lub(S) iff: U 0 is upper bound and if U is any upper bound on S, then U 0  U

Greatest Lower Bound (glb, meet) In any partial order , if S is a set of elements (e.g. S={D,E}) then: L is lower bound on S iff L  x for every x in S. L 0 is the greatest upper bound (glb) of S, written L 0 = S, or L 0 =glb(S), iff: m 0 is upper bound and if m is any upper bound on S, then m 0  m In the presence of traits or interfaces, there are multiple types that are subtypes of both D and E. The type (D with E) is the largest of them.

Computing lub and glb for tuple and function types

Lattice Partial order: binary relation  (subset of some D 2 ) which is – reflexive: x  x – anti-symmetric: x  y /\ y  x -> x=y – transitive: x  y /\ y  z -> x  z Lattice is a partial order in which every two-element set has lub and glb Lemma: if (D,  ) is lattice and D is finite, then lub and glb exist for every finite set

Idea of Why Lemma Holds lub(x 1,lub(x 2,...,lub(x n-1,x n ))) is lub({x 1,...x n }) glb(x 1,glb(x 2,...,glb(x n-1,x n ))) is glb({x 1,...x n }) lub of all elements in D is maximum of D – by definition, glb({}) is the maximum of D glb of all elements in D is minimum of D – by definition, lub({}) is the minimum of D

Graphs and Partial Orders If the domain is finite, then partial order can be represented by directed graphs – if x  y then draw edge from x to y For partial order, no need to draw x  z if x  y and y  z. So we only draw non-transitive edges Also, because always x  x, we do not draw those self loops Note that the resulting graph is acyclic: if we had a cycle, the elements must to be equal

Defining Abstract Interpretation Abstract Domain D (elements are data-flow facts), describing which information to compute, e.g. – inferred types for each variable: x:C, y:D – interval for each variable x:[a,b], y:[a’,b’] Transfer Functions, [[st]] for each statement st, how this statement affects the facts – Example:

Domain of Intervals [a,b] where a,b  {-M,-127,0,127,M-1}

Find Transfer Function: Plus If and we execute x= x+y then Suppose we have only two integer variables: x,y So we can let a’= a+c b’ = b+d c’=c d’ = d

Find Transfer Function: Minus If and we execute y= x-y then Suppose we have only two integer variables: x,y So we can let a’= a b’ = b c’= a - d d’ = b - c