Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)

Slides:



Advertisements
Similar presentations
CPSC 388 – Compiler Design and Construction
Advertisements

Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Optimizing single thread performance Dependence Loop transformations.
1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
CPSC Compiler Tutorial 9 Review of Compiler.
Cpeg421-08S/final-review1 Course Review Tom St. John.
Program analysis Mooly Sagiv html://
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Chair of Software Engineering Fundamentals of Program Analysis Dr. Manuel Oriol.
Program analysis Mooly Sagiv html://
Code Generation for Basic Blocks Introduction Mooly Sagiv html:// Chapter
Reference Book: Modern Compiler Design by Grune, Bal, Jacobs and Langendoen Wiley 2000.
Compiler Summary Mooly Sagiv html://
Overview of program analysis Mooly Sagiv html://
Compiler design Computer Science Rensselaer Polytechnic Lecture 1.
Overview of program analysis Mooly Sagiv html://
Invitation to Computer Science 5th Edition
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Data Flow Analysis Compiler Baojian Hua
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
CSC 338: Compiler design and implementation
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
CST320 - Lec 11 Why study compilers? n n Ties lots of things you know together: –Theory (finite automata, grammars) –Data structures –Modularization –Utilization.
Advanced Compiler Design An Introduction to the Javali Compiler Framework Zoltán Majó 1.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
1 Chapter 1 Introduction. 2 Outlines 1.1 Overview and History 1.2 What Do Compilers Do? 1.3 The Structure of a Compiler 1.4 The Syntax and Semantics of.
Chapter 1 Introduction. Chapter 1 - Introduction 2 The Goal of Chapter 1 Introduce different forms of language translators Give a high level overview.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Intermediate Code Representations
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
Chapter 1: Introduction 1 Compiler Designs and Constructions Chapter 1: Introduction Objectives: Course Objectives Introduction Dr. Mohsen Chitsaz.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
The Model of Compilation Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Control Flow Analysis Compiler Baojian Hua
What is a compiler? –A program that reads a program written in one language (source language) and translates it into an equivalent program in another language.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
ICS312 Introduction to Compilers Set 23. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Presented by : A best website designer company. Chapter 1 Introduction Prof Chung. 1.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Basic Program Analysis
Compiler Designs and Constructions (Page 83 – 92)
Chapter 1 Introduction.
Introduction to Compiler Construction
A Simple Syntax-Directed Translator
Introduction to Advanced Topics Chapter 1 Text Book: Advanced compiler Design implementation By Steven S Muchnick (Elsevier)
Chapter 1 Introduction.
Compiler Lecture 1 CS510.
CS 536 / Fall 2017 Introduction to programming languages and compilers
An Overview to Compiler Design
Basic Program Analysis: AST
Control Flow Analysis CS 4501 Baishakhi Ray.
Review: Compiler Phases:
Compilers B V Sai Aravind (11CS10008).
Control Flow Analysis (Chapter 7)
Control Flow Analysis (Chapter 7)
Topic 2: Compiler Front-End
Presentation transcript:

Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)

Outline What is Control Flow Analysis? Motivating Example Structure of an optimizing compiler A motivating example Constructing basic blocks Depth first search Finding dominators Reducibility Interval and Structural Analysis Conclusions

Control Flow Analysis Input: A sequence of IR Output: –A partition of the IR into basic blocks –A control flow graph –The loop structure

Compiler Structure Symbol table and access routines OS Interface String of characters Scanner tokens Semantic analyzer Parser Code Generator IR AST Object code

Optimizing Compiler Structure String of characters Front-End IR Control Flow Analysis CFG Data Flow Analysis CFG+information Program Transformations instruction selection Object code

An Example Reaching Definitions A definition --- an assignment to variable An assignment d reaches a program point block if there exists an execution path to the this point in which the value assigned at d is still active

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m 1 1, 2 1, 2, 3 1, 2, 3, 5 1, 2, 3, 5, 8, 9, 10, 11 1, 3, 5, 8, 9, 10, 11 1, 5, 8, 9, 10, 11 1, 8, 9, 10, 11

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m entry exit   2, 3 2, 3, 5,8,9, 10, 11 2,3 2, 3, 5,8,9, 10, 11

Approaches for Data Flow Analysis Iterative –Compute natural loops and iterate on CFG Interval Based –Reduce the CFG to single node –Inductively define the data flow solution Structural –Identify control flow structures in the CFG –Inductively define the data flow solution

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m entry exit   2, 3 2, 3, 5 2, 3, 5, 8,9, 10, 11 2, 3, 5 2,3 2, 3, 5, 8,9, 10, 11,8, 9, 10, 11

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m entry exit {9, 10}, {1, 2, 3} {11}, {5} {2, 3, 5}, {8, 9, 10, 11}

entry exit {9, 10}, {1, 2, 3} {11}, {5} , {8, 9, 10, 11}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit , {1, 2, 3, 8, 9, 10, 11, 5}

entry exit , {1, 2, 3, 8, 9, 10, 11, 5}

Finding Basic Blocks A basic block is the maximal sequence of straight-line IR instructions –no fork-join A leader IR instruction –the entry of a routine –a target of a branch –instruction immediately following branch

Constructing basic blocks Input: a sequence of MIR instructions Output: a list of basic blocks where each MIR instruction occurs in exactly one block Method: determine the leaders of the basic blocks: - the first instruction in the procedure is a leader - any instruction that is the target of a jump is a leader - any instruction after branch is a leader for each leader its basic block consists of - the leader and - all instructions up to but not including the next leader or the end of the program

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m B1 B2 B3 B4 B5 B6

Constructing Control Flow Graph (CFG) Special entry block r without successors Special exit block without predecessors There is an edge m  n –m= entry and the first instruction in n begins the procedure –n=exit and the last instruction in m is return or the last instruction in the procedure –there is a branch from the last instruction in m into the first instruction in n –the first instruction in n immediately follows the last non-branch instruction in m

Running Example 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m B1 B2 B3 B4 B5 B6

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m entry exit

How to treat call instructions? A call is an atomic instruction A call ends a basic block Replace the call by the procedure body (inline) A call is a “goto” into the procedure A call is handled in a special way

Potential Difficulties Gotos outside procedure boundaries Exit/Trap calls Exception handling Computed gotos setjump(), lonjump() calls

Approaches for Data Flow Analysis Iterative –Compute natural loops and iterate on CFG Interval Based –Reduce the CFG to single node –Inductively define the data flow solution Structural –Identify control flow structures in the CFG

Identifying Natural Loops A basic block m dominates a basic block n if every path from entry to n includes m The domination relationship is: reflexive, transitive, and anti-symmetric  can be represented as a tree A back edge m  n  n dominates m The natural loop contains the blocks on the paths from n to m

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m entry exit B0 B1 B2 B3 B5 B6 B7 B4

Reducible Flow Graphs All the loops are natural Can be “reduced” into a single node via a sequence of special transformations –Example T1, T2 transformations Every loop has a single entry Result from “well structured” programs Most programs compiled into reducible flow graphs

T1/T2 Transformations T1 T2  

Bad Example B4B5 B1 B2B3

Node Splitting B1 B2 B3 B4 B5 B1 B2 B3 B4 B5 B3a

Why can’t we construct loops from source? Language dependent Non uniform Source to source transformations Most programming languages support “wild” GOTOs

Depth-first spanning tree Input: a flow graph G = (N,E,r) Output:a depth-first spanning tree (N,T) Method:T := Ø; for each node n in N do mark n unvisited; call DFS(r) Using:procedure DFS(n) is mark n visited; for each n  s in E do if s is not visited then add the edge n  s to T; call DFS(s)

Better DFS Implementations Explicit stack instead of recursion Pointer reversal

Pre-ordering Input: a flow graph G=(N,E,r) Output:a depth-first spanning tree (N,T) and ordering Pre of N Method:T := Ø; for each node n in N do mark n unvisited; i := 1; call DFS(r) Using:procedure DFS(n) is mark n visited; Pre(n) := i; i := i + 1; for each n  s in E do if s is not visited then add the edge n  s to T; call DFS(s);

Computing dominators Input:a flow graph G=(N,E,r) Output:for each node n, a set DOM(n) of dominators Method:DOM(r) := { r }; for each n in N \ { r } do DOM(n) := N; while changes in some DOM(n) do for each n in N \ { r } do DOM(n) := { n } U  { DOM(p) | p  n is in E }

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i : goto L1 13: L3: return m entry exit B0 B1 B2 B3 B5 B6 B7 B4

Other Algorithms for Finding Dominators Lengauer & Tarjan e log n algorithm Harel linear time algorithm Thorup linear time algorithm Alstrup & Lauridsen incremental algorithm

Computing natural loops Input:a flow graph G=(N,E,r) and a backedge m  n Output:a set, loop, of the nodes in the natural loop of m  n Method:stack := empty; loop := {n}; call add(m); while stack is not empty do pop d from the stack; for each p with p  d in E do call add(p) Using:procedure add(p) is if p is not in loop then loop := loop U {p}; push p on the stack

Issues Natural loops with disjoint headers are disjoint or nested within each other But what about loops which share a header?

Two Loops with the same header B1: i =1 if (i >= 100) goto B4 else if ((i %10)==0) goto B3 else B2:.... i++; goto B1 B3:.... i++; goto B1 B4:... B1: if (i < j) goto B2 else if (i > j) goto B3 else goto B4 B2:.... i++; goto B1 B3:.... i++; goto B1 B4:...

Strongly connected components Input:a flow graph G = (N,E,r) Output:a set of strongly connected components Method: for all n in N do mark n unvisited i := 1; stack := empty while there exists unvisited node n do call SCC(n) Using:procedure SCC(n) is...

procedure SCC(n) is mark n visited; Pre(n) := i; Low(n) := i; (lowest number for node in SCC) i := i+1; push n on the stack; for each n -> s in E do if s is not visited then call SCC(s); Low(n) := min(Low(n),Low(s)) else if Pre(s) < Pre(n) and s is on the stack (back or cross edge) then Low(n) := min(Low(n),Pre(s)); if Low(n) = Pre(n) (n is the root of an SCC) then SCC := Ø; repeat pop d off the stack; SCC := SCC U {d} until d = n; return SCC

Structural Analysis Identify “common” structures in the control flow graph (even irreducible) Reduce the CFG into “simple-regions” Shift some dataflow analysis from compile- time to compiler-generation-time Can be efficiently implemented via DFS

Block Schema B1 B2 Bn 

Conditionals B1 B2 B1 B2 B3 B0 B1 B2 Bn

Loops B1 B2 B1 B2 B1 B2B3