# Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)

## Presentation on theme: "Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)"— Presentation transcript:

Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)

Outline What is Control Flow Analysis? Motivating Example Structure of an optimizing compiler A motivating example Constructing basic blocks Depth first search Finding dominators Reducibility Interval and Structural Analysis Conclusions

Control Flow Analysis Input: A sequence of IR Output: –A partition of the IR into basic blocks –A control flow graph –The loop structure

Compiler Structure Symbol table and access routines OS Interface String of characters Scanner tokens Semantic analyzer Parser Code Generator IR AST Object code

Optimizing Compiler Structure String of characters Front-End IR Control Flow Analysis CFG Data Flow Analysis CFG+information Program Transformations instruction selection Object code

An Example Reaching Definitions A definition --- an assignment to variable An assignment d reaches a program point block if there exists an execution path to the this point in which the value assigned at d is still active

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m 1 1, 2 1, 2, 3 1, 2, 3, 5 1, 2, 3, 5, 8, 9, 10, 11 1, 3, 5, 8, 9, 10, 11 1, 5, 8, 9, 10, 11 1, 8, 9, 10, 11

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m entry exit   2, 3 2, 3, 5,8,9, 10, 11 2,3 2, 3, 5,8,9, 10, 11

Approaches for Data Flow Analysis Iterative –Compute natural loops and iterate on CFG Interval Based –Reduce the CFG to single node –Inductively define the data flow solution Structural –Identify control flow structures in the CFG –Inductively define the data flow solution

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m entry exit   2, 3 2, 3, 5 2, 3, 5, 8,9, 10, 11 2, 3, 5 2,3 2, 3, 5, 8,9, 10, 11,8, 9, 10, 11

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m entry exit {9, 10}, {1, 2, 3} {11}, {5} {2, 3, 5}, {8, 9, 10, 11}

entry exit {9, 10}, {1, 2, 3} {11}, {5} , {8, 9, 10, 11}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5}

entry exit , {1, 2, 3, 8, 9, 10, 11, 5}

entry exit , {1, 2, 3, 8, 9, 10, 11, 5}

Finding Basic Blocks A basic block is the maximal sequence of straight-line IR instructions –no fork-join A leader IR instruction –the entry of a routine –a target of a branch –instruction immediately following branch

Constructing basic blocks Input: a sequence of MIR instructions Output: a list of basic blocks where each MIR instruction occurs in exactly one block Method: determine the leaders of the basic blocks: - the first instruction in the procedure is a leader - any instruction that is the target of a jump is a leader - any instruction after branch is a leader for each leader its basic block consists of - the leader and - all instructions up to but not including the next leader or the end of the program

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m B1 B2 B3 B4 B5 B6

Constructing Control Flow Graph (CFG) Special entry block r without successors Special exit block without predecessors There is an edge m  n –m= entry and the first instruction in n begins the procedure –n=exit and the last instruction in m is return or the last instruction in the procedure –there is a branch from the last instruction in m into the first instruction in n –the first instruction in n immediately follows the last non-branch instruction in m

Running Example 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m B1 B2 B3 B4 B5 B6

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m entry exit

How to treat call instructions? A call is an atomic instruction A call ends a basic block Replace the call by the procedure body (inline) A call is a “goto” into the procedure A call is handled in a special way

Potential Difficulties Gotos outside procedure boundaries Exit/Trap calls Exception handling Computed gotos setjump(), lonjump() calls

Approaches for Data Flow Analysis Iterative –Compute natural loops and iterate on CFG Interval Based –Reduce the CFG to single node –Inductively define the data flow solution Structural –Identify control flow structures in the CFG

Identifying Natural Loops A basic block m dominates a basic block n if every path from entry to n includes m The domination relationship is: reflexive, transitive, and anti-symmetric  can be represented as a tree A back edge m  n  n dominates m The natural loop contains the blocks on the paths from n to m

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m entry exit B0 B1 B2 B3 B5 B6 B7 B4

Reducible Flow Graphs All the loops are natural Can be “reduced” into a single node via a sequence of special transformations –Example T1, T2 transformations Every loop has a single entry Result from “well structured” programs Most programs compiled into reducible flow graphs

T1/T2 Transformations T1 T2  

Node Splitting B1 B2 B3 B4 B5 B1 B2 B3 B4 B5 B3a

Why can’t we construct loops from source? Language dependent Non uniform Source to source transformations Most programming languages support “wild” GOTOs

Depth-first spanning tree Input: a flow graph G = (N,E,r) Output:a depth-first spanning tree (N,T) Method:T := Ø; for each node n in N do mark n unvisited; call DFS(r) Using:procedure DFS(n) is mark n visited; for each n  s in E do if s is not visited then add the edge n  s to T; call DFS(s)

Better DFS Implementations Explicit stack instead of recursion Pointer reversal

Pre-ordering Input: a flow graph G=(N,E,r) Output:a depth-first spanning tree (N,T) and ordering Pre of N Method:T := Ø; for each node n in N do mark n unvisited; i := 1; call DFS(r) Using:procedure DFS(n) is mark n visited; Pre(n) := i; i := i + 1; for each n  s in E do if s is not visited then add the edge n  s to T; call DFS(s);

Computing dominators Input:a flow graph G=(N,E,r) Output:for each node n, a set DOM(n) of dominators Method:DOM(r) := { r }; for each n in N \ { r } do DOM(n) := N; while changes in some DOM(n) do for each n in N \ { r } do DOM(n) := { n } U  { DOM(p) | p  n is in E }

1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m entry exit B0 B1 B2 B3 B5 B6 B7 B4

Other Algorithms for Finding Dominators Lengauer & Tarjan e log n algorithm Harel linear time algorithm Thorup linear time algorithm Alstrup & Lauridsen incremental algorithm

Computing natural loops Input:a flow graph G=(N,E,r) and a backedge m  n Output:a set, loop, of the nodes in the natural loop of m  n Method:stack := empty; loop := {n}; call add(m); while stack is not empty do pop d from the stack; for each p with p  d in E do call add(p) Using:procedure add(p) is if p is not in loop then loop := loop U {p}; push p on the stack

Issues Natural loops with disjoint headers are disjoint or nested within each other But what about loops which share a header?

Two Loops with the same header B1: i =1 if (i >= 100) goto B4 else if ((i %10)==0) goto B3 else B2:.... i++; goto B1 B3:.... i++; goto B1 B4:... B1: if (i < j) goto B2 else if (i > j) goto B3 else goto B4 B2:.... i++; goto B1 B3:.... i++; goto B1 B4:...

Strongly connected components Input:a flow graph G = (N,E,r) Output:a set of strongly connected components Method: for all n in N do mark n unvisited i := 1; stack := empty while there exists unvisited node n do call SCC(n) Using:procedure SCC(n) is...

procedure SCC(n) is mark n visited; Pre(n) := i; Low(n) := i; (lowest number for node in SCC) i := i+1; push n on the stack; for each n -> s in E do if s is not visited then call SCC(s); Low(n) := min(Low(n),Low(s)) else if Pre(s) < Pre(n) and s is on the stack (back or cross edge) then Low(n) := min(Low(n),Pre(s)); if Low(n) = Pre(n) (n is the root of an SCC) then SCC := Ø; repeat pop d off the stack; SCC := SCC U {d} until d = n; return SCC

Structural Analysis Identify “common” structures in the control flow graph (even irreducible) Reduce the CFG into “simple-regions” Shift some dataflow analysis from compile- time to compiler-generation-time Can be efficiently implemented via DFS

Block Schema B1 B2 Bn 

Conditionals B1 B2 B1 B2 B3 B0 B1 B2 Bn

Loops B1 B2 B1 B2 B1 B2B3

Download ppt "Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)"

Similar presentations