# A Polynomial-Time Algorithm for Global Value Numbering SAS 2004 Sumit Gulwani George C. Necula.

## Presentation on theme: "A Polynomial-Time Algorithm for Global Value Numbering SAS 2004 Sumit Gulwani George C. Necula."— Presentation transcript:

A Polynomial-Time Algorithm for Global Value Numbering SAS 2004 Sumit Gulwani George C. Necula

1 Global Value Numbering Goal: Discover equivalent expressions in procedures Applications: Compiler optimizations –Copy propagation, Constant propagation, Common sub- expression elimination, Induction variable elimination etc. Program verification –Discover loop invariants, verify program assertions Discover equivalent computations across programs –Plagiarism detection tools, Translation validation

2 Global Value Numbering x := b £ a; y := a £ 3; c := a £ b; If (b == 3) z := a £ b; Equivalence problem is undecidable. Simplification Assumptions: Operators are uninterpreted (will not discover x = c) Conditionals are non-deterministic (will not discover y = c) Will discover z = c TrueFalse

3 Non-trivial Example assert(x = y); assert(z = F(y)); * x := a; y := a; z := F(a); x := b; y := b; z := F(b);

4 Existing Algorithms Algorithms that work on SSA form of the program –Alpern, Wegman, Zadecks (AWZ) algorithm: POPL 1988 Polynomial, Incomplete –Ruthing, Knoop, Steffens (RKS) Algorithm: SAS 1999 Polynomial, Incomplete, Improvement on AWZ Dataflow analysis or Abstract interpretation based –Kildalls Algorithm: POPL 1973 Exponential, Complete –Our Algorithm: POPL 2004 Polynomial, Complete, Randomized –Our Algorithm: this paper Polynomial, Complete

5 Why SSA based algorithms are incomplete? assert(x = y); assert(z = F(y)); * x = (a,b) y = (a,b) z = (F(a),F(b)) F(y) = F( (a,b)) AWZ Algorithm: functions are uninterpreted –fails to discover second assertion RKS Algorithm: uses rewrite rules for normalization –Does not discover all assertions in little more involved examples. –Rewrite rules not applied exhaustively (exp applications o.w.) –Rules are pessimistic in handling loops x := a; y := a; z := F(a); x := b; y := b; z := F(b);

6 Abstract Interpretation based algorithm G = SP(G 0,x := e) Assignment Node G0G0 x := e G 2 = G 0 Conditional Node G 1 = G 0 * G0G0 G = Join(G 1 0,G 2 0 ) G10G10 Join Node G20G20

7 Outline Strong equivalence DAG (SED) The join operation: Idea #1 Pruning an SED: Idea #2 The strongest postcondition operation Fixed point computation

8 Representing Equivalences a := 1; b := 2; x := F(1,2); { a,1 } { b,2 } { x, F(1,2) }

9 Representing Equivalences a := 1; b := 2; x := F(1,2); { a,1 } { b,2 } { x, F(1,2), F(a,2), F(1,b), F(a,b) } Such an explicit representation can be exponential.

10 Strong Equivalence DAG (SED) A data structure for representing equivalences. Nodes n: Type: c, ?, F(n 1,n 2 ) Terms(n): set of equivalent expressions –Terms( ) = V –Terms( ) = V [ { c } –Terms( ) = V [ { F(e 1,e 2 ) | e 1 2 Terms(n 1 ), e 2 2 Terms(n 2 ) } 8 variables x, 9 at most one node s.t. x 2 V – called Node(x)

11 SED: Example This SED represents the following partition: Terms(n 1 ) = { a, 2 } Terms(n 2 ) = { b} Terms(n 3 ) = { c, d, F(a,b), F(2,b) } Terms(n 4 ) = { e, F(c,b), F(d,b), F(F(a,b),b), F(F(2,b),b) } a, 2 d,c, F b, ? e, F n1n1 n4n4 n3n3 n2n2

12 Outline Strong equivalence DAG (SED) The join operation: Idea #1 Pruning an SED: Idea #2 The strongest postcondition operation Fixed point computation

13 The Join Operation G = Join(G 1, G 2 ) G is obtained by product construction of G 1 and G 2 If n= 2 G 1 and m= 2 G 2, then [n,m]= 2 G Definition of t 1 t t 2 c t c = c F(l 1,r 1 ) t F(l 2,r 2 ) = F ([l 1,l 2 ],[r 1,r 2 ]) t 1 t t 2 = ?, otherwise Proof of Correctness Terms([n,m]) = Terms(n) Å Terms(m) (Thus product construction = partition intersection)

14 Example: The Join Operation G1G1 G2G2 G F y 2, F y 1, F y 3,y 4 y 5, ? F y6,?y6,?y7,?y7,? F y 2, F y 1, F y 4,y 5 ? F y 6,y 7 ? y3,?y3,? G = Join(G 1,G 2 ) F y 2, F y 1, F y 4,y 5 ? F y6,?y6,? y3,?y3,? y7,?y7,?

15 Outline Strong equivalence DAG (SED) The join operation: Idea #1 Pruning an SED: Idea #2 The strongest postcondition operation Fixed point computation

16 Motivation: The Prune Operation Discovering equivalences among all expressions For the latter, it is sufficient to discover equivalences among all terms of size at most t at each program point (where t = #variables * size of program). Thus, SEDs can be pruned to have a small size. Discovering equivalences among program expressions vs. If G=Join(G 1,G 2 ), then Size(G) can be Size(G 1 ) £ Size(G 2 ) There are programs, where size of SEDs after n joins is exponential in n.

17 The Prune Operation Prune(G,k) For each node, check if x 2 V is equal to some F-term of size less than k. If not, then delete all the nodes that are reachable from only

18 Example: The Prune Operation G Prune(G,2) y 2, ? y 1, G y 4,y 5 ? G F y 2, F y 1, G y 4,y 5 ? F y6,?y6,? y3,?y3,? y7,?y7,?

19 Outline Strong equivalence DAG (SED) The join operation: Idea #1 Pruning an SED: Idea #2 The strongest postcondition operation Fixed point computation

20 The Strongest Postcondition Operation G = SP(G 0, x := e) To obtain G from G, do: Delete label x from Node(x) in G 0 Let n= be the node in G 0 s.t. e 2 Terms(n) (Add such a node to G 0 if it does not already exists) Add x to V.

21 F Example: The Strongest Postcondition Operation G0G0 z, u, F x, ? G = SP(G 0, u := F(z,x)) z, F x, ? u, F

22 Outline Strong equivalence DAG (SED) The join operation: Idea #1 Pruning an SED: Idea #2 The strongest postcondition operation Fixed point computation

23 Fixed Point Computation and Complexity The lattice of sets of equivalences (among uninterpreted function terms) has height at most k. Complexity –Dominated by the cost of join operations –# of join operations: O(j £ k) –Each join operation: O(k 2 £ N) This requires doing pruning while computing join –Total cost: O(k 3 £ N £ j) k: # of variables N: size of program j: # of join points in program

24 Example x := 1; y := 1; z := F(1,1); x := 2; y := 2; z := F(2,2); u := F(x,y); Assert(u = z); L1L1 L2L2 L3L3 L4L4 G1G1 z, F x,y, 1 G2G2 z, F x,y, 2 G 3 = Join(G 1,G 2 ) G3G3 z, F x,y,? G 4 = Assignment(G 3, u := F(x,y)) G4G4 u,z, F x,y, ?

25 Conclusion Idea #1: Join of 2 SEDs = Product construction Idea #2: Prune SEDs (Discovering equivalences among program expressions does not require computing equivalences involving large terms) Future Work Inter-procedural value numbering Abstract interpretation for combined theory of linear arithmetic and uninterpreted functions

Similar presentations