# C Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.

## Presentation on theme: "C Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University."— Presentation transcript:

c Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN

c Chuen-Liang Chen, NTUCS&IE / 322 Introduction local optimization  within a basic block  may be accompanying with code generation  e.g.,peephole optimization global optimization  over more than one basic blocks  e.g.,loop optimization data flow analysis (a technique)

c Chuen-Liang Chen, NTUCS&IE / 323 Peephole optimization (1/2) modify particular pattern in a small window (peephole; 2-3 instructions) may on intermediate or target code constant folding (evaluate constant expressions in advance)  ( +, Lit1, Lit2, Result )  ( :=, Lit1+Lit2, Result )  ( :=, Lit1, Result1 ), ( +, Lit2, Result1, Result2 )  ( :=, Lit1, Result1 ), ( :=, Lit1+Lit2, Result2 ) strength reduction (replace slow operations with faster equivalents)  ( *, Operand, 2, Result )  ( ShiftLeft, Operand, 1, Result )  ( *, Operand, 4, Result )  ( ShiftLeft, Operand, 2, Result ) null sequences (delete useless operations)  ( +, Operand, 0, Result )  ( :=, Operand, Result )  ( *, Operand, 1, Result )  ( :=, Operand, Result )

c Chuen-Liang Chen, NTUCS&IE / 324 Peephole optimization (2/2) combine operations (replace several operations with one equivalent)  Load A, R j ; Load A+1, R j+1  DoubleLoad A, R j  BranchZero L1, R1; Branch L2; L1:  BranchNotZero L2, R1  Subtract #1, R1; BranchZero L1, R1  SubtractOneBranch L1, R1 algebraic laws (use algebraic laws to simplify or reorder instructions)  ( +, Lit, Operand, Result )  ( +, Operand, Lit, Result )  ( -, 0, Operand, Result )  ( Negate, Operand, Result ) special case instructions (use instructions designed for special operand cases)  Subtract #1, R1  Decrement R1  Add #1, R1  Increment R1  Load #0, R1; Store A, R1  Clear A address mode operations (use address modes to simplify code)  Load A, R1; Add 0(R1), R2  Add @A, R2  Subtract #2, R1; Clear 0(R1)  Clear -(R1)

c Chuen-Liang Chen, NTUCS&IE / 325 Loop optimization (1/6) due to 90 / 10 rule example -- for l in 1..100 loop for J in 1..100 loop for K in 1..100 loop A(l)(J)(K) := ( I * J ) * K; end loop; for l in 1..100 loop for J in 1..100 loop T1 := Adr( A(l)(J) ); T2 := I * J; for K in 1..100 loop T1 (K) := T2 * K; end loop; loop invariant expression factorization loop invariant expression factorization for l in 1..100 loop T3 := Adr( A(I) ); for J in 1..100 loop T1 :=Adr( T3 (J) ); T2 := I * J; for K in 1..100 loop T1(K) := T2 * K; end loop;

c Chuen-Liang Chen, NTUCS&IE / 326 Loop optimization (2/6) for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; -- Initial value of l*J for J in 1..100 loop T1 := Adr( T3(J) ); T2 := T4 ; -- T4 holds I*J T5 := T2; -- Initial value of T2*K for K in 1..100 loop T1(K) := T5 ; -- T5 holds T2*K = I*J*K T5 := T5 + T2; end loop; T4 := T4 + I; end loop; induction variable elimination for l in 1..100 loop T3 := Adr( A(I) ); for J in 1..100 loop T1 :=Adr( T3(J) ); T2 := I * J; for K in 1..100 loop T1(K) := T2 * K; end loop;

c Chuen-Liang Chen, NTUCS&IE / 327 Loop optimization (3/6) copy propagation for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; -- Initial value of l*J for J in 1..100 loop T1 := Adr( T3(J) ); T5 := T4 ; -- Initial value of T2*K for K in 1..100 loop T1(K) := T5; -- T5 holds T2*K = I*J*K T5 := T5 + T4 ; end loop; T4 := T4 + I; end loop; for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; for J in 1..100 loop T1 := Adr( T3(J) ); T2 := T4; T5 := T2; for K in 1..100 loop T1(K) := T5; T5 := T5 + T2; end loop; T4 := T4 + I; end loop;

c Chuen-Liang Chen, NTUCS&IE / 328 Loop optimization (4/6) for l in 1..100 loop T3 := A 0 + ( 10000 * l ) - 10000 ; T4 := I; -- Initial value of l*J for J in 1..100 loop T1 := T3 + ( 100 * J ) - 100 ; T5 := T4; -- Initial value of T4*K for K in 1..100 loop (T1+K-1)  := T5; -- T5 holds T4*K = I*J*K T5 := T5 + T4; end loop; T4 := T4 + I; end loop; subscripting code expansion for l in 1..100 loop T3 := Adr( A(I) ); T4 := I; for J in 1..100 loop T1 := Adr( T3(J) ); T5 := T4; for K in 1..100 loop T1(K) := T5; T5 := T5 + T4; end loop; T4 := T4 + I; end loop;

c Chuen-Liang Chen, NTUCS&IE / 329 Loop optimization (5/6) induction variable elimination T6 := A 0 ; -- Initial value of Adr(A(I)) for l in 1..100 loop T3 := T6 ; T4 := I; -- Initial value of l*J T7 := T3; -- Initial value of Adr(A(l)(J)) for J in 1..100 loop T1 := T7 ; T5 := T4; -- Initial value of T4*K T8 := T1; -- Initial value of Adr(A(l)(J)(K)) for K in 1..100 loop T8  := T5; -- T5 holds T4*K = I*J*K T5 := T5 + T4; T8 := T8 + 1; end loop; T4 := T4 + I; T7 := T7 + 100; end loop; T6 := T6 + 10000; end loop; for l in 1..100 loop T3 := A 0 + ( 10000 * l ) - 10000; T4 := I; for J in 1..100 loop T1 := T3 + ( 100 * J ) - 100; T5 := T4; for K in 1..100 loop (T1+K-1)  := T5; T5 := T5 + T4; end loop; T4 := T4 + I; end loop;

c Chuen-Liang Chen, NTUCS&IE / 330 Loop optimization (6/6) T6 := A 0 ; -- Initial value of Adr(A(I)) for l in 1..100 loop T4 := I; -- Initial value of l*J T7 := T6 ; -- Initial value of Adr(A(l)(J)) for J in 1..100 loop T5 := T4; -- Initial value of T4*K T8 := T7 ; -- Initial value of Adr(A(l)(J)(K)) for K in 1..100 loop T8  := T5; -- T5 holds T4*K = I*J*K T5 := T5 + T4; T8 := T8 + 1; end loop; T4 := T4 + I; T7 := T7 + 100; end loop; T6 := T6 + 10000; end loop; copy propa- gation T6 := A 0 ; for l in 1..100 loop T3 := T6; T4 := I; T7 := T3; for J in 1..100 loop T1 := T7; T5 := T4; T8 := T1; for K in 1..100 loop T8  := T5; T5 := T5 + T4; T8 := T8 + 1; end loop; T4 := T4 + I; T7 := T7 + 100; end loop; T6 := T6 + 10000; end loop;

c Chuen-Liang Chen, NTUCS&IE / 331 to fetch information for global structure, not only for a basic block data flow graph  node -- basic block  example -- Read ( Limit ) ; for I in 1.. Limit loop Read ( J ) ; if I = 1 then Sum := J ; else Sum := Sum + J ; end if ; end loop ; Write ( Sum ) ; Global data flow analysis (1/2) Read(Limit) I := 1 I > Limit Read(J) I = 1 Sum := JSum := Sum + J I := I + 1 Write(Sum) 0 2 6 34 5 1

c Chuen-Liang Chen, NTUCS&IE / 332 Global data flow analysis (2/2) classification of data flow analyses  any-path v.s. all-path  forward-flow v.s. backward-flow  dependent on different types of information data flow equations  each basic block has 4 sets, IN, OUT, KILLED, and GEN, whose relationships are specified by data flow equations  equations for all basic blocks need to be satisfied simultaneously  may not unique solution solution  iterative method  structure method

c Chuen-Liang Chen, NTUCS&IE / 333 Any-path forward-flow analysis example -- uninitialized variable (used but undefined)  IN -- uninitialized just before this basic block  OUT -- uninitialized before (including) this basic block  KILLED -- defined  GEN -- out of scope  data flow equations -- –IN(b) =  i  P(b) OUT(i) –OUT(b) = GEN(b)  ( IN(b) - KILLED(b) ) –IN(first) = universal set initial condition, i.e., IN(first), is case by case b pp ss

c Chuen-Liang Chen, NTUCS&IE / 334 Any-path backward-flow analysis example -- live variable  OUT -- will be used just after this basic block  IN -- will be used after (including) this basic block  KILLED -- defined  GEN -- used  data flow equations -- –OUT(b) =  i  S(b) IN(i) –IN(b) = GEN(b)  ( OUT(b) - KILLED(b) ) –OUT(last) =  b pp ss

c Chuen-Liang Chen, NTUCS&IE / 335 All-path forward-flow analysis example -- available expression (to check redundant computation)  IN -- already computed just before this basic block  OUT -- already computed before (including) this basic block  KILLED -- one of operands is re-defined  GEN -- computed subexpression  data flow equations -- –IN(b) =  i  P(b) OUT(i) –OUT(b) = GEN(b)  ( IN(b) - KILLED(b) ) –IN(first) =  b pp ss

c Chuen-Liang Chen, NTUCS&IE / 336 All-path backward-flow analysis example -- very busy expression (worth storing on register)  OUT -- will be used for all cases just after this basic block  IN -- will be used for all cases after (including) this basic block  KILLED -- defined  GEN -- used  data flow equations -- –OUT(b) =  i  S(b) IN(i) –IN(b) = GEN(b)  ( OUT(b) - KILLED(b) ) –OUT(last) =  b pp ss

c Chuen-Liang Chen, NTUCS&IE / 337 Structure method of data flow solution (1/4) for backward analysis -- I  O for forward analysis  I = I 1 O = ( I 2 - K 2 )  G 2 = ( ((I 1 -K 1 )  G 1 ) - K 2 )  G 2 = ( I - (K 1  K 2 ) )  (G 1 -K 2 )  G 2 K= K 1  K 2 G= ( G 1 - K 2 )  G 2  I = I 1 = I 2 O = O 1  O 2 = ((I 1 -K 1 )  G 1 )  ((I 2 -K 2 )  G 2 ) = ( I - (K 1  K 2 ) )  (G 1  G 2 ) K= K 1  K 2 G= G 1  G 2 S1S1 S2S2 S1S1 S2S2 (any path)

c Chuen-Liang Chen, NTUCS&IE / 338 Structure method of data flow solution (2/4)  I = I 1 = I 2 O = O 1  O 2 = ((I 1 -K 1 )  G 1 )  ((I 2 -K 2 )  G 2 ) =  = ( I - (K 1  K 2 ) )  (G 1  G 2 ) K= K 1  K 2 G= G 1  G 2 –any pathK= K 1 –all pathK= K 1  K 2 G= ( G 2 - K 1 )  G 1 G= G 1 S1S1 S2S2 (all path) S1S1 S2S2 S1S1 S1S1 S2S2 S1S1

c Chuen-Liang Chen, NTUCS&IE / 339 Structure method of data flow solution (3/4) example -- uninitialized variable Read(Limit) I := 1 I > Limit Read(J) I = 1 Sum := JSum := Sum + J I := I + 1 Write(Sum) 0 2 6 34 5 1

c Chuen-Liang Chen, NTUCS&IE / 340 Structure method of data flow solution (4/4)

c Chuen-Liang Chen, NTUCS&IE / 341 Applications of data flow analyses

Download ppt "C Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University."

Similar presentations