Download presentation
Presentation is loading. Please wait.
Published byChrystal James Modified over 9 years ago
1
Optimizing Compilers for Modern Architectures Preliminary Transformations Chapter 4 of Allen and Kennedy
2
Optimizing Compilers for Modern Architectures Overview Why do we need this? —Requirements of dependence testing –Stride 1 –Normalized loop –Linear subscripts –Subscripts composed of functions of loop induction variables —Higher dependence test accuracy —Easier implementation of dependence tests
3
Optimizing Compilers for Modern Architectures An Example INC = 2 KI = 0 DO I = 1, 100 DO J = 1, 100 KI = KI + INC U(KI) = U(KI) + W(J) ENDDO S(I) = U(KI) ENDDO Programmers optimized code —Confusing to smart compilers
4
Optimizing Compilers for Modern Architectures An Example INC = 2 KI = 0 DO I = 1, 100 DO J = 1, 100 ! Deleted: KI = KI + INC U(KI + J*INC) = U(KI + J*INC) + W(J) ENDDO KI = KI + 100 * INC S(I) = U(KI) ENDDO Applying induction-variable substitution —Replace references to AIV with functions of loop index
5
Optimizing Compilers for Modern Architectures An Example INC = 2 KI = 0 DO I = 1, 100 DO J = 1, 100 U(KI + (I-1)*100*INC + J*INC) = U(KI + (I-1)*100*INC + J*INC) + W(J) ENDDO ! Deleted: KI = KI + 100 * INC S(I) = U(KI + I * (100*INC)) ENDDO KI = KI + 100 * 100 * INC Second application of IVS —Remove all references to KI
6
Optimizing Compilers for Modern Architectures An Example INC = 2 ! Deleted: KI = 0 DO I = 1, 100 DO J = 1, 100 U(I*200 + J*2 - 200) = U(I*200 + J*2 -200) + W(J) ENDDO S(I) = U(I*200) ENDDO KI = 20000 Applying Constant Propagation —Substitute the constants
7
Optimizing Compilers for Modern Architectures An Example DO I = 1, 100 DO J = 1, 100 U(I*200 + J*2 - 200) = U(I*200 + J*2 - 200) + W(J) ENDDO S(I) = U(I*200) ENDDO Applying Dead Code Elimination —Removes all unused code
8
Optimizing Compilers for Modern Architectures Information Requirements Transformations need knowledge —Loop Stride —Loop-invariant quantities —Constant-values assignment —Usage of variables
9
Optimizing Compilers for Modern Architectures Loop Normalization Lower Bound 1 with Stride 1 To make dependence testing as simple as possible Serves as information gathering phase
10
Optimizing Compilers for Modern Architectures Loop Normalization Algorithm Procedure normalizeLoop(L 0 ); i = a unique compiler-generated LIV S1: replace the loop header for L 0 DO I = L, U, S with the adjusted loop header DO i = 1, (U – L + S) / S; S2: replace each reference to I within the loop by i * S – S + L; S3: insert a finalization assignment I = i * S – S + L; immediately after the end of the loop; end normalizeLoop;
11
Optimizing Compilers for Modern Architectures Loop Normalization Caveat —Un-normalized: DO I = 1, M DO J = I, N A(J, I) = A(J, I - 1) + 5 ENDDO Has a direction vector of (<,=) —Normalized: DO I = 1, M DO J = 1, N – I + 1 A(J + I – 1, I) = A(J + I – 1, I – 1) + 5 ENDDO Has a direction vector of ( )
12
Optimizing Compilers for Modern Architectures Loop Normalization Caveat —Consider interchanging loops –( ) OK –( ) becomes (>,<) Problem Handled by another transformation —What if the step size is symbolic? –Prohibits dependence testing –Workaround: use step size 1 Less precise, but allow dependence testing
13
Optimizing Compilers for Modern Architectures Definition-use Graph Traditionally called Definition-use Chains Provides the map of variables usage Heavily used by the transformations
14
Optimizing Compilers for Modern Architectures Definition-use Graph Definition-use graph is a graph that contains an edge from each definition point in the program to every possible use of the variable at run time uses(b): the set of all variables used within the block b that have no prior definitions within the block defsout(b): the set of all definitions within block b that are not killed within the block killed(b): the set of all definitions that define variables killed by other definitions within block b
15
Optimizing Compilers for Modern Architectures Definition-use Graph Computing reaches for one block b may immediately change all other reaches including b itself since reaches(b) is an input into other reaches equations Archiving correct solutions requires simultaneously solving all individual equations —There is a workaround this
16
Optimizing Compilers for Modern Architectures Definition-use Graph
17
Optimizing Compilers for Modern Architectures Definition-use Graph
18
Optimizing Compilers for Modern Architectures Dead Code Elimination Removes all dead code What is Dead Code ? —Code whose results are never used in any ‘Useful statements’ What are Useful statements ? —Are they simply output statements ? —Output statements, input statements, control flow statements, and their required statements Makes code cleaner
19
Optimizing Compilers for Modern Architectures Dead Code Elimination
20
Optimizing Compilers for Modern Architectures Constant Propagation Replace all variables that have constant values at runtime with those constant values
21
Optimizing Compilers for Modern Architectures Constant Propagation
22
Optimizing Compilers for Modern Architectures Constant Propagation
23
Optimizing Compilers for Modern Architectures Static Single-Assignment Reduces the number of definition-use edges Improves performance of algorithms
24
Optimizing Compilers for Modern Architectures Static Single-Assignment Example
25
Optimizing Compilers for Modern Architectures Forward Expression Substitution DO I = 1, 100 K = I + 2 A(K) = A(K) + 5 ENDDO DO I = 1, 100 A(I+2) = A(I+2) + 5 ENDDO Example
26
Optimizing Compilers for Modern Architectures Forward Expression Substitution Need definition-use edges and control flow analysis Need to guarantee that the definition is always executed on a loop iteration before the statement into which it is substituted Data structure to find out if a statement S is in loop L —Test whether level-K loop containing S is equal to L
27
Optimizing Compilers for Modern Architectures Forward Expression Substitution
28
Optimizing Compilers for Modern Architectures Forward Expression Substitution
29
Optimizing Compilers for Modern Architectures Forward Expression Substitution
30
Optimizing Compilers for Modern Architectures Forward Expression Substitution
31
Optimizing Compilers for Modern Architectures Induction Variable Substitution Definition: an auxiliary induction variable in a DO loop headed by DO I = LB, UB, S is any variable that can be correctly expressed as cexpr * I + iexpr L at every location L where it is used in the loop, where cexpr and iexpr L are expressions that do not vary in the loop, although different locations in the loop may require substitution of different values of iexpr L
32
Optimizing Compilers for Modern Architectures Induction Variable Substitution Example: DO I = 1, N A(I) = B(K) + 1 K = K + 4 … D(K) = D(K) + A(I) ENDDO
33
Optimizing Compilers for Modern Architectures Induction Variable Recognition
34
Optimizing Compilers for Modern Architectures Induction Variable Substitution Induction Variable Recognition
35
Optimizing Compilers for Modern Architectures Induction Variable Substitution
36
Optimizing Compilers for Modern Architectures Induction Variable Substitution
37
Optimizing Compilers for Modern Architectures Induction Variable Substitution More complex example DO I = 1, N, 2 K = K + 1 A(K) = A(K) + 1 K = K + 1 A(K) = A(K) + 1 ENDDO Alternative strategy is to recognize region invariance DO I = 1, N, 2 A(K+1) = A(K+1) + 1 K = K+1 + 1 A(K) = A(K) + 1 ENDDO
38
Optimizing Compilers for Modern Architectures Induction Variable Substitution Driver
39
Optimizing Compilers for Modern Architectures IVSub without loop normalization DO I = L, U, S K = K + N … = A(K) ENDDO DO I = L, U, S … = A(K + (I – L + S) / S * N) ENDDO K = K + (U – L + S) / S * N
40
Optimizing Compilers for Modern Architectures IVSub without loop normalization Problem: —Inefficient code —Nonlinear subscript
41
Optimizing Compilers for Modern Architectures IVSub with Loop Normalization I = 1 DO i = 1, (U-L+S)/S, 1 K = K + N … = A (K) I = I + 1 ENDDO
42
Optimizing Compilers for Modern Architectures IVSub with Loop Normalization I = 1 DO i = 1, (U – L + S) / S, 1 … = A (K + i * N) ENDDO K = K + (U – L + S) / S * N I = I + (U – L + S) / S
43
Optimizing Compilers for Modern Architectures Summary Transformations to put more subscripts into standard form —Loop Normalization —Constant Propagation —Induction Variable Substitution Do loop normalization before induction-variable substitution Leave optimizations to compilers
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.