Presentation is loading. Please wait.

Presentation is loading. Please wait.

CR18: Advanced Compilers L02: Dependence Analysis Tomofumi Yuki 1.

Similar presentations


Presentation on theme: "CR18: Advanced Compilers L02: Dependence Analysis Tomofumi Yuki 1."— Presentation transcript:

1 CR18: Advanced Compilers L02: Dependence Analysis Tomofumi Yuki 1

2 Today’s Agenda Legality of Loop Transformations Dependences Legality of loop parallelization Legality of loop permutation Dependence Tests How to find dependences? Conservative tests Exact methods Polyhedral Representations 2

3 Loop Parallelism “Simple” transformation Not so simple to reason about Legality Performance impacts More complicated cases Transform the loops to expose parallelism 3 for (i=0; i<N; i++) S; for (i=0; i<N; i++) S; forall (i=0; i<N; i++) S; forall (i=0; i<N; i++) S;

4 Legality of Transformations First Rule of Compiler preserve original semantics Many complications loops parameters array accesses branches pointers random numbers regular subset 4

5 Preserving Semantics Preserving the order of operations one “easy” way to ensure preservation dependence is a partial order Exceptions? 5

6 Dependences Express relations between statements flow (true) dependence RAW anti-dependence WAR output dependence WAW input dependence RAR 6 a =...... = a a =...... = a a =...... = a a =...... = a

7 Flow vs Anti Dependence Why is flow the “true” dependence? Flow is value-based Anti is memory-based for i a[i] =...... = a[i] for i a[i] =...... = a[i] for i... = a[i] a[i] =... for i... = a[i] a[i] =... for i... = a[i] b[i] =... for i... = a[i] b[i] =... 7

8 Dependence Abstractions Distance Vector distance between write and read [i,j] + c e.g., [0,1] Direction Vector direction of the instance that uses the value one of, ≤, ≥, =, * e.g., [0,<] less precise, but sometimes sufficient 8

9 Direction Vector Example 1 9 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i][0] + B[i][j]; for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i][0] + B[i][j]; i j distance vector [0,1], [0,2], [0,3] direction vector [0,<]

10 Direction Vector Example 2 10 for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; i j distance vector [1,-1] direction vector [ ]

11 So what does these vectors do? Parallelism is clear same for direction vectors Loop carried-dependence loop at depth d carries a dependence if at least one of the distance/direction vectors have non-zero entry at d 11 [0,0,1] [0,1,0] [1,1,0] [0,0,1] [0,1,0] [1,1,0] [0,0, 1] [0,1, 1] [0,1,-1] [0,0, 1] [0,1, 1] [0,1,-1] [1, 0,0] [1, 1,0] [1,-1,0] [1, 0,0] [1, 1,0] [1,-1,0]

12 Loop Carried Dependence Is any of the loops parallel? What are the distance vectors? 12 for i for j A[j] = foo(A[j], A[j+1]) for i for j A[j] = foo(A[j], A[j+1])

13 Legality of Loop Permutation Another application of distance vectors Which ones can you permute? 13 for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; [1,1] [0,1] [1,-1]

14 Legality of Loop Permutation Another application of distance vectors Which ones can you permute? 14 for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; [1,1] i j

15 for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i+1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i+1][j-1] + B[i][j]; Another application of distance vectors Which ones can you permute? Legality of Loop Permutation 15 i j [0,1] for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j];

16 Another application of distance vectors Which ones can you permute? for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; [1,-1] Legality of Loop Permutation 16 i j Fully permutable: [≤,...,≤]

17 Legality of Loop Reversal Is this transformation legal? 17 for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=M-1; j>0; j--) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=M-1; j>0; j--) A[i][j] = A[i-1][j+1] + B[i][j]; [1,-1] [?,?]

18 Today’s Agenda Legality of Loop Transformations Dependences Legality of loop parallelization Legality of loop permutation Dependence Tests How to find dependences? Conservative tests Exact methods 18

19 How to Find the Vectors Easy case Not too easy 19 for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[i] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[i] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[2*i-j+3] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[2*i-j+3] + B[i][j];

20 How to Find the Vectors Really difficult No general solution polynomial case is undecidable can work for linear accesses wide range of precise-ness even for linear case 20 for (i=1; i<N; i++) for (j=0; j<M; j++) { A[i*i+j*j-i*j] = A[i] + B[i][j]; A[i*j*j-i*j*3] = A[i] + B[i][j]; } for (i=1; i<N; i++) for (j=0; j<M; j++) { A[i*i+j*j-i*j] = A[i] + B[i][j]; A[i*j*j-i*j*3] = A[i] + B[i][j]; }

21 Dependence: Affine Case Given two accesses f(i,j) and g(x,y) the two accesses are in conflict if: same location: f(i,j) = g(x,y) one of them is a write Let f and g be affine a 0 +a 1 i+a 2 j = b 0 +b 1 x+b 2 y The last write to a conflicting location is the producer 21

22 It is just solving a linear system Theoretically it is not that “hard” Two Directions Polyhedral: use PIP and get exact solution Others: less expensive solutions work in practice 22

23 Exact Method: Polyhedral Model Array Dataflow Analysis [Feautrier 1991] Given read and write statement instances r,w Find w as a function of r such that r and w are in conflict w happens-before r w is the most recent write when everything is affine Main Engine Parametric Integer Linear Programming 23

24 Exact Dependence Analysis Who produced the value read at A[j]? Powerful but expensive 24 for (i=0; i<N; i++) for (j=0; j<M; j++) S: A[i] = A[j] + B[i][j]; for (i=0; i<N; i++) for (j=0; j<M; j++) S: A[i] = A[j] + B[i][j]; S = if i>j and j>0 : S ; if i=j and i>0 : S ; if j>i or i=j=0: A[j]; S = if i>j and j>0 : S ; if i=j and i>0 : S ; if j>i or i=j=0: A[j]; 0≤i,i’<N 0≤j,j’<M i=j’ (i’,j’)<<(i,j) obj: max i’*X+j’

25 ADA Example 1 What is the PIP problem? 25 for (i = 0; i<=N; i++) for (j = i; j<=M; j++) A[j] = foo(A[j], A[j+1]) for (i = 0; i<=N; i++) for (j = i; j<=M; j++) A[j] = foo(A[j], A[j+1])

26 ADA Example 2 What is the PIP problem? 26 for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]); for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]);

27 Digression: Multiple Statements Within a domain, the order of execution is given by lex. order What do you do when you have multiple statements? 27

28 2d+1 Notation A convention to encode statement ordering Called in many different names in the original ADA paper, it simply said to: “use the textual order” For a d-dimensional loop nest, use d+1 constant dimensions 28 for i for j S1 ; for j S2 ; S3 ; for i for j S1 ; for j S2 ; S3 ; dom(S1) = {0,i,0,j,0|...} dom(S2) = {0,i,1,j,0|...} dom(S3) = {0,i,1,j,1|...} dom(S1) = {0,i,0,j,0|...} dom(S2) = {0,i,1,j,0|...} dom(S3) = {0,i,1,j,1|...}

29 ADA Example 2 What is the PIP problem? 29 for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]); for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]);

30 ADA Example 3 What is the PIP problem? 30 for (t=0; t<=T; t++) { for (i=0; i<=N; i++) A[i] = foo(B[j]); for (j=0; j<=M; j++) B[j] = foo(A[i]); } for (t=0; t<=T; t++) { for (i=0; i<=N; i++) A[i] = foo(B[j]); for (j=0; j<=M; j++) B[j] = foo(A[i]); }

31 The Omega Test Another Variant of ADA William Pugh (1991) based on Fourier-Motzkin for integers Presburger Arithmetic Two slightly different branches one in US, the other in France we mostly talk about the French stuff, but similar evolution took place with Omega 31

32 So what is wrong? Can’t we just use this powerful method all the time? 32

33 Dependence Tests Same setting (conflicting memory accesses) f(i,j) = g(x,y) Let f and g be affine a 0 +a 1 i+a 2 j = b 0 +b 1 x+b 2 y linear Diophantine equation solution exists if 33 gcd(a 1,a 2,b 1,b 2 )=|a 0 -b 0 |

34 GCD Test 3i=6x-3y+2  3i-6x+3y=2 gcd(3,6,3) = 2 ? 2i=4x-2y+2  2i-4x+2y=2 gcd(2,4,2) = 2 ? 34 for (i=1; i<N; i++) for (j=0; j<M; j++) A[3*i] = A[6*i-3*j+2] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[3*i] = A[6*i-3*j+2] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[2*i] = A[4*i-2*j+2] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[2*i] = A[4*i-2*j+2] + B[i][j];

35 GCD vs ADA ADA is clearly much more precise (exact) What can ADA say for the following? 35 for (i=1; i<N; i++) for (j=0; j<i*i; j++) A[i] = foo(A[i])... for (i=1; i<N; i++) for (j=0; j<i*i; j++) A[i] = foo(A[i])...

36 Why is GCD Test Inexact? When does GCD test give false positive? What happens when GCD=1? GCD test: i = j trivial solution exist Main problem the space is completely unconstrained 36 for (i=0; i<N; i++) for (j=N; j<M; j++) A[i] = A[j] + B[i][j]; for (i=0; i<N; i++) for (j=N; j<M; j++) A[i] = A[j] + B[i][j];

37 Exact vs Exact Array Dataflow Analysis “exact” dependence analysis GCD Test inexact dependence test Exact Dependence Tests no false positives/negatives does not necessary give the producer 37

38 Banerjee Test [Banerjee 1976] Making it slightly better There may be a dependence if min(f(i,j)-g(x,y))≤0, and 0≤max(f(i,j)-g(x,y)) min(i-j) = 0-(M-1) = 1-M max(i-j) = N-1-N = -1 38 for (i=0; i<N; i++) for (j=N; j<M; j++) { A[i] = A[j] + B[i][j]; } for (i=0; i<N; i++) for (j=N; j<M; j++) { A[i] = A[j] + B[i][j]; }

39 Banerjee Test Intuition interval of 2 functions 39

40 Banerjee Test Exact or Inexact? Weakness? 40

41 What happens with 2D arrays? How to formulate? given read: A[i][j] and write: A[x+1][y+2] How to formulate? given read: A[i][i] and write: A[x+1][x+2] 41 for (i=0; i<N; i++) A[i][j] = A[i+1][j+2]; for (i=0; i<N; i++) A[i][j] = A[i+1][j+2]; for (i=0; i<N; i++) A[i][i] = A[i+1][i+2]; for (i=0; i<N; i++) A[i][i] = A[i+1][i+2];

42 Dimension-by-Dimension Simple extension also called subscript-by-subscript Given A[f 1 (i vec ),f 2 (i vec ),...,f n (i vec )] B[g 1 (j vec ),g 2 (j vec ),...,g n (j vec )] Check feasibility of: f 1 = g 1 or f 2 = g 2 or,..., f 3 = g 3 42

43 Limitations of Dim-by-Dim Is there parallelism in this loop nest? “coupled” subscript We need to check for feasibility of: f 1 = g 1 ∧ f 2 = g 2 ∧,..., ∧ f 3 = g 3 43 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

44 Lambda Test [Li et al. 1989] Multi-dimensional Banerjee Given A[f 1 (i vec ),f 2 (i vec ),...,f n (i vec )] B[g 1 (j vec ),g 2 (j vec ),...,g n (j vec )] Check 44

45 How to get Direction Vectors Pick a direction vector and then test it! only test relevant vectors to the legality testing for lex. negative vectors can return true, but makes no sense What makes sense for the following? 45 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

46 Lambda Test Let’s try [=,<] 46 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

47 Lambda Test Let’s try [=,<] 47 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

48 Lambda Test Let’s try [=,<] 48 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... i,i’ j,j’ ψ2 ψ1

49 Lambda Test Let’s try [=,<] 49 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

50 Delta Test [Goff et al. 1991] Further extensions for multiple indices Pragmatic approach key observation: real programs are not that complicated when it comes to array accesses 1 st Step, classify array access (pairs) ZIV (Zero Index Variable) pair SIV (Single Index Variable) pair MIV (Multiple Index Variables) pair 50

51 Delta Test Classifications ZIV e.g., A[N], A[10],... loop invariant SIV e.g., A[i], A[j], A[i+2],... only one loop iterator MIV e.g., A[i+j], A[2*i-j], A[i*j]... when two ore more iterators are involved 51

52 Array Access Patterns What do they look like in “real-life”? 1D, 2D, 3D+ arrays coupled, separate ZIV, SIV, MIV 52

53 Delta Test Algorithm 1. Classify accesses 2. Solve the easy cases if separable ZIV/SIV proves independence, done 3. Solve the harder cases BUT, some information are used from Step 2. constraint intersection/propagation 53

54 Constraint Intersection It is sometimes easy to show that multiple constraints cannot be satisfied at the same time If you have coupled SIV accesses e.g., A[i,i] = A[i+1, i+2] By analyzing each dim separately, you get i’ = i+1 and j’=i+2 But you also know that the valid space is i’=j’, i’=j’=i+c Intersecting everything gives empty set 54

55 Constraint Propagation Like intersection, SIV gives partial information e.g., A[i,i+j] = A[i+1, i+j] i’=i+1 is derived from the 1 st dim you then substitute the info to the 2 nd dim A[i’,j’] = A[i+1, i+1+j’] Reformulating the 2 nd dim gives i+1+j’ = i+j which yields j’=j-1 55

56 Putting it All Together Delta-Test aims to take advantage of various properties of how the code is written a collection of many small tricks It is probably closer to what is in actual compilers than the polyhedral model 56

57 transition 57

58 Back to Array Dataflow Result Another view: PRDG Polyhedral Reduced Dependence Graph reduced vs extended (recall L01) Node: Statement domain Edge: Dependence (domain + function) 58 for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = B[j]+B[j+1]; for (j=0; j<Q; j++) S1: B[j] = A[j]; } for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = B[j]+B[j+1]; for (j=0; j<Q; j++) S1: B[j] = A[j]; } S0S0 S0S0 S1S1 S1S1 0≤i<N 0≤j<P 0≤i<N 0≤j<Q

59 Polyhedral Objects We will usually use ISL syntax Set [ ] -> { [ ] : } [N,M]->{ [i,j] : 0<=i<N and 0<=j<M } Relation [ ] -> { [ ] -> [out] : } [N,M]->{ [i,j] -> [x,y] : x=i+1 } Function is a special case of relation I often use ( → ) 59

60 Additional Conventions You can name each tuple Following are NOT equivalent [N,M] -> { S0[i,j] : 0<=i<N and 0<=j<M } [N,M] -> { S1[i,j] : 0<=i<N and 0<=j<M } Index names DO NOT matter Following are equivalent [N,M] -> { [i,j] : 0<=i<N and 0<=j<M } [N,M] -> { [x,y] : 0<=x<N and 0<=y<M } Names of parameters DO matter 60

61 Set vs Relations They are not really different [N]->{ [i,j] -> [x,y] : i=x and j=y } [N]->{ [i,j,x,y] : i=x and j=y } Mostly for convenience when representing program information Ex1. Dependence S0[i,j] -> S1[i’,j’] Ex2. Array access S0[i,j] -> A[i] 61

62 Matrix Representation Polyhedral obj. are often encoded as matrices Ax + b ≥ 0 A: linear part (matrix) x: indices (symbolic vector) b: constant (constant vector) Px + Ax + b to explicitly separate params Simply Ax + b for functions Algebraic properties of A is often used 62

63 Matrix Form Example { [i,j] : 0≤i<10 and 0≤j<i } 63

64 Integer Set Library Tool for manipulating sets and relations mostly by Sven Verdoolaege Kind of does every thing now manipulating set/relation scheduling code generation PIP counting integer points... 64

65 ISL Demo Online interface http://www2.cs.kuleuven.be/cgi- bin/dtai/barvinok.cgi 65

66 PRDG Example (Dataflow Edge) 66 for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } S0S0 S0S0 S1S1 S1S1 S0[i,j]→S0[i+1,j] : j≥Q S0[i,j]→S1[i,j] : j<Q S1[i,j]→S1[i+1,j] : j<P S1[i,j]→S1[i+1,j] : j≥P

67 PRDG Example (Dep. Polyhedra) 67 for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } S0S0 S0S0 S1S1 S1S1 j≥Q, i’=i+1, j=j’ i’=i, j=j’ : j<Q i’=i+1, j=j’ : j<P i’=i+1, j=j’ : j’≥P 0≤i<N 0≤j<P 0≤i<N 0≤j<Q

68 Uniform vs Affine Dependence Uniform dependences constant offset: → + c can be described with distance vectors Affine dependences any affine function: → A.[i j]+b uniform when A = I When do we need affine dependences? 68

69 PRDG + Expressions = ??? PRDG is an abstraction of dependences what each statement does is lost You may want the expressions in you analysis typically when semantic properties are useful Polyhedral Equational Model 69

70 Alpha Language Equational Language or PRDG + Expressions or Systems of Affine Recurrence Equations or dynamic single assignment code Basic structure declaration of the domain of equations affine equations that define the computation performed at each iteration point 70

71 Alpha Example 71 for (i=1; i<=N; i++) { S0: A[i,i] = foo(); for (j=i+1; j<=M; j++) S1: A[i,j] = A[i,j-1] * A[i,i]; } for (i=1; i<=N; i++) { S0: A[i,i] = foo(); for (j=i+1; j<=M; j++) S1: A[i,j] = A[i,j-1] * A[i,i]; } S0 : [N,M] -> { [i] : 1<=i<=N } S1 : [N,M] -> { [i,j] : i<=i<=N and i<j<=M } S0[i] = foo(); S1[i,j] = case { : j=1} : A [i,j-1] * S0[i]; { : j>1} : S1[i,j-1] * S0[i]; esac; S0 : [N,M] -> { [i] : 1<=i<=N } S1 : [N,M] -> { [i,j] : i<=i<=N and i<j<=M } S0[i] = foo(); S1[i,j] = case { : j=1} : A [i,j-1] * S0[i]; { : j>1} : S1[i,j-1] * S0[i]; esac;

72 Role of Alpha in this Course Polyhedral Equational Model is not popular within the already niche Polyhedral Model I know A LOT about it because my advisor is the main guy working on it It is a good IR to look at both dependences and expressions it is also suited for teaching some of the aspects I sometimes use it to replace PRDG but keep in mind that it is a different view of it 72

73 Next Time Transforming polyhedral representations Tiling 73


Download ppt "CR18: Advanced Compilers L02: Dependence Analysis Tomofumi Yuki 1."

Similar presentations


Ads by Google