Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CS 201 Compiler Construction Array Dependence Analysis & Loop Parallelization.

Similar presentations


Presentation on theme: "1 CS 201 Compiler Construction Array Dependence Analysis & Loop Parallelization."— Presentation transcript:

1 1 CS 201 Compiler Construction Array Dependence Analysis & Loop Parallelization

2 Goal: Identify loops whose iterations can be executed in parallel on different processors of a shared-memory multiprocessor system. Matrix Multiplication for I = 1 to n do -- parallel for J = 1 to n do -- parallel for K = 1 to n do –- not parallel C[I,J] = C[I,J] + A[I,K]*B[K,J] 2

3 Data Dependences Flow Dependence: S1: X = …. S2: … = X Anti Dependence: S1: … = X S2: X = … Output Dependence: S1: X = … S2: X = … 3 S1 δ f S2 S1 δ a S2 S1 δ o S2 S1S1 S1S1 S2S2 S2S2 δfδf S1S1 S1S1 S2S2 S2S2 δaδa S1S1 S1S1 S2S2 S2S2 δoδo

4 Example: Data Dependences do I = 1, 40 S1: A(I+1) = …. S2: … = A(I-1) enddo do I = 1, 40 S1: A(I-1) = … S2: … = A(I+1) enddo do I = 1, 40 S1: A(I+1) = … S2: A(I-1) = … enddo 4 S1S1 S1S1 S2S2 S2S2 δfδf S2S2 S2S2 S1S1 S1S1 δaδa S1S1 S1S1 S2S2 S2S2 δoδo

5 Sets of Dependences do I = 1, 100 S: A(I) = B(I+2) + 1 T: B(I) = A(I-1) - 1 enddo 5 S S T T δfδf S(1): A(1) = B(3) + 1 T(1): B(1) = A(0) - 1 S(2): A(2) = B(4) + 1 T(2): B(2) = A(1) - 1 S(3): A(3) = B(5) + 1 T(3): B(3) = A(2) - 1 ………….. S(100): A(100) = B(102) + 1 T(100): B(100) = A(99) - 1 Due to A() Set of iteration pairs associated with this dependence: {(i,j): j=i+1, 1<=i<=99} Dependence distance: j-i=1 constant in this case.

6 Nested Loops level 1: do I1 = 1, 100 level 2:do I2 = 1, 50 S: A(I1,I2) = A(I1,I2-1) + B(I1,I2) enddo 6 Value computed by S in an iteration (i1,i2) is same as the value used in an iteration (j1,j2): A(i1,i2)  A(j1,j2-1) iff i1=j1 and i2=j2-1 S is flow dependent on itself at level 2 (corresponds to inner loop) for fixed value of I1 dependence exists between different iterations of second loop (inner loop). iteration pairs: {((i1,i2),(j1,j2)): j1=i1, j2=i2+1, 1<=i1<=100, 1<=i2<=49}

7 Nested Loops Contd.. 7 Iteration pairs: {((i1,i2),(j1,j2)): j1=i1, j2=i2+1, 1<=i1<=100, 1<=i2<=49} Dependence distance vector: (j1-i1,j2-i2) = (0,1) There is no dependence at level 1.

8 Computing Dependences - Formulation 8 Can we find iterations (i1,i2) & (j1,j2) such that i1+1=j1;i2+1=j2 and 1<=i1<=81<=j1<=8 i1-3<=i2<=i1j1-3<=j2<=j1 i1-3<=i2<=5j1-3<=j2<=5 1<=i2<=i11<=j2<=j1 1<=i2<=51<=j2<=5 and i1, i2, j1, j2 are integers. level 1: do I1 = 1, 8 level 2:do I2 = max(I1-3,1), min(I1,5) A(I1+1,I2+1) = A(I1,I2) + B(I1,I2) enddo

9 Computing Dependences Contd.. 9 Dependence testing is an integer programming problem  NP-complete Solutions trade-off Speed and Accuracy of the solver. False positives: imprecise tests may report dependences that actually do not exist – conservative is to report false positives but never miss a dependence. DependenceTests: extreme value test; GCD test; Generalized GCD test; Lambda test; Delta test; Power test; Omega test etc…

10 Extreme Value Test 10 Approximate test which guarantees that a solution exists but it may be a real solution not an integer solution. Let f: R n  R st f is bounded in a set S contained in R n Let b be a lower bound of f in S and B be an upper bound of f in S: b ≤ f( ā ) ≤ B for any ā ε S contained in R n For a real number c, the equation f( ā ) = c will have solutions iff b ≤ c ≤ B.

11 Extreme Value Test Contd.. 11 Example: f: R 2  R; f(x,y) = 2x + 3y S = {(x,y): 0<=x<=1 & 0<=y<=1} contained in R 2 lower bound, b=0; upper bound, B=5 1. Is there a real solution to the equation 2x + 3y = 4 ? Yes. (0.5,1) is a solution, there are many others. 2. Is there an integer solution in S ? No. f(0,0)=0; f(0,1)=3; f(1,0)=2; & f(1,1)=5. For none of the integer points in S, f(x,y)=4. If there are no real solutions, there are no integer solutions. If there are real solutions, then integer solutions may or may not exist.

12 Extreme Value Test Contd.. 12 Example: DO I = 1, 10 DO J = 1, 10 A[10*I+J-5] = ….A[10*I+J-10]…. 10*I1+J1-5 = 10*I2+J *I1-10*I2+J1-J2 = -5 f: R 4  R; f(I1,I2,J1,J2) = 10*I1-10*I2+J1-J2 1<=I1,I2,J1,J2<=10; lower bound, b=-99; upper bound, B=+99 since -99 <= -5 <= +99 there is a dependence.

13 Extreme Value Test Contd.. 13

14 Extreme Value Test Contd.. 14

15 Extreme Value Test Contd.. 15

16 Extreme Value Test Contd.. 16

17 Nested Loops – Multidimensional Arrays 17

18 Nested Loops Contd.. 18


Download ppt "1 CS 201 Compiler Construction Array Dependence Analysis & Loop Parallelization."

Similar presentations


Ads by Google