Download presentation
Presentation is loading. Please wait.
Published byDerick Lucas Blake Modified over 5 years ago
1
A Unified Framework for Schedule and Storage Optimization
William Thies, Frédéric Vivien*, Jeffrey Sheldon, and Saman Amarasinghe MIT Laboratory for Computer Science * ICPS/LSIIT, Université Louis Pasteur
2
Motivating Example for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
3
Motivating Example t = 1 (i, j) = (1, 1) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
4
Motivating Example t = 2 (i, j) = (1, 2) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
5
Motivating Example t = 3 (i, j) = (1, 3) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
6
Motivating Example t = 4 (i, j) = (1, 4) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
7
Motivating Example t = 5 (i, j) = (1, 5) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
8
Motivating Example t = 6 (i, j) = (2, 1) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
9
Motivating Example t = 7 (i, j) = (2, 2) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
10
Motivating Example t = 8 (i, j) = (2, 3) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
11
Motivating Example t = 9 (i, j) = (2, 4) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
12
Motivating Example t = 10 (i, j) = (2, 5) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
13
Motivating Example t = 25 (i, j) = (5, 5) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j
14
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion i j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
15
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 1
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 1 (i, j) = (1, 1) init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
16
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 2
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 2 (i, j) = {(1, 2), (2, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
17
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 3
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 3 (i, j) = {(1, 3), (2, 2), (3, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
18
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 4
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 4 (i, j) = {(1, 4), (2, 3), (3, 2), (4, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
19
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 5
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 5 (i, j) = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
20
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 6
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 6 (i, j) = {(2, 5), (3, 4), (4, 3), (5, 2)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
21
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 7
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 7 (i, j) = {(3, 5), (4, 4), (5, 3)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
22
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 8
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 8 (i, j) = {(4, 5), (5, 4)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
23
Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 9
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 9 (i, j) = (5, 5) init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j
24
Parallelism/Storage Tradeoff
Increasing storage can enable parallelism But storage can be expensive Phase ordering problem Optimizing for storage restricts parallelism Maximizing parallelism restricts storage options Too complex to consider all combinations Need efficient framework to integrate schedule and storage optimization Cache RAM Disk “efficient framework” means linear programming problem
25
Outline Abstract problem Simplifications Concrete problem
Solution Method Conclusions
26
Abstract Problem Given DAG of dependent operations
Must execute producers before consumers Must store a value until all consumers execute Two parameters control execution: 1. A scheduling function Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m
27
Abstract Problem We can ask three questions:
Two parameters control execution: 1. A scheduling function Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m
28
Abstract Problem We can ask three questions:
1. Given , what is the smallest m? Two parameters control execution: 1. A scheduling function Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m
29
Abstract Problem We can ask three questions:
1. Given , what is the smallest m? 2. Given m, what is the “best” ? Two parameters control execution: 1. A scheduling function Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m
30
Abstract Problem We can ask three questions:
1. Given , what is the smallest m? 2. Given m, what is the “best” ? 3. What is the smallest m that is valid for all legal ? Two parameters control execution: 1. A scheduling function Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m
31
Outline Abstract problem Simplifications Concrete problem
Solution Method Conclusions
32
Simplifying the Schedule
Real programs aren’t DAG’s Dependence graph is parameterized by loops Too many nodes to schedule Size could even be unknown (symbolic constants) Use classical solution: affine schedules Each statement has a scheduling function Each is an affine function of the enclosing loop counters and symbolic constants To simplify talk, ignore symbolic constants: (i) = B • i
33
Simplifying the Storage Mapping
Programs use arrays, not associative maps If size decreases, need to specify which elements are mapped to the same location
34
Simplifying the Storage Mapping
Programs use arrays, not associative maps If size decreases, need to specify which elements are mapped to the same location
35
Simplifying the Storage Mapping
Programs use arrays, not associative maps If size decreases, need to specify which elements are mapped to the same location ?
36
Simplifying the Storage Mapping
Occupancy Vectors (Strout et al.) Simplifying the Storage Mapping Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
37
Simplifying the Storage Mapping
Occupancy Vectors (Strout et al.) Simplifying the Storage Mapping Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
38
Simplifying the Storage Mapping
Occupancy Vectors (Strout et al.) Simplifying the Storage Mapping Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
39
Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
40
Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
41
Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
42
Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
43
Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
44
Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
45
Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i
46
Occupancy Vectors (Strout et al.) Simplifying the Store
For a given schedule, v is valid if semantics are unchanged using transformed array Shorter vectors require less storage v = (1, 1) j i
47
Outline Abstract problem Simplifications Concrete problem
Solution Method Conclusions
48
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? i j
49
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? i j
50
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
51
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
52
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
53
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
54
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
55
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
56
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
57
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
58
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
59
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
60
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
61
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
62
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j
63
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j
64
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j
65
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j
66
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j
67
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j
68
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j
69
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j
70
Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? ??? i j
71
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? i j
72
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2 i + j (inclusive) (i, j) = i (exclusive) i j
73
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2 i + j (inclusive) (i, j) = i (exclusive) i j
74
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2 i + j (inclusive) (i, j) = i (exclusive) i j
75
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2 i + j (inclusive) (i, j) = i (exclusive) i j
76
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2 i + j (inclusive) (i, j) = i (exclusive) i j
77
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2 i + j (inclusive) (i, j) = i (exclusive) i j
78
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2 i + j (inclusive) (i, j) = i (exclusive) i j
79
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
80
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
81
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
82
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
83
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
84
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
85
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
86
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
87
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
88
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
89
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
90
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
91
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
92
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
93
Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2 i + j i j
94
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? i j
95
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
96
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
97
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
98
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
99
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
100
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
101
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
102
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
103
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal i j
104
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal v = (2, 1) i j
105
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal v = (2, 1) i j
106
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal v = (2, 1) i j
107
Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal v = (2, 1) Def: v is an affine occupancy vector (AOV) i j
108
Outline Abstract problem Simplifications Concrete problem
Solution Method Conclusions
109
Schedule Constraint (i) (h(i)) + 1
Schedule Constraints Dependence analysis yields: iteration i depends on iteration h(i) h is an affine function Consumer must execute after producer i2 be sure to make it clear that this is classical result that audience should be familiar with Schedule Constraint (i) (h(i)) + 1 i h(i) i1
110
Storage Constraints i2 i h(i) i1
111
Storage Constraints dynamic single assignment
v dynamic single assignment for i = 1 to n for j = 1 to n A[i][j] = … B[i][j] = … v i2 i h(i) i1
112
Storage Constraint (i) (h(i) + v)
Storage Constraints i2 h(i).+.v Consumer: i Producer: h(i) Overwriting producer: h(i) + v Consumer: i Producer: h(i) Consumer must execute before producer is overwritten v i h(i) i1 Storage Constraint (i) (h(i) + v)
113
The Constraints A given (, v) combination is valid if
For all dependences h, For all iterations i in the program: (i) (h(i)) + 1 schedule constraint (i) (h(i) + v) storage constraint Given , want to find v satisfying constraints Might look simple, but it is not Too many i’s and n’s to enumerate! Need to reduce the number of constraints
114
The Constraints A given (, v) combination is valid if
For all dependences h, For all iterations i in the program: (i) (h(i)) + 1 schedule constraint (i) (h(i) + v) storage constraint Given , want to find v satisfying constraints Might look simple, but it is not Too many i’s to enumerate! Need to reduce the number of constraints “so we’ve expressed what makes a given combination valid, but we’d like to find v satisfying constraints…”
115
The Vertex Method (1-D) An affine function is non-negative within an interval [x1, x2] iff it is non-negative at x1 and x2 x1 x2
116
The Vertex Method (1-D) An affine function is non-negative over an unbounded interval [x1, ) iff it is non-negative at x1 and is non-decreasing along the interval x1
117
The Vertex Method The same result holds in higher dimensions
An affine function is nonnegative over a bounded polyhedron D iff it is nonnegative at vertices of D mention that a polyhedron is just a region bounded by planes; everything we encounter here is a polyhedron
118
Applying the Method (Quinton87)
Recall the storage constraints For all iterations i in the program: (i) (h(i) + v) Re-arrange: 0 (h(i) + v) - (i) The right hand side is: 1. An affine function of i 2. Nonnegative over the domain D of iterations We can apply the vertex method
119
Applying the Method Replace i with the vertices w of its domain:
(h(i) + v) - (i) iD, (h(i) + v) - (i) 0 (h(w1) + v) - (w1) 0 (h(w2) + v) - (w2) 0 (h(w3) + v) - (w3) 0 (h(w4) + v) - (w4) 0 be sure to say where w1 through w4 come from w3 w4 i2 i1 w1 w2 iteration space
120
The Reduced Constraints
Apply same method to schedule constraints Now a given (, v) combination is valid if For all dependences h, For all vertices w of the iteration domain: (w) (h(w)) schedule constraint (w) (h(w) + v) storage constraint These are linear constraints and v are variables; h and w are constants Given , constraints are linear in v (& vice-versa)
121
Answering the Questions
(w) (h(w)) schedule constraint (w) (h(w) + v) storage constraint (w(z)) (h(w(z))) schedule constraint (h(w(z)) + v) (w(z)) storage constraint 1. Given , we can “minimize” |v| - Linear programming problem 2. Given v, we can find a “good” - Feautrier, 1992 3. To find an AOV... still too many constraints! - For all satisfying the schedule constraints: v must satisfy the storage constraints
122
Finding an AOV Apply the vertex method again!
(w) (h(w)) schedule constraint (w) (h(w) + v) storage constraint (w(z)) (h(w(z))) schedule constraint (h(w(z)) + v) (w(z)) storage constraint Apply the vertex method again! Schedule constraints define domain of valid Storage constraints can be written as a non-negative affine function of components of : Expand (i) = B • i B • w B • (h(w) + v) Simplify (h(w) + v – w) • B 0
123
Finding an AOV Our constraints are now as follows:
For all dependences h, For all vertices w of the iteration domain, For all vertices t of the space of valid schedules: t • w t • (h(w) + v) AOV constraint Can find “shortest” AOV with linear program Finite number of constraints h, w, and t are known constants
124
The Big Picture Input program dependence analysis Affine Dependences
Schedule & Storage Constraints vertex method linear program Constraints without i Given , find v Given v, find vertex method linear program Constraints without Find an AOV, valid for all
125
Details in Paper Symbolic constants
Inter-statement dependences across loops Farkas’ Lemma for improved efficiency
126
Related Work Universal Occupancy Vector (Strout et al.)
Valid for all schedules, not just affine ones Stencil of dependences in single loop nest Storage for ALPHA programs (Quilleré, Rajopadhye, Wilde) Polyhedral model, with occupancy vector analog Assume schedule is given PAF compiler (Cohen, Lefebvre, Feautrier) Minimal expansion scheduling contraction Storage mapping A[i mod x][j mod y] strout only answered 3rd question, with branch and bound algorithm the cohen/lefebvre/feautrier team is at versailles university in france
127
Future Work Allow affine left hand side references
A[2j][n-i] = … Consider multi-dimensional time schedules Collapse multiple dimensions of storage
128
Conclusions Unified framework for determining:
1. A good storage mapping for a given schedule 2. A good schedule for a given storage mapping 3. A good storage mapping for all valid schedules Take away: representations and techniques Occupancy vectors Affine schedules Vertex method
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.