A Unified Framework for Schedule and Storage Optimization

A Unified Framework for Schedule and Storage Optimization
William Thies, Frédéric Vivien*, Jeffrey Sheldon, and Saman Amarasinghe MIT Laboratory for Computer Science * ICPS/LSIIT, Université Louis Pasteur

Motivating Example for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

Motivating Example t = 1 (i, j) = (1, 1) j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j

Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion i j
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

Motivating Example t = 25 (i, j) = (5, 5) j Array Expansion t = 1
for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 1 (i, j) = (1, 1) init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 2 (i, j) = {(1, 2), (2, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 3 (i, j) = {(1, 3), (2, 2), (3, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 4 (i, j) = {(1, 4), (2, 3), (3, 2), (4, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 5 (i, j) = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 6 (i, j) = {(2, 5), (3, 4), (4, 3), (5, 2)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 7 (i, j) = {(3, 5), (4, 4), (5, 3)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 8 (i, j) = {(4, 5), (5, 4)} init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

for i = 1 to n for j = 1 to n A[j] = f(A[j], A[j-2]) j Array Expansion t = 9 (i, j) = (5, 5) init A[0][j] for i = 1 to n for j = 1 to n A[i][j] = f(A[i-1][j], A[i][j-2]) i j

Parallelism/Storage Tradeoff
Increasing storage can enable parallelism But storage can be expensive Phase ordering problem Optimizing for storage restricts parallelism Maximizing parallelism restricts storage options Too complex to consider all combinations Need efficient framework to integrate schedule and storage optimization Cache RAM Disk “efficient framework” means linear programming problem

Outline Abstract problem Simplifications Concrete problem
Solution Method Conclusions

Abstract Problem Given DAG of dependent operations
Must execute producers before consumers Must store a value until all consumers execute Two parameters control execution: 1. A scheduling function  Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m

Abstract Problem We can ask three questions:
Two parameters control execution: 1. A scheduling function  Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m

1. Given , what is the smallest m? Two parameters control execution: 1. A scheduling function  Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m

1. Given , what is the smallest m? 2. Given m, what is the “best” ? Two parameters control execution: 1. A scheduling function  Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m

1. Given , what is the smallest m? 2. Given m, what is the “best” ? 3. What is the smallest m that is valid for all legal ? Two parameters control execution: 1. A scheduling function  Maps each operation to execution time Parallelism is implicit 2. A fully associative store of size m

Simplifying the Schedule
Real programs aren’t DAG’s Dependence graph is parameterized by loops Too many nodes to schedule Size could even be unknown (symbolic constants) Use classical solution: affine schedules Each statement has a scheduling function  Each  is an affine function of the enclosing loop counters and symbolic constants To simplify talk, ignore symbolic constants: (i) = B • i

Simplifying the Storage Mapping
Programs use arrays, not associative maps If size decreases, need to specify which elements are mapped to the same location

Programs use arrays, not associative maps If size decreases, need to specify which elements are mapped to the same location ?

Occupancy Vectors (Strout et al.) Simplifying the Storage Mapping Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i

Occupancy Vectors (Strout et al.) Simplifying the Store
Specifies unit of overwriting within an array Locations collapsed if separated by a multiple of v v = (1, 1) j i

Occupancy Vectors (Strout et al.) Simplifying the Store
For a given schedule, v is valid if semantics are unchanged using transformed array Shorter vectors require less storage v = (1, 1) j i

Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? i j

Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Solution: v = (1, 1) i j

Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? i j

Answering Question #1 Given (i, j) = i + j, what is the shortest valid occupancy vector v? Why not v = (0, 1)? ??? i j

Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? i j

Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? (i, j) is between: (i, j) = 2  i + j (inclusive) (i, j) = i (exclusive) i j

Answering Question #2 Given v = (0, 1), what is the range of valid schedules ? Lets try (i, j) = 2  i + j i j

Answering Question #3 What is the shortest v that is valid for all legal affine schedules? i j

Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal  i j

Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal  v = (2, 1) i j

Answering Question #3 What is the shortest v that is valid for all legal affine schedules? Range of legal  v = (2, 1) Def: v is an affine occupancy vector (AOV) i j

Schedule Constraint (i)  (h(i)) + 1
Schedule Constraints Dependence analysis yields: iteration i depends on iteration h(i) h is an affine function Consumer must execute after producer i2 be sure to make it clear that this is classical result that audience should be familiar with Schedule Constraint (i)  (h(i)) + 1 i h(i) i1

Storage Constraints i2 i h(i) i1

Storage Constraints dynamic single assignment
v dynamic single assignment for i = 1 to n for j = 1 to n A[i][j] = … B[i][j] = … v i2 i h(i) i1

Storage Constraint (i)  (h(i) + v)
Storage Constraints i2 h(i).+.v Consumer: i Producer: h(i) Overwriting producer: h(i) + v Consumer: i Producer: h(i) Consumer must execute before producer is overwritten v i h(i) i1 Storage Constraint (i)  (h(i) + v)

The Constraints A given (, v) combination is valid if
For all dependences h, For all iterations i in the program: (i)  (h(i)) + 1 schedule constraint (i)  (h(i) + v) storage constraint Given , want to find v satisfying constraints Might look simple, but it is not Too many i’s and n’s to enumerate! Need to reduce the number of constraints

The Constraints A given (, v) combination is valid if
For all dependences h, For all iterations i in the program: (i)  (h(i)) + 1 schedule constraint (i)  (h(i) + v) storage constraint Given , want to find v satisfying constraints Might look simple, but it is not Too many i’s to enumerate! Need to reduce the number of constraints “so we’ve expressed what makes a given combination valid, but we’d like to find v satisfying constraints…”

The Vertex Method (1-D) An affine function is non-negative within an interval [x1, x2] iff it is non-negative at x1 and x2 x1 x2

The Vertex Method (1-D) An affine function is non-negative over an unbounded interval [x1, ) iff it is non-negative at x1 and is non-decreasing along the interval x1

The Vertex Method The same result holds in higher dimensions
An affine function is nonnegative over a bounded polyhedron D iff it is nonnegative at vertices of D mention that a polyhedron is just a region bounded by planes; everything we encounter here is a polyhedron

Applying the Method (Quinton87)
Recall the storage constraints For all iterations i in the program: (i)  (h(i) + v) Re-arrange: 0  (h(i) + v) - (i) The right hand side is: 1. An affine function of i 2. Nonnegative over the domain D of iterations We can apply the vertex method

Applying the Method Replace i with the vertices w of its domain:
(h(i) + v) - (i) iD, (h(i) + v) - (i)  0 (h(w1) + v) - (w1)  0 (h(w2) + v) - (w2)  0 (h(w3) + v) - (w3)  0 (h(w4) + v) - (w4)  0 be sure to say where w1 through w4 come from w3 w4 i2 i1 w1 w2 iteration space

The Reduced Constraints
Apply same method to schedule constraints Now a given (, v) combination is valid if For all dependences h, For all vertices w of the iteration domain: (w)  (h(w)) schedule constraint (w)  (h(w) + v) storage constraint These are linear constraints  and v are variables; h and w are constants Given , constraints are linear in v (& vice-versa)

Answering the Questions
(w)  (h(w)) schedule constraint (w)  (h(w) + v) storage constraint (w(z))  (h(w(z))) schedule constraint (h(w(z)) + v)  (w(z)) storage constraint 1. Given , we can “minimize” |v| - Linear programming problem 2. Given v, we can find a “good”  - Feautrier, 1992 3. To find an AOV... still too many constraints! - For all  satisfying the schedule constraints: v must satisfy the storage constraints

Finding an AOV Apply the vertex method again!
(w)  (h(w)) schedule constraint (w)  (h(w) + v) storage constraint (w(z))  (h(w(z))) schedule constraint (h(w(z)) + v)  (w(z)) storage constraint Apply the vertex method again! Schedule constraints define domain of valid  Storage constraints can be written as a non-negative affine function of components of : Expand (i) = B • i B • w  B • (h(w) + v) Simplify (h(w) + v – w) • B  0

Finding an AOV Our constraints are now as follows:
For all dependences h, For all vertices w of the iteration domain, For all vertices t of the space of valid schedules: t • w  t • (h(w) + v) AOV constraint Can find “shortest” AOV with linear program Finite number of constraints h, w, and t are known constants

The Big Picture Input program dependence analysis Affine Dependences
Schedule & Storage Constraints vertex method linear program Constraints without i Given , find v Given v, find  vertex method linear program Constraints without  Find an AOV, valid for all 

Details in Paper Symbolic constants
Inter-statement dependences across loops Farkas’ Lemma for improved efficiency

Related Work Universal Occupancy Vector (Strout et al.)
Valid for all schedules, not just affine ones Stencil of dependences in single loop nest Storage for ALPHA programs (Quilleré, Rajopadhye, Wilde) Polyhedral model, with occupancy vector analog Assume schedule is given PAF compiler (Cohen, Lefebvre, Feautrier) Minimal expansion  scheduling  contraction Storage mapping A[i mod x][j mod y] strout only answered 3rd question, with branch and bound algorithm the cohen/lefebvre/feautrier team is at versailles university in france

Future Work Allow affine left hand side references
A[2j][n-i] = … Consider multi-dimensional time schedules Collapse multiple dimensions of storage

Conclusions Unified framework for determining:
1. A good storage mapping for a given schedule 2. A good schedule for a given storage mapping 3. A good storage mapping for all valid schedules Take away: representations and techniques Occupancy vectors Affine schedules Vertex method

A Unified Framework for Schedule and Storage Optimization

Similar presentations

Presentation on theme: "A Unified Framework for Schedule and Storage Optimization"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Unified Framework for Schedule and Storage Optimization

Similar presentations

Presentation on theme: "A Unified Framework for Schedule and Storage Optimization"— Presentation transcript:

Similar presentations

About project

Feedback