Download presentation
Presentation is loading. Please wait.
Published byLaureen Harmon Modified over 9 years ago
1
A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages Presented by A. Craik (5-Jan-12) Research supported by funding from Microsoft Research and the Queensland State Government 1 1
2
Introduction 2 2 Procedural Algorithm Sequential Implementation w/ Injected Parallelism Procedural Algorithm Sequential Implementation Parallel Algorithm Explicitly Parallel Implementation Semantic Analysis Dependency Analysis 2 2
3
Inherent Parallelism: a = 1; b = 2; c = a + b; Three steps for finding & exploiting: 1.Find the inherent parallelism in the program 2.Decide which inherent parallelism is worth exploiting 3.Choose an implementation technology to expose the selected parallelism Introduction for (int i=0; i<max; ++i) a[i] = a[i] + 1; 3 3
4
Dependencies impose ordering constraints Sequential consistency required Two forms – Control – which statements will run – Data – reads & writes of shared state Control well studied and easier to handle inter-procedurally – Example, Java checked exceptions Introduction 4 4
5
Flow Dependence ( Write-After-Read ) int a = 1; int b = a + 1; a = 2; Output Dependence ( Write-After-Write ) int a = 1; a = 4; a = 5; Anti-Dependence ( Read-After-Write ) int a = 1; a = 2; int b = a + 1; Data Dependencies 5 5
6
for (int i=0; i < 3; ++i) { for (int j=0; j < i+1; ++j) { a[i,j] = b[i,j] + c[i,j]; b[i,j] = a[i,j+1]; } Pair-wise analysis of statements and expressions Can a, b or c refer to the array? Traditional Approach 6 6
7
for (int i=0; i < 3; ++i) { for (int j=0; j < i+1; ++j) { a[i,j] = b[i,j] + c[i,j]; b[i,j] = a.readIandJInc(i,j); } What does a.readIandJInc(i,j) do? Examine ALL possible implementations! Traditional Approach 7 7
8
class Holder { public static int value; } class Array { public int readsIandJInc(i,j) { return this[i,j+1]; } Side-Effects 8 8
9
class Holder { public static int value; } class Array { public int readsIandJInc(i,j) { this[0,0] = i + j; return this[i,j]; } Side-Effects 9 9
10
class Holder { public static int value; } class Array { public int readsIandJInc(i,j) { Holder.value++; return this[i,j]; } Side-Effects 10
11
Traditional ApproachMy Approach Kernels Less precise Inter-procedural Limitations of Current Techniques Traditional: Focused on analyzing complex tight loops Poor abstraction and composition Too complex for programmers to use without tool support 11
12
Goal: – Simplify inter-procedural dependency analysis Idea: – Ensure safety – Make reasoning modular and composable The Idea 12
13
Specify effects on method signature: public int getReads() reads writes What goes in the angle brackets? – Abstract effect description – Composable descriptions – Verifiable The Idea 13
14
The Idea 14
15
Encapsulation representation hierarchy Object-Orientation Person name dateOfBirth employer String Date Company 15
16
The Idea 16
17
Can 2 arbitrary pieces of code execute in parallel safely? Type rules specify computation of effect sets Look for overlaps in the read & write effect sets to find possible data deps. Safe Parallelism Block 1 {... } reads writes Block 2 {... } reads writes 17
18
Dependency exists where two triangles of representation overlap Triangles can only be nested: Becomes a check for a parent-child relationship; disjointess no dep. Dependencies using Effect Sets 18
19
Task Parallelism – Run 2+ separate ops. at same time Loop Parallelism – Execute loop iterations in parallel Pipeline Parallelism – Stage loop body execution so that iteration execution overlaps safely Types of Parallelism 19
20
class Demo { void op1() reads writes {…} void op2() reads writes {…} } Can we execute calls to op1 and op2 in parallel? Determine the overlap in the effect sets; no overlap no data deps. Realization using one-way calls or futures Task Parallelism 20
21
Data parallel loops major source of parallelism in imperative programs Start with simple data parallel loop in the form of a foreach loop: foreach (T element in collection) element.operation(); Loop Parallelism Conditions 21
22
Condition 1: Areas holding the representations of the objects returned by the enumerator are all disjoint from one another Foreach Loop Conditions 22
23
Condition 2: The operation only mutates the representation of its “own” element and does not read the state owned by any of the other elements Foreach Loop Conditions 23
24
Condition 3: There are no control dependencies which would prevent loop parallelization Foreach Loop Conditions 24
25
So far we have looked at foreach(T element in collection) element.operation(); Question: How do we generalize this to an arbitrary loop body? foreach(T element in collection) { //sequence of statements //including local var defs //and a read of a context r } Arbitrary Loop Bodies 25
26
Loop becomes: foreach (T elem in collection) elem.loopBody(this); Where loopBody is: class T { void loopBody(Foo me) { //same sequence of statements //replace all elem by this //and all this by me } Loop Body Rewriting 26
27
Encapsulation representation hierarchy Object-Orientation Person name dateOfBirth employer String Date Company 27
28
Designed to enforce encapsulation Adapted to validate encapsulation Type parameters to capture memory referencing permissions class Person [o,c] { private String|this| Name; private Date|this| DateOfBirth; private Company|c| Employer; … } Ownership Types 28
29
class Company[o] { public string name; … } class Person[o,c] { private Company|c| Employer; public string employerName() reads writes<> { return Employer.name; } … } Ownerships & Effects 29
30
Analyze & apply sufficient conditions All pairs of context relations need to be known Need some basis to believe the relationships between contexts to hold Contexts and Dependencies 30
31
Statically know some relationships – The owner of an object is a parent of the object’s this context – The world context is a parent of all contexts Relationship may only be known dynamically Optionally track at runtime to allow runtime conditions Reasons for a Runtime System 31
32
Conditional Parallelism parallel for(T e in collection){ e.operation(arguments); } parallel for(T e in collection){ e.operation(arguments); } serial for(T e in collection){ e.operation(arguments); } serial for(T e in collection){ e.operation(arguments); } disjoint(r,c) Always True disjoint(r,c) Always True if (disjoint(r,c)) { parallel version } else { sequential version } if (disjoint(r,c)) { parallel version } else { sequential version } disjoint(r,c) Always False disjoint(r,c) Always False disjoint(r,c) unknown disjoint(r,c) unknown for(T e in collection){ e.operation(arguments); } for(T e in collection){ e.operation(arguments); } 32
33
We do not know the relationships between all contexts at compile time. May vary from one object or method invocation to another Reasons: – Separate Compilation – Dynamic Linking – Complex Data Flows Reasons for a Runtime System 33
34
Type system provides support for specifying context relationships programmer asserts must be true void oper1[r]() reads writes where r # c { … foreach(T|c| elem in collection){…} … } Reasons for a Runtime System 34
35
Naïve implementation – each object keeps a pointer to its owner Runtime System Implementation 35
36
AFJO Soundness Subject Reduction Progress Effect Soundness Owner Invariance Effect Completeness Contexts form a Tree Cast Safety Context Disjointness Implies Effect Disjointness Disjoint effects imply no data dependencies Update Dependency Preservation Sufficient for Parallelization Sequential Consistency Task Parallelism Sufficient Conditions Data Parallelism Sufficient Conditions Pipeline Parallelism Sufficient Conditions Disjointness Test Correct Static Context Relations Well Formed Heap Context Parameters do not survive 36
37
Added my system to C# 3.5 Extended GPC# compiler Added infrastructure to support arbitrary type parameters Implemented runtime ownership tracking system (~1,000 lines) Implementation – Zal MetricTotalGPC#ExtensionsExtensions (% Total) SLOC-P39,44427,88812,15630.8% SLOC-L22,20114,9577,24432.7% 37
38
Implementation – Zal Zal Compiler Microsoft C# Compiler Executing Program with Automatic Parallelization Zal source C# source Runtime Ownership Libraries CIL Program w/ Ownership Tracking 38
39
Implementation – Zal AST Tokens AST Scanner generated by GPLex Parser generated by Coco/R Type CheckerCode Generation Effect Computation Parallelization Legend C# compilation step Zal compilation step I/O AST Scanner.scan() Reads a stream of characters and processes them into tokens Parser.parse() Converts stream of tokens into an Abstract Syntax Tree TypeCheck() Resolves all TypeRefs to TypeDefs & checks type correctness Output() Emit Generates C# or CIL implementation of AST computeEffects() LocalEffects() Computes heap & stack effects for AST nodes Parallelize() Checks sufficient conditions for parallelism and implements them Ownership Implementation BuildOwnership Implementation() Implements Zal features in C# by modifying AST AST 39
40
Have applied my system to a number of realistic applications Overall annotation requires modification to 20% of the source Ownership tracking overhead: – Execution time: 10% to 20% – Memory usage: 15% to 30% Implementation not fully optimized Validation 40
41
Validation – Speedup 41
42
Validation – Speedup 42
43
Focus on providing tools to express parallelism No support for validating correctness of parallelization Assumed programmer knowledge of parallel programming constructs Examples: Fortress, Chapel, X10 Related Work – Prog. Langs. 43
44
Have proposed effect systems, but only suggested application to parallelism Data race and dead lock detection for locking – very different reasoning Deterministic Parallel Java (late 2009) – modified ownerships – Focused on kernels – Lost composition & abstraction to do so Related Work – Ownership 44
45
Abstract and composable system for reasoning about effects based on Ownership Types. Effect and reasoning systems applied to a real language and real program examples Real parallelism detected and exploited automatically Contributions 45
46
Developed and proved sufficient conditions for a number of different forms of parallelism Runtime system to support static reasoning. Contributions 46
47
A. Craik and W. Kelly. Using Ownership to Reason About Inherent Parallelism in Imperative Object-Oriented Programs. International Conference on Compiler Construction. ed. R. Gupta, LNCS 6011, pp. 145-164, Springer- Verlag Berlin Hiedleberg, 2010. W. Reid, W. Kelly, and A. Craik. Reasoning about Parallelism in Modern Object-Oriented Languages. Australasian Computer Science Conference. 2008 +3 technical reports on various versions of the reasoning system in e-prints Publications 47
48
System for reasoning about data dependencies and parallelism Abstract & composable Usable by both programmers & automated tools Question of when & how to exploit still open Demonstration this automated reasoning is possible w/ prototype Conclusion 48
49
Q & A 49
50
Ownerships traditionally for encapsulation Stack not considered by these works Stack & stack referencing models vary from language to language I consider a restricted stack model: – Stack and heap are disjoint – Stack locations can be differentiated by name Ownership & The Stack 50
51
Stack model fits Java, C#, and VB.NET Dereferencing to read the heap causes an ownership effect Stack location names are unique and cannot be aliased without de- referencing Ownership & The Stack 51
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.