Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages Presented by A. Craik (5-Jan-12) Research supported by funding.

Similar presentations


Presentation on theme: "A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages Presented by A. Craik (5-Jan-12) Research supported by funding."— Presentation transcript:

1 A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages Presented by A. Craik (5-Jan-12) Research supported by funding from Microsoft Research and the Queensland State Government 1 1

2 Introduction 2 2 Procedural Algorithm Sequential Implementation w/ Injected Parallelism Procedural Algorithm Sequential Implementation Parallel Algorithm Explicitly Parallel Implementation Semantic Analysis Dependency Analysis 2 2

3 Inherent Parallelism: a = 1; b = 2; c = a + b; Three steps for finding & exploiting: 1.Find the inherent parallelism in the program 2.Decide which inherent parallelism is worth exploiting 3.Choose an implementation technology to expose the selected parallelism Introduction   for (int i=0; i<max; ++i) a[i] = a[i] + 1; 3 3

4 Dependencies impose ordering constraints Sequential consistency required Two forms – Control – which statements will run – Data – reads & writes of shared state Control well studied and easier to handle inter-procedurally – Example, Java checked exceptions Introduction 4 4

5 Flow Dependence ( Write-After-Read ) int a = 1; int b = a + 1; a = 2; Output Dependence ( Write-After-Write ) int a = 1; a = 4; a = 5; Anti-Dependence ( Read-After-Write ) int a = 1; a = 2; int b = a + 1; Data Dependencies 5 5

6 for (int i=0; i < 3; ++i) { for (int j=0; j < i+1; ++j) { a[i,j] = b[i,j] + c[i,j]; b[i,j] = a[i,j+1]; } Pair-wise analysis of statements and expressions Can a, b or c refer to the array? Traditional Approach 6 6

7 for (int i=0; i < 3; ++i) { for (int j=0; j < i+1; ++j) { a[i,j] = b[i,j] + c[i,j]; b[i,j] = a.readIandJInc(i,j); } What does a.readIandJInc(i,j) do? Examine ALL possible implementations! Traditional Approach 7 7

8 class Holder { public static int value; } class Array { public int readsIandJInc(i,j) { return this[i,j+1]; } Side-Effects 8 8

9 class Holder { public static int value; } class Array { public int readsIandJInc(i,j) { this[0,0] = i + j; return this[i,j]; } Side-Effects 9 9

10 class Holder { public static int value; } class Array { public int readsIandJInc(i,j) { Holder.value++; return this[i,j]; } Side-Effects 10

11 Traditional ApproachMy Approach Kernels Less precise Inter-procedural  Limitations of Current Techniques Traditional: Focused on analyzing complex tight loops Poor abstraction and composition Too complex for programmers to use without tool support 11

12 Goal: – Simplify inter-procedural dependency analysis Idea: – Ensure safety – Make reasoning modular and composable The Idea 12

13 Specify effects on method signature: public int getReads() reads writes  What goes in the angle brackets? – Abstract effect description – Composable descriptions – Verifiable The Idea 13

14 The Idea 14

15 Encapsulation  representation hierarchy Object-Orientation Person name dateOfBirth employer String Date Company 15

16 The Idea 16

17 Can 2 arbitrary pieces of code execute in parallel safely? Type rules specify computation of effect sets Look for overlaps in the read & write effect sets to find possible data deps. Safe Parallelism Block 1 {... } reads writes Block 2 {... } reads writes 17

18 Dependency exists where two triangles of representation overlap Triangles can only be nested: Becomes a check for a parent-child relationship; disjointess  no dep. Dependencies using Effect Sets  18

19 Task Parallelism – Run 2+ separate ops. at same time Loop Parallelism – Execute loop iterations in parallel Pipeline Parallelism – Stage loop body execution so that iteration execution overlaps safely Types of Parallelism 19

20 class Demo { void op1() reads writes {…} void op2() reads writes {…} } Can we execute calls to op1 and op2 in parallel? Determine the overlap in the effect sets; no overlap  no data deps. Realization using one-way calls or futures Task Parallelism 20

21 Data parallel loops major source of parallelism in imperative programs Start with simple data parallel loop in the form of a foreach loop: foreach (T element in collection) element.operation(); Loop Parallelism Conditions 21

22 Condition 1: Areas holding the representations of the objects returned by the enumerator are all disjoint from one another Foreach Loop Conditions 22

23 Condition 2: The operation only mutates the representation of its “own” element and does not read the state owned by any of the other elements Foreach Loop Conditions 23

24 Condition 3: There are no control dependencies which would prevent loop parallelization Foreach Loop Conditions 24

25 So far we have looked at foreach(T element in collection) element.operation(); Question: How do we generalize this to an arbitrary loop body? foreach(T element in collection) { //sequence of statements //including local var defs //and a read of a context r } Arbitrary Loop Bodies 25

26 Loop becomes: foreach (T elem in collection) elem.loopBody(this); Where loopBody is: class T { void loopBody(Foo me) { //same sequence of statements //replace all elem by this //and all this by me } Loop Body Rewriting 26

27 Encapsulation  representation hierarchy Object-Orientation Person name dateOfBirth employer String Date Company 27

28 Designed to enforce encapsulation Adapted to validate encapsulation Type parameters to capture memory referencing permissions class Person [o,c] { private String|this| Name; private Date|this| DateOfBirth; private Company|c| Employer; … } Ownership Types 28

29 class Company[o] { public string name; … } class Person[o,c] { private Company|c| Employer; public string employerName() reads writes<> { return Employer.name; } … } Ownerships & Effects 29

30 Analyze & apply sufficient conditions All pairs of context relations need to be known Need some basis to believe the relationships between contexts to hold Contexts and Dependencies 30

31 Statically know some relationships – The owner of an object is a parent of the object’s this context – The world context is a parent of all contexts Relationship may only be known dynamically Optionally track at runtime to allow runtime conditions Reasons for a Runtime System 31

32 Conditional Parallelism parallel for(T e in collection){ e.operation(arguments); } parallel for(T e in collection){ e.operation(arguments); } serial for(T e in collection){ e.operation(arguments); } serial for(T e in collection){ e.operation(arguments); } disjoint(r,c) Always True disjoint(r,c) Always True if (disjoint(r,c)) { parallel version } else { sequential version } if (disjoint(r,c)) { parallel version } else { sequential version } disjoint(r,c) Always False disjoint(r,c) Always False disjoint(r,c) unknown disjoint(r,c) unknown for(T e in collection){ e.operation(arguments); } for(T e in collection){ e.operation(arguments); } 32

33 We do not know the relationships between all contexts at compile time. May vary from one object or method invocation to another Reasons: – Separate Compilation – Dynamic Linking – Complex Data Flows Reasons for a Runtime System 33

34 Type system provides support for specifying context relationships programmer asserts must be true void oper1[r]() reads writes where r # c { … foreach(T|c| elem in collection){…} … } Reasons for a Runtime System 34

35 Naïve implementation – each object keeps a pointer to its owner Runtime System Implementation 35

36 AFJO Soundness Subject Reduction Progress Effect Soundness Owner Invariance Effect Completeness Contexts form a Tree Cast Safety Context Disjointness Implies Effect Disjointness Disjoint effects imply no data dependencies Update Dependency Preservation Sufficient for Parallelization Sequential Consistency Task Parallelism Sufficient Conditions Data Parallelism Sufficient Conditions Pipeline Parallelism Sufficient Conditions Disjointness Test Correct Static Context Relations Well Formed Heap Context Parameters do not survive 36

37 Added my system to C# 3.5 Extended GPC# compiler Added infrastructure to support arbitrary type parameters Implemented runtime ownership tracking system (~1,000 lines) Implementation – Zal MetricTotalGPC#ExtensionsExtensions (% Total) SLOC-P39,44427,88812,15630.8% SLOC-L22,20114,9577,24432.7% 37

38 Implementation – Zal Zal Compiler Microsoft C# Compiler Executing Program with Automatic Parallelization Zal source C# source Runtime Ownership Libraries CIL Program w/ Ownership Tracking 38

39 Implementation – Zal AST Tokens AST Scanner generated by GPLex Parser generated by Coco/R Type CheckerCode Generation Effect Computation Parallelization Legend C# compilation step Zal compilation step I/O AST Scanner.scan() Reads a stream of characters and processes them into tokens Parser.parse() Converts stream of tokens into an Abstract Syntax Tree TypeCheck() Resolves all TypeRefs to TypeDefs & checks type correctness Output() Emit Generates C# or CIL implementation of AST computeEffects() LocalEffects() Computes heap & stack effects for AST nodes Parallelize() Checks sufficient conditions for parallelism and implements them Ownership Implementation BuildOwnership Implementation() Implements Zal features in C# by modifying AST AST 39

40 Have applied my system to a number of realistic applications Overall annotation requires modification to 20% of the source Ownership tracking overhead: – Execution time: 10% to 20% – Memory usage: 15% to 30% Implementation not fully optimized Validation 40

41 Validation – Speedup 41

42 Validation – Speedup 42

43 Focus on providing tools to express parallelism No support for validating correctness of parallelization Assumed programmer knowledge of parallel programming constructs Examples: Fortress, Chapel, X10 Related Work – Prog. Langs. 43

44 Have proposed effect systems, but only suggested application to parallelism Data race and dead lock detection for locking – very different reasoning Deterministic Parallel Java (late 2009) – modified ownerships – Focused on kernels – Lost composition & abstraction to do so Related Work – Ownership 44

45 Abstract and composable system for reasoning about effects based on Ownership Types. Effect and reasoning systems applied to a real language and real program examples Real parallelism detected and exploited automatically Contributions 45

46 Developed and proved sufficient conditions for a number of different forms of parallelism Runtime system to support static reasoning. Contributions 46

47 A. Craik and W. Kelly. Using Ownership to Reason About Inherent Parallelism in Imperative Object-Oriented Programs. International Conference on Compiler Construction. ed. R. Gupta, LNCS 6011, pp. 145-164, Springer- Verlag Berlin Hiedleberg, 2010. W. Reid, W. Kelly, and A. Craik. Reasoning about Parallelism in Modern Object-Oriented Languages. Australasian Computer Science Conference. 2008 +3 technical reports on various versions of the reasoning system in e-prints Publications 47

48 System for reasoning about data dependencies and parallelism Abstract & composable Usable by both programmers & automated tools Question of when & how to exploit still open Demonstration this automated reasoning is possible w/ prototype Conclusion 48

49 Q & A 49

50 Ownerships traditionally for encapsulation Stack not considered by these works Stack & stack referencing models vary from language to language I consider a restricted stack model: – Stack and heap are disjoint – Stack locations can be differentiated by name Ownership & The Stack 50

51 Stack model fits Java, C#, and VB.NET Dereferencing to read the heap causes an ownership effect Stack location names are unique and cannot be aliased without de- referencing Ownership & The Stack 51


Download ppt "A Framework for Reasoning About Inherent Parallelism in Modern Object-Oriented Languages Presented by A. Craik (5-Jan-12) Research supported by funding."

Similar presentations


Ads by Google