Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design-Driven Compilation

Similar presentations


Presentation on theme: "Design-Driven Compilation"— Presentation transcript:

1 Design-Driven Compilation
Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

2 Points-to Analysis, Region Analysis
Overview + Computation Goal: Parallelization Analysis Problems: Points-to Analysis, Region Analysis Fully Automatic Design Driven Two Potential Solutions Evaluation

3 Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2

4 Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide

5 Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide 4 7 1 6 3 5 2 8 Conquer

6 Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine

7 Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine 1 2 3 4 5 6 7 8

8 Divide and Conquer Algorithms
Lots of Generated Concurrency Solve Subproblems in Parallel

9 Divide and Conquer Algorithms
Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel

10 Divide and Conquer Algorithms
Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel Combine Results in Parallel

11 “Sort n Items in d, Using t as Temporary Storage”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n);

12 “Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Divide array into subarrays and recursively sort subarrays

13 “Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Subproblems Identified Using Pointers Into Middle of Array 4 7 6 1 5 3 8 2 d d+n/4 d+n/2 d+3*(n/4)

14 “Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 4 7 6 1 5 3 8 2 d d+n/4 d+n/2 d+3*(n/4)

15 “Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Sorted Results Written Back Into Input Array 7 4 1 6 5 3 2 8 d d+n/4 d+n/2 d+3*(n/4)

16 “Merge Sorted Quarters of d Into Halves of t”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 7 4 1 6 5 3 2 8 d 4 1 6 7 3 2 5 8 t t+n/2

17 “Merge Sorted Halves of t Back Into d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 2 1 3 4 6 5 7 8 d 4 1 6 7 3 2 5 8 t t+n/2

18 “Use a Simple Sort for Small Problem Sizes”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 4 7 6 1 5 3 8 2 d d+n

19 “Use a Simple Sort for Small Problem Sizes”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 4 7 1 6 5 3 8 2 d d+n

20 Parallel Sort void sort(int *d, int *t, int n) if (n > CUTOFF) {
spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n);

21 What Do You Need To Know To Exploit This Form of Parallelism?
Points-to Information (data blocks that pointers point to) Region Information (accessed regions within data blocks)

22 Information Needed To Exploit Parallelism
d and t point to different memory blocks Calls to sort access disjoint parts of d and t Together, calls access [d,d+n-1] and [t,t+n-1] sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+n/2,t+n/2,n/4); sort(d+3*(n/4),t+3*(n/4), n-3*(n/4)); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1

23 Information Needed To Exploit Parallelism
d and t point to different memory blocks First two calls to merge access disjoint parts of d,t Together, calls access [d,d+n-1] and [t,t+n-1] merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4), d+n,t+n/2); merge(t,t+n/2,t+n,d); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1

24 Information Needed To Exploit Parallelism
Calls to insertionSort access [d,d+n-1] insertionSort(d,d+n); d d+n-1

25 What Do You Need To Know To Exploit This Form of Parallelism?
Points-to Information (d and t point to different data blocks) Symbolic Region Information (accessed regions within d and t blocks)

26 How Hard Is It To Figure These Things Out?

27 How Hard Is It To Figure These Things Out?
Challenging

28 How Hard Is It To Figure These Things Out?
void insertionSort(int *l, int *h) { int *p, *q, k; for (p = l+1; p < h; p++) { for (k = *p, q = p-1; l <= q && k < *q; q--) *(q+1) = *q; *(q+1) = k; } Not immediately obvious that insertionSort(l,h) accesses [l,h-1]

29 How Hard Is It To Figure These Things Out?
void merge(int *l1, int*m, int *h2, int *d) { int *h1 = m; int *l2 = m; while ((l1 < h1) && (l2 < h2)) if (*l1 < *l2) *d++ = *l1++; else *d++ = *l2++; while (l1 < h1) *d++ = *l1++; while (l2 < h2) *d++ = *l2++; } Not immediately obvious that merge(l,m,h,d) accesses [l,h-1] and [d,d+(h-l)-1]

30 Issues Heavy Use of Pointers Pointers into Middle of Arrays
Pointer Arithmetic Pointer Comparison Multiple Procedures sort(int *d, int *t, n) insertionSort(int *l, int *h) merge(int *l, int *m, int *h, int *t) Recursion

31 Fully Automatic Solution
Whole-program pointer analysis Context-sensitive, flow-sensitive Rugina and Rinard, PLDI 1999 Whole-program region analysis Symbolic constraint systems Solve by reducing to linear programs Rugina and Rinard, PLDI 2000

32 Need for sophisticated interprocedural analyses
Key Complication Need for sophisticated interprocedural analyses Pointer analysis Propagate analysis results through call graph Fixed-point algorithm for recursive programs Region analysis Formulation avoids fixed-point algorithms Single constraint system for each strongly connected component Need to have whole program in analyzable form

33 Bigger Picture Points-to and region information is (implicitly) part of the interface of each procedure Programmer understands procedure interfaces Programmer knows Points-to relationships on entry Effect of procedure on points-to relationships Regions of memory blocks that procedure accesses

34 Idea Enhance procedure interface to make points-to and region information explicit Points-to language Points-to graphs at entry and exit Effect on points-to relationships Region language Symbolic specification of accessed regions Programmer provides information Analysis verifies that it is correct

35 Points-to Language f(p, q, n) { context { entry: p->_a, q->_b;
exit: p->_a, _a->_c, q->_b, _b->_d; } entry: p->_a, q->_a; q->_a;

36 Points-to Language f(p, q, n) { context { Contexts for f(p,q,n)
entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } entry: p->_a, q->_a; q->_a; Contexts for f(p,q,n) p q p q entry p q p q exit

37 Verifying Points-to Information
One (flow sensitive) analysis per context f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q exit

38 Verifying Points-to Information
Start with entry points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q p q entry p q p q exit

39 Verifying Points-to Information
Analyze procedure f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit

40 Verifying Points-to Information
Analyze procedure f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit

41 Verifying Points-to Information
Check result against exit points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit

42 Verifying Points-to Information
Similarly for other context f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q exit

43 Verifying Points-to Information
Start with entry points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q p q entry p q p q exit

44 Verifying Points-to Information
Analyze procedure f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit

45 Verifying Points-to Information
Check result against exit points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit

46 Analysis of Call Statements
g(r,n) { . f(r,s,n); }

47 Analysis of Call Statements
Analysis produces points-graph before call g(r,n) { . f(r,s,n); } r s

48 Analysis of Call Statements
Retrieve declared contexts from callee g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s p q p q exit

49 Analysis of Call Statements
Find context with matching entry graph g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s p q p q exit

50 Analysis of Call Statements
Find context with matching entry graph g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s p q p q exit

51 Analysis of Call Statements
Apply corresponding exit points-to graph g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s r s p q p q exit

52 Analysis of Call Statements
Continue analysis after call g(r,n) { . f(r,s,n); } r s

53 Analysis of Call Statements
g(r,n) { . f(r,s,n); } Result Points-to declarations separate analysis of multiple procedures Transformed global, whole-program analysis into local analysis that operates on each procedure independently r s

54 Region Language h(p,n) { reads [p,p+n-1]; writes [p,p+n-1]; }

55 Region Language h(p,n) { reads [p,p+n-1]; writes [p,p+n-1]; } reads p

56 Verifying Region Information
Two region containment requirements Direct Accesses: Locations directly accessed by procedure must be contained in declared regions Callees: Regions accessed by callees must be contained in declared regions of caller

57 Verifying Region Information
h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); }

58 Verifying Region Information
Extract directly accessed regions h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Directly Accessed Regions writes p p+n-1

59 Verifying Region Information
Check inclusion within declared regions h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Declared Regions for h(p,n) writes p p+n-1 reads p p+n-1 Directly Accessed Regions writes p p+n-1

60 Verifying Region Information
Check inclusion for accesses of callees h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } Callees

61 Verifying Region Information
Start with call to h(p,n/2) h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); }

62 Verifying Region Information
Extract and translate regions for h(p,n/2) h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Translated Regions from h(p,n/2); writes p p+n-1

63 Verifying Region Information
Check inclusion in declared regions of caller h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Declared Regions for h(p,n) writes p p+n-1 reads p p+n-1 Translated Regions from h(p,n/2); writes p p+n-1

64 Verifying Region Information
Similarly for call h(p+n/2,n-n/2) h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Translated Regions from h(p+n/2,n-n/2); writes p p+n-1

65 Verifying Region Information
Check inclusion in declared regions of caller h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Declared Regions for h(p,n) writes p p+n-1 reads p p+n-1 Translated Regions from h(p+n/2,n-n/2); writes p p+n-1

66 Verifying Region Information
Result Region declarations separate analysis of multiple procedures Transformed global, whole-program analysis into local analysis that operates on each procedure independently h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); }

67 Experience

68 Experience Implemented points-to and region languages
Integrated with points-to and region analyses Obtained Divide and Conquer Benchmarks Quicksort (QS) Mergesort (MS) Matrix multiply (MM) LU decomposition (LU) Heat (H) Written in C We added points-to and region information Sorting Programs Dense Matrix Computations Scientific Computation

69 Results With points-to and region information, could parallelize all benchmarks Points-to information speeds up points-to analysis significantly (up to factor of two) Region information has no significant effect on how fast region analysis runs

70 Proportion of C Code, Region Declarations, and Points-to Declarations
Programming Overhead Proportion of C Code, Region Declarations, and Points-to Declarations 1.00 C Code 0.75 Region Declarations 0.50 Points-to Declarations 0.25 0.00 QS MS MM LU H

71 Evaluation How difficult is it to provide declarations?
Not that difficult. Have to write comparatively little code Must know information anyway How much benefit does compiler obtain? Substantial benefit. Simpler analysis software (no complex interprocedural analysis) More scalable, precise analysis

72 Software Engineering Benefits of Points-to and Region Declarations
Evaluation Software Engineering Benefits of Points-to and Region Declarations Analysis reflects programmers intention Enhanced code reliability Enhanced interface information Analyze incomplete programs Programs that use libraries Programs under development

73 Drawbacks of Points-to and Region Declarations
Evaluation Drawbacks of Points-to and Region Declarations Have to learn new language Have to integrate into development process Legacy software issues (programmer may not know points-to and region information)

74 Related Work Extended Type Systems FX/87 [GJLS87]
Dependent Types [XF99] Issue: where put extended type information? Integrated with rest of program Separated from rest of program Program Verification ESC [DLNS98] PVS [ORRSS96]

75 Related Work Pointer Analysis Landi, Ryder, Zhang – PLDI93
Emami, Ghiya, Hendren – PLDI94 Wilson, Lam – PLDI96 Rugina, Rinard – PLDI99 Rountev, Ryder – CC01 Region Analysis Triolet, Irigoin, Feautrier- PLDI86 Havlak, Kennedy – IEEE TPDS91 Rugina, Rinard – PLDI00 Pointer Specifications Hendren, Hummel, Nicolau – PLDI92 Guyer, Lin – LCPC00

76 Conclusion Basic idea: Programmer provides Points-to information
Region information Analysis Verifies correctness Uses information to enable further analyses and transformations Lots of benefits to compiler and programmer


Download ppt "Design-Driven Compilation"

Similar presentations


Ads by Google