Presentation is loading. Please wait.

Presentation is loading. Please wait.

Program Analysis and Design Conformance Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.

Similar presentations


Presentation on theme: "Program Analysis and Design Conformance Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology."— Presentation transcript:

1 Program Analysis and Design Conformance Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

2 Research Overview Program Analysis Commutativity Analysis for C++ Programs [PLDI96] Memory Disambiguation for Multithreaded C Programs Pointer Analysis [PLDI99] Region Analysis [PPoPP99, PLDI00] Pointer and Escape Analysis for Multithreaded Java Programs [OOPSLA99, PLDI01, PPoPP01]

3 Research Overview Transformations Automatic Parallelization Object-Oriented Programs with Linked Data Structures [PLDI96] Divide and Conquer Programs [PPoPP99, PLDI00] Synchronization Optimizations Lock Coarsening [POPL97,PLDI98] Synchronization Elimination [OOPSLA99] Optimistic Synchronization Primitives [PPoPP97] Memory Management Optimizations Stack Allocation [OOPSLA99,PLDI01] Per-Thread Heap Allocation

4 Research Overview Verifications of Safety Properties Data Race Freedom [PLDI00] Array Bounds Checks [PLDI00] Correctness of Region-Based Allocation [PPoPP01] Credible Compilation [RTRV99] Correctness of Dataflow Analysis Results Correctness of Standard Compiler Optimizations

5 Talk Overview Memory Disambiguation Goal: Verify Data Race Freedom for Multithreaded Divide and Conquer Programs Analyses: Pointer Analysis Accessed Region Analysis Experience integrating information from the developer into the memory disambiguation analysis Role Verification Design Conformance

6 Basic Memory Disambiguation Problem *p = v Without Any Analysis: *p=v may access any location *p = v; (write v into the memory location that p points to) What memory locations may *p=v access?

7 *p = v; (write v into the memory location that p points to) What memory location may *p=v access? *p = v With Analysis: *p=v does not access these memory locations ! *p=v may access this location Basic Memory Disambiguation Problem

8 Static Memory Disambiguation Analyze the program to characterize the memory locations that statements in the program read and write Fundamental problem in program analysis with many applications

9 Application: Verify Data Race Freedom *p = v1; *q = v2; *q = v2 *p = v1 || *q = v2 *p = v1 Program Does This NOT This

10 Example - Divide and Conquer Sort 47615382

11 82536147 47615382 Divide

12 28531674 82536147 47615382 Example - Divide and Conquer Sort Conquer Divide

13 Example - Divide and Conquer Sort 28531674 Conquer 82536147 Divide 47615382 41673258 Combine

14 Example - Divide and Conquer Sort 28531674 Conquer 82536147 Divide 47615382 41673258 Combine 21346578

15 Divide and Conquer Algorithms Lots of Generated Concurrency Solve Subproblems in Parallel

16 Divide and Conquer Algorithms Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel

17 Divide and Conquer Algorithms Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel Combine Results in Parallel

18 “Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n);

19 “Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Divide array into subarrays and recursively sort subarrays in parallel

20 “Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Subproblems Identified Using Pointers Into Middle of Array 47615382 d d+n/4 d+n/2 d+3*(n/4)

21 “Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 74165328 d d+n/4 d+n/2 d+3*(n/4) Sorted Results Written Back Into Input Array

22 “Merge Sorted Quarters of d Into Halves of t” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 74165328 41673258 d t t+n/2

23 “Merge Sorted Halves of t Back Into d” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 21346578 41673258 d t t+n/2

24 “Use a Simple Sort for Small Problem Sizes” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 47615382 d d+n

25 “Use a Simple Sort for Small Problem Sizes” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 47165382 d d+n

26 What Do You Need To Know To Verify Data Race Freedom? Points-to Information (data blocks that pointers point into) Region Information (accessed regions within data blocks)

27 d and t point to different memory blocks Calls to sort access disjoint parts of d and t Together, calls access [d,d+n-1] and [t,t+n-1] sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+n/2,t+n/2,n/4); sort(d+3*(n/4),t+3*(n/4), n-3*(n/4)); Information Needed To Verify Race Freedom d t d t d t d t d+n-1 t+n-1 d+n-1 t+n-1 d+n-1 t+n-1 d+n-1 t+n-1

28 d and t point to different memory blocks First two calls to merge access disjoint parts of d,t Together, calls access [d,d+n-1] and [t,t+n-1] merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4), d+n,t+n/2); merge(t,t+n/2,t+n,d); d t d t d+n-1 t+n-1 d+n-1 t+n-1 d t d+n-1 t+n-1 Information Needed To Verify Race Freedom

29 dd+n-1 Information Needed To Verify Race Freedom Calls to insertionSort access [d,d+n-1] insertionSort(d,d+n);

30 What Do You Need To Know To Verify Data Race Freedom? Points-to Information (d and t point to different data blocks) Symbolic Region Information (accessed regions within d and t blocks)

31 How Hard Is It To Figure These Things Out?

32 Challenging How Hard Is It For the Program Analysis To Figure These Things Out?

33 void insertionSort(int *l, int *h) { int *p, *q, k; for (p = l+1; p < h; p++) { for (k = *p, q = p-1; l <= q && k < *q; q--) *(q+1) = *q; *(q+1) = k; } Not immediately obvious that insertionSort(l,h) accesses [l,h-1]

34 void merge(int *l1, int*m, int *h2, int *d) { int *h1 = m; int *l2 = m; while ((l1 < h1) && (l2 < h2)) if (*l1 < *l2) *d++ = *l1++; else *d++ = *l2++; while (l1 < h1) *d++ = *l1++; while (l2 < h2) *d++ = *l2++; } Not immediately obvious that merge(l,m,h,d) accesses [l,h-1] and [d,d+(h-l)-1] How Hard Is It For the Program Analysis To Figure These Things Out?

35 Issues Heavy Use of Pointers Pointers into Middle of Arrays Pointer Arithmetic Pointer Comparison Multiple Procedures sort(int *d, int *t, n) insertionSort(int *l, int *h) merge(int *l, int *m, int *h, int *t) Recursion Multithreading

36 Pointer Analysis For each program point, computes where each pointer may point e.g. “ p  x before statement *p = 1” Complications 1. Statically unbounded number of locations recursive data structures (lists, trees) dynamically allocated arrays 2. Multiple possible executions of the program may create different dynamic data structures

37 Memory Abstraction Physical Memory Abstract Memory StackHeap p i head r p r q v qv j i j Allocation block for each variable declaration Allocation block for each memory allocation site

38 Memory Abstraction Physical Memory Abstract Memory StackHeap p i head r p r q v qv j i j Allocation block for each variable declaration Allocation block for each memory allocation site

39 Pointer Analysis Summary Key Challenge for Multithreaded Programs: Analyzing interactions between threads Solution: Interference Edges Record edges generated by each thread Captures effect of parallel threads on points-to information of other threads

40 What Pointer Analysis Gives Us Disambiguation of Memory Accesses Via Pointers Pointer-based loads and stores: use pointer analysis results to derive the allocation block that each pointer-based load or store statement accesses MOD-REF or READ-WRITE SETS Analysis: All loads and stores Procedures: use the memory access information for loads and stores to compute the allocation blocks that each procedure accesses

41 Is This Information Enough?

42 NO Necessary but not Sufficient Parallel Tasks Access (Disjoint) Regions of Same Allocated Block of Memory

43 Structure of Analysis Bounds Analysis Region Analysis Data Race Freedom Symbolic Upper and Lower Bounds for Each Memory Access in Each Procedure Symbolic Regions Accessed By Execution of Each Procedure Check that Parallel Threads Are Independent Pointer Analysis Disambiguate Memory at the Granularity of Allocation Blocks

44 Running Example – Array Increment void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); /* increment first half */ spawn f(p+n/2, n/2); /* increment second half */ sync; } else { /* base case: increment small array */ int i = 0; while (i < n) { *(p+i) += 1; i++; } }

45 Bounds Analysis Region Analysis Data Race Detection Symbolic Upper and Lower Bounds for Each Memory Access in Each Procedure Pointer Analysis Intra-procedural Bounds Analysis

46 Intraprocedural Bounds Analysis GOAL: For each pointer and array index variable at each program point, derive lower and upper bounds E.g. “ 0  i  n-1 at statement *(p+i) += 1 ” Bounds are symbolic expressions variables represent initial values of parameters of enclosing procedure bounds are combinations of variables example expression for f(p,n): p+(n/2)-1

47 What are upper and lower bounds for i at each program point in base case? int i = 0; while (i < n) { *(p+i) += 1; i++; } Intraprocedural Bounds Analysis

48 Bounds Analysis, Step 1 Build control flow graph i = 0 i < n *(p+i) += 1 i = i+1

49 Set up bounds at beginning of basic blocks Bounds Analysis, Step 2 l 1  i  u 1 i = 0 i < n *(p+i) += 1 i = i+1 l 2  i  u 2 l 3  i  u 3

50 Compute transfer functions Bounds Analysis, Step 3 l 1  i  u 1 i = 0 i < n *(p+i) += 1 i = i+1 l 2  i  u 2 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1

51 l 2  i  n-1 n  i  u 2 l 2  i  u 2 Compute transfer functions Bounds Analysis, Step 3 l 1  i  u 1 i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1

52 Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2  i  n-1 n  i  u 2 l 2  i  u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 Build Region Constraints [ 0, 0 ]  [ l 2, u 2 ] [ l 3 +1, u 3 +1 ]  [ l 2, u 2 ] [ l 2, n-1 ]  [ l 3, u 3 ] l 1  i  u 1

53 Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2  i  n-1 n  i  u 2 l 2  i  u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 Build Region Constraints [ 0, 0 ]  [ l 2, u 2 ] [ l 3 +1, u 3 +1 ]  [ l 2, u 2 ] [ l 2, n-1 ]  [ l 3, u 3 ] l 1  i  u 1

54 Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2  i  n-1 n  i  u 2 l 2  i  u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 Build Region Constraints [ 0, 0 ]  [ l 2, u 2 ] [ l 3 +1, u 3 +1 ]  [ l 2, u 2 ] [ l 2, n-1 ]  [ l 3, u 3 ] l 1  i  u 1

55 Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2  i  n-1 n  i  u 2 l 2  i  u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 Build Region Constraints [ 0, 0 ]  [ l 2, u 2 ] [ l 3 +1, u 3 +1 ]  [ l 2, u 2 ] [ l 2, n-1 ]  [ l 3, u 3 ] -   i  + 

56 Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2  i  n-1 n  i  u 2 l 2  i  u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 Build Region Constraints [ 0, 0 ]  [ l 2, u 2 ] [ l 3 +1, u 3 +1 ]  [ l 2, u 2 ] [ l 2, n-1 ]  [ l 3, u 3 ] -   i  + 

57 Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2  i  n-1 n  i  u 2 l 2  i  u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 Build Region Constraints [ 0, 0 ]  [ l 2, u 2 ] [ l 3 +1, u 3 +1 ]  [ l 2, u 2 ] [ l 2, n-1 ]  [ l 3, u 3 ] -   i  +  l 2  0 l 2  l 3 +1 l 3  l 2 0  u 2 u 3 +1  u 2 n-1  u 3 Inequality Constraints

58 Generate symbolic expressions for bounds Goal: express bounds in terms of parameters Bounds Analysis, Step 5 l 2 = c 1 p + c 2 n + c 3 l 3 = c 4 p + c 5 n + c 6 u 2 = c 7 p + c 8 n + c 9 u 3 = c 10 p + c 11 n + c 12

59 Generate symbolic expressions for bounds Goal: express bounds in terms of parameters l 2 = c 1 p + c 2 n + c 3 l 3 = c 4 p + c 5 n + c 6 Bounds Analysis, Step 5 u 2 = c 7 p + c 8 n + c 9 u 3 = c 10 p + c 11 n + c 12 l 2  0 l 2  l 3 +1 l 3  l 2 0  u 2 u 3 +1  u 2 n-1  u 3

60 c 1 p + c 2 n + c 3  0 c 1 p + c 2 n + c 3  c 4 p + c 5 n + c 6 +1 c 4 p + c 5 n + c 6  c 1 p + c 2 n + c 3 Substitute expressions into constraints Bounds Analysis, Step 6 0  c 7 p + c 8 n + c 9 c 10 p + c 11 n + c 12 +1  c 7 p + c 8 n + c 9 c 7 p + c 8 n + c 9  c 10 p + c 11 n + c 12

61 Reduce symbolic inequalities to linear inequalities c 1 p + c 2 n + c 3  c 4 p + c 5 n + c 6 if c 1  c 4, c 2  c 5, and c 3  c 6 Bounds Analysis, Step 7

62 Apply reduction and generate a linear program c 1  0 c 2  0 c 3  0 c 1  c 4 c 2  c 5 c 3  c 6 +1 c 4  c 1 c 5  c 2 c 6  c 3 Bounds Analysis, Step 8 0  c 7 0  c 8 0  c 9 c 10  c 7 c 11  c 8 c 12 +1  c 9 c 7  c 10 c 8  c 11 c 9  c 12

63 Apply reduction and generate a linear program c 1  0 c 2  0 c 3  0 c 1  c 4 c 2  c 5 c 3  c 6 +1 c 4  c 1 c 5  c 2 c 6  c 3 lower boundsupper bounds Bounds Analysis, Step 8 Objective Function: max: (c 1 + + c 6 ) - (c 7 + + c 12 ) 0  c 7 0  c 8 0  c 9 c 10  c 7 c 11  c 8 c 12 +1  c 9 c 7  c 10 c 8  c 11 c 9  c 12

64 Solve linear program to extract bounds Bounds Analysis, Step 10 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 l 2  i  n-1 n  i  u 2 l 2  i  u 2 -   i  +  i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 Solution

65 Solve linear program to extract bounds Bounds Analysis, Step 9 u 2 = n u 3 = n-1 l 2  i  n-1 n  i  u 2 l 2  i  u 2 -   i  +  i = 0 i < n *(p+i) += 1 i = i+1 l 3  i  u 3 0  i  0 l 3  i  u 3 l 3 +1  i  u 3 +1 l 2 = 0 l 3 = 0 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 Solution Symbolic Bounds

66 Substitute bounds at each program point Bounds Analysis, Step 10 0  i  n-1 n  i  n 0  i  n -   i  +  i = 0 i < n *(p+i) += 1 i = i+1 0  i  n-1 0  i  0 0  i  n-1 1  i  n u 2 = n u 3 = n-1 l 2 = 0 l 3 = 0 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 Solution Symbolic Bounds

67 0  i  n-1 n  i  n 0  i  n -   i  +  i = 0 i < n *(p+i) += 1 i = i+1 0  i  n-1 0  i  0 0  i  n-1 1  i  n Compute access regions at each load or store Access Regions [p,p+n-1] u 2 = n u 3 = n-1 l 2 = 0 l 3 = 0 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 Solution Symbolic Bounds

68 Bounds Analysis Region Analysis Data Race Detection Symbolic Regions Accessed By Execution of Each Procedure Pointer Analysis Interprocedural Region Analysis

69 Same Approach Set up target bounds of accessed regions Build a constraint system to compute these bounds Constraint System Accessed regions for a procedure must include: 1. Regions accessed by statements in the procedure 2. Regions accessed by invoked procedures Interprocedural Region Analysis GOAL: Compute accessed regions of memory for each procedure E.g. “ f(p,n) accesses [p, p+n-1] ”

70 void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } [ p, p+n-1 ] Region Analysis in Example

71 f(p,n) accesses [ l(p,n), u(p,n) ] Region Analysis in Example void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } [ p, p+n-1 ]

72 [ l(p,n/2), u(p,n/2) ] [ l(p+n/2,n/2), u(p+n/2,n/2) ] Region Analysis in Example void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } [ p, p+n-1 ] f(p,n) accesses [ l(p,n), u(p,n) ]

73 Derive Constraint System Region constraints [ l(p,n/2), u(p,n/2) ]  [ l(p,n), u(p,n) ]www [ l(p+n/2,n/2), u(p+n/2,n/2) ]  [ l(p,n), u(p,n) ]www [ p, p+n-1 ]  [ l(p,n), u(p,n) ]www Reduce to inequalities between lower/upper bounds Further reduce to a linear program and solve: l(p,n) = p u(p,n) = p+n-1 Access region for f(p,n): [p, p+n-1]

74 Bounds Analysis Region Analysis Data Race Freedom Check that Parallel Threads Are Independent Pointer Analysis Data Race Freedom

75 Dependence testing of two statements Do accessed regions intersect? Based on comparing upper and lower bounds of accessed regions Absence of data races Check that all the statements that execute in parallel are independent Data Race Freedom

76 void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } f(p,n) accesses [ p, p+n-1 ]

77 [ p, p+n/2-1 ] [ p+n/2, p+n-1 ] Data Race Freedom void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } f(p,n) accesses [ p, p+n-1 ]

78 No data races ! Data Race Freedom void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } }

79 Fundamental Property of the Analysis: No Fixed Point Computations The analysis does not use fixed-point computations: The problem is reduced to a linear program The solution to the linear program directly gives the symbolic lower and upper bounds Fixed-point approaches: Termination is not guaranteed: analysis domain of symbolic expressions has infinite ascending chains Use imprecise techniques to ensure termination: Artificially truncate number of iterations Use imprecise widening operators

80 Experience Set of benchmark programs Two versions of each benchmark Sequential version written in C Multithreaded version written in Cilk Experiments: 1.Data Race Freedom for the multithreaded versions 2.Array Bounds Violation Detection for both sequential and multithreaded versions 3.Automatic Parallelization for the sequential version

81 Data Races and Array Bounds Violations Application Data races (multithreaded) Array Bounds Violations (multithreaded) Array Bounds Violations (sequential) QuickSort NO MergeSort NO BlockMul NO NoTempMul NO LU NO Knapsack YESNO Heat NO

82 Parallel Performance Quicksort MergesortHeat BlockMul NoTempMulLU

83 Summary Sophisticated Memory Disambiguation Analysis Points-to Information Accessed Region Information Automatic Interprocedural Handles Multithreaded Programs Other Uses Besides Data Race Freedom Bitwidth Analysis Array-Bounds Check Elimination Buffer Overrun Detection

84 Bigger Picture Analysis has a very specific goal Developer understands and cares about results Points-to and region information is (implicitly) part of the interface of each procedure Developer understands interfaces Developer has expectations about analysis results Analysis can identify serious programming errors Developer expectations are implicit

85 Idea Enhance procedure interface to make points-to and region information explicit Points-to language Points-to graphs at entry and exit Effect on points-to relationships Region language Symbolic specification of accessed regions Developer provides information Analysis verifies that it is correct, and that correctness implies data race freedom

86 Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; }

87 Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; } p q p q p q p q Contexts for f(p,q,n) entry exit

88 Verifying Points-to Information One (flow sensitive) analysis per context f(p,q,n) {. } p q p q p q p q Contexts for f(p,q,n) entry exit

89 Verifying Points-to Information Start with entry points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)

90 Verifying Points-to Information Analyze procedure f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)

91 Verifying Points-to Information Analyze procedure f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)

92 Verifying Points-to Information Check result against exit points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)

93 Verifying Points-to Information Similarly for other context f(p,q,n) {. } p q p q p q p q entry exit Contexts for f(p,q,n)

94 Verifying Points-to Information Start with entry points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)

95 Verifying Points-to Information Analyze procedure f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)

96 Verifying Points-to Information Check result against exit points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)

97 Analysis of Call Statements g(r,n) {. f(r,s,n);. }

98 Analysis of Call Statements Analysis produces points-graph before call g(r,n) {. f(r,s,n);. } r s

99 p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Retrieve declared contexts from callee g(r,n) {. f(r,s,n);. } r s

100 p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Find context with matching entry graph g(r,n) {. f(r,s,n);. } r s

101 p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Find context with matching entry graph g(r,n) {. f(r,s,n);. } r s

102 p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Apply corresponding exit points-to graph g(r,n) {. f(r,s,n);. } r s r s

103 Analysis of Call Statements Continue analysis after call g(r,n) {. f(r,s,n);. } r s

104 Analysis of Call Statements g(r,n) {. f(r,s,n);. } r s Result Points-to declarations separate analysis of multiple procedures Transformed global, whole-program analysis into local analysis that operates on each procedure independently

105 Experience Implemented points-to and region languages Integrated with points-to and region analyses Divide and Conquer Benchmarks Quicksort (QS) Mergesort (MS) Matrix multiply (MM) LU decomposition (LU) Heat (H) We added points-to and region information Sorting Programs Dense Matrix Computations Scientific Computation

106 Programming Overhead Proportion of C Code, Region Declarations, and Points-to Declarations 0.00 0.25 0.50 0.75 1.00 QSMSMMLUH C Code Region Declarations Points-to Declarations

107 Evaluation How difficult is it to provide declarations? Not that difficult. Have to write comparatively little code Must know information anyway How much benefit does analysis obtain? Substantial benefit. Simpler analysis software (no complex interprocedural analysis) More scalable, precise analysis

108 Evaluation Software Engineering Benefits of Points-to and Region Declarations Improved communication between developer and analysis Analysis reflects developer’s expectations Enhanced code reliability Enhanced interface information Analyze incomplete programs Programs that use libraries Programs under development

109 Evaluation Drawbacks of Points-to and Region Declarations Have to learn new language Have to integrate into development process Legacy software issues (programmer may not know points-to and region information)

110 Steps to Design Conformance Verify that Program Correctly Implements Key Design Properties as Expressed by Developer or Designer Role Verification Design Conformance for Object Models (joint with Daniel Jackson, MIT LCS) Context: Air Traffic Control Software MIT LCS (Daniel Jackson, Martin Rinard) MIT Aero-Astro Department (R. John Hansman) NASA Ames Research Center (Michelle Eshow) Kansas State University CS Dept. (David Schmidt) CTAS (Center/TRACON Automation System)

111 Role Verification Objects play different roles during their lifetime in computation Parked Aircraft, Taxiing Aircraft, Cleared for Takeoff Aircraft, In Flight Aircraft Roles reflect constraints on activities of object System actions must respect role constraints Parked Aircraft can’t take off Action violations indicate system confusion Goals Obtain role information from developer Check that program uses roles correctly

112 Role Classification Two General Kinds of Classification Content-based (predicate on object fields determines role) Relative (points-to relationships determine role) Role Classification is Application Dependent Aircraft Flying Aircraft Parked Aircraft Taxiing Aircraft Cleared Aircraft Class Roles

113 Standard View of Object Fields Outgoing References List of Meter Fixes Sequence Of Points String Runway Object Gate Object Incoming References Flight Plan Trajectory Flight Name Runway Gate

114 Relative Role Classification Points-to relationships define roles Specify sources of incoming edges Field of an object playing a given role Global or local variable Specify target of outgoing edges Specify available fields in each role

115 Example Roles Gate Object Aircraft Parked Aircraft Flight Plan Trajectory Flight Name Runway Gate

116 Trajectory Gate Example Roles Runway Object Aircraft Cleared for Takeoff Aircraft Flight Plan Runway Flight Name List of Meter Fixes String

117 Role Verification Analysis Obtains Role Definitions Method Information Roles of parameters and globals on entry Role changes that method performs Role of return value Intraprocedural Analysis Simulates potential executions of method Precise abstraction of heap Use role information for invoked methods Verify correctness of role information

118 Benefits of Roles Software Engineering Benefits Safety checks that take application semantics into account Enhanced implementation transparency Transformations Enabled By Precise Referencing Behavior Safe real-time memory management Parallelization and race detection for Programs with linked data structures Optimized Atomic Transactions

119 Key Issue: Obtaining Role Information Range of Developer and Designer Involvement Some Involvement Reasonable and Necessary: Roles Reflect Application-Specific Properties Primary Focus: Role Definitions Determine analysis distinctions Relevance of extracted information Secondary Focus: Method Specifications Developer specifies roles of parameters Analysis extracts role changes

120 Design Conformance Software Development Activities Requirements Design Implementation Design is Partial Focus on Important Aspects Omit Many Low-Level Details Design and Implementation are Disconnected No guarantee that code conforms to design

121 Goal of Design Conformance Establish and mechanically check conformance Use specific design formalism (object models) Boxes (objects) and Arrows (relations between objects) Aircraft Flying Aircraft Parked Aircraft Taxiing Aircraft Cleared Aircraft Meter Fix Flight Plan ++

122 Key Issue Establishing correspondence between object model and implementation Object models usually at a higher level of abstraction Many relations in object model realized as group of objects and references Object model may entirely omit some objects or references Enables designer to focus on important aspects But complicates path to conformance analysis

123 Aircraft Flying Aircraft Parked Aircraft Taxiing Aircraft Cleared Aircraft Meter Fix Flight Plan ++ Gate Object Aircraft Flight Plan Trajectory Flight Name Runway Gate Trajectory Gate Runway Object Aircraft Flight Plan Runway Flight Name List of Meter Fixes String Aircraft Meter Fix Flight Plan + Abstract Object Model Concrete Object Model Intermediate Object Model Roles

124 Concretization Specifications Maps Between Object Models Enables Designer/Developer to Establish Correspondence Between Object Models Specify how Object Model is Realized in Code Foundation for design conformance analysis Guides implementation of object model Implementation patterns for object models

125 Design Conformance Benefits Higher Confidence in Software Promote clean implementation of design Guarantee important design properties Design becomes useful throughout entire development cycle Updated as implementation changes Reliable source of information Enables more precise, relevant analysis

126 Related Work Pointer Analysis Landi, Ryder, Zhang – PLDI93 Emami, Ghiya, Hendren – PLDI94 Wilson, Lam – PLDI96 Rugina, Rinard – PLDI99 Rountev, Ryder – CC01 Salcianu, Rinard – PPoPP01 Region Analysis Triolet, Irigoin, Feautrier- PLDI86 Havlak, Kennedy – IEEE TPDS91 Rugina, Rinard – PLDI00 Pointer Specifications Hendren, Hummel, Nicolau – PLDI92 Guyer, Lin – LCPC00

127 Related Work Shape Analysis [CWZ90,GH96,FL97,SRW99,MS01] Extended Type Systems FX/87 [GJLS87] Dependent Types [XF99] Program Verification ESC [DLNS98] PVS [ORRSS96] Implementations of Object Models [HBR00]

128 Conclusion Developer and Designer Interact with Analysis Benefits More precise, relevant analysis Verify key safety and design properties Enhance utility of design Enable powerful transformations Key Issue: Determining appropriate abstractions to leverage Access regions, roles, object models Abstractions Share Several Features Identify important properties of data Relate properties of data to behavior of computation


Download ppt "Program Analysis and Design Conformance Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology."

Similar presentations


Ads by Google