# Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.

## Presentation on theme: "Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv."— Presentation transcript:

Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv

Shape Analysis Static program analysis Determines information about dynamically allocated storage –A pointer variable is not NULL –Two data structures are disjoint The algorithm is Conservative

Applications of Shape Analysis Cleanness –Dor, Rodeh, Sagiv [SAS2000] Parallelization –Assmann, Weinhardt [PMMPC93] –Hendren, Nicolau [TPDS90] –Larus, Hilfinger [PLDI88]

Current State Good Intraprocedural analyses Sagiv, Reps, Wilhelm [TOPLAS 1998] –Analyze body of list manipulation procedures: reverse, insert, delete –Expensive, imprecise interprocedural analyses of recursive procedures

Main Results Interprocedural shape analysis algorithm for programs manipulating linked lists –Handles recursive procedures Prototype implementation –Successfully analyzed several list manipulating procedures insert, delete, reverse, reverse_append –Properties verified An a-cyclic list remains a-cyclic No memory leaks No NULL dereference

Running Example typedef struct List { int data ; struct List* n ; } *L ; L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); t  data = s ; l 2 : t  n = create(s-1); return t; } void main() { L r = NULL; int k; … l 1 : r = create(k); }

Selected Memory States exit k=3 r = NULL void main() { L r = NULL; int k; … l 1 : r = create(k); }

L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); t  d = s ; l 2 : t  n = create(s-1); return t; } l 1 s=3 t Selected Memory States l 2 s=0 t = NULL l 2 s=1 t l 2 s=2 t exit k=3 r = NULL 3 NULL 2 1

L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); t  d = s ; l 2 : t  n = create(s-1); return t; } l 1 s=3 t Selected Memory States l 2 s=1 t l 2 s=2 t exit k=3 r = NULL 3 NULL 2 1

L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); t  d = s ; l 2 : t  n = create(s-1); return t; } l 1 s=3 t Selected Memory States l 2 s=2 t exit k=3 r = NULL 3 NULL 2 1

L create(int s) { L t=NULL; if (s <= 0) return NULL; t = (L) malloc(sizeof(*L)); t  d = s ; l 2 : t  n = create(s-1); return t; } l 1 s=3 t Selected Memory States exit k=3 r = NULL 32 1 NULL

Selected Memory States exit k=3 r 32 1 NULL void main() { L r = NULL; int k; … l 1 : r = create(k); }

Where is the Challenge ? Dynamic allocation –Unbounded number of objects Recursion –Unbounded number of activation records Properties of: –Invisible instances of local variables –Dynamically allocated objects l 1 s=3 l 2 s=0 l 2 s=1 l 2 s=2 exit k=3 3 NULL 2 1 r = NULL t t t t = NULL

Our Approach Reduce the interprocedural problem shape analysis problem to an intraprocedural problem Program with procedures Program without procedures Represent the activation record stack as a linked list: Control Information Invisible instances of local variables Explicit manipulation of the stack

Our Algorithm Abstract Interpretation –Concrete Semantics: Concrete representation of memory states Effect of program statements –Abstract Semantics: Abstract representation of memory states Transfer functions Finds abstract representation of memory states at every program point

Concrete Memory Descriptors cs exit cs l1 cs l2 top cs l2 pr t t t l 1 s=3 t l 2 s=0 t = NULL l 2 s1 t l 2 s=2 t exit k=3 r = NULL 3 NULL 2 1

Concrete Memory Descriptors Relationships between memory elements: value of local variables: t, r n-successor: n invoked by: pr cs exit cs l1 cs l2 top cs l2 pr t t t Properties of memory elements: “type”: stack, heap “visibility”: top “call-site”: exit, cs l 1, cs l 2

Bounding the Representation Concrete Memory Descriptors represent memory states –Every object is represented uniquely Abstract Memory Descriptors –Conservatively represent Concrete Memory Descriptors –A bounded representation

3-Valued Properties TrueFalse top t Don’t Know top=1/2 t

Abstraction cs exit cs l1 cs l2 cs l2, top pr t t t t t cs exit cs l1 cs l2, top pr cs l2 pr

Bounding the Representation Summarize nodes according to their unary properties Join values of relationships Convert a Concrete Memory Descriptor of arbitrary size into an Abstract Memory Descriptor of bounded size Does the Abstract Memory Descriptor contain enough information?

Problem cs l2, top cs l2 exit pr t cs l1 pr t exit cs l1 cs l2 cs l2, top pr t t t

Observing Properties of Invisible Variables Explicitly track universal properties of invisible-variables –Different invisible instances of t cannot point to the same heap cell Instrumentation properties –Track derived properties of memory elements

Some Instrumentation Properties Pointed-to by an invisible instance of t Pointed by more than one invisible instance of t t is not NULL

Memory Descriptors with Instrumentation exit cs l1 cs l2 cs l2, top pr t t t cs l2, top cs l2 exit pr cs l1 pr t t

Problem - solved cs l2, top cs l2 exit pr cs l1 pr t t exit cs l1 cs l2 cs l2, top pr t t cs l2, top t

Why Does It Work Shape analysis handles linked list quite precisely (Sagiv, Reps, Wilhelm [TOPLAS98]) Utilize the (intraprocedural) 3-valued logic framework of Sagiv, Reps and Wilhelm [POPL99] to analyze the resulting intraprocedural problem

Prototype Implementation Implemented in TVLA [Lev-Ami, Sagiv SAS 2000] Analyzed some recursive list manipulating programs Verified cleanness properties: –No memory leaks –No NULL dereferences

Prototype Implementation Procedure create delAll insert delete search append reverse reverse_append reverse_append _r Running example Time (sec) 7.31 12.74 34.61 38.29 8.07 40.64 47.56 95.35 1204.13 16.50 Number of (3VL) Structures 219 139 344 423 303 326 414 797 2285 208

Conclusion Need to know more than potential values of invisible variables Tracking properties of invisible variables helps to overcome the (necessary) imprecision summarization of their values Instrumentation –Generic Sharing by different instances of a local variable –List specific

Conclusion Storing the call-site enable to improve information propagation to return-sites Shows how the intraprocedural framework of Sagiv, Reps and Wilhelm can be used for interprocedural analyses Analysis of a complex data structure

Limitations Small programs No mutual recursion (Implementation) Predefined instrumentation library Easy to use, no need for user intervention –Might not be good for all programs

Further Work Scaling the algorithm –Distinguishing between “relevant context” and “irrelevant” context –Analysis of programs manipulating Abstract Data Types

The End Interprocedural shape analysis for recursive programs Noam rinetzky and Mooly Sagiv Compiler Construction 2001 www.cs.tau.ac.il/~maon

Download ppt "Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv."

Similar presentations