Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Modeling for Program Analysis Scott McPeak OSQ Retreat.

Similar presentations


Presentation on theme: "Data Modeling for Program Analysis Scott McPeak OSQ Retreat."— Presentation transcript:

1 Data Modeling for Program Analysis Scott McPeak OSQ Retreat

2 A Program Verifier Verification assures that a program meets some specification, e.g. "no segfaults" –Full correctness vs. partial specs This is undecidable: annotations Program Specification Annotations useful factsnew obligations

3 Verifier Architecture Verification condition generation (semantics) Theorem prover program annotations specification (hardcoded) predicates (collectively imply program meets spec) "proved" "not proved"

4 Verification Benefits Potential for reducing costs of testing and debugging is enormous –Memory safety –Concurrency safety –Adherence to domain-specific protocols Annotation appeal: capture "why" info Could prove absence of certain security violations

5 Run Time is Too Late Doesn't reduce testing cost Run-time cost may be significant –Cumulative across different analyses Recovery after run-time failure? Delay between introduction of a bug and the discovery of its effect

6 Will Anyone Annotate? Of course, if cost/benefit ratio is right Benefits can be high (previous slide) Abstraction is key to controlling cost –Can re-use "why" knowledge; libraries, etc. –Common tasks must be easy (e.g. array of non- null elements) –Module-wide defaults under user control

7 Development Model codecompileverifiertesting type error fix failed proof diagnosis assistant explanation fix wrong behavior debugging...

8 Data Modeling Program analyzer must abstract application data (otherwise it's just executing!) Model: family of mathematical objects, and axioms which relate them Enormous design space, little guidance Direct impact on success of analysis

9 Example: Strings Initial model: two function symbols –size(addr)# of allocated bytes –strlen(addr)least index of a 0 byte strcpy(d, s) pre: size(d) < strlen(s) post: strlen(d) = strlen(s) strcat(d, s) pre: size(d) - strlen(d) < strlen(s) post: strlen(d) = pre(strlen(d) + strlen(s))

10 String as a Set Add the predicate contains(addr, ch) ! {T,F} strcpy(d, s) post: 8 ch. contains(s, ch), contains(d, ch) strchr(s, ch) ! r post: contains(s, ch) ) 9 i. r = s+i && : contains(s, ch) ) r = NULL

11 String as a Sequence Add another symbol "[]" addr[i] ! ch strcpy(d, s) post: 8 i. d[i] = s[i] strchr(s, ch) ! r post: ( 9 i. s[i]=ch) ) *r=ch && : ( 9 i. s[i]=ch) ) r=NULL

12 Example: Integers " int " is easy to model, right? Well... Mathematical integers Finite partition: { 1 } 32-bit 2's complement with wraparound

13 Example: Memory mem toplevel obj addr &x malloc(..) a struct field offsets g array int indexes 8 3 "x" = sel(mem 0, addr x ) "a.g[3]" = sel(sel(sel(mem 0, addr a ), g), 3) "a" "a.g"

14 Pointers Pointers are access paths "&(a.g[3])" = sub(sub(sub(whole, a), g), 3) Rules to read via pointers Can also write, do pointer arithmetic, deeper indexing, e.g. "&(p->x)" selPtr(obj, sub(rest, index)) = v sel(selPtr(obj, rest), index) = v selPtr(obj, whole) = obj

15 Data Structure Invariants Classic approach: universal quantifier – 8 a. type(a)=Foo ) a->x = a->y + 1 Field admission predicate –Bar *p; admission: p!=NULL; Object state field: "ok" vs. "not ok" –Change a field ! state:="not ok" –Manually certify "ok", precondition=invariant – 8 a. type(a)=Foo ) a->state="ok"

16 Example: Change Sets Globals: list of changed / list of unchanged –Not ideal.. name sets of globals? Hierarchical mem: changed object is easy –new = update(old, obj_addr, some_value) But changed field (of many objects) is hard Possible alternative: staged & weakened invariants; state what is still true, rather than naming what has changed

17 Conclusions Try to capture invariants implicitly, via representation choices Be explicit about related entities: inDegree(n)=d vs. inDegree1(n, referrer) Let user select among possible models, even to choose not to model certain fields Try to think like a programmer


Download ppt "Data Modeling for Program Analysis Scott McPeak OSQ Retreat."

Similar presentations


Ads by Google