# Notes on Cyclone Extended Static Checking Greg Morrisett Harvard University.

## Presentation on theme: "Notes on Cyclone Extended Static Checking Greg Morrisett Harvard University."— Presentation transcript:

Notes on Cyclone Extended Static Checking Greg Morrisett Harvard University

2 Static Extended Checking: SEX-C Similar approach to ESC-M3/Java: Calculate a 1st-order predicate describing the machine state at each program point. Generate verification conditions (VCs) corresponding to run-time checks. Feed VCs to a theorem prover. Only insert check (and issue warning) if prover can't show VC is true. Key goal: needs to scale well (like type- checking) so it can be used on every edit- compile-debug cycle.

3 Example: strcpy strcpy(char ?d, char ?s) { while (*s != 0) { *d = *s; s++; d++; } *d = 0; } Run-time checks are inserted to ensure that s and d are not NULL and in bounds. 6 words passed in instead of 2.

4 Better strcpy(char ?d, char ?s) { unsigned i, n = numelts(s); assert(n < numelts(d)); for (i=0; i < n && s[i] != 0; i++) d[i] = s[i]; d[i] = 0; } This ought to have no run-time checks beyond the assert.

5 Even Better: strncpy(char *d, char *s, uint n) @assert(n < numelts(d) && n <= numelts(s)) { unsigned i; for (i=0; i < n && s[i] != 0; i++) d[i] = s[i]; d[i] = 0; } No fat pointers or dynamic checks. But caller must statically satisfy the pre-condition.

6 In Practice: strncpy(char *d, char *s, uint n) @checks(n < numelts(d) && n <= numelts(s)) { unsigned i; for (i=0; i < n && s[i] != 0; i++) d[i] = s[i]; d[i] = 0; } If caller can establish pre-condition, no check. Otherwise, an implicit check is inserted. Clearly, checks are a limited class of assertions.

7 Results so far… For the 165 files (~78 Kloc) that make up the standard libraries and compiler: CLibs: stdio, string, … CycLib: list, array, splay, dict, set, bignum, … Compiler: lex, parse, typing, analyze, xlate to C,… Eliminated 96% of the (static) checks null : 33,121 out of 34,437 (96%) bounds: 13,402 out of 14,022 (95%) 225s for bootstrap compared to 221s with all checks turned off (2% slower) on this laptop. Optimization standpoint: seems pretty good.

8 Scaling

9 Not all Rosy: Don't do as well at array-intensive code. For instance, on the AES reference: 75% of the checks (377 out of 504) 2% slower than all checks turned off. 24% slower than original C code. (most of the overhead is fat pointers) The primary culprit: we are very conservative about arithmetic. i.e., x[2*i+1] will throw us off every time.

10 Challenges Assumed I could use off-the-shelf technology. But ran into a few problems: scalable VC generation previously solved problem (see ESC guys.) but entertaining to rediscover the solutions. usable theorem provers for now, rolled our own (not the real focus.)

11 Verification-Condition Generation We started with textbook strongest post- conditions: SP[ x := e] A = A[  / x ]  x =e[  / x ] (  fresh) SP[S 1 ; S 2 ] A = SP[S 2 ] (SP[S 1 ] A) SP[ if (e) S 1 else S 2 ] A = SP[S 1 ](A  e  0)  SP[S 2 ](A  e=0)

12 Why SP instead of WP? SP[ if (c) skip else fail ] A = A  c When A  c then we can eliminate the check. Either way, the post-condition is still A  c. WP[ if (c) skip else fail ] A =(c  A)   c For WP, this will be propagated backwards making it difficult to determine which part of the pre-condition corresponds to a particular check.

13 1st Problem with Textbook SP SP[ x := e] A = A[  / x ]  x =e[  / x ] What if e has effects? In particular, what if e is itself an assignment? Solution: use a monadic interpretation: SP : Exp  Assn  Term  Assn

14 For Example: SP[ x ] A = ( x, A) SP[ e 1 + e 2 ] A = let (t 1,A 1 ) = SP[ e 1 ] A (t 2,A 2 ) = SP[ e 2 ] A 1 in (t 1 + t 2, A 2 ) SP[ x := e ] A = let (t,A 1 ) = SP[ e ] A in (t[  / x ], A 1 [  / x ]  x == t[  / x ])

15 Or as in Haskell SP[ x ] = return x SP[ e 1 + e 2 ] = do { t 1  SP[ e 1 ] ; t 2  SP[ e 2 ] ; return t 1 + t 2 } SP[ x := e ] = do { t  SP[ e ] ; replace [  / x ] ; and x == t[  / x ] ; return t[  / x ] }

16 One Issue Of course, this over sequentializes the code. C has very liberal order of evaluation rules which are hopelessly unusable for any sound analysis. So we force the evaluation to be left-to- right and match our sequentialization.

17 Next Problem: Diamonds SP[ if (e 1 ) S 11 else S 12 ; if (e 2 ) S 21 else S 22 ;... if (e n ) S n1 else S n2 ]A Textbook approach explodes paths into a tree. SP[ if (e) S 1 else S 2 ] A = SP[S 1 ](A  e  0)  SP[S 2 ](A  e=0) This simply doesn't scale. e.g., one procedure had assn with ~1.5B nodes. WP has same problem. (see Flanagan & Leino)

18 Hmmm…a lot like naïve CPS SP[ if (e 1 ) S 11 else S 12 ; if (e 2 ) S 21 else S 22 ]A = SP[S 21 ] ((SP[S 11 ](A  e 1  0)  SP[S 12 ](A  e 1 =0))  e 2  0)  SP[S 22 ] ((SP[S 11 ](A  e 1  0)  SP[S 12 ](A  e 1 =0))  e 2 =0) Duplicate result of 1st conditional which duplicates the original assertion.

19 Aha! We need a "let": SP[ if (e) S 1 else S 2 ] A = let X=A in (e  0  SP[S 1 ]X)  (e=0  SP[S 2 ]X) Alternatively, make sure we physically share A. Oops: SP[ x := e] X = X[  / x ]  x =e[  / x ] This would require adding explicit substitutions to the assertion language to avoid breaking the sharing.

20 Handling Updates (Necula) Factor out a local environment: A = { x =e 1  y =e 2  …}  B where neither B nor e i contains program variables (i.e., x,y,… ) Only the environment needs to change on update: SP[ x := 3] { x =e 1  y =e 2  …}  B = { x =3  y =e 2  …}  B So most of the assertion (B) remains unchanged and can be shared.

21 So Now: SP : Exp  (Env  Assn)  (Term  Env  Assn) SP[ x ] (E,A) = (E( x ), (E,A)) SP[ e 1 + e 2 ] (E,A) = let (t 1,E 1,A 1 ) = SP[ e 1 ] (E,A) (t 2,E 2,A 2 ) = SP[ e 2 ] (E,A 1 ) in (t 1 + t 2, E 2, A 2 ) SP[ x := e ] (E,A) = let (t,E 1,A 1 ) = SP[ e ] (E,A) in (t, E 1 [ x :=t], A 1 )

22 Or as in Haskell: SP[ x ] = lookup x SP[ e 1 + e 2 ] = do { t 1  SP[ e 1 ] ; t 2  SP[ e 2 ] ; return t 1 + t 2 } SP[ x := e ] = do { t  SP[ e ] ; set x t; return t }

23 Note: Monadic encapsulation crucial from a software engineering point of view: actually have multiple out-going flow edges due to exceptions, return, etc. (see Tan & Appel, VMCAI'06) so the monad actually accumulates (Term  Env  Assn) values for each edge. but it still looks as pretty as the previous slide. (modulo the fact that it's written in Cyclone.)

24 Diamond Problem Revisited: SP[ if (e) S 1 else S 2 ] { x =e 1  y =e 2  …}  B = (SP[S 1 ] { x =e 1  y =e 2  …}  B  e  0)  (SP[S 2 ] { x =e 1  y =e 2  …}  B  e=0) = ({ x =t 1  y =t 2  …}  B 1 )  ({ x =u 1  y =u 2  …}  B 2 ) = { x =  x  y =  y  …}  ((  x = t 1   y = t 2  …  B 1 )  (  x = u 1   y = u 2  …  B 2 ))

25 How does the environment help? SP[ if ( a ) x :=3 else x := y; if ( b ) x :=5 else skip; ] { x =e 1  y =e 2 }  B B { x =v  y =e 2 } a  0  t =3 a =0  t = e 2    b  0  v =5 b =0  v =t    

26 Tah-Dah! I've rediscovered SSA. monadic translation sequentializes and names intermediate results. only need to add fresh variables when two paths compute different values for a variable. so the added equations for conditionals correspond to  -nodes. Like SSA, worst-case O(n 2 ) but in practice O(n). Best part: all of the VCs for a given procedure share the same assertion DAG.

27 Space Scaling

28 So far so good: Of course, I've glossed over the hard bits: loops memory procedures Let's talk about loops first…

29 Widening: Given A  B, calculate some C such that A  C and B  C and |C| < |A|, |B|. Then we can compute a fixed-point for loop invariants iteratively: start with pre-condition P process loop-test & body to get P' see if P'  P. If so, we're done. if not, widen P  P' and iterate. (glossing over variable scope issues.)

30 Our Widening: Conceptually, to widen A  B Calculate the DNF Factor out syntactically common primitive relations: In practice, we do a bit of closure first. e.g., normalize terms & relations. e.g., x==e expands to x  e  x  e. Captures any primitive relation that was found on every path.

31 Widening Algorithm (Take 1): assn = Prim of reln*term*term | True | False | And of assn*assn | Or of assn*assn widen (Prim(…)) = expand(Prim(…)) widen (True) = {} widen (And(a1,a2)) = widen(a1)  widen(a2) widen (Or(a1,a2)) = widen(a1)  widen(a2)...

32 Widening for DAG: Can't afford to traverse tree so memoize: widen A = case lookup A of SOME s => s | NONE => let s = widen' A in insert(A,s); s end widen' (x as Prim(…)) = {x} widen' (True) = {} widen' (And(a1,a2)) = widen(a1)  widen(a2) widen' (Or(a1,a2)) = widen(a1)  widen(a2)

33 Hash Consing (ala Shao's Flint) To make lookup's fast, we hash-cons all terms and assertions. i.e., value numbering constant time syntactic [in]equality test. Other information cached in hash-table: widened version of assertion negation of assertion free variables

34 Note on Explicit Substitution Originally, we used explicit substitution. widen S (Subst(S',a)) = widen (S  S') a widen S (x as Prim(…)) = {S(x)} widen S (And(a1,a2)) = widen S a1  widen S a2... Had to memoize w.r.t. both S and A. rarely encountered same S and A. result was that memoizing didn't help. ergo, back to tree traversal. Of course, you get more precision if you do the substitution (but it costs too much.)

35 Back to Loops: The invariants we generate aren't great. worst case is that we get "true" we do catch loop-invariant variables. if x starts off at i, is incremented and is guarded by x = i. But: covers simple for-loops well it's fast: only a couple of iterations user can override with explicit invariant (note: only 2 loops in string library annotated this way, but plan to do more.)

36 Memory As in ESC, use a functional array: terms: t ::= … | upd(t m,t a,t v ) | sel(t m,t a ) with the environment tracking mem : SP[ * e] = do { a  SP[e] ; m  lookup mem ; return sel(m,a) } McCarthy axioms: sel(upd(m,a,v),a) == v sel(upd(m,a,v),b) == sel(m,b) when a  b

37 The realities of C bite again… Consider: pt x = new Point{1,2}; int *p = &x->y; *p = 42; *x; sel(upd(upd(m, x,{1,2}), x +offsetof(pt,y),42), x ) = {1,2} ??

38 Explode Aggregates? update(m, x,{1,2}) = upd(upd(m,x+offsetof(pt,x),1), x+offsetof(pt,y),2) This turns out to be too expensive in practice because you must model memory down to the byte level.

39 Refined Treatment of Memory Memory maps roots to aggregate values: Aggregates: {t 1,…,t n } | set(a,t,v) | get(a,t) Roots: malloc(n,t) where n is a program point and t is a term used to distinguish different dynamic values allocated at the same point. Pointer expressions are mapped to paths: Paths: path ::= root | path  t

40 Selects and Updates: Sel and upd operate on roots only: sel(upd(m,r,v),r) = v sel(upd(m,r,v),r') = sel(m,r') when r != r' Compound select and update for paths: select(m,r) = sel(m,r) select(m,a  t) = get(select(m,a),t) update(m,r,v) = update(m,r,v) update(m, a  t, v) = update(m, a, set(select(m,a),t,v))

41 For Example: *x = {1,2}; int *p = &x->y; *p = 42; update(upd(m, x,{1,2}), x  off(pt, y ), 42) = upd(upd(m, x,{1,2}), x, set({1,2},off(pt, y ), 42) = upd(upd(m, x,{1,2}), x, {1,42})) = upd(m, x,{1,42})

42 Reasoning about memory: To reduce: select(update(m,p 1,v),p 2 )) to select(m,p 2 ) we need to know p 1 and p 2 are disjoint paths. In particular, if one is a prefix of the other, we cannot reduce (without simplifying paths). Often, we can show their roots are distinct. Many times, we can show they are updates to distinct offsets of the same path prefix. Otherwise, we give up.

43 Procedures: Originally, intra-procedural only: Programmers could specify pre/post- conditions. Recently, extended to inter-procedural: Calculate SP's and propagate to callers. If too large, we widen it. Go back and strengthen pre-condition of (non-escaping) callee's by taking "disjunction" of all call sites' assertions.

44 Summary of VC-Generation Started with textbook strongest post- conditions. Effects: Rewrote as monadic translation. Diamond: Factored variables into an environment to preserve sharing (SSA). Loops: Simple but effective widening for calculating invariants. Memory: array-based approach, but care to avoid blowing up aggregates. Extended to inter-procedural summaries.

45 Proving: Original plan was to use off-the-shelf technology. eg., Simplify, SAT solvers, etc. But found: either didn't have decision procedures that I needed. or were way too slow to use on every compile. so like an idiot, decided to roll my own…

46 2 Prover(s): Simple Prover: Given a VC: A  C Widen A to a set of primitive relns. Calculate DNF for C and check that each disjunct is a subset of A. (C is quite small so no blowup here.) This catches a lot: all but about 2% of the checks we eliminate! void f(int @x) { …*x… } if (x != NULL) …*x… for (i=0; i < numelts(A); i++)…A[i]…

47 2nd Prover: Given A  C, try to show A   C inconsistent. Conceptually: explore DNF tree (i.e., program paths) the real exponential blow up is here. so we have a programmer-controlled throttle on the number of paths we'll explore (default 33). accumulate a set of primitive facts. at leaves, run simple decision procedures to look for inconsistencies and prune path.

48 Problem: Arithmetic To eliminate an array bounds check on an expression x[i], we can try to prove a predicate similar to this: A  0  i < numelts( x ) where A describes the state of the machine at that program point.

49 Do we need checks here? char *malloc(unsigned n) @ensures(n == numelts(result)); void foo(unsigned x) { char *p = malloc(x+1); for (int i = 0; i <= x; i++) p[i] = ‘a’; } 0  i < numelts( p )?

50 You bet! foo(-1) void foo(unsigned x) { char *p = malloc(x+1); for (int i = 0; i <= x; i++) p[i] = ‘a’; } i  x from loop guard, but this is an unsigned comparison. That is, we are comparing i against 0x ffffffff which always succeeds.

51 Integer Overflow This example is based on a vulnerability in the GNU mail utilities (i.e., IMAP servers) http://archives.neohapsis.com/archives/ fulldisclosure/2005-05/0580.html There are other situations where wrap-around gets you into trouble. So we wanted to take machine arithmetic seriously. Unfortunately, I haven't yet found a prover that I can effectively use. (If you know of any, please tell me!)

52 Our (Dumb) Arithmetic Solver Determines [un]satisfiability of a conjunction of difference constraints (similar to approach used by Touchstone & ABCD): Constraints: x – y S  c and x – y U  c care needed when generating constraints e.g., x + c <= y + k cannot (in general) be simplified to x - y  (k - c). Algorithm tries to find cycles in the graphs: x – x 1 U <= c 1, x 1 – x 2 U <= c 2, …, x n – x U <= c n where c 1 +c 2 +…+c n < 0. That is, x – x < 0. again, care needed to avoid internal overflow.

53 Future? We need provers as libraries/services: can we agree upon a logic? typed, untyped? theories must include useful domains (e.g., Z mod). can we agree upon an API? sharing must be preserved need incremental support, control over search need counter-example support need witnesses? we can now generate some useful benchmarks. multiple metrics: precision vs. time*space

54 Currently: Memory? The functional array encoding of memory doesn't work well. Can we adapt separation logic? Will it actually help? Can we integrate refinements into the types? Work with A. Nanevski & L. Birkedal a start. Loops? Can we divorce VC-generation from theorem proving all together? (e.g., by compiling to a language with inductive predicates?)

55 False Positives: We still have 2,000 checks left. I suspect that most are not needed. How to draw the eye to the ones that are? strengthen pre-conditions artificially (e.g., assume no aliasing, overflow, etc.) if we still can't prove the check, then it should be moved up to a "higher-rank" warning.

56 Lots of Borrowed Ideas ESC M3 & Java Touchstone, Special-J, Ccured SPLint (LCLint) FLINT ABCD