Presentation is loading. Please wait.

Presentation is loading. Please wait.

Constraint-Based Analysis CS 6340 1. 2 void f(state *x, state *y) { result = spin_trylock( & x->lock); spin_lock( & y->lock); … if (!result) spin_unlock(

Similar presentations


Presentation on theme: "Constraint-Based Analysis CS 6340 1. 2 void f(state *x, state *y) { result = spin_trylock( & x->lock); spin_lock( & y->lock); … if (!result) spin_unlock("— Presentation transcript:

1 Constraint-Based Analysis CS 6340 1

2 2 void f(state *x, state *y) { result = spin_trylock( & x->lock); spin_lock( & y->lock); … if (!result) spin_unlock( & x->lock); spin_unlock( & y->lock); } Code Example Path Sensitivity result (!result) Pointers & Heap ( & x->lock); ( & y->lock); Inter- procedural Flow Sensitivity spin_trylock spin_lock spin_unlock Locked Unlocked Error unlock lock unlock lock

3 3 Saturn What? –SAT-based approach to static bug detection How? –SAT-based approach –Program constructs  Boolean constraints –Inference  SAT solving Why SAT? –Lots of reasons, but for now: –Program states naturally expressed as bits –The theory for bits is SAT –Efficient solvers widely available

4 4 Intuition Analyzing in one direction is problematic –Forwards or backwards –Consider null dereference analysis No null ptr assignments: forwards is best No dereferences: backwards is best Constraints –Give a global picture of the program –Allow more efficient order of solution

5 5 Straight-line Code void f(int x, int y) { int z = x & y ; assert(z == x); } x 31 …x0x0 y 31 …y0y0 == x 31  y 31 … x0y0x0y0 Bitwise-AND R y & x z == ;

6 6 Straight-line Code void f(int x, int y) { int z = x & y; assert(z == x); } R Query: Is-Satisfiable(  ) Answer: Yes x = [00…1] y = [00…0] Negated assertion is satisfiable. Therefore, the assertion may fail.

7 7 Control Flow – Preparation Approach –Assumes loop free program –Unroll loops, drop backedges May miss errors that are deeply buried –Bug finding, not verification –Many errors surface in a few iterations Advantages –Simplicity, reduces false positives

8 8 if (c) Control Flow – Example Merges –preserve path sensitivity –select bits based on the values of incoming guards G = c, x: [a 31 …a 0 ] G =  c, x: [b 31 …b 0 ] G = c   c, x: [v 31 …v 0 ] where v i = (c  a i )  (  c  b i ) c x = a; x = b; res = x; cc if (c) x = a; else x = b; res = x; true

9 9 Pointers – Overview May point to different locations… –Thus, use points-to sets p: { l 1,…,l n } … but path sensitive –Use guards on points-to relationships p: { (g 1, l 1 ), …, (g n, l n ) }

10 10 G = c, p: { (true, y) } Pointers – Example G = true, p: { (true, x) } p = & x; if (c) p = & y; res = *p; G = true, p: { (c, y); (  c, x)} if (c) res = y; else if (  c) res = x;

11 11 Pointers – Recap Guarded Location Sets { (g 1, l 1 ), …, (g n, l n ) } Guards –Condition under which points-to relationship holds –Collected from statement guards Pointer Dereference –Conditional Assignments

12 12 Not Covered Other Constructs –Structs, … Modeling of the environment Optimizations –several to reduce size of formulas –some form of program slicing important

13 13 What can we do with Saturn? int f(lock_t *l) { lock(l); … unlock(l); } if (l->state == Unlocked) l->state = Locked; else l->state = Error; if (l->state == Locked) l->state = Unlocked; else l->state = Error; Locked Unlocked Error unlock lock unlock lock

14 14 General FSM Checking Encode FSM in the program –State  Integer –Transition  Conditional Assignments Check code behavior –SAT queries

15 15 How are we doing so far? Precision: Scalability:  –SAT limit is 1M clauses –About 10 functions Solution: –Divide and conquer –Function summaries

16 16 Function Summaries (1 st try) Function behavior can be summarized with a set of state transitions Summary: * l: Unlocked  Unlocked Locked  Error int f(lock_t *l) { lock(l); … unlock(l); return 0; }

17 17 int f(lock_t *l) { lock(l); … if (err) return -1; … unlock(l); return 0; } A Difficulty Problem –two possible output states –distinguished by return value (retval == 0)… Summary 1. (retval == 0) *l: Unlocked  Unlocked Locked  Error 2.  (retval == 0) *l: Unlocked  Locked Locked  Error

18 18 FSM Function Summaries Summary representation (simplified): { P in, P out, R } User gives: –P in : predicates on initial state –P out : predicates on final state –Express interprocedural path sensitivity Saturn computes: –R: guarded state transitions –Used to simulate function behavior at call site

19 19 int f(lock_t *l) { lock(l); … if (err) return -1; … unlock(l); return 0; } Lock Summary (2 nd try) Output predicate: –P out = { (retval == 0) } Summary (R): 1. (retval == 0) *l: Unlocked  Unlocked Locked  Error 2.  (retval == 0) *l: Unlocked  Locked Locked  Error

20 20 Lock checker for Linux Parameters: –States: { Locked, Unlocked, Error } –P in = {} –P out = { (retval == 0) } Experiment: –Linux Kernel 2.6.5: 4.8MLOC –~40 lock/unlock/trylock primitives –20 hours to analyze 3.0GHz Pentium IV, 1GB memory

21 21 Double Locking/Unlocking static void sscape_coproc_close(…) { spin_lock_irqsave(&devc->lock, flags); if (…) sscape_write(devc, DMAA_REG, 0x20); … } static void sscape_write(struct … *devc, …) { spin_lock_irqsave(&devc->lock, flags); … }

22 22 Ambiguous Return State int i2o_claim_device(…) { down(&i2o_configuration_lock); if (d->owner) { up(&i2o_configuration_lock); return –EBUSY; } if (…) { return –EBUSY; } … }

23 23 Bugs TypeBugsFalse Pos.% Bugs Double Locking 1349957% Ambiguous State 452267% Total17912160% Previous Work: MC (31), CQual (18), <20% Bugs

24 24 Function Summary Database 63,000 functions in Linux – More than 23,000 are lock related – 17,000 with locking constraints on entry – Around 9,000 affects more than one lock – 193 lock wrappers – 375 unlock wrappers – 36 with return value/lock state correlation Available on the web...

25 25 Another Checker Memory leaks –Common, esp. in error handling code –Hard to find –Problematic in long running applications Current techniques –Escape analysis –Ownership types –Region based analysis…

26 26 Simple Leak char *f() { char *p; p = (char*)malloc(…); … if (err) return NULL; … return p; }

27 27 Scenario 1 – Malloc Wrappers char *f() { char *p; p = (char*)strdup(…); … if (err) return NULL; … return p; }

28 28 Scenario 2 – External References char *f(struct *s) { char *p; p = (char*)malloc(…); s->name = p; if (err) return NULL; … return p; }

29 29 Scenario 3 – Function Calls char *f(struct state *s) { char *p; p = (char*)malloc(…); g(s, p); if (err) return NULL; … return p; } void g(s, p) { s->name = p; }

30 30 Scenario 4 – Data dependency void f(int len) { char fastbuf[10], *p; if (len < 10) p = fastbuf; else p = (char *)malloc(len); … if (p != fastbuf) free(p); }

31 31 Requirements Track points-to relationships precisely Infer escaping functions –ones that create external references to objects passed in via parameters Infer allocation functions

32 32 Analysis Part I – Points-to Rule PointsTo(p, l) –condition under which p points to l  (p) = { (g 0, l 0 ), …, (g n-1, l n-1 ) } PointsTo(p, l) =  g i (if l i = l)   false (otherwise)

33 33 Analysis Part II – EscapeVia EscapeVia(l, p, X) –the condition under which location l escapes via pointer p, excluding references in set X Access Roots –Every object in the function body is accessed through one of the following “roots” Parameters (p 1 …p n ) The Return Value (ret_val) Global Variables Local Variables Heap Allocated Objects

34 34 Analysis Part II – EscapeVia Never escape through local variables Root(p)  Locals  X EscapeVia(l, p, X) = false Always escape through global variables RootOf(p)  Globals EscapeVia(l, p, X) = PointsTo(p, l)

35 35 Escaping through parameters/return RootOf(p)  (Params  { ret_val }) – X EscapeVia(l, p, X) = PointsTo(p, l) Escaping via another allocated location RootOf(p)  NewLocs – X EscapeVia(l, p, X) = PointsTo(p, l)  Escaped(p,X  {RootOf(l)}) Analysis Part II – EscapeVia

36 36 Analysis Part III – Escape/Leak Escape Condition Escaped(l, X) =  p EscapedVia(l, p, X) Leak Condition Leaked(l, X) =  Escaped(l, X) Leak Checker For all new locations l, there is a leak if Satisfiable(Leaked(l, {}))

37 37 Results LOC (K) # Alloc Func. # BugsFP (%) Samba40480838.79% OpenSSL2961011170.85% BinUtils90991136(66)3.55% OpenSSH361929(10)0% Total1,6462913653.69%

38 38 Why SAT? (Revisited …) Moore’s Law Uniform modeling of constructs as bits Constraints –Local specification –Global solution Incremental SAT solving –makes multiple queries efficient

39 39 Why SAT? (Cont.) Path sensitivity is important –To find bugs –To reduce false positives –Much easier to model precisely with SAT Compositionality is important –Function summaries critical for scalability –Easy to construct with SAT queries


Download ppt "Constraint-Based Analysis CS 6340 1. 2 void f(state *x, state *y) { result = spin_trylock( & x->lock); spin_lock( & y->lock); … if (!result) spin_unlock("

Similar presentations


Ads by Google