How to find lots of bugs by checking program belief systems Dawson Engler David Chen, Seth Hallem, Ben Chelf, Andy Chou Stanford University Presentedy.

Slides:

Advertisements

Similar presentations

Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.

Advertisements

Semantics Static semantics Dynamic semantics attribute grammars

R4 Dynamically loading processes. Overview R4 is closely related to R3, much of what you have written for R3 applies to R4 In R3, we executed procedures.

CS 450 Module R4. R4 Overview Due on March 11 th along with R3. R4 is a small yet critical part of the MPX system. In this module, you will add the functionality.

Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.

Module R2 CS450. Next Week R1 is due next Friday ▫Bring manuals in a binder - make sure to have a cover page with group number, module, and date. You.

SPLINT STATIC CHECKING TOOL Sripriya Subramanian 10/29/2002.

Composition CMSC 202. Code Reuse Effective software development relies on reusing existing code. Code reuse must be more than just copying code and changing.

CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Bugs as deviant behavior – Inferring errors in systems code D. Engler, D. Chen, S. Hallem, B. Chelf,

Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.

1 Pointers A pointer variable holds an address We may add or subtract an integer to get a different address. Adding an integer k to a pointer p with base.

Aliases in a bug finding tool Benjamin Chelf Seth Hallem June 5 th, 2002.

How to find lots of bugs by checking program belief systems Dawson Engler David Chen, Seth Hallem, Ben Chelf, Andy Chou Stanford University This talk I.

Checking System Rules Using System-Specific, Programmer- Written Compiler Extensions Dawson Engler, Benjamin Chelf, Andy Chow, Seth Hallem Computer Systems.

Finding bugs with system-specific static analysis Dawson Engler Ken Ashcraft, Ben Chelf, Andy Chou, Seth Hallem, Yichen Xie, Junfeng Yang Stanford University.

CS 61C L03 C Arrays (1) A Carle, Summer 2005 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #3: C Pointers & Arrays

MULTIVIE W Checking System Rules Using System-Specific, Program-Written Compiler Extensions Paper: Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem.

“A System and Language for Building System-Specific, Static Analyses” CMSC 631 – Fall 2003 Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler (presented.

/* iComment: Bugs or Bad Comments? */

How to find lots of bugs in real code with system-specific static analysis Dawson Engler Stanford University Ben Chelf, Andy Chou, Seth Hallem Coverity.

Unit Testing & Defensive Programming. F-22 Raptor Fighter.

CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.

A Simple Method for Extracting Models from Protocol Code David Lie, Andy Chou, Dawson Engler and David Dill Computer Systems Laboratory Stanford University.

EE4E. C++ Programming Lecture 1 From C to C++. Contents Introduction Introduction Variables Variables Pointers and references Pointers and references.

University of Maryland Bug Driven Bug Finding Chadd Williams.

15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.

Testing Michael Ernst CSE 140 University of Washington.

Testing CSE 140 University of Washington 1. Testing Programming to analyze data is powerful It’s useless if the results are not correct Correctness is.

Inferring and checking system rules by static analysis William R Wright.

How to find lots of bugs with system-specific static analysis Dawson Engler Andy Chou, Ben Chelf, Seth Hallem, Ken Ashcraft Stanford University This talk.

Implications of Inheritance COMP204, Bernhard Pfahringer.

Richard Mancusi - CSCI 297 Static Analysis and Modeling Tools which allows further checking of software systems.

Chapter 25 Formal Methods Formal methods Specify program using math Develop program using math Prove program matches specification using.

The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.

Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005.

COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.

Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.

Reasoning about programs March CSE 403, Winter 2011, Brun.

1 Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions Dawson Engler Benjamin Chelf Andy Chou Seth Hallem Stanford University.

1 Splint: A Static Memory Leakage tool Presented By: Krishna Balasubramanian.

An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.

Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.

Bugs (part 1) CPS210 Spring Papers  Bugs as Deviant Behavior: A General Approach to Inferring Errors in System Code  Dawson Engler  Eraser: A.

Consensus-based Mining of API Preconditions in Big Code Hoan NguyenRobert DyerTien N. NguyenHridesh Rajan.

CSCI1600: Embedded and Real Time Software Lecture 28: Verification I Steven Reiss, Fall 2015.

1 Lecture07: Memory Model 5/2/2012 Slides modified from Yin Lou, Cornell CS2022: Introduction to C.

Testing CSE 160 University of Washington 1. Testing Programming to analyze data is powerful It’s useless (or worse!) if the results are not correct Correctness.

Searching CSE 103 Lecture 20 Wednesday, October 16, 2002 prepared by Doug Hogan.

/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.

/ PSWLAB Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions by D. Engler, B. Chelf, A. Chou, S. Hallem published.

Chapter 1 The Phases of Software Development. Software Development Phases ● Specification of the task ● Design of a solution ● Implementation of solution.

Checking System Rules Using System-Specific Programmer Written Compiler Extensions Dawson Engler, Benjamin Chelf, Andy Chou and Seth Hallem Presented by:

Announcements Assignment 2 Out Today Quiz today - so I need to shut up at 4:25 1.

2. Meta-Compilation: Bug Detection with Compiler Extensions Based on

CSE 374 Programming Concepts & Tools Hal Perkins Fall 2015 Lecture 17 – Specifications, error checking & assert.

C11, Implications of Inheritance

Chapter 6 CS 3370 – C++ Functions.

Logger, Assert and Invariants

CSE 374 Programming Concepts & Tools

Testing UW CSE 160 Winter 2017.

Testing UW CSE 160 Spring 2018.

Testing UW CSE 160 Winter 2016.

Ben Chelf, Andy Chou, Seth Hallem

Effective and Efficient memory Protection Using Dynamic Tainting

CSE 303 Concepts and Tools for Software Development

CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization

CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization

CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization

CSE 153 Design of Operating Systems Winter 19

Presentation transcript:

How to find lots of bugs by checking program belief systems Dawson Engler David Chen, Seth Hallem, Ben Chelf, Andy Chou Stanford University Presentedy by Baohua Wu This talk I s about how to find lots of error s in real coe. The way we are going to do this is a bit unusual. Rather than undertand what rules the systemmust follow or swhat state it is in, both of which are hard. Instead we are going to do omething much easier; we will infer what rules it believes it must obey and what state it cbelieve it is in and cross check thee believes for contradictions. The great thing this buys us is that now we can find lots of errors in coe we do no t unederstand.

u Systems have many ad hoc correctness rules –“acquire lock l before modifying x”, “cli() must be paired with sti(),” “don’t block with interrupts disabled” –One error = crashed machine u If we know rules, can check with extended compiler –Rules map to simple source constructs –Use compiler extensions to express them –Nice: scales, precise, statically find 1000s of errors Context: finding OS bugs w/ compilers Reduced to using grep on millions of line of code, or documentation, hoping you can find all cases lock_kernel(); if (!de->count) { printk("free!\n"); return; } unlock_kernel(); Linux fs/proc/ inode.c GNU C compiler Lock checker “missing unlock!”

u Problem: what are the rules?!?! – s of rules in s of subsystems. –To check, must answer: Must a() follow b()? Can foo() fail? Does bar(p) free p? Does lock l protect x? –Manually finding rules is hard. So don’t. Instead infer what code believes, cross check for contradiction u Intuition: how to find errors without knowing truth? –Contradiction. To find lies: cross-examine. Any contradiction is an error. –Deviance. To infer correct behavior: if 1 person does X, might be right or a coincidence. If 1000s do X and 1 does Y, probably an error. –Crucial: we know contradiction is an error without knowing the correct belief! Goal: find as many serious bugs as possible Reduced to using grep on millions of line of code, or documentation, hoping you can find all cases

u MUST beliefs: –Inferred from acts that imply beliefs code *must* have. –Check using internal consistency: infer beliefs at different locations, then cross-check for contradiction u MAY beliefs: could be coincidental –Inferred from acts that imply beliefs code *may* have –Check as MUST beliefs; rank errors by belief confidence. Cross-checking program belief systems x = *p / z; // MUST belief: p not null // MUST: z != 0 unlock(l); // MUST: l acquired x++; // MUST: x not protected by l // MAY: A() and B() // must be paired Specification = checkable redundancy. Can cross check code against itself for same effect. Others: that x was not already equal to value. B(); // MUST: B() need not // be preceded by A() A(); … B(); A(); … B(); A(); … B(); A(); … B();

Two techniques u Internal Consistency –Must beliefs u Statistical Analysis –May beliefs

Trivial consistency: NULL pointers u *p implies MUST belief: –p is not null u A check (p == NULL) implies two MUST beliefs: –POST: p is null on true path, not null on false path –PRE: p was unknown before check u Cross-check these for three different error types. u Check-then-use (79 errors, 26 false pos) /* 2.4.1: drivers/isdn/svmb1/capidrv.c */ if(!card) printk(KERN_ERR, “capidrv-%d: …”, card->contrnr…) Show because it is one of the simplest possible checkers, and because it finds hundreds of errors.

Null pointer fun u Use-then-check: 102 bugs, 4 false u Contradiction/redundant checks (24 bugs, 10 false) /* 2.4.7: drivers/char/mxser.c */ struct mxser_struct *info = tty->driver_data; unsigned flags; if(!tty || !info->xmit_buf) return 0; /* 2.4.7/drivers/video/tdfxfb.c */ fb_info.regbase_virt = ioremap_nocache(...); if(!fb_info.regbase_virt) return -ENXIO; fb_info.bufbase_virt = ioremap_nocache(...); /* [META: meant fb_info.bufbase_virt!] */ if(!fb_info.regbase_virt) { iounmap(fb_info.regbase_virt); Can look for redundancy in general: deadcode elim is an error finder. Can look for: writes never read, lock acquired that protects nothing,

Redundancy checking u Assume: code supposed to be useful –Useless actions = conceptual confusion. Like type systems, high level bugs map to low-level redundancies u Identity operations: “x = x”, “1 * y”, “x & x”, “x | x” u Assignments that are never read: /* ac8/net/appletalk/aarp.c */ da.s_node = sa.s_node; da.s_net = da.s_net; for(entry=priv->lec_arp_tables[i];entry != NULL; entry=next){ next = entry->next; if (…) lec_arp_remove(priv->lec_arp_tables, entry); lec_arp_unlock(priv); return 0; } Can look for redundancy in general: deadcode elim is an error finder. Can look for: writes never read, lock acquired that protects nothing. Redundant transition means we’re missing something with analysis. Critical sections that have no shared state, contradictory booleans – in general look at deadcode elim and CSE as error signalers

Internal Consistency: finding security holes u Applications are bad: –Rule: “do not dereference user pointer ” –One violation = security hole –Detect with static analysis if we knew which were “bad” –Big Problem: which are the user pointers??? u Sol’n: forall pointers, cross-check two OS beliefs –“*p” implies safe kernel pointer –“copyin(p)/copyout(p)” implies dangerous user pointer –Error: pointer p has both beliefs. –Implemented as a two pass global checker u Result: 24 security bugs in Linux, 18 in OpenBSD –(about 1 bug to 1 false positive) First pass: mark all pointers treated as user pointers. Second pass: make sure they are never dereferenced.

u Still alive in linux 2.4.4: –Tainting marks “rt” as a tainted pointer, checking warns that rt is passed to a routine that dereferences it –2 other examples in same routine… /* drivers/net/appletalk/ipddp.c:ipddp_ioctl */ case SIOCADDIPDDPRT: return ipddp_create(rt); case SIOCDELIPDDPRT: return ipddp_delete(rt); case SIOFCINDIPDDPRT: if(copy_to_user(rt, ipddp_find_route(rt), sizeof(struct ipddp_route))) return –EFAULT; An example Marked as tainted because passed as the first argument to copy_to_user, which is used to access potentientially bad user pointers. Does global analysis to detect that the pointer will be dereferenced by ippd_…

u Common: multiple implementations of same interface. –Beliefs of one implementation can be checked against those of the others! u User pointer (3 errors): If one implementation taints its argument, all others must –How to tell? Routines assigned to same function pointer –More general: infer execution context, arg preconditions… –Interesting q: what spec properties can be inferred? Cross checking beliefs related abstractly –Parameter features: Can a param be null? What are legal values of integer parameter Return code: What are allowable error code to return & when? Execution context: Are interrupts off or on when code runs? When it exits? Does it run concurrently? –Parameter features: Can a param be null? What are legal values of integer parameter Return code: What are allowable error code to return & when? Execution context: Are interrupts off or on when code runs? When it exits? Does it run concurrently? foo_write(void *p, void *arg,…){ copy_from_user(p, arg, 4); disable(); … do something … enable(); return 0; } bar_write(void *p, void *arg,…){ *p = *(int *)arg; … do something … disable(); return 0; } If one does it right, we can cross check all: if one dev gets it right we are in great shape.

Handling MAY beliefs u MUST beliefs: only need a single contradiction u MAY beliefs: need many examples to separate fact from coincidence u Conceptually: –Assume MAY beliefs are MUST beliefs –Record every successful check with a “check” message –Every unsuccessful check with an “error” message –Use the test statistic to rank errors based on ratio of checks (n) to errors (err) –Intuition: the most likely errors are those where n is large, and err is small. z(n, err) = ((n-err)/n-p0)/sqrt(p0*(1-p0)/n)

Statistical: Deriving deallocation routines u Use-after free errors are horrible. –Problem: lots of undocumented sub-system free functions –Soln: derive behaviorally: pointer “p” not used after call “foo(p)” implies MAY belief that “foo” is a free function u Conceptually: Assume all functions free all arguments –(in reality: filter functions that have suggestive names) –Emit a “check” message at every call site. –Emit an “error” message at every use –Rank errors using z test statistic: z(checks, errors) –E.g., foo.z(3, 3) < bar.z(3, 1) so rank bar’s error first –Results: 23 free errors, 11 false positives foo(p); *p = x; foo(p); *p = x; foo(p); *p = x; bar(p); p = 0; bar(p); p = 0; bar(p); *p = x; Can cross-correlate: free is on error path, has dealloc in name, etc, bump up ranking. Foo has 3 errors, and 3 checks. Bar, 3 checks, one error. Essentially every passed check implies belief held, every error = not held

A bad free error /* drivers/block/cciss.c:cciss_ioctl */ if (iocommand.Direction == XFER_WRITE){ if (copy_to_user(...)) { cmd_free(NULL, c); if (buff != NULL) kfree(buff); return( -EFAULT); } } if (iocommand.Direction == XFER_READ) { if (copy_to_user(...)) { cmd_free(NULL, c); kfree(buff); } } cmd_free(NULL, c); if (buff != NULL) kfree(buff);

u Traditional: –Use global analysis to track which routines return NULL –Problem: false positives when pre-conditions hold, difficult to tell statically (“return p->next”?) u Instead: see how often programmer checks. –Rank errors based on number of checks to non-checks. u Algorithm: Assume *all* functions can return NULL –If pointer checked before use, emit “check” message –If pointer used before check, emit “error” –Sort errors based on ratio of checks to errors u Result: 152 bugs, 16 false. Statistical: deriving routines that can fail P = foo(…); *p = x; p = bar(…); If(!p) return; *p = x; p = bar(…); If(!p) return; *p = x; p = bar(…); If(!p) return; *p = x; p = bar(…); *p = x; Can also use consistency: if a routine calls a routine that fails, then it to can fail. Similarly, if a routine checks foo for failure, but calls bar, which does not, is a type error. (In a sense can use witnesses: take good code and see what it does, reapply to unknown code)

The worst bug u Starts with weird way of checking failure: u So why are we looking for “seg_alloc”? /* ipc/shm.c:750:newseg: */ if (!(shp = seg_alloc(...)) return -ENOMEM; id = shm_addid(shp); /* : ipc/shm.c:1745:map_zero_setup */ if (IS_ERR(shp = seg_alloc(...))) return PTR_ERR(shp); static inline long IS_ERR(const void *ptr) { return (unsigned long)ptr > (unsigned long)-1000L; } int ipc_addid(…* new…) {... new->cuid = new->uid =…; new->gid = new->cgid = … ids->entries[id].p = new;

Deriving “A() must be followed by B()” u “a(); … b();” implies MAY belief that a() follows b() –Programmer may believe a-b paired, or might be a coincidence. u Algorithm: –Assume every a-b is a valid pair (reality: prefilter functions that seem to be plausibly paired) –Emit “check” for each path that has a() then b() –Emit “error” for each path that has a() and no b() –Rank errors for each pair using the test statistic »z(foo.check, foo.error) = z(2, 1) u Results: 23 errors, 11 false positives. foo(p, …) bar(p, …); “check foo-bar” x(); y(); “check x-y” foo(p, …); … “error:foo, no bar!”

Checking derived lock functions u Evilest: u And the award for best effort: /* 2.4.1: drivers/sound/trident.c: trident_release: lock_kernel(); card = state->card; dmabuf = &state->dmabuf; VALIDATE_STATE(state); /* 2.4.0:drivers/sound/cmpci.c:cm_midi_release: */ lock_kernel(); if (file->f_mode & FMODE_WRITE) { add_wait_queue(&s->midi.owait, &wait);... if (file->f_flags & O_NONBLOCK) { remove_wait_queue(&s->midi.owait, &wait); set_current_state(TASK_RUNNING); return –EBUSY; … unlock_kernel();

u Key ideas: –Check code beliefs: find errors without knowing truth. –Beliefs code MUST have: Contradictions = errors –Beliefs code MAY have: check as MUST beliefs and rank errors by belief confidence u Secondary ideas: –Check for errors by flagging redundancy. –Analyze client code to infer abstract features rather than just implementation. –Spec = checkable redundancy. Can use code for same. Summary: Belief Analysis

Example free checker sm free_checker { state decl any_pointer v; decl any_pointer x; start: { kfree(v); }  v.freed ; v.freed: { v == x } | { v != x }  { /* suppress fp */ } | { v }  { err(“Use after free!”); ; } start v. freed error use(v) kfree(v) Simple. Have had freshman write these and post bugs to linux groups. Three parts: start state. Pattern; raw c coede or wildcards., match does a transition, callouts. Scales with sophistication of analysis. System will kill variable, track when assigned to others. Simple. Have had freshman write these and post bugs to linux groups. Three parts: start state. Pattern; raw c coede or wildcards., match does a transition, callouts. Scales with sophistication of analysis. System will kill variable, track when assigned to others.

Example inferring free checker sm free_checker { state decl any_pointer v; decl any_pointer x; decl any_fn_call call; decl any_args args; start: { call(v) }  { char *n = mc_identifier(call); if(strstr(n, “free”) || strstr(n, “dealloc”) || … ) { mc_v_set_state(v, freed); mc_v_set_data(v, n); note(“NOTE: %s”, n); } }; v.freed: { v == x } | { v != x }  { /* suppress fp */ } | { v }  { err(“Use after free %s!”, mc_v_get_data(v)); ; Simple. Have had freshman write these and post bugs to linux groups. Three parts: start state. Pattern, match does a transition, callouts. Scales with sophistication of analysis. Simple. Have had freshman write these and post bugs to linux groups. Three parts: start state. Pattern, match does a transition, callouts. Scales with sophistication of analysis.

u Two Techniques: –internal consistency –statistical analysis u Found hundreds of bugs automatically in real system code: –Linux –OpenBSD Conclusion Reduced to using grep on millions of line of code, or documentation, hoping you can find all cases