Presentation is loading. Please wait.

Presentation is loading. Please wait.

Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005.

Similar presentations


Presentation on theme: "Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005."— Presentation transcript:

1 Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

2 The Video The Video

3 The Problem How to discover errors in code without running it Code can run for weeks or months without displaying the error Many errors are caused by pieces of code that are very difficult to test Device drivers – manufacturers aren’t always good at this, and one OS company can’t possibly test all the tens of thousands of devices out there Device drivers – manufacturers aren’t always good at this, and one OS company can’t possibly test all the tens of thousands of devices out there The Windows 98 crash was caused by a bad scanner driver Concurrent code—debugging complicated concurrency problems is a nightmare x n. Concurrent code—debugging complicated concurrency problems is a nightmare x n.

4 The Scope Lines of Code (estimated) Windows 3.1 3,000,000 Windows NT 3.51 4,000,000 Windows 95 15,000,000 RedHat Linux 7.1 30,000,000 Windows 2000 35,000,000 Windows XP 40,000,000 Debian Linux 2.2 56,000,000 Debian Linux 3.1 213,000,000

5 The Real Problem We’re only human No person, no group of people can possibly manually debug anything as complicated as an OS and its related pieces No person, no group of people can possibly manually debug anything as complicated as an OS and its related pieces Good tools are not enough Good tools are not enough Can’t rely on thorough annotations of entire code base Can’t rely on manual directions: the more automated the better

6 The Solutions MC Security checking system RacerX: Race condition and Deadlock detection General rule inference from source code

7 MECA: Statically Checking Security Properties Checks low-level properties (pointer safety, etc.) Relies on annotations that propagate through the analysis Goals Expressiveness Expressiveness Low manual overhead—programmers only have to type in a relatively few number of annotations Low manual overhead—programmers only have to type in a relatively few number of annotations Low false-positives Low false-positives

8 How MC Works Uses a modified GCC compiler Parses source along with abstract syntax tree generated by compiler AST used to build a control-flow graph Annotation propagator uses CFG to propagate annotations through entire graph Checkers are run on the completed graph Results are ranked and filtered

9 An example Rule: OS kernel may not access a user- pointer (there are “paranoid” functions to access the data pointed to by a user- pointer) Referred to as a “tainted” pointers Referred to as a “tainted” pointersAnnotate: Tainted variables, parameters, and fields Tainted variables, parameters, and fields Functions that produce tainted values Functions that produce tainted values

10 Source annotations struct myStruct { /*@ tainted */ int*p; }; /*@ tainted */ int *foo(/*@ tainted */int *p); void memcpy(/*@ !tainted */void *dst, /*@ !tainted */void *src, unsigned nbytes);

11 Source annotations //Binding: /*@ set_length($ret, sz) */ void* malloc(unsigned sz); //Global: all sys_* calls //are tainted /*@ global $param ${!strncmp(current_fn,”sys_”,4)} ==> tainted */

12 Propagation void bar(/*@ tainted */void *p); struct S{char* buf;} //Before analysis void foo(char** p, struct S* s) { char *r; struct S* ss; r=*p; bar(r);//taints r and *p ss =s; bar(ss->buf);//taints ss and s } //At the end of analysis: Foo(/*@ tainted (*p) */char **p, /*@tainted(s->buf) */struct S* s);s

13 MECA results On average, one manual annotation led to 682 checks Linux 2.5.63 Bugs: TypeWarningsFixed Arbitrary write 1111 Arbitrary read 88 Fault at will 1917 Always fail 63 Total4439 False Positives 8

14 RacerX Static detection of race conditions and deadlocks Designed to find errors in large, multi- threaded systems Sorts errors by severity (the hard part) They checked Linux, FreeBSD, and a mystery OS that has only 500,000 lines of code

15 Deadlock Deadlock Thread 1 has locked resource A Thread 1 has locked resource A Thread 2 has locked resource B Thread 2 has locked resource B Thread 1 needs resource B to complete Thread 1 needs resource B to complete Thread 2 needs resource A to complete Thread 2 needs resource A to complete Neither can proceed—these threads are deadlocked Neither can proceed—these threads are deadlocked

16 Race condition Multiple threads access the same memory If memory is unprotected: Two threads can simultaneously write to same memory (bad) Two threads can simultaneously write to same memory (bad) One thread can read, another can write simultaneously (bad) One thread can read, another can write simultaneously (bad) Two threads can simultaneously read from same memory (probably ok) Two threads can simultaneously read from same memory (probably ok) It’s a race because final value is non- deterministically chosen by who gets there first.

17 Avoiding the Problem If data is never accessed by more than one thread, you don’t have to worry about concurrency If program logic ensures that only one thread accesses data, you don’t need to worry about locking the data If you’re writing a shared component, you almost always have to worry about concurrency

18 Algorithm “Lockset” algorithm detects both types of problems Lockset - A pair of Lock()/Unlock() Lock()/Unlock() InterruptDisable()/InterruptEnable() InterruptDisable()/InterruptEnable() Etc. Etc.

19 Algorithm Top-down analysis of control-flow graph Add/remove locks as needed Check for race/deadlock on each statement Cache results to ease exponential graph size

20 Deadlock Check Basically, finds if there are cycles in the lockset dependencies If lock a is obtained, then lock b, we have: If lock a is obtained, then lock b, we have: a  b Following this line of reasoning, we can discover cases that look like this: Following this line of reasoning, we can discover cases that look like this: a  b  c  a

21 Deadlock Check Deciding how important the cycle is, is non-trivial. Basically, rank higher according to: Global locks vs. local locks Global locks vs. local locks Small depth difference vs. big depth difference Small depth difference vs. big depth difference Fewer threads vs. more threads Fewer threads vs. more threads

22 Race Checking This is even harder than deadlock detection Must answer: Is lockset valid (if not, you will have LOTS of false positives) Is lockset valid (if not, you will have LOTS of false positives) Can the unprotected memory be accessed more than one thread? Can the unprotected memory be accessed more than one thread? Does the access need to be protected? Does the access need to be protected? Two reads do not a wrong make Must annotate API functions that require locks Must annotate API functions that require locks

23 Race Checking Deciding if code is multithreaded: Inferred from “programmer belief” – if a piece of code contains concurrency-related statements, the code is probably multi- threaded Inferred from “programmer belief” – if a piece of code contains concurrency-related statements, the code is probably multi- threaded Annotations—designate API functions as requiring locks Annotations—designate API functions as requiring locks

24 Race Checking Does memory need to be protected? If it’s never written to, no. If it’s never written to, no. If it’s only written on initialization, no. If it’s only written on initialization, no. On a certain code path, if there are a high-number of variables that are potentially written to concurrently, probably. On a certain code path, if there are a high-number of variables that are potentially written to concurrently, probably. Anything that can’t be written atomically, yes. (although, this is pretty much anything, especially if you have more than 1 CPU) Anything that can’t be written atomically, yes. (although, this is pretty much anything, especially if you have more than 1 CPU) If a variable is statistically likely to be protected by locking code (“Programmer Belief”) If a variable is statistically likely to be protected by locking code (“Programmer Belief”)

25 RacerX: Results ConfirmedUnconfirmedBenignFalse Deadlock System X 237 Linux 2.5.62 486 FreeBSD236 Race System X 741314 Linux 2.5.62 3226

26 Pop Quiz – Question 1 If you have read the 3 rd paper, you may not answer this question. Find the bug: if (card==NULL) { printk(KERN_ERR “capidrv-%d: … %d!\n”, card->contrnr, id); }

27 Pop Quiz – Answer 1 if (card==NULL) { printk(KERN_ERR “capidrv-%d: … %d!\n”, card->contrnr, id); }

28 Pop Quiz – Question 2 If you have read the 3 rd paper, you may not answer this question. Find the bug: struct mxser_struct *info = tty->driver_data; unsigned long flags; if (!tty || !info->xmit_buf) return 0;

29 Pop Quiz – Answer 2 struct mxser_struct *info = tty->driver_data; unsigned long flags; if (!tty || !info->xmit_buf) return 0;

30 General Methodology Take advantage of programmer beliefs Statistics are our friend If something is usually done a certain way, then instances that violate that should be examined Check internal consistency Discover rules that are built-in to the code Discover rules that are built-in to the code Minimal to no annotation Minimal to no annotation

31 Conclusion The methods tonight provide some of the best ways to find errors: Millions of lines of code can be checked with at most hundreds of lines of annotations Millions of lines of code can be checked with at most hundreds of lines of annotations The bugs these methods find are fairly specific in nature (revolve around well- structured code constructs)

32 References Junfeng Yang, Ted Kremenek, Yichen Xie, and Dawson Engler. MECA: an Extensible, Expressive System and Language for Statically Checking Security Properties. ACM CCS, 2003. MECA: an Extensible, Expressive System and Language for Statically Checking Security Properties. MECA: an Extensible, Expressive System and Language for Statically Checking Security Properties. Dawson Engler and Ken Ashcraft. RacerX: Effective, Static Detection of Race Conditions and Deadlocks. SOSP 2003. RacerX: Effective, Static Detection of Race Conditions and Deadlocks. RacerX: Effective, Static Detection of Race Conditions and Deadlocks. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. OSDI 2000. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. Source Lines of Code, http://www.answers.com/topic/source-lines- of-code http://www.answers.com/topic/source-lines- of-codehttp://www.answers.com/topic/source-lines- of-code Concurrency – Part 2: Avoiding the Problem, http://blogs.msdn.com/larryosterman/archive/2005/02/15/373460.as px http://blogs.msdn.com/larryosterman/archive/2005/02/15/373460.as px http://blogs.msdn.com/larryosterman/archive/2005/02/15/373460.as px


Download ppt "Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005."

Similar presentations


Ads by Google