Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Computer Architecture Lab University of Michigan 1 Efficient Dynamic Detection of Input-Related Security Faults Eric Larson Dissertation Defense.

Similar presentations


Presentation on theme: "Advanced Computer Architecture Lab University of Michigan 1 Efficient Dynamic Detection of Input-Related Security Faults Eric Larson Dissertation Defense."— Presentation transcript:

1 Advanced Computer Architecture Lab University of Michigan 1 Efficient Dynamic Detection of Input-Related Security Faults Eric Larson Dissertation Defense University of Michigan April 29, 2004

2 Advanced Computer Architecture Lab University of Michigan 2 Security Faults Keeping computer data and accesses secure is a tough problem Software errors cost companies millions of dollars Different types of errors can lead to exploits: –Protocol errors –Configuration errors –Implementation errors (most common) Even with a well-designed security protocol, a program can be compromised if it contains bugs!

3 Advanced Computer Architecture Lab University of Michigan 3 Input-Related Software Faults Common implementation error is to improperly bound input data –checks are not present in many cases –when checks are present, they can be wrong –especially important for network data Common security exploit: buffer overflow –array references –string library functions in C Widespread problem: –2/3 of CERT security advisories in 2003 were due to buffer overflows –buffer overflow bugs have recently been found in Windows and Linux

4 Advanced Computer Architecture Lab University of Michigan 4 Remainder of the stack foo Example Buffer Overflow Attack Attacking the program involves two steps: bar 1. Write malicious code onto the stack. bad code 2. Redirect control to execute the malicious data.

5 Advanced Computer Architecture Lab University of Michigan 5 Overwriting the Return Address void bar() { char buffer[100]; gets(buffer); printf(“String is %s”, buffer); } Return address temporary value 1 temporary value 2 buf[99] buf[98] buf[0] Stack grows to lower addresses Data grows to higher addresses

6 Advanced Computer Architecture Lab University of Michigan 6 Overwriting the Return Address void bar() { char buffer[100]; gets(buffer); printf(“String is %s”, buffer); } 0xbadc0de buf[99] buf[98] buf[0] Stack grows to lower addresses Data grows to higher addresses The location of the return address is not always known, so overwrite everything!

7 Advanced Computer Architecture Lab University of Michigan 7 Outline of Talk Background and Related Work (Ch. 2) Detecting Input-Related Software Faults (Ch. 3) MUSE: Instrumentation Infrastructure (Ch. 4) Implementation and Results (Ch. 5) Reducing Performance Overhead (Ch. 6) Conclusions (Ch. 7)

8 Advanced Computer Architecture Lab University of Michigan 8 When Should I Look for Software Bugs? Compile-time (static) bug detection +no dependence on input +can prove that a dangerous operation is safe in some cases –often computationally infeasible (too many states or paths) –scope is limited: either high false alarm rate or low bug finding rate –hard to analyze heap data Run-time (dynamic) bug detection +can analyze all variables (including those on the heap) +execution is on a real path  fewer false alarms –error may not manifest as an error in the output –depends on program input –impacts performance of program Our approach is dynamic, addressing its deficiencies by borrowing ideas from static bug detection

9 Advanced Computer Architecture Lab University of Michigan 9 Contributions of this Thesis Dynamically Detecting Input-Related Software Faults –Relaxes dependence on input MUSE: Instrumentation Infrastructure –Developed for rapid prototyping of bug detection tools for this and future research Removing Unnecessary Instrumentation –Reduces performance overhead Improved Shadow State Management –Tighter integration with the compiler, improves performance

10 Advanced Computer Architecture Lab University of Michigan 10 Selected Related Work Jones & Kelly: dynamic approach to catching memory access errors, tracks all valid objects in memory using a table Tainted Perl: prevents unsafe actions from unvalidated input STOBO: uses allocation sizes rather than string sizes CCured: type system used to catch memory access errors, instrumentation is added when static analysis fails BOON: derives and solves a system of integer range constraints statically to find buffer overruns CSSV: model checking system to find buffer overflows in C, keeps track of potential string lengths and null termination MetaCompilation: checks for uses of unbounded input, does not verify if the checks are correct

11 Advanced Computer Architecture Lab University of Michigan 11 Detection of Input-Related Software Faults Program instrumentation tracks data derived from input –possible range of integer variables –maximum size and termination of strings Dangerous operations are checked over entire range of possible values Found 17 bugs in 9 programs, including 2 known high security faults in OpenSSH Relaxes constraint that the user provides an input that exposes the bug

12 Advanced Computer Architecture Lab University of Michigan 12 Detecting Array Buffer Overflows Interval constraint variables are introduced when external inputs are read –Holds the lower and upper bounds for each input value –Initial values encompass the entire range –Control points narrow the bounds –Arithmetic operations adjust the bounds Potentially dangerous operations are checked: –Array indexing –Controlling a loop or memory allocation size –Arithmetic operations (overflow)

13 Advanced Computer Architecture Lab University of Michigan 13 Code Sequence: int x; int array[5]; x = get_input_int(); if (x 4) fatal(“bounds”); x++; y = array[x]; Range of x: -MAX_INT   x  +MAX_INT 0  x  4 1  x  5 Value of x: 2 2 3 3 ERROR! When x = 5, array reference is out of bounds!

14 Advanced Computer Architecture Lab University of Michigan 14 Detecting Dangerous String Operations Strings are shadowed by: – max_str_size : largest possible size of the string – known_null : set if string is known to contain a null character Checking string operations: –source string will fit into the destination –source strings are guaranteed to be null terminated Operations involving a string length can narrow the maximum string size –our size counts the null character, the strlen function does not Integers that store string lengths are shadowed by: –base address of corresponding string –difference between its value and actual string length

15 Advanced Computer Architecture Lab University of Michigan 15 String Fault Detection Example Code SegmentStr. max_str_sizeknown_null char *bad_copy(char *src) { char tmp[16]; char *dst = (char*)malloc(16); if (strlen(src) > 16) return NULL; strncpy(tmp, src, 16); strcpy(dst, tmp); return dst; } srcMAX_INTTRUE

16 Advanced Computer Architecture Lab University of Michigan 16 String Fault Detection Example Code SegmentStr. max_str_sizeknown_null char *bad_copy(char *src) { char tmp[16]; char *dst = (char*)malloc(16); if (strlen(src) > 16) return NULL; strncpy(tmp, src, 16); strcpy(dst, tmp); return dst; } src tmp dst MAX_INT 16 TRUE FALSE

17 Advanced Computer Architecture Lab University of Michigan 17 String Fault Detection Example Code SegmentStr. max_str_sizeknown_null char *bad_copy(char *src) { char tmp[16]; char *dst = (char*)malloc(16); if (strlen(src) > 16) return NULL; strncpy(tmp, src, 16); strcpy(dst, tmp); return dst; } src tmp dst src MAX_INT 16 17 TRUE FALSE TRUE

18 Advanced Computer Architecture Lab University of Michigan 18 String Fault Detection Example Code SegmentStr. max_str_sizeknown_null char *bad_copy(char *src) { char tmp[16]; char *dst = (char*)malloc(16); if (strlen(src) > 16) return NULL; strncpy(tmp, src, 16); strcpy(dst, tmp); return dst; } src tmp dst src tmp MAX_INT 16 17 16 TRUE FALSE TRUE FALSE

19 Advanced Computer Architecture Lab University of Michigan 19 String Fault Detection Example Code SegmentStr. max_str_sizeknown_null char *bad_copy(char *src) { char tmp[16]; char *dst = (char*)malloc(16); if (strlen(src) > 16) return NULL; strncpy(tmp, src, 16); strcpy(dst, tmp); return dst; } src tmp dst src tmp MAX_INT 16 17 16 TRUE FALSE TRUE FALSE ERROR! tmp may not be null terminated during strcpy

20 Advanced Computer Architecture Lab University of Michigan 20 String Fault Detection Example Code SegmentStr. max_str_sizeknown_null char *bad_copy(char *src) { char *dst = (char*)malloc(16); if (strlen(src) > 16) return NULL; strcpy(dst, src); return dst; } src dst src MAX_INT 16 17 TRUE FALSE TRUE ERROR! src may not fit into dst during strcpy

21 Advanced Computer Architecture Lab University of Michigan 21 MUSE: Implementation Infrastructure Developed for rapid prototyping of bug detection tools for this and future research General-purpose instrumentation tool –can also be used to created profilers, coverage tools, and debugging aids Implemented in GCC at the abstract syntax tree (AST) level Simplification phase breaks up complex C statements –removes C side effects and other nuances –allows matching in the middle of a complex expression Specification consists of pattern-function pairs –patterns match against statements, expressions, and special events –on a match, call is made to corresponding external function

22 Advanced Computer Architecture Lab University of Michigan 22 Testing Process Source Code Instrumentation specification Instrumented Executable Error reports Compile (GCC w/MUSE) Run test suite Debug and fix errors

23 Advanced Computer Architecture Lab University of Michigan 23 Input Checker Implementation Shadow state stores checker bookkeeping info: –integers: bounds and string length information –arrays: maximum string size, null flag, and actual size Stored in hash tables ( shadow state table ) –hash tables are indexed by address –separate hash tables for integers and arrays Pointers use the array hash table Debug tracing mode can help find source of error lb: 0 ub: 5 Shadow State Table int x; shadow state for x: &x

24 Advanced Computer Architecture Lab University of Michigan 24 Results: Bugs Found ProgramDescription Defects Found Add’l False Alarms anagramanagram generator20 ftfast Fourier transform20 ksgraph partitioning30 yacr2channel router21 betaftpdfile transfer protocol daemon21 gaiminstant messaging client11 ghttpdweb server32 opensshsecure shell client / server20 thttpdweb server01 TOTAL176

25 Advanced Computer Architecture Lab University of Michigan 25 Results: Comparison to Static Approaches Program: anagram ft ks yacr2 betaftpd gaim ghttpd openssh thttpd My approach: 2 3 2 1 3 2 0 BOON: 0 0 0 0 0 core dump 0 0 MetaCompilation: Could not get access to their bug detection system.

26 Advanced Computer Architecture Lab University of Michigan 26 Initial Performance Results

27 Advanced Computer Architecture Lab University of Michigan 27 Eliminating Unnecessary Instrumentation Many variables do not need shadow state: –Variables that never hold input data –Variables that do not produce results used in dangerous operations Use static analysis to only apply instrumentation to variables that need shadow state –At least 83% of instrumentation sites are useless! Algorithm is similar to that of constant propagation in a compiler Implemented in Dflow, a whole program dataflow analysis tool we created

28 Advanced Computer Architecture Lab University of Michigan 28 Example: Removing Unneeded Instrumentation int a, b, c, d, x[5]; a = get_input_int(); b = get_input_int(); c = 2; d = b; x[a] = 3; x[c] = 6; printf(“%d\n”, d);

29 Advanced Computer Architecture Lab University of Michigan 29 Example: Removing Unneeded Instrumentation int a, b, c, d, x[5]; create_array_state(x); a = get_input_int(); create_int_bound_state(&a); b = get_input_int(); create_int_bound_state(&b); c = 2; remove_int_state(&c); d = b; copy_int_state(&d, &b); check_array_ref(x, &a); x[a] = 3; check_array_ref(x, &c); x[c] = 6; printf(“%d\n”, d);

30 Advanced Computer Architecture Lab University of Michigan 30 Example: Removing Unneeded Instrumentation int a, b, c, d, x[5]; create_array_state(x); a = get_input_int(); create_int_bound_state(&a); b = get_input_int(); create_int_bound_state(&b); c = 2; remove_int_state(&c); d = b; copy_int_state(&d, &b); check_array_ref(x, &a); x[a] = 3; check_array_ref(x, &c); x[c] = 6; printf(“%d\n”, d); Unnecessary! c never holds input data

31 Advanced Computer Architecture Lab University of Michigan 31 Example: Removing Unneeded Instrumentation int a, b, c, d, x[5]; create_array_state(x); a = get_input_int(); create_int_bound_state(&a); b = get_input_int(); create_int_bound_state(&b); c = 2; remove_int_state(&c); d = b; copy_int_state(&d, &b); check_array_ref(x, &a); x[a] = 3; check_array_ref(x, &c); x[c] = 6; printf(“%d\n”, d); Unnecessary! input value in b never used in dangerous operation

32 Advanced Computer Architecture Lab University of Michigan 32 Results: Removing Unneeded Instrumentation

33 Advanced Computer Architecture Lab University of Michigan 33 Results: Removing Unneeded Instrumentation

34 Advanced Computer Architecture Lab University of Michigan 34 Approaches to Shadow State Management Shadow state table ( Example: Jones & Kelly ): –Slow to maintain and access –Does not modify the variables within the program Fat variables ( Example: Safe C ): –Fast to access, shadow state is contained within the variable –Variables no longer fit in within a register –All variables of a particular type must be instrumented –Must account for functions that were not compiled using fat variables

35 Advanced Computer Architecture Lab University of Michigan 35 Referencing Local Shadow State by Name Compiler creates separate variable to store shadowed state for local variables –Quick to access, lookup to table not necessary –Original variable is not modified in any form –Only created for local variables that need shadowed state Still need shadow state table for: –heap variables –aliased local variables (used in the “address-of (&)” operator)

36 Advanced Computer Architecture Lab University of Michigan 36 Results: Shadow State by Name (Performance)

37 Advanced Computer Architecture Lab University of Michigan 37 Results: Shadow State by Name (Integer Shadow State Table Accesses)

38 Advanced Computer Architecture Lab University of Michigan 38 Overall Performance Results

39 Advanced Computer Architecture Lab University of Michigan 39 Conclusion Our dynamic approach detects input-related faults reducing the dependence on the precise input Shadows variables derived from input with additional state: –Integers: upper and lower bounds –Strings: maximum string size and known null flag Found 17 bugs in 9 programs –2 known high security faults in OpenSSH Improved performance by 58% –removing unneeded instrumentation sites –improved shadow state management

40 Advanced Computer Architecture Lab University of Michigan 40 Future Work Reduce the dependence on the control path Improve performance overhead by eliminating redundant instrumentation Add symbolic analysis support Address these common scenarios: –pointer walking (manual string handling) –multiple string concatenation into a single buffer Add static bug detection work to prove operations safe Combine MUSE and Dflow into a single standalone tool Explore other correctness properties

41 Advanced Computer Architecture Lab University of Michigan 41 Questions and Answers

42 Advanced Computer Architecture Lab University of Michigan 42 Inserting Malicious Code The injected code is typically very simple – often a lone system call that invokes a shell Do not know the precise address ahead of time –Keep on guessing until you get it right –Precede code with a sequence of nops to reduce the number of guesses –Disassembling the code can help Malicious code need not reside on the stack (Example: environment variable) Also possible to exploit a buffer overflow on the heap

43 Advanced Computer Architecture Lab University of Michigan 43 Software Verification Verification determines if a program is functionally correct Complete program verification only possible for trivial programs Instead, programs are shown to satisfy properties –that are simple –that have well-known behavior Verification schemes are gauged by: – soundness : every possible error is found – completeness : every reported error is a true error

44 Advanced Computer Architecture Lab University of Michigan 44 Typical Static Bug Detection Scheme Program Model Abstract Translate Optimize Parse Check Correctness Specification Remove parts of code not relevant to property Can be done using model checker, theorem prover, constraint solver, or interpreter

45 Advanced Computer Architecture Lab University of Michigan 45 Dynamic Bug Detection Systems Bug prevention schemes: –used “in the field”, needs to be fast –add safety checks around dangerous operations –bugs are still present Bug detection schemes: –designed to be used during testing –finding bugs is more important than speed –high performance overhead –typically use shadow state to find bugs that do not manifest in an output error

46 Advanced Computer Architecture Lab University of Michigan 46 Example Static Bug Detection Systems SLAM: Uses predicate abstraction to create a Boolean program that is used to verify Windows device drivers. PREfix: Traverses the call graph bottom-up using summary models for analyzed functions. ARCHER: Uses static analysis and a constraint solver to find errors in the Linux kernel. Splint: Uses annotation to analyze programs for security vulnerabilities. SPIN: Designed for verifying distributed system protocols. The protocol must be manually written using PROMELA.

47 Advanced Computer Architecture Lab University of Michigan 47 Tainted Data Analysis Algorithm // Initialization Tainted =  InputFunctionCalls = stmts that call input-producing functions foreach stmt s if (s  InputFunctionCalls ) then Tainted = Tainted  Defs ( s ) // Iterate until Tainted set is stable do { LastTainted = Tainted foreach stmt s if (  d  Uses ( s ) s.t. d  Tainted ) then Tainted = Tainted  Defs ( s ) } while ( LastTainted  Tainted ) // At end, Tainted contains definitions derived from input


Download ppt "Advanced Computer Architecture Lab University of Michigan 1 Efficient Dynamic Detection of Input-Related Security Faults Eric Larson Dissertation Defense."

Similar presentations


Ads by Google