Runtime Monitoring of C Programs for Security and Correctness Suan Hsi Yong University of Wisconsin – Madison Ph.D. Committee: Susan Horwitz (advisor), Thomas Reps, Charles Fischer, Somesh Jha, James Smith
Errors in Software Software errors are undesirable may produce incorrect results may crash system may corrupt data may be vulnerable to attack Software errors are difficult to detect may be infrequently exercised may not noticeably alter observable output
Memory and Type Safety Memory safety: each dereference can only access ‘intended target’ spatial access errors (e.g., out-of-bounds array access) temporal access errors (e.g., stale pointer dereference) Type safety: operations can only be applied to values of certain types
Memory and Type Safety Useful for improving quality of software Tradeoff between efficiency and flexibility If too general, incurs a high runtime overhead to enforce If too restrictive, limits expressiveness and utility of language C language mandates but does not enforce memory and type safety programmer’s responsibility, error prone
Approaches for Finding Errors Static Approaches imprecise, not scalable Dynamic Approaches incomplete coverage, high runtime overhead This thesis: Dynamic approach, but use static analysis to improve overhead for testing/debugging, and for use in deployed software
This Thesis Explores… Runtime checking of memory and type safety in C programs Three manifestations Memory-Safety Enforcer (MSE): detects invalid dereferences Sensitive Location Checker (SLC): detects invalid writes to security-sensitive locations Runtime Type Checker (RTC): detects bugs manifested as type errors
Common Features Tagged memory Source-level instrumentation each byte of memory is tagged at runtime with information used to detect errors Source-level instrumentation approach is portable, compatible with uninstrumented libraries Static analysis identifies and eliminates unnecessary instrumentation
Architecture C source file runtime system/ libraries instru- mented C source file Instrumenter classifications Static Analysis C Compiler instru- mented exec- utable
Outline Introduction Memory-Safety Enforcer (MSE) Sensitive-Location Checker (SLC) Runtime Type Checker (RTC) Related Work Conclusion
Memory Safety p = &a p’s valid target is a *(p+i) valid only if it accesses a Spatial access error: out of bounds e.g., if i is negative or is too large Temporal access error: stale pointer e.g., if a had been freed prior to *(p+i)
Memory Safety Enforcer (MSE) Debugging Tool invalid access programming error e.g., buffer overrun, stale pointer dereference Security Tool most attacks require invalid write to succeed (e.g., stack smashing attacks) halt execution when violation detected
Control Transfer Attacks Idea: overwrite sensitive location with address of malicious code Sensitive locations include return address (stack smashing) global offset table function pointers longjmp buffer exec call argument others…
Stack Smashing char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); p buf return address
Detecting Invalid Access Fat Pointer Record information about what each pointer should point to Safe-C, CCured, Cyclone Tagged Memory (our approach) Record information about which locations may be valid targets of some pointer dereference also used by Purify
Fat Pointers associate information with pointer: address and size of referent p buf 12 char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); buf return address
Tagged Memory associates information with target rather than pointer Tagged Memory associates information with target rather than pointer p char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); buf return address = valid target of unsafe dereference
Fat Pointer vs. Tagged Memory Fat Pointers Guaranteed to catch all spatial errors Difficult to catch temporal errors efficiently e.g., CCured uses garbage collection Tagged Memory Can detect both spatial and temporal errors efficiently Guaranteed only to catch invalid accesses to non-user memory But can improve with static analysis
Improving MSE Which dereferences to check? if static analysis can guarantee that *p is always valid, then *p need not be instrumented. classify dereferences into checked/unchecked Which locations to tag valid at runtime? if x can only be accessed directly or via unchecked dereference, then x need not be tagged valid classify locations into tracked/untracked
Checked Derefs/Tracked Locs Checked Derefs/Tracked Locs naively: all dereferences are checked; all user-defined locations are tracked char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); p buf return address
Checked Derefs/Tracked Locs Checked Derefs/Tracked Locs FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp Static Analysis: identify fewer checked derefs and tracked locations
Checked Dereferences Writes Only vs. Read/Write write-only checks catches most attacks; significantly improving overhead Flow-insensitive analysis: *p is checked if: p is assigned a non-pointer value, or p is the result of pointer arithmetic, or p may point to stack or freed heap location a[i] *(a+i)
Example: Checked Dereferences FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); dereferences: *p checked *fp unchecked
Tracked Locations locations that may be validly accessed via checked dereference fewer tracked locations means better performance and coverage less overhead to mark and clear valid tag increase likelihood of catching invalid access identify with points-to analysis [Das’00] for each checked dereference *p, all locations in p’s points-to set are tracked
Example: Tracked Locations FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp locations: untracked tracked p buf points-to graph: foo fp dereferences: *p unsafe *fp safe
Example: Tracked Locations Example: Tracked Locations FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp p buf fp locations: untracked tracked p buf points-to graph: foo fp dereferences: *p unsafe *fp safe
Summary of MSE Coverage Attack target MSE Unoptimized MSE + Static Analysis Return address yes Function pointer no likely exec argument maybe
Evaluation: Runtime Overhead
Flow-Sensitive Analyses Redundant Checks Analysis *(p+i) = ...; *(p+i) = ...; //- don’t instrument Pointer Range Analysis track range of possible values for each pointer *p = ...; p a:char[10], [3,7] if *p is definitely in-bounds, don’t instrument
Flow-Sensitive Analyses remind ppl: lower is better
Flow-Sensitive Analyses
Analysis Time Slowdown
MSE Static Analysis Summary Unoptimized MSE high runtime overhead only catches invalid access to non-user memory Flow-insensitive (Extended Points-To Analysis) low runtime overhead, scalable analysis Flow-sensitive Analyses 20% improvement, but analysis not scalable Write-only faster than read-write checking
Comparison with Other Tools (runtime overhead)
Summary of MSE Tool for detecting invalid pointer dereferences that has low runtime overhead does not report false positives is portable, and does not require programmer changes to source code protects against a wide range of vulnerabilities, including stack smashing and erroneous free
Outline Introduction Memory-Safety Enforcer (MSE) Sensitive-Location Checker (SLC) Runtime Type Checker (RTC) Related Work Conclusion
Two Approaches to Security MSE: Try to detect all invalid accesses including invalid accesses that are not vulnerable to attack SLC: Detect only invalid accesses to sensitive locations return address, function pointers, longjmp buffers, exec call arguments related work: StackGuard – protects only return address on the activation record
SLC vs MSE: classification identify unsafe dereferences, then compute tracked locations dereferences locations w *p x *q y *r z (points-to edges)
SLC vs MSE: classification identify sensitive locations, then compute unchecked dereferences dereferences locations w *p x *q y *r z (points-to edges)
Example char safe_buf[8]; char vuln_buf[8]; strcpy(vuln_buf, “ls”); char safe_buf[8]; char vuln_buf[8]; strcpy(vuln_buf, “ls”); gets(safe_buf); system(vuln_buf); not sensitive safe _buf sensitive unchecked vuln _buf checked use same color blocks as previous… return address
SLC vs MSE: instrumentation SLC must set/clear tag of sensitive locations, while MSE must set/clear tag of tracked locations In general, much fewer sensitive locations that tracked locations, so SLC is faster SLC must set/clear tag of return address on activation record may slow down SLC compared to MSE
Runtime Overhead: SLC vs MSE Average: SLC=37.7%, MSE=54.1%
SLC: The Bad News In some of the benchmarks (ijpeg, li, perl, gap), over 90% of the dereferences were not checked i.e., they may point to a sensitive location due to imprecision of points-to analysis could be improved with better points-to analysis Good news: can tell from static analysis whether SLC will be effective for a given program
SLC vs MSE Memory Safety Enforcer (MSE) detects invalid accesses that may not be vulnerable to attack may prevent new as-yet-undiscovered methods of attack Sensitive Location Checker (SLC) targets specific locations known (a priori) to be vulnerable to attack better runtime overhead because of limited scope
Outline Introduction Memory-Safety Enforcer (MSE) Sensitive-Location Checker (SLC) Runtime Type Checker (RTC) Related Work Conclusion
Runtime Type Checking Idea is to detect runtime type violations value of one type is used in context of incompatible type Scalar types only (structs and arrays broken down into components) Debugging tool, for use during development/testing Higher overhead acceptable (~20x) Related tools: Purify, Insure++, Valgrind
Error Example 1: Union union U { int u1; int *u2; } u; int *p; u.u1 = 20; // write int into u.u1 p = u.u2; // copy int from u.u2 -- suspicious! *p = 0; // bad pointer deref -- error!
Example 2: Bad Pointer Access int *intArray = (int *) malloc(15 * sizeof(int)); int **ptrArray = (int **) malloc(15 * sizeof(int *)); User memory intArray ptrArray
Example 2: Bad Pointer Access int *i, sumEven = 0; for(i = intArray; ...; i += 2) sumEven += *i; i User memory intArray ptrArray ORIGINAL i intArray ptrArray padding PURIFY User memory
Example 3: Custom Allocator char * myMalloc(size_t size) { static char *myMemory, *current; ... if(first_time){ myMemory = (char *) malloc(BLKSIZE); } return &myMemory[current += size];
Example 3: Custom Allocator int *intArray = (int *) myMalloc(10 * sizeof(int)); int **ptrArray = (int **) myMalloc(10 * sizeof(int *)); i User memory intArray ptrArray ORIGINAL myMemory i User memory myMemory PURIFY
Example 4: Simulated Inheritance struct Base { int a1; int a2; } base; struct Sub { int b1; int b2; char b3; } sub; : f(&base); f(&sub); void f(struct Base *s) { s->a1 = ... s->a2 = ... }
Example 4: Simulated Inheritance struct Base { int a1; int a2; } base; struct Sub { int b1; float f1; int b2; char b3; } sub; : f(&base); f(&sub); void f(struct Base *s) { s->a1 = ... s->a2 = ... }
Runtime Type-Checking Track type of values in memory store in mirror of memory: 4 bits for each byte unalloc, uninit, integral, real, pointer, zero Warning when type of value assigned does not match expected type Error when bad runtime type is used
Runtime Type Checking Tool (memory) (mirror) k . . . unalloc f p int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k );
Runtime Type Checking Tool (memory) (mirror) int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); k . . . unalloc uninit f . . . unalloc p . . . unalloc instrument a declaration
Runtime Type Checking Tool (memory) (mirror) int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); k . . . unalloc uninit f . . . unalloc uninit p . . . unalloc instrument a declaration
Runtime Type Checking Tool (memory) (mirror) int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); k . . . unalloc uninit f . . . unalloc 2.3 float uninit p . . . unalloc uninit instrument an assignment
Runtime Type Checking Tool (memory) (mirror) int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); k . . . unalloc uninit f . . . unalloc 2.3 float uninit p . . . unalloc ( &f ) uninit pointer instrument an assignment
Runtime Type Checking Tool (memory) (mirror) int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); k . . . unalloc uninit f . . . unalloc 2.3 OK float uninit p . . . unalloc ( &f ) pointer uninit instrument a use (of a pointer)
Runtime Type Checking Tool (memory) (mirror) int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); k . . . unalloc uninit OK f . . . unalloc 2.3 float uninit p . . . unalloc ( &f ) pointer uninit instrument a pointer dereference
Runtime Type Checking Tool (memory) (mirror) int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); warning! k . . . unalloc 2.3 float uninit f . . . unalloc 2.3 float uninit p . . . unalloc ( &f ) uninit pointer instrument an assignment
Runtime Type Checking Tool (memory) (mirror) error! int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); k . . . unalloc 2.3 uninit float f . . . unalloc 2.3 float uninit p . . . unalloc ( &f ) uninit pointer instrument a use (of an int)
Runtime Type Checking Found errors in SPEC 95, SPEC 2000, and Solaris utilities But, overhead is high 4x-170x slowdown, average 43.5x Use Static Analysis Flow insensitive: Type Flow Analysis Flow sensitive: redundant checks, may-be-uninitialized
Type Flow Analysis Classify lvalue expressions into: safe: runtime type always equals static type no instrumentation needed type-unsafe: always in-bounds, but runtime type may not equal static type must be instrumented to check runtime type mem-unsafe: may violate memory safety must be fully instrumented (including check that dereference does not access unalloc memory) maybe use examples to illustrate each classification
Type Flow Analysis Classify locations into: untracked: runtime type always equals static type no instrumentation needed tracked: runtime type always equals static type, but may be accessed by unsafe dereference initialize mirror to static type, but type doesn’t change unsafe: may cause type-safety error at runtime initialize mirror to uninit; type may change at runtime
Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations >
Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute points-to set for each pointer Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations pt p {f}
Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations x x 5.0 VALUEfloat VALUEinit &i VALUEvalid-ptr int nodes
Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations *p *p y+z VALUEint p+i VALUEptr &i VALUEvalid-ptr nodes
Type-Flow Analysis Points-to Analysis Build Assignment Graph x = y; Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations x y = *p = 5.0; VALUEfloat = *p x = y + z; VALUEint = x edges
Type-Flow Analysis Points-to Analysis Build Assignment Graph T = “uninitialized” Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations valid-ptr init pointer int char float … | = “multiply-typed”
Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations float VALUEfloat T(x) T(y) T(y) x y =
Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations T(*p) T(k) {k} p pt *p y = *p {k} p pt k T(k) T(y)
} Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations poss-type(x) static-type(x) x type-unsafe int k; float f = 2.3; int *p = &f; k = *p; } poss-type(k) = float k type-unsafe static-type(k) = int
} Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations poss-type(x) static-type(x) x type-unsafe int k; float f = 2.3; int *p = &f; k = *p; } poss-type(*p) = float *p type-unsafe static-type(*p) = int
Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations 2. poss-type(p) ≠ valid-ptr *p mem-unsafe int *p = &k; p = p + 1; *p = 5; *p mem-unsafe
} Type-Flow Analysis Points-to Analysis Build Assignment Graph Compute Possible Types Classify Expressions and Locations 3.*p unsafe, x tracked {x} p pt int k; float f = 2.3; int *p = &f; k = *p; } *p unsafe f tracked p may-point-to f
Type-Flow Analysis: Example int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); tracked f safe p type-unsafe k type-unsafe *p Instrumentation: w/o using type-flow analysis
Type-Flow Analysis: Example int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); tracked f safe p type-unsafe k type-unsafe *p Instrumentation: using type-flow analysis
RTC Runtime Overhead
Flow-sensitive Refinements May-be-uninitialized analysis needed to detect uses of uninitialized data analysis and runtime overhead both slow not worthwhile: better to use unoptimized Redundant checks analysis for each use of x, if error is reported, tag of x is corrected to prevent cascading errors y = x + 1; z = x + 2; //check of x is redundant >
RTC Runtime Overhead
Analysis Time
Outline Introduction Memory-Safety Enforcer (MSE) Sensitive-Location Checker (SLC) Runtime Type Checker (RTC) Related Work Conclusion
Related Work Run-time Error Detection Reducing Instrumentation Purify, Insure++, Valgrind Safe C, CCured, Cyclone Safe languages (Java), dynamically-typed languages (Lisp) Reducing Instrumentation Java (remove array-bounds checks) Lisp and Scheme (remove tag-handling operations, e.g., Henglein) CCured (remove pointer checks: Necula et al) more detailed slides on, e.g., range analysis
Conclusion Finding errors and vulnerabilities is difficult Three related runtime monitoring approaches Sensitive location checker: efficient, security-oriented Memory-safety enforcer: efficient, security or debugging Runtime type-checker: slow, for debugging Effective in detecting errors Static analysis to improve runtime overhead Flow-insensitive: scalable, significant improvements Flow-sensitive: modest improvements, not scalable
Future Work Better static analysis: fewer unsafe and tracked locations escape analysis / scope analysis better points-to analysis Different mechanism for tagging e.g. hash lookup Apply ideas to other languages and environments e.g. absorb overhead on multiprocessor
Runtime Monitoring of C Programs for Security and Correctness END Suan Hsi Yong University of Wisconsin – Madison Ph.D. Committee: Susan Horwitz (advisor), Thomas Reps, Charles Fischer, Somesh Jha, James Smith
Index Start / Mem & Type Safety / Architecture MSE / Ctl-xfer Attack / Fat Pointers / Classification / Flow-sensitive / Results SLC / Example / Results RTC / Union / MyMalloc / Inheritance / Example / Type-Flow / Results Related / Conclusion / Future MSE example / CCured wild / Type-Flow example / MBU
Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp locations: untracked tracked pointers: unsafe safe p fp
Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); fp p buf fp locations: untracked tracked pointers: unsafe safe p fp
Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); buf fp p buf fp locations: untracked tracked pointers: unsafe safe p fp
Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; Example: Allocation FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp p buf fp locations: untracked tracked pointers: unsafe safe p fp
Example: Checking Writes Example: Checking Writes FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp p buf fp locations: untracked tracked pointers: unsafe safe p fp
Example: Checking Writes Example: Checking Writes FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp p buf fp locations: untracked tracked pointers: unsafe safe p fp
Example: Checking Writes Example: Checking Writes FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp p buf fp locations: untracked tracked pointers: unsafe safe p fp
Example: Checking Writes Example: Checking Writes FN_PTR fp = &foo; char buf[12]; char *p = &buf[0]; do { *p = getchar(); } while(*p++ != ‘\0’); (*fp)(); p buf fp p buf fp locations: untracked tracked pointers: unsafe safe p fp
Back
} CCured WILD Pointers int i; int *p; int *q1, *q2; q1 = i; q1 = (int)&p; q2 = &i; } q1 WILD => q2 WILD
Back
Type-Flow Analysis: Example int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); {f} p pt 1. Points-to analysis
Type-Flow Analysis: Example int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); VALUEfloat f = VALUEvalid-ptr p = *p k = {f} p pt 2. Assignment graph
Type-Flow Analysis: Example int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); float float VALUEfloat f = valid-ptr valid-ptr VALUEvalid-ptr p = float float *p k = T(*p) T(f) {f} p pt 3. Runtime types
Type-Flow Analysis: Example int k; float f; int *p; f = 2.3; p = (int *)&f; k = *p; print_int( k ); float tracked f valid-ptr safe p float type-unsafe k float type-unsafe *p {f} p pt 4. Type-safety levels
Back
May-be-uninitialized Analysis
May-be-uninitialized Analysis
May-be-uninitialized Analysis
Back