Presentation is loading. Please wait.

Presentation is loading. Please wait.

Program Analysis for Security John Mitchell CS 155 Spring 2012.

Similar presentations


Presentation on theme: "Program Analysis for Security John Mitchell CS 155 Spring 2012."— Presentation transcript:

1 Program Analysis for Security John Mitchell CS 155 Spring 2012

2 Software bugs are serious problems Thanks: Isil and Thomas Dillig

3 Several different approaches Instrument code for testing – Heap memory: Purify – Perl tainting (information flow) – Java race condition checking Black-box testing – Fuzzing and penetration testing – Black-box web application security analysis Static code analysis – FindBugs, Fortify, Coverity, MS tools, … 3

4 Entry 1 23 4 Software Exit Behaviors Entry 1 2 4 Exit 124124134 124134 123124134 124123134 123123134 124124134... 124134 Manual testing only examines small subset of behaviors 4

5 Static analysis goals Bug finding – Identify code that the programmer wishes to modify or improve Correctness – Verify the absence of certain classes of errors Note: some fundamental limitations…

6 Outline General discussion of tools – Fundamental limitations – Basic method based on abstract states More details on one specific method – Property checkers from Engler et al., Coverity – Sample security-related results Quick tour of another tool – JavaScript confinement verifier – Proves isolation; not “just” bug finding Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, …

7 Program Analyzers Code ReportTypeLine 1mem leak324 2buffer oflow4,353,245 3sql injection23,212 4stack oflow86,923 5dang ptr8,491 ……… 10,502info leak10,921 Program Analyzer Spec potentially reports many warnings may emit false alarms analyze large code bases false alarm

8 Soundness, Completeness PropertyDefinition SoundnessIf the program contains an error, the analysis will report a warning. “Sound for reporting correctness” CompletenessIf the analysis reports an error, the program will contain an error. “Complete for reporting correctness”

9 CompleteIncomplete Sound Unsound Reports all errors Reports no false alarms Reports all errors May report false alarms UndecidableDecidable May not report all errors May report false alarms Decidable May not report all errors Reports no false alarms

10 Software... Behaviors Sound Over-approximation of Behaviors False Alarm Reported Error approximation is too coarse… …yields too many false alarms Modules

11 entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno Does this program ever crash?

12 entry X  0 Is Y = 0 ? X  X + 1 X  X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yesno infeasible path! … program will never crash Does this program ever crash?

13 entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno X = 0 X = 1 X = 2 X = 3 non-termination! … therefore, need to approximate Try analyzing without approximating…

14 X  X + 1 f d in d out d out = f(d in ) X = 0 X = 1 dataflow elements transfer function dataflow equation

15 X  X + 1 f1 d in1 d out1 = f 1 (d in1 ) Is Y = 0 ? f2 d out2 d out1 d in2 d out1 = d in2 d out2 = f 2 (d in2 ) X = 0 X = 1

16 d out1 = f 1 (d in1 ) d join = d out1 ⊔ d out2 d out2 = f 2 (d in2 ) f1f1 f2f2 f3f3 d out1 d in1 d in2 d out2 d join d in3 d out3 d join = d in3 d out3 = f 3 (d in3 ) least upper bound operator Example: union of possible values What is the space of dataflow elements,  ? What is the least upper bound operator, ⊔ ?

17 entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno X = 0 X = posX = T X = negX = 0X = T Try analyzing with “signs” approximation… terminates... … but reports false alarm … therefore, need more precision lost precision X = T

18 X = posX = 0X = neg X =  X  negX  pos true Y = 0 Y  0 false X = T X = posX = 0X = neg X =  signs lattice Boolean formula lattice refined signs lattice

19 entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno X = 0trueX = 0Y=0X = posY=0X = neg Y0Y0 X = posY=0 X = neg Y0Y0 X = posY=0 X = posY=0X = neg Y0Y0 X = 0 Y0Y0 Try analyzing with “path-sensitive signs” approximation… terminates... … no false alarm … soundly proved never crashes no precision loss refinement

20 Outline General discussion of tools – Fundamental limitations – Basic method based on abstract states More details on one specific method – Property checkers from Engler et al., Coverity – Sample security-related results Quick tour of another tool – JavaScript confinement verifier – Proves isolation; not “just” bug finding Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, …

21 Bugs to Detect Some examples Crash Causing Defects Null pointer dereference Use after free Double free Array indexing errors Mismatched array new/delete Potential stack overrun Potential heap overrun Return pointers to local variables Logically inconsistent code Uninitialized variables Invalid use of negative values Passing large parameters by value Underallocations of dynamic data Memory leaks File handle leaks Network resource leaks Unused values Unhandled return codes Use of invalid iterators Slide credit: Andy Chou 21

22 Example: Check for missing optional args Prototype for open() syscall: Typical mistake: Result: file has random permissions Check: Look for oflags == O_CREAT without mode argument int open(const char *path, int oflag, /* mode_t mode */...); fd = open(“file”, O_CREAT); 22

23 Example: Chroot protocol checker Goal: confine process to a “jail” on the filesystem −chroot() changes filesystem root for a process Problem −chroot() itself does not change current working directory chroot() chdir(“/”) open(“../file”,…) 23 Error if open before chdir

24 TOCTOU Race condition between time of check and use Not applicable to all programs check(“foo”) use(“foo”) 24

25 Tainting checkers 25

26 Example Code #include void say_hello(char * name, int size) { printf("Enter your name: "); fgets(name, size, stdin); printf("Hello %s.\n", name); } int main(int argc, char *argv[]) { if (argc != 2) { printf("Error, must provide an input buffer size.\n"); exit(-1); } int size = atoi(argv[1]); char * name = (char*)malloc(size); if (name) { say_hello(name, size); free(name); } else { printf("Failed to allocate %d bytes.\n", size); } 26

27 atoi main exitfreemalloc printffgets say_hello Callgraph 27

28 atoi main exitfreemalloc printffgets say_hello Reverse Topological Sort 1 2 3 456 7 8 Idea: analyze function before you analyze caller 28

29 atoi main exitfreemalloc printffgets say_hello Apply Library Models 1 2 3 456 7 8 Tool has built-in summaries of library function behavior 29

30 atoi main exitfreemalloc printffgets say_hello Bottom Up Analysis 1 2 3 456 7 8 Analyze function using known properties of functions it calls 30

31 atoi main exitfreemalloc printffgets say_hello Bottom Up Analysis 1 2 3 456 7 8 Analyze function using known properties of functions it calls 31

32 atoi main exitfreemalloc printffgets say_hello Bottom Up Analysis 1 2 3 456 7 8 Finish analysis by analyzing all functions in the program 32

33 Finding Local Bugs #define SIZE 8 void set_a_b(char * a, char * b) { char * buf[SIZE]; if (a) { b = new char[5]; } else { if (a && b) { buf[SIZE] = a; return; } else { delete [] b; } *b = ‘x’; } *a = *b; } 33

34 char * buf[8]; if (a) b = new char [5];if (a && b) buf[8] = a;delete [] b; *b = ‘x’; END *a = *b; a!a a && b !(a && b) Control Flow Graph Represent logical structure of code in graph form 34

35 char * buf[8]; if (a) b = new char [5];if (a && b) buf[8] = a;delete [] b; *b = ‘x’; END *a = *b; a!a a && b !(a && b) Path Traversal Conceptually: Analyze each path through control graph separately Actually Perform some checking computation once per node; combine paths at merge nodes Conceptually Actually 35

36 char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun See how three checkers are run for this path Defined by a state diagram, with state transitions and error states Checker Assign initial state to each program var State at program point depends on state at previous point, program actions Emit error if error state reached Run Checker 36

37 char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” 37

38 char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” 38

39 char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” Already knew a was null 39

40 char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted” 40

41 char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!” 41

42 char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!” No more errors reported for b 42

43 False Positives What is a bug? Something the user will fix. Many sources of false positives −False paths −Idioms −Execution environment assumptions −Killpaths −Conditional compilation −“third party code” −Analysis imprecision −… 43

44 char * buf[8]; if (a) b = new char [5];if (a && b) buf[8] = a;delete [] b; *b = ‘x’; END *a = *b; a!a a && b !(a && b) A False Path 44

45 char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning Integer Range Disequality Branch 45

46 char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning “a in [0,0]” “a == 0 is true” Integer Range Disequality Branch 46

47 char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning “a in [0,0]” “a == 0 is true” “a != 0” Integer Range Disequality Branch 47

48 char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning “a in [0,0]” “a == 0 is true” “a != 0” Impossible Integer Range Disequality Branch 48

49 Environment Assumptions Should the return value of malloc() be checked? int *p = malloc(sizeof(int)); *p = 42; OS Kernel: Crash machine. File server: Pause filesystem. Spreadsheet: Lose unsaved changes. Game: Annoy user. Library: ? Medical device: malloc?! Web application: 200ms downtime IP Phone: Annoy user. 49

50 Statistical Analysis Assume the code is usually right int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; 3/4 deref 1/4 deref 50

51 Application to Security Bugs Stanford research project −Ken Ashcraft and Dawson Engler, Using Programmer-Written Compiler Extensions to Catch Security Holes, IEEE Security and Privacy 2002 −Used modified compiler to find over 100 security holes in Linux and BSD −http://www.stanford.edu/~engler/ Benefit −Capture recommended practices, known to experts, in tool available to all 51

52 Sanitize integers before use Linux: 125 errors, 24 false; BSD: 12 errors, 4 false array[v] while(i < v) … v.clean Use(v) v.tainted Syscall param Network packet copyin(&v, p, len) any<= v <= any memcpy(p, q, v) copyin(p,q,v) copyout(p,q,v) ERROR Warn when unchecked integers from untrusted sources reach trusting sinks

53 Example security holes /* 2.4.9/drivers/isdn/act2000/capi.c:actcapi_dispatch */ isdn_ctrl cmd;... while ((skb = skb_dequeue(&card->rcvq))) { msg = skb->data;... memcpy(cmd.parm.setup.phone, msg->msg.connect_ind.addr.num, msg->msg.connect_ind.addr.len - 1); Remote exploit, no checks 53

54 Example security holes /* 2.4.5/drivers/char/drm/i810_dma.c */ if(copy_from_user(&d, arg, sizeof(arg))) return –EFAULT; if(d.idx > dma->buf_count) return –EINVAL; buf = dma->buflist[d.idx]; Copy_from_user(buf_priv->virtual, d.address, d.used); Missed lower-bound check: 54

55 User-pointer inference Problem: which are the user pointers? −Hard to determine by dataflow analysis −Easy to tell if kernel believes pointer is from user! Belief inference −“*p” implies safe kernel pointer −“copyin(p)/copyout(p)” implies dangerous user ptr −Error: pointer p has both beliefs. Implementation: 2 pass checker inter-procedural: compute all tainted pointers local pass to check that they are not dereferenced 55

56 Results for BSD and Linux All bugs released to implementers; most serious fixed Gain control of system 18 15 3 3 Corrupt memory 43 17 2 2 Read arbitrary memory 19 14 7 7 Denial of service 17 5 0 0 Minor 28 1 0 0 Total 125 52 12 12 Linux BSD Violation Bug Fixed Bug Fixed 56

57 Outline General discussion of tools – Fundamental limitations – Basic method based on abstract states More details on one specific method – Property checkers from Engler et al., Coverity – Sample security-related results Quick tour of another tool – JavaScript confinement verifier – Proves isolation; not “just” bug finding Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, …

58 Two Problems Sandboxing Untrusted JavaScript58 Sandbox Design: ensure that sandboxed code obtains access to ANY protected resource ONLY via the API API Confinement: verify that sandboxed code cannot use the API to obtain direct access to a security-critical resource Evolution of Standardized JavaScript ECMA-262 3 rd edition (ES3) - wa s around when I started this work in 2008 ECMA-262 5 th edition (ES5) - released in Dec 2009 – has a strict mode (ES5-strict)

59 Sandboxing Untrusted JavaScript59 Solution to the Sandboxing and API Confinement problems for ES5-strict Solution to the Sandboxing and API Confinement problems for ES5-strict Plan: 1.Subset SES of ES5-strict 2.Sandboxing technique for untrusted SES code 3.Confinement analysis technique for SES APIs 4.Application: Yahoo! ADSafe

60 The language ES5-strict ES5-strict = ES3 with the following restrictions Sandboxing Untrusted JavaScript60 Restriction (relative to ES3)Rationale No delete on variable names No prototypes for scope objects No with No this coercion Safe built-in functions No.caller,.callee on arguments object No.caller,.arguments on function objects No arguments and formal parameters aliasing Lexical scoping No ambient access to Global object Closure-based encapsulation

61 Scope-bounded eval Sandboxing Untrusted JavaScript61 Example: eval(“function(){return x}”, x) Explicitly list free variables of s Run-time restriction: Free(Parse(s)) ⊆ {x 1,…, x n } Allows an upper bound on side-effects of executing s eval (s, x 1,…, x n ) Implemented by de-sugaring to (eval noFree (“function(x 1,…, x n ){“+ s +“}”))(x 1,…, x n )

62 Next few slides Sandboxing Untrusted JavaScript62 API Confinement: Verify that sandboxed code cannot use the API to obtain direct access to a security-critical resource 1.Subset SES of ES5-strict 2.Sandboxing technique for untrusted SES code 3.Confinement analysis technique for SES APIs 4.Application: Yahoo! ADSafe

63 API Confinement is Complex JavaScript API Confinement63 Resources, DOM f1f1 r1r1 r4r4 r3r3 r2r2 Untrusted JS Invoke r2r2 Return r 2 Access r 2 r3r3 r4r4 Side-effect r 4 u1u1 Repeat Precision-Efficiency tradeoff API Critical, e.g., window.location

64 Key Properties of API Implementations Code is part of the trusted computing base Small in size, relative to the application Written in a disciplined manner Developers have an incentive in keeping the code simple Sandboxing Untrusted JavaScript64 Insights : Conservative and scalable static analysis techniques can do well Can soundly establish API Confinement Can warn developers away from using complex coding patterns

65 Challenges & Techniques Hurdles: Forall quantification on untrusted code Analysis of eval( s, x 1,…, x n ) in general Sandboxing Untrusted JavaScript65 Techniques: Flow-Insensitive and Context-Insensitive Points-to analysis Abstract eval(s, x 1,…, x n ) by the set of all statements that can be written using free variables {x 1,…, x n } Confine(t, critical) : For all untrusted terms s in SES light,

66 Verifying Confine(t, critical) Sandboxing Untrusted JavaScript66 Trusted code t eval with free vars ”test”,“api” Environment (Built-ins) + + Datalog Solver (least fixed point) Inference Rules (SES semantics) Stack( “test”, l) ∧ Critical(l) ? NOT CONFINED CONFINED true false Abstraction Our decision procedure and implementation

67 Express Analysis in Datalog (Whaley et. al.) Program t l 1 :var y = {}; l 2 :var x = y; l 3 :x.f = y; Sandboxing Untrusted JavaScript67 Facts(t) Stack(y, l 1 ) Assign(x, y) Store(x, “f”, y) abstract Abstract programs as Datalog facts Abstract the semantics of SES as Datalog inference rules Stack(x, l) :- Assign(x, y), Stack(y, l) Heap(l, f, m) :- Store(x, f, y), Stack(x, l), Stack(y, m) Execution of program t is abstracted by the least-fixed-point of Facts(t) under the inference rules

68 Complete set of Predicates Sandboxing Untrusted JavaScript68 Abstracting termsAbstracting Heaps & Stacks Assign(x, y)Throw(l, x)Heap(l, x, m)Stack(x, l) Load(x, y, f)Catch(l, x)Prototype(l, m)FuncType(l) Store(x, f, y)TP(l, x)ObjType(l)ArrayType(l) Formal(l, i, x)FormalRet(l, x)NotBuiltin(l)Critical(l) Actual(x, i, z, y, l)Instance(l, x) Global(x)Annotation(x, y) Sufficient to model implicit type conversions, reflection, exceptions

69 Next few slides Sandboxing Untrusted JavaScript69 1.Subset SES of ES5-strict 2.Sandboxing technique for untrusted SES code 3.Confinement analysis technique for SES APIs 4.Application: Yahoo! ADSafe Implementation: We implemented the decision procedure in the form of an automated tool ENCAP built on top of Datalog engine: bddbddb available online at: http://code.google.com/p/es-lab/Encap

70 Application: Yahoo! ADSafe JavaScript API Confinement70 Analysis (1 st attempt) annotated the API implementation (2000 LOC) desugared it to SES and ran ENCAP obtained NOT CONFINED  culprits: ADSAFE.go and ADSAFE.lib Sandbox: JSLint Filter API: ADSAFE object ADSafe: Mechanism for safely embedding ads This was an actual exploit Analysis (2 nd attempt) fixed this bug ran ENCAP again obtained CONFINED Result: ADSafe DOM API safely confines DOM objects under the SES threat model, assuming the annotations hold

71 Summary General discussion of tools −Fundamental limitations −Basic method based on abstract states More details on one specific method −Property checkers from Engler et al., Coverity −Sample security-related results Quick tour of JavaScript tool −JavaScript confinement verifier −Proves isolation; not “just” bug finding Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, A. Taly, …


Download ppt "Program Analysis for Security John Mitchell CS 155 Spring 2012."

Similar presentations


Ads by Google