Program Analysis for Security Original slides created by Prof. John Mitchell.

Slides:



Advertisements
Similar presentations
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Advertisements

Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
SPLINT STATIC CHECKING TOOL Sripriya Subramanian 10/29/2002.
Program Analysis for Security
Program Analysis for Security John Mitchell CS 155 Spring 2012.
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
Program analysis Mooly Sagiv html://
Program analysis Mooly Sagiv html://
Program Analysis for Security Suhabe Bugrara Stanford University.
Verifying the Safety of User Pointer Dereferences Suhabe Bugrara Stanford University Joint work with Alex Aiken.
Overview of program analysis Mooly Sagiv html://
1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation –The new operator –The delete operator –Dynamic.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Procedural Concept The main program coordinates calls to procedures and hands over appropriate data as parameters.
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
University of Calgary – CPSC 441. C PROGRAM  Collection of functions  One function “main()” is called by the operating system as the starting function.
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
Computer Security and Penetration Testing
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
1 C++ Classes and Data Structures Jeffrey S. Childs Chapter 4 Pointers and Dynamic Arrays Jeffrey S. Childs Clarion University of PA © 2008, Prentice Hall.
Dynamic Memory Allocation. Domain A subset of the total domain name space. A domain represents a level of the hierarchy in the Domain Name Space, and.
Overflow Examples 01/13/2012. ACKNOWLEDGEMENTS These slides where compiled from the Malware and Software Vulnerabilities class taught by Dr Cliff Zou.
Topic 3: C Basics CSE 30: Computer Organization and Systems Programming Winter 2011 Prof. Ryan Kastner Dept. of Computer Science and Engineering University.
Program Analysis for Security John Mitchell CS 155 Spring 2014.
Computer Organization and Design Pointers, Arrays and Strings in C Montek Singh Sep 18, 2015 Lab 5 supplement.
Buffer overflow and stack smashing attacks Principles of application software security.
Introduction to Software Analysis CS Why Take This Course? Learn methods to improve software quality – reliability, security, performance, etc.
Slides by Kent Seamons and Tim van der Horst Last Updated: Nov 11, 2011.
1 Test Coverage Coverage can be based on: –source code –object code –model –control flow graph –(extended) finite state machines –data flow graph –requirements.
Announcements You will receive your scores back for Assignment 2 this week. You will have an opportunity to correct your code and resubmit it for partial.
CSE 333 – SECTION 2 Memory Management. Questions, Comments, Concerns Do you have any? Exercises going ok? Lectures make sense? Homework 1 – START EARLY!
C Tutorial - Pointers CS 537 – Introduction to Operating Systems.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Optimistic Hybrid Analysis
Dynamic Allocation in C
Content Coverity Static Analysis Use cases of Coverity Examples
Secure Programming Dr. X
Object Lifetime and Pointers
Sabrina Wilkes-Morris CSCE 548 Student Presentation
Program Analysis for Security
CMSC 341 Lecture 2 – Dynamic Memory and Pointers (Review)
Data Types In Text: Chapter 6.
Stack and Heap Memory Stack resident variables include:
Computer Organization and Design Pointers, Arrays and Strings in C
Control Flow Testing Handouts
Secure Programming Dr. X
A bit of C programming Lecture 3 Uli Raich.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
Graph Coverage for Specifications CS 4501 / 6501 Software Testing
Secure Software Development: Theory and Practice
Outline of the Chapter Basic Idea Outline of Control Flow Testing
Program Analysis for Security
Checking Memory Management
Dynamic Memory Allocation
High Coverage Detection of Input-Related Security Faults
CS 465 Buffer Overflow Slides by Kent Seamons and Tim van der Horst
Chapter 15 Pointers, Dynamic Data, and Reference Types
Verifying the Safety of User Pointer Dereferences
Automated Tools for System and Application Security
Processes in Unix, Linux, and Windows
Pointers, Dynamic Data, and Reference Types
Memory Allocation CS 217.
Outline Defining and using Pointers Operations on pointers
Introduction to Static Analyzer
CS5123 Software Validation and Quality Assurance
Foundations and Definitions
Pointers, Dynamic Data, and Reference Types
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Program Analysis for Security Original slides created by Prof. John Mitchell

[PopPhoto.com Feb 10] Facebook missed a single security check…

App stores

How can you tell whether software you – Develop – Buy is safe to install and run?

Two options Static analysis – Inspect code or run automated method to find errors or gain confidence about their absence Dynamic analysis – Run code, possibly under instrumented conditions, to see if there are likely problems

Program Analyzers Code ReportTypeLine 1mem leak324 2buffer oflow4,353,245 3sql injection23,212 4stack oflow86,923 5dang ptr8,491 ……… 10,502info leak10,921 Program Analyzer Spec

Entry Software Exit Behaviors Entry Exit Manual testing only examines small subset of behaviors 7

Static vs Dynamic Analysis Static – Can consider all possible inputs – Find bugs and vulnerabilities – Can prove absence of bugs, in some cases Dynamic – Need to choose sample test input – Can find bugs vulnerabilities – Cannot prove their absence

Cost of Fixing a Defect Credit: Andy Chou, Coverity

Cost of security or data privacy vulnerability?

Dynamic analysis Instrument code for testing – Heap memory: Purify – Perl tainting (information flow) – Java race condition checking Black-box testing – Fuzzing and penetration testing – Black-box web application security analysis 11

Static Analysis Long research history Decade of commercial products – FindBugs, Fortify, Coverity, MS tools, …

Static Analysis: Outline General discussion of static analysis tools – Goals and limitations – Approach based on abstract states More about one specific approach – Property checkers from Engler et al., Coverity – Sample security checkers results Static analysis for of Android apps Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, J. Franklin, A. Aiken, …

Static analysis goals Bug finding – Identify code that the programmer wishes to modify or improve Correctness – Verify the absence of certain classes of errors

Soundness, Completeness PropertyDefinition Soundness “Sound for reporting correctness” Analysis says no bugs  No bugs or equivalently There is a bug  Analysis finds a bug Completeness “Complete for reporting correctness” No bugs  Analysis says no bugs Recall: A  B is equivalent to (  B)  (  A)

CompleteIncomplete Sound Unsound Reports all errors Reports no false alarms Reports all errors May report false alarms UndecidableDecidable May not report all errors May report false alarms Decidable May not report all errors Reports no false alarms

Sound Program Analyzer Code ReportTypeLine 1mem leak324 2buffer oflow4,353,245 3sql injection23,212 4stack oflow86,923 5dang ptr8,491 ……… 10,502info leak10,921 Program Analyzer Spec Sound: may report many warnings May emit false alarms Analyze large code bases false alarm

Software... Behaviors Sound Over-approximation of Behaviors False Alarm Reported Error approximation is too coarse… …yields too many false alarms Modules

Outline General discussion of tools – Goals and limitations – Approach based on abstract states More about one specific approach – Property checkers from Engler et al., Coverity – Sample security-related results Static analysis for Android malware – … Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, J. Franklin, A. Aiken, …

entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno Does this program ever crash?

entry X  0 Is Y = 0 ? X  X + 1 X  X - 1 Is Y = 0 ? Is X < 0 ? exit crash yes no yes no yesno infeasible path! … program will never crash Does this program ever crash?

entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno X = 0 X = 1 X = 2 X = 3 non-termination! … therefore, need to approximate Try analyzing without approximating…

X  X + 1 f d in d out d out = f(d in ) X = 0 X = 1 dataflow elements transfer function dataflow equation

X  X + 1 f1 d in1 d out1 = f 1 (d in1 ) Is Y = 0 ? f2 d out2 d out1 d in2 d out1 = d in2 d out2 = f 2 (d in2 ) X = 0 X = 1

d out1 = f 1 (d in1 ) d join = d out1 ⊔ d out2 d out2 = f 2 (d in2 ) f1f1 f2f2 f3f3 d out1 d in1 d in2 d out2 d join d in3 d out3 d join = d in3 d out3 = f 3 (d in3 ) least upper bound operator Example: union of possible values What is the space of dataflow elements,  ? What is the least upper bound operator, ⊔ ?

entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno X = 0 X = posX = T X = negX = 0X = T Try analyzing with “signs” approximation… terminates... … but reports false alarm … therefore, need more precision lost precision X = T

X = posX = 0X = neg X =  X  negX  pos true Y = 0 Y  0 false X = T X = posX = 0X = neg X =  signs lattice Boolean formula lattice refined signs lattice

entry X  0 Is Y = 0 ? X  X + 1X  X - 1 Is Y = 0 ? Is X < 0 ?exit crash yes no yes no yesno X = 0trueX = 0Y=0X = posY=0X = neg Y0Y0 X = posY=0 X = neg Y0Y0 X = posY=0 X = posY=0X = neg Y0Y0 X = 0 Y0Y0 Try analyzing with “path-sensitive signs” approximation… terminates... … no false alarm … soundly proved never crashes no precision loss refinement

Outline General discussion of tools – Goals and limitations – Approach based on abstract states More about one specific approach – Property checkers from Engler et al., Coverity – Sample security-related results Static analysis for Android malware – … Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, J. Franklin, A. Aiken, …

Unsound Program Analyzer Code ReportTypeLine 1mem leak324 2buffer oflow4,353,245 3sql injection23,212 4stack oflow86,923 5dang ptr8,491 ……… Program Analyzer Spec may emit false alarms analyze large code bases false alarm Not sound: may miss some bugs

Demo Coverity video: Observations – Code analysis integrated into development workflow – Program context important: analysis involves sequence of function calls, surrounding statements – This is a sales video: no discussion of false alarms

Bugs to Detect Some examples Crash Causing Defects Null pointer dereference Use after free Double free Array indexing errors Mismatched array new/delete Potential stack overrun Potential heap overrun Return pointers to local variables Logically inconsistent code Uninitialized variables Invalid use of negative values Passing large parameters by value Underallocations of dynamic data Memory leaks File handle leaks Network resource leaks Unused values Unhandled return codes Use of invalid iterators Slide credit: Andy Chou 33

Example: Check for missing optional args Prototype for open() syscall: Typical mistake: Result: file has random permissions Check: Look for oflags == O_CREAT without mode argument int open(const char *path, int oflag, /* mode_t mode */...); fd = open(“file”, O_CREAT); 34

Example: Chroot protocol checker Goal: confine process to a “jail” on the filesystem −chroot() changes filesystem root for a process Problem −chroot() itself does not change current working directory chroot() chdir(“/”) open(“../file”,…) 35 Error if open before chdir

TOCTOU Race condition between time of check and use Not applicable to all programs check(“foo”) use(“foo”) 36

Tainting checkers 37

Example code with function def, calls #include void say_hello(char * name, int size) { printf("Enter your name: "); fgets(name, size, stdin); printf("Hello %s.\n", name); } int main(int argc, char *argv[]) { if (argc != 2) { printf("Error, must provide an input buffer size.\n"); exit(-1); } int size = atoi(argv[1]); char * name = (char*)malloc(size); if (name) { say_hello(name, size); free(name); } else { printf("Failed to allocate %d bytes.\n", size); } 38

atoi main exitfreemalloc printffgets say_hello Callgraph 39

atoi main exitfreemalloc printffgets say_hello Reverse Topological Sort Idea: analyze function before you analyze caller 40

atoi main exitfreemalloc printffgets say_hello Apply Library Models Tool has built-in summaries of library function behavior 41

atoi main exitfreemalloc printffgets say_hello Bottom Up Analysis Analyze function using known properties of functions it calls 42

atoi main exitfreemalloc printffgets say_hello Bottom Up Analysis Analyze function using known properties of functions it calls 43

atoi main exitfreemalloc printffgets say_hello Bottom Up Analysis Finish analysis by analyzing all functions in the program 44

Finding Local Bugs #define SIZE 8 void set_a_b(char * a, char * b) { char * buf[SIZE]; if (a) { b = new char[5]; } else { if (a && b) { buf[SIZE] = a; return; } else { delete [] b; } *b = ‘x’; } *a = *b; } 45

char * buf[8]; if (a) b = new char [5];if (a && b) buf[8] = a;delete [] b; *b = ‘x’; END *a = *b; a!a a && b !(a && b) Control Flow Graph Represent logical structure of code in graph form 46

char * buf[8]; if (a) b = new char [5];if (a && b) buf[8] = a;delete [] b; *b = ‘x’; END *a = *b; a!a a && b !(a && b) Path Traversal Conceptually: Analyze each path through control graph separately Actually Perform some checking computation once per node; combine paths at merge nodes Conceptually Actually 47

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun See how three checkers are run for this path Defined by a state diagram, with state transitions and error states Checker Assign initial state to each program var State at program point depends on state at previous point, program actions Emit error if error state reached Run Checker 48

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” 49

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” 50

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” Already knew a was null 51

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted” 52

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!” 53

char * buf[8]; if (a) if (a && b) delete [] b; *b = ‘x’; END *a = *b; !a !(a && b) Apply Checking Null pointersUse after freeArray overrun “buf is 8 bytes” “a is null” “b is deleted” “b dereferenced!” No more errors reported for b 54

False Positives What is a bug? Something the user will fix. Many sources of false positives −False paths −Idioms −Execution environment assumptions −Killpaths −Conditional compilation −“third party code” −Analysis imprecision −… 55

char * buf[8]; if (a) b = new char [5];if (a && b) buf[8] = a;delete [] b; *b = ‘x’; END *a = *b; a!a a && b !(a && b) A False Path 56

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning Integer Range Disequality Branch 57

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning “a in [0,0]” “a == 0 is true” Integer Range Disequality Branch 58

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning “a in [0,0]” “a == 0 is true” “a != 0” Integer Range Disequality Branch 59

char * buf[8]; if (a) if (a && b) buf[8] = a; END !a a && b False Path Pruning “a in [0,0]” “a == 0 is true” “a != 0” Impossible Integer Range Disequality Branch 60

Environment Assumptions Should the return value of malloc() be checked? int *p = malloc(sizeof(int)); *p = 42; OS Kernel: Crash machine. File server: Pause filesystem. Spreadsheet: Lose unsaved changes. Game: Annoy user. Library: ? Medical device: malloc?! Web application: 200ms downtime IP Phone: Annoy user. 61

Statistical Analysis Assume the code is usually right int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; int *p = malloc(sizeof(int)); *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; int *p = malloc(sizeof(int)); if(p) *p = 42; 3/4 deref 1/4 deref 62

Example security holes /* 2.4.9/drivers/isdn/act2000/capi.c:actcapi_dispatch */ isdn_ctrl cmd;... while ((skb = skb_dequeue(&card->rcvq))) { msg = skb->data;... memcpy(cmd.parm.setup.phone, msg->msg.connect_ind.addr.num, msg->msg.connect_ind.addr.len - 1); Remote exploit, no checks 63

Example security holes /* 2.4.5/drivers/char/drm/i810_dma.c */ if(copy_from_user(&d, arg, sizeof(arg))) return –EFAULT; if(d.idx > dma->buf_count) return –EINVAL; buf = dma->buflist[d.idx]; Copy_from_user(buf_priv->virtual, d.address, d.used); Missed lower-bound check: 64

Summary Static vs dynamic analyzers General properties of static analyzers – Fundamental limitations – Basic method based on abstract states More details on one specific method – Property checkers from Engler et al., Coverity – Sample security-related results Static analysis for Android malware – STAMP method, sample studies Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, J. Franklin, A. Aiken, …