Alleviating False Alarm Problem of Static Buffer Overflow Analysis Youil Kim 2008-12-12 1.

Slides:



Advertisements
Similar presentations
Demand-driven inference of loop invariants in a theorem prover
Advertisements

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Jeremy Morse, Mikhail Ramalho, Lucas Cordeiro, Denis Nicole, Bernd Fischer ESBMC 1.22 (Competition Contribution)
Finding bugs: Analysis Techniques & Tools Symbolic Execution & Constraint Solving CS161 Computer Security Cho, Chia Yuan.
Delta Debugging and Model Checkers for fault localization
Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } P1() Challenge: Correct and Efficient Synchronization { ……………………………
Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } T1() Challenge: Correct and Efficient Synchronization { ……………………………
Satisfiability Modulo Theories (An introduction)
A Program Transformation For Faster Goal-Directed Search Akash Lal, Shaz Qadeer Microsoft Research.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Making Choices in C if/else statement logical operators break and continue statements switch statement the conditional operator.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Refining Buffer Overflow Detection via Demand-Driven Path-Sensitive Analysis Wei Le and Mary Lou Soffa University of Virginia sotesty.cs.virginia.edu.
Scalable Error Detection using Boolean Satisfiability 1 Yichen Xie and Alex Aiken Stanford University.
Program analysis Mooly Sagiv html://
1 Predicate Abstraction of ANSI-C Programs using SAT Edmund Clarke Daniel Kroening Natalia Sharygina Karen Yorav (modified by Zaher Andraus for presentation.
Program analysis Mooly Sagiv html://
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Predicate Abstraction for Software and Hardware Verification Himanshu Jain Model checking seminar April 22, 2005.
Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits.
Overview of program analysis Mooly Sagiv html://
Describing Syntax and Semantics
Automatic Generation of Parallel OpenGL Programs Robert Hero CMPS 203 December 2, 2004.
1 Loop-Extended Symbolic Execution on Binary Programs Pongsin Poosankam ‡* Prateek Saxena * Stephen McCamant * Dawn Song * ‡ Carnegie Mellon University.
Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.
Testing Static Analysis Tools using Exploitable Buffer Overflows from Open Source Code Zitser, Lippmann & Leek Presented by: José Troche.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Statically Detecting Likely Buffer Overflow Vulnerabilities David Larochelle David Evans University of Virginia Department of Computer Science Supported.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
CIS Computer Programming Logic
Introduction Overview Static analysis Memory analysis Kernel integrity checking Implementation and evaluation Limitations and future work Conclusions.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Aditya V. Nori, Sriram K. Rajamani Microsoft Research India.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Predicate Abstraction of ANSI-C Programs Using SAT By Edmund Clarke, Daniel Kroening, Natalia Sharygina, Karen Yorav Presented by Yunho Kim Provable Software.
Chapter 3 Part II Describing Syntax and Semantics.
Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.
4061 Session 26 (4/19). Today Network security Sockets: building a server.
Programming in C++ Dale/Weems/Headington Chapter 1 Overview of Programming and Problem Solving.
Localization and Register Sharing for Predicate Abstraction Himanshu Jain Franjo Ivančić Aarti Gupta Malay Ganai.
School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Introduction to Software Analysis CS Why Take This Course? Learn methods to improve software quality – reliability, security, performance, etc.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Review A program is… a set of instructions that tell a computer what to do. Programs can also be called… software. Hardware refers to… the physical components.
Working with Java.
SS 2017 Software Verification Bounded Model Checking, Outlook
Program Analysis via Satisfiability Modulo Path Programs
Airac int *c = (int *)malloc(sizeof(int)*10); C 프로그램의 메모리접근 오류 자동 검출
Solving Linear Arithmetic with SAT-based MC
Chapter 3: Understanding C# Language Fundamentals
Moonzoo Kim CS Dept. KAIST
High Coverage Detection of Input-Related Security Faults
Control Structures (Structured Programming) for controlling the procedural aspects of programming CS1110 – Kaminski.
Wei Le and Mary Lou Soffa University of Virginia
Control Structures (Structured Programming) for controlling the procedural aspects of programming CS1110 – Kaminski.
Predicate Abstraction
SOFTWARE ENGINEERING INSTITUTE
Presentation transcript:

Alleviating False Alarm Problem of Static Buffer Overflow Analysis Youil Kim

Background Morris worm (1998) –Infected over 6,000 major Unix machines. –Buffer overflows in UNIX fingerd Code-Red virus (2001) –Infected over 359,000 computers in 14 hours. –Buffer overflows in Microsoft IIS Buffer overflows account for 1/3 of the severe remotely exploitable vulnerabilities

Our Goal A Scalable and Precise Static Buffer Overflow Analyzer for Large C Programs – 두 마리 토끼를 어떻게 잡을까 ?

Our Approach Precise Analysis Buffer Overflow Alarms C Code Reduced Alarms Scalable Analysis

Our Approach Less precise analysis, first. –Unification-based points-to analysis –Interval analysis Precise analysis on small areas around potential alarms –Symbolic execution using an SMT solver

Current Development Status Scalable static buffer overflow analyzer –Analyzes five GNU tools in five minutes. Precise analysis using an SMT solver. –Reduces 72% of false alarms of the buffer overflow test cases from three open source applications (bind, sendmail, wu-ftpd)

참고 : SMT Solver Satisfiability Modulo Theories –SMT generalizes boolean satisfiability by adding equality reasoning, arithmetic, and other useful theories. Z3 from Microsoft –Used in several program analysis, verification, test case generation projects. 7

참고 : SMT Formula (x > 0) and ((x + y = 6)) and ((x + y = 2) or (x + 2y – z > 4)) X0 and (X1 or X2) and (X3 or X4) SAT Formula SMT Formula 8

Raccoon, Our Base Analyzer

Raccoon Overview Analysis 1: Points-to Analysis Analysis 2: Interval Analysis Analysis 3: Buffer Analysis False Alarm Filter 10 Buffer Overflow Alarms C Code Reduced Alarms Buffer Analysis Example: p = {offset = [0, 1], length = [3, 3], size = [5, 5]} 0 p

Raccoon’s Performance SoftwareSLOC# CIL LinesTime# Alarms# Writes% Alarms tar-1.139,27919, bison ,85429, ,31642 sed-4.083,3447, gzip-1.2.4a5,80911, grep ,23414, Total36,52083, ,4142,68553 Experiments on 2.33 GHZ quad-core XEON with 8 GB RAM Statically proved 47% of writes are safe

참고 : Airac5’s Performance Software# LinesTime# Alarms# Accesses% Alarms tar ,25824, ,6303 bison ,90720, ,1641 sed-4.086,05351, gzip-1.2.4a7,32718, grep ,29733, Total68,842148, ,2412 Experiments on Pentium GHZ with 8 GB RAM Fewer alarms, but takes 600 times longer to analyze

Filtering False Alarms of Buffer Overflow Analysis Using an SMT Solver

False Alarm Filter Analysis 1: Points-to Analysis Analysis 2: Interval Analysis Analysis 3: Buffer Analysis False Alarm Filter 14 Buffer Overflow Alarms C Code Reduced Alarms

A Concrete Example $ raccoon2 rmt.cil.c Buffer Overflow Detection: rmt.c:138: *(string + counter) Size : [64, 64] Offset: [0, 64] rmt.c:311: *(p) Size : [64, 64] Offset: [-oo, 62] Total 0 array write(s). Total 0 array alarm(s). Total 4 pointer write(s). Total 2 pointer alarm(s) bytes p

A Concrete Example 302 count = lseek (tape, count, whence); 303 if (count < 0) 304 goto ioerror; p = count_string + sizeof count_string; 309 *--p = '\0'; 310 do 311 *--p = '0' + (int) (count % 10); 312 while ((count /= 10) != 0); rmt.c in GNU tar p = {offset=[63,63], size=[64,64]}, count = [0, +oo] p = {offset=[-oo,63], size=[64,64]}, count = [0, +oo]

The Filtering Algorithm 17 Extract a Program Snippet Build an Initial Context SMT Translation Symbolic Execution It’s a false alarm. I don’t know. unsatisfiablesatisfiable or unknown loop execution Choose an Alarm Statement No alarm Exit

Extract a Program Snippet Extract a backward program slice with respect to the target alarm statement –Up to the safe point –Within the procedure boundary –Note: We use Raccoon results to imitate function calls. 18

A Program Snippet count = lseek (tape, count, whence); 303 if (count < 0) 304 goto ioerror; p = count_string + sizeof count_string; 309 *--p = '\0'; 310 do 311 *--p = '0' + (int) (count % 10); 312 while ((count /= 10) != 0); 19 Backward program slicing to the safely accessible point. p = {offset=[63,63], size=[64,64]}, count = [0, +oo]

A Program Snippet in SSA Form while (1) { p_1 = p_0 - 1; (*p_1) = (char)(48 + (int)(count_0 % 10L)); count_1 = count_0 / 10L; if (! (count_1 != 0L)) { break; } 20 Internal format is in static single assignment (SSA) form. p = {offset=[63,63], size=[64,64]}, count = [0, +oo]

An Initial Context ;; Initial Context Formula (assert (p.offset_0 = 63)) (assert (p.size_0 = 64)) (assert (count_0 >= 0)) (assert (count_0 < )) Need initial context for live-in variables of the program snippet Constructed from the results of interval analysis of Raccoon

Symbolic Execution of Loops let isFalseAlarm(ctxt, i) = if isSat(ctxt && l-path[i] && l-alarm[i]) then DONTKNOW else if isSat(ctxt && r-path[i] && r-alarm[i]) then DONTKNOW else if isSat(ctxt && r-path[i]) then isFalseAlarm(ctxt && r-path[i], i + 1) else YES 22 l-path r-path l-pathr-path 1 st iteration...

Run It $ time./rmt-ocaml The loop is unrolled 18 times. It is a false alarm. real 0m0.034s user 0m0.032s sys 0m0.002s $ 23

Implementation Raccoon2 is implemented in OCaml. Yices OCaml API: C API + SWIG –Yices provides (incomplete) C API. –SWIG connects the C API with OCaml code. Currently, manual translation. 24

Experimental Results Bad Code # Writes# Alarms# False Alarms# RemovedRaccoon TimeSMT Time BIND s54.95s BIND s245.30s BIND s5.52s BIND s0.06s SM s- SM s SM sX SM s∞ SM s- SM sX SM s0.88s + FTP s0.02s FTP s- FTP s- 25

Research Plan

Research Plan: Filtering Algorithm Automatize the false alarm filtering –Implement missing parts –Experiment with large GNU software Optimize the false alarm filtering algorithm –Exploit yices_inconsistent() 27

Research Plan: Raccoon Improve Raccoon analyzer –For example, structure field sensitivity Alarm grouping via abstract state refinements –Extend the basic idea to pointer accesses and C string library calls –Visualize the relationship between alarms 28

Alarm Grouping: A Motivating Example for (counter = 0; counter < SPARSES_IN_OLDGNU_HEADER; counter++) 202 { 203 /* Compare to 0, or use !(int)..., for Pyramid's dumb compiler. */ 204 if (current_header->oldgnu_header.sp[counter].numbytes == 0) 205 break; sparsearray[counter].offset = 208 OFF_FROM_OCT(current_header->oldgnu_header.sp[counter].offset); 209 sparsearray[counter].numbytes = 210 SIZE_FROM_OCT(current_header->oldgnu_header.sp[counter].numbytes); 211 } compare.c in GNU tar

Thank You

참고 : SMT Formulae (1/2) ;; Loop Body Formula – 1st (p.offset_1 = p.offset_0 – 1) and (count_1 = count_0 / 10) 31 ;; Leaving Path Context Formula – 1st (count_1 = 0) and false ;; Remaining Path Condition Formula – 1st (true) ;; Remaining Path Alarm Condition Formula – 1st (p.offset_1 <= 62) and ((p.offset_1 < 0) or (p.size_0 <= p.offset_1))

참고 : SMT Formulae (2/2) 32 ;; Remaining Path Condition Formula – 2nd (not (count_2 = 0)) ;; Remaining Path Alarm Condition Formula – 2nd (p.offset_2 <= 62) and ((p.offset_2 < 0) or (p.size_0 <= p.offset_2)) ;; Loop Body Formula – 2nd (p.offset_2 = p.offset_1 – 1) and (count_2 = count_1 / 10) ;; Leaving Path Context Formula – 2nd (count_2 = 0) and false

참고 : A Sequence of Loops 517 p = buf; 519 strcpy(temp, "HEADER JUNK:"); 523 while (*temp != '\0') 524 *p++ = *temp++; 534 comp_size = dn_comp(exp_dn, comp_dn, 200,...); 539 for(i=0; i<comp_size; i++) 540 *p++ = *comp_dn++; 544 PUTSHORT(30, p); /* type = T_NXT = 30 */ 545 p += 2; nxt-ok.c from BIND-1 buffer overflow models 33

참고 : Related Work Forward-backward analysis [ASTREE] Counter-Example Guided Abstraction Refinements [SLAM, BLAST] Statistical alarm ranking [Coverity, Airac] 34