School of Electrical Engineering and Computer Science University of Central Florida Anomaly-Based Bug Prediction, Isolation, and Validation: An Automated.

Slides:



Advertisements
Similar presentations
Delta Debugging and Model Checkers for fault localization
Advertisements

Integrity & Malware Dan Fleck CS469 Security Engineering Some of the slides are modified with permission from Quan Jia. Coming up: Integrity – Who Cares?
Lecture 16 Buffer Overflow modified from slides of Lawrie Brown.
Bouncer securing software by blocking bad input Miguel Castro Manuel Costa, Lidong Zhou, Lintao Zhang, and Marcus Peinado Microsoft Research.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
Software testing.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
Efficient and Flexible Architectural Support for Dynamic Monitoring YUANYUAN ZHOU, PIN ZHOU, FENG QIN, WEI LIU, & JOSEP TORRELLAS UIUC.
Yuanyuan ZhouUIUC-CS Architectural Support for Software Bug Detection Yuanyuan (YY) Zhou and Josep Torrellas University of Illinois at Urbana-Champaign.
PathExpander: Architectural Support for Increasing the Path Coverage of Dynamic Bug Detection S. Lu, P. Zhou, W. Liu, Y. Zhou, J. Torrellas University.
Software Testing. “Software and Cathedrals are much the same: First we build them, then we pray!!!” -Sam Redwine, Jr.
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice ZhengMike Jordan.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.
Automated Diagnosis of Software Configuration Errors
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Software Faults and Fault Injection Models --Raviteja Varanasi.
Verification and Validation Yonsei University 2 nd Semester, 2014 Sanghyun Park.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
A Portable Virtual Machine for Program Debugging and Directing Camil Demetrescu University of Rome “La Sapienza” Irene Finocchi University of Rome “Tor.
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
Stamping out worms and other Internet pests Miguel Castro Microsoft Research.
Locating Causes of Program Failures Texas State University CS 5393 Software Quality Project Yin Deng.
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan, 2005 University of Wisconsin, Stanford University,
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan University of Wisconsin, Stanford University, and.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Chapter 8 – Software Testing Lecture 1 1Chapter 8 Software testing The bearing of a child takes nine months, no matter how many women are assigned. Many.
Computer Security and Penetration Testing
Presentation of Failure- Oblivious Computing vs. Rx OS Seminar, winter 2005 by Lauge Wullf and Jacob Munk-Stander January 4 th, 2006.
Bug Localization with Machine Learning Techniques Wujie Zheng
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.
Which Configuration Option Should I Change? Sai Zhang, Michael D. Ernst University of Washington Presented by: Kıvanç Muşlu.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
Implementing Precise Interrupts in Pipelined Processors James E. Smith Andrew R.Pleszkun Presented By: Ravikumar Source:
Stamping out worms and other Internet pests Miguel Castro Microsoft Research.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Relyzer: Exploiting Application-level Fault Equivalence to Analyze Application Resiliency to Transient Faults Siva Hari 1, Sarita Adve 1, Helia Naeimi.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
References: “Pruning Dynamic Slices With Confidence’’, by X. Zhang, N. Gupta and R. Gupta (PLDI 2006). “Locating Faults Through Automated Predicate Switching’’,
Chapter 8 Lecture 1 Software Testing. Program testing Testing is intended to show that a program does what it is intended to do and to discover program.
Software process model from Ch2 Chapter 2 Software Processes1 Requirements Specification Design and Implementation ValidationEvolution.
Fixing the Defect CEN4072 – Software Testing. From Defect to Failure How a defect becomes a failure: 1. The programmer creates a defect 2. The defect.
1 CEN 4072 Software Testing PPT12: Fixing the defect.
Detecting Atomicity Violations via Access Interleaving Invariants
Chapter 5 – Software Testing & Maintenance (Evolution) 1.
Implementing Precise Interrupts in Pipelined Processors James E. Smith Andrew R.Pleszkun Presented By: Shrikant G.
Dynamic Testing.
1 Phase Testing. Janice Regan, For each group of units Overview of Implementation phase Create Class Skeletons Define Implementation Plan (+ determine.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
Lecturer: Eng. Mohamed Adam Isak PH.D Researcher in CS M.Sc. and B.Sc. of Information Technology Engineering, Lecturer in University of Somalia and Mogadishu.
What is a software? Computer Software, or just Software, is the collection of computer programs and related data that provide the instructions telling.
Memory Protection through Dynamic Access Control Kun Zhang, Tao Zhang and Santosh Pande College of Computing Georgia Institute of Technology.
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael I. Jordan UC Berkeley.
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Testing Tutorial 7.
Generating Automated Tests from Behavior Models
YAHMD - Yet Another Heap Memory Debugger
Testing and Debugging PPT By :Dr. R. Mall.
Chapter 8 – Software Testing
Sampling User Executions for Bug Isolation
Jihyun Park, Changsun Park, Byoungju Choi, Gihun Chang
Test Case Purification for Improving Fault Localization
Testing and Test-Driven Development CSC 4700 Software Engineering
PPT1: How failures come to be
50.530: Software Engineering
Chapter 7 Software Testing.
Presentation transcript:

School of Electrical Engineering and Computer Science University of Central Florida Anomaly-Based Bug Prediction, Isolation, and Validation: An Automated Approach for Software Debugging Martin Dimitrov Huiyang Zhou

Background: Terminology Defect : Erroneous piece of code (a bug) I nfection : The defect is triggered => the program state differs from what the programmer intended. Failure : An observable error (crash, hang, wrong results) in program behavior. University of Central Florida Terminologies are based on the book “Why Programs Fail” by Andreas Zeller. 1

Background: From Defects to Failures University of Central Florida Program execution Variable and input values Sane state Erroneous code Infected state Observer sees failure Figure from the book “Why Programs Fail” by A. Zeller 2

Motivation The typical process of software debugging involves: 1)Examine the point of program failure and reason backwards about the possible causes. 2)Create a hypothesis of what could be the root cause. 3)Modify the program to verify the hypothesis. 4)If the failure is still there, the search resumes. Software debugging is tedious and time consuming ! In this work we propose an approach to automate the debugging effort and pinpoint the failure root cause. University of Central Florida 3

Presentation Outline Motivation Proposed approach Detecting anomalies (step 1) Isolating relevant anomalies (step 2) Validating anomalies (step 3) Experimental methodology Experimental results Conclusions University of Central Florida 4

Proposed Approach University of Central Florida mov... cmp... jge... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction 5

Proposed Approach University of Central Florida cmp... jge... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... 5

Proposed Approach University of Central Florida jge... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida inc... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida inc... cmp... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida inc... cmp... jl... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida inc... cmp... jl... test... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida inc... cmp... jl... test... jne... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida inc... cmp... jl... test... jne... mov... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida inc... cmp... jl... test... jne... mov... call... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... 5

Proposed Approach University of Central Florida A program failure is observed: –Crash –Hang –Incorrect results, etc. Start the automated debugging process The output of our approach is a ranked list of instructions (the possible root-cause of failure) inc... cmp... jl... test... jne... mov... call... mov... movl... inc... cmp... Dynamic Instruction Stream time Failure point jr... movl... Failure 5

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida mov... cmp... jge... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida cmp... jge... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida jge... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... cmp... mov... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... cmp... jl... lea... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... cmp... jl... test... movl... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... cmp... jl... test... jne... inc... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... cmp... jl... test... jne... mov... cmp... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... cmp... jl... test... jne... mov... call... jl... movl... inc... cmp... Dynamic Instruction Stream time Current instruction jr... movl... anomaly 6

Proposed Approach Step 1: Detect anomalies in program execution University of Central Florida inc... cmp... jl... test... jne... mov... call... mov... movl... inc... cmp... Dynamic Instruction Stream time Failure point jr... movl... Failure anomaly Each anomaly constitutes a hypothesis for the root cause of program failure. 6

Proposed Approach Step 2: Isolate the relevant anomalies University of Central Florida inc... cmp... jl... test... jne... mov... call... mov... movl... inc... cmp... Dynamic Instruction Stream time Failure point jr... movl... Failure anomaly Create dynamic forward slices from the anomalies to the failure point. Discard anomalies which do not lead to the failure point. 7

Proposed Approach Step 3: Validate the isolated anomalies University of Central Florida inc... cmp... jl... test... jne... mov... call... mov... movl... inc... cmp... Dynamic Instruction Stream time Failure point jr... movl... Failure anomaly Automatically “fix” the anomaly and observe if the program still fails. 8

Proposed Approach Step 3: Validate the isolated anomalies University of Central Florida inc... cmp... jl... test... jne... mov... call... mov... movl... inc... cmp... Dynamic Instruction Stream time jr... movl... If the failure disappears we have a high confidence the root cause have been pinpointed. Success No failure 8

Detecting Program Anomalies (Step 1) When infected by a software bug the program is likely to misbehave: –Out-of-bounds addresses and values –Unusual control paths –Page faults –Redundant computations, etc. Anomaly detection: Infer program specifications from passing runs and turn them into ‘soft’ assertions. –Learn program invariants during passing runs (e.g. variable “i” is always between 0 and 100 ) –Flag violated invariants during the failing run (e.g. Report anomaly if variable “i” is 101 ) University of Central Florida 9

Detecting Program Anomalies We use several anomaly detectors to monitor a large spectrum of program invariants and catch more bugs. DIDUCE [S. Hangal et al., ICSE 2002 ] –Instructions tent to produce values/addresses within a certain range (e.g. 0 <= i <= 100). Detect violations of these invariants. AccMon [P. Zhou et al., MICRO ] –Only a few static instructions access a given memory location (load/store set locality). Signal an anomaly if memory access does not belong to the load/store set. LoopCount –Detect abnormal number of loop iterations. University of Central Florida 10

Detecting Program Anomalies void more_arrays () {... a_count += STORE_INCR ; /* Copy the old arrays. */ for (indx = 1; indx < old_count; indx++) arrays[indx] = old_ary[indx]; /* Initialize the new elements. */ for (; indx < v_count; indx++){/* defect */ arrays[indx] = NULL; /* infection : */ /* Free the old elements. */ if (old_count != 0){ free (old_ary); /* crash */ } University of Central Florida Heap Memory 0x80bfe6c 0x80bfe70 0x80bfe74 0x80bfe78 0x80bfe7c 0x80bfe80 data size data size a_count  v_count  11

Detecting Program Anomalies void more_arrays () {... a_count += STORE_INCR ; /* Copy the old arrays. */ for (indx = 1; indx < old_count; indx++) arrays[indx] = old_ary[indx]; /* Initialize the new elements. */ for (; indx < v_count; indx++){/* defect */ arrays[indx] = NULL; /* infection : */ /* Free the old elements. */ if (old_count != 0){ free (old_ary); /* crash */ } University of Central Florida DIDUCE: Address of store instruction is out of normal range. AccMon: store instruction is not in store set of this memory location. LoopCount: Loop iterates more times than usual. DIDUCE: Address of store instruction is out of normal range. (false – positive) AccMon: store instruction is not in store set of this memory location. (false – positive) LoopCount: Loop iterates more times than usual. (false – positive) DIDUCE24 anomalies AccMon68 anomalies LoopCount36 anomalies 12

Detecting Program Anomalies: Architectural Support DIDUCE and AccMon capture invariants using limited size caches structures, as proposed in previous work LoopCount utilizes the existing loop-branch predictor to detect anomalies. Advantages and disadvantages of hardware support: +Performance efficiency +Portability +Efficient ways to change or invalidate dynamic instructions -Limited hardware resource may become a concern University of Central Florida 13

Isolating Relevant Anomalies (Step 2) Anomaly detectors alone are NOT effective for debugging: –May signal too many anomalies / false positives –Tradeoff between bug coverage and number of false positives Our solution: –Allow aggressive anomaly detection for maximum bug coverage –Automatically isolate only the relevant anomalies by constructing dynamic forwards slices from the anomaly to the failure point University of Central Florida 14

Isolating Relevant Anomalies: Architectural Support Add token(s) to each register and memory word. Detected anomalies set a token associated with the destination memory word or register. Tokens propagate based data dependencies. When the program fails, examine the point of failure for token. University of Central Florida 15

Isolating Relevant Anomalies: Architectural Support University of Central Florida void more_arrays () {... /* Copy the old arrays. */ for (indx = 1; indx < old_count; indx++) arrays[indx] = old_ary[indx]; /* Initialize the new elements. */ for (; indx < v_count; indx++){/* defect */ arrays[indx] = NULL; /* infection */ /* Free the old elements. */ if (old_count != 0){ free (old_ary); /* crash */ }... MemoryToken 0x80bfe7c 0x80bfe80 0x80bfe84 0x80bfe88 0x80bfdf8 0x80bfdfc 0x80bfe00 0x80bfe04 0x80bfe08 Failure mov %ebx,0xc(%edx) 16

Isolating Relevant Anomalies: Architectural Support Problem: Many tokens for each memory location/ register Solution: –We leverage tagged architectures for information flow tracking. –Use only one token (1 bit) (i.e., shared by all anomalies ) –We leverage delta debugging [A. Zeller, FSE 1999 ] to isolate the relevant anomalies automatically. University of Central Florida DIDUCE24 anomalies AccMon68 anomalies LoopCount36 anomalies DIDUCE3 anomalies AccMon2 anomalies LoopCount4 anomalies Number of Initial AnomaliesNumber of Isolated Anomalies 17

Delta-Debugging University of Central Florida 18

Validating Isolated Anomalies (Step 3) Validate the remaining anomalies by applying a “fix” and observing if the program failure disappears. Our “fix” is to nullify the anomalous instruction (turn it into no-op) If the program succeeds, we have a high confidence we have found the root cause (or at least broken the infection chain) University of Central Florida 19

Validating Isolated Anomalies University of Central Florida void more_arrays () {... /* Initialize the new elements. */ for (; indx < v_count; indx++){/* defect */ arrays[indx] = NULL; /* infection */ /* Free the old elements. */ if (old_count != 0){ free (old_ary); /* crash */ } MemoryToken 0x80bfe6c 0x80bfe70 0x80bfe74 0x80bfe78 0x80bfe7c 0x80bfe80 0x0 data size data 0x0 size Success The “size” information is not corrupted and the program terminates successfully. 20

Validating Isolated Anomalies Four possible outcomes of our validation step: University of Central Florida > Segmentation Fault Failure Program fails the same way as before. > Hello World ! Success Program produces expected output. >... (hang) Unknown Program fails in an unexpected manner. > Hello Martin No crash Program does not crash, but produces unexpected results. Rank isolated anomalies based on the outcome: succeed (highest), no crash, unknown, failure (lowest) In our running example the root-cause is ranked #1. 21

Experimental Methodology Implemented a working debugging tool using binary instrumentation (PIN). Evaluated applications from BugBench [S. Lu et al., Bugs 2005] and gcc compiler. University of Central Florida ApplicationLines of Code Defect LocationDefect Description bc ,042storage.c: 176Incorrect bounds checking causes heap buffer overflow gzip ,163gzip.c: 1009Buffer overflow due to misuse of library call strcpy ncompress defects 1,922compress42.c: 886 and 1740 Buffer overflow due to misuse of library call strcpy Incorrect bounds checking causes stack buffer underflow polymorph polymorph.c: 200Incorrect bounds checking causes stack buffer overflow man-1.5h14675man.c:998Incorrect loop exit condition causes stack buffer overflow gcc ,000combine.c: 4013Incorrect call to apply_distributive_law causes a loop in the RTL tree 22

Experimental Results University of Central Florida ApplicationInitial Anomalies Isolated Anomalies Validated (Application Succeeds) Defect rank DALDALDAL bc gzip ncompress (strcpy defect) ncompress (stack underflow) 24n/a polymorph man-1.5h gcc

Case Study: GCC University of Central Florida 24

GCC Defect University of Central Florida 25

GCC Fix University of Central Florida 26

Experimental Results Compared to Failure-Inducing Chops University of Central Florida ApplicationFailure-Inducing Chops Proposed Approach bc gzip ncompress (strcpy defect) 41 ncompress (stack underflow) 111 polymorph man-1.5h1n/a1 gcc

Limitations No failure, no bug detection –Un-triggered bugs –Bugs are triggered but output is correct Target at bugs in sequential programs University of Central Florida 28

Conclusions We present a novel automated approach to pinpoint the root causes of software failures: –Detect anomalies during program execution. –Isolate only the relevant anomalies. –Validate isolated anomalies by nullifying execution results Our experimental results demonstrate that we accurately pinpoint the defect even for large programs such as gcc. University of Central Florida 29

Questions The tool is available for download at: University of Central Florida 30