1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng

Slides:



Advertisements
Similar presentations
Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.
Advertisements

Parallel Symbolic Execution for Structural Test Generation Matt Staats Corina Pasareanu ISSTA 2010.
Grey Box testing Tor Stålhane. What is Grey Box testing Grey Box testing is testing done with limited knowledge of the internal of the system. Grey Box.
ECE 720T5 Fall 2012 Cyber-Physical Systems Rodolfo Pellizzoni.
1 of 24 Automatic Extraction of Object-Oriented Observer Abstractions from Unit-Test Executions Dept. of Computer Science & Engineering University of Washington,
Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements Wes Masri and Marwa El-Ghali American Univ. of Beirut ECE.
Error Management with Design Contracts Karlstad University Computer Science Error Management with Design Contracts Eivind J. Nordby, Martin Blom, Anna.
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
On the Relation between Design Contracts and Errors Karlstad University Computer Science On the Relation Between Design Contracts and Errors A Software.
1 Application of Metamorphic Testing to Supervised Classifiers Xiaoyuan Xie, Tsong Yueh Chen Swinburne University of Technology Christian Murphy, Gail.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael Ernst, Jake Cockrell, William Griswold, David Notkin Presented by.
Automatically Extracting and Verifying Design Patterns in Java Code James Norris Ruchika Agrawal Computer Science Department Stanford University {jcn,
Using JML Runtime Assertion Checking to Automate Metamorphic Testing in Applications without Test Oracles Christian Murphy, Kuang Shen, Gail Kaiser Columbia.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis Chris Murphy, Moses Vaughan, Waseem Ilahi,
Mining Long Sequential Patterns in a Noisy Environment Jiong Yang, Wei Wang, Philip S. Yu, and Jiawei Han SIGMOD 2002 Presented by: Eddie Date: 2002/12/23.
AP Computer Science.  Not necessary but good programming practice in Java  When you override a super class method notation.
Applying Dynamic Analysis to Test Corner Cases First Penka Vassileva Markova Madanlal Musuvathi.
Automated Diagnosis of Software Configuration Errors
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Software Faults and Fault Injection Models --Raviteja Varanasi.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 17: Code Mining.
Success status, page 1 Collaborative learning for security and repair in application communities MIT & Determina AC PI meeting July 10, 2007 Milestones.
CC0002NI – Computer Programming Computer Programming Er. Saroj Sharan Regmi Week 7.
Automatic Software Testing Via Mining Software Data Wujie Zheng Supervisor: Prof. Michael R. Lyu Department of Computer Science & Engineering The Chinese.
ECE 720T5 Winter 2014 Cyber-Physical Systems Rodolfo Pellizzoni.
Mining Function Usage Patterns to Find Bugs Chadd Williams.
1 Debugging and Testing Overview Defensive Programming The goal is to prevent failures Debugging The goal is to find cause of failures and fix it Testing.
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan, 2005 University of Wisconsin, Stanford University,
1 Automatic Identification of Common and Special Object-Oriented Unit Tests Dept. of Computer Science & Engineering University of Washington, Seattle Oct.
A generic tool to assess impact of changing edit rules in a business survey – SNOWDON-X Pedro Luis do Nascimento Silva Robert Bucknall Ping Zong Alaa Al-Hamad.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Bug Localization with Machine Learning Techniques Wujie Zheng
Test Suite Reduction for Regression Testing of Simple Interactions between Two Software Modules Dmitry Kichigin.
Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.
Controlling Execution Programming Right from the Start with Visual Basic.NET 1/e 8.
1 Discovering Robust Knowledge from Databases that Change Chun-Nan HsuCraig A. Knoblock Arizona State UniversityUniversity of Southern California Journal.
What is Testing? Testing is the process of finding errors in the system implementation. –The intent of testing is to find problems with the system.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
Computer Science 1 Mining Likely Properties of Access Control Policies via Association Rule Mining JeeHyun Hwang 1, Tao Xie 1, Vincent Hu 2 and Mine Altunay.
Prioritizing Test Cases for Regression Testing Article By: Rothermel, et al. Presentation by: Martin, Otto, and Prashanth.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
Week 14 Introduction to Computer Science and Object-Oriented Programming COMP 111 George Basham.
Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008.
CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
Bug Localization with Association Rule Mining Wujie Zheng
CIS 842: Specification and Verification of Reactive Systems Lecture INTRO-Examples: Simple BIR-Lite Examples Copyright 2004, Matt Dwyer, John Hatcliff,
Software Quality Assurance and Testing Fazal Rehman Shamil.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
Random Test Generation of Unit Tests: Randoop Experience
Test Case Purification for Improving Fault Localization presented by Taehoon Kwak SoftWare Testing & Verification Group Jifeng Xuan, Martin Monperrus [FSE’14]
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Cyclomatic complexity
Software Testing An Introduction.
Chapter 8 – Software Testing
Specification-Based Unit Test Data Selection: Integrating Daikon and Jtest Tao Xie and David Notkin Computer Science & Engineering, University of.
Eclat: Automatic Generation and Classification of Test Inputs
Test Case Purification for Improving Fault Localization
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Regression Testing.
White Box testing & Inspections
Presentation transcript:

1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng

2 Outline Introduction  Test selection  Mining specification as test oracles Mining Specification from Passing Tests Our Approach: Mining Specification from Unverified Tests  Potential benefits  Difficulties of applying previous approaches directly  Mining predicate rules offline Experiments Future Work

3 Test Selection Software Testing  Software testing is the most commonly used and effective technique for improving the quality of software programs. Testing involves three main steps  Generating a set of test inputs There have been various practical approaches on automatic test- input generation  Executing those inputs on the program under test Do not need intensive human labor  Checking whether the test executions reveal faults Remains a largely manual task Test selection helps to reduce the cost of test-result inspection by selecting a small subset of tests that are likely to reveal faults.

4 Mining specification as test oracles A specification can describe  The temporal properties of programs such as fopen() should be followed by fclose();  The algebraic properties of programs such as get(S, arg).state=S (state-preserving function);  The operational properties of programs such as idx>=0 for a method get(array,idx) (precondition) Such a specification can be mined from execution traces, and then be used as test oracles for finding failing tests

5 Mining Specification from Passing Tests Dynamic invariant detection, called Daikon [Ernst01]  Given a set of passing tests, mine the implicit rules over the variables dynamically The templates  Invariants over any variable: x=a …  Invariants over two numeric variables: y=ax+b …  Invariants over three numeric variables: y=ax+by+c …  … Simply check the candidate rules and discard those violated immediately

6 Mining Specification from Passing Tests Test Selection based on dynamic invariant detection (Jov [Xie03] )  Start from some passing tests, mine the operational model using Daikon  Generate a new test using Jtest ((a commercial Java unit testing tool)l; mine a new operational model  If the new operational model is different from the original one, add the new test to the test suite  Repeat the process

7 Mining Specification from Passing Tests Test Selection based on dynamic invariant detection (Eclat [Pacheco05])  Similar to the work of Jov  Further distinguish illegal inputs from fault-revealing inputs A violation of the operational model is not necessary to imply a fault, since the previously-seen behavior may be incomplete They classify violations into three types  Normal operation  No entry or exit violations  Some entry violations, no exit violations  Fault-revealing  No entry violations, some exit violations.  Illegal input  Some entry and some exit violations.

8 Mining Specification from Passing Tests Test Selection based on dynamic invariant detection (DIDUCE [Hangal02])  Based on another type of invariant, focusing on variable values  Mine invariants from long-running program executions Issue a warning whenever an invariant is violated Also relax an invariant after being violated

9 Mining Specification from Passing Tests The drawback  We may have only a small set of passing tests or even no passing tests, whose behavior is incomplete.  A specification mined from a small set of tests could be noisy.  Many violations of the specification are false- positives

10 Our Approach: Mining Specification from Unverified Tests We propose to mine common operational models, which are not always true in all observed traces, from a (potentially large) set of unverified tests based on mining predicate rules. The common operational models mined can then be used to select the suspicious tests

11 Our Approach: Mining Specification from Unverified Tests Potential Benefits  Training a behavior model on a large set of unverified tests may capture the typical software behaviors without bias, hence reducing the noise  It is relatively easy to collect execution data of a large set of tests without verifying them

12 Our Approach: Mining Specification from Unverified Tests Difficulties of applying previous approaches directly  A common behavior model is not always true over the whole set of tests, thus we can not discard them directly  Alternatively, we may generate and collect all the potential models at runtime and evaluate them after running all the tests. However, such an approach can incur high runtime overhead if Daikon-like operational models, which are in a large number, are used.

13 Our Approach: Mining Specification from Unverified Tests Mining predicate rules offline  Collect values of simple predicates at runtime  Generate and evaluate predicate rules as potential operational models after running all the tests A predicate rule is an implication relationship between predicates When a rule’s confidence is not equal to 1, the higher the rule’s confidence is, the more suspicious its violations are in indicating a failure.  We then select tests that violate the mined predicate rules for result inspection

14 Our Approach: Mining Specification from Unverified Tests The program would fail if In passing tests, the program should satisfy a precondition corresponds to a precondition We observe that a failure is not likely to be predicted by the violation of a single predicate. This is similar to and weaker than the real operational model. Its violation should also lead to the violation of the real operational model and indicate a failure, such as Test 5.

15 Our Approach: Mining Specification from Unverified Tests  Mining predicate rules We use cbi-tools [Liblit05] to instrument the programs  Branches: At each conditional (branch), two predicates are tracked, indicating whether the true or false branches were ever taken  Returns: At each scalar-returning function call site, three predicates are tracked: whether the returned value is ever 0 For each predicate y, we mine the rules X=> y and X => !y, where X is a conjunction of other predicates  To reduce complexity, our current implementation mines only the rules where X is a single predicate  We plan to use advanced data mining techniques such as association rule mining to mine more general rules in our future work. There may be a large number of predicate rules.  For each predicate y, we select only the most confident rule X=> y and the most confident rule X => !y  Alternatively, we can set a threshold min_conf and select all the rules having higher confidences than min_conf

16 Our Approach: Mining Specification from Unverified Tests  Test selection We select only a small subset of the tests that violate all the predicate rules at least once  Initially, the set of selected tests is empty.  We sort the selected predicate rules in the descending order of confidence.  From the top to bottom, if a rule is not violated by any of the previously selected tests, we select the first test that violates the rule.  Finally, in a greedy way all the selected rules can be violated by the selected tests.  We also rank the selected tests in the order of selection.

17 Experiments Preliminary Results  Subject 1: the Siemens suite 130 faulty versions of 7 programs that range in size from 170 to 540 lines On average, only 1.53% (45/2945) of the original tests are needed to be checked. Despite small sized, the selected tests can still reveal 74.6% (97/130) of the faults, while the results of random sampling technique can reveal only 45.4% (59/130) of the faults.

18 Experiments Preliminary Results  Subject 2: the grep program a unix utility to search a file for a pattern; it includes 13,358 lines of C code There are 3 buggy versions that fail 3, 4, and 132 times running the 470 tests, respectively. Our approach selects 82, 86, and 89 tests for these versions, which reveal all the 3 faults. In addition, for each version, there is at least one failing test ranked in top 20. We also randomly select 20 tests for each version. In the 5 times of random selection, the selected tests never reveal the faults of the first two versions but always reveal the faults of the third version.

19 Future work Combine our approach with automatic test generation tools Study the characteristics of mined common operational models and compare them with invariants mined by Daikon. Explore mining more general rules containing several predicates.

20 References [Pacheco05] C. Pacheco and M. D. Ernst. Eclat: Automatic generation and classification of test inputs. In ECOOP, pages 504–527, [Xie03] T. Xie and D. Notkin. Tool-assisted unit test selection based on operational violations. In ASE, pages 40–48, [Hangal02] S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In ICSE, pages 291–301, 2002.

21 Thank you!