Learning based Software Testing --Marriage between “learning” and “testing” Dan Hao Peking University 2015.5.28.

Slides:



Advertisements
Similar presentations
xUnit Test Patterns (Some) xUnit Test Patterns (in practice) by Adam Czepil.
Advertisements

Imbalanced data David Kauchak CS 451 – Fall 2013.
Classification Algorithms
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Fei Xing1, Ping Guo1,2 and Michael R. Lyu2
SBSE Course 3. EA applications to SE Analysis Design Implementation Testing Reference: Evolutionary Computing in Search-Based Software Engineering Leo.
Biologically Inspired AI (mostly GAs). Some Examples of Biologically Inspired Computation Neural networks Evolutionary computation (e.g., genetic algorithms)
Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
TEMPLATE DESIGN © Genetic Algorithm and Poker Rule Induction Wendy Wenjie Xu Supervised by Professor David Aldous, UC.
Tractable and intractable problems for parallel computers
1 Application of Metamorphic Testing to Supervised Classifiers Xiaoyuan Xie, Tsong Yueh Chen Swinburne University of Technology Christian Murphy, Gail.
(c) 2007 Mauro Pezzè & Michal Young Ch 16, slide 1 Fault-Based Testing.
CS590Z Statistical Debugging Xiangyu Zhang (part of the slides are from Chao Liu)
1 A hybrid particle swarm optimization algorithm for optimal task assignment in distributed system Peng-Yeng Yin and Pei-Pei Wang Department of Information.
Using JML Runtime Assertion Checking to Automate Metamorphic Testing in Applications without Test Oracles Christian Murphy, Kuang Shen, Gail Kaiser Columbia.
MAE 552 – Heuristic Optimization Lecture 10 February 13, 2002.
Swami NatarajanJuly 14, 2015 RIT Software Engineering Reliability: Introduction.
Attempts to find an optimum solution penalty value for certain classes of NP-Hard problems George M. White SITE University of Ottawa
CISC673 – Optimizing Compilers1/34 Presented by: Sameer Kulkarni Dept of Computer & Information Sciences University of Delaware Phase Ordering.
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Particle Swarm Optimization Algorithms
Invitation to Computer Science 5th Edition
Efficient Model Selection for Support Vector Machines
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
Swarm Computing Applications in Software Engineering By Chaitanya.
University of Sunderland CIFM03Lecture 4 1 Software Measurement and Reliability CIFM03 Lecture 4.
Swarm Intelligence 虞台文.
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
Bug Localization with Machine Learning Techniques Wujie Zheng
1 Software Reliability Assurance for Real-time Systems Joel Henry, Ph.D. University of Montana NASA Software Assurance Symposium September 4, 2002.
1 IE 607 Heuristic Optimization Particle Swarm Optimization.
1 A Heuristic Approach Towards Solving the Software Clustering Problem ICSM03 Brian S. Mitchell /
Yue Jia, Mark Harman King’s College London CREST Constructing Subtle Faults Using Higher Order Mutation Testing Higher Order Mutation Testing.
Today’s Agenda  HW #1  Finish Introduction  Input Space Partitioning Software Testing and Maintenance 1.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
REPRESENTATIONS AND OPERATORS FOR IMPROVING EVOLUTIONARY SOFTWARE REPAIR Claire Le Goues Westley Weimer Stephanie Forrest
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
Isolating Failure-Inducing Combinations in Combinatorial Testing using Test Augmentation and Classification Kiran Shakya Tao Xie North Carolina State University.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
Project 2: Classification Using Genetic Programming Kim, MinHyeok Biointelligence laboratory Artificial.
Scientific Debugging. Errors in Software Errors are unexpected behaviors or outputs in programs As long as software is developed by humans, it will contain.
GENETIC PROGRAMMING. THE CHALLENGE "How can computers learn to solve problems without being explicitly programmed? In other words, how can computers be.
Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.
Optimization Problems
FCM WIZARD IN ACTION SCENARIO: HIV DRUG RESISTANCE PREDICTION Main developers: Gonzalo NÁPOLES and Isel GRAU Supervisors: Elpiniki PAPAGEORGIOU and Koen.
Week 5-6 MondayTuesdayWednesdayThursdayFriday Testing III No reading Group meetings Testing IVSection ZFR due ZFR demos Progress report due Readings out.
A field of study that encompasses computational techniques for performing tasks that require intelligence when performed by humans. Simulation of human.
Genetic Programming COSC Ch. F. Eick, Introduction to Genetic Programming GP quick overview Developed: USA in the 1990’s Early names: J. Koza Typically.
Mutation Testing Breaking the application to test it.
Random Test Generation of Unit Tests: Randoop Experience
CIS-NG CASREP Information System Next Generation Shawn Baugh Amy Ramirez Amy Lee Alex Sanin Sam Avanessians.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Genetic Algorithm(GA)
Genetic Programming.
Particle Swarm Optimization
PSO -Introduction Proposed by James Kennedy & Russell Eberhart in 1995
Towards Trustworthy Program Repair
Faults, Errors, Failures CS 4501 / 6501 Software Testing
GENETIC PROGRAMMING BBB4003.
Overview of Supervised Learning
Overfitting and Underfitting
Genetic Programming Chapter 6.
Genetic Programming Chapter 6.
Genetic Programming Chapter 6.
GENETIC PROGRAMMING BBB4003.
Area Coverage Problem Optimization by (local) Search
Coevolutionary Automated Software Correction
Presentation transcript:

Learning based Software Testing --Marriage between “learning” and “testing” Dan Hao Peking University

About Me Associate Professor, Peking University Education Background – , Peking University, Ph.D. – , Harbin Institute of Technology, B.S. Research Interest: Software Testing 2 Homepage: sei.pku.edu.cn/~haod

Software contains bugs 3

Software Testing Software Everywhere SOFTWARE 4

Simplified Software-Testing Process SUT Test Input Expected Output Execute Actual Output Compare revealed faultsno revealed faults Test Case Test Oracle 5

Important Problems in Software Testing Test Oracle Generation Test Input Generation Test Repair Test Selection/reducti on/prioritization …… 6

Important Problems in Software Testing Test Oracle Generation Test Input Generation Test Repair Test Selection/reducti on/prioritization …… How to solve these problems? Program Analysis, Machine Learning, Searching Algorithms, … 7

Learning --- Outsider’s perspective Machine Learning – learn rules from history data – apply rules to new data Search-based Algorithms – find the optimum quickly in the solution space …… Learning Algorithms 8

Marriage between Learning & Testing Learning Algorithms 9

Marriage between Learning & Testing Learning Algorithms search-based test generation search-based test prioritization/reduction/selection automated program repair machine learning based bug prediction …… 10

Marriage between Learning & Testing Learning Algorithms 11

Our Work in the Marriage between Learning & Testing Learning Algorithms Test oracle generation[ASE14] Obsolete test identification[ECOOP12] Test effectiveness measurement[ISSTA16] …… 12

In: Test Oracle Generation[ASE14] Metamorphic relation inference + PSO What is the test oracle problem? Test oracles are widely recognized as a difficult problem!!! 13 [ASE14] Jie Zhang, Junjie Chen, Dan Hao, Yingfei Xiong, Bing Xie, Lu Zhang, Hong Mei, Search- based Inference of Polynomial Metamorphic Relations, ASE 2014.

Metamorphic Relation --- A specific type of oracles MR: a particular change to the input “changes” the output 14 R I (I 1,I 2 )=> R O (O 1,O 2 ) I1I1 I2I2 O1O1 O2O2

Statistics on Metamorphic Relations 60% 50% 1-MR 2-MR 15 R I (I 1,I 2 )=> R O (O 1,O 2 )

1-MR 2-MR 16 R I (I 1,I 2 )=> R O (O 1,O 2 ) R I : I 2 =αI 1 +β R O : c 1 O 1 +c 2 O 2 +c 3 =0 P(I 1 ) P(αI 1 +β) c 1 P(I 1 )+c 2 P(αI 1 +β)+c 3 =0 c 1 P 2 (I 1 )+c 2 P 2 (αI 1 +β)+c 3 P(I 1 )P(αI 1 +β)+c 4 P (I 1 ) +c 5 P(αI 1 +β)+c 6 =0

PSO  MR Inference PSO algorithm: – Each candidate solution is called a particle – Each particle has a velocity and a location which keep changing – A fitness function is used to evaluate how close the location of a particle is to an optimal location PSO algorithm (Particle Swarm Optimization) simulating the birds foraging behavior Why PSO : Very effective to search in continuous space Can lead a particle to escape local optimal locations Simple, no many parameters to adjust 17 1-MR: c 1 P(I 1 )+c 2 P(αI 1 +β)+c 3 =0 2-MR: c 1 P 2 (I 1 )+c 2 P 2 (αI 1 +β)+c 3 P(I 1 )P(αI 1 +β)+c 4 P (I 1 )+c 5 P(αI 1 +β)+c 6 =0

YN Y N a.In the beginning, the N particles are assigned with locations L (initial values of the parameters). b.The N particles keep updating their velocities and locations. c.When reaching the termination threshold, over. PSO based MR Inference 18

MR Filtering 19 1-MR: c 1 P(I 1 )+c 2 P(αI 1 +β)+c 3 =0 2-MR: c 1 P 2 (I 1 )+c 2 P 2 (αI 1 +β)+c 3 P(I 1 )P(αI 1 +β)+c 4 P (I 1 )+c 5 P(αI 1 +β)+c 6 =0

Evaluation Results(1/2) 20 Applied to 189 scientific functions of JDK, Apache, Matlab, and GSL

In: Obsolete Test Identification[ECOOP12] Obsolete Tests public class Testcases Account a; protected void setUp() a=new Account(100.0,"user1"); protected void tearDown() public void test1() a.transfer(50.0,"user2"); a.withdraw(40.0); assertEquals(9.5,a.getBalance()); public void test2() a.withdraw(40.0); assertEquals(56,a.getBalance();//should be P T P’ 21 [ECOOP12] Dan Hao,Tian Lan, Hongyu Zhang, Chao Guo, Lu Zhang, Is This a Bug or an Obsolete Test? ECOOP 2012.

Problem Description Obsolete test identification Given a failing execution, is it caused by a bug in the source code or an obsolete test case? Importance – Without knowing the cause of a failure, how to decide whether repairing a test [Daniel:ASE09,Daniel:ISSTA10] or debugging in the source code [Jones: ASE06,Liblit: PLDI03,Weimer:ICSE09,Kim: ICSE13] ? Krishna Ratakonda (IBM “Determining this reason for failure is the critical first step before any corrective action can be taken…” 22

Best-First Decision Tree Algorithm  Obsolete Test Identification Binary classification problem (T v.s. P’) public class Testcases Account a; protected void setUp() a=new Account(100.0,"user1"); protected void tearDown() public void test1() a.transfer(50.0,"user2"); a.withdraw(40.0); assertEquals(9.5,a.getBalance()); public void test2() a.withdraw(40.0); assertEquals(56,a.getBalance();//should be T P’ 23

Learning based Obsolete Test Identification failure- inducing test collection feature collection classifier building 24

Features Complexity Features Change Features Testing Features Maximum depth of the call graph Number of methods in the call graph File change Type of failure Count of plausible nodes in the graph Existence of highly fault-prone node Product innocence 25

Evaluation Results within the same version between versions across projects Effective when being applied within the same versions, or between versions 26

In: Test Effectiveness Measurement[ISSTA16] Mutation Testing: – Whether a mutant is killed by a test suite – Mutation score 1 begin 2int x,y; 3input(x,y); 4if(x<y) 5 output(x+y); 6else 7 output(x*y); 8end 1 begin 2int x,y; 3input(x,y); 4if(x<=y) 5 output(x+y); 6else 7 output(x*y); 8end Program Mutant 27 [ISSTA16] Jie Zhang, Yiling Lou, Lingming Zhang, Dan Hao, Lu Zhang, Hong Mei, Predictive Mutation Testing, ISSTA 2016.

Challenge in Mutation Testing Costly – E.g., 512LOC Program has mutants Mutant generation Mutant execution Literature – Do Fewer – Do Faster 28 Don’t run at all

Predictive Mutation Testing Mutation testing results – Whether a mutant is killed by a test suite – Mutation score: percentage of killed mutants Prediction: Whether a mutant is killed or survived 29

Random Forest  Mutation Testing Binary classification problem (killed v.s. survived) 1 begin 2int x,y; 3input(x,y); 4if(x<y) 5 output(x+y); 6else 7 output(x*y); 8end 1 begin 2int x,y; 3input(x,y); 4if(x<=y) 5 output(x+y); 6else 7 output(x*y); 8end Mutant Program Whether a test kills a mutant? PIE Theory : Propagation Infection Execution 30

Features 31

Evaluation Results Besides, in the cross-version scenario, the precision is mostly about 90%. 32

Evaluation Results Besides, in the cross-version scenario, the precision is mostly about 90%. Prediction results with high precision Make prediction quickly 33

Commonality Analysis Test Oracle Generation Obsolete Test Identification Test Effectiveness Measurement Sampling Learning Algorithms PSO Decision-Tree Random Forest 34

Learning-based Software Testing Learning Algorithms 35 Problems: test generation test-execution optimization defect prediction bug fixing …… Algorithms: genetic algorithms PSO hill climbing random forest ……

Challenges in Learning-based Software Testing 36 Testing Perspective: Transform a testing problem into a typical learning problem Learning Perspective: Design of fitness function Influence of Imbalance data Choice of algorithms and parameter values ……

Summary 37

Thanks!