Bug Isolation in the Presence of Multiple Errors Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan UC Berkeley and Stanford University.

Slides:



Advertisements
Similar presentations
Linear Regression.
Advertisements

Brief introduction on Logistic Regression
Statistical Techniques I EXST7005 Multiple Regression.
Logistic Regression Psy 524 Ainsworth.
1 Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael Jordan Presented By : Arpita Gandhi.
8 - 1 Multivariate Linear Regression Chapter Multivariate Analysis Every program has three major elements that might affect cost: – Size » Weight,
Statistical Debugging Ben Liblit, University of Wisconsin–Madison.
Correlation and regression
Statistical Debugging Ben Liblit, University of Wisconsin–Madison.
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.
Planning under Uncertainty
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Tracking with Linear Dynamic Models. Introduction Tracking is the problem of generating an inference about the motion of an object given a sequence of.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice ZhengMike Jordan.
Jeff Howbert Introduction to Machine Learning Winter Machine Learning Feature Creation and Selection.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Chapter 8: Bivariate Regression and Correlation
Testing Hypotheses.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Ensemble Data Assimilation and Uncertainty Quantification Jeffrey Anderson, Alicia Karspeck, Tim Hoar, Nancy Collins, Kevin Raeder, Steve Yeager National.
Advanced Statistics for Researchers Meta-analysis and Systematic Review Avoiding bias in literature review and calculating effect sizes Dr. Chris Rakes.
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan, 2005 University of Wisconsin, Stanford University,
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan University of Wisconsin, Stanford University, and.
Evaluating the Options Analyst’s job is to: gather the best evidence possible in the time allowed to compare the potential impacts of policies.
Bivariate Regression Analysis The most useful means of discerning causality and significance of variables.
User Study Evaluation Human-Computer Interaction.
Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate.
Bug Localization with Machine Learning Techniques Wujie Zheng
Extension to Multiple Regression. Simple regression With simple regression, we have a single predictor and outcome, and in general things are straightforward.
Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.
EBCP. Random vs Systemic error Random error: errors in measurement that lead to measured values being inconsistent when repeated measures are taken. Ie:
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Matching Estimators Methods of Economic Investigation Lecture 11.
Introductory Topics PSY Scientific Method.
Multigroup Models Byrne Chapter 7 Brown Chapter 7.
Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
Chapter 21: More About Tests
Bug Isolation via Remote Sampling. Lemonade from Lemons Bugs manifest themselves every where in deployed systems. Each manifestation gives us the chance.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Chapter 22 Comparing Two Proportions.  Comparisons between two percentages are much more common than questions about isolated percentages.  We often.
Statistical Debugging CS Motivation Bugs will escape in-house testing and analysis tools –Dynamic analysis (i.e. testing) is unsound –Static analysis.
Cooperative Bug Isolation CS Outline Something different today... Look at monitoring deployed code –Collecting information from actual user runs.
Measurements and Data. Topics Types of Data Distance Measurement Data Transformation Forms of Data Data Quality.
Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.
Automated Adaptive Bug Isolation using Dyninst Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison.
EC 827 Module 2 Forecasting a Single Variable from its own History.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Simulation-based inference beyond the introductory course Beth Chance Department of Statistics Cal Poly – San Luis Obispo
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael I. Jordan UC Berkeley.
Comparing Two Proportions
Simulation-Based Approach for Comparing Two Means
Statistical Debugging
Roberto Battiti, Mauro Brunato
Machine Learning Feature Creation and Selection
Sampling User Executions for Bug Isolation
Public Deployment of Cooperative Bug Isolation
Methods of Economic Investigation Lecture 12
Statistical Debugging
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
CS 188: Artificial Intelligence Fall 2008
Presentation transcript:

Bug Isolation in the Presence of Multiple Errors Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan UC Berkeley and Stanford University

In Our Last Episode… Generic instrumentation schemes –Wild guesses about interesting behavior Bernoulli sampling transformation –Amortization across acyclic regions Data mining techniques –Deterministic: process of elimination –Non-deterministic: logistic regression

Schemes  Sites  Predicates Several instrumentation schemes available –Function returns, pairwise comparisons, branches, … Scheme induces finite set of instrumentation sites Site determines finite set of observable predicates Predicates completely partition each site –Bump exactly one counter per observation –Infer additional predicates (e.g. ≤, ≠, ≥) offline

What Does This Give Us? Absolutely certain of what we do see Uncertain of what we don’t see Given enough runs, samples ≈ reality –Common events seen most often –Rare events seen at proportionate rate

Regularized Logistic Regression S-shaped cousin to linear regression Predict success/failure as function of counters Penalty factor forces most coefficients to zero –Large coefficient  highly predictive of failure count failure = 1 success = 0

Buffer Overrun in bc void more_arrays () { … /* Copy the old arrays. */ for (indx = 1; indx < old_count; indx++) arrays[indx] = old_ary[indx]; /* Initialize the new elements. */ for (; indx < v_count; indx++) arrays[indx] = NULL; … } #1: indx > scale #2: indx > use_math #3: indx > opterr #4: indx > next_func #5: indx > i_base #1: indx > scale #2: indx > use_math #3: indx > opterr #4: indx > next_func #5: indx > i_base

Limitations of Logistic Regression Linearly-weighted combination of features –What does this mean? Many correlated features –Weight may be spread in unpredictable ways Suited to explaining a single mode of failure –Do you really believe you have just one bug?

Multiple-Bug Isolation Consider predicates one at a time –Include inferred predicates (e.g. ≤, ≠, ≥) How likely is failure when predicate P is true? –(technically, when P is observed to be true)

Multiple-Bug Isolation Consider predicates one at a time –Include inferred predicates (e.g. ≤, ≠, ≥) How likely is failure when predicate P is true? –(technically, when P is observed to be true)

Are We Done? Not Exactly! f = …; if (f == NULL) { x = 0; *f; } Predicate ( x == 0 ) is an innocent bystander –Program is already doomed Bad( f == NULL )= 1.0 Bad( x == 0 )= 1.0

Three-Valued Logic Identify unlucky sites on the doomed path Captures risk of failure from reaching site at all, regardless of predicate truth/falsehood

Getting to the Heart of the Matter Looking for increase in failure odds Correspondence to likelihood ratio testing

Multiple-Bug Filtering & Ranking 1.Discard predicates having Increase(P) ≤ 0 –Dead predicates –Invariant predicates –Bystander predicates –Others 2.Sort remaining predicates by Bad(P) –Likely causes with determinacy metrics

Case Study: Moss Reintroduce nine historic Moss bugs –Including wrong-output bugs Instrument with everything we’ve got –Branches, returns, scalar pairs, the works Generate 32,000 randomized runs

Effectiveness of Filtering Eliminates 99% of branch predicates –4170 → 51 Eliminates 99.5% of return predicates –2964 → 16 Eliminates 96% of scalar pair predicates –195,864 → 8242

Effectiveness of Ranking Five bugs: captured by branches, returns –Lists are short, easy to examine by hand –“Smoking guns” rise to the top –Stop early if Bad() dips down Two bugs: buried in scalar pairs results –List is still too large to be useful Two bugs: never cause a failure –No failure, no problem!

Summary: Putting it All Together Wild guesses + fair random sampling Seek behaviors that co-vary with outcome –Statistical modeling, confidence tests, more… Future work –More selective instrumentation schemes –Non-uniform sampling –Improved statistical models –Use of program structure in analysis

Join the Cause! The Cooperative Bug Isolation Project

Linear Regression Match a line to the data points Outcome can be anywhere along y axis But our outcomes are always 0/1

Logistic Regression Prediction asymptotically approaches 0 and 1 –0: predict success –1: predict failure

Training the Model Maximize LL using stochastic gradient ascent Problem: model is wildly under-constrained –Far more counters than runs –Will get perfectly predictive model just using noise

Regularized Logistic Regression Add penalty factor for nonzero terms Force most coefficients to zero Retain only features that “pay their way” by significantly improving prediction accuracy