Bug Isolation via Remote Sampling. Lemonade from Lemons Bugs manifest themselves every where in deployed systems. Each manifestation gives us the chance.

Slides:



Advertisements
Similar presentations
Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering.
Advertisements

1 Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael Jordan Presented By : Arpita Gandhi.
Trace Analysis Chunxu Tang. The Mystery Machine: End-to-end performance analysis of large-scale Internet services.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Probability A Coin Toss Activity. Directions: Each group will toss a fair coin ten times. On the worksheet, they will record each toss as a heads or tails.
Conditional Probability and Independence. Learning Targets 1. I can calculate conditional probability using a 2-way table. 2. I can determine whether.
Chapter 6: What Do You Expect? Helpful Links:
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
STAT Section 5 Lecture 23 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.
AP Statistics Section 6.2 A Probability Models
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.
Bug Isolation in the Presence of Multiple Errors Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan UC Berkeley and Stanford University.
Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.
Data Handling & Analysis ZO4030 Andrew Jackson
CSE 3504: Probabilistic Analysis of Computer Systems Topics covered: Moments and transforms of special distributions (Sec ,4.5.3,4.5.4,4.5.5,4.5.6)
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Statistical Debugging: A Tutorial Steven C.H. Hoi Acknowledgement: Some slides in this tutorial were borrowed from Chao Liu at UIUC.
LARGE SAMPLE TESTS ON PROPORTIONS
SAMPLING DISTRIBUTIONS. SAMPLING VARIABILITY
An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College.
C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.
.NET Mobile Application Development Introduction to Mobile and Distributed Applications.
Give an example to show the advantages to using multithreaded Clients See page 142 of the core book (Tanebaum 2002).
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice ZhengMike Jordan.
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
Review of Probability Theory. © Tallal Elshabrawy 2 Review of Probability Theory Experiments, Sample Spaces and Events Axioms of Probability Conditional.
C LIENT /S ERVER AND P EER TO P EER National 4/5 Computing Science.
Simulation II IE 2030 Lecture 18. Outline: Simulation II Advanced simulation demo Review of concepts from Simulation I How to perform a simulation –concepts:
Proportions for the Binomial Distribution ©2005 Dr. B. C. Paul.
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan, 2005 University of Wisconsin, Stanford University,
IMPROUVEMENT OF COMPUTER NETWORKS SECURITY BY USING FAULT TOLERANT CLUSTERS Prof. S ERB AUREL Ph. D. Prof. PATRICIU VICTOR-VALERIU Ph. D. Military Technical.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
1.2 – Random Samples Simple Random Sample – Every sample of size n from the population has an equal chance of being selected – Every member of the population.
Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.
Stats Probability Theory. Instructor:W.H.Laverty Office:235 McLean Hall Phone: Lectures: M W F 2:30pm - 3:20am Arts 133 Lab: M 3:30 - 4:20.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) x ±  x and x ±  y What.
Cloud Interoperability & Standards. Scalability and Fault Tolerance Fault tolerance is the property that enables a system to continue operating properly.
Overview Of Probability Distribution. Standard Distributions  Learning Objectives  Be familiar with the standard distributions (normal, binomial, and.
DRILL Answer the following question’s in your notebook: 1.How does ACO differ from PSO? 2.What does positive feedback do in a swarm? 3.What does negative.
STATISTICAL INFERENCES
Probability and Distributions. Deterministic vs. Random Processes In deterministic processes, the outcome can be predicted exactly in advance Eg. Force.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.
Aditya Thakur Rathijit Sen Ben Liblit Shan Lu University of Wisconsin–Madison Workshop on Dynamic Analysis 2009 Cooperative Crug Isolation.
Statistical Debugging CS Motivation Bugs will escape in-house testing and analysis tools –Dynamic analysis (i.e. testing) is unsound –Static analysis.
Cooperative Bug Isolation CS Outline Something different today... Look at monitoring deployed code –Collecting information from actual user runs.
Aim: What is the importance of probability?. What is the language of Probability? “Random” is a description of a kind of order that emerges in the long.
Automated Adaptive Bug Isolation using Dyninst Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Credibility: Evaluating What’s Been Learned Predicting.
Math 1320 Chapter 7: Probability 7.3 Probability and Probability Models.
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael I. Jordan UC Berkeley.
Theoretical distributions: the other distributions.
PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.
Introduction to Discrete Probability
Testing Tutorial 7.
Random variables (r.v.) Random variable
Intro CS – Probability and Random Numbers
Appendix A: Probability Theory
Chapter 8: Hypothesis Testing and Inferential Statistics
Sampling User Executions for Bug Isolation
Public Deployment of Cooperative Bug Isolation
The Binomial and Geometric Distributions
O.S Lecture 14 File Management.
Introduction to Basic Statistical Methodology
Client/Server and Peer to Peer
Presentation transcript:

Bug Isolation via Remote Sampling

Lemonade from Lemons Bugs manifest themselves every where in deployed systems. Each manifestation gives us the chance of inspection and hence the resolutions. Deployment gives more test cases than the test suite. But you need a feedback mechanism. Feedback is expensive. Most feedback requires Manual Inspection.

Feedback [Client Performance] Instrumentation Call backs Assertion Checks Logging [Data Transmission] Network Latency Bandwidth

Make Feedback Cheaper [Client Performance] Sampling Distributed Sampling [Data Transmission] Omit data – e.g. contextual information such as ordering of executions

Contributions Framework for Sampling Execution information, in a distributed manner. Examples of Data Analysis on different kinds of gathered Data – Bug Detection using Distributed Sampling – Detection of Deterministic Bugs – Detection of Non-Deterministic Bugs

Sampling Biased Coin Tosses. [Bernoulli process] Walk down the instrumented path if you get a head. Walk the Fast path if you get a tail. We can predict the occurrences of head because it is biased. [Geometric distribution] Maintain a threshold. – If countdown less than threshold, it is a Heads and you walk down the instrumented path.

>4? >3?

Transmission Losses Why? – Client side storage – Network bandwidth – Central Storage need might not be scalable Store only Predicate Observations – Predicate Vectors/Arrays (i th value shows number of times is was true/false) Completely ignores context. Has been left for future work.

Experiments (sample applications) Sharing assertion costs Debugging deterministic bugs Debugging non-deterministic bugs

Distributed Assertion checks Spread out the instrumentation. – Multiple exe’s. Each contains a sub-set of instrumentation Sampling density 1/ ,258 runs for 90% confidence of observing an event. MS Word produces those many runs every 19 mins. Food for thought: how many mins would a web browser like Chrome or FireFox take?

Predicate based Bug Isolation Observed the behavior of Predicates Basically applied Elimination Strategies to eliminate those predicates which are not likely to be the cause of a failure/fault. Eliminated predicates do not need to be checked. Hence, full instrumentation (not sampled) does fairly well also, in terms of performance.

Statistically finding Non-deterministic bugs These bugs do not always fail when a predicate is true. Machine learning technique to compute the probability that the program will fail given a subset of ‘interesting’ predicates are true. If probability more than 0.5, then the program will fail. And those specific predicates and associated data values are examined.

Discussion Even if statistics is really effective at telling us where the fault is, can it explain the fault? How can we make it effective at telling us why the fault is there?