Presentation is loading. Please wait.

Presentation is loading. Please wait.

Finding Similar Failures Using Callstack Similarity

Similar presentations


Presentation on theme: "Finding Similar Failures Using Callstack Similarity"— Presentation transcript:

1 Finding Similar Failures Using Callstack Similarity
Kevin Bartz, Harvard University, Jack W. Stokes, John C. Platt, Ryan Kivet, David Grant, Silviu Calinoiu and Gretchen Loihle, Microsoft Corporation, Redmond, WA Presented by: Sandeep Kumar Dhankar Dept of Computer & Information Sciences University of Delaware

2 OUTLINE Introduction Approach Similarity Classifier Model Results
Conclusion

3 INTRODUCTION REMEMBER THIS?

4 WHAT NEXT? Problem 1 (assigned)  Problem 2 (assigned)  Problem 3 (assigned) 

5 Problems with this approach
Similar problems encountered Duplication of efforts Wrong prioritization of tasks

6 Solution? Try to find and group similar problems
Treat callstack as string and apply string matching techniques

7 CallStack

8 Edit distance Number of insertions, deletions,
modifications required to convert one string to another string E.g. Tried and Tired 2 modifications are required in tried to convert into tired so edit distance is 2

9 Data used 1 million Failure reports to windows error reporting system collected over 90 day period Type of failure– crash, hang, deadlock Name of the causing process Exception code Offending callstack

10 Training set

11 Model Parameters Features Defined over pair of failures

12 Features cont…

13 Model parameters Dependence on Callstack edit distance

14 Callstack edit distance penalty parameters

15 Model P(Sim|β, ϒ, X) = g-1(α + βET1{ET1 = ET2}+ βPN1{PN1 = PN2}+
βEC1{EC1 = EC2}+ βCSEditDistance(CS1,CS2; ϒ)) Where g-1(x) = e-x / (1+e-x) , the inverse logit function

16 Model variations Full model Reduced model Further reduced model
γInsSame = γInsNew and γDelSame = γDelLast Further reduced model γSubMod = γSubFunc = γSuboffset Baseline model with untuned edit distance

17 Computation The edit distance computation dominates the time requirement of computation Consider only those failures whose first three callstack frame matches Returns under 3000 such failures Model applied to them

18 Estimated coefficients for the model

19 Results

20 Result cont… Full Model works with 90% precision to identify similar failures on recall Baseline model with 50% precision on recall

21 Conclusion about paper
Good ability to recover similar failures being shown by model Total computation times not exactly given for edit distance comparisons for the data set Initial failure classification for training data based on tags by developer is not standard thing to use Only first three frames checked for match in fast global search

22 QUESTIONS?

23 THANKS


Download ppt "Finding Similar Failures Using Callstack Similarity"

Similar presentations


Ads by Google