Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS5103 Software Engineering Lecture 17 Debugging.

Similar presentations


Presentation on theme: "CS5103 Software Engineering Lecture 17 Debugging."— Presentation transcript:

1 CS5103 Software Engineering Lecture 17 Debugging

2 2 Today’s class  Delta Debugging  Motivation  Algorithm  In practice  Statistical Debugging  Tarantula  Dynamic Slicing

3 3 Debugging  Something we do when testing find a bug  Basic Process  Reproduce the bug  Locate the fault  Fix  Bug localization: Basic idea  Suspicious Score (s) = failing tests cover (s) / all tests cover (s)

4 4 Debugging  Sometimes the inputs is too complex…  Quite common in real world (compiler, office, browser, database, OS, …)  Locate the relevant inputs

5 5 Consider Mozilla Firefox  Taking html pages as inputs  A large number of bugs are related to loading certain html pages  Corner cases in html syntax  Incompatibility between browsers  Corner cases in Javascripts, css, …  Error handling for incorrect html, Javascript, css, …

6 6 How do we go from this All Windows 3.1 Windows 95<OPTION VALUE="Windows 98">Windows 98 Windows ME Windows 2000 Windows NT Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5 Mac System 7.6.1 Mac System 8.0 Mac System 8.5 Mac System 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x MacOS X Linux<OPTION VALUE="BSDI">BSDI FreeBSD NetBSD<OPTION VALUE="OpenBSD">OpenBSD AIX BeOS HPUX< OPTION VALUE="IRIX">IRIX Neutrino OpenVMS<OPTION VALUE="OS/2">OS/2 OSF/1 Solaris<OPTION VALUE="SunOS">SunOS other -- P1 P2 P3<OPTION VALUE="P4">P4 P5 blocker critical major<OPTION VALUE="normal">normal minor trivial<OPTION VALUE="enhancement">enhancement<

7 7 To this…

8 8 Motivation  Turning bug reports with real web pages to minimized test cases  The minimized test case should still be able to reveal the bug  Benefit of simplification  Easy to communicate  Remove duplicates  Easy debugging  Involve less potentially buggy code  Shorter execution time

9 9 Delta Debugging  The problem definition  A program exhibit an error for an input  The input is a set of elements  E.g., a sequence of API calls, a text file, a serialized object, …  Problem:  Find a smaller subset of the input that still cause the failure

10 10 A generic algorithm  How do people handle this problem?  Binary search  Cut the input to halves  Try to reproduce the bug  Iterate

11 11 Delta Debugging Version 1  The set of elements in the bug-revealing input is I  Assumptions  Each subset of I is a valid input:  Each Subset of I -> success / fail  A single input element E causes the failure  E will cause the failure in any cases (combined with any other elements) (Monotonic)

12 12 Solution is simple  Go with the binary search process  Throw away half of the input elements, if the rest input elements still cause the failure

13 13 Solution is simple  Go with the binary search process  Throw away half of the input elements, if the rest input elements still cause the failure A single element: we are done!

14 14 Example

15 15 Delta Debugging Version 1  This is just binary search: easy to automate  The assumptions do not always hold  Let’s look at the assumptions:  (I 1 U I 2 ) = -> I 1 = and I 2 = or I 1 = and I 2 = It is interesting to see if this is not the case

16 16 Case I: multiple failing branches  What happened if I 1 = and I 2 = ?  A subset of I 1 fails and also a subset of I 2 fails  We can simply continue to search I 1 and I 2  And we find two fail-causing elements  They may be due to the same bug or not

17 17 Case II: Interference  What happened if I 1 = and I 2 = ?  This means that a subset of I 1 and a subset of I 2 cause the failure when they combined  This is called interference

18 18 Handling Interference  The cute trick  Consider I 1 = and I 2 =  But I 1 U I 2 =  An element D 1 in I 1 and an element D 2 in I 2 cause the failure  We do binary search in I 2 with I 1  Split I 2 to P 1 and P 2, try I 1 U P 1 and I 1 U P 2  Continue until you find D 2, so that I 1 U D 2 cause the failure  Then we do binary search in I 1 with D 2 until find D 1  Return D 1 U D 2

19 19 Example I: Handle interference Consider 8 input elements, of which 3 and 7 cause the failure when they applied together Configuration Result 1 2 3 4 5 6 7 8 1 2 3 4 5 6 1 2 3 4 7 8 1 2 3 4 7 1 2 7 3 4 7 3 7 Interference!

20 20 Example II: Handle multiple interference Consider 8 input elements, of which 3, 5 and 7 cause the failure when they applied together Configuration Result 1 2 3 4 5 6 7 8 1 2 3 4 5 6 1 2 3 4 7 8 1 2 3 4 5 6 7 1 2 3 4 5 7 1 2 5 7 3 4 5 7 Interference! Second Interference! What to do? 3 5 7 Go on with I 1 U P 1 !

21 21 Delta Debugging Version 2  The set of elements in the bug-revealing input is I  New Assumptions  Each subset of I is a valid input  A subset of input elements E causes the failure  E will cause the failure in any cases (combined with any other elements)

22 22 Delta Debugging Version 2  Algorithm  Split I to I 1 and I 2  Case I: I 1 = and I 2 =  Try I 1  Case I: I 1 = and I 2 =  Try I 2  Case I: I 1 = and I 2 =  try both I 1 and I 2  Case II: I 1 = and I 2 =  Handle interference for I 1 and I 2

23 23 Real example: GNU Compiler  This input program (bug.c) causes Gcc 2.59.2 to crash when all optimitization are enabled  Minimize it to debug gcc  Consider each character as an element

24 24 Real example: GNU Compiler  Our delta debugging process  Create the appropriate subset of bug.c  Feed it to gcc  Continue according to whether Gcc crashes 77

25 25 GCC compiler example  The minimized code:  The test case is 1-minimal  No single character can be removed  Even every space is removed  The function name has been changed from mult to a signle t  Gcc is executed for 700+ times  Input reduce to 10% of the initial input t(double z[],int n){int i,j;for(;;){i=i+j+1;z[i]=z[i]*(z[0]+0);}return z[n];}

26 26 Another example: GDB  GDB is the debugger from GNU  It updates from 4.16 to 4.17  The version 4.17 no longer compatible with DDD (a GUI for GNU software development tools)  178, 000 lines of code change from 4.16  How to know which code change(s) cause the failure

27 27 Results  After a lot of work (by machine)  178KLOC change grouped to 8700 groups (commits)  Use delta debugging  Work it out in 470 tests  It took 48 hours  Doing this by hand would be a nightmare!

28 28 Importance of input elements  It is important to have good input element definition  So that subset of input elements are valid for input  The size of input is small  Consider the examples  GCC example: we use characters as elements, which is simple but not so good, if the bug happens after parser, the bug is not monotonic due to syntax errors  GDB example: we group LOC to groups to reduce input size to 5% of the original size. 2 days are acceptable, what about 40 days?

29 29 Limitations of Delta debugging  Rely on the assumptions  Monotonicity does not always hold  Rely on good input elements, always providing valid inputs will enhance efficiency  Require automatic test oracles  Good for regression testing  No good for development-time testing

30 30 Statistical Debugging  Delta Debugging  Narrow down the input to be considered  Statistical Debugging  Narrow down the code to be considered

31 31 Statistical Debugging  Basic Idea  Consider a number of test cases, some of which pass and some of which fail  If a statement is covered mostly by failed test cases, it is highly likely to be the buggy part of the code

32 32 Tarantula  A classical tool for statistical debugging  Use the following formulas  Color = red + pass/(fail + pass) * (green )  Brightness = max (pass, fail)

33 33 Tarantula: Illustration

34 34 Context based statistical debugging  Not just consider a statement  Runtime Control Flow Graph  Also consider connections  Outcomes of branches  Connections on a runtime-CFG

35 35 Runtime Control Flow Graph 1: void replaceFirst (sx, sy) { 2: for (int i=0;i<len;i++) { 3: if (arr[i]==sx){ 4: arr[i] = sz; 5: //should break; 6: } 7: if (arr[i]==sy)){ 8: arr[i] = sz; 9: //should break; 10: } 11: } 12:} pass Fail

36 36 Limitations  Questions:  If a statement is covered only by passed test cases, can it be the root cause of the bug found?  If a statement is covered only by failed test cases, it must be the root cause of the bug found?

37 37 Example void f(int a, int b){ if (a > 0){ //error: should be >= do something; } if (b < 0){ do something } Test Cases: 3, 2 2, 1, 0, -1 2, 0

38 38 Dynamic Slicing  Another way to narrow down code to be considered in debugging

39 39 Data Dependencies  Data dependencies are the dependency from the usage of a variable to the definition of the variable  Example: s1: x = 3; s2: if(y > 5){ s3: y = y + x; //data depend on x in s1 s4: }

40 40 Control Dependencies  Control dependencies are the dependency from the branch basic blocks to the predicate  Example: s1: x = 3; s2: if(y > 5){ s3: y = y + x; //control depend on y in s2 s4: }

41 41 Dynamic Slicing  Describe dependencies among code elements  If a variable has incorrect value, the bug should be in its backward dynamic slice  Like runtime control flow graph  A map from static slicing to the executed code

42 Algorithm  A dependence edge is introduced from a load to a store if during execution, at least once, the value stored by the store is indeed read by the load (mark dependence edge)  No static analysis is needed.

43 1 2121 5151 7171 8181 3131 4141 Algorithm II Example 1: b=0 2: a=2 3: 1 <=i <=N 4: if ((i++)%2= =1) 5: a=a+16: b=a*2 7: z=a+b 8: print(z) TF T F For input N=1, the trace is:

44 Efficiency: Summary For an execution of 130M instructions: space requirement: reduced from 1.5GB to 94MB (I further reduced the size by a factor of 5 by designing a generic compression technique [MICRO’05]). time requirement: reduced from >10 Mins to <30 seconds. http://jslice.sourceforge.net/

45 45 Summary of debugging  Debugging is a follow-up step of testing  Bug localization, and bug fixing are tasks highly depend on human intelligence  Tools can help us to narrow the scope to consider  Bug localization  Reduce the code to be considered  Delta debugging  Reduce the inputs to be considered


Download ppt "CS5103 Software Engineering Lecture 17 Debugging."

Similar presentations


Ads by Google