Presentation on theme: "08/05/2015Dr Andy Brooks1 MSc Software Maintenance MS Viðhald hugbúnaðar Fyrirlestrar 15 & 16 Programmers Use Slices When Debugging."— Presentation transcript:
08/05/2015Dr Andy Brooks1 MSc Software Maintenance MS Viðhald hugbúnaðar Fyrirlestrar 15 & 16 Programmers Use Slices When Debugging
08/05/2015Dr Andy Brooks2 Case Study Dæmisaga Reference Programmers Use Slices When Debugging, Mark Weiser, Communications of the ACM, Volume 25, Number 7, pp 446-452, 1982.
08/05/2015Dr Andy Brooks3 The basic debugging method Reading 1 million lines of code, from beginning to end, to locate and remove a bug is not efficient. –100 LOC/day equates to 10000 days... –1000 LOC/day equates to 1000 days... The basic debugging method is to begin at the statement where the error appears and then reason backwards about the previous sequence of statements.
08/05/2015Dr Andy Brooks4 Reasoning backwards Reasoning backwards to determine all the influences on a variable usually reveals that many statements in the program have no influence. Sometimes you reason backward to the hardware or translation software...
08/05/2015Dr Andy Brooks5 Program Slicing “The process of stripping a program of statements without influence on a given variable at a given statement is called program slicing.” “An elementary slicing criterion of a program P is a tuple where i denotes a specific statement in P and V is a subset of variables in P.” Að sneiða
08/05/2015Dr Andy Brooks6 A program and a program slice 1BEGIN 2READ(X,Y) 3TOTAL:=0.0 4SUM:=0.0 5IF X<=1 6 THEN SUM:=Y 7 ESLE BEGIN 8 READ(Z) 9 TOTAL:=X*Y 10 END 11 WRITE(TOTAL,SUM) 12 END. Slice on Z at statement 12 BEGIN READ(X,Y) IF X<=1 THEN ELSE READ(Z) END. TOTAL, SUM and Y have no influence on Z.
08/05/2015Dr Andy Brooks7 A program and a program slice 1BEGIN 2READ(X,Y) 3TOTAL:=0.0 4SUM:=0.0 5IF X<=1 6 THEN SUM:=Y 7 ESLE BEGIN 8 READ(Z) 9 TOTAL:=X*Y 10 END 11 WRITE(TOTAL,SUM) 12 END. Slice on X at statement 9 BEGIN READ(X,Y) END.
08/05/2015Dr Andy Brooks8 A program and a program slice 1BEGIN 2READ(X,Y) 3TOTAL:=0.0 4SUM:=0.0 5IF X<=1 6 THEN SUM:=Y 7 ESLE BEGIN 8 READ(Z) 9 TOTAL:=X*Y 10 END 11 WRITE(TOTAL,SUM) 12 END. Slice on TOTAL at statement 12 BEGIN READ(X,Y) TOTAL:=0.0 IF X<=1 THEN ELSE TOTAL:=X*Y END.
08/05/2015Dr Andy Brooks9 Experimental Hypothesis H1 “... debugging programmers, working backwards from the variable and statement of a bug´s appearance, use that variable and statement as a slicing criterion to construct mentally the corresponding program slice.” Experimental Hypothesis H2 “... programmers look at code only in contiguous pieces.” tilgáta
08/05/2015Dr Andy Brooks10 “Slices are generally not contiguous pieces, but contain statements scattered throughout the code.” contiguous aðlægur ---------- xxxxxx ---------- xxxxxx ---------- xxxxxx ---------- xxxxxx ---------- xxxxxx ---------- xxxxxx ---------- slice
08/05/2015Dr Andy Brooks11 Method 1.Programmers debug three programs. 2.Test programmers´ memory of various code fragments particularly the program slice relevant to the bug. “If the programmers did slice, then their memories for the relevant slices should be at least as good as their memories of contiguous code, and somewhat better than their memories of other non-contiguous code.”
08/05/2015Dr Andy Brooks12 It is important to recognise that programmers were not observed working with the programs. Their actions and the program statements they considered were not recorded. Testing programmers´ memory is an indirect measurement. –And you may not be measuring what you think you are measuring...
08/05/2015Dr Andy Brooks13 Materials Three programs written in Algol-W Program sizes from 75 to 150 lines of code Program TALLY –An IBM scientific subroutine –poorly structured and non-mnemonic variable names Program PAYROLL –written for the experiment –computes salaries and deductions –well structured and mnemonic variable names Program EVADE –written for the experiment –simulation of random aircaft turns –well structured and mnemonic variable names
08/05/2015Dr Andy Brooks14 Program bugs ProgramOriginal codeBug EVADELEFTOT:=LEFTOT-HORT*THRUSTLEFTOT:=HORT*THRUST TALLYSCNT:=SCNT+1.0SCNT:=SCNT-1.0 PAYROLLEXEMPTHOURS:=OVERTIMEPAY:=0OVERTIMEPAY:=0.0 The bugs were chosen so that the entire experiment could be completed in less than an hour.
08/05/2015Dr Andy Brooks15 5 types of program fragments shown to programmers: 1.Relevant slice 2.Relevant contiguous overlapped the relevant slice 3.Irrelevant contiguous did not overlap relevant contiguous did not overlap relevant slice program TALLY had no irrelevant contiguous 4.Irrelevant slice 5.Jumble every 3rd or 4th statement
08/05/2015Dr Andy Brooks16 Fragment overlap relevant slice & relevant contiguous EVADETALLYPAYROLL 0.750.580.33 Overlap is the fraction of statements shared by two fragments.
08/05/2015Dr Andy Brooks17 Syntactic changes Syntactic changes were made to the code fragments to prevent recognition by a particular detail: –Variables and constants in the fragments were renamed as single letters followed by a unique number. –Indenting was adjusted from the original program to a form internally consistent with each fragment.
08/05/2015Dr Andy Brooks18 Participants Experienced Algol-W programmers Graduate student teaching assistants –all from the University of Michigan in Ann Arbor 26 volunteers –4 participated in pilot studies –1 did not follow instructions in the experiment –21 final participants þátttakendur
08/05/2015Dr Andy Brooks19 Pilot studies are conducted to: –To check experimental materials are in order. Instructions are clear. –To check experimental processes are sound. There is sufficient time to complete tasks. Participants behave in the way expected. Weiser reports that pilot studies were conducted but fails to report on actions taken as a result of the pilot studies. Any actions taken should be briefly reported.
08/05/2015Dr Andy Brooks20 Procedure Participants were given all three programs to debug in random order. Participants were then asked to rate 14 program fragments for how sure they were the fragment had been used in one of the three programs. –remember, program TALLY had no irrelevant contiguous fragment (3*5-1 = 14) Code fragments were given in random order each on a separate page with its rating scale. Participants were told not to look back either at the programs or at previously rated code fragments.
08/05/2015Dr Andy Brooks21 Part of the relevant slice for PAYROLL
08/05/2015Dr Andy Brooks22 Fragment shown to participants Rating scale recognition
08/05/2015Dr Andy Brooks23 Results All 21 participants found the bugs in TALLY and EVADE but only 17 found the bug in PAYROLL. ProgramMeanStandard Deviation TALLY13.06.9 EVADE8.06.0 PAYROLL9.23.1 Table IV Debugging times (minutes)
08/05/2015Dr Andy Brooks24 Results A two-way analysis of variance using Friedman´s test indicated an overall difference in the ratings of the different fragments. –fragment type, program type
08/05/2015Dr Andy Brooks25 Results Figure 3 by fragment type 28% 24% 54% Why is recognition so high?
08/05/2015Dr Andy Brooks26 Significant differences Wilcoxon matched-pairs signed-ranks test The difference between relevant slices and irrelevant slices is significant at the 0.03 level. The difference between relevant slices and jumbles is very significant at the 0.005 level.
08/05/2015Dr Andy Brooks27 Results Irrelevant contiguous was recognised because the programs were small and the irrelevant contiguous fragments were close to the output statements which wrote the incorrect variable values. –Participants would likely have examined code around these output statements.
08/05/2015Dr Andy Brooks28 Results Figure 4 by fragment type and program type
08/05/2015Dr Andy Brooks29 Results Figure 4 TALLY shows the greatest recognition of the relevant slice fragment. Because TALLY was poorly structured (many GOTOS), perhaps more programmers adopted a slicing strategy to debug it.
08/05/2015Dr Andy Brooks30 Results Table V To conclude the experiment, participants were asked about the typicalness of the programs and the bugs. Table V shows that the mean ratings were at least 2.4 on a 1 to 4 scale. –4 meant “very typical” –1 meant “not at all typical”. Weiser reasonably concluded that no program was especially atypical.
08/05/2015Dr Andy Brooks31 Examples of slices Figure 6 Slices that are large in relation to the program (e.g. 563/662 statements) are less useful to the program maintainer.
08/05/2015Dr Andy Brooks32 Implications Tools that automatically generate program slices can help maintainers debug faulty code. Novice programmers should be taught the concept of slicing. Today, researchers study many different kinds of slicing techniques. Dynamic slicing makes use of knowledge about the input, and this can greatly reduce the size of slices.
08/05/2015Dr Andy Brooks33 Slicing or not ? “Because the relevant slice fragment overlapped the relevant sequential fragment in each program, this experiment gives no absolute assurance that relevant slices were not recognised only because of that overlap.” Table II indicates that recognition ratings between relevant slice fragments and relevant sequential fragments are poorly correlated. This suggests that participants could have been recognising relevant slice fragments because they had indeed been slicing, but...
08/05/2015Dr Andy Brooks34 In experimental work it is better to directly measure than indirectly measure. Nowadays, it is possible to build and use tools to record all user actions and so help establish if program slicing occurred or not. Even in Weiser´s day, he could have recorded participants speaking their thoughts and actions aloud and then analysed the recordings to help establish if program slicing had occurred or not.
08/05/2015Dr Andy Brooks35 At the very least, Weiser should have asked his participants at the end of the experiment what actions they performed to debug the programs. Because the programs were so small, it is quite possible that relevant slice recognition occurred because (some or all) participants had simply read all the code involved. It would be interesting to know what the recognition rates would have been if fragments shown to participants had not been syntactically altered.
08/05/2015Dr Andy Brooks36 You never really know what is going on inside someone´s head.