10/06/2015Dr Andy Brooks1 MSc Software Maintenance MS Viðhald Hugbúnaðar Fyrirlestrar 27 & 28 Debugging with the Whyline tool

Slides:



Advertisements
Similar presentations
JAVA Coursework (the same for 2A and 2B). Fundamental Information The coursework is 30 marks in your O’Level = 15% of the exam Must be word processed.
Advertisements

The Web Warrior Guide to Web Design Technologies
Programming Types of Testing.
08/05/2015Dr Andy Brooks1 MSc Software Maintenance MS Viðhald hugbúnaðar Fyrirlestrar 15 & 16 Programmers Use Slices When Debugging.
George Blank University Lecturer. CS 602 Java and the Web Object Oriented Software Development Using Java Chapter 4.
Slides prepared by Rose Williams, Binghamton University Chapter 1 Getting Started 1.1 Introduction to Java.
Automating Tasks With Macros
Chapter 2: Algorithm Discovery and Design
Report of the CMU Natural Programming Group Brad Myers, Andy Ko, Jeff Stylos, Michael Coblenz, Brian Ellis, Polo Chao Carnegie Mellon University.
1 Chapter 4 The Fundamentals of VBA, Macros, and Command Bars.
The C++ Tracing Tutor: Visualizing Computer Program Behavior for Beginning Programming Courses Rika Yoshii Alastair Milne Computer Science Department California.
Logo Lesson 5 TBE Fall 2004 Farah Fisher. Prerequisites  Given a shape, use basic Logo commands and/or a procedure to draw the shape, with and.
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
Chapter 6: An Introduction to System Software and Virtual Machines
Chapter 2: Algorithm Discovery and Design
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
A First Program Using C#
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
Data Structures & AlgorithmsIT 0501 Algorithm Analysis I.
1 Computing Software. Programming Style Programs that are not documented internally, while they may do what is requested, can be difficult to understand.
Chapter 2: Algorithm Discovery and Design Invitation to Computer Science, C++ Version, Third Edition.
Invitation to Computer Science, Java Version, Second Edition.
DEPARTMENT OF COMPUTER SCIENCE & TECHNOLOGY FACULTY OF SCIENCE & TECHNOLOGY UNIVERSITY OF UWA WELLASSA 1 CST 221 OBJECT ORIENTED PROGRAMMING(OOP) ( 2 CREDITS.
Tutorial 111 The Visual Studio.NET Environment The major differences between Visual Basic 6.0 and Visual Basic.NET are the latter’s support for true object-oriented.
Bug Localization with Machine Learning Techniques Wujie Zheng
By Noorez Kassam Welcome to JNI. Why use JNI ? 1. You already have significantly large and tricky code written in another language and you would rather.
Compiler Construction
Introduction of Geoprocessing Topic 7a 4/10/2007.
CHAPTER TEN AUTHORING.
Chapter 1 Section 1.1 Introduction to Java Slides prepared by Rose Williams, Binghamton University Kenrick Mock, University of Alaska Anchorage.
CS 206 Introduction to Computer Science II 09 / 10 / 2009 Instructor: Michael Eckmann.
9/2/ CS171 -Math & Computer Science Department at Emory University.
ALG0183 Algorithms & Data Structures Lecture 4 Experimental Algorithmics 8/25/20091 ALG0183 Algorithms & Data Structures by Dr Andy Brooks Case study article:
Debugging and Profiling With some help from Software Carpentry resources.
Chapter 2 Processes and Threads Introduction 2.2 Processes A Process is the execution of a Program More specifically… – A process is a program.
The Software Development Process
CASE/Re-factoring and program slicing
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Introduction to Interactive Media Interactive Media Tools: Authoring Applications.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 4 Slide 1 Slide 1 What we'll cover here l Using the debugger: Starting the debugger Setting.
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
Intermediate 2 Computing Unit 2 - Software Development.
1First BlueJ Day, Houston, Texas, 1st March 2006 Debugging in BlueJ Davin McCall.
Concurrency Properties. Correctness In sequential programs, rerunning a program with the same input will always give the same result, so it makes sense.
BIT 115: Introduction To Programming Professor: Dr. Baba Kofi Weusijana (say Doc-tor Way-oo-see-jah-nah, Doc-tor, or Bah-bah)
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
Introduction of Geoprocessing Lecture 9 3/24/2008.
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
AVCE ICT – Unit 7 - Programming Session 12 - Debugging.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
Chapter 2: Algorithm Discovery and Design Invitation to Computer Science.
PROGRAMMING FUNDAMENTALS INTRODUCTION TO PROGRAMMING. Computer Programming Concepts. Flowchart. Structured Programming Design. Implementation Documentation.
Hello world !!! ASCII representation of hello.c.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
6. (supplemental) User Interface Design. User Interface Design System users often judge a system by its interface rather than its functionality A poorly.
Programming Logic and Design Seventh Edition Chapter 1 An Overview of Computers and Programming.
Advanced Computer Systems
Testing and Debugging PPT By :Dr. R. Mall.
Key Ideas from day 1 slides
Testing and Debugging.
Software Programming J. Holvikivi 2014.
Chapter 1 Introduction(1.1)
Computer Programming-1 CSC 111
Programming Logic and Design Eighth Edition
Presentation transcript:

10/06/2015Dr Andy Brooks1 MSc Software Maintenance MS Viðhald Hugbúnaðar Fyrirlestrar 27 & 28 Debugging with the Whyline tool

10/06/2015Dr Andy Brooks2 Case Study Dæmisaga Reference Debugging Reinvented: Asking and Answering Why and Why Not Questions about Program Behavior, Andrew J. Ko and Brad A. Myers, ICSE´08, pp , ©ACM

1. Introduction When program behaviour is incorrect, software engineers must think of questions to ask about the code. Often they simply guess. –“Is this double increment caused by a typo somewhere, a ‘2’ perhaps instead of a ‘1’? Studies have reported that initial guesses are wrong almost 90% of the time. –The double increment was actually caused by faulty program logic which resulted in the incrementing method being called twice. 10/06/2015Dr Andy Brooks3

1. Introduction Breakpoint debuggers require the software engineer to choose a line of code. –to examine program state at a particular time Slicing tools also require the software engineer to choose a seed variable or statement. –to display all the code that has an influence If the wrong variable or wrong line of code is chosen then tool output can be irrelevant to solving the problem. –garbage-in garbage-out 10/06/2015Dr Andy Brooks4

Whyline A new kind of program understanding and debugging tool. Whyline allows the user to choose a why did or why didn´t question about program output. Whyline then generates an answer to the question using various program analyses. –static and dynamic slicing, precise call graphs, “new algorithms” –chains of events as explanations Whyline works with Java programs that use standard Java I/O and that do not run “too long”. 10/06/2015Dr Andy Brooks5 1. Introduction

Simple painting application The user demonstrates the program behaviour they want to inquire about (1). When the program halts, Whyline loads the trace. Using a time controller (2), the user finds the point in time they want to ask about. The user clicks on something of interest and questions pop up about it (3). The user selects a question. Whyline determines the responsible execution sequence and the user can select from a list of pop up questions (4). Whyline determines the instantiation event (5) and the corresponding source code is shown (6). The call stack and locals at the time of the selected event are also shown (7). 10/06/2015Dr Andy Brooks6 2. An Example

10/06/2015Dr Andy Brooks7 Figure 1. ©ACM green not blue slider used interactive debugging

2. An Example Using Whyline, time spent debugging was halved because: –people do not have to guess search terms or understand the resulting matches –people do not have to set breakpoints Using WhyLine people “simply pointed to something that they knew was relevant and wrong, and let the Whyline determine the related evidence”. 10/06/2015Dr Andy Brooks8 Watch the WhyLine videos:

3.1 Recording an Execution Trace The Whyline takes a postmortem approach to debugging by capturing a trace. A trace stores Java source files, instrumented class files, sequences of events in each thread, and other types of meta data. Each thread has a separate trace file for its events. Currently 55 types of events are defined in the Whyline. Events include values after their header to help developers interpret program state. –for an assignment event, the value assigned is included –for an invocation event, values passed as arguments are included 10/06/2015Dr Andy Brooks9 method invocation

3.1 Recording an Execution Trace Unexecuted classes referenced by a dynamically loaded class are also saved as part of the trace to help answer why didn´t questions. –This is not applied recursively as this would “likely include all known classes”. 10/06/2015Dr Andy Brooks10

3.2 Loading a trace All source files and class files are loaded. –used for almost every aspect of question and answering Whyline constructs lists of output instructions which are used as basis to generate questions. Whyline generates a call graph from the invocations found in the class files. Then events are loaded in order of their event IDs. –Whyline has a “complete ordering of the events in the execution.” “To improve the performance of question derivation and answering, the Whyline constructs lists of invocations, assignments to fields, and other types of events.” 10/06/2015Dr Andy Brooks11

3.3 Creating an I/O History From the low-level event information recorded in traces, Whyline constructs a user interface for navigating the output history. 10/06/2015Dr Andy Brooks12 A user can move backwards and forwards in time. The selected input time T determines what events are visible on the screen. snapshots from QuickTime video

3.3 Creating an I/O History Whyline finds fields and invocations that could have influenced output. –“For example, the color of a rectangle might be affected by some field in an object, or by the return value of a call to some method.” If an output instruction directly invokes rather than simply influence output (e.g. draw a rectangle rather than set the rectangle´s colour), Whyline marks all the potential indirect callers as output invoking. 10/06/2015Dr Andy Brooks13 tracking dependencies

(1) Why did property = value? (refers to value passed to output call) 10/06/2015Dr Andy Brooks Deriving questionsSee Figure 3.

(6) Why didn´t an instance of class C appear? (refers to instantiations of C) Why didn´t questions “support questions about output that has no representative output to click on”. Whyline has a why didn´t question for each familiar class that has output invoking methods (not output influencing), inherited or declared. –“A class is familiar if user owned code either defines or references the specific class.” 10/06/2015Dr Andy Brooks Deriving questionsSee Figure 3.

(4) Why did object get created? (refers to instantiation of object) (5) Why didn´t method execute after time T? (refers to potential invocation instructions) 10/06/2015Dr Andy Brooks Deriving questionsSee Figure 3.

(2) Why did field = value? (refers to assignment before T) (3) Why didn´t field´s value change after time T? (refers to potential assignment instructions) 10/06/2015Dr Andy Brooks Deriving questionsSee Figure 3.

5.1 Performance Feasibility Performance tests were run on a 2GHz Intel Core Duo MacBook Pro with 2GB of RAM. –standard OS X JVM, given a 1 GB heap The Unix time command was used to measure time to a tenth of a second. The casy study article text says performance tests were run five times and the results averaged. –Table 1 says tests were run 10 times and results averaged. Execution times were measured for normal operation, profiling time (using the profiler YourKit), and tracing time using the Whyline. 6/10/2015Dr Andy Brooks18 5. EVALUATION

Table 1 ©ACM LOC calculated omitting whitespace lines. Whyline´s tracing is slower than profiling “because it instruments more code”. Whyline´s tracing time should improve once Whyline has been optimised. 10/06/2015Dr Andy Brooks19

Table 1 ©ACM Compressed trace sizes compare favourable with those reported in dynamic slicing work. Loading time is an issue. The single biggest limiting factor is memory. The larger traces resulted in garbage collection and virtual memory use. –improvements in Whyline´s memory management are needed 10/06/2015Dr Andy Brooks20

Does Whyline scale? A minute of user interaction with ArgoUML was tested. –35,597 I/O events The output history is navigable at interactive speeds. Clicking on an event produced a menu of questions at interactive speeds. 10/06/2015Dr Andy Brooks21 5. EVALUATION

5.2 Question Coverage Does the Whyline provide questions that a user actually wants to ask? 9 bug reports for the applications listed in Table 1 were chosen at random. All but one bug report had a possible corresponding Whyline question. –one bug report was a feature request This evaluation did not test actual Whyline usage. –Would the user actually locate the question and would Whyline’s answer make any sense? “In future work, we will assess this issue in greater detail” 10/06/2015Dr Andy Brooks22 5. EVALUATION

5.3 User Study A pilot evaluation was conducted with 9 participants having a variety of backgrounds: –psychology, design, computer science, linguistics, food science, engineering One participant had never seen a line of code. Another had programmed for more than 10 years. The evaluation task was to resolve the slider bug using the Whyline. Task performance was compared with18 self-described Java experts who used Eclipse 2.1 to resolve the slider bug in a previous study [10]. 10/06/2015Dr Andy Brooks23 5. EVALUATION

5.3 User Study Participants recieved a short tutorial (1-2 minutes) on how to use the Whyline. The blue slider´s incorrect behaviour was demonstrated to participants. –Participants were asked to find the cause of this incorrect behaviour. Participants were allowed to ask about the user interface but not about that the task or code. The experimenter offered clarification if a user expressed confusion about the user interface. 10/06/2015Dr Andy Brooks24 5. EVALUATION

5.3 User Study Times (minutes)MininumMaximumMedian Whyline1124 control group33810 Whyline participants were more than twice as fast as the Java experts (the control group). –statistically significant difference p < 0,05 (Wilcoxon rank sums test) The pilot evaluation has limited external validity. –single task –small sample size (n=9) 10/06/2015Dr Andy Brooks25 5. EVALUATION

5.3 User Study Novices in the pilot evaluation tended to outperform the experts in the pilot evaluation. –Often they asked aloud “Why is the line blue?” and used Whyline directly to have the question answered. Experts in the pilot evaluation asked the same question but they first speculated about the reason rather than use the Whyline directly. –e.g. “Why didn´t this slider´s event get handled” One expert didn´t expect Whyline could make the connection between the slider and the color. 10/06/2015Dr Andy Brooks26 5. EVALUATION

7. Limitations (of Whyline) The Whyline tracing approach is practical only for executions lasting a few minutes. Some bugs can only be reproduced without interference from instrumentation. Loading traces feels “heavier” in comparison to breakpoint debugger use that has virtually no setup time. Cryptic names used for method and field names will result in cryptic Whyline questions. –‘Why did wd = 251?’ rather than ‘Why did width = 251?’ Whyline helps find code related to a behaviour but does not explain how to change that behaviour. 10/06/2015Dr Andy Brooks27

8. Discussion Whyline has no special knowledge about user interface toolkits or other APIs. A user thinking “Why didn´t this window change?” must choose a question like “Why didn´t this JFrame´s repaint() method get called?” “It might be helpful if one could write plug-ins for the Whyline to add special knowledge and heuristics for certain APIs, to improve the specificity of questions and answers.” 10/06/2015Dr Andy Brooks28

8. Discussion Modern applications can run across multiple platforms and can be written using multiple languages. How can traces be captured in such an environment? Does Whyline need to provide support for people collaborating on bug fixing? 10/06/2015Dr Andy Brooks29

Critical commentary from Andy Whyline technology could revolutionise approaches to debugging. The evaluation, however, was focussed on one defect in a small, stand-alone application. –The result regarding time saved is not generalisable. Would maintainers prefer a DORA approach to identify relevant code to reason about rather than explore the question set posed by Whyline? Much more evaluation work needs to be done. 10/06/2015Dr Andy Brooks30