How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of.

Slides:



Advertisements
Similar presentations
By Veronika Movagharianpour and Adam Brakel. Software Developers face challenges:  Producing high-quality software  with low-defect levels  while doing.
Advertisements

Real maximum walking speed on short distance assessed by a corrected version of the Timed 25 Foot Walk Test (T25FW) R. Phan Ba 1, 2, P. Calay 1, 2, P.
Evaluation of Training
Presenter: Han, Yi-Ti Adviser: Chen, Ming-Puu Date: Jan 19, 2009 Sitthiworachart, J. & Joy, M.(2008). Computer support of effective peer assessment in.
Bank Employee Incentives and Stock Purchase Plans Participation Thomas Rapp, PhD Nicolas Aubert, PhD 1.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Statistics of EBO 2010 Examination EBO General Assembly Sunday June 21st, 2010 (Tallin, Estonia) Danny G.P. Mathysen MSc. Biomedical Sciences EBOD Assessment.
‘Enhancing the First Year Experience – A Case Study From Biomedical Sciences’ Paul Hagan Stephen M c Clean University of Ulster.
Homework Planners as an Intervention for Homework Completion Audrey Bullock Fall 2009 Math 5090 Audrey Bullock Fall 2009 Math 5090.
What causes bugs? Joshua Sunshine. Bug taxonomy Bug components: – Fault/Defect – Error – Failure Bug categories – Post/pre release – Process stage – Hazard.
International student success – do the raw materials meet the specification? David Bell.
Empirically Assessing End User Software Engineering Techniques Gregg Rothermel Department of Computer Science and Engineering University of Nebraska --
An Experimental Evaluation on Reliability Features of N-Version Programming Xia Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005.
© Anselm Spoerri Lecture 13 Housekeeping –Term Projects Evaluations –Morse, E., Lewis, M., and Olsen, K. (2002) Testing Visual Information Retrieval Methodologies.
Validating and Improving Test-Case Effectiveness Author: Yuri Chernak Presenter: Lam, Man Tat.
A teachers’ project: “Towards learner autonomy”. A teachers’ project: towards learner autonomy §Rationale §What we wanted to achieve §The process §Problems.
Needs Analysis Instructor: Dr. Mavis Shang
Bringing Softtek’s Software Testing Organization from Good to World- Class Software Testing Organization Proposal.
Assessing Students Ability to Communicate Effectively— Findings from the College of Technology & Computer Science College of Technology and Computer Science.
The possible effects of target language learning prior to secondary dual language school studies by Anna Várkuti 10th Summer School of Psycholinguistics.
BEC & BULATS by Angel Phu & Zita Yip Examinations Services Officer.
By: Taylor Helsper.  Introduction  Test Driven Development  JUnit  Testing Private Methods  TDD Example  Conclusion.
Confidence, mathematics and performance of Engineering Studies candidates at the New South Wales Higher School Certificate examination John Barlow Australian.
Failure Mode & Effect Analysis (FMEA)
8/23/2015Slide 1 The introductory statement in the question indicates: The data set to use: GSS2000R.SAV The task to accomplish: a one-sample test of a.
Curriculum 21 SUCCEED Southeastern University and College Coalition for Engineering Education Multiple Vantage Points for Employment-Related Feedback Share.
Software Project Management Lecture # 8. Outline Chapter 25 – Risk Management  What is Risk Management  Risk Management Strategies  Software Risks.
The Math Studies Project for Internal Assessment A good project should be able to be followed by a non-mathematician and be self explanatory all the way.
Is PeerMark a useful tool for formative assessment of literature review? A trial in the School of Veterinary Science Duret, D & Durrani,
How Significant Is the Effect of Faults Interaction on Coverage Based Fault Localizations? Xiaozhen Xue Advanced Empirical Software Testing Group Department.
Programme Specification, Benchmarks etc. Warren Houghton School of Engineering and Computer Science, University of Exeter.
Correlation1.  The variance of a variable X provides information on the variability of X.  The covariance of two variables X and Y provides information.
GCSE Computer Science 2 YEAR COURSE Business & ICT Department.
Abstract Matthew L. Bowe Dr. Christopher Hlas (mentor) Department of Mathematics University of Wisconsin-Eau Claire Background iNformation Methods Results.
The Effect of Computers on Student Writing: A Meta-Analysis of Studies from 1992 to 2002 Amie Goldberg, Michael Russell, & Abigail Cook Technology and.
Chapter 3: Software Maintenance Process Omar Meqdadi SE 3860 Lecture 3 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics.
Assessing assessment: the role of student effort in comparative studies Ray Adams Jayne Butler.
École Georges P. Vanier Achievement Exam Result Sharing
Experimentation in Computer Science (Part 1). Outline  Empirical Strategies  Measurement  Experiment Process.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Empirical Assessment of Test-First Approach Liang Huang and Mike Holcombe Department of Computer Science, University of Sheffield.
Self-assessment Accuracy: the influence of gender and year in medical school self assessment Elhadi H. Aburawi, Sami Shaban, Margaret El Zubeir, Khalifa.
ADVLW UNIT 8 Preparing the final project formats.
Market Research & Product Management.
The Satisfied Student October 4 th, Today’s Presentation  Present data from Case’s Senior Survey and the National Survey of Student Engagement.
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
A Biased Fault Attack on the Time Redundancy Countermeasure for AES Sikhar Patranabis, Abhishek Chakraborty, Phuong Ha Nguyen and Debdeep Mukhopadhyay.
Changes to assessment and reporting of children’s attainment A guide for Parents and Carers Please use the SPACE bar to move this slideshow at your own.
Use of digital diagnostic Tests Baleni Z.G. R.P.L. Manager.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Changes to assessment and reporting of children’s attainment A guide for Parents and Carers Please use the SPACE bar to move this slideshow at your own.
1 Predicting Classes in Need of Refactoring – An Application of Static Metrics Liming Zhao Jane Hayes 23 September 2006.
Valkova Inna, PhD, Director of the Center for Educational Assessment and Teaching Methods. Paris, November 2014 The Impact of Pre-school education on the.
0 Simulation Modeling and Analysis: Input Analysis 7 Random Numbers Ref: Law & Kelton, Chapter 7.
Teaching Peer Review of Writing in a Large First-Year Electrical and Computer Engineering Class: Comparison of Two Methods Michael Ekoniak Molly Scanlon.
TEACHING STATISTICS ONLINE Dr Alison Bentley Research Coordinator School of Clinical Medicine Faculty of Health Sciences.
A MEMBER OF THE RUSSELL GROUP. Denis Duret School of Veterinary Science University of Liverpool Denis.
A PRELIMINARY EMPIRICAL ASSESSMENT OF SIMILARITY FOR COMBINATORIAL INTERACTION TESTING OF SOFTWARE PRODUCT LINES Stefan Fischer Roberto E. Lopez-Herrejon.
Contact Do advanced qualifications equate to better.
Applying Combinatorial Testing to Data Mining Algorithms
Research in Social Work Practice Salem State University
The ECTS grading table 2015 Maria Sticchi Damiani
Testing Tutorial 7.
MSA / Gage Capability (GR&R)
A nationwide US student survey
What’s Happening With Millennials In Community College Geosciences
The Math Studies Project for Internal Assessment
Effect of Sample size on Research Outcomes
Presentation transcript:

How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of Sheffield

Background The Experiment Preliminary results Conclusion and Future research

Background The Experiment Preliminary results Conclusion and Future research

Background Test First programming Story Implementation Write Tests Run test cases All pass? Rework No Yes Next Story Story Write Tests Implementation Run test cases All pass? No Rework Next Story Yes Test LastTest First How Test Last and Test First work respectively

Background Previous Studies (TF vs TL) TF programmers obtained higher productivity 1) Kaufmann et al [Kaufmann 2003] 2) Janzen et al [Janzen 2005] TF programmers failed to obtain higher productivity 1) Müller et al [Müller 2002] 2) Williams [Williams 2003] et al and Maximilien et al [Maximilien 2003] 3) George et al [George 2003, 2004] 4) Macias et al [Macias 2004] 5) Erdogmus [Erdogmus 2005]

Background Previous Studies (TF vs. TL) TF programmers obtained higher external quality Williams [Williams 2003] et al and Maximilien et al [Maximilien 2003] George et al [George 2003, 2004] Edwards [Edwards 2003] TF programmers failed to obtain higher external quality Müller et al [Müller 2001] Pancur et al [Pancur 2003] Macias et al [Macias 2004] Erdogmus [Erdogmus 2005]

Background Our Initial study Results (pertaining to the effectiveness): 1) TF teams spent more percentage of time on testing 2) TF teams obtained higher productivity however statistically insignificant 3) The minimum external quality achievable was improved with the increase of time spent on testing as a percentage 4) Linear correlation between Effort spent on Testing and Coding

Background Motivation The differences in terms of effectiveness between TF and TL programmers are possibly due to some co-variances other than the treatments (testing/programming strategies). 1)TF is not easy to learn [Crispin 2006]. 2)Subjects are not skillful of programming following TF. 3)Testing has an impact on the Code quality and productivity [Basili 1986, Stephens 2003]. It is imperative to analyze the tests written by subjects and to assess the subjects’ ability to test, to distinguish the good and bad testers.

Background The Experiment Preliminary results Conclusion and Future research

The Experiment Context: Sheffield Software Engineering Observatory Semi-industrial setting. Medium-sized projects, Longer development time, Real external clients 2 groups of subjects 2nd and 3rd year computer science undergraduates. 4 th year MEng and MSc students.

The Experiment Questionnaire A Subjects were given 1)A short piece of Java code, and 2)29 potential tests and asked to select tests for 1)Category partition testing (22 out of 29 were necessary for the partition), and 2)Giving Branch coverage (The coverage and redundant choices were calculated for each of the responses). The testing ability was measured by 1) For Category partitioning: (The number of Correct choices made) -– (the number of redundant choices) 2) Branch coverage obtained, redundant choices for giving branch coverage

The Experiment Procedure 1)Team and group allocation 2)Intensive training of doing TF 3)Software development, including group meetings, management meetings, and client meetings 4)Questionnaire distribution (before Easter vocation)

Background The Experiment Preliminary results Conclusion and Future research

Preliminary results Undergraduates achieved lower marks in doing Category partitioning whereas made more redundant choices when giving the branch coverage, however NOT statistically significant. Postgraduates did no better than undergraduates when giving the branch coverage.

Preliminary results The postgraduates had higher probability to be Excellent (38% versus 21% for undergraduates), and the much lower probability to be the Poor (13% versus 43% for undergraduates), given that the responses were categorized by “Excellent” (70% and above), “Fair” (50%-70%) and “Poor” (50% and below)

Background The Experiment Preliminary results Conclusion and Future research

Limitation 1)Student subjects, 2)Small sample size, 3)Low response rate 4)The ability to select tests, not write test 5)Code based questionnaire only

Conclusion and Future research Conclusion Since category partition method requires some analysis of the specification, and TF requires programmers to write tests before code Programmers with higher level of expertise did better when doing category partition, while failed to do better in the case of giving branch coverage, which suggests TF requires higher level of expertise.

Conclusion and Future research Future Research Questionnaires 1) which is NOT code based, and/or 2) in which testing of different level is focused are to be distributed in a larger group of subjects with different backgrounds. Questionnaire B (proposed) Subjects were proposed to be given 1)A short piece of text specification, and 2)A number of potential tests The testing ability was proposed to be measured by 1) The number of Correct choices made 2) The number of redundant choices

Thanks for listening