How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of.

How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of Sheffield

Background The Experiment Preliminary results Conclusion and Future research

Background Test First programming Story Implementation Write Tests Run test cases All pass? Rework No Yes Next Story Story Write Tests Implementation Run test cases All pass? No Rework Next Story Yes Test LastTest First How Test Last and Test First work respectively

Background Previous Studies (TF vs TL) TF programmers obtained higher productivity 1) Kaufmann et al [Kaufmann 2003] 2) Janzen et al [Janzen 2005] TF programmers failed to obtain higher productivity 1) Müller et al [Müller 2002] 2) Williams [Williams 2003] et al and Maximilien et al [Maximilien 2003] 3) George et al [George 2003, 2004] 4) Macias et al [Macias 2004] 5) Erdogmus [Erdogmus 2005]

Background Previous Studies (TF vs. TL) TF programmers obtained higher external quality Williams [Williams 2003] et al and Maximilien et al [Maximilien 2003] George et al [George 2003, 2004] Edwards [Edwards 2003] TF programmers failed to obtain higher external quality Müller et al [Müller 2001] Pancur et al [Pancur 2003] Macias et al [Macias 2004] Erdogmus [Erdogmus 2005]

Background Our Initial study Results (pertaining to the effectiveness): 1) TF teams spent more percentage of time on testing 2) TF teams obtained higher productivity however statistically insignificant 3) The minimum external quality achievable was improved with the increase of time spent on testing as a percentage 4) Linear correlation between Effort spent on Testing and Coding

Background Motivation The differences in terms of effectiveness between TF and TL programmers are possibly due to some co-variances other than the treatments (testing/programming strategies). 1)TF is not easy to learn [Crispin 2006]. 2)Subjects are not skillful of programming following TF. 3)Testing has an impact on the Code quality and productivity [Basili 1986, Stephens 2003]. It is imperative to analyze the tests written by subjects and to assess the subjects’ ability to test, to distinguish the good and bad testers.

The Experiment Context: Sheffield Software Engineering Observatory Semi-industrial setting. Medium-sized projects, Longer development time, Real external clients 2 groups of subjects 2nd and 3rd year computer science undergraduates. 4 th year MEng and MSc students.

The Experiment Questionnaire A Subjects were given 1)A short piece of Java code, and 2)29 potential tests and asked to select tests for 1)Category partition testing (22 out of 29 were necessary for the partition), and 2)Giving Branch coverage (The coverage and redundant choices were calculated for each of the responses). The testing ability was measured by 1) For Category partitioning: (The number of Correct choices made) -– (the number of redundant choices) 2) Branch coverage obtained, redundant choices for giving branch coverage

The Experiment Procedure 1)Team and group allocation 2)Intensive training of doing TF 3)Software development, including group meetings, management meetings, and client meetings 4)Questionnaire distribution (before Easter vocation)

Preliminary results Undergraduates achieved lower marks in doing Category partitioning whereas made more redundant choices when giving the branch coverage, however NOT statistically significant. Postgraduates did no better than undergraduates when giving the branch coverage.

Preliminary results The postgraduates had higher probability to be Excellent (38% versus 21% for undergraduates), and the much lower probability to be the Poor (13% versus 43% for undergraduates), given that the responses were categorized by “Excellent” (70% and above), “Fair” (50%-70%) and “Poor” (50% and below)

Limitation 1)Student subjects, 2)Small sample size, 3)Low response rate 4)The ability to select tests, not write test 5)Code based questionnaire only

Conclusion and Future research Conclusion Since category partition method requires some analysis of the specification, and TF requires programmers to write tests before code Programmers with higher level of expertise did better when doing category partition, while failed to do better in the case of giving branch coverage, which suggests TF requires higher level of expertise.

Conclusion and Future research Future Research Questionnaires 1) which is NOT code based, and/or 2) in which testing of different level is focused are to be distributed in a larger group of subjects with different backgrounds. Questionnaire B (proposed) Subjects were proposed to be given 1)A short piece of text specification, and 2)A number of potential tests The testing ability was proposed to be measured by 1) The number of Correct choices made 2) The number of redundant choices

Thanks for listening

How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of.

Similar presentations

Presentation on theme: "How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of.

Similar presentations

Presentation on theme: "How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of."— Presentation transcript:

Similar presentations

About project

Feedback