Empirical Usability Testing in a Component-Based Environment: Improving Test Efficiency with Component-Specific Usability Measures Willem-Paul Brinkman.

Slides:

Advertisements

Similar presentations

Cultural Heritage in REGional NETworks REGNET Project Meeting Content Group Part 1: Usability Testing.

Advertisements

Making sense out of recorded user-system interaction Dr Willem-Paul Brinkman Lecturer Department of Information Systems and Computing Brunel University.

The art and science of measuring people l Reliability l Validity l Operationalizing.

Semester in review. The Final May 7, 6:30pm – 9:45 pm Closed book, ONE PAGE OF NOTES Cumulative Similar format to midterm (probably about 25% longer)

Using Statistics in Research Psych 231: Research Methods in Psychology.

Using Statistics in Research Psych 231: Research Methods in Psychology.

Elementary hypothesis testing

Using Statistics in Research Psych 231: Research Methods in Psychology.

Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.

Component-specific usability testing Dr Willem-Paul Brinkman Lecturer Department of Information Systems and Computing Brunel University

Usability Testing Of Interaction Components: Taking the Message Exchange as a Measure of Usability Willem-Paul Brinkman Brunel University, London Reinder.

13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.

Analysis of Variance & Multivariate Analysis of Variance

Usability and Evaluation Dov Te’eni. Figure ‎ 7-2: Attitudes, use, performance and satisfaction AttitudesUsePerformance Satisfaction Perceived usability.

Using Statistics in Research Psych 231: Research Methods in Psychology.

Robert delMas (Univ. of Minnesota, USA) Ann Ooms (Kingston College, UK) Joan Garfield (Univ. of Minnesota, USA) Beth Chance (Cal Poly State Univ., USA)

류 현 정류 현 정 Human Computer Interaction Introducing evaluation.

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.

Hypothesis Testing II The Two-Sample Case.

Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.

Predictive Evaluation

Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 8: Quantitative.

Statistical Power The ability to find a difference when one really exists.

Elementary Statistical Methods André L. Souza, Ph.D. The University of Alabama Lecture 22 Statistical Power.

OHTO -99 SOFTWARE ENGINEERING “SOFTWARE PRODUCT QUALITY” Today: - Software quality - Quality Components - ”Good” software properties.

Statistics and Research methods Wiskunde voor HMI Bijeenkomst 3 Relating statistics and experimental design.

Ch 14. Testing & modeling users

Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.

Multimedia Specification Design and Production 2013 / Semester 1 / week 9 Lecturer: Dr. Nikos Gazepidis

User Study Evaluation Human-Computer Interaction.

Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal distributions to evaluate this finding: The study, published.

Conducting a User Study Human-Computer Interaction.

Experimentation in Computer Science (Part 1). Outline  Empirical Strategies  Measurement  Experiment Process.

CS2003 Usability Engineering Usability Evaluation Dr Steve Love.

Evaluation of User Interface Design 4. Predictive Evaluation continued Different kinds of predictive evaluation: 1.Inspection methods 2.Usage simulations.

Statistics (cont.) Psych 231: Research Methods in Psychology.

Intro: “BASIC” STATS CPSY 501 Advanced stats requires successful completion of a first course in psych stats (a grade of C+ or above) as a prerequisite.

Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.

IT 와 인간의 만남 KAIST 지식서비스공학과 Experimental Research KSE966/986 Seminar Uichin Lee Sept. 21, 2012.

Human-Computer Interaction. Overview What is a study? Empirically testing a hypothesis Evaluate interfaces Why run a study? Determine ‘truth’ Evaluate.

Elementary statistics for foresters Lecture 5 Socrates/Erasmus WAU Spring semester 2005/2006.

Assessing Peer Support and Usability of Blogging Technology Yao Jen Chang Department of Electronic Engineering Chung-Yuan Christian University, Taiwan.

©2010 John Wiley and Sons Chapter 2 Research Methods in Human-Computer Interaction Chapter 2- Experimental Research.

Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:

Analyzing Statistical Inferences July 30, Inferential Statistics? When? When you infer from a sample to a population Generalize sample results to.

Finishing up: Statistics & Developmental designs Psych 231: Research Methods in Psychology.

Consistency: A Factor that Links the Usability of Individual Interaction Components Together Willem-Paul Brinkman Brunel University Reinder Haakma Philips.

Statistics (cont.) Psych 231: Research Methods in Psychology.

Chapter ?? 7 Statistical Issues in Research Planning and Evaluation C H A P T E R.

Statistics (cont.) Psych 231: Research Methods in Psychology.

Inferential Statistics Psych 231: Research Methods in Psychology.

IMPROVING THE HUMAN TECHNOLOGY INTERFACE NURSING INFORMATICS CHAPTER 4 1.

Chapter 8 Introducing Inferential Statistics.

Computer aided teaching of statistics: advantages and disadvantages

A Hierarchical Model for Object-Oriented Design Quality Assessment

Understanding Results

Hypothesis Tests: One Sample

Conducting a User Study

Analysis of Software Usability Evaluation Methods

Psych 231: Research Methods in Psychology

Psych 231: Research Methods in Psychology

Psych 231: Research Methods in Psychology

Psych 231: Research Methods in Psychology

HCI Evaluation Techniques

Psych 231: Research Methods in Psychology

Understanding Statistical Inferences

Psych 231: Research Methods in Psychology

Psych 231: Research Methods in Psychology

Presentation transcript:

Empirical Usability Testing in a Component-Based Environment: Improving Test Efficiency with Component-Specific Usability Measures Willem-Paul Brinkman Brunel University, London Reinder Haakma Philips Research Laboratories Eindhoven Don Bouwhuis Eindhoven University of Technology

Topics  Research Motivation  Testing Method  Experimental Evaluation of the Testing Method  Conclusions

Research Motivation Studying the usability of a system

Research Motivation ExternalComparison External Comparison relating differences in usability to differences in the systems InternalComparison Internal Comparison trying to link usability problems with parts of the systems

Component-Based Software Engineering  Multiple versions testing paradigm (external comparison)  Single version testing paradigm (internal comparison) Manage Support Re-use Create Re-use

Research Motivation PROBLEM 1.Only empirical analysis of the overall system such as task time, keystrokes, questionnaires etc - not powerful 2.Usability tests, heuristic evaluations, cognitive walkthroughs where experts identify problems – unreliable SOLUTION Component-Specific usability measures: more powerful and reliable

Testing Method Procedure  Normal procedures of a usability test  User task which requires interaction with components under investigation  Users must complete the task successfully

Component-specific component measures Perceived ease-of-use Perceived satisfaction Objective performance Component-specific questionnaire helps the users to remember their interaction experience with a particular component

Component-specific component measures Perceived ease-of-use Perceived satisfaction Objective performance Perceived Usefulness and Ease-of-use questionnaire (David, 1989), 6 questions, e.g.  Learning to operate [name] would be easy for me.  I would find it easy to get [name] to do what I want it to do. UnlikelyLikely

Component-specific component measures Perceived ease- of-use Perceived satisfaction Objective performance Post-Study System Usability Questionnaire (Lewis, 1995)  The interface of [name] was pleasant.  I like using the interface of [name].Strongly disagreeagree

Component-specific component measures Number of messages received directly, or indirectly from lower- level components. The effort users put into the interaction Perceived ease- of-use Perceived satisfaction Objective performance Component Control process Control loop: Each message is a cycle of the control loop

Architectural Element Interaction component Elementary unit of an interactive system, on which behavioural- based evaluation is possible. A unit within an application that can be represented as a finite state machine which directly, or indirectly via other components, receives signals from the user. Users must be able to perceive or infer the state of the interaction component. AP C AP C AP C Interactor CNUCE model CM V V MVC PAC Example of suitable agents-models

Interaction layers = 15+23= Add ProcessorEditor Control results Control equation UserCalculator

Control Loop Evaluation Component User message Feedback Reference value User System

Lower Level Control Loop User Calculator

Higher Level Control Loop User Calculator

80 users 8 mobile telephones 3 components were manipulated according to Cognitive Complexity Theory (Kieras & Polson, 1985) 1.Function Selector 2.Keypad 3.Short Text Messages Experimental Evaluation of the Testing Method

Architecture Mobile telephone Send Text Message Send Text Message Function Selector Function Selector Keypad

Evaluation study – Function Selector Versions: Broad/shallow Narrow/deep

Evaluation study – Keypad Versions Repeated-Key Method “L” Modified-Model-Position method “J”

Evaluation study– Send Text Message Versions Simple Complex

Statistical Tests number of keystrokes task time 0 8 x = sample mean (estimator of µ) s = estimation of the standard deviation (σ) s x = estimation of the standard error of the mean, s x 2 = s 2 /n

Statistical Tests p-value: probability of making type I, or , error, wrongly rejecting the hypothesis that underlying distribution is the same.

Statistical Tests p-value: probability of making type I, or , error, wrongly rejecting the hypothesis that underlying distribution is the same.

Results – Function Selector Results of two multivariate analyses and related univariate analyses of variance with the version of the Function Selector as independent between-subjects variable.

Results – Keypad Results of multivariate and related univariate analyses of variance with the version of the Keypad as independent between-subjects variable.

Results – Send Text Message Results of two multivariate analyses and related univariate analyses of variance with the version of the STM component as independent between-subjects variable

Power of component-specific measures Statistical Power: 1 - β Type II, or β, error: failing to reject the hypothesis when it is false

Power of component-specific measures x = sample mean (estimator of µ) s = estimation of the standard deviation (σ) s x = estimation of the standard error of the mean, s x 2 = s 2 /n

Power of component-specific measures Statistical Power: 1 - β Component-specific measure are less affected by usability problems users may or may not encounter with other part of the system

Results- Power Analysis Average probability that a measure finds a significant (α = 0.05) effect for the usability difference between the two versions of FS, STM, or the Keypad components

Conclusions Component-Specific measure can be used to test the difference in usability between different versions of an interaction component 1.Objective Performance Measure: Number of messages received directly or indirectly via lower- level components 2.Subjective Usability Measures: Ease-Of-Use and Satisfaction questionnaire Component-specific measures are potentially more powerful than overall usability measures

Questions / Discussion Thanks for your attention

Layered Protocol Theory Layered Protocol Theory (Taylor, 1988) Component-Based Interactive Systems

Reflection 1.Different lower level versions  different effort involved when sending a message 2.Usability of a component can affect the interaction users have with other components  Overall measure more powerful? 3.Can instrumentation code be inserted? Limitations Other Evaluation Methods Exploitation of the Testing Method

Reflection 1.Unit testing  lacks the context of a real task 2.Sequential Data Analysis  lacks direct link with higher layers 3.Not Event-Base Usability Evaluation  lacks direct link with component Limitations Other Evaluation Methods Exploitation of the Testing Method

Reflection 1.Creation process  Reducing the need to deal with a component each time when it is deployed 2.Re-use process  Still needs final usability test Limitations Other Evaluation Methods Exploitation of the Testing Method

Testing Method Aim to evaluate the difference in usability between two or more versions of a component