Empirical Evaluation Data Collection: Techniques, methods, tricks Objective data IRB Clarification All research done outside the class (i.e., with non-class.

Slides:



Advertisements
Similar presentations
Evaluation of User Interface Design
Advertisements

Interaksi Manusia Komputer Evaluasi Empiris1/68 EVALUASI EMPIRIS Pengenalan Evaluasi Empiris Perancangan Eksperimen Partisipasi, IRB dan Etika Pengumpulan.
6.811 / PPAT: Principles and Practice of Assistive Technology Wednesday, 16 October 2013 Prof. Rob Miller Today: User Testing.
Each individual person is working on a GUI subset. The goal is for you to create screens for three specific tasks your user will do from your GUI Project.
Deciding How to Measure Usability How to conduct successful user requirements activity?
IAT 334 Experimental Evaluation ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY.
Chapter 14: Usability testing and field studies. Usability Testing Emphasizes the property of being usable Key Components –User Pre-Test –User Test –User.
©N. Hari Narayanan Computer Science & Software Engineering Auburn University 1 COMP 7620 Evaluation Chapter 9.
Observation Watch, listen, and learn…. Agenda  Observation exercise Come back at 3:40.  Questions?  Observation.
Evaluation. formative 4 There are many times throughout the lifecycle of a software development that a designer needs answers to questions that check.
Empirical Methods in Human- Computer Interaction.
Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research group School of Information.
User Interface Testing. Hall of Fame or Hall of Shame?  java.sun.com.
Evaluation Methodologies
Observing users. The aims Discuss the benefits & challenges of different types of observation. Describe how to observe as an on-looker, a participant,
Usable Privacy and Security Carnegie Mellon University Spring 2008 Lorrie Cranor 1 Designing user studies February.
Intro to Evaluation See how (un)usable your software really is…
Evaluation Methods April 20, 2005 Tara Matthews CS 160.
Evaluation How do we test the interaction design? Several Dimensions
From Controlled to Natural Settings
ICS 463, Intro to Human Computer Interaction Design: 8. Evaluation and Data Dan Suthers.
Intro to Evaluation See how (un)usable your software really is…
Damian Gordon.  Summary and Relevance of topic paper  Definition of Usability Testing ◦ Formal vs. Informal methods of testing  Testing Basics ◦ Five.
Gathering Usability Data
Chapter 14: Usability testing and field studies
User Interface Evaluation Usability Testing Methods.
1 Testing the UI This material has been developed by Georgia Tech HCI faculty, and continues to evolve. Contributors include Gregory Abowd, Jim Foley,
Intro to Evaluation See how (un)usable your software really is…
Chapter 23 How to collect data. This chapter is about the tools/techniques used to collect data Hang on, let’s review: what are we collecting? What’s.
S556 SYSTEMS ANALYSIS & DESIGN Week 11. Creating a Vision (Solution) SLIS S556 2  Visioning:  Encourages you to think more systemically about your redesign.
Gathering User Data IS 588 Dr. Dania Bilal Spring 2008.
Human Computer Interaction
Usability testing. Goals & questions focus on how well users perform tasks with the product. – typical users – doing typical tasks. Comparison of products.
What is Usability? Usability Is a measure of how easy it is to use something: –How easy will the use of the software be for a typical user to understand,
COMP5047 Pervasive Computing: 2012 Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research.
Effective Methods for Educational Research The value of prototyping, observing, engaging users Diana Laurillard.
CS5714 Usability Engineering Formative Evaluation of User Interaction: During Evaluation Session Copyright © 2003 H. Rex Hartson and Deborah Hix.
AMSc Research Methods Research approach IV: Experimental [1] Jane Reid
Usability Engineering Dr. Dania Bilal IS 582 Spring 2006.
Observing users. The aims Discuss the benefits & challenges of different types of observation. Describe how to observe as an on-looker, a participant,
Observation & Experiments Watch, listen, and learn…
EVALUATION PROfessional network of Master’s degrees in Informatics as a Second Competence – PROMIS ( TEMPUS FR-TEMPUS-JPCR)
Z556 Systems Analysis & Design Session 10 ILS Z556 1.
Usability Engineering Dr. Dania Bilal IS 592 Spring 2005.
Usability Evaluation or, “I can’t figure this out...do I still get the donuts?”
Usability Engineering Dr. Dania Bilal IS 582 Spring 2007.
1 ITM 734 Introduction to Human Factors in Information Systems Cindy Corritore Testing the UI – part 2.
School of Engineering and Information and Communication Technology KIT305/607 Mobile Application Development Week 7: Usability (think-alouds) Dr. Rainer.
Day 8 Usability testing.
Introduction to Marketing Research
SIE 515 Design Evaluation Lecture 7.
Evaluation through user participation
Usability Testing 3 CPSC 481: HCI I Fall 2014 Anthony Tang.
Chapter 7 Data Gathering 1.
Presented by Arseniy Mstislavskiy
Usability Evaluation, part 2
Data analysis and interpretation
SY DE 542 User Testing March 7, 2005 R. Chow
From Controlled to Natural Settings
Chapter 12 Observing Users
Thoughts on HCI Requirements Elicitation
Observing users.
Chapter 23 Deciding how to collect data
User interface design.
Inspecting your data Analyzing & interpreting results
Observation & Experiments
Gathering data, cont’d Subjective Data Quantitative
Experimental Evaluation
Human-Computer Interaction: Overview of User Studies
Presentation transcript:

Empirical Evaluation Data Collection: Techniques, methods, tricks Objective data IRB Clarification All research done outside the class (i.e., with non-class members as participants) requires IRB approval or exemption Call the IRB (Alice Basler), and describe the research; she will determine over the phone if the research is “exempt” or if a formal review application is required Take the NIH online course before calling the IRB (i.e., this weekend) Note: IRB review time can range from 5 minutes (exempt) to 6 weeks+ (full Board) Fall 2002 CS/PSY 6750

Evaluation is Detective Work Goal: gather evidence that can help you determine whether your hypotheses are correct or not. Evidence (data) should be: Relevant Diagnostic Credible Corroborated Fall 2002 CS/PSY 6750

Data as Evidence Relevant Diagnostic Appropriate to address the hypotheses e.g., Does measuring “number of errors” provide insight into how effective your new air traffic control system supports the users’ tasks? Diagnostic Data unambiguously provide evidence one way or the other e.g., Does asking the users’ preferences clearly tell you if the system performs better? (Maybe) Fall 2002 CS/PSY 6750

Data as Evidence Credible Corroborated Are the data trustworthy? Gather data carefully; gather enough data Corroborated Do more than one source of evidence support the hypotheses? e.g., Both accuracy and user opinions indicate that the new system is better than the previous system. But what if completion time is slower? Fall 2002 CS/PSY 6750

General Recommendations Include both objective & subjective data e.g., “completion time” and “preference” Use multiple measures, within a type e.g., “reaction time” and “accuracy” Use quantitative measures where possible e.g., preference score (on a scale of 1-7) Note: Only gather the data required; do so with the min. interruption, hassle, time, etc. Fall 2002 CS/PSY 6750

Types of Data to Collect “Demographics” Info about the participant, used for grouping or for correlation with other measures e.g., handedness; age; first/best language; SAT score Note: Gather if it is relevant. Does not have to be self-reported: you can use tests (e.g.,Edinburgh Handedness) Quantitative data What you measure e.g., reaction time; number of yawns Qualitative data Descriptions, observations that are not quantified e.g., different ways of holding the mouse; approaches to solving problem; trouble understanding the instructions Fall 2002 CS/PSY 6750

Planning for Data Collection What data to gather? Depends on the task and any benchmarks How to gather the data? Interpretive, natural, empirical, predictive?? What criteria are important? Success on the task? Score? Satisfaction?… What resources are available? Participants, prototype, evaluators, facilities, team knowledge (programming, stats, etc.) Fall 2002 CS/PSY 6750

Collecting Data Capturing the Session Post-session activities Observation & Note-taking Audio and video recording Instrumented user interface Software logs Think-aloud protocol - can be very helpful Critical incident logging - positive & negative Post-session activities Structured interviews; debriefing “What did you like best/least?”; “How would you change..?” Questionnaires, comments, and rating scales Post-hoc video coding/rating by experimenter Fall 2002 CS/PSY 6750

Observing Users Not as easy as you think One of the best ways to gather feedback about your interface Watch, listen and learn as a person interacts with your system Fall 2002 CS/PSY 6750

Observation Direct Indirect In same room Video recording Can be intrusive Users aware of your presence Only see it one time May use 1-way mirror to reduce intrusion Cheap, quicker to set up and to analyze Indirect Video recording Reduces intrusion, but doesn’t eliminate it Cameras focused on screen, face & keyboard Gives archival record, but can spend a lot of time reviewing it Fall 2002 CS/PSY 6750

Location Observations may be In lab - Maybe a specially built usability lab Easier to control Can have user complete set of tasks In field Watch their everyday actions More realistic Harder to control other factors Fall 2002 CS/PSY 6750

Challenge In simple observation, you observe actions but don’t know what’s going on in their head Often utilize some form of verbal protocol where users describe their thoughts Fall 2002 CS/PSY 6750

Verbal Protocol One technique: Think-aloud User describes verbally what s/he is thinking while performing the tasks What they believe is happening Why they take an action What they are trying to do Fall 2002 CS/PSY 6750

Think Aloud Very widely used, useful technique Allows you to understand user’s thought processes better Potential problems: Can be awkward for participant Thinking aloud can modify way user performs task Fall 2002 CS/PSY 6750

Teams Another technique: Co-discovery learning (Constructive interaction) Join pairs of participants to work together Use think aloud Perhaps have one person be semi-expert (coach) and one be novice More natural (like conversation) so removes some awkwardness of individual think aloud Fall 2002 CS/PSY 6750

Alternative What if thinking aloud during session will be too disruptive? Can use post-event protocol User performs session, then watches video and describes what s/he was thinking Sometimes difficult to recall Opens up door of interpretation Fall 2002 CS/PSY 6750

Historical Record In observing users, how do you capture events in the session for later analysis? Capturing a Session Paper & pencil Can be slow May miss things Is definitely cheap and easy Task 1 Task 2 Task 3 … Time 10:00 10:03 10:08 10:22 S e S e Fall 2002 CS/PSY 6750

Capturing a Session Recording (audio and/or video) Good for talk-aloud Hard to tie to interface Multiple cameras probably needed Good, rich record of session Can be intrusive Can be painful to transcribe and analyze Fall 2002 CS/PSY 6750

Capturing a Session Software logging Modify software to log user actions Can give time-stamped keypress or mouse event Two problems: Too low-level, want higher level events Massive amount of data, need analysis tools Fall 2002 CS/PSY 6750

Issues What if user gets stuck on a task? You can ask “What are you trying to do..?” “What made you think..?” “How would you like to perform..?” “What would make this easier to accomplish..?” Maybe offer hints Can provide design ideas Fall 2002 CS/PSY 6750