What’s Better? Moderated Lab Testing or Unmoderated Remote Testing? Susan Fowler FAST Consulting 718 720-1169 

Slides:

Advertisements

Similar presentations

Symantec 2010 Windows 7 Migration Global Results.

Advertisements

3rd Annual Plex/2E Worldwide Users Conference 13A Batch Processing in 2E Jeffrey A. Welsh, STAR BASE Consulting, Inc. September 20, 2007.

AP STUDY SESSION 2.

1 Copyright © 2002 Pearson Education, Inc.. 2 Chapter 2 Getting Started.

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.

McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Extended Learning Module D (Office 2007 Version) Decision Analysis.

Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.

There’s no such thing as a boy’s / girl’s job

Myra Shields Training Manager Introduction to OvidSP.

Slide 1 FastFacts Feature Presentation October 16 th, 2008 We are using audio during this session, so please dial in to our conference line… Phone number:

Slide 1 FastFacts Feature Presentation August 28, 2008 We are using audio during this session, so please dial in to our conference line… Phone number:

Slide 1 FastFacts Feature Presentation August 12, 2010 We are using audio during this session, so please dial in to our conference line… Phone number:

Slide 1 FastFacts Feature Presentation September 7, 2010 We are using audio during this session, so please dial in to our conference line… Phone number:

Slide 1 FastFacts Feature Presentation December 13 th, 2007 We are using audio during this session, so please dial in to our conference line… Phone number:

Slide 1 FastFacts Feature Presentation October 15, 2013 To dial in, use this phone number and participant code… Phone number: Participant.

Slide 1 FastFacts Feature Presentation November 11, 2008 We are using audio during this session, so please dial in to our conference line… Phone number:

1. 2 Begin with the end in mind! 3 Understand Audience Needs Stakeholder Analysis WIIFM Typical Presentations Expert Peer Junior.

1. 2 Begin with the end in mind! 3 Understand Audience Needs Stakeholder Analysis WIIFM Typical Presentations Expert Peer Junior.

Process a Customer Chapter 2. Process a Customer 2-2 Objectives Understand what defines a Customer Learn how to check for an existing Customer Learn how.

Understanding PSAT/NMSQT Results The PSAT/NMSQT is cosponsored by the College Board and the National Merit Scholarship Corporation.

Plan My Care Training Care Management Working in partnership with Improvement and Efficiency South East.

LIBRARY WEBSITE, CATALOG, DATABASES AND FREE WEB RESOURCES.

Mike Scott University of Texas at Austin

1 Advanced Tools for Account Searches and Portfolios Dawn Gamache Cindy Bylander.

40 Tips Leveraging the New APICS.org to the Benefit of Your Organization, Members, and Customers! 1.

1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.

The 5S numbers game..

Photo Slideshow Instructions (delete before presenting or this page will show when slideshow loops) 1.Set PowerPoint to work in Outline. View/Normal click.

1. 2 Objectives Become familiar with the purpose and features of Epsilen Learn to navigate the Epsilen environment Develop a professional ePortfolio on.

David Geddes, Ph.D. Senior Vice President & Partner Fleishman-Hillard Institute for Public Relations Measurement Commission 21 tips for conducting evaluation.

Break Time Remaining 10:00.

Earn Passive Residual Income On Autopilot!

Shopping Day Free Powerpoint Templates.

Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.

1. 2 Its almost time to take the FCAT 2.0! Here are some important explanations and reminders to help you do your very best.

1 Overview Background Goals Methodology Participants Findings Recommendations.

PP Test Review Sections 6-1 to 6-6

1 The Blue Café by Chris Rea My world is miles of endless roads.

Bright Futures Guidelines Priorities and Screening Tables

EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.

Juan Gallegos November Objective Objective of this presentation 2.

Success Planner PREPARE FOR EXAMINATIONS Student Wall Planner and Study Guide.

Why Do You Want To Work For Us?

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

Services Course Windows Live SkyDrive Participant Guide.

SLP – Endless Possibilities What can SLP do for your school? Everything you need to know about SLP – past, present and future.

1 How Do I Order From.decimal? Rev 05/04/09 This instructional training document may be updated at anytime. Please visit and check the.

GEtServices Services Training For Suppliers Requests/Proposals.

1 BRState Software Demonstration. 2 After you click on the LDEQ link to download the BRState Software you will get this message.

Tips for Taking the Computer-Based. It’s almost time to take the FCAT 2.0! Here are some important explanations and reminders to help you do your very.

2004 EBSCO Publishing Presentation on EBSCOadmin.

1 Teaching the Web in Under an Hour Mary Ellen Bates Bates Information Services

Chapter 12 Working with Forms Principles of Web Design, 4 th Edition.

Essential Cell Biology

1 Phase III: Planning Action Developing Improvement Plans.

Clock will move after 1 minute

PSSA Preparation.

Chapter 11 Creating Framed Layouts Principles of Web Design, 4 th Edition.

CINAHL Keyword Searching. This presentation will take you through the procedure of finding reliable information which can be used in your academic work.

Immunobiology: The Immune System in Health & Disease Sixth Edition

Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.

RefWorks: The Basics October 12, What is RefWorks? A personal bibliographic software manager –Manages citations –Creates bibliogaphies Accessible.

Tips for Taking the FSA ELA Reading and Mathematics Assessments

Patient Survey Results 2013 Nicki Mott. Patient Survey 2013 Patient Survey conducted by IPOS Mori by posting questionnaires to random patients in the.

4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.

End-of-Year Administration: Reminders & Updates

© 2004 Keynote Systems Customer Experience Management (CEM) Bonny Brown, Ph.D. Director, Research & Public Services.

Presentation transcript:

What’s Better? Moderated Lab Testing or Unmoderated Remote Testing? Susan Fowler FAST Consulting 

2 What’s in this talk •Definitions & differences between moderated and unmoderated tests •What goes into a remote unmoderated test script? •What goes into the remote-study report? •Comparisons between moderated and unmoderated tests

3 Definition of terms •Moderated: In-lab studies and studies using online conferencing software with a moderator. Synchronous. •Unmoderated: Web-based studies using online tools and no moderator. Asynchronous. –Keynote Systems –SurveyMonkey –UserZoom–Zoomerang –WebSort.net

4 Differences •Rewards: –Moderated—$50 cash, gifts. –Unmoderated—$10 online gift certificates, coupons or credits, raffles. •Finding participants: –Moderated—use a marketing/recruiting company or a corporate mail or list. –Unmoderated—send invitations to a corporate list, intercept people online, or use a pre-qualified panel.

5 Differences •Qualifying participants: –Moderated—ask them; have them fill in a questionnaire at start. –Unmoderated—ask them in a screener section and knock out anyone who doesn’t fit (age, geography, disease, etc.).

6 Differences •Test scripts: –Moderated—the moderator has tasks he or she wants the participant to do, and the moderator and the notetakers track the questions and difficulties themselves. –Unmoderated—the script contains both the tasks and the questions that the moderator wants to address.

7 Differences •What you can test: –Moderated—anything that you can bring into the lab. –Unmoderated—only web-based software or web sites.

8 How Keynote Systems’ tool works 6 Analyst delivers actionable insights and recommendations 5 The tool captures panelists’ real-life behavior, goals, thoughts & attitudes 4 Panelists perform tasks & answer questions with the browser tool 2 A large, targeted sample of prescreened panelists is recruited 1 Client formulates research strategy & objectives 3 Panelists access the web test from their natural home or office environment

9 Creating an unmoderated test script •Screener: “Do you meet the criteria for this test?” • For each task: “Were you able to…?” –Ask “scorecard” questions--satisfaction, ease of use, organized –Ask “what did you like?” and “what did you not like?” –Provide a list of frustrations with an open-ended “other” option at end. •Wrap-up: –Overall scorecard, “would you return,” “would you recommend,” address for gift

10 What a test looks like: Screen The first few slides ask demographic questions. They can be used to eliminate participants from the test.

11 What a test looks like: Task

12 For your first task, we would like your feedback on the tugpegasus.org home page. Without clicking anywhere, please spend as much time as you would in real life learning about what tugpegasus.org offers from the content on the home page. When you have a good understanding of what tugpegasus.org offers, please press 'Answer.'

13 What a test looks like: Task You can create single-select questions as well as Likert scales.

14 What a test looks like: Task You can tie probing questions to earlier answers. The questions can be set up to respond to the earlier answer, negative or positive.

15 What a test looks like: Task You can have multi-select questions that turn off multiple selection if the participant picks a “none of the above” choice.

16 What a test looks like: Task You can make participants pick three (or any number) of characteristics. You can also randomize the choices, as well as the order of the tasks and the questions.

17 What a test looks like: Wrap-up The last set of questions in a study are score-card type questions: Did the participant think the site was easy, was she satisfied by the site, was it well-organized? Usability = credibility

18 What a test looks like: Wrap-up A participant might be forced to return to the site for business reasons, but if he’s willing to recommend it, then he’s probably happy with the site.

19 What a test looks like: Wrap-up Answers to these exit questions often contain gems. Don’t overlook the opportunity to ask for last-minute thoughts.

20 Reports: Analyzing unmoderated results •Quantitative data: Satisfaction, ease of use, and organization scorecards, plus other Likert results, are analyzed for statistical significance and correlations •Qualitative data: Lots and lots of responses to open-ended questions •Clickstream data: Where did the participants actually go? First clicks, standard paths, fall-off points

21 How do moderated and unmoderated results compare? •Statistical validity •Shock value of participants’ comments •Quality of the data •Quantity of the data •Missing information •Cost •Time •Subjects •Environment •Geography

22 Comparisons: Statistical validity •What’s the real difference between samples of 10 (moderated) and 100 (unmoderated)? –“The smaller number is good to pick up the main issues, but you need the larger sample to really validate whether the smaller sample is representative. –“I’ve noticed the numbers swinging around as we picked up more participants, at the level between 50 and 100 participants. At 100 or 200 participants, the data were completely different.” Ania Rodriguez, ex-IBM, now Keynote director

23 Comparisons: Statistical validity •It’s just math…

24 Key Customer Experience Metrics Q85 – 88. Overall, how would you rate your experience on the Club Med site. Ease of use Overall Organization Level of Frustration Perception of Page Load Times Significantly higher or lower than Club Med at 90% CI n=50 per site “Site was slow site kept losing my information and had to be retyped.” – Club Med “I could not get an ocean view room because the pop up window took too long to wait for.” – Club Med Club Med trailed Beaches on nearly all key metrics (especially page load times).

25 Comparisons: Statistical validity •What’s the real difference between samples of 10 (moderated) and 100 (unmoderated)? –“In general, quantitative shows you where issues are happening. For why, you need qualitative.” –But “to convince the executive staff, you need quantitative data. –“We also needed the quantitative scale to see how people were interacting with eBay Express. It was a new interaction paradigm [faceted search]—we needed click-through information, how deep did people go, how many facets did people use?” Michael Morgan, eBay usability group manager; uses UserZoom & Keynote

26 Comparisons: Statistical validity •“How many users are enough?” –There is no magical number. –Katz & Rohrer in UX (vol. 4, issue 4, 2005): •Is the goal to assess quality? For benchmarking and comparisons, high numbers are good. •Or is to address problems and reduce risk before the product is released? To improve the product, small, ongoing tests are better.

27 Comparisons: Shock value •Are typed comments as useful as audio or video in proving that there’s a problem? •Ania: –“Observing during the session is better than audio or video. While the test is happening, the CEOs can ask questions. They’re more engaged.” –That being said, “You can create a powerful stop- action video using Camtasia and the clickstreams.”

28 Comparisons: Shock value •Are typed comments as useful as audio or video in proving that there’s a problem? •Michael: –“The typed comments are very useful—top of mind. However, they’re not as engaging as video.” So, in his reports, he combines qualitative Morae clips with the quantitative UserZoom data. –“We also had click mapping—heat maps and first clicks,” and that was very useful. “On the first task, looking for laptops, we found that people were going to two different places.”

29 Comments are backed by heatmaps

30 Comparisons: Quality of the data •Online and in the lab, what are the temptations to be less than honest? –In the lab, some participants want to please the moderator. –Online, some participants want to steal your money.

31 Comparisons: Quality of the data •How do you prompt participants to explain why they’re stuck if you can’t see them getting stuck? –In the task debriefing, include a general set of explanations from which people can choose. For example, “The site was slow,” “Too few search results,” “Page too cluttered.”

32 Comparisons: Quality of the data •How do you prompt participants to explain why they’re stuck if you can’t see them get stuck? –Let people stop doing a task, but ask them why they quit.

33 Comparisons: Quantity of data •What is too much data? What are the trade-offs between depth and breadth? –“I’ve never found that there was too much data. I might not put everything in the report, but I can drill in 2 or 3 months later if the client or CEO asks for more information about something.” –With more data, “I can also do better segments” (for example, check a subset like “all women 50 and older vs. all men 50 and older). —Ania Rodriguez

34 Comparisons: Quantity of data •What is too much data? What are the trade-offs between depth and breadth? –“You have to figure out upfront how much you want to know. Make sure you get all the data you need for your stakeholders. –“You won’t necessarily present all the data to all the audiences. Not all audiences get the same presentation.” The nitty-gritty goes into an appendix. –“You also don’t want to exhaust the users by asking for too much information.” —Michael Morgan

35 Comparisons: Missing data •What do you lose if you can’t watch someone interacting with the site? –Some of the language they use to describe what they see. “eBay talk is ‘Sell your item’ and ‘Buy it now.’ People don’t talk that way. They say, ‘purchase an item immediately.’” — Michael Morgan –Reality check. “The only way to get good data is to test with 6 live users first. We find the main issues and frustrations, and then we validate them by running the test with 100 to 200 people.” — Ania Rodriguez –Body language, tone of voice, and differences because of demographics…

36 Comparisons: Missing data

37 Comparisons: Missing data

38 Comparisons: Relative expense •What are the relative costs of moderated vs. unmoderated tests? –What’s your experience?

39 Comparisons: Time •Which type of test takes longer to set up and analyze: moderated or unmoderated? –What’s your experience?

40 Comparisons: Subjects •Is it easier or harder to get qualified subjects for unmoderated testing? –Keynote and UserZoom offer pre-qualified panels. –If you want to pick up people who use your site, an invitation on the site is perfect. –If you do permission marketing and have an list of customers or prospects already, you can use that. •How do you know if the subjects are actually qualified? –Ask them to answer screening questions. Hope they don’t lie. Don’t let them retry (by setting a cookie).

41 Comparisons: Environment •In unmoderated testing, participants use their own computers in their own environments. However, firewalls and job rules may make it difficult to get business users as subjects. •Also, is taking people out of their home or office environments ever helpful—for example, by eliminating interruptions and distractions?

42 Comparisons: Geography •Remote unmoderated testing makes it relatively easy to test in many different locations, countries, and time zones. •However, moderated testing in different locations may help the design team understand the local situation better.

43 References Farnsworth, Carol. (Feb. 2007) “Using Quantitative/Qualitative Customer Research to Improve Web Site Effectiveness.” Fogg, B. J., Cathy Soohoo, David R. Danielson, Leslie Marable, Julianne Stanford, Ellen R. Tauber. (June 2003) Focusing on user-to-product relationships: How do users evaluate the credibility of Web sites?: a study with over 2,500 participants. Proceedings of the 2003 conference on Designing for user experiences DUX '03.Focusing on user-to-product relationships: How do users evaluate the credibility of Web sites?: a study with over 2,500 participants Fogg, B. J., Jonathan Marshall, Othman Laraki, Alex Osipovich, Chris Varma, Nicholas Fang, Jyoti Paul, Akshay Rangnekar, John Shon, Preeti Swani, Marissa Treinen. (March 2001) What makes Web sites credible?: a report on a large quantitative study Proceedings of the SIGCHI conference on Human factors in computing systems CHI '01.What makes Web sites credible?: a report on a large quantitative study Katz, Michael A., Christian Rohrer. (2005) “What to report: Deciding whether an issue is valid.” User Experience. 4(4): Tullis, T. S., Fleischman, S., McNulty, M., Cianchette, C., and Bergel, M. (2002) An Empirical Comparison of Lab and Remote Usability Testing of Web Sites (PDF). Usability Professionals Association Conference, July 2002, Orlando, FL. ( Empirical Comparison of Lab and Remote Usability Testing of Web Siteshttp://members.aol.com/TomTullis/prof.htm University of British Columbia Visual Cognition Lab. (Undated) Demos. (

44 Commercial tools Keynote Systems (online usability testing) –Demo: “Try it now” on search_tools/webeffective.html search_tools/webeffective.html UserZoom (online usability testing) – WebSort.net (online card sorting tool) SurveyMonkey.com (online survey tool—basic level is free) Zoomerang.com (online survey tool)

45 Statistics Darrell Huff, “How to Lie With Statistics,” W. W. Norton & Company (September 1993) Huff/dp/ /ref=pd_bbs_sr_1/ ?ie=UTF8&s=books&qid= &sr=1-1 Huff/dp/ /ref=pd_bbs_sr_1/ ?ie=UTF8&s=books&qid= &sr=1-1 Julian L. Simon, “"Resampling: The New Statistics,” 2 nd ed., October 1997, Michael Starbird, “What Are the Chances? Probability Made Clear & Meaning from Data,” The Teaching Company, id=1475&pc=Science%20and%20Mathematicshttp:// id=1475&pc=Science%20and%20Mathematics

46 Questions? Contact us anytime! Susan Fowler has been an analyst for Keynote Systems, Inc., which offers remote unmoderated user-experience testing. She is currently a consultant at FAST Consulting and an editorial board member of User Experience magazine. With Victor Stanwick, she is an author of the Web Application Design Handbook (Morgan Kaufmann Publishers) ; cell