Blackmon, Kitajima, & Polson, CHI2005 1/26 Tool for Accurately Predicting Website Navigation Problems, Non-Problems, Problem Severity, and Effectiveness.

Slides:

Advertisements

Similar presentations

Chapter 14: Usability testing and field studies

Advertisements

Thoughts on Systematic Exploratory Testing of Important Products James Bach, Satisfice, Inc.

Cross Cultural Research

CMo: When Less Is More Yevgen Borodin Jalal Mahmud I.V. Ramakrishnan Context-Directed Browsing for Mobiles.

Chapter Thirteen Conclusion: Where We Go From Here.

Doug Altman Centre for Statistics in Medicine, Oxford, UK

Chapter 10 Decision Making © 2013 by Nelson Education.

The Cognitive Walkthrough and Cognitive Walkthrough for the Web -- A Worked Example (Computer Mediated Communication) (René van der Ark) (RuG)

Benjamin J. Deaver Advisor – Dr. LiGuo Huang Department of Computer Science and Engineering Southern Methodist University.

Chapter 14: Usability testing and field studies. 2 FJK User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept.

The art and science of measuring people l Reliability l Validity l Operationalizing.

Chapter 14: Usability testing and field studies. Usability Testing Emphasizes the property of being usable Key Components –User Pre-Test –User Test –User.

Feb. 1, 2007Search and Sensemaking1 CoLiDeS and SNIF-ACT: Complementary Models for Searching and Sensemaking on the Web Muneo Kitajima, AIST Peter G. Polson.

Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.

Ed H. Chi IMA Digital Library Workshop Ed H. Chi U of Minnesota Ph.D.: Visualization Spreadsheets M.S.: Computational Biology.

Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos.

UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Chapter 7 Correlational Research Gay, Mills, and Airasian

Robert delMas (Univ. of Minnesota, USA) Ann Ooms (Kingston College, UK) Joan Garfield (Univ. of Minnesota, USA) Beth Chance (Cal Poly State Univ., USA)

SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.

Chapter 14: Usability testing and field studies

Predictive Evaluation

Calibration Guidelines 1. Start simple, add complexity carefully 2. Use a broad range of information 3. Be well-posed & be comprehensive 4. Include diverse.

Put it to the Test: Usability Testing of Library Web Sites Nicole Campbell, Washington State University.

University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.

Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.

Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.

ICSE2006 Far East Experience Track Detecting Low Usability Web Pages using Quantitative Data of Users’ Behavior Noboru Nakamichi 1, Makoto Sakai 2, Kazuyuki.

A Model of Information Foraging via Ant Colony Simulation Matthew Kusner.

Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.

산업경영공학세미나 김성엽 SNIF-ACT : A Cognitive Model of User Navigation on the World Wide Web World Wide Web.

Testing & modeling users. The aims Describe how to do user testing. Discuss the differences between user testing, usability testing and research experiments.

Tools of the Trade: Inquiry CECS 5030: Introduction to the Internet Dr. Cathleen Norris & Jennifer Smolka.

Assessing the Quality of Research

Science Fair How To Get Started… (

A Meta-Study of Algorithm Visualization Effectiveness Christopher Hundhausen, Sarah Douglas, John Stasko Presented by David Burlinson 8/10/2015.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Approaching the Research Proposal Before you can start writing your proposal you need to clarify exactly what you will be doing, why, and how. This is.

Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 12 Making Sense of Advanced Statistical.

Individual Differences in Human-Computer Interaction HMI Yun Hwan Kang.

Teaching Reading Comprehension

Algorithmic Detection of Semantic Similarity WWW 2005.

Working Memory and Learning Underlying Website Structure

Evaluation Methods - Summary. How to chose a method? Stage of study – formative, iterative, summative Pros & cons Metrics – depends on what you want to.

Major Science Project Process A blueprint for experiment success.

Cognitive Walkthrough More evaluating with experts.

Developing & Answering Questions Assessing comprehension.

1 Chapter 18: Selection and training n Selection and Training: Last lines of defense in creating a safe and efficient system n Selection: Methods for selecting.

Building an Aviation Corpus Conclusions & Future Work BUILDING AN LSA-SIMULATED REPRESENTATION OF PILOT AVIATION KNOWLEDGE Accelerating Development of.

Chapter 6 - Standardized Measurement and Assessment

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.

Vector geometry: A visual tool for statistics Sylvain Chartier Laboratory for Computational Neurodynamics and Cognition Centre for Neural Dynamics.

Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)

ASSOCIATIVE BROWSING Evaluating 1 Jin Y. Kim / W. Bruce Croft / David Smith by Simulation.

Assess usability of a Web site’s information architecture: Approximate people’s information-seeking behavior (Monte Carlo simulation) Output quantitative.

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.

Access to Electronic Journals and Articles in ARL Libraries By Dana M. Caudle Cecilia M. Schmitz.

Research Methods & Design Outline

Research Design. How do we know what we know? The way we make reasoning Deductive logic Begins with one or more premises, reasoning then proceeds logically.

Muneo Kitajima Human-Computer Interaction Group

CJT 765: Structural Equation Modeling

Making Sense of Advanced Statistical Procedures in Research Articles

Multiple Regression – Split Sample Validation

Testing & modeling users

Cognitive Walkthrough

Presentation transcript:

Blackmon, Kitajima, & Polson, CHI2005 1/26 Tool for Accurately Predicting Website Navigation Problems, Non-Problems, Problem Severity, and Effectiveness of Repairs Marilyn Hughes Blackmon, U. of Colorado Muneo Kitajima, AIST, Japan Peter Polson, U. of Colorado

Blackmon, Kitajima, & Polson, CHI2005 2/26 Part One Work supported by NSF Grant to M. H. Blackmon

Blackmon, Kitajima, & Polson, CHI2005 3/26 Problem that spurred research and development of tool  Focus on users building comprehensive knowledge of a topic Browse complex websites (cf. search engine) Pure forward search Learn by exploration  Automatically predict what is worth repairing? Need accurate measure of problem severity Need to predict success rate for repairs  Web designers using tool must be able to do what unaided designers cannot: predict behavior of users different from themselves – objectively represent user diversity (background knowledge)

Blackmon, Kitajima, & Polson, CHI2005 4/26 Solution: Incrementally extend Cognitive Walkthrough for the Web (CWW)  CHI2002 paper tailored Cognitive Walkthrough (CW) for web navigation Proved CWW would identify usability problems that interfere with web navigation Substituted objective measures of similarity, familiarity, and elaboration of heading/link texts using Latent Semantic Analysis (LSA)  CHI2003 paper proved significantly better performance on CWW-repaired webpages vs. original, unrepaired pages

Blackmon, Kitajima, & Polson, CHI2005 5/26 Percent task failure correlated 0.93 with observed clicks (each task n≥38)

Blackmon, Kitajima, & Polson, CHI2005 6/26 Research problem, reformulated: What determines mean clicks?  Identify & repair factors that increase mean clicks and raise risk of task failure  Hypothetical determinants, based on prior results and theory underlying CWW research: Unfamiliar correct link, i.e., insufficient background knowledge to comprehend link Competing headings & their high-scent links Competing links under correct heading Weak scent correct link under correct heading

Blackmon, Kitajima, & Polson, CHI2005 7/26 First step: Collect enough data for multiple regression analysis  Reused 64 tasks from CHI2003 paper and ran additional experiments to get data on 100 new tasks, creating 164- task dataset  Developed automatable rules for CWW problem identification  Built multiple regression model for 164- task dataset and found 3 independent variables explaining 57% of the variance

Blackmon, Kitajima, & Polson, CHI2005 8/26 Multiple regression translates into formula to predict problem severity  Multiple regression analysis yielded formula for predicting mean clicks on links: (predicted clicks for non-problem) if correct link is unfamiliar times number of competing links nested under any competing heading if correct link has weak-scent + zero clicks for competing links under correct heading  Prediction for non-problem task =  ≥2.5 mean clicks distinguishes problem from non- problem

Blackmon, Kitajima, & Polson, CHI2005 9/26 Example of task: Find article about Hmong: List of 9 categories > Social Science > Anthropology Scroll A-Z list to find Hmong

Blackmon, Kitajima, & Polson, CHI /26

Blackmon, Kitajima, & Polson, CHI /26 CWW-identified problems in “Find Hmong” task: Competing headings

Blackmon, Kitajima, & Polson, CHI /26 Predicted mean clicks for Find Hmong task on original, unrepaired webpage  predicted clicks for non-problem  if correct link is unfamiliar  if correct link has weak-scent  (0.754 *5, the number of competing links nested under any competing heading) _________  predicted mean total clicks

Blackmon, Kitajima, & Polson, CHI /26 CWW-guided repairs of navigation usability problems detected by CWW  Create alternate high-scent paths to target webpage via all correct and competing headings IF competing heading(s) IF unfamiliar correct link IF weak-scent correct link  Substitute or elaborate link text with familiar, higher frequency words IF unfamiliar correct link

Blackmon, Kitajima, & Polson, CHI /26 Repair benefits for “Find Hmong,” a problem definitely worth repairing

Blackmon, Kitajima, & Polson, CHI /26 All 164 tasks: Predicted vs. observed mean total clicks

Blackmon, Kitajima, & Polson, CHI /26 Psychological validity measures for 164-task dataset  For 46 tasks predicted to have serious problems (i.e., predicted clicks ≥ 5.0) 100% hit rate, 0% false alarms 93% success rate for repairs (statistically significant difference repaired vs. not)  For all 75 tasks predicted to be problems 92% hit rate, 8% false alarms 83% success rate for repairs, significant different repaired vs. unrepaired, p<.0001

Blackmon, Kitajima, & Polson, CHI /26 Cross-validation study: Replicate the model on new dataset?  Ran another large experiment to test whether multiple regression formula replicated with new set of tasks 2 groups Each group did 32 new tasks, 64 total tasks Used prediction formula to identify problems vs. non-problems All tasks have just one correct link

Blackmon, Kitajima, & Polson, CHI /26 Multiple regression analysis produced full cross validation  Multiple regression of 64-task dataset gave same 3 determinants found for 164- task original dataset & similar coefficients  Hit rate for predicted problems = 90%, false alarms = 10%  Correct rejection for predicted non- problems = 69%, 31% misses, but 2/3 of misses had observed clicks , other 1/3 of misses >3.5 but <5.0

Blackmon, Kitajima, & Polson, CHI /26 Predicted vs. observed clicks for 64 tasks in cross-validation experiment

Blackmon, Kitajima, & Polson, CHI /26 Part Two

Blackmon, Kitajima, & Polson, CHI /26 Theory matters: CWW is theory-based usability evaluation method  CoLiDeS cognitive model (Kitajima, Blackmon, & Polson, 2000, 2005)  Construction-Integration cognitive architecture (Kintsch, 1998), a comprehensive model of human cognitive processes  Latent Semantic Analysis (LSA)

Blackmon, Kitajima, & Polson, CHI /26 The Key Idea  Core process underlying Web navigation is skilled reading comprehension Comprehension processes build mental representations of goals and webpage objects (subregions, hyperlinks, images, and other targets for action) Action planning compares goal with potential targets for action and selects target with highest activation level

Blackmon, Kitajima, & Polson, CHI /26 Consensus: Web navigation is equivalent to following scent trail  Scent or residue (Furnas, 1997)  SNIF-ACT based on Information Foraging (Pirolli & Card, 1999)  Bloodhound Project: Web User Flow by Information Scent (WUFIS) => InfoScent Simulator (Chi, et al., 2001, 2003)  CWW activation level

Blackmon, Kitajima, & Polson, CHI /26 CoLiDeS activation level: Scent is MORE than just similarity  Adequate background knowledge to comprehend headings and links? Select semantic space that best matches user group Warning bell for low word frequency Warning bell for low term vector  Before computing similarity, simulate human elaboration of link texts during comprehension, using LSA Near neighbors, finding terms simultaneously familiar and similar in meaning  Compute goal-heading and goal-link similarity with LSA cosines, defining weak scent as a cosine <0.10, moderate scent as cosine ≥0.30

Blackmon, Kitajima, & Polson, CHI /26 Conclusions: Extending CWW successful for research and development of tool  We CAN now predict severity of navigation usability problems and success rate for repairs of these problems, so we invest time to repair only what is worth repairing: tasks predicted ≥5.0 clicks  Web designers using tool CAN do what unaided designers cannot: predict behavior of users different from themselves – objectively represent user diversity in education level, culture, language, and field of expertise (background knowledge)

Blackmon, Kitajima, & Polson, CHI /26 Conclusions, continued  Scales up to large websites  Reliable (LSA measures vs. human judgments)  Psychologically valid (228-task dataset, large n gives stable mean for each task), based on cognitive model  Theory matters Drives experimental design High accuracy and psychological validity of tool Practitioners and researchers can now put the tool to use with trust

Blackmon, Kitajima, & Polson, CHI /26

Blackmon, Kitajima, & Polson, CHI /26 Non-problem task Find Fern approaches asymptote of pure forward search  One-click minimum path for both problems AND non-problems  1.1 mean total clicks on links  90% pure forward search (minimum path solution)  97% of first clicks were on link under correct heading  100% success rate -- everyone finished task in 1 or 2 clicks  9 seconds = mean solution time