Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluation Shneiderman and Plaisant Chapter 4. Introduction Iterative design –Current “best practice” –Specialization of Boehm’s spiral model –Cost increases.

Similar presentations


Presentation on theme: "Evaluation Shneiderman and Plaisant Chapter 4. Introduction Iterative design –Current “best practice” –Specialization of Boehm’s spiral model –Cost increases."— Presentation transcript:

1 Evaluation Shneiderman and Plaisant Chapter 4

2 Introduction Iterative design –Current “best practice” –Specialization of Boehm’s spiral model –Cost increases in radial dimension Last time, design –“Radically transformational” So, count on multiple passes –Requirements Know task and user –Guidelines This time, evaluation –How, after all, to know if usable Design EvaluateImplement

3 Overview Introduction –Evaluation Plans, Acceptance Testing, and Life Cycle Expert reviews Usability testing and techniques –Goal is to “engineer” a good interface, constrained by time and cost Survey instruments Acceptance tests Evaluation during active use “Controlled psychologically oriented experiments” –Elements of science, as applied to interface evaluation

4 Introduction Usability of interface design – a key component …from 2 nd week! Evaluation required to know how/if “usable” –By whatever means … reviews, surveys, etc. Again, what makes sense (is appropriate for) programmer/expert not right for general user population –… an early point –“Know thy user” and know how “thy user” performs with the system and how the system performs with “thy user” –and the way to know is by evaluation In Shneiderman-ese: –“Designers can become so entranced with their creations that they may fail to evaluate them adequately.” –“Experienced designers have attained the wisdom and humility to know that extensive testing is a necessity.” –“If feedback is the “breakfast of champions”, then testing is the “dinner of gods”

5 Evaluation Plan – Metrics of Usability “It’s Fundamental” “Evaluation plan” should be part of system development … and life cycle … in larger projects –Also, part of “acceptance tests” Objective measurable goals for hardware and software performance –So, for system performance: Response time, functionality, reliability, … –And for usability, user-experience: Values on specific metrics –Also, part of “maintenance” after deployment As noted, metrics of usability include: –Time to learn specific tasks –Speed of task performance –Rate of errors –Retention of commands or task sequences over time –Frequency of help/assistance requests –Subjective user satisfaction

6 Evaluation Plan – Depends of Project Evaluation plan varies, depending on project –Range of costs might be from 10%-20% of a project down to 5%. –Range of evaluation plans might be from years to a few days test Will have different kinds of elements depending on: –Stage of design (early, middle, late) Key screens, prototype, final system –Novelty of project Well defined vs. exploratory –Number of expected users –Criticality of the interface E.g., life-critical medical system vs. museum exhibit support –Costs of product and finances allocated for testing –Time available –Experience of design and evaluation team

7 “Step-by-Step Usability Guide” from Shneiderman, web site example From Shneiderman

8 But, Testing is not a Panacea, so … Nonetheless, testing can’t eliminate all problems In essence, plan for remaining problems/challenges As part of evaluation plan! Cost of eliminating error (or enhancing performance) does not increase linearly I.e., the obvious things are easy, the hard require more resources to refine Design/cost decision made about what amount of costs to allocate Still, some things extremely hard to test E.g., user performance under stress

9 Expert Reviews, 1 Expert reviews can be from “just asking for feedback” to structured techniques, e.g., heuristic review, guidelines review –… of course, you have to have an expert, and large organizations do –Expert needs to be familiar with domain and design goals One-half day to one week effort –But lengthy training period may sometimes be required to explain task domain or operational procedures Even informal demos to colleagues or customers can provide some useful feedback –More formal expert reviews have proven to be effective

10 Expert Reviews, 2 Can be scheduled at several points in development process –When experts are available –When design team is ready for feedback Different experts tend to find different problems in an interface –3-5 expert reviewers can be highly productive, as can complementary usability testing Caveats: –Experts may not have an adequate understanding of task domain or user communities –Conflicting advice –Even experienced expert reviewers have great difficulty knowing how typical users, especially first-time users will really behave

11 Expert Review Techniques Heuristic evaluation –General review for adherence of interface to principles of successful design, e.g., Nielsen E.g., “error messages should be informative”, “feedback provided” –Adherence to some theory or model, e.g., object-action model Guidelines review –Check for conformance with guidelines –Given complexity of guidelines, can be significant effort Consistency inspection –E.g., of interface terminology, fonts, color schemes, input/output format Formal usability inspection/review – as part of SE process –Structure forum for critiqueing (if not courtroom style …) –Might be occasion to request exception for guideline deviation Bird’s-eye view of interface –By, e.g., full set of printed screens on wall –Inconsistencies, organization, etc. more evident Cognitive walkthrough …

12 Heuristic Evaluation Recall, Nielsen’s Heuristics Meet expectations –1. Match the real world –2. Consistency & standards –3. Help & documentation User is boss –4. User control & freedom –5. Visibility of system status –6. Flexibility & efficiency Errors –7. Error prevention –8. Recognition, not recall –9. Error reporting, diagnosis, and recovery Keep it simple –10. Aesthetic & minimalistic design

13 Heuristic Evaluation cf. Nielsen, useit.com article A small number of experts / evaluators either use or observe use of system and provide list of problems, based on heuristics –A type of “discount usability testing” –Recall, principles of Nielsen, Shneiderman, Togazinni, and others Some evaluators find some problems, others find others –Nielsen recommends 3-5 evaluators Steps : –Inspect UI thoroughly –Compare UI against heuristics –List usability problems Explain and justify each problem with heuristics

14 How To Do Heuristic Evaluation Details Justify every problem with a heuristic –“Too many choices on the home page - Aesthetic & minimalistic Design” –Can’t just say “I don’t like the colors”, but can justify List every problem –Even if an interface element has multiple problems Go through the interface at least twice –Once to get the feel of the system –Again to focus on particular interface elements Don’t limit to a single heuristic set (“8 Golden Rules”, Nielsen, etc.) –Others: affordances, visibility, perceptual elements, color principles But, a particular heuristic set, e.g., Nielsen’s, is easier to compare against

15 Example.

16 Shopping cart icon not balanced with its background whitespace (Aesthetic & minimalist design) Good: user is greeted by name (Visibility of system status) Red is used both for help messages and for error messages (Consistency, Match real world) “There is a problem with your order”, but no explanation or suggestions for resolution (Error reporting) ExtPrice and UnitPrice are strange labels (Match real world) Remove Hardware button inconsistent with Remove checkbox (Consistency)

17 Example "Click here“ is unnecessary (Aesthetic & minimalist design) No “Continue shopping" button (User control & freedom) Recalculate is very close to Clear Cart (Error prevention) “Check Out” button doesn’t look like other buttons `(Consistency, both internal & external) Uses “Cart Title” and “Cart Name” for the same concept (Consistency) Must recall and type in cart title to load (Recognition not recall, Error prevention, Flexibility & efficiency)

18 Heuristic Evaluation is Not User Testing Evaluators not the user either –Maybe closer to being a typical user than the coder/developer is, though Analogy: code inspection vs. testing Heuristic evaluation finds problems that user testing often misses –E.g., inconsistent fonts But user testing is the “gold standard” for usability

19 Hints for Better Heuristic Evaluation Use multiple evaluators –Different evaluators find different problems –The more the better, but diminishing returns –Nielsen recommends 3-5 evaluators Alternate heuristic evaluation with user testing –Each method finds different problems –Heuristic evaluation is cheaper Use “observer” with evaluator –Adds cost, but cheap enough anyway –Take notes –Provide domain guidance, where needed It’s OK for observer to help evaluator As long as the problem has already been noted This wouldn’t be OK in a user test

20 Writing Good Heuristic Evaluations (fyi) Heuristic evaluations must communicate well to developers and managers Include positive comments as well as criticisms –“Good: Toolbar icons are simple, with good contrast and few colors (minimalist design)” Be tactful –Not: “the menu organization is a complete mess” –Better: “menus are not organized by function” Be specific –Not: “text is unreadable” –Better: “text is too small, and has poor contrast (black text on dark green background)”

21 Suggested Report Format (fyi) What to include: –Problem –Heuristic –Description –Severity –Recommendation (if any) –Screenshot (if helpful)

22 Formal Evaluation Formal evaluation typically much larger effort Again, consider a large scale SE project Will look at some elements of formal evaluation –Training, evaluation, severity ratings, debriefing

23 Formal Evaluation Process 1. Training –Meeting for design team & evaluators –Introduce application –Explain user population, domain, scenarios 2. Evaluation –Evaluators work separately –Generate written report, or oral comments recorded by an observer –Focus on generating problems, not on ranking their severity yet –1-2 hours per evaluator 3. Severity Rating –Evaluators prioritize all problems found (not just their own) –Take the mean of the evaluators ratings 4. Debriefing –Evaluators & design team discuss results, brainstorm solutions

24 Severity Ratings Contributing factors –Frequency: how common? –Impact: how hard to overcome? –Persistence: how often to overcome? Severity scale –1. Cosmetic: need not be fixed –2. Minor: needs fixing but low priority –3. Major: needs fixing and high priority –4. Catastrophic: imperative to fix

25 Evaluating Prototypes Heuristic evaluation works on: –Sketches –Paper prototypes –Unstable prototypes “Missing-element” problems are harder to find on sketches –Because you’re not actually using the interface, you aren’t blocked by feature’s absence –Look harder for them

26 Cognitive Walkthrough Experts “walks through” the design, as a user would to carry out specific tasks –Identifies potential problems using psychological principles E.g., “user has to remember action too long to successfully recall” Principle that short term memory is limited – more later –Usually performed by expert in cognitive psychology –Evaluates design on how well it supports user in learning task, etc. Analysis focuses on goals and knowledge: –Does the interface design lead user to generate correct goals? E.g., at low level, having arrow in list box helps user form goal to click and select among alternatives For each task walkthrough considers –What impact will interaction have on user? –What cognitive processes are required? –What learning problems may occur?

27 Kinds of User Tests Formative evaluation –Find problems for next iteration of design –Evaluates prototype or implementation, in lab, on chosen tasks –Qualitative observations (usability problems) Field study –Find problems in context –Evaluates working implementation, in real context, on real tasks –Mostly qualitative observations Controlled experiment –Tests a hypothesis, e.g., interface X is faster than interface Y –Evaluates working implementation, in controlled lab environment, on chosen tasks –Mostly quantitative observations (time, error rate, satisfaction)

28 Usability Testing and Laboratories, 1 Usability testing and laboratories since early 1980s –Speeded up projects and cut costs – which led to acceptance and implementation Usability testing is a unique practice –Roots of techniques in experimental psychology –Again, interface design is like engineering Which draws on science, but is a practice –Not testing hypotheses about theories Rather, goal is to refine interfaces rapidly –A variable or few at a time approach not appropriate Too slow, costly “User interface architect” –Works with usability laboratory –Carry out elements of evaluation plan “Pilot test”

29 Usability Testing and Laboratories, 2 Small usability lab –Two areas: one for the participants to perform the tasks another, separated by a half- silvered mirror, for the testers and observers Participants should be chosen to represent intended user communities –Consider … background in computing, experience with task, motivation, education, ability with natural language used in interface

30 Usability Testing – Techniques, 1 Thinking aloud protocols –Surprisingly straightforward and effective technique –Users simply say what they are doing User observed performing task user asked to describe what he is doing and why, what he thinks is happening etc. –Well studied methodology –Advantages simplicity - requires little expertise can provide useful insight can show how system is actually use –Disadvantages subjective selective act of describing may alter task performance

31 Usability Testing – Techniques, 2 Videotaping –Useful for later review and showing designers or managers problems users encounter –Sessions can be “coded” by observers for data reduction Paper mockups –Sketches, story-boards Actually, used often! –Different skills (and costs) for programming and sketching “Discount usability testing” –Shneiderman’s, and others, term for “quick and dirty” –“Rapid usability testing” Rapid, perhaps lo fidelity, prototype “Global” task performance Competitive usability testing –Compares new design with existing or others –Essentially, “incremental” testing of changes with existing as baseline

32 Usability Testing – Techniques, 3 Universal usability testing –Considers diversity of hardware platforms and users –E.g., ambient light levels, network speed, age groups, color-blindness Field test and portable labs –Puts logging software, usability task, video equipment, etc. where will be used –Cost benefits and validity Remote usability testing –Web-based application natural, e- feedback in general Can-you-break-this tests –… like it says …

33 Usability Testing – Limitations Emphasizes first-time users –After all, bringing in people to laboratory In fact, often solicited in large labs in newspaper –Short sessions show only first part of learning curve … Only possible to see part of complete system functionality Testing should be performed in environment in which system to be used –Office, home, outside … not in laboratory –Misses context of use Testing should be performed for long duration –And such long duration testing should be part of plan, but is often not

34 Ethics of User Testing Users are human beings –Human subjects have been seriously abused in past Research involving user testing is now subject to close scrutiny Institutional Review Board (IRB) must approve user studies Pressures on a user –Performance anxiety –Feel like an intelligence test –Comparing self with other subjects –Feeling stupid in front of observers –Competing with other subjects Informed consent statement: I have freely volunteered to participate in this experiment. I have been informed in advance what my task(s) will be and what procedures will be followed. I have been given the opportunity to ask questions, and have had my questions answered to my satisfaction. I am aware that I have the right to withdraw consent and to discontinue participation at any time, without prejudice to my future treatment. My signature below may be taken as affirmation of all the above statements; it was given prior to my participation in this study.

35 Treat the User With Respect Time –Don’t waste Comfort –Make the user comfortable Informed consent –Inform the user as fully as possible Privacy –Preserve the users privacy Control –The user can stop at any time

36 Before a Test Time –Pilot-test all materials and tasks Comfort (psychological and physical) –“We’re testing the system; we’re not testing you” –“Any difficulties you encounter are the system’s fault. We need your help to find these problems.” Privacy –“Your test results will be completely confidential” Information –Brief about purpose of study –Inform about audio taping, videotaping, other observers –Answer any questions beforehand (unless biasing) Control –“You can stop at any time.”

37 During the Test Time –Eliminate unnecessary tasks –Comfort, calm, relaxed atmosphere –Take breaks in long session –Never act disappointed –Give tasks one at a time –First task should be easy, for an early success experience Privacy –E.g., user’s boss shouldn’t be watching Information –Answer questions (where won’t bias) Control –User can give up a task and go on to the next –User can quit entirely

38 After the Test Comfort –Say what they’ve helped you do Information –Answer questions that you had to defer to avoid biasing the experiment Privacy –Don’t publish user-identifying information –Don’t show video or audio without user permission

39 Formative Evaluation Find some users –Should be representative of the target user class(es), based on user analysis Give each user some tasks –Should be representative of important tasks, based on task analysis –Watch user do the tasks Roles in formative evaluation –User –Facilitator –Observers

40 User’s Role E.g., user should think aloud –What they think is happening –What they’re trying to do –Why they took an action Problems –Feels odd –Thinking aloud may alter behavior –Disrupts concentration Another approach: pairs of users –Two users working together are more key to converse naturally –Also called co-discovery, constructive interaction

41 Facilitator’s Role Does the briefing Provides the tasks Coaches the user to think aloud by asking questions –“What are you thinking?” –“Why did you try that?” Controls the session and prevents interruptions by observers

42 Observer’s Role Be quiet –Don’t help, don’t explain, don’t point out mistakes Take notes –Watch for critical incidents: events that strongly affect task performance or satisfaction –Usually negative Errors Repeated attempts Curses –May be positive “Cool” “Oh, now I see”

43 Recording Observations Pen & paper notes –Prepared forms can help Audio recording –For think-aloud Video recording –Usability abs often set up with two cameras, one for user’s face, one for screen –User may be self-conscous –Good for closed-circuit view by observers in another room –Generates too much data –Retrospective testing: go back through the video with the user, disicussng critical incidents Screen capture & event logging –Cheap and unobtrusive

44 How Many Users? (supplementary) Landauer-Nielsen model –Every tested user finds a fraction L of usability problems (typcal L = 31%) –If user tests are independent, then n users will find a fraction 1-(1-L)^n –So 5 users will find 85% of the problems Which is better: –Using 15 users to find 99% of problems with one design iteration –Using 5 users to find 85% problems with each of three design iterations For multiple user classes, get 3-5 users from each class

45 Flaws in Nielsen-Landauer Model (supplementary) L may be much smaller than 31% –Spool & Schroeder study of a CD-purchasing web site found L=8%, so 5 users only find 35% of problems L may vary from problem to problem –Different problems have different probabilities of being found, caused by: Individual differences Interface diversity Task complexity Lesson: you can’t predict with confidence how many users may be needed

46 Usability Testing - Other Techniques Physiological methods Eye tracking –Head or desk mounted equipment tracks position of eye –Eye movement reflects amount of cognitive processing a display requires Measurements include –Fixations: eye maintains stable position. Number and duration indicate level of difficulty with display –Saccades: rapid eye movement from one point of interest to another –Scan paths: moving straight to a target with a short fixation at the target is optimal

47 Usability Testing – “Heat Maps”, 1 Eye movement (gaze) data mapped to false color “Eyetracking Web Usability”, Nielsen, 2009

48 Usability Testing – “Heat Maps”, 2 Eye movement (gaze) data mapped to false color “Search Engine Optimization for Dummies”, Koneka

49 Usability Testing - Other Techniques Physiological measurements Emotional response linked to physical changes These may help determine a user’s reaction to an interface Measurements include: –heart activity, including blood pressure, volume and pulse. –activity of sweat glands: Galvanic Skin Response (GSR) –electrical activity in muscle: electromyogram (EMG) –electrical activity in brain: electroencephalogram (EEG) Difficulty in interpreting these physiological responses –more research needed

50 Survey Instruments and Questionnaires (briefly) Familiar, inexpensive and generally acceptable companion for usability tests and expert reviews –Advantages quick and reaches large user group can be analyzed more rigorously –Disadvantages less flexible less probing –Long, detailed example in text Keys to successful surveys: –Clear goals in advance – what information is required –Development of focused items that help attain the goals Styles of question –General, open-ended, scalar, multiple-choice, ranked Users could be asked for their subjective impressions about specific aspects of interface such as representations of: –task domain objects and actions –syntax of inputs and design of displays.

51 Surveys and Questionnaires, 2 (briefly) Other goals would be to ascertain: –users background (age, gender, origins, education, income) –experience with computers (specific applications or software packages, length of time, depth of knowledge) –job responsibilities (decision-making influence, managerial roles, motivation) –personality style (introvert vs. extrovert, risk taking vs. risk aversive, early vs. late adopter, systematic vs. opportunistic) –reasons for not using an interface (inadequate services, too complex, too slow) –familiarity with features (printing, macros, shortcuts, tutorials) –their feeling state after using an interface (confused vs. clear, frustrated vs. in-control, bored vs. excited) Online surveys avoid cost of printing and extra effort needed for distribution and collection of paper forms Many people prefer to answer a brief survey displayed on a screen, instead of filling in and returning a printed form, –although there is a potential bias in the sample.

52 A cceptance Test (briefly) As noted at outset: –For large implementation projects, customer or manager usually sets objective and measurable goals for hardware and software performance If completed product fails to meet these acceptance criteria, system must be reworked until success is demonstrated –“deliverables” include Again, measurable criteria for user interface can be established and might include: –Time to learn specific functions –Speed of task performance –Rate of errors by users –Human retention of commands over time –Subjective user satisfaction In large system, there may be 8 or 10 such tests to carry out on different components of interface and with different user communities Once acceptance testing has been successful, there may be a period of field testing before national or international distribution

53 Evaluation During Active Use (briefly) Recall, evaluation plan should include evaluation throughout software’s life cycle –Successful active use requires constant attention from dedicated managers, user-services personnel, and maintenance staff –“Perfection is not attainable, but percentage improvements are possible” Idea of “gradual interface dissemination” useful for minimal disruption –Continue to fix problems and refine design (including user interface) –Taken further by alpha and beta testing Many techniques available: –Interviews and focus group discussions –Continuous user-performance data logging –Online suggestion box or e-mail trouble reporting –Discussion group and newsgroup Interviews and focus group discussions –Interviews with individual users can be productive because the interviewer can pursue specific issues of concern. –Group discussions are valuable to ascertain the universality of comments

54 Evaluation During Active Use, 2 (briefly) Continuous user-performance data logging –The software architecture should make it easy for system managers to collect data about –The patterns of system usage - Speed of user performance –Rate of errors - Frequency of request for online assistance –A major benefit is guidance to system maintainers in optimizing performance and reducing costs for all participants Online or telephone consultants –Many users feel reassured if they know there is a human assistance available –On some network systems, the consultants can monitor the user's computer and see the same displays that the user sees Online suggestion box or e-mail trouble reporting –Electronic mail to the maintainers or designers. –For some users, writing a letter may be seen as requiring too much effort Discussion group and newsgroup –Permit postings of open messages and questions –Some are independent, e.g. America Online and Yahoo! –Topic list –Sometimes moderators –Social systems –Comments and suggestions should be encouraged.

55 Controlled Psychologically-oriented Experiments - Context Recall, idea that goals for the engineering practice of interface design and implementation differ from goals for science of psychology (or, for that matter a science of HCI) –Goal of interface design is to design and implement (“good”) interfaces rapidly, or, in a pragmatic, cost effective manner Hence, different techniques appropriate Following from this, Shneiderman suggests that: –As scientific and engineering progress is often stimulated by improved techniques for precise measurement, –Rapid progress in the designs of interfaces will be stimulated as researchers and practitioners evolve suitable human-performance measures and techniques –For example: Appliances have energy efficiency ratings, and Interfaces might have measures such as learning time for tasks, user satisfaction ratings A second principle Shneiderman suggests is to “adapt” elements of “the scientific method” to HCI, or interface design –Bears looking at, but understand that he is speaking of both “science” and “empirical investigation” –In fact, this is how you, as students educated in science and engineering should think!

56 Controlled Psychologically-oriented Experiments - “Empirical Investigation for Interface Design” (supplemental) “The scientific method (Shneiderman)”, or, empirical investigation, as applied to HCI and interface design –Deal with a practical problem and consider the theoretical framework To help the user learn how to navigate through the information presented he/she should be shown which items to select as objects and where committing to some action will take him/or her –State a lucid and testable hypothesis By changing (color, font) of the item it will be more easily selected as shown by the time to perform the task decreasing by 2 seconds –Identify a small number of independent variables that are to be manipulated Those things to change (manipulate), e.g., color, font –Carefully choose the dependent variables that will be measured Those things to measure, e.g., time to complete task –Judiciously select subjects and carefully or randomly assign subjects to groups As noted below – one of several “biasing factors” –Control for biasing factors (non-representative sample of subjects or selection of tasks, inconsistent testing procedures) So that any change in value of dependent variable is not attributable to anything except the difference in independent variable –Apply statistical methods to data analysis So you know what to expect by chance, measurement error, etc. –Resolve the practical problem, refine the theory, and give advice to future researchers

57 End?

58 Controlled Psychologically-oriented Experiments - Context Recall, … goals for the engineering practice of interface design and implementation differ from goals for science …psychology, physics –(or, for that matter a science of HCI) Goal of interface design is to design and implement (“good”) interfaces rapidly, or, in a pragmatic, cost effective manner –Hence, different techniques appropriate

59 Controlled Psychologically-oriented Experiments - Context Goal of interface design is to design and implement (“good”) interfaces rapidly, or, in a pragmatic, cost effective manner Hence, different techniques appropriate Following from this, Shneiderman suggests that: –As scientific and engineering progress is often stimulated by improved techniques for precise measurement, –Rapid progress in the designs of interfaces will be stimulated as researchers and practitioners evolve suitable human-performance measures and techniques –For example: Appliances have energy efficiency ratings, and Interfaces might have measures such as learning time for tasks, user satisfaction ratings

60 Controlled Psychologically-oriented Experiments - Context Goal of interface design is to design and implement (“good”) interfaces rapidly, or, in a pragmatic, cost effective manner Hence, different techniques appropriate A second principle Shneiderman suggests is to “adapt” elements of “the scientific method” to HCI, or interface design –Bears looking at, but understand that he is speaking of both “science” and “empirical investigation” –In fact, this is how you, as students educated in science and engineering should think!

61 Controlled Psychologically-oriented Experiments Shneiderman - “Empirical Investigation for Interface Design” How to “ adapt the scientific method, or empirical investigation, as applied to HCI and interface design –Deal with a practical problem and consider the theoretical framework To help the user learn how to navigate through the information presented he/she should be shown which items to select as objects and where committing to some action will take him/or her –State a lucid and testable hypothesis “By changing (color, font) of the item it will be more easily selected as shown by the time to perform the task decreasing by 2 seconds” –Identify a small number of independent variables that are to be manipulated Those things to change (manipulate), e.g., color, font –Carefully choose the dependent variables that will be measured Those things to measure, e.g., time to complete task

62 Controlled Psychologically-oriented Experiments Shneiderman - “Empirical Investigation for Interface Design” How to “ adapt the scientific method, or empirical investigation, as applied to HCI and interface design –Judiciously select subjects and carefully or randomly assign subjects to groups As noted below – one of several “biasing factors” –Control for biasing factors (non-representative sample of subjects or selection of tasks, inconsistent testing procedures) So that any change in value of dependent variable is not attributable to anything except the difference in independent variable –Apply statistical methods to data analysis So you know what to expect by chance, measurement error, etc. –Resolve the practical problem, refine the theory, and give advice to future researchers

63 End Materials from: –Shneiderman publisher site: http://wps.aw.com/aw_shneider_dtui_4 &5/ –John Klemmer’s Intro. to HCI Design course http://hci.stanford.edu/courses/cs147/ –MIT OpenCourseware, Robert Miller’s User Interface Design and Implementation http://ocw.mit.edu/OcwWeb/Electrical-Engineering-and-Computer-Science/6-831Fall- 2004/CourseHome/index.htmhttp://ocw.mit.edu/OcwWeb/Electrical-Engineering-and-Computer-Science/6-831Fall- 2004/CourseHome/index.htm


Download ppt "Evaluation Shneiderman and Plaisant Chapter 4. Introduction Iterative design –Current “best practice” –Specialization of Boehm’s spiral model –Cost increases."

Similar presentations


Ads by Google