Chapter 9 evaluation techniques. Evaluation Techniques Evaluation Objective –Tests usability –Efficiency – Functionality of system occurs in laboratory,

chapter 9 evaluation techniques

Evaluation Techniques Evaluation Objective –Tests usability –Efficiency – Functionality of system occurs in laboratory, field and/or in collaboration with users evaluates both design and implementation should be considered at all stages in the design life cycle

Goals of Evaluation assess extent of system functionality assess effect of interface on user identify specific problems

Evaluating Designs Cognitive Walkthrough Heuristic Evaluation Review-based evaluation

Cognitive Walkthrough Proposed by Polson et al. Walkthroughs require a detailed review of a sequence of actions. The origin of the cognitive walkthrough approach to evaluation is the code walkthrough which evaluates the various stages of the code to confirm it for best practice.

Objective of cognitive Walkthrough Walk through are done to –Confirm how easily the system helps users to learn the use of the software –Efficiently achieve their tasks with it. Experience shows users prefer to learn from personal exploration than just training. –Discover possible problems the user may encounter in the use of the software by systematically going through the provided steps

Cognitive Walkthrough (ctd) For each task walkthrough considers –what impact will interaction have on user? –what cognitive(knowledge based) processes are required? –what learning problems may occur?

To do a walkthrough you need four things: A specification or prototype of the system. It doesn’t have to be complete, but it should be fairly detailed. Details such as the location and wording for a menu can make a big difference. A description of the task the user is to perform on the system. This should be a representative task that most users will want to do.—what is to be done

To do a walkthrough you need four things: A complete, written list of the actions needed to complete the task with the proposed system.– how to do the what. An indication of who the users are and what kind of experience and knowledge the evaluators can assume about them.- administrators, end users

Critiquing: Given this information, the evaluators step through the action sequence (identified in item 3 above) to critique the system and tell a believable story about its usability. To do this, for each action, the evaluators try to answer the following four questions for each step in the action sequence.

To do a walkthrough you need four things: Is the effect of the action the same as the user’s goal at that point? Each user action will have a specific effect within the system. Is this effect the same as what the user is trying to achieve at this point? For example, if the effect of the action is to save a document, is ‘saving a document’ what the user wants to do?- icon should do what its meant to do Will users see that the action is available? Will users see the button or menu item, for example, that is used to produce the action? This is not asking whether they will recognize that the button is the one they want. – whether the buttons needed to perform the task are easy to find

To do a walkthrough you need four things: Once users have found the correct action, will they know it is the one they need? This complements the previous question. It is one thing for a button or menu item to be visible, but will the user recognize that it is the one he is looking for to complete his task? Where the previous question was about the visibility of the action, this one is about whether its meaning and effect is clear. – will user recognize save button? After the action is taken, will users understand the feedback they get? If you now assume that the user did manage to achieve the correct action, will he know that he has done so? Will the feedback given be sufficient confirmation of what has actually happened? Is the feedback appropriate eg messages uploaded successfully

Example of cognitive walkthrough Go to page 322 to go through the steps one by one

TO BE CONTINUED

Heuristic Evaluation Proposed by Nielsen and Molich. To aid the evaluators in discovering usability problems, a set of 10 heuristics are provided. usability criteria (heuristics) – set of rules are identified design examined by experts to see if the set of rules are violated Example heuristics –system behaviour is predictable – save means save –system behaviour is consistent – save should not delete –feedback is provided – document saved successfully

Nielsen’s ten heuristics are: Each evaluator assesses the system and notes violations of any of these heuristics that would indicate a potential usability problem. The evaluator also assesses the severity of each usability problem, based on four factors: –How common is the problem, –how easy is it for the user to overcome, –will it be a one-off problem or a persistent one, and –how seriously will the problem be perceived? These can be combined into an overall severity rating on a scale of 0–4: 0 = I don’t agree that this is a usability problem at all 1 = Cosmetic problem only: need not be fixed unless extra time is available on project 2 = Minor usability problem: fixing this should be given low priority 3 = Major usability problem: important to fix, so should be given high priority 4 = Usability catastrophe: imperative to fix this before product can be released (Nielsen)

Nielsen’s ten heuristics are: Visibility of system status Always keep users informed about what is going on, through appropriate feedback within reasonable time. For example, if a system operation will take some time, give an indication of how long and how much is complete. Match between system and the real world The system should speak the user’s language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in natural and logical order. User control and freedom Users often choose system functions by mistake and need a clearly marked ‘emergency exit’ to leave the unwanted state without having to go through an extended dialog. Support undo and redo. Consistency and standards Users should not have to wonder whether words, situations or actions mean the same thing in different contexts. Follow platform conventions and accepted standards. Save should mean save

Nielsen’s ten heuristics are: Error prevention Make it difficult to make errors. Even better than good error messages is a careful design that prevents a problem from occurring in the first place. – Prevention is better than cure Recognition rather than recall Make objects, actions and options visible. The user should not have to remember information from one part of the dialog to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate. Flexibility and efficiency of use Allow users to tailor frequent actions. Accelerators – unseen by the novice user – may often speed up the interaction for the expert user to such an extent that the system can cater to both inexperienced and experienced users. – autocorrect in word, typing in Google Aesthetic and minimalist design Dialogs should not contain information that is irrelevant or rarely needed. Every extra unit of information in a dialog competes with the relevant units of information and diminishes their relative visibility.

Nielsen’s ten heuristics are: Help users recognize, diagnose and recover from errors Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution. Help and documentation Few systems can be used with no instructions so it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user’s task, list concrete steps to be carried out, and not be too large. Once each evaluator has completed their separate assessment, all of the problems are collected and the mean severity ratings calculated. The design team will then determine the ones that are the most important and will receive attention first.

Review-based evaluation Assignment to be submitted for Thursday’s Lab

Evaluating through user Participation

Users studies Initial evaluation concentrated on Design evaluation rather than focusing the users testing Our concentration now, different approaches –Experimental methods –Observational methods –Query techniques –Physiological monitoring Distinct User evaluation style –Laboratory –Work Environment or field

Laboratory studies Advantages: –specialist equipment available – internet, macbook –uninterrupted environment Disadvantages: –lack of context –difficult to observe several users cooperating Appropriate –if system location is dangerous or impractical for constrained single user systems to allow controlled manipulation of use

Field Studies Advantages: –natural environment –context retained (though observation may alter it) –longitudinal studies possible Disadvantages: –distractions –noise Appropriate –where context is crucial for longitudinal studies

Evaluating Implementations Requires an artefact: simulation, prototype, full implementation

Experimental evaluation controlled evaluation of specific aspects of interactive behaviour evaluator chooses hypothesis to be tested a number of experimental conditions are considered which differ only in the value of some controlled variable.

Experimental factors Subjects –who – representative, sufficient sample Variables –things to modify and measure Hypothesis –what you’d like to show Experimental design –how you are going to do it

Variables independent variable (IV) characteristic changed to produce different conditions e.g. interface style, number of menu items So, for example, an experiment that wants to test whether search speed improves as the number of menu items decreases may consider menus with five, seven, and ten items. Here the independent variable, number of menu items, has three levels. More complex experiments may have more than one independent variable. For example, in the above experiment, we may suspect that the speed of the user’s response depends not only on the number of menu items but also on the choice of commands used on the menu. In this case there are two independent variables. If there were two sets of command names (that is, two levels), we would require six experimental conditions to investigate all the possibilities (three levels of menu size × two levels of command names).

Variables dependent variable (DV) characteristics measured in the experiment – from the example above, measuring the speed of search e.g. time taken, number of errors.

Hypothesis prediction of outcome of an experiment –framed in terms of IV and DV –The aim is to show that the prediction is correct, and this is done by disproving the null hypothesis e.g. “error rate will increase as font size decreases” null hypothesis: –states no difference between conditions e.g. null hyp. = “no change with font size”

Experimental design - Phases Phase 1 –Choose the Hypothesis (What is it that you are trying to demonstrate) –Identify Dependent and Independent Variables –Consider Participants – How many are available and are they representative of the user group? Phase 2 – Determine the experimental method –Between-subjects –Within-subjects

Experimental design within groups design –each subject performs experiment under each condition. –transfer of learning possible –less costly and less likely to suffer from user variation. between groups design –each subject performs under only one condition –no transfer of learning –more users required –variation can bias results.

TO BE CONTINUED

Observational Methods Think Aloud Cooperative evaluation Protocol analysis Automated analysis Post-task walkthroughs

Think Aloud user observed performing task user asked to describe what he is doing and why, what he thinks is happening etc. Advantages –simplicity - requires little expertise –can provide useful insight –can show how system is actually use Disadvantages – Class discussions –subjective –selective –act of describing may alter task performance

Cooperative evaluation – let me see if I can find my way around sort of – just like a drive test variation on think aloud user collaborates in evaluation both user and evaluator can ask each other questions throughout Additional advantages –less constrained and easier to use –user is encouraged to criticize system –clarification possible

Protocol analysis paper and pencil – cheap, limited to writing speed audio – good for think aloud, difficult to match with other protocols video – accurate and realistic, needs special equipment, obtrusive computer logging – automatic and unobtrusive, large amounts of data difficult to analyze user notebooks – coarse and subjective, useful insights, good for longitudinal studies Mixed use in practice. audio/video transcription difficult and requires skill. Some automatic support tools available

Query Techniques Interviews Questionnaires

Interviews analyst questions user on one-to -one basis usually based on prepared questions informal, subjective and relatively cheap Advantages – (student to discuss) –can be varied to suit context –issues can be explored more fully –can elicit user views and identify unanticipated problems Disadvantages –very subjective –time consuming

Questionnaires Set of fixed questions given to users Advantages –quick and reaches large user group –can be analyzed more rigorously Disadvantages –less flexible –less probing

Questionnaires (ctd) Need careful design –what information is required? –how are answers to be analyzed? Styles of question –general –open-ended –scalar –multi-choice –ranked

Physiological methods Eye tracking Physiological measurement

eye tracking head or desk mounted equipment tracks the position of the eye eye movement reflects the amount of cognitive processing a display requires measurements include –fixations: eye maintains stable position. Number and duration indicate level of difficulty with display –saccades: rapid eye movement from one point of interest to another –scan paths: moving straight to a target with a short fixation at the target is optimal

Physiological measurements emotional response linked to physical changes these may help determine a user’s reaction to an interface measurements include: –heart activity, including blood pressure, volume and pulse. –activity of sweat glands: Galvanic Skin Response (GSR) –electrical activity in muscle: electromyogram (EMG) –electrical activity in brain: electroencephalogram (EEG) some difficulty in interpreting these physiological responses - more research needed

Choosing an Evaluation Method when in process:design vs. implementation style of evaluation:laboratory vs. field how objective:subjective vs. objective type of measures:qualitative vs. quantitative level of information:high level vs. low level level of interference:obtrusive vs. unobtrusive resources available:time, subjects, equipment, expertise

Chapter 9 evaluation techniques. Evaluation Techniques Evaluation Objective –Tests usability –Efficiency – Functionality of system occurs in laboratory,

Similar presentations

Presentation on theme: "Chapter 9 evaluation techniques. Evaluation Techniques Evaluation Objective –Tests usability –Efficiency – Functionality of system occurs in laboratory,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 9 evaluation techniques. Evaluation Techniques Evaluation Objective –Tests usability –Efficiency – Functionality of system occurs in laboratory,

Similar presentations

Presentation on theme: "Chapter 9 evaluation techniques. Evaluation Techniques Evaluation Objective –Tests usability –Efficiency – Functionality of system occurs in laboratory,"— Presentation transcript:

Similar presentations

About project

Feedback