Agenda Part 2 comments – Average score: 87 Part 3: due in 2 weeks Data analysis
Project part 3 Please read the comments on your evaluation plans Finish your plan – Finalize questions, tasks – Prepare scripts or tutorials, etc. Find participants – Friends, neighbors, co-workers Perform the evaluations – Clearly inform your users what you are doing and why. – If you are audio or video recording, I prefer you use a consent form. – Pilot at least once – know how long its going to take.
Part 3 write up State exactly what you did (task list, how many, questionnaires etc.) Summarize data collected Summarize usability conclusions based on your data Discuss implications for the prototype based on those conclusions
Quantitative and qualitative Quantitative data – expressed as numbers Qualitative data – difficult to measure sensibly as numbers, e.g. count number of words to measure dissatisfaction Quantitative analysis – numerical methods to ascertain size, magnitude, amount Qualitative analysis – expresses the nature of elements and is represented as themes, patterns, stories Be careful how you manipulate data and numbers!
Descriptive Statistics For all variables, get a feel for results: – Total scores, times, ratings, etc. – Minimum, maximum – Mean, median, ranges, etc. e.g. “Twenty participants completed both sessions (10 males, 10 females; mean age 22.4, range 18-37 years).” e.g. “The median time to complete the task in the mouse-input group was 34.5 s (min=19.2, max=305 s).”
Simple quantitative analysis Averages – Mean: add up values and divide by number of data points – Median: middle value of data when ranked – Mode: figure that appears most often in the data Percentages versus numbers Graphical representations give overview of data
Subgroup Stats Look at descriptive stats (means, medians, ranges, etc.) for any subgroups – e.g. “The mean error rate for the mouse-input group was 3.4%. The mean error rate for the keyboard group was 5.6%.” – e.g. “The median completion time (in seconds) for the three groups were: novices: 4.4, moderate users: 4.6, and experts: 2.6.”
Other Presentation Methods 0 20 Mean lowhigh Middle 50% Time in secs. Age Box plot Scatter plot
Visualizing log data Interaction profiles of players in online game Log of web page activity
Simple qualitative analysis Recurring patterns or themes – Emergent from data Categorizing data – Categorization scheme may be emergent or pre-specified Looking for critical incidents – Helps to focus in on key events
Presenting the findings Only make claims that your data can support The best way to present your findings depends on the audience, the purpose, and the data gathering and analysis undertaken Graphical representations may be appropriate for presentation Other techniques are: – Using stories, e.g. to create scenarios based on the data – Summarizing the findings
Interviews Raw data: – Audio or video recordings, interviewer notes Initial processing – Transcribe audio, or expand upon notes Qualitative processing – Group answers to same question (small # of questions and people) – Label interesting phrases or words – Put labels on post-its or in software and group labels Quantitative processing – Gather quantitative responses such as age, etc. – Categorize and count responses (5 liked, 3 disliked, etc.) Presentation – Summarize responses, tell stories and patterns – Use descriptive quotes
Questionnaire Raw data: – Tables of questions and numbers or text answers Quantitative processing – Calculate descriptive stats (means, percentages, etc.) for each question – Can break into subgroups or use statistics to look for relationships between items (does age correlate to stronger preferences?) Qualitative processing – Group answers to same question Presentation – Present tables & charts of means, percentages, etc. – Explain overall meaning of all the responses
Observation Raw data: – Audio or video recording, log files, notes Initial processing: – Transcribe audio, expand notes or take more based on video, synchronize logs with recordings Quantitative processing – Record metrics such as errors, times, clicks, etc. – Produce descriptive stats and charts of those metrics Qualitative processing – Note places where problems occurred, interesting behaviors, common behaviors Presentation – Descriptions of common or interesting problems – Videos demonstrating issues, or descriptive quotes – Charts describing quantitative data
Sample Think-aloud categorization 1. Interface problems 1. Verbalizations show evidence of dissatisfaction about an aspect of the interface. 2. Verbalizations show evidence of confusion/uncertainty about an aspect of the interface. 3. Verbalizations show evidence of confusion/surprise at the outcome of an action. 4. Verbalizations show evidence that they are having problems achieving a goal. 5. Verbalizations show evidence that the user has made an error. 6. The participant I unable to recover from error without external help from the experimenter. 7. The participant makes a suggestion for redesign of the interface. See pg 380 for more complete example
Experimental Results How does one know if an experiment’s results mean anything or confirm any beliefs? Example: 40 people participated, 28 preferred interface 1, 12 preferred interface 2 What do you conclude?
Goal of analysis Get >95% confidence in significance of result – that is, null hypothesis disproved H o : Time color = Time b/w – OR, there is an influence – ORR, only 1 in 20 chance that difference occurred due to random chance
Means Not Always Perfect Experiment 1 Group 1 Group 2 Mean: 7 Mean: 10 1,10,10 3,6,21 Experiment 2 Group 1 Group 2 Mean: 7 Mean: 10 6,7,8 8,11,11
Inferential Stats and the Data Are these really different? What would that mean?
Hypothesis Testing Tests to determine differences – t-test to compare two means – ANOVA (Analysis of Variance) to compare several means – Need to determine “statistical significance” “Significance level” (p): – The probability that your null hypothesis was wrong, simply by chance – p (“alpha” level) is often set at 0.05, or 5% of the time you’ll get the result you saw, just by chance
Errors Errors in analysis do occur Main Types: – Type I/False positive - You conclude there is a difference, when in fact there isn’t – Type II/False negative - You conclude there is no difference when there is And then there’s the True Negative…
Drawing Conclusions Make your conclusions based on the descriptive stats, but back them up with inferential stats – e.g., “The expert group performed faster than the novice group t(1,34) = 4.6, p >.01.” Translate the stats into words that regular people can understand – e.g., “Thus, those who have computer experience will be able to perform better, right from the beginning…”
Tools to support data analysis Spreadsheet – simple to use, basic graphs – Can even do basic statistical analysis Statistical packages, e.g. SPSS Qualitative data analysis tools – Categorization and theme-based analysis, e.g. N6 – Quantitative analysis of text-based data
Analysis and Presentation for Part 3 List of problems from HE with severity ratings List of problems found in CW Basic quantitative analysis from your observation Basic qualitative analysis from your observation – Places where problems occur, general story of what and how people did, etc. Basic quantitative and qualitative analysis from the questionnaire or interview – Tables of responses, averages, etc. as appropriate
Interpreting your results Go through each usability criteria – do results demonstrate support for meeting this criteria or not? How do they? Discuss any other problems with aspects of the design that your results demonstrate. Discuss how you would modify the design based on these results.