Presentation is loading. Please wait.

Presentation is loading. Please wait.

User Study Evaluation Human-Computer Interaction.

Similar presentations


Presentation on theme: "User Study Evaluation Human-Computer Interaction."— Presentation transcript:

1 User Study Evaluation Human-Computer Interaction

2 Hypothesis A statement of prediction Describes what you expect will happen in your study Alternative hypothesis (H 1 ) – your prediction, i.e. a claim of difference in the population e.g. Participants will commit more errors with interface A than with interface B Null hypothesis (H 0 ) – No difference or no effect e.g. Participants will commit the same number of errors between interface A and interface B or Participants will commit more errors in interface B than with interface A

3 Hypothesis – one or two tailed? Alternative hypothesis One-tailed: Participants will commit more errors with interface A than with interface B (i.e. directional) Two-tailed: There will be a significant difference in the number of errors participants commit with interface A than with interface B but I don’t know if there will be more or fewer (i.e. non- directional) Can’t prove the alternative hypothesis, can only reject the null hypothesis If your prediction was correct – reject null hypothesis Not rejecting null hypothesis ≠ accepting it

4 Metrics What you are measuring Some types of metrics Objective – facts of an event Time to complete task (continuous) Errors (discrete, i.e. distinct and separate, can be counted) Subjective – a person’s opinion Satisfaction

5 Metrics Types of metrics Objective – facts of an event Subjective – a person’s opinion *Both* are important How to measure Instrumentation – record data within your system Questionnaires / Surveys Scales Free-response Let’s discuss appropriateness of each Let’s look at a very popular survey (SUS)SUS

6 Analysis Most of what we do involves: Normal Distributed Results Independent Testing Homogenous Population Recall, we are testing the hypothesis by trying to prove the NULL hypothesis false

7 Analysis 3 main steps for analysis Data Preparation: Cleaning and organizing the data for analysis Checking the data for accuracy Transforming data (e.g. reverse coding survey data) Descriptive Statistics: Describing the data Provide simple summaries about the sample and the measures Simply describing what is, what the data shows Inferential Statistics: Testing Hypotheses and Models Try to infer from the sample data what the population thinks Make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance

8 Data preparation Checking data for accuracy Are the responses legible/readable? Are all important questions answered? Are the responses complete? Is all relevant contextual information included (e.g., data, time, place, researcher)?

9 Data preparation Data transformations Missing values Depending on program, need designate specific values to represent missing values, e.g. -99 Scale totals Add or average across individual items Item reversals Likert scale – sometimes rating for items need to be reversed 1 (strongly disagree) – 5 (strongly agree) “I generally feel good about myself.” “Sometimes I feel like I'm not worth much as a person.” What does a 5 mean in each case?

10 Descriptive statistics Simple summaries of sample and measures, i.e. data Describing what is or what the data shows Central tendency – estimate of the “center” of a distribution of values Mean – average across a set of values 15, 15, 18, 25, 33 = 106 µ = 106/5 = 21.2 Median – score found in middle of a set of values 15, 15, 18, 25, 33 Mode – most frequently occurring value 15, 15, 18, 25, 33 Describe the data with a number and a graph

11 Inferential statistics Try to reach conclusions that go beyond the immediate data – draw inferences e.g. want to compare the average performance of 2 groups to see if there’s a difference t-test: statistical test used to determine whether two observed means are statistically different

12 t-test What does it mean to say that the averages for two groups are statistically different?

13 t-test Variability is the noise that may make it harder to see the group difference Variance: measure of variability around the mean Standard deviation: s quare root of the variance

14 t – test (rule of thumb) Good values of t > 1.96 (standard deviations from the mean)

15 t-test Once computed, look up t-value to see whether the ratio is large enough to say that the difference between the groups is not likely to have been a chance finding. To test the significance, you need to set a risk level (called the alpha level). Accepted standard is alpha level of.05. 5 times out of 100 you would find a statistically significant difference between the means even if there was none (i.e., by "chance"). Degrees of freedom (df). For t-test, the df = sum of the persons in both groups minus 2. Given the alpha level, the df, and the t-value, look up t-value to determine whether the t-value is large enough to be significant. If yes, conclude that difference between means for the 2 groups is different (even given the variability) and reject null hypothesis.

16 α and p values α value – probability of making a Type I error (rejecting null hypothesis when really true) p value – probability that the effect found did not occur by chance. The lower the p value, the higher the statistical significance (the more rigorous the test)

17 Relationship between α and p values Once the alpha level has been set, a statistic (like t) is computed. Each statistic has an associated probability value called a p- value, or the likelihood of an observed statistic occurring due to chance, given the sampling distribution. Alpha sets the standard for how extreme the data must be before we can reject the null hypothesis. The p-value indicates how extreme the data are. Compare the p-value with alpha to determine whether the observed data are statistically significantly different from the null hypothesis

18 Kinds of t-tests Formula is slightly different for each: Single-sample: tests whether a sample mean is significantly different from a pre-existing value (e.g. norms) Paired-samples: tests the relationship between 2 linked samples, e.g. means obtained in 2 conditions by a single group of participants Independent-samples: tests the relationship between 2 independent populations Which test fits your situation?

19 t and alpha values

20 Independent samples t-test Example: social presence questionnaire “I perceived I was in the presence of a patient in the room with me.” http://www.vassarstats.net/tu.html

21 Correlations Correlations – relationship between two variables Pearon’s product-moment correlation coefficient – r http://bdaugherty.tripod.com/KeySkills/lineGraphs.html

22 Correlations Pearson’s product-moment correlation coefficient – r http://www.socscistatistics.co m/tests/pearson/Default2.asp x http://en.wikipedia.org/wiki/Co rrelation_and_dependence


Download ppt "User Study Evaluation Human-Computer Interaction."

Similar presentations


Ads by Google