Conducting a User Study Human-Computer Interaction.

Slides:



Advertisements
Similar presentations
Tests of Significance and Measures of Association
Advertisements

Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
One sample means Testing a sample mean against a population mean.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Statistical Issues in Research Planning and Evaluation
Conducting a User Study Human-Computer Interaction.
STAT 135 LAB 14 TA: Dongmei Li. Hypothesis Testing Are the results of experimental data due to just random chance? Significance tests try to discover.
BHS Methods in Behavioral Sciences I April 25, 2003 Chapter 6 (Ray) The Logic of Hypothesis Testing.
PSY 307 – Statistics for the Behavioral Sciences
Section 7.1 Hypothesis Testing: Hypothesis: Null Hypothesis (H 0 ): Alternative Hypothesis (H 1 ): a statistical analysis used to decide which of two competing.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Inferential Stats for Two-Group Designs. Inferential Statistics Used to infer conclusions about the population based on data collected from sample Do.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Lect 10b1 Histogram – (Frequency distribution) Used for continuous measures Statistical Analysis of Data ______________ statistics – summarize data.
Lecture 9: One Way ANOVA Between Subjects
Hypothesis Testing. Introduction Always about a population parameter Attempt to prove (or disprove) some assumption Setup: alternate hypothesis: What.
Introduction to sample size and power calculations How much chance do we have to reject the null hypothesis when the alternative is in fact true? (what’s.
Inferential Statistics
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
AM Recitation 2/10/11.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Linear Regression Inference
Statistical Analysis Statistical Analysis
Conducting a User Study Human-Computer Interaction.
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Individual values of X Frequency How many individuals   Distribution of a population.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
User Study Evaluation Human-Computer Interaction.
Conducting a User Study Human-Computer Interaction.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
The binomial applied: absolute and relative risks, chi-square.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Human-Computer Interaction. Overview What is a study? Empirically testing a hypothesis Evaluate interfaces Why run a study? Determine ‘truth’ Evaluate.
Data Analysis Econ 176, Fall Populations When we run an experiment, we are always measuring an outcome, x. We say that an outcome belongs to some.
F, t, and p Basic Statistics for Computer Scientists (aka knowing enough to be critical of user studies) April 4, 2002 Benjamin Lok.
1.1 Statistical Analysis. Learning Goals: Basic Statistics Data is best demonstrated visually in a graph form with clearly labeled axes and a concise.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 11: Bivariate Relationships: t-test for Comparing the Means of Two Groups.
Three Broad Purposes of Quantitative Research 1. Description 2. Theory Testing 3. Theory Generation.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Hypothesis Testing. “Not Guilty” In criminal proceedings in U.S. courts the defendant is presumed innocent until proven guilty and the prosecutor must.
Statistical Analysis. Null hypothesis: observed differences are due to chance (no causal relationship) Ex. If light intensity increases, then the rate.
© Copyright McGraw-Hill 2004
Psych 230 Psychological Measurement and Statistics Pedro Wolf October 21, 2009.
BHS Methods in Behavioral Sciences I May 9, 2003 Chapter 6 and 7 (Ray) Control: The Keystone of the Experimental Method.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Hypothesis Testing and Statistical Significance
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
 Confidence Intervals  Around a proportion  Significance Tests  Not Every Difference Counts  Difference in Proportions  Difference in Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistics Made Simple
Sampling and Sampling Distributions
The Chi Square Test A statistical method used to determine goodness of fit Chi-square requires no assumptions about the shape of the population distribution.
Chapter 9 Hypothesis Testing.
Conducting a User Study
12 Inferential Analysis.
12 Inferential Analysis.
Statistics Made Simple
What are their purposes? What kinds?
Variance and Hypothesis Tests
BHS Methods in Behavioral Sciences I
Presentation transcript:

Conducting a User Study Human-Computer Interaction

Overview Why run a study? Why run a study? Evaluate if a statement is true Evaluate if a statement is true Ex. The heavier a person weighs, the higher their blood pressure Ex. The heavier a person weighs, the higher their blood pressure Many ways to do this: Many ways to do this: Look at data from a doctor’s office Look at data from a doctor’s office What’s the pros and cons? What’s the pros and cons? Get a group of people to get weighed and measure their BP Get a group of people to get weighed and measure their BP What’s the pros and cons? What’s the pros and cons? Ideal solution: have everyone in the world get weighed and BP Ideal solution: have everyone in the world get weighed and BP Participants are a sample of the population Participants are a sample of the population You should immediately question this! You should immediately question this! Restrict population Restrict population

Population Design Identify the statement to be evaluated Identify the statement to be evaluated Ex. The heavier a person weighs, the higher their blood pressure Ex. The heavier a person weighs, the higher their blood pressure Create a hypothesis Create a hypothesis Ex. Weight is directly proportional to blood pressure Ex. Weight is directly proportional to blood pressure Identify Independent and Dependent Variables Identify Independent and Dependent Variables Independent Variable – the variable that is being manipulated by the experimenter (weight) Independent Variable – the variable that is being manipulated by the experimenter (weight) Dependent Variable – the variable that is caused by the independent variable. (blood pressure) Dependent Variable – the variable that is caused by the independent variable. (blood pressure) Design Study Design Study Invite 100 people Invite 100 people Weigh them and take their BP Weigh them and take their BP Graph Graph See if there is a trend See if there is a trend

Two Group Design Identify the statement to be evaluated Identify the statement to be evaluated Ex. Shorter people are smarter than taller people Ex. Shorter people are smarter than taller people Create a hypothesis Create a hypothesis Ex. IQ of people shorter than 5’9” > IQ of people 5’9” or taller Ex. IQ of people shorter than 5’9” > IQ of people 5’9” or taller Design Study Design Study Two groups called conditions Two groups called conditions How many people? How many people? What’s your design? What’s your design? What is the independent and dependent variables? What is the independent and dependent variables? Confounding factors – factors that affect outcomes, but are not related to the study Confounding factors – factors that affect outcomes, but are not related to the study

Design External validity – do your results mean anything? External validity – do your results mean anything? Results should be similar to other similar studies Results should be similar to other similar studies Use accepted questionnaires, methods Use accepted questionnaires, methods Power – how much meaning do your results have? Power – how much meaning do your results have? The more people the more you can say that the participants are a sample of the population The more people the more you can say that the participants are a sample of the population Generalization – how much do your results apply to the true state of things Generalization – how much do your results apply to the true state of things

Design People who use a mouse and keyboard will be faster to fill out a form than keyboard alone. People who use a mouse and keyboard will be faster to fill out a form than keyboard alone. Let’s create a study design Let’s create a study design Two types: Two types: Between Subjects Between Subjects Across Subjects Across Subjects Everyone do this now for your study Everyone do this now for your study

Procedure Formally have all participants sign up for a time slot (if individual testing is needed) Formally have all participants sign up for a time slot (if individual testing is needed) Informed Consent (let’s look at one) Informed Consent (let’s look at one) Execute study Execute study Questionnaires/Debriefing (let’s look at one) Questionnaires/Debriefing (let’s look at one)

Hypothesis Proving Hypothesis: Hypothesis: People who use a mouse and keyboard will be faster to fill out a form than keyboard alone. People who use a mouse and keyboard will be faster to fill out a form than keyboard alone. US Court system: Innocent until proven guilty US Court system: Innocent until proven guilty NULL Hypothesis: Assume people who use a mouse and keyboard will fill out a form than keyboard alone in the same amount of time NULL Hypothesis: Assume people who use a mouse and keyboard will fill out a form than keyboard alone in the same amount of time Your job to prove differently! Your job to prove differently! Alternate Hypothesis 1: People who use a mouse and keyboard will fill out a form than keyboard alone, either faster or slower. Alternate Hypothesis 1: People who use a mouse and keyboard will fill out a form than keyboard alone, either faster or slower. Alternate Hypothesis 2: People who use a mouse and keyboard will fill out a form than keyboard alone, faster. Alternate Hypothesis 2: People who use a mouse and keyboard will fill out a form than keyboard alone, faster.

Analysis Most of what we do involves: Most of what we do involves: Normal Distributed Results Normal Distributed Results Independent Testing Independent Testing Homogenous Population Homogenous Population

Raw Data What does the mean (average) tell us? Is that enough? What does the mean (average) tell us? Is that enough?

Variances standard deviation – measure of dispersion (square root of the sum of squares divided by N) standard deviation – measure of dispersion (square root of the sum of squares divided by N) Small Pattern (seconds)Large Pattern (seconds) MeanS.D.MinMaxMeanS.D.MinMax Real Space (n=41) Purely Virtual (n=13) Hybrid (n=13) Vis Faith Hybrid (n=14)

Hypothesis We assumed the means are “equal” We assumed the means are “equal” But are they? Or is the difference due to chance? But are they? Or is the difference due to chance? Small Pattern (seconds)Large Pattern (seconds) MeanS.D.MinMaxMeanS.D.MinMax Real Space (n=41) Purely Virtual (n=13) Hybrid (n=13) Vis Faith Hybrid (n=14)

T - test T – test – statistical test used to determine whether two observed means are statistically different T – test – statistical test used to determine whether two observed means are statistically different

T – test (rule of thumb) Good values of t > 1.96 (rule of thumb) Good values of t > 1.96 Look at what contributes to t Look at what contributes to t htm htm

F statistic, p values F statistic – assesses the extent to which the means of the experimental conditions differ more than would be expected by chance F statistic – assesses the extent to which the means of the experimental conditions differ more than would be expected by chance t is related to F statistic t is related to F statistic Look up a table, get the p value. Compare to α Look up a table, get the p value. Compare to α α value – probability of making a Type I error (rejecting null hypothesis when really true) α value – probability of making a Type I error (rejecting null hypothesis when really true) p value – statistical likelihood of an observed pattern of data, calculated on the basis of the sampling distribution of the statistic. (% chance it was due to chance) p value – statistical likelihood of an observed pattern of data, calculated on the basis of the sampling distribution of the statistic. (% chance it was due to chance)

Small PatternLarge Pattern t – test with unequal variance p – value t – test with unequal variance p - value PVE – RSE vs. VFHE – RSE ** *** PVE – RSE vs. HE – RSE ** * VFHE – RSE vs. HE – RSE

Significance What does it mean to be significant? What does it mean to be significant? You have some confidence it was not due to chance. You have some confidence it was not due to chance. But difference between statistical significance and meaningful significance But difference between statistical significance and meaningful significance Always know: Always know: samples (n) samples (n) p value p value variance/standard deviation variance/standard deviation means means

IRB Let’s look at a completed one Let’s look at a completed one You MUST turn one in by October 28 th to the TA! You MUST turn one in by October 28 th to the TA! Must have OKed before running study Must have OKed before running study