Presentation on theme: "1 G89.2228 Lect 2a G89.2228 Lecture 2a Thinking about variability Samples and variability Null hypothesis testing."— Presentation transcript:
1 G89.2228 Lect 2a G89.2228 Lecture 2a Thinking about variability Samples and variability Null hypothesis testing
2 G89.2228 Lect 2a Thinking about variability Variation is a critical feature of behavioral science –Between person differences –Within person differences (over trials or observation times) Very often we want to "explain" variation –Theories predict which scores are high and which are low –Data exploration finds patterns Sometimes variation is nuisance –Measurement error –Pre-existing differences that lead to sampling variation Sometimes variability is stochastic but of interest –Genetic variation –Multilevel (children,schools,communities)
3 G89.2228 Lect 2a Example: CESD variability (228 cases) Histogram of data
4 G89.2228 Lect 2a Samples and variability Variation in samples reflects variation in population, but not necessarily exactly. Samples can be used to estimate population variation, but the estimates themselves are variable over different samples. We can study sample means and variances, as well as the expected variability of the mean and variance estimates. –Example of mean sampling variation from exercise
5 G89.2228 Lect 2a Taking Sampling Variability into Account Suppose I randomly assign eight AIDS patients to one of two groups. –One group of four is given brief training in positive thinking –The other group is the control The next week I determine that the treatment group has more T4 cells than the control group –How do we know that the difference is not simply due to chance sampling variation?
6 G89.2228 Lect 2a Null hypothesis testing: Logic Suppose we believe two populations have different means –Call this belief H1 Suppose we observe two sample means are not the same. Is this sufficient evidence for H1? A skeptic might claim that the sample means are expected to vary due to chance sampling fluctuations, even if the two populations have the same mean –Call the skeptic’s belief H0 (the null hypothesis)
7 G89.2228 Lect 2a Null hypothesis testing: Logic, continued Evaluate the size of the observed sample mean difference assuming H0 –If the difference is not surprising under H0, then the skeptic's concern stands. –If the size of the difference would be highly unusual under H0 by sampling fluctuations alone, then we dismiss the skeptic's concern.
8 G89.2228 Lect 2a Null hypothesis testing: More formal version To evaluate whether the data are surprising under H0, we evaluate formal probability statements. Let D represent some function of the data (e.g., a function of the mean difference). –Define Pr(D>k|H0) as the rejection region of D –Reject H0 if the calculated D falls in the rejection region. If the calculated D does not fall in the rejection region, then we fail to reject H0. –A result that rejects H0 is said to be “statistically significant” –Result that fails to reject H0 is said not to be statistically significant. Formal probability statements needed.
9 G89.2228 Lect 2a Null hypothesis testing: Decision analysis Null Hypothesis Decision rule ignores Type II error »Only Pr(D>k|H0) is fixed »In practice, Pr(D>k|H1) is often small »Pr(D>k|H1) is statistical power Statistical power analysis can help design studies that keep both Type I and Type II error small Reject H0Not Reject H0 H0 True Type I Error Correct H0 False Correct Type II Error NHST Decision
10 G89.2228 Lect 2a Null hypothesis testing: Critique Statistical power in psychology is usually low Statistical significance confused with scientific significance Neither H1 or H0 are likely to be literally correct Conditional probability, Pr[DATA|H0] is often interpreted as Pr[H0|DATA] NHST lead to categorical rather than qualitative results Rules lead to biased reporting –Investigators strive for “significance” –Small effects are less often reported
11 G89.2228 Lect 2a Alternatives to Null Hypothesis Testing Estimation of effect size –Just how big is the observed difference? Interval estimation –What range of values is likely to contain the population effect size? Meta analysis –What does the distribution of estimated effects look like over replications of the study?