Inferences about Population Means

Slides:



Advertisements
Similar presentations
The t-test Inferences about Population Means. Questions What is the main use of the t-test? How is the distribution of t related to the unit normal? When.
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
The two-sample t-test Expanding t to two groups. t-tests used for population mean diffs With 1-sample t, we have a single sample and a population value.
Single Sample t-test Purpose: Compare a sample mean to a hypothesized population mean. Design: One group.
The Two Sample t Review significance testing Review t distribution
Inferences About Means of Two Independent Samples Chapter 11 Homework: 1, 2, 3, 4, 6, 7.
Lecture 8 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Lecture 9: One Way ANOVA Between Subjects
S519: Evaluation of Information Systems
 What is t test  Types of t test  TTEST function  T-test ToolPak 2.
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Chapter 12 Section 1 Inference for Linear Regression.
Chapter 7 Inferences Regarding Population Variances.
The t-test Inferences about Population Means when population SD is unknown.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
AM Recitation 2/10/11.
Hypothesis Testing:.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
© Copyright McGraw-Hill 2000
General Linear Model 2 Intro to ANOVA.
Chapter 10 The t Test for Two Independent Samples
I271B QUANTITATIVE METHODS Regression and Diagnostics.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Chapter 9 Introduction to the t Statistic
Independent-Samples t test
Confidence Intervals.
HYPOTHESIS TESTING.
Dependent-Samples t-Test
Chapter 9 Hypothesis Testing.
Chapter 23 Comparing Means.
AP Statistics FINAL EXAM ANALYSIS OF VARIANCE.
Lecture Nine - Twelve Tests of Significance.
Lecture Slides Elementary Statistics Twelfth Edition
Inference for the Difference Between Two Means
CHAPTER 10 Comparing Two Populations or Groups
Math 4030 – 10a Tests for Population Mean(s)
Lecture 2 2-Sample Tests Goodness of Fit Tests for Independence
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Correlation A bit about Pearson’s r.
Chapter 9 Hypothesis Testing.
Elementary Statistics
Chapter 9 Hypothesis Testing.
Chi-square and F Distributions
The t distribution and the independent sample t-test
Econ 3790: Business and Economics Statistics
Elementary Statistics
Lesson Comparing Two Means.
Introduction to ANOVA.
Reasoning in Psychology Using Statistics
Quantitative Methods in HPELS HPELS 6210
I. Statistical Tests: Why do we use them? What do they involve?
Reasoning in Psychology Using Statistics
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Section 6-4 – Confidence Intervals for the Population Variance and Standard Deviation Estimating Population Parameters.
Chapter 24 Comparing Means Copyright © 2009 Pearson Education, Inc.
CHAPTER 10 Comparing Two Populations or Groups
Inference for Regression
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Introduction to the t Test
Statistical Inference for the Mean: t-test
Presentation transcript:

Inferences about Population Means The t-test Inferences about Population Means

Questions What is the main use of the t-test? How is the distribution of t related to the unit normal? When would we use a t-test instead of a z-test? Why might we prefer one to the other? What are the chief varieties or forms of the t-test? What is the standard error of the difference between means? What are the factors that influence its size?

More Questions Identify the appropriate to version of t to use for a given design. Compute and interpret t-tests appropriately. Given that construct a rejection region. Draw a picture to illustrate.

Background The t-test is used to test hypotheses about means when the population variance is unknown (the usual case). Closely related to z, the unit normal. Developed by Gossett for the quality control of beer. Comes in 3 varieties: Single sample, independent samples, and dependent samples.

What kind of t is it? Single sample t – we have only 1 group; want to test against a hypothetical mean. Independent samples t – we have 2 means, 2 groups; no relation between groups, e.g., people randomly assigned to a single group. Dependent t – we have two means. Either same people in both groups, or people are related, e.g., husband-wife, left hand-right hand, hospital patient and visitor.

Single-sample z test For large samples (N>100) can use z to test hypotheses about means. Suppose Then If

The t Distribution We use t when the population variance is unknown (the usual case) and sample size is small (N<100, the usual case). If you use a stat package for testing hypotheses about means, you will use t. The t distribution is a short, fat relative of the normal. The shape of t depends on its df. As N becomes infinitely large, t becomes normal.

Degrees of Freedom For the t distribution, degrees of freedom are always a simple function of the sample size, e.g., (N-1). One way of explaining df is that if we know the total or mean, and all but one score, the last (N-1) score is not free to vary. It is fixed by the other scores. 4+3+2+X = 10. X=1.

Single-sample t-test With a small sample size, we compute the same numbers as we did for z, but we compare them to the t distribution instead of the z distribution. (c.f. z=1.96) 1<2.064, n.s. Interval = Interval is about 9 to 13 and contains 10, so n.s.

R code for t distribution qt(c(.025, .975), df=7) [1] -2.364624 2.364624 qt(c(.025, .975), df=1000) [1] -1.962339 1.962339 pt(2, df=7) [1] 0.9571903 1-pt(2, df=7) [1] 0.04280966 If 2-tailed, would need p > .975 (or [1-p] less than .025) for significance

Review How are the distributions of z and t related? Given that construct a rejection region. Draw a picture to illustrate.

Difference Between Means (1) Most studies have at least 2 groups (e.g., M vs. F, Exp vs. Control) If we want to know diff in population means, best guess is diff in sample means. Unbiased: Variance of the Difference: Standard Error:

Difference Between Means (2) We can estimate the standard error of the difference between means. For large samples, can use z

Independent Samples t (1) Looks just like z: df=N1-1+N2-1=N1+N2-2 If SDs and Ns are equal, estimate is: We assume that the variance in both groups is the same. There are varieties of the t-test that relax this assumption.

Independent Samples t   Pooled Standard Error of the Difference (computed):

Independent Samples t (2) tcrit = t(.05,10)=2.23

Review What is the standard error of the difference between means? What are the factors that influence its size? Describe a design (what IV? What DV?) where it makes sense to use the independent samples t test.

Dependent t (1) Observations come in pairs. Brother, sister, repeated measure. Problem solved by finding diffs between pairs Di=yi1-yi2. df=N(pairs)-1

Dependent t (2) Brother Sister 5 7 8 3 Diff 2 1

Assumptions The t-test is based on assumptions of normality and homogeneity of variance. You can test for both these (make sure you know how). As long as the samples in each group are large and nearly equal, the t-test is robust, that is, still good, even though assumptions are not met.

Review Describe a design where it makes sense to use a single-sample t. Describe a design where it makes sense to use a dependent samples t.

Strength of Association (1) Scientific purpose is to predict or explain variation. There are statistical indexes of how well our IV accounts for variance in the DV. These are measures of how strongly or closely associated our IVs and DVs are (omega-squared and R-squared are most commonly used). Variance accounted for:

Strength of Association (2) How much of variance in Y is associated with the IV? Compare the 1st (left-most) curve with the curve in the middle and the one on the right. In each case, how much of the variance in Y is associated with the IV, group membership? More in the second comparison. As mean diff gets big, so does variance acct.

Association & Significance Power increases with association (effect size) and sample size. Effect size: Significance = effect size X sample size. Note relation to z score (single sample) (independent samples) Increasing sample size does not increase effect size (strength of association). It shrinks the standard error so power is greater, |t| is larger.

Estimating Power (1) If the null is false, the statistic is no longer distributed as t, but rather as noncentral t. This makes power computation difficult.

Noncentrality Howell introduces the noncentrality parameter delta to use for estimating power. For the one-sample t, Recall the relations between t and d on earlier slide

Estimating Power (2) Suppose (Howell, p. 231) that we have 25 people, a sample mean of 105, and a hypothesized mean and SD of 100 and 15, respectively. Then Howell presents an appendix where delta is related to power. For power = .8, alpha = .05, delta must be 2.80. To solve for N, we compute:

Estimating Power (3) Dependent t can be cast as a single sample t using difference scores. Independent t. To use Howell’s method, the result is n per group, so double it. Suppose d = .5 (medium effect) and n =25 per group. From Howell’s appendix, the value of delta of 1.77 with alpha = .05 results in power of .43. For a power of .8, we need delta = 2.80 Need 63 per group.

SAS Proc Power – single sample example proc power; onesamplemeans test=t nullmean = 100 mean = 105 stddev = 15 power = .8 ntotal = . ; run; The POWER Procedure One-sample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Null Mean 100 Mean 105 Standard Deviation 15 Nominal Power 0.8 Number of Sides 2 Alpha 0.05 Computed N Total Actual N Power Total 0.802 73

2 sample t Power Calculate sample size ; 2 sample t Power Calculate sample size Two-sample t Test for Mean Difference Fixed Scenario Elements Distribution Normal Method Exact Mean Difference 0.5 Standard Deviation 1 Nominal Power 0.8 Number of Sides 2 Null Difference 0 Alpha 0.05 Group 1 Weight 1 Group 2 Weight 1 Computed N Total Actual N Power Total 0.801 128 proc power; twosamplemeans meandiff= .5 stddev=1 power=0.8 ntotal=.; run;

2 sample t Power proc power; twosamplemeans The POWER Procedure Two-Sample t Test for Mean Difference Fixed Scenario Elements Distribution Normal Method Exact Number of Sides 1 Mean Difference 5 Standard Deviation 10 Total Sample Size 50 Null Difference 0 Alpha 0.05 Group 1 Weight 1 Group 2 Weight 1 Computed Power Power 0.539 proc power; twosamplemeans meandiff = 5 [assumed difference] stddev =10 [assumed SD] sides = 1 [1 tail] ntotal = 50 [25 per group] power = .; *[tell me!]; run;

Typical Power in Psych Average effect size is about d=.40. Consider power for effect sizes between .3 and .6. What kind of sample size do we need for power of .8? proc power; twosamplemeans meandiff= .3 to .6 by .1 stddev=1 power=.8 ntotal=.; plot x= power min = .5 max=.95; run; Two-sample t Test for 1 Computed N Total Mean Actual N Index Diff Power Total 1 0.3 0.801 352 2 0.4 0.804 200 3 0.5 0.801 128 4 0.6 0.804 90 Typical studies are underpowered.

Power Curves Why a whopper of an IV is helpful.

Review About how many people total will you need for power of .8, alpha is .05 (two tails), and an effect size of .3? You can only afford 40 people per group, and based on the literature, you estimate the group means to be 50 and 60 with a standard deviation within groups of 20. What is your power estimate?