When  is unknown  The sample standard deviation s provides an estimate of the population standard deviation .  Larger samples give more reliable estimates.

Slides:



Advertisements
Similar presentations
Comparing One Sample to its Population
Advertisements

AP STATISTICS LESSON 11 – 1 (DAY 3) Matched Pairs t Procedures.
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
CHAPTER 9 Testing a Claim
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Inference for distributions: - for the mean of a population IPS chapter 7.1 © 2006 W.H Freeman and Company.
Conditions with σ Unknown Note: the same as what we saw before.
Introduction Comparing Two Means
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Significance Tests Chapter 13.
Inference for distributions: - Comparing two means IPS chapter 7.2 © 2006 W.H. Freeman and Company.
Inference for Distributions - for the Mean of a Population IPS Chapter 7.1 © 2009 W.H Freeman and Company.
Inference for Distributions - for the Mean of a Population
Business Statistics for Managerial Decision
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
Two-sample problems for population means BPS chapter 19 © 2006 W.H. Freeman and Company.
CHAPTER 19: Two-Sample Problems
C HAPTER 11 Section 11.2 – Comparing Two Means. C OMPARING T WO M EANS Comparing two populations or two treatments is one of the most common situations.
Objective: To test claims about inferences for two sample means, under specific conditions.
7.1 Lecture 10/29.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Hypothesis Testing – Examples and Case Studies
Inferences Based on Two Samples
14. Introduction to inference
More About Significance Tests
September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.
Confidence Intervals and Hypothesis Tests for the Difference between Two Population Means µ 1 - µ 2 : Independent Samples Inference for  1  2 1.
Chapter 10 Comparing Two Means Target Goal: I can use two-sample t procedures to compare two means. 10.2a h.w: pg. 626: 29 – 32, pg. 652: 35, 37, 57.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
IPS Chapter 7 © 2012 W.H. Freeman and Company  7.1: Inference for the Mean of a Population  7.2: Comparing Two Means  7.3: Optional Topics in Comparing.
Inference for distributions: - Comparing two means IPS chapter 7.2 © 2006 W.H. Freeman and Company.
Inference for a population mean BPS chapter 18 © 2006 W.H. Freeman and Company.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Inference for a population mean BPS chapter 16 © 2006 W.H. Freeman and Company.
Two sample problems:  compare the responses in two groups  each group is a sample from a distinct population  responses in each group are independent.
Objectives (BPS chapter 19) Comparing two population means  Two-sample t procedures  Examples of two-sample t procedures  Using technology  Robustness.
MATH 2400 Ch. 15 Notes.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Inference for Distributions - for the Mean of a Population IPS Chapter 7.1 © 2009 W.H Freeman and Company.
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Chapter 7 Inference for Distributions. Inference for the mean of a population  So far, we have assumed that  was known.  If  is unknown, we can use.
Comparing Means: Confidence Intervals and Hypotheses Tests for the Difference between Two Population Means µ 1 - µ 2 Chapter 24 Independent Samples Chapter.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Difference Between Two Means.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Inference for Distributions 7.2 Comparing Two Means © 2012 W.H. Freeman and Company.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.3 Tests About a Population.
Inference for distributions: - Comparing two means.
Inference for a population mean BPS chapter 16 © 2006 W.H. Freeman and Company.
CHAPTER 19: Two-Sample Problems ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Objectives (PSLS Chapter 18) Comparing two means (σ unknown)  Two-sample situations  t-distribution for two independent samples  Two-sample t test 
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
10/31/ Comparing Two Means. Does smoking damage the lungs of children exposed to parental smoking? Forced vital capacity (FVC) is the volume (in.
Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 © 2009 W.H Freeman and Company.
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 7: Tests of significance and confidence.
Inference for distributions: - for the mean of a population IPS chapter 7.1 © 2006 W.H Freeman and Company.
Chapter 23 CI HT for m1 - m2: Paired Samples
The Practice of Statistics in the Life Sciences Fourth Edition
CHAPTER 19: Two-Sample Problems
CHAPTER 19: Two-Sample Problems
Objectives 7.1 Inference for the mean of a population
Section 10.2 Comparing Two Means.
Inference for Distributions
Presentation transcript:

When  is unknown  The sample standard deviation s provides an estimate of the population standard deviation .  Larger samples give more reliable estimates of . Population distribution Small sampleLarge sample

The t distributions We take 1 random sample of size n from a Normal population N(µ,σ):  When  is known, the sampling distribution of is Normal N(  /√n), and the statistic follows the standard Normal N(0,1).  When  is estimated from the sample standard deviation s, the statistic follows the t distribution t ( ,1) with n − 1 degrees of freedom.

When n is large, s is a good estimate of  and the t df n – 1 distribution is close to the standard Normal distribution. Standard Normal t distribution, df 4 t distribution, df 1 Standard Normal t distribution, df 100 t distribution, df 20

A medical study examined the effect of a new medication on the seated systolic blood pressure. The results, presented as mean ± SEM for 25 patients, are ± 8.9. What is the standard deviation s of the sample data? Standard deviation versus standard error For a sample of size n, the sample standard deviation s is: n − 1 is the “degrees of freedom.” The value s/√n is called the standard error of the sample mean SEM. Scientists often present their sample results as the mean ± SEM. SEM = s/√n s = SEM*√n s = 8.9*√25=44.5

Table C When σ is known, we use the Normal distribution and z. When σ is unknown we use a t distribution with “n − 1” degrees of freedom (df). Table C shows the z-values and t-values corresponding to landmark P-values/ confidence levels.

The one-sample t test As before, a test of hypotheses requires a few steps: 1.Stating the null hypothesis (H 0 ) 2.Deciding on a one-sided or two-sided alternative (H a ) 3.Choosing a significance level  4.Calculating t and its degrees of freedom 5.Finding the area under the curve with Table C or software 6.Stating the P-value and concluding

We draw a random sample of size n from an N(µ, σ) population. When  is estimated from s, the distribution of the test statistic t is a t distribution with df = n – 1. This resulting t test is robust to deviations from Normality as long as the sample size is large enough.  1

One-sided (one-tailed) Two-sided (two-tailed) The P-value is the probability, if H 0 was true, of randomly drawing a sample like the one obtained or more extreme in the direction of H a.

Using Table C: < t 2.7 < so 0.02 > P-value > 0.01 For H a : μ > μ 0 if n = 10 and t = 2.70, then…

Variable N Mean SE Mean StDev s/√n s Weightchange Study Participants: 53 obese children ages 9 to 12 with a BMI above the 95 th percentile for age and gender Intervention: family counseling sessions on the stoplight diet (green/yellow/red approach to eating food) - after 8 weekly sessions and 3 follow-up sessions Assessment: Weight change at 15 weeks of intervention Was the intervention effective in helping obese children lose weight? H 0 :  = 0 versus H a :  < 0 (one-sided test)

MINITAB: Test of mu = 0 vs < 0 Variable N Mean StDev SE Mean T P Weightchange There is a significant weight loss, on average, following intervention. df = 53-1=52 ≈ 50, one-sided P >.0005 (software gives P = ≈ 0.001), highly significant.

Confidence intervals A C% confidence interval is a range of values (interval) that contains the true population parameter with confidence C. We have a set of data from a normal pop’n with both  and  unknown. We use x ̅ to estimate , and s to estimate  using a t dist’n (df n − 1). C t*t*−t*−t* m  C is the area between −t* and t*.  We find t* in the line of Table C.  The margin of error m is:

Data on the blood cholesterol levels (mg/dl) of 24 lab rats give a sample mean of 85 and a standard deviation of 12. We want a 95% confidence interval for the mean blood cholesterol of all lab rats. Assume normality of the data. We are 95% confident that the true mean blood cholesterol of all lab rats is between 79.9 and 90.1 mg/dl.

Matched pairs t procedures Sometimes we want to compare treatments or conditions at the individual level. The data sets produced this way are not independent. The individuals in one sample are related to those in the other sample.  Pre-test and post-test studies look at data collected on the same sample elements before and after some experiment is performed.  Twin studies often try to sort out the influence of genetic factors by comparing a variable between sets of twins.  Using people matched for age, sex, and education in social studies allows us to cancel out the effect of these potential lurking variables.

Variable N Mean SE Mean StDev Weightchange Study Participants: 53 obese children ages 9 to 12 with a BMI above the 95 th percentile for age and gender Intervention: family counseling sessions on the stoplight diet (green/yellow/red approach to eating food) - after 8 weekly sessions and 3 follow-up sessions Assessment: Weight change at 15 weeks of intervention Was the intervention effective in helping obese children lose weight? This is a pre-/post design. The weight change values are the difference in body weight before and after intervention for each participant.

Does lack of caffeine increase depression? Randomly selected caffeine-dependent individuals were deprived of all caffeine- rich foods and assigned to receive daily pills. At one time the pills contained caffeine and, at another time they were a placebo. Depression was assessed quantitatively (higher scores represent greater depression). (Assume normality.) This is a matched pairs design with 2 data points for each subject. We compute a new variable “Difference” Placebo minus Caffeine Caffeine

H 0 :  diff = 0 ; H a :  diff > 0 For df = 11-1=10, P-value >.0025 (Software gives P = ) Caffeine deprivation causes a significant increase in depression (P < 0.005, n = 11). (…) With 11 "difference" points, df = n – 1 = 10. We find: x ̅ diff = 7.36; s diff = 6.92; so SEM diff = s diff / √n = 6.92/√11 = We test:

Robustness The t procedures are exactly correct when the population is exactly Normal. This is rare. The t procedures are robust to small deviations from Normality, but:  The sample must be a random sample from the population.  Outliers and skewness strongly influence the mean and therefore the t procedures. Their impact diminishes as the sample size gets larger because of the Central Limit Theorem. As a guideline:  When n < 15, the data must be close to Normal and without outliers.  When 15 < n < 40, mild skewness is acceptable, but not outliers.  When n > 40, the t statistic will be valid even with strong skewness.

Does oligofructose consumption stimulate calcium absorption? Healthy adolescent males took a pill for nine days and had their calcium absorption tested on the ninth day. The experiment was repeated three weeks later. Subjects received either an oligofructose pill first or a control sucrose pill first. The order was randomized and the experiment was double-blind. Fractional calcium absorption data (in percent of intake) for 11 subjects: Can we use a t inference procedure for this study? Discuss the assumptions.

Red wine, in moderation Does drinking red wine in moderation increase blood polyphenol levels, thus maybe protecting against heart attacks? Nine randomly selected healthy men were assigned to drink half a bottle of red wine daily for two weeks. The percent change in their blood polyphenol levels was assessed: x ̅ = 5.5; s = 2.517; df = n − 1 = 8 Can we use a t inference procedure for this study? Discuss the assumptions.