Inference for a Population Mean  Estimation Hypothesis Testing.

Slides:



Advertisements
Similar presentations
5.6 Determining Sample Size to Estimate  Required Sample Size To Estimate a Population Mean  If you desire a C% confidence interval for a population.
Advertisements

“Students” t-test.
t distributions t confidence intervals for a population mean  Sample size required to estimate  hypothesis tests for 
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
CHAPTER 9 Testing a Claim
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Inferential Statistics & Hypothesis Testing
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Pengujian Hipotesis Nilai Tengah Pertemuan 19 Matakuliah: I0134/Metode Statistika Tahun: 2007.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.
BPS - 5th Ed. Chapter 171 Inference about a Population Mean.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
Chapter 4 Simple Random Sampling n Definition of Simple Random Sample (SRS) and how to select a SRS n Estimation of population mean and total; sample.
Lecture Unit 5 Section 5.7 Testing Hypotheses about Means 1.
Chapter 8 Testing Hypotheses about Means 1. Sweetness in cola soft drinks Cola manufacturers want to test how much the sweetness of cola drinks is affected.
Chapter 23 Confidence Intervals and Hypothesis Tests for a Population Mean  ; t distributions  t distributions  Confidence intervals for a population.
Experimental Statistics - week 2
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
Fundamentals of Hypothesis Testing: One-Sample Tests
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
More About Significance Tests
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
BPS - 5th Ed. Chapter 171 Inference about a Population Mean.
1 Happiness comes not from material wealth but less desire.
Inference for a population mean BPS chapter 18 © 2006 W.H. Freeman and Company.
A Broad Overview of Key Statistical Concepts. An Overview of Our Review Populations and samples Parameters and statistics Confidence intervals Hypothesis.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Essential Statistics Chapter 131 Introduction to Inference.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 23 Inference About Means.
Hypotheses tests for means
Testing of Hypothesis Fundamentals of Hypothesis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Inference for a population mean BPS chapter 16 © 2006 W.H. Freeman and Company.
1 Required Sample Size, Type II Error Probabilities Chapter 23 Inference for Means: Part 2.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
MATH 2400 Ch. 15 Notes.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.
Chapter 20 Confidence Intervals and Hypothesis Tests for a Population Mean  ; t distributions t distributions confidence intervals for a population mean.
Essential Statistics Chapter 161 Inference about a Population Mean.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Applied Quantitative Analysis and Practices LECTURE#14 By Dr. Osman Sadiq Paracha.
Lecture Unit 5.5 Confidence Intervals for a Population Mean  ; t distributions  t distributions  Confidence intervals for a population mean  Sample.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.3 Tests About a Population.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter 5 Section 5.3 Confidence Intervals for a Population Mean  ; t distributions; sample size  t distributions  Confidence intervals for a population.
Warm Up Yvon Hopps ran an experiment to test optimum power and time settings for microwave popcorn. His goal was to find a combination of power and time.
Chapter Nine Hypothesis Testing.
CHAPTER 9 Testing a Claim
Chapter 6 Hypothesis Tests for a Population Mean ; t distributions
Inference for a Population Mean 
Chapter 4 Simple Random Sampling
Chapter 9 Hypothesis Testing.
CHAPTER 9 Testing a Claim
Confidence Intervals for Proportions
CHAPTER 9 Testing a Claim
Basic Practice of Statistics - 3rd Edition
CHAPTER 9 Testing a Claim
Essential Statistics Inference about a Population Mean
Presentation transcript:

Inference for a Population Mean  Estimation Hypothesis Testing

Confidence Intervals and Hypothesis Tests for a Population Mean  ; t distributions  t distributions  Confidence intervals for a population mean  Sample size required to estimate  Hypothesis tests for a population mean 

The Importance of the Central Limit Theorem When we select simple random samples of size n, the sample means we find will vary from sample to sample. We can model the distribution of these sample means with a probability model that is

Time (in minutes) from the start of the game to the first goal scored for 281 regular season NHL hockey games from a recent season. mean  = 13 minutes, median 10 minutes. Histogram of means of 500 samples, each sample with n=30 randomly selected from the population at the left.

Since the sampling model for x is the normal model, when we standardize x we get the standard normal z

If  is unknown, we probably don’t know  either. The sample standard deviation s provides an estimate of the population standard deviation  For a sample of size n, the sample standard deviation s is: n − 1 is the “degrees of freedom.” The value s/√n is called the standard error of x, denoted SE(x).

Standardize using s for  Substitute s (sample standard deviation) for  ssss s ss s Note quite correct to label expression on right “z” Not knowing  means using z is no longer correct

t-distributions Suppose that a Simple Random Sample of size n is drawn from a population whose distribution can be approximated by a N(µ, σ) model. When  is known, the sampling model for the mean x is N(  /√n), so is approximately Z~N(0,1). When  is estimated with the sample standard deviation s, the sampling model for follows a t distribution with degrees of freedom n − 1. is the 1-sample t statistic

Confidence Interval Estimates CONFIDENCE INTERVAL for CONFIDENCE INTERVAL for  where: t = Critical value from t-distribution with n-1 degrees of freedom = Sample mean s = Sample standard deviation n = Sample size For very small samples ( n < 15), the data should follow a Normal model very closely. For moderate sample sizes ( n between 15 and 40), t methods will work well as long as the data are unimodal and reasonably symmetric. For sample sizes larger than 40, t methods are safe to use unless the data are extremely skewed. If outliers are present, analyses can be performed twice, with the outliers and without.

t distributions Very similar to z~N(0, 1) Sometimes called Student’s t distribution; Gossett, brewery employee Properties: i) symmetric around 0 (like z) ii) degrees of freedom

Z Student’s t Distribution

Z t Student’s t Distribution Figure 11.3, Page 372

Z t1t Student’s t Distribution Figure 11.3, Page 372 Degrees of Freedom

Z t1t t7t7 Student’s t Distribution Figure 11.3, Page 372 Degrees of Freedom

t-Table 90% confidence interval; df = n-1 = 10

Student’s t Distribution P(t > ) = t 10 P(t < ) =.05

Comparing t and z Critical Values Conf. leveln = 30 z = %t = z = %t = z = %t = z = %t =

Hot Dog Fat Content The NCSU cafeteria manager wants a 95% confidence interval to estimate the fat content of the brand of hot dogs served in the campus cafeterias. Degrees of freedom = 35; for 95%, t = We are 95% confident that the interval ( , ) contains the true mean fat content of the hot dogs.

During a flu outbreak, many people visit emergency rooms. Before being treated, they often spend time in crowded waiting rooms where other patients may be exposed. A study was performed investigating a drive-through model where flu patients are evaluated while they remain in their cars. In the study, 38 people were each given a scenario for a flu case that was selected at random from the set of all flu cases actually seen in the emergency room. The scenarios provided the “patient” with a medical history and a description of symptoms that would allow the patient to respond to questions from the examining physician. The patients were processed using a drive-through procedure that was implemented in the parking structure of Stanford University Hospital. The time to process each case from admission to discharge was recorded. Researchers were interested in estimating the mean processing time for flu patients using the drive-through model. Use 95% confidence to estimate this mean.

Degrees of freedom = 37; for 95%, t = We are 95% confident that the interval (25.484, ) contains the true mean processing time for emergency room flu cases using the drive-thru model.

Determining Sample Size to Estimate 

Required Sample Size To Estimate a Population Mean  If you desire a C% confidence interval for a population mean  with an accuracy specified by you, how large does the sample size need to be? We will denote the accuracy by ME, which stands for Margin of Error.

Example: Sample Size to Estimate a Population Mean  Suppose we want to estimate the unknown mean height  of male students at NC State with a confidence interval. We want to be 95% confident that our estimate is within.5 inch of  How large does our sample size need to be?

Confidence Interval for 

Good news: we have an equation Bad news: 1.Need to know s 2.We don’t know n so we don’t know the degrees of freedom to find t * n-1

A Way Around this Problem: Use the Standard Normal

Estimating s: 2 Approaches 1.Previously collected data or prior knowledge of the population 2.If the population is normal or near-normal, then s can be conservatively estimated by s  range % of obs. within 3  of the mean

Example: sample size to estimate mean height µ of NCSU undergrad. male students We want to be 95% confident that we are within.5 inch of  so  ME =.5; z*=1.96 Suppose previous data indicates that s is about 2 inches. n= [(1.96)(2)/(.5)] 2 = We should sample 62 male students

Example: Sample Size to Estimate a Population Mean  - Textbooks Suppose the financial aid office wants to estimate the mean NCSU semester textbook cost  within ME=$25 with 98% confidence. How many students should be sampled? Previous data shows  is about $85.

Example: Sample Size to Estimate a Population Mean  -NFL footballs The manufacturer of NFL footballs uses a machine to inflate new footballs The mean inflation pressure is 13.0 psi, but random factors cause the final inflation pressure of individual footballs to vary from 12.8 psi to 13.2 psi After throwing several interceptions in a game, Tom Brady complains that the balls are not properly inflated. The manufacturer wishes to estimate the mean inflation pressure to within.025 psi with a 99% confidence interval. How many footballs should be sampled?

Example: Sample Size to Estimate a Population Mean  The manufacturer wishes to estimate the mean inflation pressure to within.025 pound with a 99% confidence interval. How may footballs should be sampled? 99% confidence  z* = 2.58; ME =.025  = ? Inflation pressures range from 12.8 to 13.2 psi So range =13.2 – 12.8 =.4;   range/6 =.4/6 =

Testing Hypotheses about Means 32

Sweetness in cola soft drinks Cola manufacturers want to test how much the sweetness of cola drinks is affected by storage. The sweetness loss due to storage was evaluated by 10 professional tasters by comparing the sweetness before and after storage (a positive value indicates a loss of sweetness): Taster Sweetness loss − − We want to test if storage results in a loss of sweetness, thus: H 0 :  = 0 versus H A :  > 0 where m is the mean sweetness loss due to storage. We also do not know the population parameter s, the standard deviation of the sweetness loss.

The one-sample t-test As in any hypothesis test, a hypothesis test for  requires a few steps: 1.State the null and alternative hypotheses (H 0 versus H A ) a)Decide on a one-sided or two-sided test 2.Calculate the test statistic t and determining its degrees of freedom 3.Find the area under the t distribution with the t-table or technology 4.State the P-value (or find bounds on the P-value) and interpret the result

The one-sample t-test; hypotheses Step 1: 1.State the null and alternative hypotheses (H 0 versus H A ) a)Decide on a one-sided or two-sided test H 0 :  =   versus H A :  >   (1 –tail test) H 0 :  =   versus H A :  <   (1 –tail test) H 0 :  =   versus H A :  ≠    –tail test)

The one-sample t-test; test statistic We perform a hypothesis test with null hypothesis H 0 :  =  0 using the test statistic where the standard error of is. When the null hypothesis is true, the test statistic follows a t distribution with n-1 degrees of freedom. We use that model to obtain a P-value.

37 P-Values: Weighing the Evidence in the Data Against H 0 The P-value is the probability, calculated assuming the null hypothesis H 0 is true, of observing a value of the test statistic more extreme than the value we actually observed. The calculation of the P-value depends on whether the hypothesis test is 1-tailed (that is, the alternative hypothesis is H A :   0 ) or 2-tailed (that is, the alternative hypothesis is H A :  ≠  0 ).

38 P-Values If H A :  >  0, then P-value=P(t > t 0 ) Assume the value of the test statistic t is t 0 If H A :  <  0, then P-value=P(t < t 0 ) If H A :  ≠  0, then P-value=2P(t > |t 0 |)

39 Interpreting P-Values The P-value is the probability, calculated assuming the null hypothesis H 0 is true, of observing a value of the test statistic more extreme than the value we actually observed. When the P-value is LOW, the null hypothesis must GO. How small does the P-value need to be to reject H 0 ? Usual convention: the P-value should be less than.05 to reject H 0 If the P-value >.05, then conclusion is “do not reject H 0 ”

Sweetening colas (continued) Is there evidence that storage results in sweetness loss in colas? H 0 :  = 0 versus H a :  > 0 (one-sided test) Taster Sweetness loss ___________________________ Average 1.02 Standard deviation Degrees of freedom n − 1 = 9 Conf. Level Two Tail One Tail dfValues of t < t = 2.70 < ; thus 0.01 < P-value < Since P-value <.05, we reject H 0. There is a significant loss of sweetness, on average, following storage.

The P-Value Weighs the Evidence in the Data against H 0 The P-value is about the data, not the hypotheses, so: 1. The P-value is NOT the probability that the null hypothesis H 0 is false; 2. The P-value is NOT the probability that the null hypothesis H 0 is true; 3. The P-value is NOT the probability that the hypothesis test is erroneous 41

42 P-Values and Jury Trials H 0 : defendant innocent; H A : defendant guilty (Beyond a reasonable doubt = low P-value) Possible verdicts In the same way, if the data are not particularly unlikely under the assumption that the null hypothesis is true, then our conclusion is “fail to reject H 0 ”, not “accept H 0 ”. If there is insufficient evidence to convict the defendant (if the P-value is not low), the jury does NOT accept the null hypothesis and declare that the defendant is “innocent”. When the P-value is not low, juries can only fail to reject the null hypothesis and declare the defendant “not guilty.”

New York City Hotel Room Costs The NYC Visitors Bureau claims that the average cost of a hotel room is $168 per night. A random sample of 25 hotels resulted in y = $ and s = $ H 0 : μ  = 168 H A : μ  168

New York City Hotel Room Costs  n = 25; df = 24 Do not reject H 0 : not sufficient evidence that true mean cost is different than $ H 0 : μ  = 168 H A : μ  t, 24 df Conf. Level Two Tail One Tail dfValues of t P-value =.158

Microwave Popcorn A popcorn maker wants a combination of microwave time and power that delivers high-quality popped corn with less than 10% unpopped kernels, on average. After testing, the research department determines that power 9 at 4 minutes is optimum. The company president tests 8 bags in his office microwave and finds the following percentages of unpopped kernels: 7, 13.2, 10, 6, 7.8, 2.8, 2.2, 5.2. Do the data provide evidence that the mean percentage of unpopped kernels is less than 10%? H 0 : μ  = 10 H A : μ  10 where μ is true unknown mean percentage of unpopped kernels

Microwave Popcorn  n = 8; df = 7 Reject H 0 : there is sufficient evidence that true mean percentage of unpopped kernels is less than 10%.02 0 H 0 : μ  = 10 H A : μ  t, 7 df Exact P-value =.02 Conf. Level Two Tail One Tail dfValues of t