Inference for a Single Population Proportion (p).

Slides:



Advertisements
Similar presentations
Comparing Two Proportions (p1 vs. p2)
Advertisements

Chapter 19 Confidence Intervals for Proportions.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Point and Confidence Interval Estimation of a Population Proportion, p
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Chapter 9: Inferences Involving One Population Student’s t, df = 5 Student’s t, df = 15 Student’s t, df = 25.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
7-2 Estimating a Population Proportion
8-3 Testing a Claim about a Proportion
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics Lecture 20 – April 3, 2008.
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Chapter 10 Hypothesis Testing
Confidence Intervals and Hypothesis Testing - II
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Fundamentals of Hypothesis Testing: One-Sample Tests
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
8.1 Inference for a Single Proportion
Comparing Two Population Means
Confidence Intervals and Hypothesis tests with Proportions.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Chapter 10 Hypothesis Testing
Chapter 8 Introduction to Hypothesis Testing
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Statistical Inference
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Large sample CI for μ Small sample CI for μ Large sample CI for p
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
AP Statistics Section 11.1 B More on Significance Tests.
© Copyright McGraw-Hill 2004
Sample Size and CI’s for the Population Mean (  and the Population Proportion (p)
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Binomial Distribution and Applications. Binomial Probability Distribution A binomial random variable X is defined to the number of “successes” in n independent.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Estimating a Population Proportion ADM 2304 – Winter 2012 ©Tony Quon.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Inference for a Single Population Proportion (p)
More on Inference.
Chapter 8: Inference for Proportions
More on Inference.
Elementary Statistics
Lecture Slides Elementary Statistics Twelfth Edition
Type I and Type II Errors
Presentation transcript:

Inference for a Single Population Proportion (p)

Sampling Distribution of the Sample Proportion where n =sample size and x =the observed number of “successes” in the sample p= estimate of population proportion p = proportion of sample having a specified attribute =^xn Sample Proportion

Sampling Distribution of the Sample Proportion The histograms below show the estimated sampling distribution of the sample proportion based upon 1000 samples of the drawn for the given sample size (n). For larger samples, the sampling distribution of the sample proportion is approximately normal.

Sampling Distribution of the Sample Proportion Sampling Distribution of 1.Provided n is “sufficiently large”, the sampling distribution is normal, generally 2.The mean of the sampling distribution is p = true population proportion 3. Standard deviation of the sampling distribution or the standard error of sample proportion is given by: * 5 is also used.

Implications for Inference (n “large”) CI for Population Proportion (p) Test Statistic (H o : p = p o ) z-values from standard normal 90% = % = % = Effect Size

Hypothesis Testing for a Population Proportion (p) Null Hypothesis H o : p = p o Alternative Hypothesesp-value (upper-tailed)  z P-value (lower-tailed)  z P-value

Hypothesis Testing for a Population Proportion (p) Null Hypothesis H o : p = p o Alternative Hypothesesp-value (two-tailed)  z - z For two-tailed tests in general it is preferable to simply construct a confidence interval for the parameter of interest and see if the hypothesized value under the null hypothesis is contained in the CI. If it is we fail to reject Ho and if it is not then we reject Ho.

Example: Treatment of Kidney Cancer Historically, one in five kidney cancer patients survive 5 years past diagnosis, i.e. 20%. An oncologist using an experimental therapy treats n = 40 kidney cancer patients and 16 of them survive at least 5 years. Is there evidence that patients receiving the experimental therapy have a higher 5-year survival rate?

Step 1: Formulate Hypotheses p = the proportion of kidney cancer patients receiving the experimental therapy that survive at least 5 years. Step 2: Determine test criteria Choose   may want to consider smaller?) Use large sample test ? Definitely questionable as we have… np = (40)(.20) = 8 > 5 and n(1-p) = (40)(.80) = 32 > 5

Step 3: Collect data and compute test statistic The observed 5-yr. survival rate for kidney cancer patients undergoing the experimental therapy is 3.16 standard errors above the historical rate!

Step 4: Compute p-value  z = 3.16 P-value =.0008 Step 5: Make Decision and Interpret We have very strong evidence to suggest the 5-year survival rate for kidney cancer patients undergoing the experimental therapy is greater than the current 5-yr. survival rate of 20% (p =.0008). It is highly unlikely we would obtain a 40% 5-yr. survival rate in our sample, if in fact the 5-yr. survival rate for the population of patients treated with the experimental therapy was truly 20%.

Step 6: Quantify significant results Confidence Interval Effect Size

Power Calculation Baseline proportion is the proportion under the null hypothesis (.20 or 20% here) Difference to detect is the absolute difference between p under alternative and p under null (i.e., =.20) Power is calculated once sample size and difference information is entered (here, Power =.935). For power use software

Power Curve for n = 40 and p o =.20 For a difference of.20 Power =.935 as seen on previous slide

Sample Size and CI’s for p Suppose we wish to estimate p using a 95% CI and have a margin of error of 3%. What sample size do we need to use? Recall the CI for p is given by: MARGIN OF ERROR (E)

Sample Size and CI’s for p Here for a 95% CI we want E =.03 or 3% After some wonderful algebraic manipulation Oh, oh! We don’t know p-hat !! 1.“Guesstimate” 2.Use p-hat from pilot or prior study. 3.Largest n we would ever need comes when p-hat =.50.

Sample Size and CI’s for p 1.Informed approach 2.Conservative approach (i.e. worst case scenario) Standard normal values 90% = % = % = 2.578

Sample Size and CI’s for p Original Question: Suppose we wish to estimate p using a 95% CI and have a margin of error of 3%. What sample size do we need to use? Assume that we estimate the 5 yr. survival rate for a new kidney cancer therapy, and we know historical that it this survival rate is around 20%. Using informed approach

Sample Size and CI’s for p Original Question: Suppose we wish to estimate p using a 95% CI and have a margin of error of 3%. What sample size do we need to use? Assume that we estimate the 5 yr. survival rate for a new kidney cancer therapy, and we know historical that it this survival rate is around 20%. Using conservative approach This is why in media polls you they usually report a sampling error of + 3% and that the poll was based on a sample of n = 1000 individuals.

Small Sample Inference for p: Binomial Exact Test When the sample size is “small” the sampling distribution cannot be approximated by a standard normal distribution. However, regardless of sample size the EXACT sampling distribution of the number of “successes” in n independent trials ALWAYS has a Binomial Distribution. Thus if we knew more about the binomial distribution we could use it to find p-values when conducting a hypothesis test and also when constructing confidence intervals for the pop. proportion p.

Binomial Probability Distribution A binomial random variable X is defined to the number of “successes” in n independent trials where the P(“success”) = p is constant. Notation: X ~ BIN(n,p) In the definition above notice the following conditions need to be satisfied for a binomial experiment: 1.There is a fixed number of n trials carried out. 2.The outcome of a given trial is either a “success” or “failure”. 3.The probability of success (p) remains constant from trial to trial. 4.The trials are independent, the outcome of a trial is not affected by the outcome of any other trial.

Binomial Distribution If X ~ BIN(n, p), then where

Binomial Distribution If X ~ BIN(n, p), then E.g. when n = 3 and p =.50 there are 8 possible equally likely outcomes (e.g. flipping a coin) SSS SSF SFS FSS SFF FSF FFS FFF X=3 X=2 X=2 X=2 X=1 X=1 X=1 X=0 P(X=3)=1/8, P(X=2)=3/8, P(X=1)=3/8, P(X=0)=1/8 Now let’s use binomial probability formula instead…

Binomial Distribution If X ~ BIN(n, p), then E.g. when n = 3, p =.50 find P(X = 2) SSF SFS FSS

Example: Treatment of Kidney Cancer In our example we had n = 40 patients and if we assume the experimental therapy is no better than current treatments then probability of 5-year survival is p =.20. Thus the number of patients in our study surviving at least 5 years has a binomial distribution, i.e. X ~ BIN(40,.20).

Example: Treatment of Kidney Cancer X ~ BIN(40,.20), find the probability that exactly 16 patients survive at least 5 years. This requires some calculator gymnastics and some scratchwork! Also, keep in mind for a p-value we need to find the probability of having 16 or more patients surviving at least 5 yrs. Remember p-value is defined as evidence as extreme or more extreme.

Example: Treatment of Kidney Cancer So we actually need to find: p-value = P(X = 16) + P(X = 17) + … + P(X = 40) + … + EXACT p-value = YIPES!

Example: Treatment of Kidney Cancer X ~ BIN(40,.20), find the probability that 16 or more patients survive at least 5 years. USE COMPUTER! Binomial Exact Test p-value calculator in JMP Enter n = sample size x = observed # of “successes” p o = proportion under H o p-values are computed automatically for all three possible alternatives

Example: Treatment of Kidney Cancer X ~ BIN(40,.20), find the probability that 16 or more patients survive at least 5 years. USE COMPUTER! Binomial Exact Test p-value calculator in JMP Exact p-value = Contrasting this EXACT p-value (p =.0029) to the one calculated earlier using the normal approximation (p =.0008) we see a fairly substantial difference! MORAL: USE EXACT WHEN n is SMALL !!!

Exact CI for p using the binomial distribution Find LCL and UCL for p by finding probabilities that meet the following requirements: P(X > x|p = LCL) =   and P(X < x|p = UCL) =  Use computer to find these probabilities. e.g. for 95% confidence   

Exact CI for p using the binomial distribution Find a 95% CI for p for the kidney cancer study For the lower confidence limit we find LCL =.248 or.249

Exact CI for p using the binomial distribution Find a 95% CI for p for the kidney cancer study For the upper confidence limit we find UCL =.566 or.567 Therefore based on an EXACT 95% confidence interval we estimate that the success rate of the experimental therapy is between 24.8% and 56.7%.

Summary of Inference for a Single Population Proportion (p) When n is “large” use large sample methods based on the sampling distribution being approximately normal. (Easy) When n is “small” use exact methods based on the binomial distribution, which requires specific software or tables. (Hard) Exact methods can always be used! In general, precise estimates of a population proportion requires a large samples size, e.g. media polls which typically use n = 1,000.