Confidence Intervals for Population Means

Slides:



Advertisements
Similar presentations
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 22 Comparing Two Proportions.
Advertisements

How are we doing? Sort the types of error into sampling and non-sampling errors, then match the situations to the types of error.
Inference on Proportions. What are the steps for performing a confidence interval? 1.Assumptions 2.Calculations 3.Conclusion.
Confidence Intervals This chapter presents the beginning of inferential statistics. We introduce methods for estimating values of these important population.
Sampling Distributions and Sample Proportions
1 Introduction to Inference Confidence Intervals William P. Wattles, Ph.D. Psychology 302.
Business Statistics for Managerial Decision
Drawing Samples in “Observational Studies” Sample vs. the Population How to Draw a Random Sample What Determines the “Margin of Error” of a Poll?
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Estimating Proportions with Confidence Chapter 10.
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Statistical Inference: Confidence Intervals
CHAPTER 8 ESTIMATION OF THE MEAN AND PROPORTION Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved.
Introduction to Formal Statistical Inference
ESTIMATION OF THE MEAN AND PROPORTION
Estimating Means with Confidence
Chapter 9 Comparing Means
Inference for Distributions
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
10.1 Estimating With Confidence
8.1 Inference for a Single Proportion
Comparing Two Population Means
Statistics: Concepts and Controversies What Is a Confidence Interval?
Chapter 7 Statistical Inference: Confidence Intervals
Chapter Twelve Census: Population canvass - not really a “sample” Asking the entire population Budget Available: A valid factor – how much can we.
Inference for Proportions
STAT 3120 Statistical Methods I Lecture 2 Confidence Intervals.
MATH 1107 Elementary Statistics Lecture 8 Confidence Intervals for proportions and parameters.
Associate Professor Arthur Dryver, PhD School of Business Administration, NIDA url:
Section 2 Part 2.   Population - entire group of people or items for which we are collecting data  Sample – selections of the population that is used.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
1 How do we interpret Confidence Intervals (Merit)? A 95% Confidence Interval DOES NOT mean that there is a 95 % probability that the population mean lies.
STAT 3120 Statistical Methods I Lecture 3 Confidence Intervals for Parameters.
INCM 9201 Quantitative Methods Confidence Intervals - Proportion.
INCM 9201 Quantitative Methods Confidence Intervals for Means.
Section 10.1 Confidence Intervals
Section 8.2 ~ Estimating Population Means Introduction to Probability and Statistics Ms. Young.
Copyright © 2010 Pearson Education, Inc. Slide Beware: Lots of hidden slides!
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
FPP Confidence Interval of a Proportion. Using the sample to learn about the box Box models and CLT assume we know the contents of the box (the.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.
CONFIDENCE STATEMENT MARGIN OF ERROR CONFIDENCE INTERVAL 1.
 The point estimators of population parameters ( and in our case) are random variables and they follow a normal distribution. Their expected values are.
Confidence Interval Estimation For statistical inference in decision making:
Chapter 10: Confidence Intervals
Chapter 19 Confidence intervals for proportions
CONFIDENCE INTERVALS.
Section 8.3 ~ Estimating Population Proportions Introduction to Probability and Statistics Ms. Young.
1 Mean Analysis. 2 Introduction l If we use sample mean (the mean of the sample) to approximate the population mean (the mean of the population), errors.
* Chapter 8 – we were estimating with confidence about a population * Chapter 9 – we were testing a claim about a population * Chapter 10 – we are comparing.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Copyright © 2009 Pearson Education, Inc. 8.1 Sampling Distributions LEARNING GOAL Understand the fundamental ideas of sampling distributions and how the.
366_8. Estimation: Chapter 8 Suppose we observe something in a random sample how confident are we in saying our observation is an accurate reflection.
STAT 3120 Statistical Methods I Lecture 2 Confidence Intervals.
Warm Up In May 2006, the Gallup Poll asked 510 randomly sampled adults the question “Generally speaking, do you believe the death penalty is applied fairly.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
 Here’s the formula for a CI for p: p-hat is our unbiased Estimate of p. Z* is called the critical value. I’ll teach you how to calculate that next. This.
Section 9.1 Sampling Distributions AP Statistics January 31 st 2011.
+ Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.
MATH Section 6.1. Sampling: Terms: Population – each element (or person) from the set of observations that can be made Sample – a subset of the.
Can't Type? press F11 Can’t Hear? Check: Speakers, Volume or Re-Enter Seminar Put ? in front of Questions so it is easier to see them. 1 Welcome to Unit.
Advanced Placement Statistics
Chapter 8: Inference for Proportions
STAT 4030 – Programming in R STATISTICS MODULE: Confidence Intervals
HMI 7530– Programming in R STATISTICS MODULE: Confidence Intervals
Confidence Intervals with Proportions
MATH 2311 Section 6.1.
DG 24 (LAST ONE) minutes Day 57 Agenda: DG 24 (LAST ONE) minutes.
Presentation transcript:

Confidence Intervals for Population Means R Programming Confidence Intervals for Population Means

Confidence Intervals These notes will guide you through estimating a single population mean from a sample. Throughout these notes: The formula will be presented; The formula will be applied (manually); The formula will be applied via R.

Sample estimate + conf. level * standard error Confidence Intervals Any Confidence Interval can be estimated using the following general form: Sample estimate + conf. level * standard error A Confidence Interval around a single population mean is developed using: 𝒙 ±𝒛∗𝒔/ 𝒏 Where: x = sample mean z = the appropriate two sided Z-score, based upon the desired confidence level s = sample standard deviation n = number of elements in sample

Confidence Intervals Typical Z scores used in CI Estimation:

Confidence Intervals For example, lets say that we took a poll of 100 college students and determined that they spent an average of $225 on books in a semester with a std dev of $50. Report the 95% confidence interval for the expenditure on books for ALL college students.

Confidence Intervals In this example, 𝑥 = 225 z= 1.96 s = 50 n= 100 𝑥 = 225 z= 1.96 s = 50 n= 100 So, the 95% interval would be: 225 ±1.96∗ 50 100 = 225± 9.8 This becomes 225 + 9.8

Confidence Intervals In English, this becomes… “We are 95% confident that the mean expenditure on books for college students is $225 plus or minus $9.80…we are 95% confident that they are spending as little as $215.20 or as much as $234.80”.

Confidence Intervals One general note regarding Confidence Intervals… The results tell us NOTHING about the probability of an individual observation…a 95% interval SHOULD NOT be interpreted as “Joe has a 95% probability of spending between $215.20 and $234.80”. The interval is an estimation of the mean of the population…not of an individual observation.

Confidence Intervals A second example… 175 students from Penn State where asked “In a typical day, about how much time do you spend watching TV?” The sample average was 2.09 hours with a standard deviation of 1.644. Report the 99% confidence interval for the average time spent watching TV for all college students.

Confidence Intervals The answer is: “We are 99% confident that the average time college students spent watching TV is between 1.77 and 2.41 hours. “

Confidence Intervals for The Population Mean of Paired Differences R Programming Confidence Intervals for The Population Mean of Paired Differences

Confidence Intervals These notes will guide you through estimating Confidence Intervals for the Population Mean of Paired Samples. Throughout these notes: The formula will be presented; The formula will be applied (manually); The formula will be applied via R.

Confidence Intervals A few notes about paired differences (which are different from two independent sample differences): The same (or VERY similar) people/objects are measured pre/post treatment. Typically, we are only interested in the calculated differences between the before and after - not in the actual values of the original data which was collected.

Confidence Intervals As we saw previously, any CI can be estimated using the approach of Sample estimate + conf. level * standard error A Confidence Interval around the population mean of paired differences : 𝒙 𝒅 ±𝒛∗ 𝒏 Where: x d = sample mean (difference of the two means) z = the appropriate two sided Z-score, based upon desired confidence s = sample standard deviation (difference) n = number of elements in sample

Confidence Intervals -1.645 1.645 Typical Z scores used in CI Estimation: 90% confidence = 1.645 95% confidence = 1.96 98% confidence = 2.33 99% confidence = 2.575 90% of the area under the curve -1.645 1.645

Confidence Intervals For example, lets say that a particular firm tracks their sales every week over the course of a year. They average 150 units a week. After hiring an advertising company, the average goes up to 165 units on average the next year. The std of the differences between the two years is 10.25. What is the 90% Confidence Interval?

Confidence Intervals In this example, xd = 15 z = 1.645 s = 10.25 n = 52 So, the 90% interval would be: 15+1.645*(10.25/SQRT(52))… This becomes 15+ 2.3382 Week Year 1 Year 2 Difference 1 100 125 25 2 160 180 20 3 110 150 40 4… 110… 120… 10… Overall Average 15 Std 10.25

Confidence Intervals In English, this becomes… “We are 90% confident that the mean difference in weekly sales between year 1 and year 2 is 15 plus or minus 2.3382…we are 90% confident that the change is as little as 12.6618 sales per week and as great as 17.3382 sales per week.”

Confidence Intervals Lets now put this into a more realistic context… Say that each sale generates $100 in profit. If you are generating as little as 12.6618 incremental sales per week (lower end of the interval), that would equate to $65,841.36 (12.6618*100*52) incremental profit over one year. This might represent the maximum that you would be willing to pay the advertising agency.

Confidence Intervals A second example… A pharmaceutical company is testing a new cholesterol reducing drug – Choless. They recruit 100 representative men for the trial of Choless. The average cholesterol level before the trial was 242. The average cholesterol level after the trial was 216. The average difference was 26 with a standard deviation of the differences of 32. Determine the 99% confidence level.

Confidence Intervals In this example, xd = 26 z = 2.575 s = 32 n = 100 So, the 99% interval of the paired differences would be: 26+2.575*(32/SQRT(100))… This becomes 26+ 8.24

Confidence Intervals In English, this becomes… “We are 99% confident that the mean difference in cholesterol levels before Choless and after Choless is 26 points, plus or minus 8.24 points…we are 99% confident that the change is as little as 17.76 points or as great as 34.24 points.”

Confidence Intervals for Difference The Two Independent Sample Means R Programming Confidence Intervals for Difference The Two Independent Sample Means

Confidence Intervals These notes will guide you through estimating parameter (mean) confidence intervals for two independent samples. Throughout these notes: The formula will be presented; The formula will be applied (manually); The formula will be applied via R.

Confidence Intervals As we saw previously, any CI can be estimated using the approach of Sample estimate + conf. level * standard error A Confidence Interval around the difference between two independent samples can be calculated as: x1 – x2  z* SQRT((s21/n1)+(s22/n2)) Where: xi = sample mean (two independent samples) z = the appropriate two sided Z-score, based upon desired confidence si = sample standard deviation (two independent samples) ni = number of elements in each sample

Confidence Intervals Typical Z scores used in CI Estimation:

Confidence Intervals A few notes about independent sample differences: The two samples must be statistically independent of each other. You need to know if the variances (std) are approximately equal or not. The formula from the previous slide assumes that they are not equal. A second formula and a discussion of the differences will be provided in a later slide.

Confidence Intervals Volunteers who had developed a cold within the previous 24 hours were randomly assigned to two groups – one took zinc lozenges and one took a placebo every 2-3 hours until their symptoms had subsided. From Zinc Group: 35 people Duration of symptoms – 4.5 days Std of days – 1.6 From Placebo Group: 33 people Duration of symptoms – 8.1 days Std of days – 1.8 Calculate the 95% Confidence Interval for the difference between the two groups.

Confidence Intervals In this example, x1 = 4.5 x2 = 8.1 z= 1.96 s1 = 1.6 s2 = 1.8 n1= 35 n2 = 33 So, the 95% interval would be: 4.5 – 8.1  1.96* SQRT((1.62/35)+(1.82/33))… This becomes -3.6 + .8112

Confidence Intervals In English, this becomes… “We are 95% confident that the zinc group experienced 3.6 fewer days of symptoms than did the placebo group, plus or minus .8112 days…the zinc group experienced as much as 4.4112 days less than the placebo group or as few as 2.7888 days less than the placebo group”.

Confidence Intervals In this example, we generated the Confidence Interval using the unpooled approach . There is a second option – the pooled approach. In theory, we use the pooled approach when the standard deviations are approximately the same between the two groups. In practice, this is uncommon. The rule of thumb goes something like this… If the larger sample standard deviation is from the group with the larger sample size, the pooled procedure will generate a larger (more conservative) interval. If the smaller sample standard deviation is from the group with the larger sample size, the pooled version may produce a misleading narrow interval.

Confidence Intervals Practitioners tend to use the unpooled procedure unless the sample standard deviations are VERY close. Lets redo the previous example with the pooled procedure and discuss the difference… Where s = SQRT[((n1-1)s21+(n2-1)S22)/(n1+n2 – 2)] So, the 95% interval would be: 4.5 – 8.1  1.96*1.699*[ SQRT( (1/35)+(1/33))]… -3.6+.8080 x1 – x2  z* SQRT(s2(1/n1+1/n2))

Confidence Intervals Using the pooled version, we did generate a slightly smaller margin of error (.8080 versus .8112). This occurred because the larger group (group 1) had smaller standard deviation. As mentioned in the previous slide, when in doubt, use the unpooled approach.

Confidence Intervals Lets now generate two sample Confidence Intervals using R…

Confidence Intervals for One Sample Proportion R Programming Confidence Intervals for One Sample Proportion

Confidence Intervals These notes will guide you through estimating proportion confidence intervals. In each case: The formula will be presented; The formula will be applied (manually); The formula will be applied via R.

Confidence Intervals The interval for any CI estimate can be expressed as: Sample estimate + conf. level * standard error In the case of a single population proportion, the expression is: 𝒑 ±𝒛∗ 𝒑(𝟏−𝒑) 𝒏 Where, “p” is the proportion of units in a sample; z is the associated # of Std deviations associated with the required confidence level; n is the number of obs in the sample.

Confidence Intervals Typical Z scores used in CI Estimation:

Confidence Intervals The Gallup Organization, founded in 1935 by George Gallup, is one of the most well respected polling organizations in the world. Their website is a great place to find confidence intervals of proportions www.gallup.com A common survey for Gallup is the presidential approval rating…

Confidence Intervals Before we dissect a Gallup survey, lets take a look at their methodology statement from their website: Survey Methods Results are based on telephone interviews with 997 national adults, aged 18 and older, conducted May 19, 2009. For results based on the total sample of national adults, one can say with 95% confidence that the maximum margin of sampling error is ±3 percentage points. Interviews are conducted with respondents on land-line telephones (for respondents with a land-line telephone) and cellular phones (for respondents who are cell-phone only). In addition to sampling error, question wording and practical difficulties in conducting surveys can introduce error or bias into the findings of public opinion polls. Polls conducted entirely in one day, such as this one, are subject to additional error or bias not found in polls conducted over several days.

Confidence Intervals From this statement, Gallup asked 997 if they approve of the job the president is doing. From this representative sample, 64% say “Yes”…How did they get a margin of error of 3%? So, from this example, we have: p=.64 z = 1.96 n = 997 .64 + 1.96 * SQRT((.64(.36))/997) … .64+.0298

Confidence Intervals So…what would happen to the interval if we increased the confidence to 99%? Everything would stay the same except for the Z-score: p=.64 z = 2.575 n = 997 .64 + 2.575 * SQRT((.64(.36))/997) … .64+.0391

Confidence Intervals From the Gallup Data (approval = 64%, n=997), here are the effects of changing the confidence levels… Confidence Level Z-score Margin of Error Low End of Approval Rating High End of Approval Rating 90% 1.645 .0250 61.50% 66.50% 95% 1.96 .0298 61% 67% 99% 2.575 .0391 60% 68%

Confidence Intervals for Difference of Two Sample Proportions R Programming Confidence Intervals for Difference of Two Sample Proportions

Confidence Intervals These notes will guide you through estimating the confidence interval for the difference in proportions for two independent samples. Throughout these notes: The formula will be presented; The formula will be applied (manually); The formula will be applied via R.

Confidence Intervals As we saw previously, the interval for any CI estimate can be expressed as: Sample estimate + conf. level * standard error In the case of a CI for the difference between two proportions, the expression is: 𝒑 𝟏 − 𝒑 𝟐 ±𝒛∗ 𝒑 𝟏 (𝟏− 𝒑 𝟏 ) 𝒏 𝟏 + 𝒑 𝟐 (𝟏− 𝒑 𝟐 ) 𝒏 𝟐 Where, pi is the proportion of units in a sample (1 or 2); Z is the associated # of Std deviations associated with the required confidence level; ni is the number of obs in the sample (1 or 2).

Confidence Intervals A few notes about independent sample differences: Sample proportions are available based upon independent randomly selected samples from the two populations. The frequencies of each proportional quantity is at least 10.

Confidence Intervals For example, lets say that we took a poll of students and asked “would you date someone with a great personality who you were not attracted to?” By gender, the results were 61.07% of 131 women said “yes” while 42.62% of 61 men said “yes”. What is the 95% Confidence Interval?

.6107 –.4262 + 1.96 * SQRT(((.6107(.3893))/131)+((.4262(.5738)/61))) Confidence Intervals So, from this example, we have: p1=.6107 p2 = .4262 z = 1.96 n1 = 131 n2 = 61 .6107 –.4262 + 1.96 * SQRT(((.6107(.3893))/131)+((.4262(.5738)/61))) .1850 + .1493

Confidence Intervals In English, this becomes… “We are 95% confident that the proportion of women who would date someone who they thought was a great person even if they did not find them attractive is 18.50% higher than the proportion of men…this difference could be as great as 33.43% or as little as 3.57%.”

Confidence Intervals Lets now generate a single proportion confidence interval using R…