The standard error of the sample mean and confidence intervals

Slides:



Advertisements
Similar presentations
Statistics Review.
Advertisements

A Sampling Distribution
Chapter 5 Introduction to Inferential Statistics.
T scores and confidence intervals using the t distribution.
The standard error of the sample mean and confidence intervals
t scores and confidence intervals using the t distribution
Chapter 5 Introduction to Inferential Statistics.
Topics: Inferential Statistics
Chapter 1 The mean, the number of observations, the variance and the standard deviation.
Correlation 2 Computations, and the best fitting line.
Correlation 2 Computations, and the best fitting line.
Confidence intervals using the t distribution. Chapter 6 t scores as estimates of z scores; t curves as approximations of z curves Estimated standard.
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
Chapter 4 Translating to and from Z scores, the standard error of the mean and confidence intervals Welcome Back! NEXT.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
Chapter 5 Introduction to Inferential Statistics.
Chapter 1 The mean, the number of observations, the variance and the standard deviation.
Chapter 1 The mean, the number of observations, the variance and the standard deviation.
Sampling We have a known population.  We ask “what would happen if I drew lots and lots of random samples from this population?”
Wednesday, October 3 Variability. nominal ordinal interval.
Chapter Six z-Scores and the Normal Curve Model. Copyright © Houghton Mifflin Company. All rights reserved.Chapter The absolute value of a number.
Chapter 1-6 Review Chapter 1 The mean, variance and minimizing error.
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
T scores and confidence intervals using the t distribution.
Chapter 7 Probability and Samples: The Distribution of Sample Means
Chapter 11: Random Sampling and Sampling Distributions
QUIZ CHAPTER Seven Psy302 Quantitative Methods. 1. A distribution of all sample means or sample variances that could be obtained in samples of a given.
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
Objectives The student will be able to: find the variance of a data set. find the standard deviation of a data set. SOL: A
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Standard Error of the Mean
Chapter 6: Sampling Distributions
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Estimation Statistics with Confidence. Estimation Before we collect our sample, we know:  -3z -2z -1z 0z 1z 2z 3z Repeated sampling sample means would.
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
A Sampling Distribution
Dan Piett STAT West Virginia University
PARAMETRIC STATISTICAL INFERENCE
Statistics: For what, for who? Basics: Mean, Median, Mode.
Smith/Davis (c) 2005 Prentice Hall Chapter Six Summarizing and Comparing Data: Measures of Variation, Distribution of Means and the Standard Error of the.
Objectives The student will be able to: find the variance of a data set. find the standard deviation of a data set.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
9.3 – Measures of Dispersion
Chapter 18: Sampling Distribution Models
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
1 Mean Analysis. 2 Introduction l If we use sample mean (the mean of the sample) to approximate the population mean (the mean of the population), errors.
INFERENTIAL STATISTICS DOING STATS WITH CONFIDENCE.
Distributions of Sample Means. z-scores for Samples  What do I mean by a “z-score” for a sample? This score would describe how a specific sample is.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Things you will need in class. zLecture notes from the my website on the internet. yGo to and look for the latest set of.
1 Estimation Chapter Introduction Statistical inference is the process by which we acquire information about populations from samples. There are.
Monday, September 27 More basics.. _ “Life is a series of samples, you can infer the truth from the samples but you never see the truth.”
Sample Means. Parameters The mean and standard deviation of a population are parameters. Mu represents the population mean. Sigma represents the population.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
Chapter Six Summarizing and Comparing Data: Measures of Variation, Distribution of Means and the Standard Error of the Mean, and z Scores PowerPoint Presentation.
Chapter 6: Sampling Distributions
GOVT 201: Statistics for Political Science
Objectives The student will be able to:
Distribution of the Sample Means
Calculating Probabilities for Any Normal Variable
Introduction to Sampling Distributions
Presentation transcript:

The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu can we expect to find 95% or 99% or sample means

An introduction to random samples When we speak about samples in statistics, we are talking about random samples. Random samples are samples that are obtained in line with very specific rules. If those rules are followed, the sample will be representative of the population from which it is drawn. One way that it will be representative of the population is that the sample mean will be close to the population mean. Specifically, on the average, sample means are closer to mu than are individual scores.

Random samples: Some principles In a random sample, each and every score must have an equal chance of being chosen each time you add a score to the sample. Thus, the same score can be selected more than once, simply by chance. (This is called sampling with replacement.) The number of scores in a sample is called “n.” Sample statistics based on random samples provide least squared, unbiased estimates of their population parameters.

The variance and the standard deviation are the basis for the rest of this chapter. In Chapter 1 you learned to compute the average squared distance of individual scores from mu. We called it the variance. Taking a square root, you got the standard deviation. Now we are going to ask a slightly different question and transform the variance and standard deviation in another way.

As you add scores to a random sample Each randomly selected score tends to correct the sample mean back toward mu If we have several samples, as we add scores the sample means get closer to each other and closer to mu The larger the samples, the closer they will be to mu, on the average.

Let’s see how that happens Population is 1320 students taking a test.  is 72.00,  = 12 Let’s randomly sample one student at a time and see what happens.We’ll create a random sample with 8 students’ scores in the sample.

Test Scores F Scores r e q u Mean n c y score Sample scores: Means: 3 2 1 0 1 2 3 Standard deviations score 36 48 60 96 108 72 84 Sample scores: 102 72 66 76 78 69 63 Means: 87 80 79 76.4 76.7 75.6 74.0

How much closer to mu does the sample mean get when you increase n, the size of the sample? (1) The average squared distance of individual scores is called the variance. You learned to compute it in Chapter 1. The symbol for the mean of a sample is the letter X with a bar over it.We will write that as X-bar.

Let’s put that in a formula sigma2X-bar = sigma2/n How much closer to mu does the sample mean get when you increase n, the size of the sample? (2) The average squared distance of sample means from mu is the average squared distance of individual scores from mu divided by n, the size of the sample. Let’s put that in a formula sigma2X-bar = sigma2/n

Let’s take that one step further As you know, the square root of the variance is called the standard deviation. It is the average unsquared distance of individual scores from mu. The average unsquared distance of sample means from mu is the square root of sigma2X-bar = sigmaX-bar. sigmaX-bar is called the standard error of the sample mean or, more briefly, the standard error of the mean. Let’s look at the formulae: sigma2X-bar = sigma2/n sigmaX-bar = sigma/

The standard error of the mean Let’s translate the formula into English, just to be sure you understand it. Here is the formula again: sigmaX-bar = sigma/ In English: The standard error of the sample mean equals the ordinary standard deviation divided by the square root of the sample size. Another way to say that: The average unsquared distance of the means of random samples from mu equals the average unsquared distance of individual scores from the population mean divided by the square root of the sample size.

The standard error of the mean is the standard deviation of the sample means around mu. We could compute the average unsquared distance of sample means from mu by 1. subtracting mu from each sample mean. 2. squaring the differences, 3. getting a sum of squares 4. dividing by the number of sample means and 5. taking the square root. We would need to do that for all possible samples of a particular size from a population. That’s a lot of calculations. (A real lot.)

Example: Start with a tiny population N=5 The scores in this population form a perfectly rectangular distribution. Mu = 5.00 Sigma = 2.83 We are going to list all the possible samples of size 2 (n=2) First see the population, then the list of samples

If we did compute a standard deviation of sample means from mu, it should give the same result as the formula Let’s see if it does. We can only do all the computations if we have a very small population and an even tinier sample. Let’s use an example with N=5 and n, the size of each sample = 2.

Computing sigma sigma = 2.83 SS=(1-5)2+(3-5)2+(5-5)2+ (7-5)2+ (9-5)2=40 sigma2=SS/N=40/5=8.00 sigma = 2.83

The standard error = the standard deviation divided by the square root of n, the sample size In the example you just saw, sigma = 2.83. Divide that by the square root of n (1.414) and you get the standard error of the mean (2.00). The formula works. And it works every time.

Let’s see what sigmaX-bar can tell us We know that the mean of SAT/GRE scores = 500 and sigma = 100 So 68.26% of individuals will score between 400 and 600 and 95.44% will score between 300 and 700 But if we take random samples of SAT scores, with 4 people in each sample, the standard error of the mean is sigma divided by the square root of the sample size = 100/2=50. 68.26% of the sample means (n=4) will be within 1.00 standard error of the mean from mu and 95.44% will be within 2.00 standard errors of the mean from mu So, 68.26% of the sample means (n=4) will be between 450 and 550 and 95.44% will fall between 400 and 600 NOTE THAT SAMPLE MEANS FALL CLOSER TO MU, ON THE AVERAGE, THAN DO INDIVIDUAL SCORES.

Let’s make the samples larger Take random samples of SAT scores, with 400 people in each sample, the standard error of the mean is sigma divided by the square root of 400 = 100/20=5.00 68.26% of the sample means will be within 1.00 standard error of the mean from mu and 95.44% will be within 2.00 standard errors of the mean from mu. So, 68.26% of the sample means (n=400) will be between 495 and 505 and 95.44% will fall between 490 and 510. Take random samples of SAT scores, with 2500 people in each sample, the standard error of the mean is sigma divided by the square root of 2500 = 100/50=2.00. 68.26% of the sample means (n=2500) will be between 498 and 502 and 95.44% will fall between 496 and 504

What happens as n increases? The sample means get closer to each other and to mu. Their average squared distance from mu equals the variance divided by the size of the sample. Therefore, their average unsquared distance from mu equals the standard deviation divided by the square root of the size of the sample. The sample means fall into a more and more perfect normal curve. These facts are called “The Central Limit Theorem” and can be proven mathematically.

CONFIDENCE INTERVALS

We want to define two intervals around mu: One interval into which 95% of the sample means will fall. Another interval into which 99% of the sample means will fall.

95% of sample means will fall in a symmetrical interval around mu that goes from 1.960 standard errors below mu to 1.960 standard errors above mu A way to write that fact in statistical language is: CI.95: mu + 1.960 sigmaX-bar or CI.95: mu - 1.960 sigmaX-bar < X-bar < mu + 1.960 sigmaX-bar

As I said, 95% of sample means will fall in a symmetrical interval around mu that goes from 1.960 standard errors below mu to 1.960 standard errors above mu Take samples of SAT/GRE scores (n=400) Standard error of the mean is sigma divided by the square root of n=100/ = 100/20.00=5.00 1.960 standard errors of the mean with such samples = 1.960 (5.00)= 9.80 So 95% of the sample means can be expected to fall in the interval 500+9.80 500-9.80 = 490.20 and 500+9.80 =509.80 CI.95: mu + 1.960 sigmaX-bar = 500+9.80 or CI.95: 490.20 < X-bar < 509.20

99% of sample means will fall within 2.576 standard errors from mu Take the same samples of SAT/GRE scores (n=400) The standard error of the mean is sigma divided by the square root of n=100/20.00=5.00 2.576 standard errors of the mean with such samples = 2.576 (5.00)= 12.88 So 99% of the sample means can be expected to fall in the interval 500+12.88 500-12.88 = 487.12 and 500+12.88 =512.88 CI.99: mu + 2.576 sigmaX-bar = 500+12.88 or CI.99: 487.12 < the sample mean < 512.88

Let’s do another one. What are the 95% and 99% confidence intervals for samples of 25 randomly selected IQ scores First compute the standard error for samples of 25 IQ scores IQ: mu =100, sigma = 15 Standard error for samples of size 25 is 15 divided by the square root of 25 = 15/5.00 =3.00

IQ scores – CI.95 and CI.99 for samples n=25 The standard error of the mean is sigma divided by the square root of n=15/5.00=3.00 1.960 standard errors of the mean with such samples = 1.960 (3.00)= 5.88 points So, 95% of the sample means (n=25) can be expected to fall in the interval 100 + 5.88 100-5.88 = 94.12 and 100+5.88 =105.88 CI.95: mu + 1.960 sigmaX-bar = 100+5.88 or CI.95: 94.12 < X-bar < 105.88 99% of the sample means (n=25) can be expected to fall in the interval 100 + (2.576)(3.00) = 100 + 7.73 CI.99: 100+7.73 or CI.99: 92.27 < X-bar < 107.73

Here is another example Here is another example. This time we start with an even smaller population (N=4) and take all possible samples of size 3. There are 64 of them. Let’s see that again the means form a normal curve around mu and the standard error equals sigma divided by the square root of the sample size (3).

Standard error of the mean - 2 The standard deviation of the individual scores was 3.35 Sample size was 3 3.35 divided by the square root of 3 = 1.94 Computing the standard error directly from the sample means shows the standard error = 1.94