Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Inference Sampling Distributions.

Similar presentations


Presentation on theme: "Introduction to Inference Sampling Distributions."— Presentation transcript:

1 Introduction to Inference Sampling Distributions

2 Inference with a Single Observation Each observation X i in a random sample is a representative of unobserved variables in population How different would this observation be if we took a different random sample? Population Observation X i Parameter:  SamplingInference ?

3 Inference with Sample Mean Sample mean is our estimate of population mean How much would the sample mean change if we took a different sample? Key to this question: Sampling Distribution of x Population Sample Parameter:  Statistic: x Sampling Inference Estimation ?

4 Sampling Distribution of a Sample Statistic Sampling Distribution of a Sample Statistic: The distribution of values for a sample statistic obtained from repeated samples, all of the same size and all drawn from the same population 1)Make a list of all samples of size 2 that can be drawn from this set (Sample with replacement) 2)Construct the sampling distribution for the sample mean for samples of size 2 3)Construct the sampling distribution for the minimum for samples of size 2 Example:Consider the set {1, 2, 3, 4}:

5 This table lists all possible samples of size 2, the mean for each sample, and the probability of each sample occurring (all equally likely) # of possible samples (with placement) = N n Table of All Possible Samples

6 Sampling Distribution of the Sample Mean Histogram: Sampling Distribution of the Sample Mean Sampling Distribution Summarize the information in the previous table to obtain the sampling distribution of the sample mean and the sample minimum:

7 Sampling Distribution of Sample Mean Distribution of values taken by statistic in all possible samples of size n from the same population Model assumption: our observations x i are sampled from a population with mean  and variance  2 Population Unknown Parameter:  Sample 1 of size n x Sample 2 of size n x Sample 3 of size n x Sample 4 of size n x Sample 5 of size n x Sample 6 of size n x Sample 7 of size n x Sample 8 of size n x. Distribution of these values?

8 Mean of Sample Mean First, we examine the center of the sampling distribution of the sample mean. Center of the sampling distribution of the sample mean is the unknown population mean: mean( X ) = μ Over repeated samples, the sample mean will, on average, be equal to the population mean –no guarantees for any one sample!

9 Variance of Sample Mean Next, we examine the spread of the sampling distribution of the sample mean The variance of the sampling distribution of the sample mean is variance( X ) =  2 /n As sample size increases, variance of the sample mean decreases! Averaging over many observations is more accurate than just looking at one or two observations

10 Comparing the sampling distribution of the sample mean when n = 1 (parent population) vs. n = 10

11 Law of Large Numbers Remember the Law of Large Numbers: If one draws independent samples from a population with mean μ, then as the number of observations increases, the sample mean x gets closer and closer to the population mean μ This is easier to see now since we know that mean(x) = μ variance(x) =  2 /n 0 as n gets large

12 Example Population: seasonal home-run totals for 7032 baseball players from 1901 to 1996 Take different samples from this population and compare the sample mean we get each time In real life, we can’t do this because we don’t usually have the entire population! Sample Size MeanVariance 100 samples of size n = 13.6946.8 100 samples of size n = 104.43 100 samples of size n = 1004.420.43 100 samples of size n = 10004.420.06 Population Parameter  = 4.42

13 Distribution of Sample Mean We now know the center and spread of the sampling distribution for the sample mean. What about the shape of the distribution? If our data x 1,x 2,…, x n follow a Normal distribution, then the sample mean x will also follow a Normal distribution!

14 Example Mortality in US cities (deaths/100,000 people) This variable seems to approximately follow a Normal distribution, so the sample mean will also approximately follow a Normal distribution irrespective of the sample size drawn.

15 Central Limit Theorem What if the original data doesn’t follow a Normal distribution? HR/Season for sample of baseball players If the sample is large enough, it doesn’t matter!

16 Central Limit Theorem If the sample size is large enough (n≥ 30), then the sample mean x has an approximately Normal distribution This is true no matter what the shape of the distribution of the original data! 

17 Example: Home Runs per Season Take many different samples from the seasonal HR totals for a population of 7032 players Calculate sample mean for each sample n = 1 n = 10 n = 100

18 Important Definition & Theorem Central Limit Theorem The sampling distribution of sample means will become normal as the sample size increases. Sampling Distribution of Sample Means If all possible random samples, each of size n, are taken from any population with a mean  and a standard deviation , the sampling distribution of sample means will: 1. have a mean equal to  2. have a standard deviation equal to Further, if the sampled population has a normal distribution, then the sampling distribution of will also be normal for samples of all sizes n   x  x x

19 Summary The standard deviation of the sampling distribution of (also called the standard error of the mean) is equal to the standard deviation of the original population divided by the square root of the sample size: Notes: –The distribution of becomes more compact as n increases. (Why?) –The variance of : x  x n  x x  x n 22  The distribution of is (exactly) normal when the original population is normal x The CLT says: the distribution of is approximately normal regardless of the shape of the original distribution, when the sample size is large enough! x The mean of the sampling distribution of is equal to the mean of the original population: x  x

20 Standard Error of the Mean Notes: The n in the formula for the standard error of the mean is the size of the sample The proof of the Central Limit Theorem is beyond the scope of this course The following example illustrates the results of the Central Limit Theorem Standard Error of the Mean: The standard deviation of the sampling distribution of sample means:  x n 

21 Graphical Illustration of the Central Limit Theorem Original Population 30 Distribution of x: n = 10 Distribution of x: n = 30 Distribution of x: n = 2 30

22 7.3 ~ Applications of the Central Limit Theorem When the sampling distribution of the sample mean is (exactly) normally distributed, or approximately normally distributed (by the CLT), we can answer probability questions using the standard normal distribution, using the z standard score for dealing with the normal distribution,

23 Example 2 Example:Consider a normal population with  = 50 and  = 15. Suppose a sample of size 9 is selected at random. Find: Px()4560  Px(.)  475 1) 2) Solutions: Since the original population is normal, the distribution of the sample mean is also (exactly) normal 1)  x  50  x n  159 35 2)

24 504560 x 0  1.00 2.00 Example 2 PxP Pz () (... 4560 4550 5 6050 5 1.002.00) 034130477208185             zz =; x -   n

25 5047.5 x 0 -0.50 Example 2 PxP x Pz (.). (.)...             475 50 5 47550 5 5 050000191503085 z =; x -   n


Download ppt "Introduction to Inference Sampling Distributions."

Similar presentations


Ads by Google