Presentation on theme: "Chapter 9: Sampling Distributions “It has been proved beyond a shadow of a doubt that smoking is one of the leading causes of statistics.” Fletcher Knebel."— Presentation transcript:
Chapter 9: Sampling Distributions “It has been proved beyond a shadow of a doubt that smoking is one of the leading causes of statistics.” Fletcher Knebel
9.1 Sampling Distributions (pp. 456-469) Samples: Examined in order to come to a reasonable conclusion about the population from which the sample is chosen To glean meaningful information, one must be statistically literate Must have an awareness of what the sample results tell us and don’t tell us A statistic calculated from a sample may suffer from bias or high variability Does not represent a good estimate of a population parameter
Vocabulary Review Parameter: an index that is related to a population Statistic: an index that is related to a sample Sampling distribution of a statistic: the distribution of values of a statistic taken from all possible samples of a specific size A statistic is unbiased if the mean of the sampling distribution is equal to the true value of the parameter being estimated
Rules to Review: Variance formula for a POPULATION: Variance formula for a SAMPLE:
The chart shows that… The mean of the distribution of sample means is the mean (μ) of the population This illustrates that a sample mean is an unbiased estimator of the population mean The distribution of sample means “centers” around the mean of the population The mean of the distribution of sample variances (s 2 ) is equal to the variance (σ 2 ) of the population This illustrates that a sample variance (s 2 ) is an unbiased estimator of the population variance The distribution of sample variances “centers” around the variance of the population
Take Note: A sample standard deviation is NOT an unbiased estimator of the population standard deviation In the above example, the mean of the sample deviation is 0.628539, and the standard deviation for the population is 8.81649658 The distribution of sample standard deviations does not center around the standard deviation of the population
9.2 Sample Proportions (pp. 472-477) The normal distribution curve is often extremely useful in analyzing sample proportions. This section provides insights into the circumstances that allow for use of normal distribution properties.
Consider an SRS of 1000 people from a large population X represents the number in this sample who are Republicans. There are 1001 possible values for X: 0, 1, …1000/ P-hat represents the possible sample proportions of Republicans in the sample. There are 1001 possible values of p-hat: 0/1000, 1/1000…1000/1000. We could choose many SRS’s and calculate a p-hat for each. We would expect the distribution of p-hat to be approximately normal.
If we choose an SRS of size n from a large population with population proportion p having some characteristic of interest, and if p-hat is the proportion of the sample having that characteristic, then: The sampling distribution of p-hat is approximately normal. The mean of the sampling distribution is p (the population parameter). The standard deviation of the sampling distribution is
It is reasonable to use the previous statements when: The population is at least 10 times as large as the sample. Rule of Thumb #1 Np is at least 10 and n(1-p) is at least 10. Rule of Thumb #2
Suppose it is known that 60% of the registered voters in a district of over 20,000 people are Republicans. IF YOU CHOOSE AN SRS OF 1000 REGISTERED VOTERS: What is the probability that the proportion of registered voters in the sample is between 58% and 62%? What is the probability that the sample will contain more than 550 Republicans? Are both rules of thumb satisfied?
Convert x =.55 to its z-score. Interpret. Rare occurrence????.000628 is approx. = 1/1592. So if we had 1600 random samples of size 1000, how many of them would we “expect” to have 550 or fewer Republicans?
9.3 Sample Means (pp. 481-494) This section contains one of the most important of all statistical theorems, the Central Limit Theorem of Statistics. It also emphasizes that it is conventionally the Greek letters μ and σ that are used for the population parameters mean and standard deviation and x-bar and s are used to represent the mean and standard deviation for samples.
The Central Limit Theorem Consider an SRS of size n from any population with mean μ and standard deviation σ. When n is large, the sampling distribution of x-bar has the following properties: It is approximately normal. The mean of the distribution is x-bar ( = μ). The standard deviation of the distribution is s.
Consider the population Now consider all possible sample size 2 with NO REPLACEMENT. There would be 3x3 or 9 such samples.
The mean of the sample means is equal to 4, which is equal to μ. This illustrates the second part of the Central Limit Theorem. The standard deviation of the sample means is equal to 1.154700538. This illustrates the third part of the Central Limit Theorem.