Presentation on theme: "Terminology A statistic is a number calculated from a sample of data. For each different sample, the value of the statistic is a uniquely determined number."— Presentation transcript:
Terminology A statistic is a number calculated from a sample of data. For each different sample, the value of the statistic is a uniquely determined number. A parameter is a number calculated from the data in a population. Since a population does not change, the value of the parameter is the same, no matter how many times it is calculated.
Notation for Samples and Populations StatisticsParameters = sample mean = population mean s 2 = sample variance 2 = population variance s = sample standard = population standard deviation
Thinking About the Sample Mean Suppose I take a sample of book costs for 100 NKU students. I then find the mean, and = $340. Suppose I then take a second sample of 100 NKU students, different than before, and find their book costs. If I calculated the sample mean, how do you think it would compare to $340? Would it be the same? Would it be “similar”?
Thinking About the Sample Mean Now suppose I performed the experiment a large number of times, with each step involving: sample 100 NKU students record textbook costs calculate the mean for the sample of 100 students Each sample produces a different What happens if I make a histogram for all the different values?
The Sampling Distribution of the Sample Mean When you take a sample, compute a statistic, repeat the process a large number of times, and then make a histogram of the statistics you observed, you are examining the sampling distribution of the statistic. Under special conditions, some of these distributions (histograms) will begin to resemble a normal distribution.
The Sampling Distribution of The properties of the Sampling Dist. of are: 1.The mean, denoted, equals the mean of the population from which the sample is drawn = 2.The standard deviation of the sampling dist., denoted, depends on how spread out the original population is, along with the number of points in the sample. This is also called the standard error of the mean.
The Sampling Distribution of 3.The sampling distribution of the sample mean is approximately normal for a large sample size n. If there are 30 or more data points in a sample (n 30), we will consider this large enough for the sampling distribution of to be approximated by the normal distribution. If the sample size is less than 30, we must know that the population from which the sample is taken is itself normally distributed.
The Sampling Distribution of The properties of the Sampling Dist. of are: 1.= (always true) 2.(always true - 6 decimal digits). Also called the standard error of the mean. 3.If n is at least 30, the sampling distribution of is approximately normal. Note: If the original population is normally distributed, then the sampling dist of the mean is also normal, regardless of the sample size n.
The Sampling Distribution of When I ask you to describe any sampling distribution, I am looking for 3 things: 1. Where is it centered (the mean). 2. How variable is it (the standard error). Note: Keep 6 digits after the decimal point. 3. What is the shape? More specifically can you claim it has an approximately normal shape?
The Central Limit Theorem As the sample size increases (n gets larger), the sampling distribution of will look more like a normal distribution, regardless of the original population. Notice, the actual theorem does not give a value for n, but simply says as it increases more and more. The n = 30 or more is a “rule of thumb” which we will use for our class. In some cases, n must be much more than 30. In other cases, n may be much smaller than 30.
Example The number of customers using the drive-thru at a local fast food restaurant during the lunch hour has a mean of 51 customers and a standard deviation of 4.5 customers. 1. Can you find the probability that on one randomly selected day there are more than 55 customers who use the restaurant’s drive-thru for lunch? 2. Fully describe the sampling distribution of the sample mean for samples of 36 lunch-hour counts of customers using this drive-thru.
Solution to Part 1 You may want to find P(X > 55) by changing 55 into a z-score and using Table 1. There is only one problem. Does the exercise say the distribution of customer demand is normal? Without knowing this information, we cannot assume it is. However, the sampling distribution of the mean might be normally distributed.
Solution to Part 2 We need to define the three points: 1. Where is it centered? = = 51 customers 2. Variability Note: This is exactly 0.75 or 0.750000. 3. Normal? Since n = 36 is at least 30, we conclude the sampling distribution of the sample mean is normally distributed.
Finding a Probability Since the distribution is normal and we know the mean and standard error, we can create z-scores using the formula below. Then we can use Table 1 as we did in chapter 4 to find probabilities.
Example Continued What is the probability that a random sample of 36 one-hour lunch periods will have a mean volume of more than 52.8 customers using the drive-thru at this restaurant? Would this be unusual? We are interested in the probability that the value of the sample mean is more than 52.8 customers.
Example Continued What is the probability that a random sample of 36 one-hour lunch periods will have a mean volume of more than 52.8 customers using the drive-thru at this restaurant? Would this be unusual? The probability that a random sample of 36 one-hour lunch periods will have a mean volume of more than 52.8 customers using the drive-thru at this restaurant is 0.0082. Since 0.0082 is less than 0.05, this would be considered unusual.