Presentation on theme: "SECTION 10.2 Estimating a Population Mean. Whats the difference between what we did in Section 10.1 and what we are beginning in Section 10.2? In reality,"— Presentation transcript:
SECTION 10.2 Estimating a Population Mean
Whats the difference between what we did in Section 10.1 and what we are beginning in Section 10.2? In reality, the standard deviation σ of the population is unknown, so the procedures from last section are not useful. However, the understanding of the logic of the procedures will continue to be of use. In order to be more realistic, σ is estimated from the data collected using s
Conditions for Inference about a Population Mean 1. Random--The data is an SRS from the population or from a randomized experiment 2. Observations from the population have a normal distribution with an unknown mean ( ) and unknown standard deviation (σ) or the sample is large enough to ensure the sampling distribution is approximately normal 3. Independence is assumed for the individual observations when calculating a confidence interval. When we are sampling without replacement from a finite population, it is sufficient to verify that the population is at least 10 times the sample size.
CAUTION Be sure to check that the conditions for constructing a confidence interval for the population mean are satisfied before you perform any calculations.
ROBUSTNESS ROBUST: Confidence levels do not change when certain assumptions are violated Fortunately for us, the t-procedures are robust in certain situations. Therefore...
This is when we use the t-procedures: Its more important for the data to be an SRS from a population than the population has a normal distribution If n is less than 15, the data must be normal to use t- procedures If n is at least 15, the t-procedures can be used except if there are outliers or strong skewness If n30, t-procedures can be used even in the presence of strong skewness, but outliers must still be examined Essentially, as long as there are no significant departures from Normality (especially outliers) then the t procedures still work quite well.
Standard Error In this setting, each sample is a part of a sampling distribution that is a normal distribution with a mean equal to the populations mean Since we do not know σ, we will replace the standard deviation formula of with this formula: This is called the standard error This is called the standard error of the sample mean
Degrees of Freedom Commonly listed as df Equal to n-1 When a t-distribution has k degrees of freedom, we will write this as t(k) When the actual df does not appear in Table C, use the greatest df available that is less than your desired df –This guarantees a wider confidence interval than needed to justify a given confidence level
Density Curves for t Distributions Bell-shaped and symmetric Greater spread than a normal curve As degrees of freedom (or sample size) increases, the t density curves appear more like a normal curve
Confidence Intervals ± t* ± t* –t* is the upper (1-C)/2 critical value for the t(n-1) distribution –We find t* using the table or our calculator t*=invT(area to left of t*, df) –We interpret these the same way we did in the last chapter. –This interval is exactly correct when the population distribution is Normal and is approximately correct for large n in other cases.
INFERENCE TOOLBOX (p 631) 1PARAMETERIdentify the population of interest and the parameter you want to draw a conclusion about. 2CONDITIONSChoose the appropriate inference procedure. VERIFY conditions (Random, Normal, Independent) before using it. 3CALCULATIONSIf the conditions are met, carry out the inference procedure. 4INTERPRETATIONInterpret your results in the context of the problem. CONCLUSION, CONNECTION, CONTEXT(meaning that our conclusion about the parameter connects to our work in part 3 and includes appropriate context) Steps for constructing a CONFIDENCE INTERVAL: DO YOU REMEMBER WHAT THE STEPS ARE???
Example: GOT MILK? --We want to estimate = the mean number of bacteria per milliliter in all of the milk from this supplier --Since we dont know σ, we should construct a one-sample t interval for. –We must be confident that the data are an SRS from the producers milk. We must learn how the sample was chosen to see if it can be regarded as an SRS (we are only told that it is a random sample). –A boxplot and a Normal probability plot of the data show no outliers and no strong skewness. This gives us little reason to doubt the Normality of the population from which this sample was drawn. In practice, we would probably rely on the fact that past measurements of this type have been roughly Normal. –Since these measurements came from a random sample of specimens, they should be independent (assuming that there were many, at least 100, one-milliliter specimens available at the milk processing facility). A milk processor monitors the number of bacteria per milliliter in raw milk received for processing. A random sample of 10 one-milliliter specimens from milk supplied by one producer give the following data: 5370, 4890, 5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870 Construct a 90% confidence interval.
Example: GOT MILK? Cont. --Entering these data into a calculator gives =4950 and s= So a 90% confidence interval for the mean bacteria count per milliliter in this producers milk is --We can say that we are 90% confident that the actual mean number of bacteria per milliliter of milk from this supplier is between and because we used a method that yields intervals such that 90% of all these intervals will capture the true mean desired. --We can say that we are 90% confident that the actual mean number of bacteria per milliliter of milk from this supplier is between and because we used a method that yields intervals such that 90% of all these intervals will capture the true mean desired. df = 10-1 = 9
Paired t Procedures Recall, matched pairs studies are a form of block design in which just two treatments are being compared Also, experiments are rarely done on randomly selected subjects. Random selection allows us to generalize results to a larger population, but random assignment of treatments to subjects allows us to compare treatments. Be careful to distinguish a matched pairs setting from a two-sample setting. The real key is independence. TREAT THE DIFFERENCES from a matched pairs study as a single sample.
TECHNOLOGY As always, you will be allowed unrestricted use of your calculator on quizzes and tests (as well as the actual AP Exam). For this reason, ALWAYS be certain to write down the values of key numbers that are being used (means, standard deviations, degrees of freedom, significance levels, etc.) along with results of the calculator procedures in order to receive full credit. The calculator information is available in your book on pages We are now using the T Interval instead of the Z Interval Plug in exactly what you are asked for