Presentation on theme: "Estimating a Population Mean"— Presentation transcript:
1 Estimating a Population Mean SECTION 10.2Estimating a Population Mean
2 What’s the difference between what we did in Section 10 What’s the difference between what we did in Section 10.1 and what we are beginning in Section 10.2?In reality, the standard deviation σ of the population is unknown, so the procedures from last section are not useful. However, the understanding of the logic of the procedures will continue to be of use.In order to be more realistic, σ is estimated from the data collected using s
3 Conditions for Inference about a Population Mean Random--The data is an SRS from the population or from a randomized experimentObservations from the population have a normal distribution with an unknown mean () and unknown standard deviation (σ) or the sample is large enough to ensure the sampling distribution is approximately normalIndependence is assumed for the individual observations when calculating a confidence interval. When we are sampling without replacement from a finite population, it is sufficient to verify that the population is at least 10 times the sample size.
4 CAUTIONBe sure to check that the conditions for constructing a confidence interval for the population mean are satisfied before you perform any calculations.
5 ROBUSTNESSROBUST: Confidence levels do not change when certain assumptions are violatedFortunately for us, the t-procedures are robust in certain situations.Therefore . . .
6 This is when we use the t-procedures: It’s more important for the data to bean SRS from a population than the population has a normal distributionIf n is less than 15, the data must be normal to use t-proceduresIf n is at least 15, the t-procedures can be used except if there are outliers or strong skewnessIf n≥30, t-procedures can be used even in thepresence of strong skewness, but outliers must still be examinedEssentially, as long as there are no significant departures from Normality (especially outliers) then the t procedures still work quite well.
7 Standard ErrorIn this setting, each sample is a part of a sampling distribution that is a normal distribution with a mean equal to the population’s meanSince we do not know σ, we will replace the standard deviation formula of with this formula:This is called the standard error of the sample mean
8 Degrees of Freedom Commonly listed as df Equal to n-1 When a t-distribution has k degrees of freedom, we will write this as t(k)When the actual df does not appear in Table C, use the greatest df available that is less than your desired dfThis guarantees a wider confidence interval than needed to justify a given confidence level
9 Density Curves for t Distributions Bell-shaped and symmetricGreater spread than a normal curveAs degrees of freedom (or sample size) increases, the t density curves appear more like a normal curve
10 Confidence Intervals± t*t* is the upper (1-C)/2 critical value for the t(n-1) distributionWe find t* using the table or our calculatort*=invT(area to left of t*, df)We interpret these the same way we did in the last chapter.This interval is exactly correct when the population distribution is Normal and is approximately correct for large n in other cases.
11 INFERENCE TOOLBOX (p 631)DO YOU REMEMBER WHAT THE STEPS ARE???Steps for constructing a CONFIDENCE INTERVAL:1—PARAMETER—Identify the population of interest and the parameter you want to draw a conclusion about.2—CONDITIONS—Choose the appropriate inference procedure. VERIFY conditions (Random, Normal, Independent) before using it.3—CALCULATIONS—If the conditions are met, carry out the inference procedure.4—INTERPRETATION—Interpret your results in the context of the problem. CONCLUSION, CONNECTION, CONTEXT(meaning that our conclusion about the parameter connects to our work in part 3 and includes appropriate context)
12 Example: GOT MILK?A milk processor monitors the number of bacteria per milliliter in raw milk received for processing. A random sample of 10 one-milliliter specimens from milk supplied by one producer give the following data:5370, 4890, 5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870Construct a 90% confidence interval.--We want to estimate = the mean number of bacteria per milliliter in all of the milk from this supplier--Since we don’t know σ, we should construct a one-sample t interval for .We must be confident that the data are an SRS from the producer’s milk. We must learn how the sample was chosen to see if it can be regarded as an SRS (we are only told that it is a “random sample”).A boxplot and a Normal probability plot of the data show no outliers and no strong skewness. This gives us little reason to doubt the Normality of the population from which this sample was drawn. In practice, we would probably rely on the fact that past measurements of this type have been roughly Normal.Since these measurements came from a random sample of specimens, they should be independent (assuming that there were many, at least 100, one-milliliter specimens available at the milk processing facility).
13 Example: GOT MILK? Cont. --Entering these data into a calculator gives =4950 and s= So a 90% confidence interval for the mean bacteria count per milliliter in this producer’s milk is--We can say that we are 90% confident that the actual mean number of bacteria per milliliter of milk from this supplier is between and because we used a method that yields intervals such that 90% of all these intervals will capture the true mean desired.df = 10-1 = 9
14 Paired t ProceduresRecall, matched pairs studies are a form of block design in which just two treatments are being comparedAlso, experiments are rarely done on randomly selected subjects. Random selection allows us to generalize results to a larger population, but random assignment of treatments to subjects allows us to compare treatments.Be careful to distinguish a matched pairs setting from a two-sample setting.The real key is independence.TREAT THE DIFFERENCES from a matched pairs study as a single sample.
15 TECHNOLOGYAs always, you will be allowed unrestricted use of your calculator on quizzes and tests (as well as the actual AP Exam). For this reason, ALWAYS be certain to write down the values of key numbers that are being used (means, standard deviations, degrees of freedom, significance levels, etc.) along with results of the calculator procedures in order to receive full credit.The calculator information is available in your book on pagesWe are now using the T Interval instead of the Z IntervalPlug in exactly what you are asked for