Presentation on theme: "H OW MUCH SLEEP DID YOU GET LAST NIGHT ? 1. <6 2. 6 3. 7 4. 8 5. 9 6. >9 Slide 1- 1."— Presentation transcript:
H OW MUCH SLEEP DID YOU GET LAST NIGHT ? 1. <6 2. 6 3. 7 4. 8 5. 9 6. >9 Slide 1- 1
U PCOMING IN C LASS Homework #11 due Monday Start working on the final step of your data project Exam #2 is Thursday April 26 th. Slide 1- 2
P OSSIBLE T ESTS One-proportion z-test Two-proportion z-test One-sample t-test for mean Two-sample t-test for differences of means Slide 1- 3
E XAMPLES OF O NE - PROPORTION TEST Everyone (100%) believes in ghosts More than 10% of the population believes in ghosts Less than 2% of the population has been to jail 90% of the population wears contacts Slide 1- 4
E XAMPLES OF T WO - PROPORTION TESTS Women believe in ghosts more than men Blacks believe in ghost more than whites People who have been to jail believe in ghosts more than people who haven’t been to jail Women smoke more than men Women use facebook in the bathroom more than men Slide 1- 5
E XAMPLES OF O NE -S AMPLE T - TEST All Priuses have fuel economy > 50 mpg Ford Focuses get 5 mpg on average The average starting salary for ISU graduates >$100,000 The average cholesterol level for a person with diabetes is 240. Slide 1- 6
E XAMPLES OF TWO - SAMPLE T - TEST The MPG for the Prius is greater than the MPG for the Ford Focus ISU male graduates have a greater starting salary than women The cholesterol levels are the same for people with and without diabetes. Slide 1- 7
E THANOL P LANT AND VOC LEVELS Near Peoria, IL a struggling ethanol plant was recently fined for toxic wastewater pollution. In the midwest, the EPA reviews the VOC levels for 25 ethanol plants. The mean VOC for these plants is 300 tons per year, with a SD of 250. Most of the plants have bypassed a stringent EPA permitting process because they claimed to have levels of VOC emission lower than 100 tons a year.
M ORE C AUTIONS A BOUT I NTERPRETING C ONFIDENCE I NTERVALS Remember that interpretation of your confidence interval is key. What NOT to say: “ 95% of all the ethanol plants pollute betwee between 214.45 and 385.55 tons per year.” The confidence interval is about the mean not the individual values. “We are 95% confident that a randomly selected ethanol plant will pollute between 214.45 and 385.55 tons per year.” Again, the confidence interval is about the mean not the individual values. Slide 1- 9
M ORE C AUTIONS A BOUT I NTERPRETING C ONFIDENCE I NTERVALS ( CONT.) What NOT to say: “The mean pollution level of is 300 tons per year 95% of the time.” The true mean does not vary—it’s the confidence interval that would be different had we gotten a different sample. “ 95% of all samples will have pollution levels between 214.45 and 385.55 tons per year.” The interval we calculate does not set a standard for every other interval—it is no more (or less) likely to be correct than any other interval. Slide 1- 10
F ROM PROPORTIONS TO MEANS Now that we know how to create confidence intervals and test hypotheses about proportions, it’d be nice to be able to do the same for means. Just as we did before, we will base both our confidence interval and our hypothesis test on the sampling distribution model.
N OTATION FOR THE T HEORY The Central Limit Theorem told us that the sampling distribution model for means is Normal with mean μ and standard deviation All we need is a random sample of quantitative data. And the true population standard deviation, σ. Well, that’s a problem…
Proportions have a link between the proportion value and the standard deviation of the sample proportion. This is not the case with means—knowing the sample mean tells us nothing about We’ll do the best we can: estimate the population parameter σ with the sample statistic s. Our resulting standard error is A PPLYING THE T HEORY TO OUR S AMPLE
C AN ’ T USE N ORMAL M ODEL, MUST USE T - DISTRIBUTION We now have extra variation in our standard error from s, the sample standard deviation. And, the shape of the sampling model changes—the model is no longer Normal. So, what is the sampling model?
A practical sampling distribution model for means When the conditions are met, the standardized sample mean follows a Student’s t -model with n – 1 degrees of freedom. We estimate the standard error with T HE T - DISTRIBUTION AND OUR SAMPLE Slide 1- 16
T HE S AMPLE S TANDARD D EVIATION Slide 4- 17 The standard deviation, s, is just the square root of the variance and is measured in the same units as the original data.
T- DISTRIBUTION Student’s t -models are unimodal, symmetric, and bell shaped, just like the Normal. But t -models with only a few degrees of freedom have much fatter tails than the Normal. Slide 1- 18
T- DISTRIBUTION As the degrees of freedom increase, the t -models look more and more like the Normal. In fact, the t -model with infinite degrees of freedom is exactly Normal. Slide 1- 19
G OSSET ’ S T William S. Gosset, an employee of the Guinness Brewery in Dublin, Ireland, worked long and hard to find out what the sampling model was. The sampling model that Gosset found has been known as Student’s t. The Student’s t -models form a whole family of related distributions that depend on a parameter known as degrees of freedom. We often denote degrees of freedom as df, and the model as t df. Slide 1- 20
F INDING T -V ALUES B Y H AND The Student’s t - model is different for each value of degrees of freedom. Because of this, Statistics books usually have one table of t -model critical values for selected confidence levels. Slide 1- 21
U SING THE T - TABLES FIND THE FOLLOWING CRITICAL - T VALUES What is the critical value of t for a 90% confidence interval with df=18? What is the critical value of t for a 99% confidence interval with df 78? Slide 1- 22
F IND P - VALUES Online Program http://www.tutor- homework.com/statistics_tables/statistics_tables. html TI-83/TI-84 http://www.keymath.com/documents/sia2/Calcula torNotes_Ch09_SIA2.pdf Slide 1- 23
U SING YOUR SOFTWARE, FIND THE FOLLOWING P - VALUES What is the p-value for t≥2.61 with 4 degrees of freedom? What is the p-value for |t|>1.81 with 21 degrees of freedom? What is the p-value for |t|<1.53 with 21 degrees of freedom? Slide 1- 24
A C ONFIDENCE I NTERVAL FOR M EANS ? ( CONT.) Slide 1- 25 When the conditions are met, we are ready to find the confidence interval for the population mean, μ. The confidence interval is where the standard error of the mean is The critical value depends on the particular confidence level, C, that you specify and on the number of degrees of freedom, n – 1, which we get from the sample size. One-sample t-interval for the mean
HW 11 – P ROBLEM 6 A nutrition lab retests the sodium content of hot dogs. This time they use a sample of 75 ‘reduced sodium’ frankfurters instead of 40. The NEW sample produces a mean of 319mg with a SD of 31. The OLD sample produced a mean of 310mg with a SD of 36. Slide 1- 26
S HOULD THE LARGER SAMPLE PRODUCE A MORE ACCURATE PREDICTION OF REDUCED SODIUM ? 1. More accurate 2. Less accurate Slide 1- 27
W HAT IS THE SE FOR THE NEW SAMPLE ? 1. 31/sqrt(75) 2. 36/sqrt(75) 3. 31/sqrt(40) 4. 36/sqrt(40) Slide 1- 28
F IND THE 95% CI, FOR THE NEW SAMPLE 1. 319± 1.960*3.58 2. 319± 1.992*3.58 3. 319± 1.665*3.58 4. 319± 2.021*3.58 Slide 1- 29
I NTERPRET 1. 95% of all “reduced sodium” hot dogs will have a sodium content that falls within the interval 2. We are 95% confident the interval contains the true mean of sodium content in this type of “reduced sodium” hot dog 3. The interval contains the true mean sodium content in this type of “reduced sodium” hot dogs 95% of the time 4. 95% of the sodium content in this type of “reduced sodium” hot dog will be contained in the interval. Slide 1- 30
F OOD LABELING REGULATIONS REQUIRE THAT FOOD IDENTIFIED AS “ REDUCED SODIUM ” MUST HAVE AT LEAST 30% LESS SODIUM THAN ITS REGULAR COUNTERPART. Let’s say, we find that the regular hot dog has 465mg of sodium. Slide 1- 31
S HOULD THIS HOT DOG BE LABELED “ REDUCED ” BASED ON THE SAMPLE ? 1. Yes, b/c a 95% CI is less than the maximum allowable sodium for a ‘reduced sodium’ frank 2. No, b/c a 95% CI extends above the maximum allowable sodium for a ‘reduced sodium’ frank 3. No, b/c a 95% CI is less than the maximum allowable sodium for a ‘reduced sodium’ frank Slide 1- 32
A T EST FOR THE M EAN Slide 1- 33 The assumptions and conditions for the one-sample t -test for the mean are the same as for the one-sample t - interval. We test the hypothesis H 0 : = 0 using the statistic The standard error of the sample mean is When the conditions are met and the null hypothesis is true, this statistic follows a Student’s t model with n – 1 df. We use that model to obtain a P-value. One-sample t-test for the mean
I NTERVALS AND T ESTS ( CONT.) More precisely, a level C confidence interval contains all of the possible null hypothesis values that would not be rejected by a two-sided hypothesis test at alpha level 1 – C. So a 95% confidence interval matches a 0.05 level test for these data. Confidence intervals are naturally two-sided, so they match exactly with two-sided hypothesis tests. When the hypothesis is one sided, the corresponding alpha level is (1 – C )/2. Slide 1- 34
S AMPLE S IZE Slide 1- 35 To find the sample size needed for a particular confidence level with a particular margin of error ( ME ), solve this equation for n : The problem with using the equation above is that we don’t know most of the values. We can overcome this: We can use s from a small pilot study. We can use z * in place of the necessary t value.
W OULD A 99% CI BE WIDER OR NARROWER THAN 98% CI? 1. Wider 2. Narrower 3. Would remain the same Slide 1- 37
W HAT ARE THE ( DIS ) ADVANTAGES OF THE 98% CI? 1. The 98% CI has a less chance of containing the true mean than the 99% CI, but 99% CI is more precise (narrower) than the 98% CI. 2. The 99% CI has a less chance of containing the true mean than the 98% CI, but 98% CI is more precise (narrower) than the 99% CI. 3. The 98% CI has a less chance of containing the true mean than the 99% CI, but 98% CI is more precise (narrower) than the 99% CI. Slide 1- 38
S UPPOSE WE DECREASE OUR SAMPLE SIZE FROM N =55 TO N =25. H OW WOULD WE EXPECT OUR 98% CI TO CHANGE ? 1. Wider 2. Narrower 3. The same Slide 1- 39
H OW LARGE A SAMPLE WOULD YOU NEED TO ESTIMATE THE MEAN BODY TEMPERATURE TO WITHIN 0.1 DEGREES WITH 99% CONFIDENCE. Slide 1- 40 1. 0-100 2. 101-200 3. 201-300 4. 301-400 5. 400+
U PCOMING IN C LASS Homework #11 due Monday Start working on the final step of your data project Exam #2 is Thursday April 26 th. Slide 1- 41