Estimation: How Large is the Effect? Chapter 2. Chapter Overview So far, we can only say things like ◦ “We have strong evidence that the long-run probability.

Estimation: How Large is the Effect? Chapter 2

Chapter Overview So far, we can only say things like ◦ “We have strong evidence that the long-run probability Buzz pushes the correct button is larger than 0.5.” ◦ “We do not have strong evidence kids have a preference between candy and a toy when trick- or-treating.” We want a method that says ◦ “I believe 68 to 75% of all elections can be correctly predicted by the competent face method.”

Chapter Overview Estimation tells how large the effect is, through an interval of values. We can be 95% confident that the “true” effect of taking bi-daily aspirin will reduce the rate of heart attacks somewhere between 30% and 50%. “If the election were held today, would you vote for Barack Obama or Mitt Romney?” 51% responded Obama (margin of error is ± 3 percentage points)

Confidence Intervals These interval estimates of a population parameter are called confidence intervals. We will find confidence intervals three ways. ◦ Through a series of tests of significance to see which proportions are plausible values for the parameter. ◦ Using the standard deviation of the simulated null distribution to help us determine the width of the interval. ◦ Through traditional theory-based methods.

Statistical Inference – Confidence Intervals Section 2.1

Can Dogs Sniff Out Cancer? Example 2.1 Marine sniffing samples

Can Dogs Sniff Out Cancer? Marine, a dog originally trained for water rescues, was tested to see if she could detect whether a patient had colorectal cancer by smelling a sample of their breath. She first smells a bag from a patient with colorectal cancer. Then she smells 5 other samples; 4 from normal patients and 1 from a person with colorectal cancer She is trained to sit next to the bag that matches the scent of the initial bag (the “cancer scent”) by being rewarded with a tennis ball.

Can Dogs Sniff Out Cancer?

Our sample proportion lies more than 10 standard deviation above the mean and hence our p-value is 0.

Can Dogs Sniff Out Cancer?

Developing a range of plausible values If I get a small p-value (like I did with 0.70) I will conclude that the value under the null is not plausible. This is when we conclude the alternative hypothesis. If I get a large p-value (like I did with 0.80) I will conclude the value under the null is plausible. This is when I can’t conclude the alternative.

Developing a range of plausible values

Can Dogs Sniff Out Cancer? We should have found that values between 0.79 and 0.96 are plausible values for Marine’s probability of picking the correct specimen. We can do more tests and find a more precise interval to be 0.787 to 0.966. Probability under null 0.7850.7860.7870.788 ………… 0.9650.9660.9670.968 p-value0.0850.0930.1100.1440.1090.1020.0940.088 Plausible?No Yes … Yes …Yes No

Can Dogs Sniff Out Cancer? (0.787, 0.966) is called a confidence interval. Since we used 10% as our significance level, this is a 90% confidence interval. (100% − 10%) 90% is the confidence level of the interval of plausible values.

Can Dogs Sniff Out Cancer? We would say we are 90% confident that Marine’s probability of correctly picking the bag with breath from the cancer patient from among 5 bags is between 0.797 and 0.966. This is a more precise statement than our initial significance test which concluded Marine’s probability was more than 0.20.

Significance Level Confidence Level Typically we use 0.05 for our significance level. There is nothing magical about 0.05. We could set up our test to make it harder to conclude the alternative (smaller significance level) or easier (larger significance level). If we increase the confidence level from 90% to 95%, what will happen to the width of the confidence interval?

Can Dogs Sniff Out Cancer? Since the confidence level gives an indication of how sure we are that we captured the actual value of the parameter in our interval, to be more sure our interval should be wider. How would we obtain a wider interval of plausible values to represent a 95% confidence level? ◦ Use a 5% significance level in the tests.

Can Dogs Sniff Out Cancer? Values that correspond to 2-sided p-values larger than 0.05 should now be in our interval. Using the table we developed, what is the 95% confidence interval for Marine’s long- run probability?

Exploration 2.1: Kissing Right As you work through this exploration …

Exploration 2.1 Your first test will be one-sided, but after that everything is a two-sided test. The sample proportion stays constant. The parameter under the null will change since we are testing to see if different parameters are plausible or not. For small p-values we can rule the parameter out. For large p-values, the parameter is plausible.

2SD and Theory-Based Methods for Confidence Intervals Section 2.2

Introduction In Section 2.1 we found confidence intervals by doing repeated tests of significance (changing the value in the null hypothesis) to find a range of values that were plausible for the population parameter (long run relative frequency or probability). This is a very tedious way to construct a confidence interval. Today, we will look at two other ways to construct confidence intervals. ◦ Two Standard Deviations (2SD). ◦ Theory-based.

In example 1.5 we looked at a study where researchers investigated whether children show a preference to toys or candy Test households in five Connecticut neighborhoods offered children two plates: ◦ One with candy ◦ One with small, inexpensive toys Example 2.2: Halloween Treats (continued)

We tested the hypotheses: ◦ Null: The proportion of trick-or-treaters who choose candy is 0.5. ( π = 0.5) ◦ Alternative: The proportion of trick-or- treaters who choose candy is not 0.5. ( π ≠ 0.5) Our sample proportion was: ◦ 148 out of 283 (52.3%) chose candy Halloween Treats

When we ran this test, we got a very large p-value so we could not conclude that π was not 0.5. This means 0.5 is a plausible value for π.

Halloween Treats What are some other plausible values for π ? Through repeated tests we come up with: Prob. under null 0.4630.4640.465…… 0.5810.5820.583 2-sided p-value 0.0480.0490.057…… 0.0530.0470.046 Plausible value (@0.05)? No Yes…… No

Halloween Treats Thus we found a 95% confidence interval of 0.465 to 0.581. This means, we are 95% confident that the probability a child will choose candy while trick-or- treating is between 0.465 and 0.581 Does it make sense that 0.5 is in the interval? Why?

Halloween Treats Remember that a p-value of less than 0.05 corresponds with a standardized statistic of 2 or larger (or -2 or smaller) Hence for most symmetric, bell- shaped distributions, about 95% of the values in the distribution fall within 2 SD of the mean.

Halloween Treats

The 2SD method only gives us a 95% confidence interval If we want a different level of confidence, we can us the range of plausible values (hard) or theory-based methods (easy). In the theory-based applet, you can input any level of confidence and the applet will calculate the confidence interval for you. This is valid provided there are at least 10 successes and 10 failures in your sample (validity conditions).

Applets Let’s check out this example using applets and doing the 2SD method and theory- based method. Remember 52.3% of 283 trick-or-treaters chose candy.

Predicting Elections from Faces (continued) Exploration 2.2

Factors that affect the width of a confidence interval Section 2.3

Factors Affecting Confidence Interval Widths

Level of Confidence Let’s take another look at the St. George Hospital heart transplant data. In the sample of 361 patients, 71of them died within 30 days of their heart transplant, and 290 survived. Each of these counts of “successes” and “failures” is greater than 10, so we can use our theory-based applet to play around with confidence intervals.

Level of Confidence Since the standard deviation is predictable, we can use the Theory- Based Inference Applet to easily find a confidence interval for the 30 day mortality rate for heart transplant patients at St. George’s. (Notice that the confidence level can be adjusted.)

Level of Confidence What happens to the width of the confidence interval as we change the level of confidence? As the level of confidence increases, the width of our confidence interval increases (and hence the margin of error increases). We are more confident of capturing our parameter in a wider range of values. Level of Confidence 80%90%95%99% Confidence Interval (0.170, 0.224)(0.163, 0.231)(0.156, 0.238)(0.143, 0.251 ) 0.197±0.0270.197±0.0340.197±0.0410.197±0.054

Sample Size We know as sample size increases, the variability (and thus standard deviation) in our null distribution decreases n = 90 (SD = 0.054)n = 361 (SD = 0.026) n = 1444 (SD = 0.013) Sample size903611444 SD of null distr. 0.0530.0270.013 Margin of error2 x SD = 0.1062 × SD = 0.0542 × SD = 0.026 Confidence interval(0.091, 0.303)(0.143, 0.251)(0.171, 0.223)

Sample Size (With everything else staying the same) increasing the sample size will make a confidence interval narrower. Notice: The observed sample proportion is the midpoint. (that won’t change) Margin of error is a multiple of the standard deviation so as the standard deviation decreases, so will the margin of error.

Formula for Theory-Based Confidence Interval

Summary

What does 95% confidence mean? If we repeatedly sampled from a population and constructed 95% confidence intervals, 95% of our intervals should contain the population parameter. Notice the interval is the random event here. The population parameter is a fixed number, we just don’t know what it is. Simulating Confidence Intervals Applet.

Type I error Think back to the St. George’s example that looked at deaths of heart transplant patients. We concluded that their death rate was higher than the national average. Suppose this resulted in ceasing operations at the hospital. Also suppose that in reality their rate was really the same as the national average. What we have done is to reject a true null hypothesis. This is called a type I error and is sometimes referred to a false alarm.

Type II error Now suppose we obtained a large p-value so we didn’t get significant results in the St. George’s example. Hence, we could not conclude that their death rate was higher than the national average. And heart operations continued at the hospital. Also suppose that in reality their rate was in fact higher than the national average. What we have done is to not reject a false null hypothesis. This is called a type II error and is sometimes referred to a missed opportunity.

Type I and Type II Errors What is true (unbeknownst to us) Null hypothesis is true Null hypothesis is false What we decide (based on data) Reject null hypothesis Type I Error (false alarm) Correct Decision Do not reject null hypothesis Correct Decision Type II Error (missed opportunity)

Type I and Type II errors Null is true Defendant is innocent Null is false Defendant is guilty Reject null Jury finds defendant guilty Type I Error Innocent person goes to prison Correct Decision Guilty person goes to prison Not reject null Jury finds defendant not guilty Correct Decision Innocent person is set free Type II Error Guilty person is set free

The probabilities of Type I and Type II errors The significance level is the criterion used to reject the null hypothesis. We have been using 0.05 as our significance level. The probability of a type I error is the significance level. (Suppose the significance level is 0.05. If the null is true we would reject it 5% of the time and thus make a type I error 5% of the time.)

Type I and Type II Errors Think back to the dog sniffing cancer study: ◦ Describe what a type I error would be. ◦ Describe what a type II error would be.

Competitive advantage to Uniform Color (continued) Exploration 2.3A

Estimation: How Large is the Effect? Chapter 2. Chapter Overview So far, we can only say things like ◦ “We have strong evidence that the long-run probability.

Similar presentations

Presentation on theme: "Estimation: How Large is the Effect? Chapter 2. Chapter Overview So far, we can only say things like ◦ “We have strong evidence that the long-run probability."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Estimation: How Large is the Effect? Chapter 2. Chapter Overview So far, we can only say things like ◦ “We have strong evidence that the long-run probability.

Similar presentations

Presentation on theme: "Estimation: How Large is the Effect? Chapter 2. Chapter Overview So far, we can only say things like ◦ “We have strong evidence that the long-run probability."— Presentation transcript:

Similar presentations

About project

Feedback