Introduction to Inference

Introduction to Inference
Tests of Significance

Definitions A test of significance is a method for using sample data to make a decision about a population characteristic. The null hypothesis, written H0, is the starting value for the decision (i.e. H0 : m = 1000). The alternative hypothesis, written Ha, states what belief/claim we are trying to determine if statistically significant (Ha : m < 1000).

Examples Chrysler Concord K-mart H0: m = 8 Ha: m > 8 H0: m = 1000

Chrysler 8

K-mart 1000

Phrasing our decision In justice system, what is our null and alternative hypothesis? H0: defendant is innocent Ha: defendant is guilty What does the jury state if the defendant wins? Not guilty Why?

Phrasing our decision H0: defendant is innocent
Ha: defendant is guilty If we have the evidence: We reject the belief the defendant is innocent because we have the evidence to believe the defendant is guilty. If we don’t have the evidence: We fail to reject the belief the defendant is innocent because we do not have the evidence to believe the defendant is guilty.

Chrysler Concord H0: m = 8 Ha: m > 8 p-value = .0134
We reject H0 since the probability is so small there is enough evidence to believe the mean Concord time is greater than 8 seconds.

K-mart light bulb H0: m = 1000 Ha: m < 1000 p-value = .1078
We fail to reject H0 since the probability is not very small there is not enough evidence to believe the mean lifetime is less than 1000 hours.

Remember: Inference procedure overview
State the procedure Define any variables Establish the conditions (assumptions) Use the appropriate formula Draw conclusions

Test of Significance Example
A package delivery service claims it takes an average of 24 hours to send a package from New York to San Francisco. An independent consumer agency is doing a study to test the truth of the claim. Several complaints have led the agency to suspect that the delivery time is longer than 24 hours. Assume that the delivery times are normally distributed with standard deviation (assume s for now) of 2 hours. A random sample of 25 packages has been taken.

Example 1 test of significance m = true mean delivery time Ho: m = 24
Ha: m > 24 Given a random sample Given a normal distribution Safe to infer a population of at least 250 packages

Example 1 (look, don’t copy)
24.85

Example 1 test of significance m = true mean delivery time
Ho: m = Ha: m > 24 Given a random sample Given a normal distribution Safe to infer a population of at least 250 packages. let a = .05

Example 1 test of significance m = true mean delivery time
Ho: m = Ha: m > 24 Given a random sample Given a normal distribution Safe to infer a population of at least 250 packages. We reject Ho. Since p-value<a there is enough evidence to believe the delivery time is longer than 24 hours.

Wording of conclusion revisit
If I believe the statistic is just too extreme and unusual (P-value < a), I will reject the null hypothesis. If I believe the statistic is just normal chance variation (P-value > a), I will fail to reject the null hypothesis. reject fail to reject p-value<a, there is p-value>a, there is not We Ho, since the enough evidence to believe…(Ha in context…)

Example 3 test of significance m = true mean distance
Ho: m = Ha: m > 340 Given random sample Given normally distributed. Safe to infer a population of at least 100 missiles. We fail to reject Ho. Since p-value>a there is not enough evidence to believe the mean distance traveled is more than 340 miles.

Familiar transition What happened on day 2 of confidence intervals involving mean and standard deviation? Switch from using z-scores to using the t-distribution. What changes occur in the write up?

Example 3 test of significance m = true mean distance
Ho: m = Ha: m > 340 Given random sample. Given normally distributed. Safe to infer a population of at least 100 missiles. We fail to reject Ho. Since p-value>a there is not enough evidence to believe the mean distance traveled is more than 340 miles.

Example 3 t-test m = true mean distance Ho: m = 340 Ha: m > 340
Given random sample. Given normally distributed. Safe to infer a population of at least 100 missiles. We fail to reject Ho. Since p-value>a there is not enough evidence to believe the mean distance traveled is more than 340 miles.

Given random sample Given normally distributed. Safe to infer a population of at least 100 missiles. We fail to reject Ho. Since p-value>a there is not enough evidence to believe the mean distance traveled is more than 340 miles.

t-chart

1 proportion z-test p = true proportion pure short
Ho: p = Ha: p = .25 Given a random sample. np = 1064(.25) > n(1–p) = 1064(1–.25) > 10 Sample size is large enough to use normality Safe to infer a population of at least 10,640 plants. We fail to reject Ho. Since p-value>a there is not enough evidence to believe the proportion of pure short is different than 25%.

Choosing a level of significance
How plausible is H0? If H0 represents a long held belief, strong evidence (small a) might be needed to dissolve the belief. What are the consequences of rejecting H0? The choice of a will be heavily influenced by the consequences of rejecting or failing to reject.

Errors in the justice system
Actual truth Guilty Not guilty Correct decision Type I error Guilty Jury decision Not guilty Type II error Correct decision

“No innocent man is jailed” justice system
Actual truth Guilty Not guilty Type I error Guilty smaller Jury decision Not guilty Type II error larger

“No guilty man goes free” justice system
Actual truth Guilty Not guilty Type I error Guilty larger Jury decision Not guilty Type II error smaller

Errors in the justice system
Actual truth Guilty Not guilty (Ha true) (H0 true) Correct decision Type I error Guilty (reject H0) Jury decision Not guilty Type II error Correct decision (fail to reject H0)

Type I and Type II errors
If we believe Ha when in fact H0 is true, this is a type I error. If we believe H0 when in fact Ha is true, this is a type II error. Type I error: if we reject H0 and it’s a mistake. Type II error: if we fail to reject H0 and it’s a mistake. APPLET

Type I and Type II example
A distributor of handheld calculators receives very large shipments of calculators from a manufacturer. It is too costly and time consuming to inspect all incoming calculators, so when each shipment arrives, a sample is selected for inspection. Information from the sample is then used to test Ho: p = .02 versus Ha: p < .02, where p is the true proportion of defective calculators in the shipment. If the null hypothesis is rejected, the distributor accepts the shipment of calculators. If the null hypothesis cannot be rejected, the entire shipment of calculators is returned to the manufacturer due to inferior quality. (A shipment is defined to be of inferior quality if it contains 2% or more defectives.)

Type I error: We think the proportion of defective calculators is less than 2%, but it’s actually 2% (or more). Consequence: Accept shipment that has too many defective calculators so potential loss in revenue.

Type II error: We think the proportion of defective calculators is 2%, but it’s actually less than 2%. Consequence: Return shipment thinking there are too many defective calculators, but the shipment is ok.

Distributor wants to avoid Type I error. Choose a = .01 Calculator manufacturer wants to avoid Type II error. Choose a = .10

Concept of Power Definition?
Power is the capability of accomplishing something… The power of a test of significance is…

Power Example In a power generating plant, pressure in a certain line is supposed to maintain an average of 100 psi over any 4 - hour period. If the average pressure exceeds 103 psi for a 4 - hour period, serious complications can evolve. During a given 4 - hour period, thirty random measurements are to be taken. The standard deviation for these measurements is 4 psi (graph of data is reasonably normal), test Ho: m = 100 psi versus the alternative “new” hypothesis m = 103 psi. Test at the alpha level of Calculate a type II error and the power of this test. In context of the problem, explain what the power means.

Type I error and a a is the probability that we think
the mean pressure is above 100 psi, but actually the mean pressure is 100 psi (or less)

Type I error and a

Type II error and b

Type II error and b b is the probability that we think the mean pressure is 100 psi, but actually the pressure is greater than 100 psi.

Power?

Power There is a probability that this test of significance will correctly detect if the pressure is above 100 psi.

Concept of Power The power of a test of significance is the probability that the null hypothesis will be correctly rejected. Because the true value of m is unknown, we cannot know what the power is for m, but we are able to examine “what if” scenarios to provide important information. Power = 1 – b

Effects on the Power of a Test
The larger the difference between the hypothesized value and the true value of the population characteristic, the higher the power. The larger the significance level, a, the higher the power of the test. The larger the sample size, the higher the power of the test. APPLET

Introduction to Inference

Similar presentations

Presentation on theme: "Introduction to Inference"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to Inference

Similar presentations

Presentation on theme: "Introduction to Inference"— Presentation transcript:

Similar presentations

About project

Feedback