 # STATISTICAL INFERENCE PART V

## Presentation on theme: "STATISTICAL INFERENCE PART V"— Presentation transcript:

STATISTICAL INFERENCE PART V
HYPOTHESIS TESTING

Tests are based on the following principle:
TESTS OF HYPOTHESIS H0 is False H0 is True Reject H0 Do not reject H0 Type I error P(Type I error) =  Correct Decision Type II error P(Type II error) =  1- 1- Tests are based on the following principle: Fix , minimize . ()=Power function of the test for all . = P(Reject H0)=P((x1,x2,…,xn)C)

POWER OF THE TEST AND P-VALUE
= P(Reject H0|H0 is not true) p-value = Observed significance level = Probability of obtaining a test statistics at least as extreme as the one that you observed by chance, OR, the smallest level of significance at which the null hypothesis can be rejected OR the maximum value of  that you are willing to tolerate.

CALCULATION OF P-VALUE
Determine the value of the test statistics, For One-Tailed Test: p-value= P(z > z0) if HA: >0 p-value= P(z < z0) if HA: <0 For Two-Tailed Test p=p-value = 2.P(z>z0) p=p-value = 2.P(z<-z0) p-value z0 p-value z0 p/2 p/2 -z0 z0

DECISION RULE BY USING P-VALUES
REJECT H0 IF p-value <  DO NOT REJECT H0 IF p-value   p-value

Example Do the contents of bottles of catsup have a net weight below an advertised threshold of 16 ounces? To test this 25 bottles of catsup were selected. They gave a net sample mean weight of It is known that the standard deviation is We want to test this at significance levels 1% and 5%.

CALCULATIONS The z-score is:
The p-value is the probability of getting a score worse than this (relative to the alternative hypothesis) i.e., Compare the p-value to the significance level. Since it is bigger than both 1% and 5%, we do not reject the null hypothesis.

P-value for this one-tailed Test
The p-value for this test is Thus, do not reject H0 at 1% and 5% significance level. We do not have enough evidence to say that the contents of bottles of catsup have a net weight of less than 16 ounces. 0.1056 0.10 0.05 -1.25

Test of Hypothesis for the Population Mean ( unknown)
For samples of size n drawn from a Normal Population, the test statistic: has a Student t-distribution with n1 degrees of freedom

EXAMPLE 5 measurements of the tar content of a certain kind of cigarette yielded 14.5, 14.2, 14.4, 14.3 and 14.6 mg per cigarette. Show the difference between the mean of this sample and the average tar content claimed by the manufacturer, =14.0, is significant at =0.05.

SOLUTION H0:  = 14.0 HA:   14.0 Decision Rule: Reject Ho if t<-t/2 or t> t/2.

CONCLUSION Reject H0 at  = 0.05. Difference is significant. 0.025
0.95 5.66 -2.766 2.766 Reject H0 Reject H0

P-value of This Test p-value = 2.P(t > 5.66) = 2(0.0024)=0.0048
Since p-value = <  = 0.05, reject H0. Minitab Output T-Test of the Mean Test of mu = vs mu not = Variable N Mean StDev SE Mean T P-Value C

CONCLUSION USING THE CONFIDENCE INTERVALS
MINITAB OUTPUT: Confidence Intervals Variable N Mean StDev SE Mean % C.I. C ( , ) Since 14 is not in the interval, reject H0.

EXAMPLE Problem: At a certain production facility that assembles computer keyboards, the assembly time is known (from experience) to follow a normal distribution with mean (m) of 130 seconds and standard deviation (s) of 15 seconds. The production supervisor suspects that the average time to assemble the keyboards does not quite follow the specified value. To examine this problem, he measures the times for 100 assemblies and found that the sample mean assembly time ( ) is seconds. Can the supervisor conclude at the 5% level of significance that the mean assembly time of 130 seconds is incorrect?

We want to prove that the time required to do the assembly is different from what experience dictates: The sample mean is The standard deviation is The standardized test statistic value is:

1- Two-Tail Hypothesis: H0: m130 HA: m 130  z
Do not Reject H0 z=test statistic values (-z/2z<z/2 Reject H0 (z<-z/2 Type I Error Probability -z (z>z/2

.90 .9 Test Statistic:  2 z  -z  Z
Z .90 Rejection Region

CONCLUSION Since –2.13<-1.96, it falls in the rejection region.
Hence, we reject the null hypothesis that the time required to do the assembly is 130 seconds. The evidence suggests that the task now takes either more or less than 130 seconds.

DECISION RULE Reject Ho if z < -1.96 or z > 1.96.
In terms of , reject H0 if 06

P-VALUE In our example, the p-value is
So, since < 0.05, we reject the null.

Calculating the Probability of Type II Error
Ho: m = 130 HA: m130 Suppose we would like to compute the probability of not rejecting H0 given that the null hypothesis is false (for instance m=135 instead of 130), i.e. b=P(not rejecting Ho|Ho is false). Assuming m=135 this statement becomes:

TESTING HYPOTHESIS ABOUT POPULATION PROPORTION, p
ASSUMPTIONS: The experiment is binomial. The sample size is large enough. x: The number of success The sample proportion is approximately for large n (np  5 and nq  5 ).

HYPOTHESIS TEST FOR p H0: p = p0 HA: p  p0
Two-sided Test Test Statistic Rejecting Area H0: p = p0 HA: p  p0 Reject Ho if z < -z/2 or z > z/2. /2 z/2 - z/2 Do not reject H0 Reject H0 1- Reject H0

HYPOTHESIS TEST FOR p One-sided Tests Test Statistic Rejecting Area
1. H0: p= p0 HA: p > p0 Reject Ho if z > z. 2. H0: p = p0 HA: p < p0 Reject Ho if z < - z. 1- z Do not reject H0 Reject H0 1- -z Reject H0 Do not reject H0

EXAMPLE Mom’s Home Cokin’ claims that 70% of the customers are able to dine for less than \$5. Mom wishes to test this claim at the 92% level of confidence. A random sample of 110 patrons revealed that 66 paid less than \$5 for lunch. Ho: p = 0.70 HA: p  0.70

ANSWER x = 66, n = 110 and p = 0.70  = 0.08, z/2 = z0.04 = 1.75
Test Statistic:

CONCLUSION DECISION RULE: Reject H0 if z < -1.75 or z > 1.75.
CONCLUSION: Reject H0 at  = Mom’s claim is not true. -1.75 1.75 -2.289 /2

P-VALUE p-value = 2. P(z < -2.289) =2(0.011) = 0.022
The smallest value of  to reject H0 is Since p-value = <  = 0.08, reject H0. 0.011 -2.289

CONFIDENCE INTERVAL APPROACH
Find the 92% CI for p. 92% CI for p: 0.52  p  0.68 Since is not in the above interval, reject H0. Mom has overestimated the percentage of customers that pay less than 5\$ for a meal. What happens with wider confidence intervals? Exercise: Calculate the 95% and 99% CIs for p.

SAMPLING DISTRIBUTION OF s2
The statistic is chi-squared distributed with n-1 d.f. when the population random variable is normally distributed with variance 2.

CHI-SQUARE DISTRIBUTION
A 1 - A 2 2 f(2)

Inference about the Population Variance (s2)
Test statistic which is chi-squared distributed with n - 1 degrees of freedom LCL = (n - 1) s2 2 1 - /2 /2 UCL = Confidence interval estimator:

Testing the Population Variance (s2) EXAMPLE
Proctor and Gamble told its customers that the variance in the weights of its bottles of Pepto-Bismol is less than 1.2 ounces squared. As a marketing representative for P&G, you select 25 bottles and find a variance of 1.7. At the 10% level of significance, is P&G maintaining its pledge of product consistency? H0: 2 = 1.2 HA: 2 < 1.2

ANSWER n=25, s2=1.7, =0.10, Test Statistics:
Decision Rule: Reject H0 if Conclusion: Because 2=34 > , do not reject H0. We don’t have enough evidence that suggests the variability in product weights less than 1.2 ounces squared.

EXAMPLE A random sample of 22 observations from a normal population possessed a variance equal to Find 90% CI for 2. 90% CI for 2: