The p-value approach to Hypothesis Testing
In hypothesis testing we need A test statistic A Critical and Acceptance region for the test statistic The Critical Region is set up under the sampling distribution of the test statistic. Area = a (0.05 or 0.01) above the critical region. The critical region may be one tailed or two tailed
The Critical region: a/2 a/2 Reject H0 Accept H0
In test is carried out by Computing the value of the test statistic Making the decision Reject if the value is in the Critical region and Accept if the value is in the Acceptance region.
The value of the test statistic may be in the Acceptance region but close to being in the Critical region, or The it may be in the Critical region but close to being in the Acceptance region. To measure this we compute the p-value.
Definition – Once the test statistic has been computed form the data the p-value is defined to be: p-value = P[the test statistic is as or more extreme than the observed value of the test statistic] more extreme means giving stronger evidence to rejecting H0
Example – Suppose we are using the z –test for the mean m of a normal population and a = 0.05. Thus the critical region is to reject H0 if Z < -1.960 or Z > 1.960 . Suppose the z = 2.3, then we reject H0 p-value = P[the test statistic is as or more extreme than the observed value of the test statistic] = P [ z > 2.3] + P[z < -2.3] = 0.0107 + 0.0107 = 0.0214
Graph p - value -2.3 2.3
If the value of z = 1.2, then we accept H0 p-value = P[the test statistic is as or more extreme than the observed value of the test statistic] = P [ z > 1.2] + P[z < -1.2] = 0.1151 + 0.1151 = 0.2302 23.02% chance that the test statistic is as or more extreme than 1.2. Fairly high, hence 1.2 is not very extreme
Graph p - value -1.2 1.2
Properties of the p -value If the p-value is small (<0.05 or 0.01) H0 should be rejected. The p-value measures the plausibility of H0. If the test is two tailed the p-value should be two tailed. If the test is one tailed the p-value should be one tailed. It is customary to report p-values when reporting the results. This gives the reader some idea of the strength of the evidence for rejecting H0
Summary A common way to report statistical tests is to compute the p-value. If the p-value is small ( < 0.05 or < 0.01) then H0 is rejected. If the p-value is extremely small this gives a strong indication that HA is true. If the p-value is marginally above the threshold 0.05 then we cannot reject H0 but there would be a suspicion that H0 is false.
Testing and Estimation of Variances
Let x1, x2, x3, … xn, denote a sample from a Normal distribution with mean m and standard deviation s (variance s2) The point estimator of the variance s2 is: The point estimator of the standard deviation s is:
The sampling distribution of s2 The c2 distribution
The c2 distribution Let z1, z2, z3, … zn denote a sample from the Standard Normal distribution Let Then the distribution of U is called the Chi-square (c2) distribution with n degrees of freedom
c 2 distribution n =1 df n =2 df n =4 df
comments Usually statistics that are “sum of squares” of observations have a distribution that is related to the c2 distribution. The degrees of freedom are the number of “independent” terms in the sum of squares
Let x1, x2, x3, … xn, denote a sample from a Normal distribution with mean m and standard deviation s (variance s2) Let Then has a c2 distribution with n = n – 1 degrees of freedom
Critical Points of the c2 distribution
Confidence intervals for s2 and s.
Confidence intervals for s2 and s. It is true that from which we can show and
Hence (1 – a)100% confidence limits for s2 are: and (1 – a)100% confidence limits for s are:
Example A study was interested in determining if administration of a drug reduces cancerous tumor size. For this purpose n +m = 9 test animals are implanted with a cancerous tumor. n = 3 are selected at random and administered the drug. The remaining m = 6 are left untreated. Final tumour sizes are measured at the end of the test period
Suppose the data has been collected and:
(1 – a)100% confidence limits for s2 are: Now: (1 – a)100% confidence limits for s2 are: and (1 – a)100% confidence limits for s are:
The drug treated group 95 % confidence limits for s2 are:
The control group 95 % confidence limits for s2 are:
Testing for the equality of variances The F test
Situation: Let x1, x2, x3, … xn, denote a sample from a Normal distribution with mean mx and standard deviation sx Let y1, y2, y3, … ym, denote a second independent sample from a Normal distribution with mean my and standard deviation sy We want to test for the equality of the two variances
i.e.: Test (Two sided alternative) or Test (one sided alternative) or Test (one sided alternative)
The sampling distribution of the test statistic The test statistic (F) The sampling distribution of the test statistic If the Null Hypothesis (H0) is true then the sampling distribution of F is called the F-distribution with n1 = n - 1 degrees in the numerator and n2 = m - 1 degrees in the denominator
The F distribution n1 = n - 1 degrees in the numerator n2 = m - 1 degrees in the denominator a Fa(n1, n2)
Note: If has F-distribution with n1 = n - 1 degrees in the numerator and n2 = m - 1 degrees in the denominator then has F-distribution with n1 = m - 1 degrees in the numerator and n2 = n - 1 degrees in the denominator
Critical region for the test: has F-distribution with n1 = n - 1 degrees in the numerator and n2 = m - 1 degrees in the denominator then has F-distribution with n1 = m - 1 degrees in the numerator and n2 = n - 1 degrees in the denominator
Critical region for the test: (Two sided alternative) Reject H0 if or
Critical region for the test (one tailed): (one sided alternative) Reject H0 if
Example A study was interested in determining if administration of a drug reduces cancerous tumor size. For this purpose n +m = 9 test animals are implanted with a cancerous tumor. n = 3 are selected at random and administered the drug. The remaining m = 6 are left untreated. Final tumour sizes are measured at the end of the test period
Suppose the data has been collected and:
We want to test: (H0 is assumed for the t-test for comparing the means ) Using a =0.05 we will reject H0 if or
Test statistic: and Therefore we accept
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
The F test – for comparing k means Situation We have k normal populations Let mi and s denote the mean and standard deviation of population i. i = 1, 2, 3, … k. Note: we assume that the standard deviation for each population is the same. s1 = s2 = … = sk = s
We want to test against
The data Assume we have collected data from each of th k populations Let xi1, xi2 , xi3 , … denote the ni observations from population i. i = 1, 2, 3, … k. Let
The pooled estimate of standard deviation and variance:
Consider the statistic comparing the sample means where
To test against use the test statistic
Computing Formulae
Now Thus
To Compute F: Compute 1) 2) 3) 4) 5)
Then 1) 2) 3)
The sampling distribution of F The sampling distribution of the statistic F when H0 is true is called the F distribution. The F distribution arises when you form the ratio of two c2 random variables divided by there degrees of freedom.
i.e. if U1 and U2 are two independent c2 random variables with degrees of freedom n1 and n2 then the distribution of is called the F-distribution with n1 degrees of freedom in the numerator and n2 degrees of freedom in the denominator
Recall: To test against use the test statistic
We reject if Fa is the critical point under the F distribution with n1 degrees of freedom in the numerator and n2 degrees of freedom in the denominator
Example In the following example we are comparing weight gains resulting from the following six diets Diet 1 - High Protein , Beef Diet 2 - High Protein , Cereal Diet 3 - High Protein , Pork Diet 4 - Low protein , Beef Diet 5 - Low protein , Cereal Diet 6 - Low protein , Pork
Hence
Thus Thus since F > 2.386 we reject H0
A convenient method for displaying the calculations for the F-test The ANOVA Table A convenient method for displaying the calculations for the F-test
Anova Table Mean Square F-ratio Between k - 1 SSBetween MSBetween Source d.f. Sum of Squares Mean Square F-ratio Between k - 1 SSBetween MSBetween MSB /MSW Within N - k SSWithin MSWithin Total N - 1 SSTotal
Diet Example
Equivalence of the F-test and the t-test when k = 2
the F-test
Hence
The c2 test for independence
Situation We have two categorical variables R and C. The number of categories of R is r. The number of categories of C is c. We observe n subjects from the population and count xij = the number of subjects for which R = I and C = j. R = rows, C = columns
Example Both Systolic Blood pressure (C) and Serum Chlosterol (R) were meansured for a sample of n = 1237 subjects. The categories for Blood Pressure are: <126 127-146 147-166 167+ The categories for Chlosterol are: <200 200-219 220-259 260+
Table: two-way frequency
The c2 test for independence Define = Expected frequency in the (i,j) th cell in the case of independence.
Justification - for Eij = (RiCj)/n in the case of independence Let pij = P[R = i, C = j] = P[R = i] P[C = j] = rigj in the case of independence = Expected frequency in the (i,j) th cell in the case of independence.
H0: R and C are independent Then to test H0: R and C are independent against HA: R and C are not independent Use test statistic Eij= Expected frequency in the (i,j) th cell in the case of independence. xij= observed frequency in the (i,j) th cell
Sampling distribution of test statistic when H0 is true - c2 distribution with degrees of freedom n = (r - 1)(c - 1) Critical and Acceptance Region Reject H0 if : Accept H0 if :
Standardized residuals Test statistic degrees of freedom n = (r - 1)(c - 1) = 9 Reject H0 using a = 0.05