Lecture8 Test forcomparison of proportion Xiaojinyu@seu.edu.cn
review Basic logic for Hypothesis t test for comparison of means t test for one sample design t test for Paired design Completely randomized design
Example7.1 Sample: 25 female patients were randomly selected, and their Hb of blood were tested. The mean of Hb is 150g/L,and the sd 16.5g/L. Question: Do these data provide sufficient evidence to indicate that the mean of Hb of all of female patients who suffer from the same disease is different from the mean of Hb of normal female (The mean of Hb of normal female is 132g/L)。
Solution to Example Hypothesis: 2.Test statistic: υ=25-1=24 H0 : 1= 132, Hb of female patients is equal to normal female ; H1 : 1 132 , Hb of female patients is different from that of normal female; =0.05 2.Test statistic: υ=25-1=24 3. t0.025,24=2.064, t>t0.025,24, so P < 0.05, Reject H0 at the level of =0.05, the difference is statistically significant.Conclusion: The mean of Hb of patients is different from the normal female.
t test for Paired or matched data Example: Two equipments to test noise(db) at same site and time d=A-B 1 87 86 2 65 66 -1 3 74 77 -3 4 95 5 60 6 55 53 7 63 62 8 88 85 9 61 59 10 54
Solution To Example (1) H0 : d=0, the results from 2 equipments are same. H1 : d≠0, the results from 2 equipments are different. =0.05 (2)Calculate t =9 (t 0.025,9 = 2.262) (3) P>0.05(0.2<P<0.4),so we have no evidence to reject H0 at level 0.05 test results of two equipments are not different.
EX7.4 (comparison of LPO between two groups ) Obesity group:n1=30, Normal group :n2=30, Difference in nature? Sampling error? INFERENCES ON TWO POPULATIONS USING DATA FROM INDEPENDENT SAMPLES To compare the LPO between obesity people and normal weight people by 2 groups including 30 subjects respectively.
Solution to example7.4 (1) H0 : 1=2, LPO of obesity people is same to normal people; H1 : 1≠2 , LPO of obesity people is same to normal people; =0.05。 (2) t and (= n1+n2 -2) 。
Solution to example7.4 Cont. (2) (t 0.025,58 = 2.002) (3) t>t 0.025,58 = 2.002, so P<0.05(P <0.0001), Reject H0 at 0.05 level, The LPO of obesity group is different from the LPO of normal men.
Example8.1: Results from a cancer Clinical Trial for drug Drugs Effect of drug total Sample rates effective Not effective Drug A 41 4 45 91.1 Drug B 24 11 35 68.6 65 15 70 Are the 2 population proportions equal or not? How categorical variables are distributed among 2 population?
Example8.1: Results from a cancer Clinical Trial for drug Drugs Effect of drug total Sample rates effective Not effective Drug A 41 4 45 91.1 Drug B 24 11 35 68.6 65 15 70 Are the 2 population proportions equal or not? How categorical variables are distributed among 2 population?
Objective To infer if proportion H0: π1=π2 by comparison 2 sample proportion p1, p2
Z-test for Binomial Distribution Data The method of normal approximation An alternative solution of chi-Squared Test Assumption about n and p
Four-fold Table Actual frequency a b c d Drugs Effect of drug total effective Not effective Drug A 41 4 45 Drug B 24 11 35 65 15 70 a b c d Actual frequency
Outline of χ2 -Test Basic logic Completely Random Design One sample design Paired Design
Statistician for chi-Squared Test British Karl Pearson 1857~1936 1901.10 created Biometrika together with Weldon and Galton
The First Thought of Significance Test The earliest clearly thought-out use of hypothesis testing probably belong to Karl Pearson. From Encyclopedia of Biostatistics. Peter Armitage,Theodore Colton
Basic Logic of Test Is a die fair or not? Throw a die to test
Theoretical frequency(T) Basic logic of it results 1 2 3 4 5 6 Theoretical frequency(T) 10 Actual frequency (A) 12 13 15 9 difference -2 -3 -5
Basic logic An investigation of the degree of agreement of Theoretical (expected) frequency & Actual (observed) frequency
χ2 -Distribution
Basic logic of χ2 test
Example8.1: Results from a cancer Clinical Trial for drug Drugs Effect of drug total Sample rates effective Not effective Drug A 41 4 45 91.1 Drug B 24 11 35 68.6 65 15 70 Are the 2 population proportions equal or not? How categorical variables are distributed among 2 population?
χ2 -Test for Completely Random Design 2X2 contingency table-comparison between 2-sample proportion or percentage rXc contingency table-comparison between more than 2-sample proportions or percentages
Steps of Chi-square test Hypothesis and significance level Test statistic & degree of freedom Compare to Critical value or to get exact P value Statistical decision and conclusion
Test Hypothesis & Significance Level H0: πA = πB H1: πA≠πB, α=0.05 2 population probabilities are equal with respect to extent of effect of drugs. There is no relationship between drug and effect; Basically, the chi-square test of independence tests whether the columns are contingent on the rows in the table.
If The H0 Is True, then We take Pc, combined rate as estimation to population rate, that is theoretical Effective Rate
Theoretical Effective Rate Drug positive negative Row total A A11 41 A12 4 n1 B A21 24 A22 11 n2 Column total m1 m2 n If the H0 is true, We will expect to find n1*Pc effective patients in group A, n1*(1-Pc) ineffective patients and so on.
Theoretical frequency Drug positive negative R total A A11 A12 n1 B A21 A22 n2 C total m1 m2 n 41 4 24 11 T11 36.56 T12 8.44 T21 28.44 T22 6.56
Test Statistic A ( T ) 41 (36.56) 4 (8.44) 24 (28.44) 11 (6.56)
Calculation of Test Statistic
Distribution of Test Statistic If H0 is true, χ2 is distributed χ2 approximately as with(r-1)(c-1) degrees of freedom. Df= (r-1)(c-1) =
Χ2 critical values
χ2 -Distribution The chi-square distribution results when independent variables with standard normal distributions are squared and summed. Positively skewed distribution 2 values will never be negative; minimum is 0 2 of close to 0 indicates that the variables are independent of one another
Decision Rule We let the probability of committing a type Ⅰerror be 0.05.So If the computed χ2 value of is equal to or greater than 3.84, reject H0.that is to say, the critical value is 3.84.
Statistical Decision and Conclusion Since 6.5732>3.84, we reject H0. We conclude that the two populations are not homogeneous with respected to effect of drug. The effects of drug A and drug B are not equivalent.
Basis of Chi-square Test of Independence Construct bivariate table as it would look if there were no relationship Compare the real table to the hypothetical one Measure how different they are if there are large differences, we conclude that there is a significant relationship….if not we conclude that its just chance that our numbers vary
Basic Assumption on Sample Size The sample size must be sufficiently large to ensure the approximation is valid. Each expected frequency T should be at least equal to 5. N ≥ 40 & T≥5 Because χ2 distribution is continuous, the critical values are only approximately correct for determining the P-value.
Correction for Continuity
Example of Correction Table 8-2 Results from Trial of Treatments for Some Cancer Patients χ2=0.595
Fisher Exact Test When n<40 or one of T happens to be less than 1, the correction for continuity is not enough
χ2 - Test for A Paired Design A Paired Design with 2 categories variables A Paired Design with more than 2 categories variables
A Paired Design Table 8-3 Results from two culture Media to detect some bacteria B A total + - 22(a) 18(b) 40 2(c) 14(d) 16 24 32 56 The two media, evaluated for the same sample of 56 specimens, give counts of the positive and negative results, as given in table.
Mcnemar Test Q. Mcnemar (1947). Note on the samplng error of the difference between correlated proportions or percentages, Psychometika 12, 153-4-157 Psychology-two correlated dichotomous responses were to be compared.
Objective of Test are the Positive rates of two media the same? H0:πA=πB, the positive rates for two culture media are the same How do we calculate the positive rates?
Marginal Proportion Positive rate of A Positive rate of B B A total + - 22(a) 18(b) 40 2(c) 14(d) 16 24 32 56 Positive rate of A Positive rate of B What we need for test are the counts in these off-diagonal cells. b and c
Solution to example8.3 The critical value χ2 0.05,1=3.84, 11.25>3.84, the P-value associated with a test statistic of 13.91 is less than 0.05. We reject H0. the positive rates for two culture media are not the same.
Requirements for χ2 -Test Must be a random sample from population Data must be in raw frequencies -Actual count data Variables must be independent arranged Categories for each I.V. must be mutually exclusive and exhaustive A sufficiently large sample size (at least 40)
Important Limitation of X2 test sensitive to sample size. Doubling sample size doubles X2 if the distribution stays the same. Remember the distinction between statistical significance and substantive significance
At the end of this session you will be able to: Prepare a contingency table Realize which study designs are suitable for applying the chi square test and the basic procedure Understand the assumptions / limitations of the chi square test.
Thank you for your attention!