# 1 Comparison of proportions 比例的比較 -Part I Instructor: 李奕慧

## Presentation on theme: "1 Comparison of proportions 比例的比較 -Part I Instructor: 李奕慧"— Presentation transcript:

1 Comparison of proportions 比例的比較 -Part I Instructor: 李奕慧 yihwei@mail.tcu.edu.tw

Lecture Overview Cross Tabulations 2 X 2 tables R XC tables Chi-square Test for Independence Chi-square Test for Trend

3 Cross Tabulation and Chi-square test for independence: To Explore the Association Between Two Categorical Variables, 例：機車騎士戴安全帽與否是否與發生車禍時頭部 受傷的機率有關？ 頭部受傷 戴安全帽 是 否合計 是 126274 否 8838126 合計 100 200

4 Data Type : Measure Scale: 連續變數 Nominal/Ordinal: 類別變數 Value: 定義類別變數的項目， 0=“no”, 1=“yes”. Data Input Helmet.sav dataset

5 頭部受傷 戴安全帽 是 否合計 是 O 11 =12O 21 =6274 否 O 12 =88O 22 =38126 合計 100 200 Observed frequencies: O 11, O 12, O 21, O 22

6 Under H 0, Expected frequencies: E 11, E 12, E 21, E 22 E 11 = 200 x P( 戴安全帽 ) x P( 頭部受傷 ) = 200 x (100/200) x (74/200) = 37 E 12 = 200 x P( 戴安全帽 ) x P( 頭部沒受傷 ) = 200 x (100/200) x (126/200) = 63=100-37

7 頭部受傷 戴安全帽 是 否合計 是 E 11 =37E 21 =3774 否 E 12 =63E 22 =63126 合計 100 200 Expected frequencies: E 11, E 12, E 21, E 22

8 Chi-square test (  2 -test) for independence

9

10 Chi-Square Tests Valuedf Asymp. Sig. (2-sided) Exact Sig. (2- sided) Exact Sig. (1- sided) Pearson Chi-Square53.625 a 1.000 Continuity Correction b 51.5021.000 Likelihood Ratio57.3841.000 Fisher's Exact Test.000 Linear-by-Linear Association 53.3571.000 N of Valid Cases200 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 37.00. b. Computed only for a 2x2 table 大樣本用： Pearson Chi-square Test 小樣本用： Fisher’s Exact Test

11 injury * helmet Crosstabulation Count helmet Total noyes injuryno3888126 yes621274 Total100 200 Row variable: Injury Column variable: Helmet 數學上的慣例： Row *Column

12 SPSS Menu Analyze > Descriptive Statistics > Crosstabs

13 Another way to input data Data >Weight Cases Helmet2.sav dataset

14 當樣本數很小時，或 E 11, E 12, E 21, E 22 小於 5 Chi-square 檢定不夠準確，必須使用 Yate’s continuity correction test, or Fisher’s exact test 頭部受傷 戴安全帽 是 否合計 是 O 11 =1O 21 =67 否 O 12 =13 O 22 =1023 合計 141630

15 Summary results:  2 -test = 3.846, P-value = 0.050 Yate’s corrected  2 -test = 2.345, P-value = 0.126 Fisher’s exact test, P-value = 0.086 頭部受傷 戴安全帽 是 否合計 是 167 否 131023 合計 141630

16 injury * helmet Crosstabulation helmet Total noyes injurynoCount101323 % within helmet62.5%92.9%76.7% yesCount617 % within helmet37.5%7.1%23.3% TotalCount161430 % within helmet100.0% Helmet (n=14) No Helmet (n=16)P-value Head injury1 (7)6 (38)0.086 Values are number of study subjects (percentage). P-value is derived from Fisher’s exact test. Table Example in Research Article

17 Chi-Square Tests Valuedf Asymp. Sig. (2-sided) Exact Sig. (2- sided) Exact Sig. (1- sided) Pearson Chi-Square3.846 a 1.050 Continuity Correction b 2.3371.126 Likelihood Ratio4.2211.040 Fisher's Exact Test.086.061 Linear-by-Linear Association 3.7181.054 N of Valid Cases30 a. 2 cells (50.0%) have expected count less than 5. The minimum expected count is 3.27. b. Computed only for a 2x2 table Helmet3.sav

R x C Tables Row variable with r levels Column variable with c levels Test for the independence between Row and Column variables Using Chi-square test with df=(r-1)x(c-1)

Test H 0 : proportions of each type of death certificates are identical in the two hospitals (There is no association between hospital type and death certificate status) Hospital Death Certificate Status Total Confirmed accurate Inaccurate No change Incorrect recording A (row %) 157 (68.6%) 18 (7.9%) 54 (23.6%) 229 B (row %) 268 (77.5%) 44 (12.7%) 34 (9.8%) 346 Total4256288575

Hospital Death Certificate Status Total Confirmed accurate Inaccurate No change Incorrect recording A169.324.735.0229 B255.737.353.0346 Total4256288575 Expected counts under H 0 : independence E rc =(n r x n c )/575

Chi-square test for no association (independence) between Hospital type and death certificate status with df=(2-1)(3-1)=2, P<0.001 Reject H 0 and conclude that there is an association between hospital type and death certificate status. It appears (from data) that Hospital A contains a larger proportion of death certificates that are incorrect and required recoding than Hospital B.

22 Hosiptal * Accuracy Crosstabulation Accuracy Total accurate minor inaccuracy major inaccuracy HosiptalA Count 1571854 229 % within Hosiptal 68.6%7.9%23.6% 100.0% B Count 2684434 346 % within Hosiptal 77.5%12.7%9.8% 100.0% TotalCount4256288575 % within Hosiptal73.9%10.8%15.3%100.0% Chi-Square Tests Valuedf Asymp. Sig. (2-sided)Exact Sig. (2-sided)Exact Sig. (1-sided)Point Probability Pearson Chi-Square21.523 a 2.000 Likelihood Ratio21.1892.000 Fisher's Exact Test21.015.000 Linear-by-Linear Association12.864 b 1.000 N of Valid Cases575 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 24.69. b. The standardized statistic is -3.587. Hospital.sav

23 Chi-square test for trend If one or both variables are ordinal, then chi-square test for trend is appropriate. Chi-square test for trend also known as Mantel-Haenszel test for trend. In R x 2 Tables, H 0 : p 1 =p 2 =…=p c versus H a : p 1

p 2 >…>p c (a decreasing trend)

24 Examples for chi-square test for trend H0: P 65-69 =P 70-74 =P 75+ Ha: P 65-69 >P 70-74 >P 75+ or P 65-69

{ "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/13/4110469/slides/slide_24.jpg", "name": "24 Examples for chi-square test for trend H0: P 65-69 =P 70-74 =P 75+ Ha: P 65-69 >P 70-74 >P 75+ or P 65-69

P 70-74 >P 75+ or P 65-69

25 agegp * counseled Crosstabulation counseled Total noyes agegp 65-69Count374530186763 % within agegp55.4%44.6%100.0% 70-74Count301321815194 % within agegp58.0%42.0%100.0% >=75Count311116754786 % within agegp65.0%35.0%100.0% TotalCount9869687416743 % within agegp58.9%41.1%100.0% Chi-Square Tests Valuedf Asymp. Sig. (2- sided) Pearson Chi-Square110.058 a 2.000 Likelihood Ratio111.0782.000 Linear-by-Linear Association103.0861.000 N of Valid Cases16743 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 1964.94. Smoking.sav

26 Am J Public Health. 2004 Oct;94(10):1768-74 Exercise

27 Categorical data: depression= (yes or no) H 0 : proportion of subjects with depression in women = proportion of subjects with depression in men H 0 : p f = p m H a : proportion of subjects with depression in women is different from proportion of subjects with depression in men H a : p f  p m

28 sex * depression Crosstabulation depression noyesTotal sexmaleCount48233495172 % within sex93.3%6.7%100.0% femaleCount74617168177 % within sex91.2%8.8%100.0% TotalCount12284106513349 % within sex92.0%8.0%100.0% Chi-Square Tests Valuedf Asymp. Sig. (2-sided) Exact Sig. (2-sided) Exact Sig. (1-sided) Pearson Chi-Square17.406 a 1.000 Continuity Correction b 17.1341.000 Likelihood Ratio17.7591.000 Fisher's Exact Test.000 Linear-by-Linear Association 17.4051.000 N of Valid Cases13349 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 412.63. b. Computed only for a 2x2 table Depression.sav

29 Exercise: Comparison among age groups H 0 : proportions of subjects with depression are the same among the three age groups. H 0 : p 75-79 = p 80-84 = p 85+ H a : proportions of subjects with depression are different among the three age groups. (test for independence) H a : depression prevalence is increasing in older age groups. (test for trend)

30 Practice! Practice! Practice! Thank you !