Presentation is loading. Please wait.

Presentation is loading. Please wait.

Categorical Data Analysis

Similar presentations


Presentation on theme: "Categorical Data Analysis"— Presentation transcript:

1 If we live with a deep sense of gratitude, our life will be greatly embellished.

2 Categorical Data Analysis
Chapter 10: Tests for Matched Pairs

3 Meta Analysis Also known as stratified analysis
Section 6.3.2: Cochran-Mantel-Haenszel test; test for conditional independence Situation: When another variable (strata Z) may “pollute” the effect of a categorical explanatory variable X on a categorical response Y Goal: Study the effect of X on Y while controlling the stratification variable Z without assuming a model

4 Example: Respiratory Improvement (SAS textbook, P. 46)
Center Treatment Yes No Total 1 Test 29 16 45 Placebo 14 31 43 47 90 2 37 9 24 21 61

5 SAS Output Summary Statistics for trtmnt by response
Controlling for center Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Nonzero Correlation <.0001 Row Mean Scores Differ <.0001 General Association <.0001

6 What to Do if Dependent (Section 6.3.5) When X and Y are NOT conditionally independent given Z, we would like to test for homogeneous association (Section 6.3.6) If X, Y, Z have homogeneous association, we would like to estimate the common conditional odds ratio for X, Y given Z

7 SAS Output Estimates of the Common Relative Risk (Row1/Row2)
Type of Study Method Value % Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel (Odds Ratio) Logit Cohort Mantel-Haenszel (Col1 Risk) Logit Cohort Mantel-Haenszel (Col2 Risk) Logit Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square DF Pr > ChiSq Total Sample Size = 180

8 Matched-pair Data Comparing categorical responses for two “paired” samples When either Each sample has the same subjects (or say subjects are measured twice) Or A natural pairing exists between each subject in one sample and a subject from the other sample (eg. Twins)

9 Example: Rating for Prime Minister
Second Survey First Survey Approve Disapprove 794 150 86 570

10 Marginal Homogeneity The probabilities of “success” for both samples are identical (The data table shows “symmetry” across the main diagonal) Eg. The probability of approve at the first and 2nd surveys are identical

11 Estimating Differences of Proportions
Sample estimate: P+1-P1+ Standard error of P+1-P1+ (based on the multinomial distribution of data): Asymptotical (1-a) confidence interval:

12 McNemar Test (for 2x2 Tables only)
See SAS textbook Sec 3.7 (p. 40) Ho: marginal homogeneity Ha: no marginal homogeneity A special case of C-M-H test; an approximate test (when n*=n12+n21>10) Exact test (when n*=n12+n21<10)

13 Level of Agreement: Kappa Coefficient
The larger the Kappa coefficient is; the stronger the agreement is The difference between observed agreement and that expected under independence compared to the maximum possible difference is called Kappa coefficient

14 SAS Output McNemar's Test Statistic (S) 17.3559 DF 1
Asymptotic Pr > S <.0001 Exact Pr >= S E-05 Simple Kappa Coefficient Kappa ASE 95% Lower Conf Limit 95% Upper Conf Limit Sample Size = 1600 Level of agreement

15 Chi-square Test for Square Tables
Consider a IxI table Marginal homogeneity: Symmetry: for all pairs of cells, Symmetry => marginal homogeneity <=

16 Chi-square Test for Square Tables
Ho: symmetry vs. Ha: not symmetry Fitted values: Standardized Pearson residuals: Pearson Chi-square Test statistic: X^2 follows approximately Chi-square with df = I(I-1)/2

17 Example: Coffee Purchase
2nd purchase 1st purchase High point Taster’s Sanka Nescafe Brim 93 17 44 7 10 9 46 11 155 12 6 4 15 2 27

18 Example: Coffee Purchase
X^2 = 20.4 and df is 5(5-1)/2=10  lack of fit (reject Ho: symmetry)  which pairs of cells cause the lack of fit? Examine their standardized Pearson residuals  The pair (1,3) and (3,1) contribute the most; other pairs are fine (rij^2 is around 1 or less)


Download ppt "Categorical Data Analysis"

Similar presentations


Ads by Google