Download presentation

Published byNathen Goltry Modified over 4 years ago

1
**LOGLINEAR MODELS FOR INDEPENDENCE AND INTERACTION IN THREE-WAY TABLES**

BY ENI SUMARMININGSIH, SSI, MM

2
**Table Structure For Three Dimensions**

When all variables are categorical, a multidimensional contingency table displays the data We illustrate ideas using thr three-variables case. Denote the variables by X, Y, and Z. We display the distribution of X-Y cell count at different level of Z using cross sections of the three way contingency table (called partial tables)

3
The two way contingency table obtained by combining the partial table is called the X-Y marginal table (this table ignores Z)

4
**Death Penalty Example Defendant’s race Victim’s Race Death Penalty**

Percentage Yes Yes No White 19 132 12.6 Black 9 11 52 17.5 6 97 5.8

5
**Marginal table Defendant’s Race Death Penalty Total Yes No White 19**

141 160 Black 17 149 166 36 290 326

6
**Partial and Marginal Odd Ratio**

Partial Odd ratio describe the association when the third variable is controlled The Marginal Odd ratio describe the association when the Third variable is ignored (i.e when we sum the counts over the levels of the third variable to obtain a marginal two-way table)

7
Association Variables P-D P-V D-V Marginal 1.18 2.71 25.99 Partial Level 1 0.67 2.80 22.04 Level 2 0.79 3.29 25.90

8
Types of Independence A three-way IXJXK cross-classification of response variables X, Y, and Z has several potential types of independence We assume a multinomial distribution with cell probabilities {i jk}, and 𝑖 𝑗 𝑘 𝜋 𝑖𝑗𝑘 =1 The models also apply to Poisson sampling with means {𝜇 𝑖𝑗𝑘 }. The three variables are mutually independent when

9
**Similarly, X could be jointly independent of Y and Z, or Z could be jointly**

independent of X and Y. Mutual independence (8.5) implies joint independence of any one variable from the others. X and Y are conditionally independent, given Z when independence holds for each partial table within which Z is fixed. That is, if 𝜋 𝑖𝑗|𝑘 =𝑃(𝑋=𝑖,𝑌=𝑗|𝑍=𝑘) then

11
**Homogeneous Association and Three-Factor Interaction**

12
**Marginal vs Conditional Independence**

Partial association can be quite different from marginal association For further illustration, we now see that conditional independence of X and Y, given Z, does not imply marginal independence of X and Y The joint probability in Table 5.5 show hypothetical relationship among three variables for new graduate of a university

13
**Table 5.5 Joint Probability**

Major Gender Income Low High Liberal Art Female 0.18 0.12 Male 0.08 Science or Engineering 0.02 0.32 Total 0.20 0.40

14
The association between Y=income at first job(high, low) and X=gender(female, male) at two level of Z=major discipline (liberal art, science or engineering) is described by the odd ratios 𝜃 𝑙𝑖𝑏 = 0.18× ×0.12 =1.0 𝜃 𝑠𝑐𝑖 = 0.02× ×0.08 =1.0 Income and gender are conditionally independent, given major

15
**Marginal Probability of Y and X**

Gender Income low high Female =0.20 =0.20 Male =0.40 Total 0.40 0.60 The odd ratio for the (income, gender) from marginal table 𝜃= 0.20× ×0.20 =2 The variables are not independent when we ignore major

16
**Suppose Y is jointly independent of X and Z, so**

𝜋 𝑖𝑗𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗+ Then 𝜋 𝑖𝑗|𝑘 = 𝜋 𝑖𝑗𝑘 𝜋 ++𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗+ 𝜋 ++𝑘 And summing both side over i we obtain 𝜋 +𝑗|𝑘 = 𝜋 ++𝑘 𝜋 +𝑗+ 𝜋 ++𝑘 = 𝜋 +𝑗+ Therefore 𝜋 𝑖𝑗|𝑘 = 𝜋 𝑖+𝑘 𝜋 ++𝑘 𝜋 +𝑗+ = 𝜋 𝑖+|𝑘 𝜋 +𝑗|𝑘

17
**So X and Y are also conditionally independent.**

In summary, mutual indepedence of the variables implies that Y is jointly independent of X and Z, which itself implies that X and Y are conditionaaly independent. Suppose Y is jointly independent of X and Z, that is 𝜋 𝑖𝑗𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗+ . Summing over k on both side, we obtain 𝜋 𝑖𝑗+ = 𝜋 𝑖++ 𝜋 +𝑗+ Thus, X and Y also exhibit marginal independence

18
So, joint independence of Y from X and Z (or X from Y and Z) implies X and Y are both marginally and condotionally independent. Since mutual independence of X, Y and Z implies that Y is jointly independent of X and Z, mutual independence also implies that X and Y are both marginally and conditionally independent However, when we know only that X and Y are conditionally independent, 𝜋 𝑖𝑗𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗𝑘 / 𝜋 ++𝑘 Summing over k on both sides, we obtain 𝜋 𝑖𝑗+ = 𝑘 𝜋 𝑖+𝑘 𝜋 +𝑗𝑘 / 𝜋 ++𝑘

19
All three terms in the summation involve k, and this does not simplify to 𝜋 𝑖++ 𝜋 +𝑗+ , marginal independence

20
**A model that permits all three pairs to be conditionally dependent is**

Model is called the loglinear model of homogeneous association or of no three-factor interaction.

21
**Loglinear Models for Three Dimensions**

Hierarchical Loglinear Models Let {ijk} denote expected frequencies. Suppose all ijk >0 and let ijk = log ijk . A dot in a subscript denotes the average with respect to that index; for instance, ∙𝑗𝑘 = 𝑖 𝑖𝑗𝑘 𝐼 . We set 𝜇= … 𝑖 𝑋 = 𝑖.. − … , 𝑗 𝑌 = ∙𝑗∙ − ∙∙∙ , 𝑘 𝑍 = ∙∙𝑘 − ∙∙∙ 𝑖𝑗 𝑋𝑌 = 𝑖𝑗∙ − 𝑖∙∙ − ∙𝑗∙ + ∙∙∙

22
𝑖𝑘 𝑋𝑍 = 𝑖∙𝑘 − 𝑖∙∙ − ∙∙𝑘 + ⋯ 𝑗𝑘 𝑌𝑍 = ∙𝑗𝑘 − ∙𝑗∙ − ∙∙𝑘 + ⋯ 𝑖𝑗𝑘 𝑋𝑌𝑍 = 𝑖𝑗𝑘 − 𝑖𝑗∙ − 𝑖∙𝑘 − ∙𝑗𝑘 + 𝑖∙∙ + ∙𝑗∙ + ∙∙𝑘 − ⋯ The sum of parameters for any index equals zero. That is 𝑖 𝑖 𝑋 = 𝑗 𝑗 𝑌 = 𝑘 𝑘 𝑍 = 𝑖 𝑖𝑗 𝑋𝑌 = 𝑗 𝑖𝑗 𝑋𝑌 =⋯= 𝑘 𝑖𝑗𝑘 𝑋𝑌𝑍 =0

23
**The general loglinear model for a three-way table is**

This model has as many parameters as observations and describes all possible positive i jk Setting certain parameters equal to zero in yields the models introduced previously. Table 8.2 lists some of these models. To ease referring to models, Table 8.2 assigns to each model a symbol that lists the highest-order term(s) for each variable

24
**Interpreting Model Parameters**

Interpretations of loglinear model parameters use their highest-order terms. For instance, interpretations for model (8.11). use the two-factor terms to describe conditional odds ratios At a fixed level k of Z, the conditional association between X and Y uses (I- 1)(J – 1). odds ratios, such as the local odds ratios Similarly, ( I – 1)(K – 1) odds ratios {i (j)k} describe XZ conditional association, and (J – 1)(K – 1) odds ratios {(i) jk} describe YZ conditional association.

25
Loglinear models have characterizations using constraints on conditional odds ratios. For instance, conditional independence of X and Y is equivalent to {i j(k)} = 1, i=1, , I-1, j=1, , J-1, k=1, , K. substituting (8.11) for model (XY, XZ, YZ) into log i j(k) yields Any model not having the three-factor interaction term has a homogeneous association for each pair of variables.

26
For 2x2x2 tables

27
**Alcohol, Cigarette, and Marijuana Use Example**

Table 8.3 refers to a 1992 survey by the Wright State University School of Medicine and the United Health Services in Dayton, Ohio. The survey asked 2276 students in their final year of high school in a nonurban area near Dayton, Ohio whether they had ever used alcohol, cigarettes, or marijuana. Denote the variables in this 222 table by A for alcohol use, C for cigarette use, and M for marijuana use.

29
**Table 8.5 illustrates model association patterns by presenting estimated**

conditional and marginal odds ratios For example, the entry 1.0 for the AC conditional association for the model (AM, CM) of AC conditional independence is the common value of the AC fitted odds ratios at the two levels of M,

30
**The entry 2.7 for the AC marginal association for this model is the odds ratio**

for the marginal AC fitted table Table 8.5 shows that estimated conditional odds ratios equal 1.0 for each pairwise term not appearing in a model, such as the AC association in model ( AM, CM). For that model, the estimated marginal AC odds ratio differs from 1.0, since conditional independence does not imply marginal independence. Model (AC, AM, CM) permits all pairwise associations but maintains homogeneous odds ratios between two variables at each level of the third. The AC fitted conditional odds ratios for this model equal 7.8. One can calculate this odds ratio using the model’s fitted values at either level of M, or from (8.14) using

31
**INFERENCE FOR LOGLINEAR MODELS**

Chi-Squared Goodness-of-Fit Tests As usual, X 2 and G2 test whether a model holds by comparing cell fitted values to observed counts 𝑋 2 = 𝑖 𝑗 𝑘 𝑛 𝑖𝑗𝑘 − 𝜇 𝑖𝑗𝑘 𝜇 𝑖𝑗𝑘 𝐺 2 =2 𝑖 𝑗 𝑘 𝑛 𝑖𝑗𝑘 𝑙𝑜𝑔 𝑛 𝑖𝑗𝑘 𝜇 𝑖𝑗𝑘 Where nijk = observed frequency and 𝜇 𝑖𝑗𝑘 =expected frequency Here df equals the number of cell counts minus the number of model parameters. For the student survey (Table 8.3), Table 8.6 shows results of testing fit for several loglinear models.

32
**Models that lack any association term fit poorly**

The model ( AC, AM, CM) that has all pairwise associations fits well (P= 0.54) It is suggested by other criteria also, such as minimizing AIC= - 2(maximized log likelihood - number of parameters in model) or equivalently, minimizing [G2- 2(df)].

Similar presentations

Presentation is loading. Please wait....

OK

AS 737 Categorical Data Analysis For Multivariate

AS 737 Categorical Data Analysis For Multivariate

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google