# LOGLINEAR MODELS FOR INDEPENDENCE AND INTERACTION IN THREE-WAY TABLES

## Presentation on theme: "LOGLINEAR MODELS FOR INDEPENDENCE AND INTERACTION IN THREE-WAY TABLES"— Presentation transcript:

LOGLINEAR MODELS FOR INDEPENDENCE AND INTERACTION IN THREE-WAY TABLES
BY ENI SUMARMININGSIH, SSI, MM

Table Structure For Three Dimensions
When all variables are categorical, a multidimensional contingency table displays the data We illustrate ideas using thr three-variables case. Denote the variables by X, Y, and Z. We display the distribution of X-Y cell count at different level of Z using cross sections of the three way contingency table (called partial tables)

The two way contingency table obtained by combining the partial table is called the X-Y marginal table (this table ignores Z)

Death Penalty Example Defendant’s race Victim’s Race Death Penalty
Percentage Yes Yes No White 19 132 12.6 Black 9 11 52 17.5 6 97 5.8

Marginal table Defendant’s Race Death Penalty Total Yes No White 19
141 160 Black 17 149 166 36 290 326

Partial and Marginal Odd Ratio
Partial Odd ratio describe the association when the third variable is controlled The Marginal Odd ratio describe the association when the Third variable is ignored (i.e when we sum the counts over the levels of the third variable to obtain a marginal two-way table)

Association Variables P-D P-V D-V Marginal 1.18 2.71 25.99 Partial Level 1 0.67 2.80 22.04 Level 2 0.79 3.29 25.90

Types of Independence A three-way IXJXK cross-classification of response variables X, Y, and Z has several potential types of independence We assume a multinomial distribution with cell probabilities {i jk}, and 𝑖 𝑗 𝑘 𝜋 𝑖𝑗𝑘 =1 The models also apply to Poisson sampling with means {𝜇 𝑖𝑗𝑘 }. The three variables are mutually independent when

Similarly, X could be jointly independent of Y and Z, or Z could be jointly
independent of X and Y. Mutual independence (8.5) implies joint independence of any one variable from the others. X and Y are conditionally independent, given Z when independence holds for each partial table within which Z is fixed. That is, if 𝜋 𝑖𝑗|𝑘 =𝑃(𝑋=𝑖,𝑌=𝑗|𝑍=𝑘) then

Homogeneous Association and Three-Factor Interaction

Marginal vs Conditional Independence
Partial association can be quite different from marginal association For further illustration, we now see that conditional independence of X and Y, given Z, does not imply marginal independence of X and Y The joint probability in Table 5.5 show hypothetical relationship among three variables for new graduate of a university

Table 5.5 Joint Probability
Major Gender Income Low High Liberal Art Female 0.18 0.12 Male 0.08 Science or Engineering 0.02 0.32 Total 0.20 0.40

The association between Y=income at first job(high, low) and X=gender(female, male) at two level of Z=major discipline (liberal art, science or engineering) is described by the odd ratios 𝜃 𝑙𝑖𝑏 = 0.18× ×0.12 =1.0 𝜃 𝑠𝑐𝑖 = 0.02× ×0.08 =1.0 Income and gender are conditionally independent, given major

Marginal Probability of Y and X
Gender Income low high Female =0.20 =0.20 Male =0.40 Total 0.40 0.60 The odd ratio for the (income, gender) from marginal table 𝜃= 0.20× ×0.20 =2 The variables are not independent when we ignore major

Suppose Y is jointly independent of X and Z, so
𝜋 𝑖𝑗𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗+ Then 𝜋 𝑖𝑗|𝑘 = 𝜋 𝑖𝑗𝑘 𝜋 ++𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗+ 𝜋 ++𝑘 And summing both side over i we obtain 𝜋 +𝑗|𝑘 = 𝜋 ++𝑘 𝜋 +𝑗+ 𝜋 ++𝑘 = 𝜋 +𝑗+ Therefore 𝜋 𝑖𝑗|𝑘 = 𝜋 𝑖+𝑘 𝜋 ++𝑘 𝜋 +𝑗+ = 𝜋 𝑖+|𝑘 𝜋 +𝑗|𝑘

So X and Y are also conditionally independent.
In summary, mutual indepedence of the variables implies that Y is jointly independent of X and Z, which itself implies that X and Y are conditionaaly independent. Suppose Y is jointly independent of X and Z, that is 𝜋 𝑖𝑗𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗+ . Summing over k on both side, we obtain 𝜋 𝑖𝑗+ = 𝜋 𝑖++ 𝜋 +𝑗+ Thus, X and Y also exhibit marginal independence

So, joint independence of Y from X and Z (or X from Y and Z) implies X and Y are both marginally and condotionally independent. Since mutual independence of X, Y and Z implies that Y is jointly independent of X and Z, mutual independence also implies that X and Y are both marginally and conditionally independent However, when we know only that X and Y are conditionally independent, 𝜋 𝑖𝑗𝑘 = 𝜋 𝑖+𝑘 𝜋 +𝑗𝑘 / 𝜋 ++𝑘 Summing over k on both sides, we obtain 𝜋 𝑖𝑗+ = 𝑘 𝜋 𝑖+𝑘 𝜋 +𝑗𝑘 / 𝜋 ++𝑘

All three terms in the summation involve k, and this does not simplify to 𝜋 𝑖++ 𝜋 +𝑗+ , marginal independence

A model that permits all three pairs to be conditionally dependent is
Model is called the loglinear model of homogeneous association or of no three-factor interaction.

Loglinear Models for Three Dimensions
Hierarchical Loglinear Models Let {ijk} denote expected frequencies. Suppose all ijk >0 and let ijk = log ijk . A dot in a subscript denotes the average with respect to that index; for instance,  ∙𝑗𝑘 = 𝑖  𝑖𝑗𝑘 𝐼 . We set 𝜇=  …  𝑖 𝑋 =  𝑖.. −  … ,  𝑗 𝑌 =  ∙𝑗∙ −  ∙∙∙ ,  𝑘 𝑍 =  ∙∙𝑘 −  ∙∙∙  𝑖𝑗 𝑋𝑌 =  𝑖𝑗∙ −  𝑖∙∙ −  ∙𝑗∙ +  ∙∙∙

 𝑖𝑘 𝑋𝑍 =  𝑖∙𝑘 −  𝑖∙∙ −  ∙∙𝑘 +  ⋯  𝑗𝑘 𝑌𝑍 =  ∙𝑗𝑘 −  ∙𝑗∙ −  ∙∙𝑘 +  ⋯  𝑖𝑗𝑘 𝑋𝑌𝑍 =  𝑖𝑗𝑘 −  𝑖𝑗∙ −  𝑖∙𝑘 −  ∙𝑗𝑘 +  𝑖∙∙ +  ∙𝑗∙ +  ∙∙𝑘 −  ⋯ The sum of parameters for any index equals zero. That is 𝑖  𝑖 𝑋 = 𝑗  𝑗 𝑌 = 𝑘  𝑘 𝑍 = 𝑖  𝑖𝑗 𝑋𝑌 = 𝑗  𝑖𝑗 𝑋𝑌 =⋯= 𝑘  𝑖𝑗𝑘 𝑋𝑌𝑍 =0

The general loglinear model for a three-way table is
This model has as many parameters as observations and describes all possible positive i jk Setting certain parameters equal to zero in yields the models introduced previously. Table 8.2 lists some of these models. To ease referring to models, Table 8.2 assigns to each model a symbol that lists the highest-order term(s) for each variable

Interpreting Model Parameters
Interpretations of loglinear model parameters use their highest-order terms. For instance, interpretations for model (8.11). use the two-factor terms to describe conditional odds ratios At a fixed level k of Z, the conditional association between X and Y uses (I- 1)(J – 1). odds ratios, such as the local odds ratios Similarly, ( I – 1)(K – 1) odds ratios {i (j)k} describe XZ conditional association, and (J – 1)(K – 1) odds ratios {(i) jk} describe YZ conditional association.

Loglinear models have characterizations using constraints on conditional odds ratios. For instance, conditional independence of X and Y is equivalent to {i j(k)} = 1, i=1, , I-1, j=1, , J-1, k=1, , K. substituting (8.11) for model (XY, XZ, YZ) into log i j(k) yields Any model not having the three-factor interaction term has a homogeneous association for each pair of variables.

For 2x2x2 tables

Alcohol, Cigarette, and Marijuana Use Example
Table 8.3 refers to a 1992 survey by the Wright State University School of Medicine and the United Health Services in Dayton, Ohio. The survey asked 2276 students in their final year of high school in a nonurban area near Dayton, Ohio whether they had ever used alcohol, cigarettes, or marijuana. Denote the variables in this 222 table by A for alcohol use, C for cigarette use, and M for marijuana use.

Table 8.5 illustrates model association patterns by presenting estimated
conditional and marginal odds ratios For example, the entry 1.0 for the AC conditional association for the model (AM, CM) of AC conditional independence is the common value of the AC fitted odds ratios at the two levels of M,

The entry 2.7 for the AC marginal association for this model is the odds ratio
for the marginal AC fitted table Table 8.5 shows that estimated conditional odds ratios equal 1.0 for each pairwise term not appearing in a model, such as the AC association in model ( AM, CM). For that model, the estimated marginal AC odds ratio differs from 1.0, since conditional independence does not imply marginal independence. Model (AC, AM, CM) permits all pairwise associations but maintains homogeneous odds ratios between two variables at each level of the third. The AC fitted conditional odds ratios for this model equal 7.8. One can calculate this odds ratio using the model’s fitted values at either level of M, or from (8.14) using

INFERENCE FOR LOGLINEAR MODELS
Chi-Squared Goodness-of-Fit Tests As usual, X 2 and G2 test whether a model holds by comparing cell fitted values to observed counts 𝑋 2 = 𝑖 𝑗 𝑘 𝑛 𝑖𝑗𝑘 − 𝜇 𝑖𝑗𝑘 𝜇 𝑖𝑗𝑘 𝐺 2 =2 𝑖 𝑗 𝑘 𝑛 𝑖𝑗𝑘 𝑙𝑜𝑔 𝑛 𝑖𝑗𝑘 𝜇 𝑖𝑗𝑘 Where nijk = observed frequency and 𝜇 𝑖𝑗𝑘 =expected frequency Here df equals the number of cell counts minus the number of model parameters. For the student survey (Table 8.3), Table 8.6 shows results of testing fit for several loglinear models.

Models that lack any association term fit poorly
The model ( AC, AM, CM) that has all pairwise associations fits well (P= 0.54) It is suggested by other criteria also, such as minimizing AIC= - 2(maximized log likelihood - number of parameters in model) or equivalently, minimizing [G2- 2(df)].

Similar presentations