Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University of Washington Joint work with Mathias Drton, UC Berkeley Peter Spirtes, CMU
Overview n Challenges for Likelihood Inference n Problems in Model Selection and Interpretation n Partial Solution u sub-class of path diagrams: ancestral graphs
Problems for Likelihood Inference n Likelihood may be multimodal u e.g. the bi-variate Gaussian Seemingly Unrelated Regression (SUR) model: X1X1 X2X2 Y1Y1 Y2Y2 may have up to 3 local maxima. Consistent starting value does not guarantee iterative procedures will find the MLE.
Problems for Likelihood Inference n Discrete latent variable models are not curved exponential families C X1X1 X2X2 X3X3 X4X4 binary observed variables ternary latent class variable 15 parameters in saturated model 14 model parameters BUT model has 2d.f. (Goodman) Usual asymptotics may not apply
Problems for Likelihood Inference n Likelihood may be highly multimodal in the asymptotic limit u After accounting for label switching/aliasing C X1X1 X2X2 X3X3 X4X4 Why report one mode ? d.f. may vary as a function of model parameters
Problems for Model Selection n SEM models with latent variables are not curved exponential families Standard 2 asymptotics do not necessarily apply e.g. for LRTs u Model selection criteria such as BIC are not asymptotically consistent u The effective degrees of freedom may vary depending on the values of the model parameters
Problems for Model Selection n Many models may be equivalent: X1X1 X2X2 Y1Y1 Y2Y2 X1X1 X2X2 Y1Y1 Y2Y2 X1X1 X2X2 Y1Y1 Y2Y2 X1X1 X2X2 Y1Y1 Y2Y2
Problems for Model Selection X1X1 XpXp Y1Y1 YqYq X1X1 XpXp Y1Y1 YqYq n Models with different numbers of latents may be equivalent: u e.g. unrestricted error covariance within blocks
Problems for Model Selection n Models with different numbers of latents may be equivalent: u e.g. unrestricted error covariance within blocks X1X1 XpXp Y1Y1 YqYq X1X1 XpXp Y1Y1 YqYq Wegelin & Richardson (2001)
Two scenarios n A single SEM model is proposed and fitted. The results are reported.
Two scenarios n A single SEM model is proposed and fitted. The results are reported. n The researcher fits a sequence of models, making modifications to an original specification. u Model equivalence implies: F Final model depends on initial model chosen F Sequence of changes is often ad hoc F Equivalent models may lead to very different substantive conclusions u Often many equivalence classes of models give reasonable fit. Why report just one?
Partial Solution n Embed each latent variable model in a ‘larger’ model without latent variables characterized by conditional independence restrictions. n We ignore non-independence constraints and inequality constraints. Latent variable model Model imposing only independence constraints on observed variables Sets of distributions
ab t cd Toy Example: acbd ad ad c ad b ac d bd a G at dt bc t +others The Generating graph n Begin with a graph, and associated set of independences
ab t cd acbd ad ad c ad b ac d bd a G at dt bc t +others hidden: ‘Unobserved’ independencies in red Marginalization n Suppose now that some variables are unobserved n Find the independence relations involving only the observed variables Toy Example:
ab t cd acbd ad ad c ad b ac d bd a G at dt bc t +others hidden: ‘Unobserved’ independencies in red Marginalization n Suppose now that some variables are unobserved n Find the independence relations involving only the observed variables Toy Example:
ab t cd abcd acbd ad ad c ad b ac d bd a G G* ‘Graphical Marginalization’ n Now construct a graph that represents the conditional independence relations among the observed variables. n Bi-directed edges are required. represents Toy Example: all and only the distributions in which these independencies hold
Equivalence re-visited n Restrict model class to path diagrams including only observed variables characterized by conditional independence u Ancestral Graph Markov models n For such models we can: u Determine the entire class of equivalent models u Identify which features they have in common n Models are curved exponential: usual asymptotics do apply
A T AB C D AC BD AD AD C AD B AC D BD A A BCD Ancestral Graph
A V ABCD T AB C D U AC BD AD AD C AD B AC D BD A A BCD A BCD Equivalent ancestral graphs
A V ABCD T AB C D U Q A BC D P R AC BD AD AD C AD B AC D BD A A BCD A BCD A BCD Markov Equiv. Class of Graphs with Latent Variables Equivalent ancestral graphs
A V ABCD T AB C D U + infinitely many others Q A BC D P R AC BD AD AD C AD B AC D BD A A BCD A BCD A BCD A BCD N A BC D M R L Markov Equiv. Class of Graphs with Latent Variables Equivalence Classes Equivalent ancestral graphs
ABCD A V ABCD T AB C D U + infinitely many others Q A BC D P R AC BD AD AD C AD B AC D BD A A BCD A BCD A BCD A BCD N A BC D M R L Markov Equiv. Class of Graphs with Latent Variables Equivalence class of Ancestral Graphs Partial Ancestral Graph
ABCD A V ABCD T AB C D U + infinitely many others Q A BC D P R AC BD AD AD C AD B AC D BD A A BCD A BCD A BCD A BCD Equivalence class of Ancestral Graphs N A BC D M R L Markov Equiv. Class of Graphs with Latent Variables
Measurement models n If we have pure measurement models with several indicators per latent: u May apply similar search methods among the latent variables (Spirtes et al. 2001; Silva et al.2003)
Other Related Work n Iterative ML estimation methods exist u Guaranteed convergence F Multimodality is still possible Implemented in R package ggm (Drton & Marchetti, 2003) n Current work: u Extension to discrete data F Parameterization and ML fitting for binary bi-directed graphs already exist u Implementing search procedures in R
References n Richardson, T., Spirtes, P. (2002) Ancestral graph Markov models, Ann. Stat., 30: n Richardson, T. (2003) Markov properties for acyclic directed mixed graphs. Scand. J. Statist. 30(1), pp n Drton, M., Richardson T. (2003) A new algorithm for maximum likelihood estimation in Gaussian graphical models for marginal independence. UAI 03, n Drton, M., Richardson T. (2003) Iterative conditional fitting in Gaussian ancestral graph models. UAI n Drton, M., Richardson T. (2004) Multimodality of the likelihood in the bivariate seemingly unrelated regressions model. Biometrika, 91(2), Marchetti, G., Drton, M. (2003) ggm package. Available from Marchetti, G., Drton, M. (2003) ggm package. Available from