Presentation is loading. Please wait.

Presentation is loading. Please wait.

GEE and Generalized Linear Mixed Models

Similar presentations


Presentation on theme: "GEE and Generalized Linear Mixed Models"— Presentation transcript:

1 GEE and Generalized Linear Mixed Models
Tom Greene

2 Outline Subject specific and population average inference in generalized linear models Review of classical generalized linear models with independent observations Generalized Estimating Equations Contrasts of GLMMs with GEEs GEE example

3 Classes of Generalized Linear Models
(Linear regression, ANOVA, ANCOVA) E(Y) = X β, Responses Independent Linear Mixed Models E(Y|b) = X β + Z b Responses Correlated Correlation modeled in part by “random effects” Generalized Linear Models (Logistic regression, Poisson regression, etc.) g(E(Y)) = X β Responses Independent Generalized Linear Mixed Models (GLMM) g(E(Y|b)) = X β + Z b Responses Correlated Correlation modeled in part by “random effects” Generalized Estimating Equations Approach (GEE) g(E(Y)) = X β Responses Correlated

4 Classes of Generalized Linear Models for Correlated Data
Linear Mixed Models E(Y|b) = X β + Z b Responses Correlated Correlation modeled in part by “random effects” Generalized Estimating Equations Approach (GEE) g(E(Y)) = X β Responses Correlated Generalized Linear Mixed Models (GLMM) g(E(Y|b)) = X β + Z b Responses Correlated Correlation modeled in part by “random effects” Population Average Inference Subject Specific Inference

5 Classes of Generalized Linear Models for Correlated Data
Population Average Inference Subject Specific Inference Generalized Estimating Equations Approach (GEE) g(E(Y)) = X β Responses Correlated Generalized Linear Mixed Models (GLMM) g(E(Y|b)) = X β + Z b Responses Correlated Analysis describes differences in the mean of Y across the entire population Analysis describes differences in the mean of Y conditional on the patient’s specific random effect b Most relevant from an individual patient’s perspective Often b represent a dimension of frailty – Hence, X β tells about the relationship of Y to X among patients with the same frailty Analysis informative from population perspective; most relevant from perspective of Policy makers Providers desiring to optimize outcomes across entire population

6 Extreme Example Subject specific effects of X on Pr(Death), OR = 20 per 1 unit increase in X Population average effect of X on Pr(Death), OR = 2.7 per 1 unit increase in X

7

8 Example: Toenail Data Toenail Dermatophyte Onychomycosis:
Common toenail infection, difficult to treat, affecting more than 2% of population. Design: Randomized, double-blind, parallel group, multicenter study for the comparison of two new compounds (A and B) for oral treatment. 2 x189 patients randomized, 36 centers 48 weeks of total follow up (12 months) 12 weeks of treatment (3 months) Measurements at months 0, 1, 2, 3, 6, 9, 12. Research question: Severity relative to treatment of TDO ?

9

10

11 Review of Generalized Linear Models (Independent Responses)
Independent responses Yi, i = 1, 2, …, N Yi, with distribution from exponential family f(y;θ,ø) = Mean model μi = E(Yi|Xi1,Xi2,…,Xip) g(μi) = β0 + β1Xi1 + β2Xi2+ βpXip Variance function Var(Yi) = øV(μi) V(μi) is a known function determined by the assumed distribution of Y within the exponential family

12 Review of Generalized Linear Models (Independent Responses)

13 Review of Generalized Linear Models (Independent Responses)

14 Review of Generalized Linear Models (Independent Responses)
Independent responses Yi, i = 1, 2, …, N Yi, with distribution from exponential family f(y;θ,ø) = Mean model μi = E(Yi|Xi1,Xi2,…,Xip) g(μi) = β0 + β1Xi1 + β2Xi2+ βJXiJ Variance function Var(Yi) = øV(μi) vi = V(μi) is a known function determined by the assumed distribution of Y within the exponential family The mean model is the only part we have to get right for valid large-sample inference!!!

15 Extension to GEE for Longitudinal Data
GEE: Generalized Estimating Equations (Liang & Zeger, 1986; Zeger & Liang, 1986) • Method is semi-parametric – estimating equations are derived without full specification of the joint distribution of a subject’s observations • Instead, specification of The mean model for the marginal distributions of the yij The variance function of yij given µij The “working” correlation matrix for the vector of repeated observations from each subject Relies on the independence across subjects (or clusters) to estimate consistently the variance of the regression coefficients

16 GEE Method Outline 1. Relate the marginal response μij = E(yij) to a linear combination of the covariates g(μij) = Xtijβ • yij is the response for subject i at time j, j = 1,2, .., J • Xij is a p × 1 vector of covariates β is a p × 1 vector of regression coefficients • g(·) is the link function 2. Describe the variance of yij as a function of the mean V(yij) = v(μij)ø • ø is possibly unknown scale parameter • v(·) is a known variance function

17 Link and Variance Functions
• Normally-distributed response g(μij) = μij “Identity link” v(μij) = 1 V(yij) = ø • Binary response (Bernoulli) g(μij) = log[μij/(1 − μij)] “Logit link” v(μij) = μij(1 − μij) ø = 1 • Poisson response g(μij) = log(μij) “Log link” v(μij) = μij ø = 1

18 GEE Method Outline 3. Choose the form of a n × n “working” correlation matrix Ri for each Yi

19 Working Correlation Structures

20 Working Correlation Structures

21 Working Correlation Structures
(AR(1)

22 Working Correlation Structures

23 GEE Estimation • Define Ai = n × n diagonal matrix with V(μij) as the jth diagonal element • Define Ri(α) = n × n “working” correlation matrix (of the n repeated measures) Working variance–covariance matrix for Yi equals Vi(α) = øAi1/2 Ri(α) Ai1/2

24

25

26

27

28

29 GEE vs. GLMM 1) Target of Inference: GEE: Population Average
GLMM: Subject Specific Notes: Recent work on perform population average inference under GLMM models

30 GEE vs. GLMM 2) Outputs: GEE: GLMM: Coefficients relating Y to X
Coefficients relating Y to X conditional on b Estimates of subject specific random effects Variance of subject specific random effects

31 GEE vs. GLMM 3) Robustness: GEE (with robust variance estimates):
Inference valid in large samples even if distribution of Y and/or variance of Y are incorrectly specified GLMM (with model-based estimates) Valid inference generally requires correct specification of distribution of Y and of variance of Y Notes: Recent proposals for robust variance estimates under GLMM Inference for Linear Mixed Models remains valid if Y is not normal for large N Caveat to GEE robustness: GEE can be biased if time dependent covariates are used unless an independent working correlation matrix is used

32 GEE vs. GLMM 4) Efficiency (power and width of confidence intervals)
Usually fairly efficient if variance function is correctly specified Between subject comparisons are nearly efficient if an independence covariance structure is used for balanced data GLMM: Maximum likelihood estimates are asymptotically efficient as long as the model is correctly specified

33 GEE vs. GLMM 5) Missing Data:
“Classical” GEE (with robust variance estimates) Valid inference if data are Missing Completely At Random (MCAR) even if variance model is wrong If variance model is correct, estimate of β is still consistent if data are MAR but not MCAR (but standard errors are not correct) GLMM (with model-based estimates) Valid inference if data are Missing At Random (MAR) Notes: Various strategies for valid GEE inference if data are MAR

34 Missing data Three general approaches to dealing with missing data under GEE which assume MAR but not MCAR Inverse probability weighting (Robins, Rotnitzky and Zhao, JASA, 1995) Multiple imputation Inverse probability weighting with augmentation, or doubly robust estimation Each method can incorporate covariate information not included in the GEE model itself. This can make the MAR assumption much more plausible. Methods 2 and 3 can be considerably more efficient than standard inverse probability weighting

35

36 GEE vs. GLMM 6) Small to Moderate Samples:
GEE (with robust variance estimates): Estimated standard errors are unstable and biased downwards Inefficient estimating equation for estimating variance Effectively uses fully unstructured variance model “Sample size” means the number of independent units Various corrections have been proposed (available in PROC GLIMMIX) GLMM (with model-based estimates) Large-sample approximations are often invoked, but performance usually better than GEE with small to moderate N if model is correctly specified.

37 More Toenail Data Multicenter trial comparing active vs. control oral treatments for toenail infection Repeated measurements of binary outcome: 0 = none or mild separation 1 = severe separation 1908 observations in 294 patients, mostly over 1 year

38 **** Standard GENMOD GEE program using Robust SEs *****;
**** Binary outcome leads to default logistic link function ****; proc genmod descending; Class id; model outcome = treatment month treatment*month/ dist=bin; repeated subject=id/type=exch covb corrw; estimate 'Control Slope' month 1/exp; estimate 'Treartment Slope' month 1 treatment*month 1/exp; run; Working Correlation Matrix Col1 Col2 Col3 Col4 Col5 Col6 Col7 Row Row Row Row Row Row Row

39 Analysis Of GEE Parameter Estimates
**** Standard GENMOD GEE program using Robust SEs; **** Binary outcome leads to default logistic link function; proc genmod descending; Class id; model outcome = treatment month treatment*month/ dist=bin; repeated subject=id/type=exch covb corrw; estimate 'Control Slope' month 1/exp; estimate 'Treatment Slope' month 1 treatment*month 1/exp; run; Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > |Z| Intercept treatment month <.0001 treatment*month

40 **** Standard GENMOD GEE program using Robust SEs *****;
**** Binary outcome leads to default logistic link function ****; proc genmod descending; Class id; model outcome = treatment month treatment*month/ dist=bin; repeated subject=id/type=exch covb corrw; estimate 'Control Slope' month 1/exp; estimate 'Treatment Slope' month 1 treatment*month 1/exp; run; Can ignore in this case Contrast Estimate Results Mean Mean L'Beta Standard Label Estimate Confidence Limits Estimate Error Control Slope Exp(Control Slope) Treatment Slope Exp(Treatment Slope) L'Beta Chi- Label Alpha Confidence Limits Square Pr > ChiSq Control Slope <.0001 Exp(Control Slope) Treatment Slope <.0001 Exp(Treatment Slope)

41 Solutions for Fixed Effects
**** GLIMMIX GLMM Estimating Subject Specific Effects ****; **** Binary outcome leading to default logistic link function ****; proc glimmix method=RSPL data=toenail; Class id; model outcome (event="1") = treatment month treatment*month/ s dist=binary; random int / subject=id; estimate 'Control Slope' month 1/or; estimate 'Treartment Slope' month 1 treatment*month 1/or cl; run; Solutions for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept treatment month <.0001 treatment*month

42 data small; set toenail; if id <= 20;
*** Small Sample; data small; set toenail; if id <= 20; ** Standard GENMOD GEE with Robust SEs: 17 Patients Only ***; ** Binary outcome leading to default logistic link function **; proc genmod descending; Class id; model outcome = treatment month treatment*month/ dist=bin; repeated subject=id/type=exch covb corrw; run; Standard 95% Confidence Parameter Estimate Error Limits Z Pr > |Z| Intercept treatment month treatment*month

43 Solutions for Fixed Effects
**** GLIMMIX GEE program using Robust SEs; **** Binary outcome leads to default logistic link function; **** Restricted to 17 patients; **** Small N Adjustment of Morel, Bokossa, and Neerchal (2003); proc glimmix method=RSPL empirical=mbn data=small; Class id; model outcome (event="1") = treatment month treatment*month/ s dist=binary ddfm=kenwardroger; random _residual_ / subject=id type=cs; run; Solutions for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept treatment month treatment*month

44 THAT’s ALL


Download ppt "GEE and Generalized Linear Mixed Models"

Similar presentations


Ads by Google