PROC GLIMMIX: AN OVERVIEW

Slides:



Advertisements
Similar presentations
SJS SDI_21 Design of Statistical Investigations Stephen Senn 2 Background Stats.
Advertisements

Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Topic 12: Multiple Linear Regression
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Copula Regression By Rahul A. Parsa Drake University &
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
GENERAL LINEAR MODELS: Estimation algorithms
Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
I OWA S TATE U NIVERSITY Department of Animal Science PROC GLIMMIX Generalized Mixed Linear Models Animal Science 500 Lecture No October 25, 2010.
Copyright © 2013, SAS Institute Inc. All rights reserved. GENERALIZED LINEAR MODELS.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.

Generalised linear models
Log-linear and logistic models
Mixed models Various types of models and their relation
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Linear and generalised linear models
OLS versus MLE Example YX Here is the data:
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Linear regression models in matrix terms. The regression function in matrix terms.
Generalized Linear Models
Analysis of Clustered and Longitudinal Data
Poisson Regression Caution Flags (Crashes) in NASCAR Winston Cup Races L. Winner (2006). “NASCAR Winston Cup Race Results for ,” Journal.
Review of Lecture Two Linear Regression Normal Equation
GEE and Generalized Linear Mixed Models
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
Fixed vs. Random Effects Fixed effect –we are interested in the effects of the treatments (or blocks) per se –if the experiment were repeated, the levels.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
HSRP 734: Advanced Statistical Methods June 19, 2008.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
Corinne Introduction/Overview & Examples (behavioral) Giorgia functional Brain Imaging Examples, Fixed Effects Analysis vs. Random Effects Analysis Models.
4-Oct-07GzLM PresentationBIOL The GzLM and SAS Or why it’s a necessary evil to learn code! Keith Lewis Department of Biology Memorial University,
Different Distributions David Purdie. Topics Application of GEE to: Binary outcomes: – logistic regression Events over time (rate): –Poisson regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Multilevel Modeling Software Wayne Osgood Crime, Law & Justice Program Department of Sociology.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
Modeling the Loss Process for Medical Malpractice Bill Faltas GE Insurance Solutions CAS Special Interest Seminar … Predictive Modeling “GLM and the Medical.
Estimation in Marginal Models (GEE and Robust Estimation)
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1.
SAS® Global Forum 2014 March Washington, DC Got Randomness?
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Multiple Imputation using SAS Don Miller 812 Oswald Tower
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
Experimental Statistics - week 9
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
1 Statistics 262: Intermediate Biostatistics Mixed models; Modeling change.
BINARY LOGISTIC REGRESSION
LINEAR REGRESSION 1.
Chapter 7. Classification and Prediction
Generalized Linear Models
Linear Mixed Models in JMP Pro
Generalized Linear Models
6-1 Introduction To Empirical Models
Simple Linear Regression
OVERVIEW OF LINEAR MODELS
Fixed, Random and Mixed effects
Presentation transcript:

PROC GLIMMIX: AN OVERVIEW By William E. Jackman

PROC GLIMMIX: AN OVERVIEW A new SAS/STAT Product Experimental in SAS 9.1 Production in SAS 9.2. %GLIMMIX macro Combines and extends statistical features found in other SAS procedures Part of a succession of SAS procedures which have extended the General Linear Model (GLM)

PROC GLIMMIX: AN OVERVIEW Regression Analysis Basics Y = B0 + B1 X1 +B2 X2 ... + Bn Xn + e y = Xβ + ε (matrix notation) ε ~ N(0, α2 In) Estimation by ordinary least squares (OLS). Essence of the General Linear Model (GLM) Y's and the X's go by several names Covariates

PROC GLIMMIX: AN OVERVIEW The GLM underlies PROC REG and PROC GLM Both procedures use OLS to fit the GLM to data with continuous response variable Same assumptions about residuals PROC REG has advantages for continuous effects (regressors). PROC GLM has advantages for discrete effects (regressors).

PROC GLIMMIX: AN OVERVIEW Indicator (dummy) variables and interactions * PROC REG: must be created in data step * PROC GLM: use class & model statements Which Procedure to use? * Interested primarily in effect of continuous variables (covariates)? * Interested primarily in effect of grouping variables?

PROC GLIMMIX: AN OVERVIEW The generalized linear model (GzLM) extends (or generalizes) the GLM. Presented in 1972; expanded in 1989. Non-normal data from exponential family Linearity is achieved through the link function. Implemented, for example, in PROC GENMOD PROC GENMOD can also handle correlated residuals.

PROC GLIMMIX: AN OVERVIEW General form of the GENMOD procedure PROC GENMOD options ; CLASS variables ; MODEL response=effects / dist= link= options ; REPEATED SUBJECT=subjects-effects / options ; RUN ;

PROC GLIMMIX: AN OVERVIEW Example of the GENMOD procedure for Poisson regression proc genmod data=skin ; class city age ; model cases=city age / offset=log_pop dist=poi link=log ; run ; where log_pop = log of the population

PROC GLIMMIX: AN OVERVIEW The generalized linear model (GzLM) Canonical link functions most common. Obtained from probability density function Default in PROC GENMOD For the Poisson distribution the default link function is the log of the response variable. log(μ) = Xβ Inverse link functions μ = eη

PROC GLIMMIX: AN OVERVIEW Logistic Regression: A special case of the generalized linear model (GzLM) Response variable from binomial distribution Part of the exponential family so GzLM applies Link function is the logit. logit(pi) = ln(pi / (1-pi)) Can be done with PROC GENMOD Input from David Schlotzhauer of SAS Institute

PROC GLIMMIX: AN OVERVIEW FURTHER EXTENSIONS OF THE GLM GLM and GzLM cannot handle random effects. Fixed effects-interest only in levels specified Random effects-inference to other levels PROC GENMOD and PROC LOGISTIC cannot handle random effects.

PROC GLIMMIX: AN OVERVIEW PROC MIXED: An extension of the GLM Can handle random effects and correlated errors fixed effects only model y = Xβ + ε mixed model y = Xβ + Zγ + ε

PROC GLIMMIX: AN OVERVIEW Mixed models distinguish between G-side random effects and R-side random effects. G-side random effects correspond to covariates (regressors) in the model which are random. R-side random effects correspond to the residuals in the model.

PROC GLIMMIX: AN OVERVIEW Example of PROC MIXED syntax proc mixed ; class id time gender ; model z = gender age gender*age ; random intercept / subject=id ; *** G-side effects go here. ; repeated time /subject=id type=ar(1) ; *** R-side effects go here. ; run ;

PROC GLIMMIX: AN OVERVIEW PROC MIXED: a linear mixed model (LMM) PROC MIXED allows for random intercepts for each subject. models the correlation in the repeated measures within each subject. has rich variety of covariance matrices for dealing with correlated residuals. Unlike GzLM’s, LMM’s require a normally distributed response variable.

PROC GLIMMIX: AN OVERVIEW PROC GLIMMIX - PUTTING IT ALL TOGETHER A Generalized Linear Mixed Model (GzLMM) Combines and extends features of GzLM’s and LMM’s Enables modeling random effects and correlated errors for non-normal data

PROC GLIMMIX: AN OVERVIEW The Generalized Linear Mixed Model (GzLMM) A linear predictor can contain random effects: η = Xβ + Z γ The random effects are normally distributed The conditional mean, μ|γ, relates to the linear predictor through a link function: g(μ|γ) = η The conditional distribution (given γ) of the data belongs to the exponential family of distributions.

PROC GLIMMIX: AN OVERVIEW Other new features of PROC GLIMMIX include: low-rank smoothing based on mixed models new features for LS-means comparisons and display. SAS programming statements allowed within the procedure Fits models to multivariate data with different distributions or links

PROC GLIMMIX: AN OVERVIEW General form of the GLIMMIX procedure: PROC GLIMMIX options ; programming statements ; CLASS variables ; MODEL response=fixed-effects / DIST= LINK = options ; RANDOM random-effects / options ; RANDOM _RESIDUAL_ / options ; RUN ;

PROC GLIMMIX: AN OVERVIEW Like other mixed models, PROC GLIMMIX distinguishes between G-side random effects and R-side random effects. G-side random effects correspond to covariates in the model which are random. R-side random effects correspond to the residuals in the model.

PROC GLIMMIX: AN OVERVIEW Example of a GzLMM using PROC GLIMMIX for Logistic Regression with Random Effects proc glimmix data=example ; class trt clinic ; model y=trt / dist=binomial link=logit ; random clinic trt*clinic ; *** random intercept trt / subject=clinic ; run ;

PROC GLIMMIX: AN OVERVIEW This example cannot be handled by PROC LOGISTIC since clinic is a random effect. For logistic regression with fixed effect only, PROC GLIMMIX or PROC LOGISTIC can be used. Which should you use? More input from David Schlotzhauer of the SAS Institute.

PROC GLIMMIX: AN OVERVIEW Parameters Estimation Methods in PROC GLIMMIX The GLIMMIX procedure has two basic modes of parameter estimation: GLM-mode and GLMM-mode. In GLM-mode, the data is never correlated and there can be no G-side random effect. In the GLMM-mode, there might be random effects and/or correlated data.

PROC GLIMMIX: AN OVERVIEW Parameter Estimation for generalized linear models Normal distribution: restricted maximum likelihood All other known distributions: maximum likelihood Unknown distributions: quasi-likelihood

PROC GLIMMIX: AN OVERVIEW Parameter Estimation for generalized linear models with overdispersion Parameters are estimated using maximum likelihood An overdispersion parameter can be estimated from the Pearson statistic

PROC GLIMMIX: AN OVERVIEW Parameter Estimation for generalized linear mixed models Pseudo-likelihood

PROC GLIMMIX: AN OVERVIEW Using PROC GLIMMIX for Linear Mixed Models In this example, the response variable is normally-distributed. Proc glimmix data= grass ; Class method variety ; Model yield = method / dist=normal ; Random variety method*variety ; run ; PROC GLIMMIX uses the residual/restricted maximum likelihood as does PROC MIXED.

PROC GLIMMIX: AN OVERVIEW PROC GLIMMIX can do much of what PROC LOGISTIC, PROC MIXED, PROC REG, and PROC GLM can do. Could be viewed as a “super PROC” Input from Jill Tao of the SAS Institute

PROC GLIMMIX: AN OVERVIEW PROC GLIMMIX versus PROC MIXED Closely related but important differences PROC GLIMMIX is not PROC MIXED with a LINK= and a DIST= option. PROC GLIMMIX models non-normal data. PROC MIXED does not. PROC GLIMMIX allows programming statements. PROC MIXED does not. PROC GLIMMIX uses the RANDOM statement to model R-side random effects. PROC MIXED uses the REPEATED statement to model R-side random effects. PROC GLIMMIX does not support the Kronecker and heterogeneous covariance structures as supported by PROC MIXED.

PROC GLIMMIX: AN OVERVIEW PROC GLIMMIX versus PROC GENMOD PROC GLIMMIX fits unit-specific models with the G-side random effects fits population-average models without the G-side effects. (Without the G-side effects, there is no way to condition the response and make the estimates unit-specific.) provides sandwich estimators of covariance of fixed effects through the EMPIRICAL option when the model is processed by subjects. computes the parameter estimates by a pseudo-likelihood method.

PROC GLIMMIX: AN OVERVIEW PROC GLIMMIX versus PROC GENMOD PROC GENMOD cannot accommodate random effects fits only population-average models computes the parameter estimates by a moment-based method.

PROC GLIMMIX: AN OVERVIEW Applications Using the GLIMMIX Procedure (from "Statistical Analysis with the GLIMMIX Procedure") Poisson Regression with Random Effects An example of Beta Regression Repeated Measures Data with Discrete Response Introduction to Radial Smoothing Applications are explained in detail in the SAS course.

PROC GLIMMIX: AN OVERVIEW Fitting Models To Multivariate Data In Which Observations Do Not All Have The Same Distribution Or Link EXAMPLE: JOINT MODELS FOR BINARY AND POISSON DATA (from a paper by Oliver Schabenberger of the SAS Institute)

PROC GLIMMIX: AN OVERVIEW data joint; length dist $7; input d$ patient age OKstatus response @@; if d = ’B’ then dist=’Binary’; else dist=’Poisson’; datalines; (only 3 lines shown) B 1 78 1 0 P 1 78 1 9 B 2 60 1 0 P 2 60 1 4 B 3 68 1 1 P 3 68 1 7 B 4 62 0 1 P 4 62 0 35 B 5 76 0 0 P 5 76 0 9 B 6 76 1 1 P 6 76 1 7

PROC GLIMMIX: AN OVERVIEW proc glimmix data=joint; class patient dist; model response(event=’1’) = dist dist*age dist*OKstatus / noint s dist=byobs(dist); random int / subject=patient; run;

PROC GLIMMIX: AN OVERVIEW The previous slide showed modeling correlations through G-side random effects. It could also be done through R-side random effects. This is presented in the SAS course “Statistical Analysis with the GLIMMIX Procedure” which expands upon this example.