1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Lecture 11 (Chapter 9).
Three or more categorical variables
Brief introduction on Logistic Regression
Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Other Analysis of Variance Designs Chapter 15. Chapter Topics Basic Experimental Design Concepts  Defining Experimental Design  Controlling Nuisance.
PROC GLIMMIX: AN OVERVIEW
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
3.3 Toward Statistical Inference. What is statistical inference? Statistical inference is using a fact about a sample to estimate the truth about the.

Instructor: K.C. Carriere
Clustered or Multilevel Data
Lecture 9: One Way ANOVA Between Subjects
EPI 809/Spring Multiple Logistic Regression.
Incomplete Block Designs
The Analysis of Variance
Today Concepts underlying inferential statistics
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Nonparametric or Distribution-free Tests
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
Lecture 9: Marginal Logistic Regression Model and GEE (Chapter 8)
1 Advances in Statistics Or, what you might find if you picked up a current issue of a Biological Journal.
Chapter 5 Sampling Distributions
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Fixed vs. Random Effects Fixed effect –we are interested in the effects of the treatments (or blocks) per se –if the experiment were repeated, the levels.
Lecture 8: Generalized Linear Models for Longitudinal Data.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Andrew Thomson on Generalised Estimating Equations (and simulation studies)
HSRP 734: Advanced Statistical Methods June 19, 2008.
Bayesian Analysis and Applications of A Cure Rate Model.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Different Distributions David Purdie. Topics Application of GEE to: Binary outcomes: – logistic regression Events over time (rate): –Poisson regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
1 Topic 2 LOGIT analysis of contingency tables. 2 Contingency table a cross classification Table containing two or more variables of classification, and.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Logistic Regression Applications Hu Lunchao. 2 Contents 1 1 What Is Logistic Regression? 2 2 Modeling Categorical Responses 3 3 Modeling Ordinal Variables.
Chapter 13 Multiple Regression
1 Analysis Considerations in Industrial Split-Plot Experiments When the Responses are Non-Normal Timothy J. Robinson University of Wyoming Raymond H. Myers.
Mixed models. Concepts We are often interested in attributing the variability that is evident in data to the various categories, or classifications, of.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
1 STA 617 – Chp11 Models for repeated data Analyzing Repeated Categorical Response Data  Repeated categorical responses may come from  repeated measurements.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Generalized Linear Models (GLMs) and Their Applications.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Experimental Statistics - week 3
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
1 STA 617 – Chp12 Generalized Linear Mixed Models Modeling Heterogeneity among Multicenter Clinical Trials  compare two groups on a response for.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 7 Sampling Distributions Section 7.1 How Sample Proportions Vary Around the Population.
Measurement, Quantification and Analysis
Generalized Linear Models
12 Inferential Analysis.
Chapter 5 Hypothesis Testing
Simple Linear Regression - Introduction
Sampling Distribution Models
Review for Exam 2 Some important themes from Chapters 6-9
12 Inferential Analysis.
Fixed, Random and Mixed effects
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

2 STA 617 – Chp12 Generalized Linear Mixed Models

3 REPLICATE Statement The REPLICATE statement provides a way to accommodate models in which different subjects have identical data. PROC NLMIXED assumes that its value indicates the number of subjects having data identical to those for the current value of the SUBJECT= variable (specified in the RANDOM statement). Only the last observation of the REPLICATE variable for each subject is used, and the replicate variable must have only positive integer values. Note that the REPLICATE mechanism is different from using a FREQ statement in other statistical modeling procedures, such as PROC GLM, GENMOD, GLIMMIX, and LOGISTIC. A FREQ variable is used to identify grouped values for observations, essentially multiplying the log likelihood or sum of squares contribution for the observation. A REPLICATE variable is used to multiply the contribution of a subject that comprises one or more observations.

4 STA 617 – Chp12 Generalized Linear Mixed Models 12.3 EXAMPLES OF RANDOM EFFECTS MODELS FOR BINARY DATA  random effects models:  Small-Area Estimation of Binomial Proportions  Modeling Repeated Binary Responses  Modeling Heterogeneity among Multicenter Clinical Trials  Alternative Formulations of Random Effects Models  Capture–Recapture Modeling to Predict Population Size

5 STA 617 – Chp12 Generalized Linear Mixed Models Small-Area Estimation of Binomial Proportions  Small-area estimation refers to estimation of parameters for a large number of geographical areas when each has relatively few observations.  For instance, one might want county-specific estimates of characteristics such as the unemployment rate or the proportion of families having health insurance coverage.  With a national or statewide survey, some counties may have few observations. Then, sample proportions in the counties may poorly estimate the true countywide proportions.  Random effects models that treat each county as a cluster can provide improved estimates.  In assuming that the true proportions vary according to some distribution, the fitting process ‘‘ borrows from the whole ’’ it uses data from all the counties to estimate the proportion in any given one.

6 STA 617 – Chp12 Generalized Linear Mixed Models Example  a simulated sample of size 2000 to mimic a poll taken before the 1996 U.S. presidential election.  For T i observations in state i (i=1,..., 51, where I=51 is DC ), yi is bin(T i,  i ), where  i is the actual proportion of votes in state i for Bill Clinton in the 1996 election, conditional on voting for Clinton or the Republican candidate, Bob Dole.  Here, T i is proportional to the state’s population size, subject to  T i =2000.  Table 12.2 shows T i,  i, and p i =y i /T i.

7 STA 617 – Chp12 Generalized Linear Mixed Models

8  fixed-effects model Problem: some states have few observations. Then, sample proportions in the states may poorly estimate the true statewide proportions. General notation:  Let i denote the true proportion in area i, i=1,..., n. These areas may be all the ones of interest, or only a sample.  Let {y i } denote independent bin(T i,  i ) variates; that is, y i =  y it, where {y it, t=1,..., T i }are independent with P(Y it =1)=  i and P(Y it =0)=1-  i.  The sample proportions p i =y i /T i are ML estimates of  i for the fixed-effects model

9 STA 617 – Chp12 Generalized Linear Mixed Models Problem of fixed-effects model  For small {T i }, {p i } have large standard errors.  Thus, p i may display much more variability than  i, especially when  i are similar.  It is helpful shrink {p i } toward their overall mean.  Random effects

10 STA 617 – Chp12 Generalized Linear Mixed Models Random effects model  If then all  the random effects estimate of each  i this is a much better estimator of that common value than the sample proportion from a single sample.  Generally, the random effects model estimators shrink the separate sample proportions toward the overall sample proportion. The amount of shrink-age decreases as increases.

11 STA 617 – Chp12 Generalized Linear Mixed Models  The predicted random effect is the estimated mean of the distribution of u i, given the data.  This prediction depends on all the data, not just data from area i.  A benefit is potential reduction in the mean-squared error of the estimates around the true values.

12 STA 617 – Chp12 Generalized Linear Mixed Models SAS GLMM Analyses of Election Data in Table 12.2

13 STA 617 – Chp12 Generalized Linear Mixed Models  For the ML fit of model (12.9)  From the predicted random effect values obtained using PROC NLMIXED in SAS, considerable shrinkage of these estimates occurs from the sample proportions toward the overall proportion supporting Clinton, which was (vary from to 0.696) [exp(0.1633)/(1+exp(0.1633))=0.5408]  The sample proportions vary between to 1.0.  Sample proportions based on fewer observations, such as DC, tended to shrink more.  Although the estimates incorporating random effects are relatively homogeneous, they tend to be closer than the sample proportions to the true values.

14 STA 617 – Chp12 Generalized Linear Mixed Models How to simulate the data? /*new simulation*/ data vote1; set vote; /*simulate the data based on true prob in each state*/ y=rand("BINOMIAL", truep, n); run;

15 STA 617 – Chp12 Generalized Linear Mixed Models Modeling Repeated Binary Responses --- incorporate covariates.  Items are (1=yes, 2=no) (1) if the family has a very low income and cannot afford anymore children (2)when the woman is not married and does not want to marry the man (3) when the woman wants the abortion for any reason. The subjects indicated whether they supported legalizing abortion in each of three situations.

16 STA 617 – Chp12 Generalized Linear Mixed Models  Let y it denote the response for subject i on item t, with y it =1 representing support.  Consider the model where x i =1 for females and 0 for males, and where u i are independent normal. The gender effect  is assumed the same for each item, and the {  t } refer to the items.

17 STA 617 – Chp12 Generalized Linear Mixed Models data new; input sex poor single any count; datalines; ; data new; set new; sex = sex-1; case = _n_; q1=1; q2=0; resp = poor; output; q1=0; q2=1; resp = single; output; q1=0; q2=0; resp = any; output; drop poor single any; proc nlmixed qpoints = 50; parms alpha=0 beta1=.8 beta2=.3 gamma=0 sigma=8.6; eta = alpha + beta1*q1 + beta2*q2 + gamma*sex + u; p = exp(eta)/(1 + exp(eta)); model resp ~ binary(p); random u ~ normal(0,sigma*sigma) subject = case; replicate count; estimate 'diff1-2' beta1-beta2; run;

18 STA 617 – Chp12 Generalized Linear Mixed Models GEE data new2; set new1; do i=1 to count; id=compress(case||"|"||i); output; end; data q1 q2 q3; set new2; if q1=1 and q2=0 then output q1; else if q1=0 and q2=1 then output q2; else output q3; data qq; merge q1 (rename=(resp=qq1)) q2(rename=(resp=qq2)) q3(rename=(resp=qq3)); run; proc corr; var qq1 qq2 qq3; run; proc GENMOD desc data=new2; class id ; model resp=q1 q2 sex/link=logit dist=bin covb MAXITER=500; repeated subject = id / type=exch; estimate 'diff12' q1 1 q2 -1; run;

19 STA 617 – Chp12 Generalized Linear Mixed Models  For a given subject of either gender, for instance, the estimated odds of supporting legalized abortion for item 1 equal exp(0.83)=2.3 times the estimated odds for item 3.  for each item the estimated probability of supporting legalized abortion is similar for females and males with similar random effect values (gamma=0.01).

20 STA 617 – Chp12 Generalized Linear Mixed Models  For these data, subjects are highly heterogeneous sigma=8.6. Thus, strong associations exist among responses on the three items.  This is reflected by 1595 of the 1850 subjects making the same response on all three items: that is, response patterns 0, 0, 0. and 1, 1, 1.  It implies tremendous variability in between-subject odds ratios.  From (12.7), for different subjects of a given gender, the middle 50% of odds ratios comparing items 1 and 3 are estimated to vary between about exp( *8.6) and exp( *8.6).

21 STA 617 – Chp12 Generalized Linear Mixed Models  An extended model allows interaction between gender and item. It does not fit better.  GEE estimates for the exchangeable working correlation structure

22 STA 617 – Chp12 Generalized Linear Mixed Models  GEE model describes six marginal probabilities (three for each gender) using four parameters.  These population-averaged beta are much smaller than the subject-specific beta from the GLMM.  This reflects the very large GLMM heterogeneity (sigma=8.6) and the corresponding strong correlations among the three responses.  For instance, the GEE analysis estimates a common correlation of 0.82 between pairs of responses.  Although the GLMM beta are about five to six times the marginal model beta, so are the standard errors. The two approaches provide similar substantive interpretations and conclusions.

23 STA 617 – Chp12 Generalized Linear Mixed Models Longitudinal Mental Depression Study Revisited  The response y t for measurement t on mental depression equals 1 for normal and 0 for abnormal. Predictors:  For severity of initial diagnosis s ( 1=severe, 0=mild)  drug treatment d (1=new, 0=standard)  and time of measurement t

24 STA 617 – Chp12 Generalized Linear Mixed Models  Marginal model chp11  Random effects model

25 STA 617 – Chp12 Generalized Linear Mixed Models SAS proc nlmixed qpoints=100; parms alpha=-.03 beta1=-1.3 beta2=-.06 beta3=.48 beta4=1.02 sigma=.066; eta = alpha + beta1*diagnose + beta2*treat + beta3*time + beta4*treat*time + u; p = exp(eta)/(1 + exp(eta)); model outcome ~ binary(p); random u ~ normal(0, sigma*sigma) subject = case; run;

26 STA 617 – Chp12 Generalized Linear Mixed Models  GEE and GLMM are similar because sigma=0.07 is very small  Little heterogeneity among subjects -> population-average will equal to subject-specific roughly