Armando Teixeira-Pinto AcademyHealth, Orlando ‘07 Analysis of Non-commensurate Outcomes.

Slides:



Advertisements
Similar presentations
Handling attrition and non- response in longitudinal data Harvey Goldstein University of Bristol.
Advertisements

REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Structural Equation Modeling. What is SEM Swiss Army Knife of Statistics Can replicate virtually any model from “canned” stats packages (some limitations.
StatisticalDesign&ModelsValidation. Introduction.
GENERAL LINEAR MODELS: Estimation algorithms
1 QOL in oncology clinical trials: Now that we have the data what do we do?
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
© 2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH Getting to Know Your Data Basic Data Cleaning Principles.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004.

Clustered or Multilevel Data
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Cumulative Geographic Residual Test Example: Taiwan Petrochemical Study Andrea Cook.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Chapter 14 Inferential Data Analysis
Relationships Among Variables
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
Analysis of Clustered and Longitudinal Data
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Brain Mapping Unit The General Linear Model A Basic Introduction Roger Tait
Andrew Thomson on Generalised Estimating Equations (and simulation studies)
HSRP 734: Advanced Statistical Methods June 19, 2008.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Multiple Imputation (MI) Technique Using a Sequence of Regression Models OJOC Cohort 15 Veronika N. Stiles, BSDH University of Michigan September’2012.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Multilevel Modeling Software Wayne Osgood Crime, Law & Justice Program Department of Sociology.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
Osteoarthritis Initiative Analytic Strategies for the OAI Data December 6, 2007 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and.
Statistical Issues in the Analysis of Patient Outcomes April 11, 2003 Elizabeth Garrett Oncology Biostatistics Acknowledgement: Thanks to Ron Brookmeyer.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Generalized linear MIXED models
1 G Lect 13W Imputation (data augmentation) of missing data Multiple imputation Examples G Multiple Regression Week 13 (Wednesday)
Ilona Verburg Nicolette de Keizer Niels Peek
Data Analysis in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine October.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Linear Methods for Classification Based on Chapter 4 of Hastie, Tibshirani, and Friedman David Madigan.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Tutorial I: Missing Value Analysis
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
ANCOVA (adding covariate) MANOVA (adding more DVs) MANCOVA (adding DVs and covariates) Group Differences: other situations…
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
CHAPTER 7 Linear Correlation & Regression Methods
Notes on Logistic Regression
Linear Mixed Models in JMP Pro
CH 5: Multivariate Methods
An Introductory Tutorial
From GLM to HLM Working with Continuous Outcomes
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
12 Inferential Analysis.
Fixed, Random and Mixed effects
Rachael Bedford Mplus: Longitudinal Analysis Workshop 23/06/2015
Presentation transcript:

Armando Teixeira-Pinto AcademyHealth, Orlando ‘07 Analysis of Non-commensurate Outcomes

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Agenda Introduction Example: HRQOL after intensive care Common approach to multiple outcomes The latent variable model HRQOL results Discussion and summary

A. Teixeira-Pinto AcademyHealth, Orlando 2007 The city of PORTO

A. Teixeira-Pinto AcademyHealth, Orlando 2007 The city of PORTO

A. Teixeira-Pinto AcademyHealth, Orlando 2007 The city of PORTO

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Introduction Multiple outcomes are often collected in health studies Longitudinal data, repeated measurements, multiple informants, multi-dimension outcome (health related quality of life), multiple surrogates for an outcome of interest For outcomes measured in the same scale there are several multivariate methods implemented in commercial software Generalized linear mixed model, GEE, GLM, MANOVA… Often the outcomes are non-commensurate (mixed type) as for example a binary and a continuous outcome

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Introduction These outcomes are often correlated: Common approach: Analyze each outcome separately (univariate framework) ignoring the correlation A multivariate approach will: Use the additional information contained in the correlation between outcomes Permit better control over Type I error rates Answer intrinsically multivariate questions Be helpful in some situations of missing data

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Motivation example Quality of life after Intensive Care Objective: evaluate health related quality of life (HRQOL) of patients 6 months after ICU discharge. Study the association with: Age Previous health state Non-chronic disease Chronic disease with no disability Chronic disease with disability Apache II score Severity score at ICU admission

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Instrument EQ-5D Measuring HRQOL EQ-5D is a standardized instrument for use as a measure of health outcome. Applicable to a wide range of health conditions and treatments, it provides a simple descriptive profile and a single index value for health status based on 5 health related dimensions. Includes a question about patient’s perception of his/hers HRQOL

A. Teixeira-Pinto AcademyHealth, Orlando 2007

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Instrument EQ-5D We’ll consider two outcomes EQ-5D index Summarizes the 5 dimensions of the EQ5D Continuous outcome D-VAS (visual analogue scale) VAS Dichotomized 50 Binary outcome And the three covariates: Age ; Previous health state; Apache II

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Common approach Data for the HRQOL after ICU stay: 4 years of data collection One intensive care unit from a tertiary hospital in Portugal 485 patients participated in the study The EQ-5D index was available for all the patients Only 366 patients answered the question associated with the D-VAS Common approach: Linear model for the EQ-5D index Logistic or probit regression for D-VAS

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Multiple outcomes EQ-5D index D-VAS age previous health state Apache II age previous health state Apache II n=485 n=366

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Multiple outcomes EQ-5D index D-VAS age previous health state Apache II age previous health state Apache II n=485 n=366

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Instrument EQ-5D

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Instrument EQ-5D

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Why should we use a multivariate method? Missing values of D-VAS are associated with lower HRQOL For a separate model for D-VAS we have missing not a random (MNAR) and the regression estimates might be biased Because the two outcomes are correlated, in a joint model, we can ‘borrow’ information from the EQ-5d index and reduce the bias for the estimates associated with D-VAS

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Multiple outcomes If the outcomes are of the same type, we could assume a multivariate distribution for the outcomes For example, two continuous outcomes

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Binary and continuous outcomes For mixed type of outcomes there is no obvious multivariate distribution Strategy: Avoid direct specification of the joint distribution Latent variable model for y b, y c Introduce a latent variable, u, and assume that conditional on u the outcomes are independent f(y b, y c )=  f(y b, y c,u) du = =  f(y b, y c |u) f(u) du =  f(y b |u) f(y c | u) f(u) du

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Binary and continuous outcomes Latent variable model  f(y b |u) f(y c | u) f(u) du We can specify separate equations for the outcomes conditional on u. The latent variable is modeling the correlation between the outcomes

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Latent model Mathematically speaking:  b and c are scale factors “adjusting” the latent variable to the different scales of the outcomes

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Latent model However this models has parameters that are non- identifiable and we have to fix some of them It can be shown that the correct way to fix some of the parameters is:

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Latent model The interpretation of  b ’s referring to the effect of the covariates on the outcome y b is conditional on u, i.e., y b |u The ‘marginal’ effect can be obtained: IMPORTANT NOTE: The models are for y b |u and y c |u. I omit the conditional from the equations for simplification.

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Latent model A nice feature of this model is that it can be easily implemented in commercial stats software With SAS, use PROC NLMIXED The same is true for  c ’s, but because of the linear link the interpretation is the same for y c |u and y c

A. Teixeira-Pinto AcademyHealth, Orlando 2007 SAS code to fit the Latent Model #SAS code to maximize the likelihood resulting from the latent variable model for the HRQOL example; proc nlmixed data=Icu.Euroqolreduced technique=newrap; #initial values; parms a1=-0.9 b1=.02 c1=-1 d1=0 a2=104 b2=-.2 c2=-9 d2=-4 sigmau=1 sigma2=15 ; bounds sigma2>0, sigmau>0; #likelihood; part1=a1 + b1*age + c1*apache +d1*pstate+ u; part2=eq5d - (a2 + b2*age +c2*apache + d2*pstate) - u*sigma2; if missing(dvas) then loglik=-log(sigma2)-.5*1/(sigma2**2)*(part2)**2; else loglik =dvas*log(PROBNORM (part1))+(1-dvas)*log(PROBNORM (-part1))-log(sigma2) - 5*1/(sigma2**2)*(part2)**2; #model (actually you can put any variable other than eq5d with complete observations; model eq5d ~ general(loglik) ; random u ~ normal(0,sigmau**2) subject=idnumb; #computes the ‘marginalized’ parameters for the probit model; estimate ‘intercept' a1/sqrt(1+sigmau**2); estimate 'age_marg' b1/sqrt(1+sigmau**2); estimate 'apache_marg' c1/sqrt(1+sigmau**2); estimate ‘pstate_marg’ d1/sqrt(1+sigmau**2); run;

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Results of the HRQOL study UnivariateLatent model CoefficientP-valueCoefficientP-value EQ-5D Index (n=485) Age (0.06) < (0.06) <0.01 Previous state (1.53) < (1.53) <0.01 Apache II ~0 (0.15) ~1~0 (0.16) ~1 D-VAS (n=366) Age (0.005) (0.005) 0.03 Previous state (0.11) < (0.11) <0.01 Apache II (0.011) (0.010) <0.01

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Results of the HRQOL study UnivariateLatent model CoefficientP-valueCoefficientP-value EQ-5D Index (n=485) Age (0.06) < (0.06) <0.01 Previous state (1.53) < (1.53) <0.01 Apache II ~0 (0.15) ~1~0 (0.16) ~1 D-VAS (n=366) Age (0.005) (0.005) 0.03 Previous state (0.11) < (0.11) <0.01 Apache II (0.011) (0.010) <0.01

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Results of the HRQOL study The analysis suggests that the severity of the episode leading to the ICU admission is associated with the patients perception of his/hers HRQOL but not with the EQ-5D index This effect would not be noticed with univariate analysis Taking into account the correlation between the two outcomes (crude  = 0.42) helped to reduce the bias of the effects estimates

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Other approaches Other strategies presented in the literature: Factorization method: f(y b, y c ) = f(y b )f(y c | y b ) or f(y b, y c ) = f(y c )f(y b | y c ) Extension of weighted GEEs to non- commensurate outcomes Other strategies for the missing data can also be used, e.g., multiple imputation

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Extention to more than two outcomes For k outcomes:

A. Teixeira-Pinto AcademyHealth, Orlando 2007 “Take home” message Complete cases + Same covariates for all the outcomes Univariate approach  Multivariate approach Complete cases + Different covariates for the the outcomes Univariate approach less efficient (larger std. errors) Multivariate approach more efficient (smaller std. errors) Missing data on the outcomes Univariate approach may lead to biased estimates Multivariate approach may reduce the bias

A. Teixeira-Pinto AcademyHealth, Orlando 2007 Thank you for your attention! Slides available at: