Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.

Similar presentations


Presentation on theme: "Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage."— Presentation transcript:

1 Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage Estimates” versus Fixed Effects Example of CA State CABG data

2 What are multilevel data? Gathering individual observations into larger groups does not create clustered data –Individual observations from a simple, random sample are never multilevel Multilevels are a result of sampling/design – Usually from stages/levels in obtaining the individual units of observation – Repeated measures is a type of multilevel data

3 Other Names for Multilevel Data Hierarchical models Clustered data (but different from cluster analysis) Components of Variance models Contextual Models Micro and macro level data

4 Multilevel Data in Outcomes Research Two levels: –Hospitals and patients –Physicians and patients Three levels: –Hospitals, physicians, and patients –Physicians, patients, and repeated measures Four levels: –National Health Interview Survey

5 National Health Interview Survey Highest level: Select Primary Sampling Units (MSA’s, counties, groups of counties) Next level: Stratify PSU’s by Census blocks and select Secondary Sampling Units (clusters of households) Next level: Select Households within SSU’s Lowest level: Interview individuals in the households (some all, others a sample)

6 Characteristics of Multilevel Data Measurements within level are correlated (eg, measures on same person are more alike than measurements across persons) Variables can be measured at each level Standard statistical models and tests are incorrect The variance of the outcome can be attributed to each level

7 Two Parts of Multilevel Data Variance Outcome = Patient Satisfaction Score Variance in the patient score divides into two parts: (1) the variance between physicans =  2 B (2) the variance within the physicians =  2 W So the total variance =  2 B +  2 W MD3: mean=74MD2: mean=58MD1: mean=81 Level 2: Physicians 79 8577 5561 687475 81 Level 1: Patients

8 Intraclass Correlation Coefficient (ICC) The intraclass correlation coefficient (ICC) is a measure of the correlation among the individual observations within the clusters It is calculated by the ratio of the between cluster variance to the total variance:  2 B / (  2 B +  2 W )

9 Intraclass Correlation Coefficient (ICC) Take extreme case where each MD’s patients have the same score = no variance within the physicians. So, ICC =  2 B /  2 B +  2 W =  2 B /  2 B + 0 = 1 = perfect correlation within the clusters. MD3: mean=74MD2: mean=58MD1: mean=81 74 81 58 74 81

10 Intraclass Correlation Coefficient (ICC) A different case where each MD’s patients have very different scores = most of the variance is within the physicians (ie, between patients, not physicians). ICC is close to 0. MD3: mean=74MD2: mean=68MD1: mean=71 64 6171 5878 549484 81

11 Implications of ICC for Analysis When the ICC is close to 0, most of the variation is explained by patient level measures Less difference between results from ordinary regression and multilevel models May be less important to use a statistical model that allows variables for physician characteristics

12 Implications of ICC for Analysis When the ICC is close to 1, most of the variation is explained by physician level measures Using a statistical model that removes physician effects leaves little variation to explain Important to use a statistical model that allows variables for physician characteristics

13 Methods of Analyzing Multilevel Data 1.Regression model ignoring higher level variables 2.Regression model with an indicator variable for each level 2 unit (minus one) 3.Conditional regression model 4.Regression model with generalized estimating equations (GEE model) 5.Random or mixed effects regression model

14 Choice of Analysis Model: Three Main Considerations What is the research question? How many observations are there at each level of the data? How important is controlling unmeasured confounding at the higher level?

15 Fixed versus Random Effects Effects are random when the units are a sample of a larger population –have variation because sampled; another sample would give different data Effects are fixed if they represent all possible members of a population: –eg, male/female; treatment groups; all the regions of the U.S.

16 Fixed versus Random Effects Effects treated as fixed or random depending on the research question Random effects: generalize from the sample to a larger population Random effects: reduce variation due to small sample size by fitting a distribution Fixed effects: Control for unmeasured confounding at the higher level

17 Methods of Analyzing Multilevel Data Fixed Effects Models: - Regression model with an indicator variable for each level 2 unit (minus one) - Conditional regression model Random Effects Models: - Regression model with generalized estimating equations (GEE) - Random or mixed effects regression model

18 What are “shrinkage estimates”? Also called Bayesian or empiric Bayesian estimates (Iezzoni text) or Best linear unbiased prediction estimates (SAS) Can only be obtained from a random effects (not GEE) regression model Variance of the higher level variable is modeled as if from a specified distribution (usually normal, but other possible)

19 A Simple Random Effects Model A simple random effects model is: y ij =  +  j + e ij, where  = overall mean,  j = difference for MD, and e ij = individual error Model says there is random variation from the mean score at the level of MD’s plus variation at the level of patients Bayesian estimates are the individual  j’s obtained from the overall distribution

20 Example of Shrinkage Estimates In Patient Outcomes Research Team study of patient satisfaction with MD treatment for diabetes, raw mean patient scores by MD ranged from 53.4 to 87.1 The random effects shrinkage estimates of the mean patient scores by MD ranged from 60.4 to 78.6 –Random effects shrinkage estimates are closer to the overall mean

21 Controversy in Outcomes Research Report Cards rank hospitals or physicians Data used has at least two levels (hospitals or physicians and their patients) Controversy is over the choice of statistical model for evaluating variation at the hospital or physician level

22 Methods of Analyzing Hospital (or MD) Mortality Variance Ignore hospital, run ordinary regression then predict average for each hospital Remove hospital effect with indicator variables for hospitals (fixed effects model) then predict average for each hospital Run random effects regression and obtain the Bayesian/shrinkage estimates for each hospital

23 Shrinkage estimates and CA State CABG Data Unadjusted estimate for each hosptial is estimated as from a normal distribution More weight is given to hospitals with more CABG patients –Hospitals with smaller numbers move closer to the mean in modeling a normal distribution Estimates somewhat software dependent

24 Shrinkage Estimates: Software Obtaining shrinkage estimates involves some software choices –Not all software provides them –STATA by itself doesn’t provide them –Different likelihood methods of fitting models STATA add-on GLLAMM (free download) SAS –For linear outcome, PROC MIXED –For non-linear, PROC NLMIXED and GLIMMIX Some other software for multilevel data


Download ppt "Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage."

Similar presentations


Ads by Google