Presentation is loading. Please wait.

Presentation is loading. Please wait.

by Kim Chantala C. M. Suchindran Dan Blanchette

Similar presentations


Presentation on theme: "by Kim Chantala C. M. Suchindran Dan Blanchette"— Presentation transcript:

1 by Kim Chantala C. M. Suchindran Dan Blanchette
Adjusting for Unequal Selection Probability in Multilevel Models: A Comparison of Software Packages by Kim Chantala C. M. Suchindran Dan Blanchette Abstract: Most surveys collect data using complex sampling plans involving selection of both clusters and individuals with unequal probability of selection. Research in methods of using multilevel modeling (MLM) procedures to analyze such data is relatively new. Often sampling weights based on selection probabilities of individuals are used to estimate population-based models. However, sampling weights used for estimating multilevel models need to be constructed differently than weights used for single-level (population-average) models.  This paper compares the capabilities of several MLM software programs that can be used for analyzing data collected with a complex sampling plan.  We illustrate how the weights for multilevel models can be constructed from population average weights.  Finally, we use data from the National Longitudinal Survey of Adolescents to compare the results from several of these packages.

2 Overview Compare capabilities of multilevel modeling software packages for analyzing data collected with a complex sampling plan Describe characteristics of survey data that can influence estimates Construct sampling weights for estimating multilevel models Contrast results from estimating a two-level model with different software packages Research in methods of using multilevel modeling (MLM) procedures to analyze complex survey data is relatively new. Recent studies have shown how sensitive estimates from MLM analyses are to the sample design. Many of these studies have focused on the use of sampling weights to adjust estimates for the effects of the sample design. The purpose of our research is to investigate using different MLM software packages to analyze a survey collected with a complex sampling plan. We will compare several different popular software packages and contrast the results from these packages. We will describe some of the features of survey data that can influence estimates and illustrate different methods of constructing sampling weights for a two-level analysis. Our example will use data from the National Longitudinal Study of Adolescent Health, a survey collected with a complex sampling plan.

3 Comparison of Software Packages: General Information
SEM Analysis MLM Analysis Adjust for Clustering Adjust for Stratification MPLUS 3.1 LISREL 8.7 GLLAMM (Stata 8) MLWIN 1.1 HLM 6.0 MIXED (SAS 8.2) NLMIXED (SAS 8.2) Several of the commercially available software packages allow analysts to use sampling weights when estimating Structural Equation Models (SEM) and Multilevel Models (MLM). The SEM software packages include MPLUS (version 1.04 and later), LISREL (version 8.7), and the Stata user written program GLLAMM (version , SRH 4 Jan 2005). Except for MIXED and NLMIXED, all of these packages have been designed to analyze data collected with a complex sampling plan.

4 Comparison of Software Packages: Implementation of Sampling Weights
Allow MLM Sampling Weights Method for Scaling MLM Sampling Weights Responsibility for Scaling MLM Sampling Weights MPLUS 3.1 Asparouhov (2004) User LISREL 8.7 Pfefferman (1998) GLLAMM (Stata 8) MLWIN 1.1 User or MLWIN default HLM 6.0 Normalize HLM default MIXED (SAS 8.2) Unknown NLMIXED (SAS 8.2) Grilli, L. (2004) In addition to allowing sampling weights for estimating single-level models, some of these software packages also allow users to specify sampling weights designed for estimating multilevel models. Because these weights need to be constructed differently than sampling weights used for single-level models, users should make sure the weights they are scaled properly for the particular software package being used for MLM analysis. The next three software packages designed specifically for fitting multilevel models, MLWIN (version ) and HLM (version 6.0), allow users to specify sampling weights at each level of sampling. MPLUS has developed a way of combining the level 1 and level 2 weight information to produce one weight for estimating 2-level models. While PROC MIXED (SAS version 8.2 and later) does allow users to specify a single weight, the weight is NOT expected to be a sampling weight, but a weight designed to be inversely proportional to the variability of the observations. Hence, users must be quite cautious in using PROC MIXED when analyzing data collected with a complex sampling plan. SAS also provides a separate package, PROC NLMIXED, for estimating nonlinear multilevel models. Although there is no weight statement available with PROC NLMIXED, special weighting procedures have been implemented through a SAS macro to adjust for the sampling design (Grilli and Pratesi, 2004).

5 Comparison of Software Packages: MLM Analyses with Sampling Weights
Normal Binary Poisson Multinomial Categorical Ordered Categorical MPLUS 3.1 LISREL 8.7 GLLAMM (Stata 8) MLWIN 1.1 HLM 6.0 MIXED (SAS 8.2) NLMIXED (SAS 8.2) The above table lists the types of MLM analyses available from these packages that allow users to use weights. The vendors of MPLUS, MLWIN, LISREL, and HLM report that the most recent versions of their software packages all produce comparable results when estimating models from complex survey data.

6 Survey Data Characteristics: Design of Add Health
80 High Schools selected with probability proportional to size from list of 26,666 schools sorted by: Enrollment Size Region of Country School Type Location Percent White 52 High Schools did not include a 7th or 8th grade Feeder school selected with probability proportional to percentage of each high schools’ entering class that came from feeder school. 52 Feeder Schools 80 High Schools Add Health is a longitudinal study of adolescents listed on grade 7-12 enrollment rosters for the academic year. The sampling plan caused the Add Health participants to differ from the target population on many characteristics. A sample of 80 high schools and 52 middle schools were chosen with unequal probability of selection. First, a list of 26,666 U.S. High Schools was sorted on size (<125, , , >776 students), school type (public, private, parochial), region (Northeast, Midwest, South, West), location (urban, suburban, rural), and percent white (0,1 to 66, 67 to 93, 94 to 100), then divided into groups for sampling. Eighty high schools were selected systematically from this list with probability proportional to enrollment size. High schools that did not include 7th or 8th grades supplied names of middle schools that contributed students to the incoming class. For each of these high schools, a single feeder school was selected with probability proportional to the percentage of the high schools’ entering class that came from the feeder school. A total of 52 feeder (junior high & middle) schools were selected. The Wave I In-Home survey selected students from the enrollment rosters of the 132 schools with unequal probability of selection. Several special over-sampled groups were also recruited for the Wave I interview. These include the core sample (roughly equal-sized samples), purposively selected schools (all students selected), non-genetic supplements (Black adolescents whose parents were college graduates, adolescents whose race was Cuban, Puerto Rican, or Chinese), the disabled sample, and the genetic supplement (biologically related adolescents, non- related adolescents living together). All of the characteristics used to select both Add Health schools and adolescents, as well as characteristics that influenced non-response, have been used to compute the final sampling weights. For each of the four panels of data, Add Health provides sampling weights that are designed for estimating single-level models. Sampling weights for the schools selected are also available. Thus final sampling weights are available for each level sampled and are available with the distributed data. Additional information about the Add Health data can be found at 18,924 Students selected from 132 schools for Wave I In-Home Interview All Students from 16 Schools Disabled Sample Genetic Samples Twins Full siblings Half siblings Unrelated in Same HH Ethnic Samples High SES Black Cuban Puerto Rican Chinese Core Sample

7 Meaning of Weight Component
Constructing Multilevel Weights Weight Components Needed to Construct Sampling Weights for Two-Level Analysis using the Add Health Data: Level Unit Interviewed Weight Component * Meaning of Weight Component 1 Adolescent i enrolled in School j fsu_wti|j Number of adolescents enrolled in school j with the same characteristics as adolescent i. 2 School j psu_wtj Number of schools in the U.S. with the same characteristics as school j. Because no one way of scaling the weights for multilevel modeling has been widely accepted, the method of scaling the weights can be different for different MLM software packages. In the next few slides we describe some of the methods most commonly recommended by developers of the MLM software packages for estimating two-level models. The weight components needed to construct sampling weights for multilevel analysis using the Add Health data differ somewhat from the final weights that are supplied with the data. However, these weight components can be easily computed from the available final sampling weights distributed for single-level analysis. For a two-level analysis, the needed weight components are defined in this table. * Stata programs for constructing sampling weights for estimating two-level models can be downloaded from our website ( after August 1, These programs have implemented methods from Pfefferman (1998) and Asparouhov (2004).

8 Some MLM Software Packages Requires Special Weights
Some MLM Software Packages Requires Special Weights* Constructed for Each Level: PWIGLS weighted scaling method 2 is recommended for informative sampling methods (Pfefferman, 1998) and would be an appropriate choice for the Add Health data. The level 2 sampling weight for each PSU is computed by summing the within-PSU sampling weight for each unit i sampled in PSU j and then dividing by the number sampled within PSU j. The level 1 sampling weight for each unit i sampled within PSU j is computed by dividing the within-PSU sampling weight for each unit i sampled in PSU j by the level 2 sampling weight. This is the weight we have used for the MLWIN, GLLAMM, and LISREL analyses. *Method of weight construction from Pfeffermann (1998)

9 Other MLM Software Packages require one Weight
Other MLM Software Packages require one Weight* that combines the weights from each level in a particular way: The method of constructing multilevel sampling weights for estimating 2-level models in Mplus is given in Asparouhov, T. (2004). Weighting for unequal probability of selection in multilevel modeling. Mplus Web Notes: No Mplus uses sampling weight components from both levels to compute just one weight used in their analysis. This is the sampling weight we used in the MPLUS and MIXED analysis. *Method of weight construction from Asparouhov (2004)

10 Illustrative Example Research Question: How is the effect of hours watching TV on BMI of students in a school influenced by the availability of a school recreation center? Data from the National Longitudinal Study of Adolescent Health (Add Health) Contrast the results from MPLUS, MIXED, LISREL, MLWIN, and GLAMM Weights for MPLUS & MIXED will be constructed with the Asparouhov (2004) method; weights for LISREL, MLWIN, and GLAMM will be constructed with the Pfeffermann (1998) method. Our example will look at how the availability of a recreation center at the school can influence the percentile body mass index of students attending the school. Data for the examples used to compare models estimated from the MLM software packages comes from the School Administrator Survey and the Wave I In-Home Survey of Add Health. Sampling weights were constructed by the methods outlined in the previous slides.

11 Data in example Level Variable Meaning School RC_S
School has on-site recreation facility, 0=No,1=Yes Individual BMIPCT Percentile BMI for age and sex of adolescent HR_WATCH Hours watched TV, played video or computer games during past week Information on the availability of an on-site school recreation center was provided by each school. Each adolescent answered height and weight questions used to compute sex- and age-adjusted percentile body mass index. They also reported hours watching TV or playing video or computer games during the past week. Our examples will fit a MLM with a level for the school and a level for the student.

12 Two-level Model Student-level model (Within or Level 1):
BMIPCTij = {0j + 1j(HR_WATCHij)} + eij where: E(eij) = 0, Var(eij) = σ2 School-level Model (Between or Level 2): 0j = 00 + 01(RC_S)j + 0j 1j = 10 + 11(RC_S)j + 1j where: E(0j ) = E(1j ) = 0 Var (0j ) = σ20, Var(1j) = σ21, Cov(0j, 1j ) = σ0,1 The Student level model hypothesizes that percentile body mass index (BMIPCT) for adolescent i attending school j can be expressed as a linear function of the number of hours spent watching TV or using computers (HR_WATCH). The coefficients 0j and 1j in the student-level model are unknown constants that determine the influence of each school on the BMIPCT of the student body (all the students enrolled in school considered as a group). If the model is correct, then 0j will represent the average BMIPCT for students at school j who do not watch TV or use computers during the week (HR_WATCH=0) and 1j represents the rate at which the percentile BMI will change as HR_WATCH increases. We expect 1j will be positive indicating that percentile BMI increases as hours of TV watching or uses computers increases. The interpretation of the intercept could be changed by centering the HR_WATCH variable. Note there was no need to center HR_WATCH to provide a meaningful interpretation of 0j. We specify both a random intercept and slope for the School-level model. The random intercept allows the average BMIPCT of students who do not watch TV or use computers (HR_WATCH=0) to differ across schools. The random slope allows the change in average BMIPCT for each unit increase in HR_WATCH to vary across schools. The addition of a binary variable to indicate presence of a school recreation center (RC_S) allows both the slopes and intercepts to be changed by the presence or absence of a recreation center located within the school.

13 Effect of Sampling Weights on Estimates
Range of Parameter Estimates Parameter Using Weights Ignoring Weights Ratio Fixed Effect 00 2.72 0.05 54.5 01 3.08 0.08 38.5 10 0.019 0.001 19.0 11 0.055 0.003 18.3 Random Effect σ2 0 12.41 0.53 23.4 σ2 1 0.008 0.0005 16.0 σ 0,1 0.234 0.025 9.36 σ2 25.03 0.62 40.3 First we contrast the range of parameter estimates computed with sampling weights versus the range computed if sampling weights are ignored. Although the packages produce nearly the same estimates in the absence of weighting, the estimates becoming much more variable when the multilevel sampling weights are used in the calculation. The range of the estimates is 16 to 55 times larger when the sampling weights are used. When sampling weights were omitted from analyses, all software packages gave nearly the same results.

14 Analysis Results from Different Packages
Weight: MPML Method A Weight: PWIGLS Method 2 MPLUS 3.1 Estimate (S.E) MIXED 8.2 LISREL 8.7 Estimate (S.E.) MLWIN 1.1 GLLAMM Estimate (S.E) Fixed Effects 00 60.19 (0.65) 59.09 (0.79) 57.83 (0.72) 58.52 (0.58) 57.47 (0.77) 01 -4.49 (0.87) -2.74 (1.10) (1.06) -1.41 (0.95) -1.51 (1.18) 10 0.033 (0.016) 0.038(0.020) 0.045 (0.018) 0.052 (0.013) 0.049 (0.021) 11 0.12 (0.021) 0.11 (0.027) 0.099 (0.025) 0.065 (0.022) 0.101 (0.029) Random σ2 0 16.27 (4.04) 24.84 (5.04) 14.13 (3.18) 12.43 (3.05) 17.11 (4.74) σ2 1 0.002 (0.002) 0.009(0.003) 0.002 (0.001) 0.001 (0.001) 0.007 (0.003) σ0,1 (0.067) (0.097) (0.047) (0.040) -0.12 (0.08) σ2 (10.12) (8.19) (8.72) (8.38) (11.94) This table shows the estimates from each package. Because the estimates cover such a wide range of magnitudes, it is difficult to assess how well the packages are at estimating the same value for each parameter. However, we can rescale each set of estimates for a parameter in a way that allows all estimates to be compared on a common measure. For each parameter, I chose that standard measure to be the average of the standard errors from the different packages. For example, using the estimates from the weighted analysis, the distance between tick marks is (average of 0.65, 0.79, 0,72, 0.58, 0.77) for g00, for g10, and for the covariance term.

15 Parameter Estimate Profile for Analysis Using Sampling Weights
s2d0 s2 sd0,d1 We have graphed the estimates for each parameter on axes where the distance between tick-marks on each axis is the mean of standard errors estimated by all packages for a given parameter. For example, the range of the estimates for the covariance term is in the weighted analysis and the average of the standard errors is 0.067, so the distance between the minimum and maximum covariance estimates in the graph is about 3.5 tick marks (average of standard errors). This method of scaling each axis allows us to visually assess how well the packages did at estimating the same value. Summary of Random Effects: PROC MIXED estimates the value of all of the random effects to be more extreme than the value estimated by any of the other packages. Random effects estimated by MLWIN, MPLUS, and LISREL are all within the average of the standard errors for each parameter estimate for the random effects. GLAMM also produces estimates for three of the random effects to differ from the LISREL and MPLUS estimate by no more than the average of the standard errors, but estimates the value of 21 to be nearly as extreme as the value estimated by PROC MIXED. Summary of Fixed Effects: The estimates for the parameter estimates 00 and 01 from the school-level model for the intercept are quite variable with MPLUS estimating the most extreme values. MLWIN estimates the most extreme value for one of the fixed effects (11 from the school level model for the slope). Except for this MLWIN estimate, all other estimates for 10 and 11 from the school-level model for the slope are within the average of the standard errors. g10 s2d1 g11

16 Predictions from Analysis Using Sampling Weights
Solid lines (RC_S=1): schools with recreation centers Dashed lines (RC_S=0): schools without recreation centers Differences in Predictions: Slopes estimated by different packages have similar values. Intercepts estimated by different packages differ, with more variability in estimates for schools without recreation centers than schools with recreation centers.

17 Conclusion Use of sampling weights to adjust for non-response and the design characteristics of complex survey data has recently been incorporated in software used for estimating multilevel models. This provides analysts with a simple method for obtaining unbiased estimates from complex survey data. When sampling weights are used, results from these packages can vary. If weights are ignored, these packages produce the same results. Simulation studies need to be conducted to determine why these packages produce different results when sampling weights are used. Models with non-normal outcomes need to be examined.

18 References Asparouhov, T. (2004). Weighting for Unequal Probability of Selection in Multilevel Modeling, Mplus Web Notes No. 8 available from Grilli, L., and Pratesi, M. Weighted Estimation in Multilevel Ordinal and Binary Models in the Presence of Informative Sampling Designs. Survey Methodology, June 2004, Volume 30, pp Pfeffermann, D., Skinner, C. J., Holmes D. J, and H. Goldstein, Rasbash, J., (1998). Weighting for Unequal Selection Probabilities in Multilevel Models. JRSS, Series B, 60,


Download ppt "by Kim Chantala C. M. Suchindran Dan Blanchette"

Similar presentations


Ads by Google