Multilevel Models 1 Sociology 229A Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

Qualitative predictor variables
1 Results from hsb_subset.do. 2 Example of Kloeck problem Two-stage sample of high school sophomores 1 st school is selected, then students are picked,
1 FE Panel Data assumptions. 2 Assumption #1: E(u it |X i1,…,X iT,  i ) = 0.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
Heteroskedasticity The Problem:
Lecture 4 (Chapter 4). Linear Models for Correlated Data We aim to develop a general linear model framework for longitudinal data, in which the inference.
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
TigerStat ECOTS Understanding the population of rare and endangered Amur tigers in Siberia. [Gerow et al. (2006)] Estimating the Age distribution.
Advanced Panel Data Techniques
EC220 - Introduction to econometrics (chapter 7)
INTERPRETATION OF A REGRESSION EQUATION
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Lecture 6: Repeated Measures Analyses Elizabeth Garrett Child Psychiatry Research Methods Lecture Series.
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Sociology 601 Class 21: November 10, 2009 Review –formulas for b and se(b) –stata regression commands & output Violations of Model Assumptions, and their.
Shall we take Solow seriously?? Empirics of growth Ania Nicińska Agnieszka Postępska Paweł Zaboklicki.
Multilevel Models 1 Sociology 229: Advanced Regression
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
Multilevel Models 2 Sociology 8811, Class 24
Multilevel Models 2 Sociology 229A, Class 18
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Multilevel Models 3 Sociology 8811, Class 25 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification iii: consequences for diagnostics Original.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Analysis of Clustered and Longitudinal Data
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
Hypothesis Testing in Linear Regression Analysis
Returning to Consumption
Serial Correlation and the Housing price function Aka “Autocorrelation”
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Addressing Alternative Explanations: Multiple Regression
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
Error Component Models Methods of Economic Investigation Lecture 8 1.
Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 3: Basic techniques for innovation data analysis. Part II: Introducing regression.
Lecture 3 Linear random intercept models. Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The.
Biostat 200 Lecture Simple linear regression Population regression equationμ y|x = α +  x α and  are constants and are called the coefficients.
. reg LGEARN S WEIGHT85 Source | SS df MS Number of obs = F( 2, 537) = Model |
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Chapter 5: Dummy Variables. DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 We’ll now examine how you can include qualitative explanatory variables.
Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Multilevel Models 3 Sociology 229A, Class 10 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
1 In the Monte Carlo experiment in the previous sequence we used the rate of unemployment, U, as an instrument for w in the price inflation equation. SIMULTANEOUS.
WHITE TEST FOR HETEROSCEDASTICITY 1 The White test for heteroscedasticity looks for evidence of an association between the variance of the disturbance.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
QM222 Class 9 Section A1 Coefficient statistics
assignment 7 solutions ► office networks ► super staffing
Presentation transcript:

Multilevel Models 1 Sociology 229A Copyright © 2008 by Evan Schofer Do not copy or distribute without permission

Multilevel Data Often we wish to examine data that is “clustered” or “multilevel” in structure –Classic example: Educational research Students are nested within classes Classes are nested within schools Schools are nested within districts or US states We often refer to these as “levels” Ex: If the study is individual/class/school… Level 1 = individual level Level 2 = classroom Level 3 = school –Note: Some stats books/packages label differently!

Multilevel Data Students nested in class, school, and state Variables at each level may affect student outcomes Class School Class School California Class School Class School Oregon

Multilevel Data Simpler example: 2-level data Which can be shown as: Class 1 S1S1 S2S2 S3S3 Class 2 S1S1 S2S2 S3S3 Class 3 S1S1 S2S2 S3S3 Level 2 Level 1

Multilevel Data We are often interested in effects of variables at multiple levels Ex: Predicting student test scores Individual level: grades, SES, gender, race, etc. Class level: Teacher qualifications, class size, track School: Private vs. public, resources State: Ed policies (funding, tests), budget –And, it is useful to assess the relative importance of each level in predicting outcomes Should educational reforms target classrooms? Schools? Individual students? Which is most likely to have big consequences?

Multilevel Data Repeated measurement is also “multilevel” or “clustered” Measurement at over time (T 1, T 2, T 3 …) is nested within persons (or firms or countries) Level 1 is the measurement (at various points in time) Level 2 = the individual Person 1 T2T2 T1T1 T4T4 T3T3 T5T5 Person 2 T2T2 T1T1 T4T4 T3T3 T5T5 Person 3 T2T2 T1T1 T4T4 T3T3 T5T5 Person 4 T2T2 T1T1 T4T4 T3T3 T5T5

Multilevel Data Examples of multilevel/clustered data: Individuals from same family –Ex: Religiosity People in same country (in a cross-national survey) –Ex: Civic participation Firms from within the same industry –Ex: Firm performance Individuals measured repeatedly –Ex: Depression Workers within departments, firms, & industries –Ex: Worker efficiency –Can you think of others?

Example: Pro-environmental values Source: World Values Survey (27 countries) Let’s simply try OLS regression. reg supportenv age male dmar demp educ incomerel ses Source | SS df MS Number of obs = F( 7, 27799) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] age | male | dmar | demp | educ | incomerel | ses | _cons |

Aggregation If you want to focus on higher-level hypotheses (e.g., schools, not children), you can aggregate Make “school” the unit of analysis OLS regression analysis of school-level variables Individual-level variables (e.g., student achievement) can be included as school averages (aggregates) –Ex: Model average school test score as a function of school resources and average student SES Problem: Approach destroys individual-level data Also: Loss of statistical power (Tabachnick & Fidel 2007) Also: Can’t draw individual-level interpretations: ecological fallacy.

Example: Pro-environmental values Aggregation: Analyze country means (N=27). reg supportenv age male dmar demp educ incomerel ses Source | SS df MS Number of obs = F( 7, 19) = 0.91 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] age | male | dmar | demp | educ | incomerel | ses | _cons | Note loss of statistical power – few variables are significant when N is only 27

Ecological Fallacy Issue: Data aggregation limits your ability to draw conclusions about level-1 units The “ecological fallacy” –Robinson, W.S. (1950). "Ecological Correlations and the Behavior of Individuals". American Sociological Review 15: 351–357 Among US states, immigration rate correlates positively with average literacy Does this mean that immigrants tend to be more literate than US citizens? NO: You can’t assume an individual-level correlation! –The correlation at individual level is actually negative –But: immigrants settled in states with high levels of literacy – yielding a correlation in aggregate statistics.

OLS Approaches Another option: Just use OLS regression Allows you to focus on lower-level units –No need for aggregation Ex: Just analyze individuals as the unit of analysis, ignoring clustering among schools Include independent variables measured at the individual-level and other levels Problems: 1. Violates OLS assumptions (see below) 2. OLS is too limited; can’t take advantage of richness of multilevel data –Ex: Complex variation in intercepts, slopes across groups.

Multilevel Data: Problems Issue: Multilevel data often results in violation of OLS regression assumption OLS requires an independent random sample… Students from the same class (or school) are not independent… and may have correlated error –If you don’t control for sources of correlated error, models tend to underestimate standard errors This leads to false rejection of H0 –Too many asterisks in table (Type I error) This is a serious issue, as we always want to err in the direction of conservatism… false findings = bad!

Multilevel Data: Problems Why might nested data have correlated error? –Example: Student performance on a test Students in a given classroom may share & experience common (unobserved) characteristics Ex: Maybe the classroom is too dark, causing all students to perform poorly on tests –If all those students score poorly, they fall below the regression line… and have negative error –But OLS regression requires that error be “random” –Within-class error should be random, not consistently negative –Other sources of within-class (or school) error An especially good teacher; poor school funding Other ideas?

Multilevel Data: Problems Sources of correlated error within groups –Ex: Cross-national study of homelessness People in welfare states have a common unobserved characteristic: access to generous benefits –Ex: Study of worker efficiency in workgroups Group members may influence each other (peer pressure) leading to group commonalities.

Multilevel Data: Problems When is multilevel data NOT a problem? –Answer: If you can successfully control for potential sources of correlated error Add a control to OLS model for: classroom, school, and state characteristics that would be sources of correlated error in each group Ex: Teacher quality, class size, budget, etc… But: We often can’t identify or measure all relevant sources of correlated error Thus, we need to abandon simple OLS regression and try other approaches.

Example: Pro-environmental values Source: World Values Survey (~26 countries). reg supportenv age male dmar demp educ incomerel ses Source | SS df MS Number of obs = F( 7, 27799) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] age | male | dmar | demp | educ | incomerel | ses | _cons |

Robust Standard Errors Strategy #1: Improve our estimates of the standard errors –Option 1: Robust Standard Errors reg y x1 x2 x3, robust The Huber / White / “Sandwich” estimator An alternative method of computing standard errors that is robust to a variety of assumption violations –Provides accurate estimates in presence of heteroskedasticity Also, robust to model misspecification –Note: Freedman’s criticism: What good are accurate SEs if coefficients are biased due to poor specification?

Example: Pro-environmental values Robust Standard Errors. reg supportenv age male dmar demp educ incomerel ses, robust Linear regression Number of obs = F( 7, 27799) = Prob > F = R-squared = Root MSE = | Robust supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] age | male | dmar | demp | educ | incomerel | ses | _cons | Standard errors shift a tiny bit… fairly similar to OLS in this case

Robust Cluster Standard Errors Option 2: Robust cluster standard errors –A modification of robust SEs to address clustering reg y x1 x2 x3, cluster(groupid) –Note: Cluster implies robust (vs. regular SEs) It is easy to adapt robust standard errors to address clustering in data; See: – – Result: SE estimates typically increase, which is appropriate because non-independent cases aren’t providing as much information as would a sample of independent cases.

Example: Pro-environmental values Robust Cluster Standard Errors. reg supportenv age male dmar demp educ incomerel ses, cluster(country) Linear regression Number of obs = F( 7, 25) = Prob > F = R-squared = Number of clusters (country) = 26 Root MSE = | Robust supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] age | male | dmar | demp | educ | incomerel | ses | _cons | Cluster standard errors really change the picture. Several variables lose statistical significance.

Dummy Variables Another solution to correlated error within groups/clusters: Add dummy variables Include a dummy variable for each Level-2 group, to explicitly model variance in means A simple version of a “fixed effects” model (see below) Ex: Student achievement; data from 3 classes Level 1: students; Level 2: classroom Create dummy variables for each class –Include all but one dummy variable in the model –Or include all dummies and suppress the intercept

Dummy Variables What is the consequence of adding group dummy variables? A separate intercept is estimated for each group Correlated error is absorbed into intercept –Groups won’t systematically fall above or below the regression line In fact, all “between group” variation (not just error) is absorbed into the intercept –Thus, other variables are really just looking at within group effects –This can be good or bad, depending on your goals.

Dummy Variables Note: You can create a set of dummy variables in stata as follows: xi i.classid – creates dummy variables for each unique value of the variable “classid” –Creates variables named _Iclassid_1, _Iclassid2, etc These dummies can be added to the analysis by specifying the variable: _Iclassid* Ex: reg y x1 x2 x3 _Iclassid*, nocons –“nocons” removes the constant, allowing you to use a full set of dummies. Alternately, you could drop one dummy.

Example: Pro-environmental values Dummy variable model. reg supportenv age male dmar demp educ incomerel ses _Icountry* Source | SS df MS Number of obs = F( 32, 27774) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] age | male | dmar | demp | educ | incomerel | ses | _Icountry_32 | _Icountry_50 | _Icountry_70 | … dummies omitted … _Icountr~891 | _cons |

Dummy Variables Benefits of the dummy variable approach It is simple –Just estimate a different intercept for each group sometimes the dummy interpretations can be of interest Weaknesses Cumbersome if you have many groups Uses up lots of degrees of freedom (not parsimonious) Makes it hard to look at other kinds of group dummies –Non-varying group variables = collinear with dummies Can be problematic if your main interest is to study effects of variables across groups –Dummies purge that variation… focus on within-group variation –If you don’t have much within group variation, there isn’t much left to analyze.

Dummy Variables Note: Dummy variables are a simple example of a “fixed effects” model (FEM) Effect of each group is modeled as a “fixed effect” rather than a random variable Also can be thought of as the “within-group” estimator –Looks purely at variation within groups –Stata can do a Fixed Effects Model without the effort of using all the dummy variables Simply request the “fixed effects” estimator in xtreg.

Fixed Effects Model (FEM) Fixed effects model: For i cases within j groups Therefore  j is a separate intercept for each group It is equivalent to solely at within-group variation: X-bar-sub-j is mean of X for group j, etc Model is “within group” because all variables are centered around mean of each group.

Fixed Effects Model (FEM). xtreg supportenv age male dmar demp educ incomerel ses, i(country) fe Fixed-effects (within) regression Number of obs = Group variable (i): country Number of groups = 26 R-sq: within = Obs per group: min = 511 between = avg = overall = max = 2154 F(7,27774) = corr(u_i, Xb) = Prob > F = supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] age | male | dmar | demp | educ | incomerel | ses | _cons | sigma_u | sigma_e | rho | (fraction of variance due to u_i) F test that all u_i=0: F(25, 27774) = Prob > F = Identical to dummy variable model!