# Random Assignment Experiments

## Presentation on theme: "Random Assignment Experiments"— Presentation transcript:

Random Assignment Experiments
When medical researchers want to examine the effect of a new drug, they use an experimental design called an random assignment experiment In such experiments, two groups are chosen randomly: 1. Treatment group: receives the treatment (a specific medicine, say) 2. Control group: receives a harmless, ineffective placebo The resulting equation is: OUTCOMEi = β0 + β1TREATMENTi + εi (16.1) where: OUTCOMEi = a measure of the desired outcome in the ith individual TREATMENTi = a dummy variable equal to 1 for individuals in the treatment group and 0 for individuals in the control group © 2011 Pearson Addison-Wesley. All rights reserved. 1

Random Assignment Experiments (cont.)
But random assignment can’t always control for all possible other factors—though sometimes we may be able to identify some of these factors and add them to our equation Let’s say that the treatment is job training: Suppose that random assignment, by chance, results in one group having more males and being slightly older than the other group If gender and age matter in determining earnings, then we can control for the different composition of the two groups by including gender and age in our regression equation: OUTCOMEi = β0 + β1TREATMENTi + β2X1i + β3X2i + εi (16.2) where: X1 = dummy variable for the individual’s gender X2 = the individual’s age © 2011 Pearson Addison-Wesley. All rights reserved. 2

Random Assignment Experiments (cont.)
Unfortunately, random assignment experiments are not common in economics because they are subject to problems that typically do not plague medical experiments—e.g.: 1. Non-Random Samples: Most subjects in economic experiments are volunteers, and samples of volunteers often aren’t random and therefore may not be representative of the overall population As a result, our conclusions may not apply to everyone 2. Unobservable Heterogeneity: In Equation 16.2, we added observable factors to the equation to avoid omitted variable bias, but not all omitted factors in economics are observable This “unobservable omitted variable” problem is called unobserved heterogeneity © 2011 Pearson Addison-Wesley. All rights reserved. 3

Random Assignment Experiments (cont.)
3. The Hawthorne Effect: Human subjects typically know that they’re being studied, and they usually know whether they’re in the treatment group or the control group The fact that human subjects know that they’re being observed sometimes can change their behavior, and this change in behavior could clearly change the results of the experiment 4. Impossible Experiments: It’s often impossible (or unethical) to run a random assignment experiment in economics Think about how difficult it would be to use a random assignment experiment to study the impact of marriage on earnings! © 2011 Pearson Addison-Wesley. All rights reserved. 4

Natural Experiments Natural experiments (or quasi-experiments) are similar to random assignment experiments, except: observations fall into treatment and control groups “naturally” (because of an exogenous event) instead of being randomly assigned by the researcher By “exogenous event” is meant that the natural event must not be under the control of either of the two groups © 2011 Pearson Addison-Wesley. All rights reserved. 5

Natural Experiments (cont.)
The appropriate regression equation for such a natural experiment is: ΔOUTCOMEi = β0 + β1TREATMENTi + β2X1i + β3X2i + εi (16.3) where: ΔOUTCOMEi is defined as the outcome after the treatment minus the outcome before the treatment for the ith observation β1 is called the difference-in-differences estimator, and it measures the difference between the change in the treatment group and the change in the control group, holding constant X1 and X2 Figure 16.1 illustrates an example of a natural experiment © 2011 Pearson Addison-Wesley. All rights reserved. 6

Figure 16.1 Treatment and Control Groups for Los Angeles

What Are Panel Data? Panel (or longitudinal) data combine time-series and cross-sectional data such that observations on the same variables from the same cross sectional sample are followed over two or more different time periods Why use panel data? At least three reasons—using panel data: 1. certainly will increase sample sizes! 2. can help provide insights into analytical questions that can’t be answered by using time-series or cross-sectional data alone: Allows determining whether the same people are unemployed year after year or whether different individuals are unemployed in different years 3. often allow researchers to avoid omitted variable problems that otherwise would cause bias in cross-sectional studies © 2011 Pearson Addison-Wesley. All rights reserved. 8

What Are Panel Data? (cont.)
There are four different kinds of variables that we encounter when we use panel data: 1. Variables that can differ between individuals but don’t change over time: e.g., gender, ethnicity, and race 2. Variables that change over time but are the same for all individuals in a given time period: e.g., the retail price index and the national unemployment rate 3. Variables that vary both over time and between individuals: e.g., income and marital status 4. Trend variables that vary in predictable ways: e.g., an individual’s age © 2011 Pearson Addison-Wesley. All rights reserved. 9

The Fixed Effects Model
There are several alternative panel data estimation procedures Most researchers use the fixed effects model, which allows each cross-sectional unit to have a different intercept: Yit = β0 + β1Xit + β2D2i βNDNi + vit (16.4) where: D2 = intercept dummy equal to 1 for the second cross-sectional entity and 0 otherwise DN = intercept dummy equal to 1 for the Nth cross-sectional entity and 0 otherwise Note that Y, X, and v have two subscripts! © 2011 Pearson Addison-Wesley. All rights reserved. 10

The Fixed Effects Model (cont.)
One major advantage of the fixed effects model is that it avoids bias due to omitted variables that don’t change over time e.g., race or gender Such time-invariant omitted variables often are referred to as unobserved heterogeneity or a fixed effect To understand how this works, consider what Equation 16.4 would look like with only two years worth of data: Yit = β0 + β1Xit + β2D2i + vit (16.5) Let’s decompose the error term, vit, into two components, a classical error term (εit) and the unobserved impact of the time-invariant omitted variables (ai): vit = εit + ai (16.6) © 2011 Pearson Addison-Wesley. All rights reserved. 11

The Fixed Effects Model (cont.)
If we substitute Equation 16.6 into Equation 16.5, we get: Yit = β0 + β1Xit + β2D2i + εit + ai (16.7) Next, average Equation 16.7 over time for each observation i, thus producing: Yi = β0 + β1Xi + β2D2i + εi + ai (16.8) where the bar over a variable indicates the mean of that variable across time Note that ai, β2D2i, and β0 don’t have bars over them because they’re constant over time © 2011 Pearson Addison-Wesley. All rights reserved. 12

The Fixed Effects Model (cont.)
If we now subtract Equation 16.8 from Equation 16.7, we get: Note that ai, β2, D2i, and β0 are subtracted out because they’re in both equations We’ve therefore shown that estimating panel data with the fixed effects model does indeed drop the ai out of the equation Hence, the fixed effects model will not experience bias due to time-invariant omitted variables! Example: The death penalty and the murder rate: Figures 16.2 and 16.3 illustrates the importance of the fixed-effects model: the unlikely (positive) result from the cross-section model is reversed by the fixed effects model! © 2011 Pearson Addison-Wesley. All rights reserved. 13

Figure 16.3 In a Panel Data Model, the Murder Rate Decreases with Executions

The Random Effects Model
Recall that the fixed effects model is based on the assumption that each cross-sectional unit has its own intercept The random effects model instead is based on the assumption that the intercept for each cross-sectional unit is drawn from a distribution (that is centered around a mean intercept) Thus each intercept is a random draw from an “intercept distribution” and therefore is independent of the error term for any particular observation Hence the term random effects model © 2011 Pearson Addison-Wesley. All rights reserved. 16

The Random Effects Model (cont.)
Advantages of the random effects model: 1. more degrees of freedom than a fixed effects model This is because rather than estimating an intercept for virtually every cross-sectional unit, all we need to do is to estimate the parameters that describe the distribution of the intercepts. 2. Can now also estimate time-invariant explanatory variables (like race or gender). Disadvantages of the random effects model: 1. Most importantly, the random effects estimator requires us to assume that ai is uncorrelated with the independent variables, the Xs, if we’re going to avoid omitted variable bias This may be an overly strong assumption in many cases © 2011 Pearson Addison-Wesley. All rights reserved. 17

Choosing Between Fixed and Random Effects
One key is the nature of the relationship between ai and the Xs: If they’re likely to be correlated, then it makes sense to use the fixed effects model If not, then it makes sense to use the random effects model Can also use the Hausman test to examine whether there is correlation between ai and X Essentially, this procedure tests to see whether the regression coefficients under the fixed effects and random effects models are statistically different from each other If they are different, then the fixed effects model is preferred If the they are not different, then the random effects model is preferred (or estimates of both the fixed effects and random effects models are provided) © 2011 Pearson Addison-Wesley. All rights reserved. 18