We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byMacy Corbet
Modified about 1 year ago
Chapter 16 Experimental and Panel Data Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University
16-1 © 2011 Pearson Addison-Wesley. All rights reserved. Random Assignment Experiments When medical researchers want to examine the effect of a new drug, they use an experimental design called an random assignment experiment In such experiments, two groups are chosen randomly: 1.Treatment group: receives the treatment (a specific medicine, say) 2.Control group: receives a harmless, ineffective placebo The resulting equation is: OUTCOME i = β 0 + β 1 TREATMENT i + ε i (16.1) where: OUTCOME i = a measure of the desired outcome in the ith individual TREATMENT i = a dummy variable equal to 1 for individuals in the treatment group and 0 for individuals in the control group
16-2 © 2011 Pearson Addison-Wesley. All rights reserved. Random Assignment Experiments (cont.) But random assignment can’t always control for all possible other factors—though sometimes we may be able to identify some of these factors and add them to our equation Let’s say that the treatment is job training: –Suppose that random assignment, by chance, results in one group having more males and being slightly older than the other group –If gender and age matter in determining earnings, then we can control for the different composition of the two groups by including gender and age in our regression equation: OUTCOME i = β 0 + β 1 TREATMENT i + β 2 X 1i + β 3 X 2i + ε i (16.2) where: X 1 = dummy variable for the individual’s gender X 2 = the individual’s age
16-3 © 2011 Pearson Addison-Wesley. All rights reserved. Random Assignment Experiments (cont.) Unfortunately, random assignment experiments are not common in economics because they are subject to problems that typically do not plague medical experiments—e.g.: 1.Non-Random Samples: Most subjects in economic experiments are volunteers, and samples of volunteers often aren’t random and therefore may not be representative of the overall population As a result, our conclusions may not apply to everyone 2.Unobservable Heterogeneity: In Equation 16.2, we added observable factors to the equation to avoid omitted variable bias, but not all omitted factors in economics are observable This “unobservable omitted variable” problem is called unobserved heterogeneity
16-4 © 2011 Pearson Addison-Wesley. All rights reserved. Random Assignment Experiments (cont.) 3. The Hawthorne Effect: Human subjects typically know that they’re being studied, and they usually know whether they’re in the treatment group or the control group The fact that human subjects know that they’re being observed sometimes can change their behavior, and this change in behavior could clearly change the results of the experiment 4. Impossible Experiments: It’s often impossible (or unethical) to run a random assignment experiment in economics Think about how difficult it would be to use a random assignment experiment to study the impact of marriage on earnings!
16-5 © 2011 Pearson Addison-Wesley. All rights reserved. Natural Experiments Natural experiments (or quasi-experiments) are similar to random assignment experiments, except: –observations fall into treatment and control groups “naturally” (because of an exogenous event) instead of being randomly assigned by the researcher –By “exogenous event” is meant that the natural event must not be under the control of either of the two groups
16-6 © 2011 Pearson Addison-Wesley. All rights reserved. Natural Experiments (cont.) The appropriate regression equation for such a natural experiment is: Δ OUTCOME i = β 0 + β 1 TREATMENT i + β 2 X 1i + β 3 X 2i + ε i (16.3) where: Δ OUTCOME i is defined as the outcome after the treatment minus the outcome before the treatment for the ith observation β 1 is called the difference-in-differences estimator, and it measures the difference between the change in the treatment group and the change in the control group, holding constant X 1 and X 2 Figure 16.1 illustrates an example of a natural experiment
16-7 © 2011 Pearson Addison-Wesley. All rights reserved. Figure 16.1 Treatment and Control Groups for Los Angeles
16-8 © 2011 Pearson Addison-Wesley. All rights reserved. What Are Panel Data? Panel (or longitudinal) data combine time-series and cross- sectional data such that observations on the same variables from the same cross sectional sample are followed over two or more different time periods Why use panel data? At least three reasons—using panel data: 1. certainly will increase sample sizes! 2. can help provide insights into analytical questions that can’t be answered by using time-series or cross-sectional data alone: Allows determining whether the same people are unemployed year after year or whether different individuals are unemployed in different years 3. often allow researchers to avoid omitted variable problems that otherwise would cause bias in cross-sectional studies
16-9 © 2011 Pearson Addison-Wesley. All rights reserved. What Are Panel Data? (cont.) There are four different kinds of variables that we encounter when we use panel data: 1. Variables that can differ between individuals but don’t change over time: e.g., gender, ethnicity, and race 2. Variables that change over time but are the same for all individuals in a given time period: e.g., the retail price index and the national unemployment rate 3. Variables that vary both over time and between individuals: e.g., income and marital status 4. Trend variables that vary in predictable ways: e.g., an individual’s age
16-10 © 2011 Pearson Addison-Wesley. All rights reserved. The Fixed Effects Model There are several alternative panel data estimation procedures Most researchers use the fixed effects model, which allows each cross-sectional unit to have a different intercept: Y it = β 0 + β 1 X it + β 2 D2 i β N DN i + v it (16.4) where: D2 = intercept dummy equal to 1 for the second cross-sectional entity and 0 otherwise DN = intercept dummy equal to 1 for the Nth cross-sectional entity and 0 otherwise Note that Y, X, and v have two subscripts!
16-11 © 2011 Pearson Addison-Wesley. All rights reserved. The Fixed Effects Model (cont.) One major advantage of the fixed effects model is that it avoids bias due to omitted variables that don’t change over time –e.g., race or gender –Such time-invariant omitted variables often are referred to as unobserved heterogeneity or a fixed effect To understand how this works, consider what Equation 16.4 would look like with only two years worth of data: Y it = β 0 + β 1 X it + β 2 D2 i + v it (16.5) Let’s decompose the error term, v it, into two components, a classical error term (ε it ) and the unobserved impact of the time-invariant omitted variables (a i ): v it = ε it + a i (16.6)
16-12 © 2011 Pearson Addison-Wesley. All rights reserved. The Fixed Effects Model (cont.) If we substitute Equation 16.6 into Equation 16.5, we get: Y it = β 0 + β 1 X it + β 2 D2 i + ε it + a i (16.7) Next, average Equation 16.7 over time for each observation i, thus producing: Y i = β 0 + β 1 X i + β 2 D2 i + ε i + a i (16.8) where the bar over a variable indicates the mean of that variable across time Note that a i, β 2 D2 i, and β 0 don’t have bars over them because they’re constant over time
16-13 © 2011 Pearson Addison-Wesley. All rights reserved. The Fixed Effects Model (cont.) If we now subtract Equation 16.8 from Equation 16.7, we get: Note that a i, β 2, D2 i, and β 0 are subtracted out because they’re in both equations We’ve therefore shown that estimating panel data with the fixed effects model does indeed drop the a i out of the equation Hence, the fixed effects model will not experience bias due to time- invariant omitted variables! Example: The death penalty and the murder rate: –Figures 16.2 and 16.3 illustrates the importance of the fixed-effects model: the unlikely (positive) result from the cross-section model is reversed by the fixed effects model!
16-14 © 2011 Pearson Addison-Wesley. All rights reserved. Figure 16.2 In a Single-Year Cross-Sectional Model, the Murder Rate Appears to Increase with Executions
16-15 © 2011 Pearson Addison-Wesley. All rights reserved. Figure 16.3 In a Panel Data Model, the Murder Rate Decreases with Executions
16-16 © 2011 Pearson Addison-Wesley. All rights reserved. The Random Effects Model Recall that the fixed effects model is based on the assumption that each cross-sectional unit has its own intercept The random effects model instead is based on the assumption that the intercept for each cross-sectional unit is drawn from a distribution (that is centered around a mean intercept) Thus each intercept is a random draw from an “intercept distribution” and therefore is independent of the error term for any particular observation –Hence the term random effects model
16-17 © 2011 Pearson Addison-Wesley. All rights reserved. The Random Effects Model (cont.) Advantages of the random effects model: 1. more degrees of freedom than a fixed effects model This is because rather than estimating an intercept for virtually every cross- sectional unit, all we need to do is to estimate the parameters that describe the distribution of the intercepts. 2. Can now also estimate time-invariant explanatory variables (like race or gender). Disadvantages of the random effects model: 1.Most importantly, the random effects estimator requires us to assume that a i is uncorrelated with the independent variables, the Xs, if we’re going to avoid omitted variable bias This may be an overly strong assumption in many cases
16-18 © 2011 Pearson Addison-Wesley. All rights reserved. Choosing Between Fixed and Random Effects One key is the nature of the relationship between a i and the Xs: –If they’re likely to be correlated, then it makes sense to use the fixed effects model –If not, then it makes sense to use the random effects model Can also use the Hausman test to examine whether there is correlation between a i and X Essentially, this procedure tests to see whether the regression coefficients under the fixed effects and random effects models are statistically different from each other –If they are different, then the fixed effects model is preferred –If the they are not different, then the random effects model is preferred (or estimates of both the fixed effects and random effects models are provided)
16-19 © 2011 Pearson Addison-Wesley. All rights reserved. Table 16.1a
16-20 © 2011 Pearson Addison-Wesley. All rights reserved. Table 16.1b
16-21 © 2011 Pearson Addison-Wesley. All rights reserved. Table 16.1c
16-22 © 2011 Pearson Addison-Wesley. All rights reserved. Table 16.1d
16-23 © 2011 Pearson Addison-Wesley. All rights reserved. Table 16.1e
16-24 © 2011 Pearson Addison-Wesley. All rights reserved. Key Terms from Chapter 16 Treatment group Control group Differences estimator Difference in differences Unobserved heterogeneity The Hawthorne effect Panel data The fixed effects model The random effects model Hausman test
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 23: Experiments (Chapter 15.1–15.5)
1 Research Method Lecture 7 (Ch14) Pooled Cross Sections and Simple Panel Data Methods ©
Lecture 20 Missing Data and random effect modelling.
SAMPLING AND ESTIMATION. PARAMETERS AND STATISTICS A parameter is a quantity used to describe a population, and a statistic is a quantity computed from.
Chapter 5 Sample Surveys. Background We have learned ways to display, describe, and summarize data, but have been limited to examining the particular.
SECTION 12.1 Tests About a Population Mean. Whats the difference between what is addressed in Section 11.2 (we skipped) and what we are beginning in Section.
Differences-in-Differences and A Brief Introduction to Panel Data.
Effect Sizes and Power Review. Statistical Power Statistical power refers to the probability of finding a particular sized effect Specifically, it is.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Using Statas Margins Command to Estimate and Interpret Adjusted Predictions and Marginal Effects Richard Williams
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
A.MANN ADAPTED FROM HAMILTON AP STATS Chapter 5 Producing Data.
Policy Evaluation Antoine Bozio Institute for Fiscal Studies University of Oxford - January 2008.
Quantitative Methods in Social Research 2010/11 Week 2 (Morning) A novice’s guide to quantitative analysis.
Test of Significance Presenter: Shib Sekhar Datta Moderator: M S Bharambe.
Multilevel Discrete-time Event History Analysis Centre for Multilevel Modelling Graduate School of Education University of Bristol September 2007.
Statistics. Be able to state the null and alternative hypotheses for testing the difference between two population proportions. Know how to examine.
MONA RAHIMI EXPERIMENTAL AND EX POST FACTO DESIGNS.
X,Y scatterplot These are plots of X,Y coordinates showing each individual's or sample's score on two variables. When plotting data this way we are usually.
AP Statistics Hamilton/Mann. Confidence intervals are one of the two most common types of statistical inference. Use confidence intervals when you.
Helen Chester University of Manchester. Brief overview of study and findings Focus on issues and recommendations for: Researchers wishing to do similar.
Objective: Identify and use the four principles of experimental design. HW: Read pp and complete exercises 5.31, 5.32, 5.33 Then read pp
How to design a statistical study How to collect data by doing an observational study, performing an experiment, using a simulation, or using a survey.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Experimental concepts. TheoryTheory. A theory can be defined as a "general principle proposed to explain how a number of separate facts are related." In.
Regression Analysis: A statistical procedure used to find relationships among a set of variables.
1 A single-factor security market The single-index model Estimating the single-index model Topic 3 (Ch. 8) Index Models.
Panel Data Models Dynamic panels and unit roots. Introduction To describe the dynamic panel and motivate its use (This is mostly a practical guide to.
Analysis by design Statistics is involved in the analysis of data generated from an experiment. It is essential to spend time and effort in advance to.
© 2016 SlidePlayer.com Inc. All rights reserved.