Multiple Regression Analysis with Qualitative Information

Slides:



Advertisements
Similar presentations
The Multiple Regression Model.
Advertisements

Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Chapter 13 Additional Topics in Regression Analysis
Chapter 10 Simple Regression.
Economics 20 - Prof. Anderson
7 Dummy Variables Thus far, we have only considered variables with a QUANTITATIVE MEANING -ie: dollars, population, utility, etc. In this chapter we will.
Econ 140 Lecture 171 Multiple Regression Applications II &III Lecture 17.
Treatment Effects: What works for Whom? Spyros Konstantopoulos Michigan State University.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Chapter 11 Multiple Regression.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 5. Dummy Variables.
DUMMY VARIABLES BY HARUNA ISSAHAKU Haruna Issahaku.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 25 Categorical Explanatory Variables.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
1 Dummy Variables. 2 Topics for This Chapter 1. Intercept Dummy Variables 2. Slope Dummy Variables 3. Different Intercepts & Slopes 4. Testing Qualitative.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Statistics and Quantitative Analysis U4320 Segment 12: Extension of Multiple Regression Analysis Prof. Sharyn O’Halloran.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Chapter 13 Multiple Regression
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
Overview of Regression Analysis. Conditional Mean We all know what a mean or average is. E.g. The mean annual earnings for year old working males.
11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1/25 Introduction to Econometrics. 2/25 Econometrics Econometrics – „economic measurement“ „May be defined as the quantitative analysis of actual economic.
Lecturer: Ing. Martina Hanová, PhD. Business Modeling.
PO 141: INTRODUCTION TO PUBLIC POLICY Summer I (2015) Claire Leavitt Boston University.
Multiple Regression Analysis: Inference
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Heteroscedasticity Chapter 8
ECONOMETRICS EC331 Prof. Burak Saltoglu
Chapter 14 Introduction to Multiple Regression
Ch. 2: The Simple Regression Model
Multiple Regression Analysis: Further Issues
Pooling Cross Sections across Time: Simple Panel Data Methods
Multiple Regression Analysis: OLS Asymptotics
Multiple Regression Analysis and Model Building
APPROACHES TO QUANTITATIVE DATA ANALYSIS
More on Specification and Data Issues
Multiple Regression Analysis with Qualitative Information
John Loucks St. Edward’s University . SLIDES . BY.
More on Specification and Data Issues
Ch. 2: The Simple Regression Model
Multiple Regression Analysis with Qualitative Information
LEARNING OUTCOMES After studying this chapter, you should be able to
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Heteroskedasticity.
Chapter 7: The Normality Assumption and Inference with OLS
Seminar in Economics Econ. 470
Chapter 8: DUMMY VARIABLE (D.V.) REGRESSION MODELS
The Simple Regression Model
SIMPLE LINEAR REGRESSION
The Simple Regression Model
Multiple Regression Analysis with Qualitative Information
More on Specification and Data Issues
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Financial Econometrics Fin. 505
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Introduction to Regression
Presentation transcript:

Multiple Regression Analysis with Qualitative Information Chapter 7 Wooldridge: Introductory Econometrics: A Modern Approach, 5e

Multiple Regression Analysis: Qualitative Information Examples: gender, race, industry, region, rating grade, … A way to incorporate qualitative information is to use dummy variables They may appear as the dependent or as independent variables A single dummy independent variable = the wage gain/loss if the person is a woman rather than a man (holding other things fixed) Dummy variable: =1 if the person is a woman =0 if the person is man

Multiple Regression Analysis: Qualitative Information Graphical Illustration Alternative interpretation of coefficient: i.e. the difference in mean wage between men and women with the same level of education. Intercept shift

Multiple Regression Analysis: Qualitative Information Dummy variable trap This model cannot be estimated (perfect collinearity) When using dummy variables, one category always has to be omitted: The base category are men The base category are women Alternatively, one could omit the intercept: Disadvantages: 1) More difficult to test for diffe-rences between the parameters 2) R-squared formula only valid if regression contains intercept

Multiple Regression Analysis: Qualitative Information Estimated wage equation with intercept shift Does that mean that women are discriminated against? Not necessarily. Being female may be correlated with other produc-tivity characteristics that have not been controlled for. Holding education, experience, and tenure fixed, women earn 1.81$ less per hour than men

Multiple Regression Analysis: Qualitative Information Comparing means of subpopulations described by dummies Discussion It can easily be tested whether difference in means is significant The wage difference between men and women is larger if no other things are controlled for; i.e. part of the difference is due to differ-ences in education, experience and tenure between men and women Not holding other factors constant, women earn 2.51$ per hour less than men, i.e. the difference between the mean wage of men and that of women is 2.51$.

Multiple Regression Analysis: Qualitative Information Further example: Effects of training grants on hours of training This is an example of program evaluation Treatment group (= grant receivers) vs. control group (= no grant) Is the effect of treatment on the outcome of interest causal? We cannot enter hrsemp in logarithmic form because hrsemp is zero for 29 of the 105 firms used in the regression. Hours training per employee Dummy indicating whether firm received training grant The coefficient on log(employ) means that, if a firm is 10% larger, it trains its workers about .61 hour less It might be that the firms receiving grants would have, on average, trained their workers more even in the absence of a grant. Nothing in this analysis tells us whether we have estimated a causal effect;

Multiple Regression Analysis: Qualitative Information Using dummy explanatory variables in equations for log(y) Dummy indicating whether house is of colonial style As the dummy for colonial style changes from 0 to 1, the house price increases by 5.4 percentage points

Multiple Regression Analysis: Qualitative Information Using dummy variables for multiple categories 1) Define membership in each category by a dummy variable 2) Leave out one category (which becomes the base category) Holding other things fixed, married women earn 19.8% less than single men (= the base category)

Multiple Regression Analysis: Qualitative Information Incorporating ordinal information using dummy variables Example: City credit ratings and municipal bond interest rates Municipal bond rate Credit rating from 0-4 (0=worst, 4=best) This specification would probably not be appropriate as the credit rating only contains ordinal information. A better way to incorporate this information is to define dummies: Dummies indicating whether the particular rating applies, e.g. CR1=1 if CR=1 and CR1=0 otherwise. All effects are measured in comparison to the worst rating (= base category).

Multiple Regression Analysis: Qualitative Information Interactions involving dummy variables Allowing for different slopes Interesting hypotheses Interaction term = intercept men = slope men = intercept women = slope women The return to education is the same for men and women The whole wage equation is the same for men and women

Multiple Regression Analysis: Qualitative Information Graphical illustration Interacting both the intercept and the slope with the female dummy enables one to model completely independent wage equations for men and women

Multiple Regression Analysis: Qualitative Information Estimated wage equation with interaction term + To do this, we would replace female *educ with female*(educ- 12.5) and rerun the regression; this only changes the coefficient on female and its standard error. Does this mean that there is no significant evidence of lower pay for women at the same levels of educ, exper, and tenure? No: this is only the effect for educ = 0. To answer the question one has to recenter the interaction term, e.g. around educ = 12.5 (= average education). No evidence against hypothesis that the return to education is the same for men and women

Multiple Regression Analysis: Qualitative Information Testing for differences in regression functions across groups Unrestricted model (contains full set of interactions) Restricted model (same regression for both groups) College grade point average Standardized aptitude test score High school rank percentile Total hours spent in college courses

Multiple Regression Analysis: Qualitative Information Null hypothesis Estimation of the unrestricted model All interaction effects are zero, i.e. the same regression coefficients apply to men and women Tested individually, the hypothesis that the interaction effects are zero cannot be rejected

Multiple Regression Analysis: Qualitative Information Joint test with F-statistic Alternative way to compute F-statistic in the given case Run separate regressions for men and for women; the unrestricted SSR is given by the sum of the SSR of these two regressions Run regression for the restricted model and store SSR If the test is computed in this way it is called the Chow-Test Important: Test assumes a constant error variance accross groups Null hypothesis is rejected

Multiple Regression Analysis: Qualitative Information A Binary dependent variable: the linear probability model Linear regression when the dependent variable is binary If the dependent variable only takes on the values 1 and 0 Linear probability model (LPM) In the linear probability model, the coefficients describe the effect of the explanatory variables on the probability that y=1

Multiple Regression Analysis: Qualitative Information Example: Labor force participation of married women =1 if in labor force, =0 otherwise Non-wife income (in thousand dollars per year) If the number of kids under six years increases by one, the pro- probability that the woman works falls by 26.2% Does not look significant (but see below)

Multiple Regression Analysis: Qualitative Information Example: Female labor participation of married women (cont.) Graph for nwifeinc=50, exper=5, age=30, kindslt6=1, kidsge6=0 The maximum level of education in the sample is educ=17. For the gi-ven case, this leads to a predicted probability to be in the labor force of about 50%. Negative predicted probability but no problem because no woman in the sample has educ < 5.

Multiple Regression Analysis: Qualitative Information Disadvantages of the linear probability model Predicted probabilities may be larger than one or smaller than zero Marginal probability effects sometimes logically impossible The linear probability model is necessarily heteroskedastic Heterosceasticity consistent standard errors need to be computed Advantanges of the linear probability model Easy estimation and interpretation Estimated effects and predictions often reasonably good in practice Variance of Ber-noulli variable

Multiple Regression Analysis: Qualitative Information More on policy analysis and program evaluation Example: Effect of job training grants on worker productivity Percentage of defective items =1 if firm received training grant, =0 otherwise No apparent effect of grant on productivity Treatment group: grant reveivers, Control group: firms that received no grant Grants were given on a first-come, first-served basis. This is not the same as giving them out randomly. It might be the case that firms with less productive workers saw an opportunity to improve productivity and applied first.

Multiple Regression Analysis: Qualitative Information Self-selection into treatment as a source for endogeneity In the given and in related examples, the treatment status is probably related to other characteristics that also influence the outcome The reason is that subjects self-select themselves into treatment depending on their individual characteristics and prospects Experimental evaluation In experiments, assignment to treatment is random In this case, causal effects can be inferred using a simple regression The dummy indicating whether or not there was treatment is unrelated to other factors affecting the outcome.

Multiple Regression Analysis: Qualitative Information Further example of an endogenuous dummy regressor Are nonwhite customers discriminated against? It is important to control for other characteristics that may be important for loan approval (e.g. profession, unemployment) Omitting important characteristics that are correlated with the non-white dummy will produce spurious evidence for discriminiation Dummy indicating whether loan was approved Race dummy Credit rating