1 Javier Aparicio División de Estudios Políticos, CIDE Primavera 2009 Regresión.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Multiple Regression Analysis
Lesson 10: Linear Regression and Correlation
The Simple Regression Model
1 Javier Aparicio División de Estudios Políticos, CIDE Otoño Regresión.
NOTATION & ASSUMPTIONS 2 Y i =  1 +  2 X 2i +  3 X 3i + U i Zero mean value of U i No serial correlation Homoscedasticity Zero covariance between U.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
BA 275 Quantitative Business Methods
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Ch11 Curve Fitting Dr. Deshi Ye
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Lecture (14,15) More than one Variable, Curve Fitting, and Method of Least Squares.
LINEAR REGRESSION MODEL
6. Multiple Regression Analysis: Further Issues 6.1 Effects of Data Scaling on OLS Statistics 6.2 More on Functional Form 6.3 More on Goodness-of-Fit and.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
Econ 140 Lecture 121 Prediction and Fit Lecture 12.
Chapter 10 Simple Regression.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
BA 555 Practical Business Analysis
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Chapter 12 Multiple Regression
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
Chapter 11 Multiple Regression.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Part 18: Regression Modeling 18-1/44 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables.
Chapter 15: Model Building
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
The Lognormal Distribution
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
9 - 1 Intrinsically Linear Regression Chapter Introduction In Chapter 7 we discussed some deviations from the assumptions of the regression model.
Introduction to Linear Regression and Correlation Analysis
3.1 Ch. 3 Simple Linear Regression 1.To estimate relationships among economic variables, such as y = f(x) or c = f(i) 2.To test hypotheses about these.
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Multiple Regression.
Chapter 15 Multiple Regression Model Building
Correlation and Regression
CHAPTER 29: Multiple Regression*
Multiple Regression.
Simple Linear Regression
Simple Linear Regression
Presentation transcript:

1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión Lineal Múltiple y i =  0 +  1 x 1i +  2 x 2i +...  k x ki + u i Ch 6. Model Specification

2 Redefining Variables Changing the scale of the y variable will lead to a corresponding change in the scale of the coefficients and standard errors, so no change in the significance or interpretation Changing the scale of one x variable will lead to a change in the scale of that coefficient and standard error, so no change in the significance or interpretation

3 Beta Coefficients Occasional you’ll see reference to a “standardized coefficient” or “beta coefficient” which has a specific meaning Idea is to replace y and each x variable with a standardized version – i.e. subtract mean and divide by standard deviation Coefficient reflects standard deviation of y for a one standard deviation change in x

4 Functional Form OLS can be used for relationships that are not strictly linear in x and y by using nonlinear functions of x and y – will still be linear in the parameters Can take the natural log of x, y or both Can use quadratic forms of x Can use interactions of x variables

5 Interpretation of Log Models If the model is ln(y) =  0 +  1 ln(x) + u   1 is the elasticity of y with respect to x If the model is ln(y) =  0 +  1 x + u   1 is approximately the % change in y given a 1 unit change in x If the model is y =  0 +  1 ln(x) + u   1 is approximately the change in y for a 100% change in x

6 Why use log models? Log models are invariant to the scale of the variables since measuring percent changes They give a direct estimate of elasticity For models with y > 0, the conditional distribution is often heteroskedastic or skewed, while ln(y) is much less so The distribution of ln(y) is more narrow, limiting the effect of outliers

7 Some Rules of Thumb What types of variables are often used in log form?  Dollar amounts that must be positive  Very large variables, such as population What types of variables are often used in level form?  Variables measured in years  Variables that are a proportion or percent

8 Quadratic Models For a model of the form y =  0 +  1 x +  2 x 2 + u we can’t interpret  1 alone as measuring the change in y with respect to x, we need to take into account  2 as well, since

9 More on Quadratic Models Suppose that the coefficient on x is positive and the coefficient on x 2 is negative Then y is increasing in x at first, but will eventually turn around and be decreasing in x

10 More on Quadratic Models Suppose that the coefficient on x is negative and the coefficient on x 2 is positive Then y is decreasing in x at first, but will eventually turn around and be increasing in x

11 Interaction Terms For a model of the form y =  0 +  1 x 1 +  2 x 2 +  3 x 1 x 2 + u we can’t interpret  1 alone as measuring the change in y with respect to x 1, we need to take into account  3 as well, since

12 Adjusted R-Squared Recall that the R 2 will always increase as more variables are added to the model The adjusted R 2 takes into account the number of variables in a model, and may decrease

13 Adjusted R-Squared (cont) It’s easy to see that the adjusted R 2 is just (1 – R 2 )(n – 1) / (n – k – 1), but most packages will give you both R 2 and adj-R 2 You can compare the fit of 2 models (with the same y) by comparing the adj-R 2 You cannot use the adj-R 2 to compare models with different y’s (e.g. y vs. ln(y))

14 Goodness of Fit Important not to fixate too much on adj-R 2 and lose sight of theory and common sense If theory clearly predicts a variable belongs, generally leave it in Don’t want to include a variable that prohibits a sensible interpretation of the variable of interest – remember ceteris paribus interpretation of multiple regression

15 Standard Errors for Predictions Suppose we want to use our estimates to obtain a specific prediction? First, suppose that we want an estimate of E(y|x 1 =c 1,…x k =c k ) =  0 =  0 +  1 c 1 + …+  k c k This is easy to obtain by substituting the x’s in our estimated model with c’s, but what about a standard error? Really just a test of a linear combination

16 Predictions (cont) Can rewrite as  0 =  0 –  1 c 1 – … –  k c k Substitute in to obtain y =  0 +  1 (x 1 - c 1 ) + … +  k (x k - c k ) + u So, if you regress y i on (x ij - c ij ) the intercept will give the predicted value and its standard error Note that the standard error will be smallest when the c’s equal the means of the x’s

17 Predictions (cont) This standard error for the expected value is not the same as a standard error for an outcome on y We need to also take into account the variance in the unobserved error. Let the prediction error be

18 Prediction interval Usually the estimate of s 2 is much larger than the variance of the prediction, thus This prediction interval will be a lot wider than the simple confidence interval for the prediction

19 Residual Analysis Information can be obtained from looking at the residuals (i.e. predicted vs. observed) Example: Regress price of cars on characteristics – big negative residuals indicate a good deal Example: Regress average earnings for students from a school on student characteristics – big positive residuals indicate greatest value-added

20 Predicting y in a log model Simple exponentiation of the predicted ln(y) will underestimate the expected value of y Instead need to scale this up by an estimate of the expected value of exp(u)

21 Predicting y in a log model If u is not normal, E(exp(u)) must be estimated using an auxiliary regression Create the exponentiation of the predicted ln(y), and regress y on it with no intercept The coefficient on this variable is the estimate of E(exp(u)) that can be used to scale up the exponentiation of the predicted ln(y) to obtain the predicted y

22 Comparing log and level models A by-product of the previous procedure is a method to compare a model in logs with one in levels. Take the fitted values from the auxiliary regression, and find the sample correlation between this and y Compare the R 2 from the levels regression with this correlation squared