Session 3 – Linear Regression Amine Ouazad, Asst. Prof. of Economics

Slides:



Advertisements
Similar presentations
Properties of Least Squares Regression Coefficients
Advertisements

Session 2 – Introduction: Inference
Econometric Modeling Through EViews and EXCEL
Managerial Economics in a Global Economy
Multiple Regression Analysis
The Simple Regression Model
The Multiple Regression Model.
Brief introduction on Logistic Regression
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Hypothesis Testing Steps in Hypothesis Testing:
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Ch11 Curve Fitting Dr. Deshi Ye
Bivariate Regression Analysis
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Lecture 4 Econ 488. Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible.
1 A REVIEW OF QUME 232  The Statistical Analysis of Economic (and related) Data.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
Multiple regression analysis
The Simple Linear Regression Model: Specification and Estimation
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Chapter 11 Multiple Regression.
Economics 20 - Prof. Anderson
Topic 3: Regression.
EC Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Introduction to Regression Analysis, Chapter 13,
Ordinary Least Squares
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Chapter 13: Inference in Regression
Inference issues in OLS
Hypothesis Testing in Linear Regression Analysis
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
7.1 Multiple Regression More than one explanatory/independent variable This makes a slight change to the interpretation of the coefficients This changes.
CHAPTER 14 MULTIPLE REGRESSION
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
3.4 The Components of the OLS Variances: Multicollinearity We see in (3.51) that the variance of B j hat depends on three factors: σ 2, SST j and R j 2.
Lecture 10: Correlation and Regression Model.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
5. Consistency We cannot always achieve unbiasedness of estimators. -For example, σhat is not an unbiased estimator of σ -It is only consistent -Where.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Lecturer: Ing. Martina Hanová, PhD..  How do we evaluate a model?  How do we know if the model we are using is good?  assumptions relate to the (population)
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Chapter 13 Simple Linear Regression
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Essentials of Modern Business Statistics (7e)
Evgeniya Anatolievna Kolomak, Professor
Fundamentals of regression analysis
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
CHAPTER 29: Multiple Regression*
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Migration and the Labour Market
Simple Linear Regression
Multiple Regression Analysis: OLS Asymptotics
Multiple Regression Analysis: OLS Asymptotics
Linear Regression Summer School IFPRI
Presentation transcript:

Session 3 – Linear Regression Amine Ouazad, Asst. Prof. of Economics Econometrics Session 3 – Linear Regression Amine Ouazad, Asst. Prof. of Economics

Session 3 – Linear Regression Amine Ouazad, Asst. Prof. of Economics Econometrics Session 3 – Linear Regression Amine Ouazad, Asst. Prof. of Economics

Outline of the course Introduction: Identification Introduction: Inference Linear Regression Identification Issues in Linear Regressions Inference Issues in Linear Regressions

This session Introduction: Linear Regression What is the effect of X on Y? Hands-on problems: What is the effect of the death of the CEO (X) on firm performance (Y)? (Morten Bennedsen) What is the effect of child safety seats (X) on the probability of death (Y)? (Steve Levitt)

This session: Linear Regression Notations. Assumptions. The OLS estimator. Implementation in STATA. The OLS estimator is CAN. Consistent and Asymptotically Normal The OLS estimator is BLUE.* Best Linear Unbiased Estimator (BLUE)* Essential statistics: t-stat, R squared, Adjusted R Squared, F stat, Confidence intervals. Tricky questions. *Conditions apply

Session 3 – Linear Regression 1. Notations

Notations The effect of X on Y. What is X? What is Y? K covariates (including the constant) N observations X is an NxK matrix. What is Y? N observations. Y is an N-vector.

Notations Relationship between y and the xs. y=f(x1,x2,x3,x4,…,xK)+e f: a function K variables. e: the unobservables (a scalar).

Session 3 – Linear Regression 2. Assumptions

Assumptions A1: Linearity A2: Full Rank A3: Exogeneity of the covariates A4: Homoskedasticity and nonautocorrelation A5: Exogenously generated covariates. A6: Normality of the residuals

Assumption A1: Linearity y = f(x1,x2,x3,…,xK)+e y = x1 b1 + x2 b2 + …+xK bK + e In ‘plain English’: The effect of xk is constant. The effect of xk does not depend on the value of xk’. Not satisfied if : squares/higher powers of x matter. Interaction terms matter.

Notations Data generating process Scalar notation Matrix version #1

Assumption A2: Full Rank We assume that X’X is invertible. Notes: A2 may be satisfied in the data generating process but not for the observed. Examples: Month of the year dummies/Year dummies, Country dummies, Gender dummies.

Assumption A3: Exogeneity i.e. mean independence of the residual and the covariates. E(e|x1,…,xK) = 0. This is a property of the data generating process. Link with selection bias in Session 1?

Dealing with Endogeneity You’re assuming that there is no covariate correlated with the Xs that has an effect on Y. If it is only correlated with X with no effect on Y, it’s OK. If it is not correlated with X and has an effect on Y, it’s OK. Example of a problem: Health and Hospital stays. What covariate should you add? Conclusion: Be creative !! Think about unobservables !!

Assumption A4: Homoskedasticity and Non Autocorrelation Var(e|x1,…,xK) = s2. Corr(ei, ej|X) = 0. Visible on a scatterplot? Link with t-tests of session 2? Examples: correlated/random effects.

Assumption A5 Exogenously generated covariates Instead of requiring the mean independence of the residual and the covariates, we might require their independence. (Recall X and e independent if f(X,e)=f(X)f(e)) Sometimes we will think of X as fixed rather than exogenously generated.

Assumption A6: Normality of the Residuals The asymptotic properties of OLS (to be discussed below) do not depend on the normality of the residuals: semi-parametric approach. But for results with a fixed number of observations, we need the normality of the residuals for the OLS to have nice properties (to be defined below).

3. The Ordinary Least Squares estimator Session 3 – Linear Regression 3. The Ordinary Least Squares estimator

The OLS Estimator Formula: Two interpretations: Minimization of sum of squares (Gauss’s interpretation). Coefficient beta which makes the observed X and epsilons mean independent (according to A3).

OLS estimator Exercise: Find the OLS estimator in the case where both y and x are scalars (i.e. not vectors). Learn the formula by heart (if correct !).

Implementation in Stata STATA regress command. regress y x1 x2 x3 x4 x5 … What does Stata do? drops variables that are perfectly correlated. (to make sure A2 is satisfied). Always check the number of observations ! Options will be seen in the following sessions. Dummies (e.g. for years) can be included using « xi: i.year ». Again A2 must be satisfied.

First things first: Desc. Stats Each variable used in the analysis: Mean, standard deviation for the sample and the subsamples. Other possible outputs: min max, median (only if you care). Source of the dataset. Why?? Show the reader the variables are “well behaved”: no outlier driving the regression, consistent with intuition. Number of observations should be constant across regressions (next slide).

Reading a table … from the Levitt paper (2006 wp)

Other important advice As a best practice always start by regressing y on x with no controls except the most essential ones. No effect? Then maybe you should think twice about going further. Then add controls one by one, or group by group. Explain why coefficient of interest changes from one column to the next. (See next session)

Stata tricks Output the estimation results using estout or outreg. Display stars for coefficients’ significance. Outputs the essential statistics (F, R2, t test). Stacks the columns of regression output for regressions with different sets of covariates. Formats: LaTeX and text (Microsoft Word).

4. Large sample properties of the ols estimator Session 3 – Linear Regression 4. Large sample properties of the ols estimator

The OLS estimator is CAN Consistent Asymptotically Normal Proof: Use ‘true’ relationship between y and X to show that b = b + (1/N (X’X)-1 )(1/N (X’e)). Use Slutsky theorem and A3 to show consistency. Use CLT and A3 to show asymptotic normality. V = plim (1/N (X’X)) -1 𝑝𝑙𝑖𝑚 𝛽 =𝛽 𝑁 ( 𝛽 −𝛽) → 𝑑 𝑁(0,𝑉)

OLS is CAN: numerical simulation Typical design of a study: Recruit X% of a population (for instance a random sample of students at INSEAD). Collect the data. Perform the regression and get the OLS estimator. If you perform these steps independently a large number of times (thought experiment), then you will get a normal distribution of parameters.

Important assumptions A1, A2, A3 are needed to solve the identification problem: With them, estimator is consistent. A4 is needed A4 affects the variance covariance matrix. Violations of A3? Next session (identif. Issues) Violations of A4? Session on inference issues.

5. Finite sample properties of the ols estimator Session 3 – Linear Regression 5. Finite sample properties of the ols estimator

The OLS Estimator is BLUE Best … i.e. has minimum variance Linear … i.e. is a linear function of the X and Y Unbiased … i.e. Estimator … i.e. it is just a function of the observations Proof (a.k.a. the Gauss Markov Theorem): 𝐸 𝛽 =𝛽

OLS is BLUE Steps of the proof: OLS is LUE because of A1 and A3. OLS is Best… For any other LUE, such as Cy, CX=Id. Then take the difference Dy= Cy-b. (b is the OLS) Show that Var(b0|X) = Var(b|X) + s2 D’D. The result follows from s2 D’D > 0.

Finite sample distribution The OLS estimator is normally distributed for a fixed N, as long as one assumes the normality of the residuals (A6). What is “large” N? Small: e.g. Acemoglu, Johnson and Robinson Large: e.g. Bennedsen and Perez Gonzalez. Statistical question: rate of convergence of the law of large numbers.

This is small N

Other examples Large N Compustat (1,000s + observations) Execucomp Scanner data Small N Cross-country regressions (< 100 points)

6. Statistics for reading the output of ols estimation Session 3 – Linear Regression 6. Statistics for reading the output of ols estimation

Statistics R squared t-test Confidence intervals F statistic. What share of the variance of the outcome variable is explained by the covariates? t-test Is the coefficient on the variable of interest significant? Confidence intervals What interval includes the true coefficient with probability 95%? F statistic. Is the model better than random noise?

Reading Stata Output

R Squared Measures the share of the variance of Y (the dependent variable) explained by the model Xb, hence R2 = var(Xb)/var(Y). Note that if you regress Y on itself, the R2 is 100%. The R2 is not a good indicator of the quality of a model.

Tricky Question Should I choose the model with the highest R squared? Adding a variable mechanically raises the R squared. A model with endogenous variables (thus not interpretable nor causal) can have a high R square.

Adjusted R-Square Corrects for the number of variables in the regression. Proposition: When adding a variable to a regression model, the adjusted R-square increases if and only if the square of the t-statistic is greater than 1. Adj-R2: arbitrary (1, why 1?) but still interesting. 𝐴𝑑𝑗 𝑅2=1− 𝑁−1 𝑁−𝐾 (1−𝑅2)

t-test and p value p-value: significance level for the coefficient. Significance at 95% : pvalue lower than 0.05. Typical value for t is 1.96 (when N is large, t is normal). Significance at X% : pvalue lower than 1-X. Important significance levels: 10%, 5%, 1%. Depending on the size of the dataset. t-test is valid asymptotically under A1,A2,A3,A4. t-test is valid at finite distance with A6. Small sample t-tests… see Wooldridge NBER conference, “Recent advances in Econometrics.” 𝑡= 𝛽 𝑘 𝜎 2 𝑆 𝑘𝑘 →𝑆(𝑁−𝐾)

F Statistic Is the model as a whole significant? Hypothesis H0: all coefficients are equal to zero, except the constant. Alternative hypothesis: at least one coefficient is nonzero. Under the null hypothesis, in distribution: 𝐹 𝐾−1,𝑁−𝐾 = 𝑅2 𝐾−1 1−𝑅2 𝑁−𝐾 →𝐹(𝐾−1,𝑁−𝐾)

Session 3 – Linear Regression 7. Tricky Questions

Tricky Questions Can I drop a non significant variable? What if two variables are very strongly correlated (but not perfectly correlated)? How do I deal (simply) with missing/miscoded data? How do I identify influential observations?

Tricky Questions Can I drop a non significant variable? A variable may be non significant but still have a significant correlation with other covariates… Dropping the non significant covariate may unduly increase the significance of the coefficient of interest. (recently seen in an OECD working paper). Conclusion: controls stay.

Tricky Questions What if two variables are very strongly correlated (but not perfectly)? One coefficient tends to be very significant and positive… While the coefficient of the other variable is very significant and negative! Beware of multicollinearity.

Tricky Questions How do I deal (simply) with missing data? Create dummies for missing covariates instead of dropping them from the regression. If it is the dependent variable, focus on the subset of non missing dependents. Argue in the paper that it is missing at random (if possible). For more advanced material, see session on Heckman selection model.

How do I identify influential points? Run the regression with the dataset except the point in question. Identify influential observations by making a scatterplot of the dependent variable and the prediction Xb.

Tricky Questions Can I drop the constant in the model? No. Can I include an interaction term (or a square) without the simple terms?

Next sessions … looking forward Session 3 – Linear Regression Next sessions … looking forward

Next session What if some of my covariates are measured with error? Income, degrees, performance, network. What if some variable is not included (because you forgot or don’t have it) and still has an impact on y? « Omitted variable bias »

Important points from this session REMEMBER A1 to A6 by heart. Which assumptions are crucial for the asymptotics? Which assumptions are crucial for the finite sample validity of the OLS estimator? START REGRESSING IN STATA TODAY ! regress and outreg2