Econometrics I Professor William Greene Stern School of Business

Slides:



Advertisements
Similar presentations
Econometrics I Professor William Greene Stern School of Business
Advertisements

Qualitative predictor variables
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
The Multiple Regression Model.
Part 17: Nonlinear Regression 17-1/26 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Part 7: Estimating the Variance of b 7-1/53 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Part 1: Simple Linear Model 1-1/301-1 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression
[Part 3] 1/49 Stochastic FrontierModels Stochastic Frontier Model Stochastic Frontier Models William Greene Stern School of Business New York University.
OUTLIER, HETEROSKEDASTICITY,AND NORMALITY
The Simple Linear Regression Model: Specification and Estimation
Microeconometric Modeling
Chapter 10 Simple Regression.
Linear Regression with One Regression
Chapter 11 Multiple Regression.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression Analysis
Part 18: Regression Modeling 18-1/44 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Simple Linear Regression and Correlation
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
Standard error of estimate & Confidence interval.
Econometric Methodology. The Sample and Measurement Population Measurement Theory Characteristics Behavior Patterns Choices.
Part 3: Regression and Correlation 3-1/41 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Inference for regression - Simple linear regression
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Part 17: Regression Residuals 17-1/38 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Microeconometric Modeling William Greene Stern School of Business New York University.
Part 9: Hypothesis Testing /29 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
1 G Lect 2w Review of expectations Conditional distributions Regression line Marginal and conditional distributions G Multiple Regression.
1/69: Topic Descriptive Statistics and Linear Regression Microeconometric Modeling William Greene Stern School of Business New York University New.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
1/61: Topic 1.2 – Extensions of the Linear Regression Model Microeconometric Modeling William Greene Stern School of Business New York University New York.
PO 141: INTRODUCTION TO PUBLIC POLICY Summer I (2015) Claire Leavitt Boston University.
The simple linear regression model and parameter estimation
Linear Regression with One Regression
Regression Analysis AGEC 784.
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
Ch. 2: The Simple Regression Model
Microeconometric Modeling
Microeconometric Modeling
BPK 304W Correlation.
Microeconometric Modeling
Econometrics I Professor William Greene Stern School of Business
Our theory states Y=f(X) Regression is used to test theory.
Chengyuan Yin School of Mathematics
CHAPTER 12 More About Regression
Simple Linear Regression
Chapter 7: The Normality Assumption and Inference with OLS
Simple Linear Regression
Econometrics I Professor William Greene Stern School of Business
Econometrics I Professor William Greene Stern School of Business
Chengyuan Yin School of mathematics
The Multiple Regression Model
Microeconometric Modeling
Microeconometric Modeling
Econometrics I Professor William Greene Stern School of Business
Presentation transcript:

Econometrics I Professor William Greene Stern School of Business Department of Economics

Econometrics I Part 10 – Interval Estimation and Prediction

Interval Estimation b = point estimator of  We acknowledge the sampling variability. Estimated sampling variance b =  + sampling variability induced by 

Point estimate is only the best single guess Form an interval, or range of plausible values Plausible  likely values with acceptable degree of probability. To assign probabilities, we require a distribution for the variation of the estimator. The role of the normality assumption for 

Robust Inference: Confidence Interval for βk bk = the point estimate Std.Err[bk] = sqr{Estimated Asymptotic Variance of bk} = vk The matrix may be any robust or appropriate covariance matrix. bk ~ Asy.N[βk,vk2] (bk-βk)/vk ~ Asy.N[0,1] Consider a range of plausible values of βk given the point estimate bk. bk  sampling error. Measured in standard error units, |(bk – βk)/ vk| < z* Larger z*  greater probability (“confidence”) Given normality, e.g., z* = 1.96  95%, z*=1.64590% Plausible range for βk then is bk ± z* vk

Critical Values for the Confidence Interval Assume normality of ε: bk ~ N[βk,vk2] for the true βk. (bk-βk)/vk ~ N[0,1] vk = [s2(X’X)-1]kk (bk-βk)/Est.(vk) ~ t[n-K]. Use critical values from t[n-K] distribution instead of standard normal. Will be the same as normal if n > 100. Based on asymptotic results or any robust covariance matrix estimator, use N[0,1].

Confidence Interval Critical t[.975,29] = 2.045 Confidence interval based on t: 1.27365  2.045 * .1501 Confidence interval based on normal: 1.27365  1.960 * .1501

Specification and Functional Form: Interaction Effects

Interaction Effect ---------------------------------------------------------------------- Ordinary least squares regression ............ LHS=LOGY Mean = -1.15746 Standard deviation = .49149 Number of observs. = 27322 Model size Parameters = 4 Degrees of freedom = 27318 Residuals Sum of squares = 6540.45988 Standard error of e = .48931 Fit R-squared = .00896 Adjusted R-squared = .00885 Model test F[ 3, 27318] (prob) = 82.4(.0000) --------+------------------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X Constant| -1.22592*** .01605 -76.376 .0000 AGE| .00227*** .00036 6.240 .0000 43.5272 FEMALE| .21239*** .02363 8.987 .0000 .47881 AGE_FEM| -.00620*** .00052 -11.819 .0000 21.2960 Do women earn more than men (in this sample?) The +.21239 coefficient on FEMALE would suggest so. But, the female “difference” is +.21239 - .00620*Age. At average Age, the effect is .21239 - .00620(43.5272) = -.05748.

Bootstrap Confidence Interval For a Coefficient

Bootstrap CI for Least Squares

Bootstrap CI for Least Absolute Deviations

Bootstrapped Confidence Intervals Estimate Norm()=(12 + 22 + 32 + 42)1/2

Coefficient on MALE dummy variable in quantile regressions

Forecasting Objective: Forecast Distinction: Ex post vs. Ex ante forecasting Ex post: RHS data are observed Ex ante: RHS data must be forecasted Prediction vs. model validation. Within sample prediction “Hold out sample”

Prediction Intervals Given x0 predict y0. Two cases: Estimate E[y|x0] = x0; Predict y0 = x0 + 0 Obvious predictor, b’x0 + estimate of 0. Forecast 0 as 0, but allow for variance. Alternative: When we predict y0 with bx0, what is the 'forecast error?' Est.y0 - y0 = bx0 - x0 - 0, so the variance of the forecast error is x0Var[b - ]x0 + 2 How do we estimate this? Form a confidence interval. Two cases: If x0 is a vector of constants, the variance is just x0 Var[b] x0. Form confidence interval as usual. If x0 had to be estimated, then we use a random variable. What is the variance of the product? (Ouch!) One possibility: Use bootstrapping.

Forecast Variance Variance of the forecast error is 2 + x0’ Var[b]x0 = 2 + 2[x0’(X’X)-1x0] If the model contains a constant term, this is In terms squares and cross products of deviations from means. Interpretation: Forecast variance is smallest in the middle of our “experience” and increases as we move outside it.

Butterfly Effect

Internet Buzz Data

A Prediction Interval The usual 95% Due to ε Due to estimating α and β with a and b

Slightly Simpler Formula for Prediction

Prediction from Internet Buzz Regression

Prediction Interval for Buzz = .8

Semi- and Nonparametric Estimation

Application: Stochastic Frontier Model Production Function Regression: logY = b’x + v - u where u is “inefficiency.” u > 0. v is normally distributed. e = N0,sv2] - |N[0,su2]| has a “skew normal density Save for the constant term, the model is consistently estimated by OLS. If the theory is right, the OLS residuals will be skewed to the left, rather than symmetrically distributed if they were normally distributed. Application: Spanish dairy data used in Assignment 2 yit = log of milk production x1 = log cows, x2 = log land, x3 = log feed, x4 = log labor

Regression Results

Distribution of OLS Residuals

A Nonparametric Regression y = µ(x) +ε Smoothing methods to approximate µ(x) at specific points, x* For a particular x*, µ(x*) = ∑i wi(x*|x)yi E.g., for ols, µ(x*) =a+bx* wi = 1/n + We look for weighting scheme, local differences in relationship. OLS assumes a fixed slope, b.

Nearest Neighbor Approach Define a neighborhood of x*. Points near get high weight, points far away get a small or zero weight Bandwidth, h defines the neighborhood: e.g., Silverman h =.9Min[s,(IQR/1.349)]/n.2 Neighborhood is + or – h/2 LOWESS weighting function: (tricube) Ti = [1 – [Abs(xi – x*)/h]3]3. Weight is wi = 1[Abs(xi – x*)/h < .5] * Ti .

LOWESS Regression

OLS Vs. Lowess

Smooth Function: Kernel Regression

Kernel Regression vs. Lowess (Lwage vs. Educ)

Locally Linear Regression

OLS vs. LOWESS