Econ 140 Lecture 121 Prediction and Fit Lecture 12.

Slides:



Advertisements
Similar presentations
Heteroskedasticity Lecture 17 Lecture 17.
Advertisements

Lesson 10: Linear Regression and Correlation
The Simple Regression Model
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Inference for Regression
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Objectives (BPS chapter 24)
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Lecture 4 This week’s reading: Ch. 1 Today:
Classical Regression III
Statistics for the Social Sciences
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 10 Simple Regression.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Chapter 12 Simple Regression
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Multiple Regression Models
The Simple Regression Model
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
Multiple Regression Applications
1 Regression Analysis Regression used to estimate relationship between dependent variable (Y) and one or more independent variables (X). Consider the variable.
Econ 140 Lecture 191 Heteroskedasticity Lecture 19.
Introduction to Probability and Statistics Linear Regression and Correlation.
Business Statistics - QBM117 Least squares regression.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression Analysis
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Relationships Among Variables
Linear Regression Analysis
Correlation & Regression
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Chapter 8: Bivariate Regression and Correlation
Lecture 15 Basics of Regression Analysis
Introduction to Linear Regression and Correlation Analysis
Correlation and Linear Regression
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Simple Linear Regression Models
Chapter 6 & 7 Linear Regression & Correlation
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Introduction to Linear Regression
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Example x y We wish to check for a non zero correlation.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Correlation and Linear Regression Chapter 13 McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Inference about the slope parameter and correlation
The simple linear regression model and parameter estimation
Regression and Correlation
Regression Analysis.
Regression Analysis AGEC 784.
Inference for Least Squares Lines
Simple Linear Regression
Section 6.2 Prediction.
Presentation transcript:

Econ 140 Lecture 121 Prediction and Fit Lecture 12

Econ 140 Lecture 122 Today’s plan Prediction using the regression equation –Plugging in values into the equation to get predictions of Y Coefficient of determination (R 2 ) Lessons on the predictive ability of an equation

Econ 140 Lecture 123 Prediction For our regression equation we can plug in values of X to get predictions of Y Keep in mind that these coefficients are estimates that are bound by confidence intervals If X o =17, Since Y is measured in natural logs, our estimate of Y is

Econ 140 Lecture 124 Prediction (2) This is different from the confidence interval around the regression line. (See oops.pdf from Lecture 1). Now we’re dealing with the sampling distribution around The mean is E(Y|X 0 ) = a + bX 0 where X 0 is the chosen value for the prediction –here, X 0 =17

Econ 140 Lecture 125 Prediction (3) The variance for the prediction is: –The right hand part of the term takes into account how far X 0 is away from the mean of X divided by the variation of X as a whole –The further away X 0 is from the mean of X, the higher the variability we get, given the variation in X.

Econ 140 Lecture 126 Prediction (4) The standard error of the predicted value for Y given a value of X is the square root of the variance term We know: so The standard error is the square root, or: = 0.03

Econ 140 Lecture 127 Confidence interval around Y 0 Confidence interval around the predicted values of Y looks like: X Y X 0 = 17

Econ 140 Lecture 128 Confidence interval around Y 0 (2) We can see that the further away the value of X 0 is from the mean of X, the larger the variation in the standard error for the predicted value of Y ( ) A confidence interval estimate of the true value of Y 0 will be

Econ 140 Lecture 129 Confidence interval around Y 0 (3) The 95% confidence interval for Y 0 is Thus the expected value of Y given that X is 17 lies between

Econ 140 Lecture 1210 Returning to Palm Beach County Let’s return to what we looked at in lecture 1 The number of registered voters versus the number of cotes cast for each political party in Florida For the number of Reform votes cast versus the number of registered Reform voters, the regression line seems to fit the data except for one outlier - Palm Beach County L12.xls is on the the web –Does the number of votes cast fall within a 95 percent confidence interval for the prediction?

Econ 140 Lecture 1211 Returning to Palm Beach County (2) The prediction of the number of votes cast in Palm Beach County is approximately 875! This compares with the actual votes cast of The 95 percent confidence interval estimate for Palm Beach county is: 806 < E(Y|X=337) < 943. Still along way from Could we really expect so many votes for Buchanan given the statistical evidence?

Econ 140 Lecture 1212 Coefficient of determination R 2 R 2 will be part of the output from most statistical software, (see the Stata output or the LINEST output). We can rewrite R 2 as

Econ 140 Lecture 1213 Coefficient of determination R 2 (2) R 2 has two properties: 0  R 2  1. –If the model has explained all of the variation, R 2 will be 1. If the model has explained none of the variation, R 2 will be zero R 2 will always be positive

Econ 140 Lecture 1214 Coefficient of determination R 2 (3) From L9.xls we know From this we can calculate R 2 = 6.181/ = 0.38 We will see that there is a relationship between R 2 and the coefficient of correlation, which is the measure of correlation between X and Y

Econ 140 Lecture 1215 Coefficient of determination R 2 (4) To get the coefficient of correlation, we can think of the model as You can square and sum both sides of the equation so that There should be no correlation between the independent variable and the errors so

Econ 140 Lecture 1216 Coefficient of determination R 2 (5) Now we’re left with We have R 2 and b 2 equal to Model Sum of Squares Residual Sum of Squares We can substitute R 2 and b 2 to get:

Econ 140 Lecture 1217 Coefficient of determination R 2 (6) The coefficient of correlation R is: So from this we can see that there is a relationship between the coefficient of determination and the coefficient of correlation –We can see how correlated X and Y are and how good of a job the model is doing –A higher R 2 doesn’t necessarily mean that the causal relationship implied by the model increases - your reasoning should come from economic theory

Econ 140 Lecture 1218 Coefficient of determination R 2 (7) If we can write we know that what matters to R 2 as a predictive indicator, once we know b, is  x 2, or the variation in X To improve the predictive ability of a regression equation, the observations on X should not be clustered around the mean of X, but should range over as many values as possible.

Econ 140 Lecture 1219 Lessons on the regression equation Lesson One : the predictive ability of any regression equation declines as X 0 moves away from the mean of X Lesson Two : For strong predictive ability, you need wide variation in X

Econ 140 Lecture 1220 Next time Next time we’ll be moving from a bivariate world to a multivariate world –We will look at how the multivariate equation relates to the bi-variate equation –We will use the LINEST function to obtain estimates in the same way that we have for the bi-variate case