Presentation on theme: "Ch.6 Simple Linear Regression: Continued"— Presentation transcript:
1 Ch.6 Simple Linear Regression: Continued To complete the analysis of the simple linear regression model, in this chapter we will considerhow to measure the variation in yt, that is explained by the modelhow to report the results of a regression analysishow changes in the units of measurement affect the estimatessome alternative functional forms that may be used to represent possible relationships between yt and xt.
2 The Coefficient of Determination (R2) Two major reasons for analyzing the modely = 1 + 2x + eareTo explain how the dependent varaible (yt) changes as the independent variable (xt) changesTo predict yo given xo.We want the independent variable (xt) to explain as much of the variation in the dependent variable (yt) as possible.We introduced the independent variable (xt) in hope that its variation will explain the variation in yA measure of goodness of fit will measure how much of the variation in the dependent variable (yt) has been explained by variation in the independent variable (xt).
3 Separate yt into its explainable and unexplainable components: is explainable.whereThe error term et is unexplainable. Using our estimates for 1 and 2, we get estimates of E(yt) and our residuals give us estimates of the error terms.Residual is defined as the difference betweenthe actual and the predicted values of y.
4 The total variation in yt is measured as the sum of the squared deviations from the mean: Also known as SST (Total Sum of Squares)A single deviation of yt from its mean can be split into two parts:The sum of squared deviations from the mean is:This term is zero
5 Graphically, a single y deviation from mean can be split into the two parts: UnexplainedTotal VariationExplainedxt
6 SST = SSR + SSE Analysis of Variance (ANOVA): Where: SST: Total Sum of Squares with T-1 degrees of freedom. It measures the total variation in the actual yt values about its mean.SSR: Regression Sum of Squares with 1 degree of freedom. It measures the variation in the predicted values of yt about their mean. It is the part of the total variation that is explained by the model.SSE: Error Sum of Squares with T-2 degrees of freedom. It measures the variation in the actual yt values about the predicted yt values. It is the part of the total variation that is left unexplained.
8 Coefficient of Determination: R2 R2 is the proportion of the total variation (SST) that is explained by the model. We can also think of it as one minus the proportion of the total variation that is unexplained (left in the residuals).0 R2 1The closer R2 is to 1.0, the better the fit of the model and the greater is the predictive ability of the model over the sample.If R2 =1 the model has explained everything. All the data points lie on the regression lie (very unlikely). There are no residuals.If R2 = 0 the model has explained nothing.
9 y x y x Graph A R2 appears to be 1.0. All data Points lie on a line. Graph ByR2 appears to be 0. The best line thru thesepoints appears to have a slope of zero.x
10 y x y x Graph C R2 appears to be close to 1.0. Graph D R2 appears to be greater than 0 butless than R2 in graph C.x
11 In the food expenditure example, R2 = 0. 317 “31 In the food expenditure example, R2 = “31.7% of the total variation in food expenditures has been explained by variation in household income.”More Examples:
12 Correlation Analysis Correlation coefficient between x and y is: The Sample Correlation between x and y is:It is always true that-1 r 1It measures the strength of a linear relationship between x and y.
13 Correlation and R2It can be shown that the square of the sample correlation coefficient for x and y is equal to R2.R2 can also be computed as the square of the sample correlation coefficient for the y values and the values.It can also be shown that
14 Reporting Regression Results (s.e.) (22.139) (0.0305) R2 = 0.317The numbers in parentheses are the standard errors of the coefficients estimates. These can be used to construct the necessary t-statistics to ascertain the significance of the estimates.Sometimes, authors will report the t-statistic instead of the standard error. This would be the t-statistic for the Ho: = 0(t-stat) (1.841) (4.201) R2 = 0.317
15 Units of Measurement ^ b1 is measured in “y units” b2 is measured in “y units over x units”Example 3.15 from Chapter 3 Exercisesy = number of sodas sold x = temperature in degrees (oF)If xo = 0o then the model predicts:So b1 is measured in y units (# of sodas).b2 = 6 where 6 is in (# of sodas / degrees).If x increases by 10 degrees y increases by 60 sodas^
16 Let newx = x/100.no change to b1b2 increases by 100
17 Functional FormsA linear model is one that is linear in the parameters with an additive error term.y = 1 + 2x + eThe coefficient 2 measures the effect of a one unit change in x on y. As the model is written above, this effect is assumed to be constant:However, we want to have the ability to model relationships among economic variables where the effect of x on y is not constant.Example: our food expenditure example assumes that the increase in food spending from an additional dollar of income was the same whether the family had a high or low income. We can capture these effects using logs, powers and reciprocals yet still maintain a model that is linear in the parameters with an additive error term.
18 The Natural Logarithm We will use the derivative property often: Let y be the log of X:y = ln(x) dy/dx = 1/x or dy = dx/xThis means that the absolute change in the log of X is equivalent to the relative change in the level of X.Let x=50 ln(x) = 3.912Let x=52 ln(x) = 3.951 dln(x) = – = 0.039The absolute change in ln(x) is 0.039, which can be interpreted as a relative change in X (X increases from 50 to 52, which, in relative terms, is 3.9%)
19 Using Logs What does 2 measure? What does 2 measure?