Presentation is loading. Please wait.

Presentation is loading. Please wait.

Residuals and Diagnosing the Quality of a Model

Similar presentations


Presentation on theme: "Residuals and Diagnosing the Quality of a Model"— Presentation transcript:

1 Residuals and Diagnosing the Quality of a Model
Regression Models Residuals and Diagnosing the Quality of a Model

2 Visualizing Regression Models

3 Criteria of quality Residuals (or what we don’t explain) should be “noise” Independent variables measure different phenomena We haven’t left out something important.

4 Diagnosing the Quality of a Regression Model Using the Residuals
Regression models assume that the errors of prediction are: homoscedastic, not autocorrelated, normally distributed, and not correlated with the independent variables.

5 Regression Models assume…
The independent variables measure different phenomena, that is the independent variables are not themselves correlated. If they are, we have a problem of “collinearity” or “multicolinearity.”

6 Collinearity

7 An Omitted Variable?

8 Models A Model: A statement of the relationship between a phenomenon to be explained and the factors, or variables, which explain it. Steps in the Process of Quantitative Analysis: Specification of the model Estimation of the model Evaluation of the model

9 Thus far… We’ve discussed…
The specification of a model, The estimation of a model and how to read and interpret the statistics we’ve produced: coefficients, t tests, F tests, R Square Now we need to evaluate the model for problems and further elaboration.

10 We need to evaluate The variation in the predicted values and the difference between the Yi and the predicted Y. That difference is called a “residual.” We can analyze the residuals to see how good the equation is, and whether there are problems with the model that need correction or improvement.

11 More statistics… Standard Error of the Estimate: The square root of the average squared error of prediction is used as a measure of the accuracy of prediction. For the population: For the sample:

12 Standard Error of the Estimate
Used to calculate a confidence interval around the predicted y. As a rule of thumb, multiply the SEE by 2 and add and subtract from the predicted Ys to determine a measure of the variability of the prediction at a 95% confidence level. At the mean of the independent variable: the standard error of the prediction = SEE/(square root of n).

13 Hypothetical Example residual is 6.2 60 55 50 predicted value is 48.8
10 20 30 40 50 60 X Y residual is 6.2

14 Example from last week….
Newval = a + b1(Newsize) + b2(Families) + b3(Eastside) + b4(South) Dep Var: NEWVAL N: Multiple R: Squared multiple R: 0.56 Adjusted squared multiple R: Standard error of estimate: 19.61 Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail) CONSTANT NEWSIZE FAMILIES EASTSIDE SOUTH

15 To understand the principles, let’s simplify….
We return to the bivariate case: House value is a function of the size of the building. Regression models assume that the errors of prediction are homoscedastic, not autocorrelated, normally distributed, and not correlated with the independent variables. That is, the error term should be noise. Now we ask: 1. how accurate our prediction is, 2. what are the characteristics of the residuals or the error term.

16 Model of Housing Values and Building Size
Dep Var: NEWVAL N: Multiple R: Squared multiple R: 0.517 Adjusted squared multiple R: Standard error of estimate: Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail) CONSTANT NEWSIZE Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P Regression Residual

17 Scatterplot of Newsize and Newval

18 Scatterplot, cont.

19 95% Confidence Intervals for Mean Predictions of Y (left) and Individual Predictions of Y (right)

20 Hypothetical Example residual is 6.2 60 55 50 predicted value is 48.8
10 20 30 40 50 60 X Y residual is 6.2

21 Analysis of Residuals ESTIMATE NEWVAL RESIDUAL N of cases 467 467 467
Minimum Maximum Range Sum Median Mean 95% CI Upper 95% CI Lower Std. Error Standard Dev Variance C.V E+14 Skewness(G1) SE Skewness Kurtosis(G2) SE Kurtosis

22 Visualizing Regression Models

23 Collinearity

24 An Omitted Variable?


Download ppt "Residuals and Diagnosing the Quality of a Model"

Similar presentations


Ads by Google