Presentation is loading. Please wait.

Presentation is loading. Please wait.

TODAY we will Review what we have learned so far about Regression Develop the ability to use Residual Analysis to assess if a model (LSRL) is appropriate.

Similar presentations


Presentation on theme: "TODAY we will Review what we have learned so far about Regression Develop the ability to use Residual Analysis to assess if a model (LSRL) is appropriate."— Presentation transcript:

1 TODAY we will Review what we have learned so far about Regression Develop the ability to use Residual Analysis to assess if a model (LSRL) is appropriate for predictions Understand how the Standard error (Se) is used in regression Analysis

2 Review  How to describe a scatterplot  Correlation Coefficient ( r )  Math Vs. Stats  Equation of Line vs. LSRL  Interpret Slope and y-intercept  What is a residual (or error)?

3 Review How to describe a scatterplot  Trend ~ Positive or Negative  Form ~ Linear or non Linear  Strength ~ moderate, weak or strong  Correlation Coefficient ( r ) -1< r < 1 Strength R Close to 1 or -1 ~ Strong association R Close to 0 ~ Weak or no linear association  Trend Positive association (as x variable increase, y variable also increase) Negative Association (as x variable increase, Y variable decrease)

4 Review  Math vs. Stats  Equation of Line vs. LSRL Line  Math  y = mx + b Line  Stats 

5 Review Interpret Slope and y-intercept  Slope: For every one unit of x, y increases (decreases) on average by the slope.  Y-intercept When the value of the variable x=0 then the value of the variable y = “a”

6 Review What is a residual (or Error) Observed y Predicted y } residual Error = Residuals OBSERVED Y VALUE – Predicted Y value

7 Use Residual Analysis to assess if the model (LSRL) is appropriate for making predictions

8 Correlation and Linearity and Outliers  Only use linear correlation to interpret the data when there is a linear relationship  An outlier can strongly influence the correlation.

9 Fitting a Model for Prediction or Fitting the LRSL for Prediction Stochastic MESSAGES All models are wrong but some are useful Text Deterministic Residual Analysis Address directly the problem of Signal and Noise Allow Random Variation A model is not the reality Signal Noise

10 Signal and Noise

11

12

13 Types of Residual plots Different plots can highlight different departures or problems in the prediction model. 1)Residual vs. Fitted 2)Histogram 3)PP~PLOT 4)Order vs. Fitted Note: these plots are from software output (Minitab)

14

15 Residual vs. Fitted value plot  Three common defects may be revealed by plotting residuals vs. fitted value  1) Outliers  2) Progressive change in the variance: Band of uniform width Funnel shape = not equal variance : transform  3) inadequacy of the model : Curvature ~ wrong model Linear trend going up ~ wrong calculation

16 Residual vs. Fitted

17 Let's look at an example to see what a "well-behaved" residual plot looks like.

18 Scatterplot Some researchers (Urbano- Marquez, et al., 1989) were interested in determining whether or not alcohol consumption was linearly related to muscle strength. The researchers measured the total lifetime consumption of alcohol (x) on a random sample of n = 50 alcoholic men. They also measured the strength (y) of the deltoid muscle in each person's nondominant arm. A fitted line plot of the resulting data, (alcoholarm.txt), looks like:

19 Scatterplot. Residual Plot Residual vs. Fitted

20 Let's look at an example to see what a ”not so well-behaved" residual plot looks like.

21 What do you notice in this scatterplot? 0 OUTLIER Scatterplot Residual plot Predicted or Fitted Foot length

22 0 Predicted or Fitted

23 Outlier Removed Predicted or Fitted 0

24 Let's look at an example to see what a ”not well-behaved" residual plot looks like.

25 0

26 Heteroscedasticity  When the requirement of a constant variance is violated we have a condition of heteroscedasticity.  Diagnose heteroscedasticity by plotting the residual against the predicted y. + + + + + + + + + + + + + + + + + + + + + + + + The spread increases with y ^ y ^ Residual ^ y + + + + + + + + + + + + + + + + + + + + + + +

27

28

29 Signal and Noise

30 Residuals plots fitted vs. residuals Homoscedasticity vs. Heteroscedasticity Homoscedasticity A residual plot is a scatterplot of the standardized residuals against the fitted values

31 Let's look at an example to see what a ”not well-behaved" residual plot looks like.

32 How does a non-linear regression function show up on a residual vs. fits plot? The answer: The residuals depart from 0 in some systematic manner, such as being positive for small x values, negative for medium x values, and positive again for large x values. Any systematic (non-random) pattern is sufficient to suggest that the regression function is not linear.

33

34

35

36

37 2) The random errors are normally distributed and centered at zero Histograms + PP PLOTS --  Normality assumption Histogram show why center at zero and why bell shape QQ plots better to discover the normal shape because the histogram bins can be manipulated and therefore the normal shape maybe difficult in some cases.

38 Histograms of residuals Centered at zero Bell shaped No outliers What to look for? Centered at zero Bell shaped No outliers How strict? Centered at zero Bell shaped No outliers What does it mean when Histogram is skewed

39

40 R, R-squared,SE 4 in one residual plots

41

42 Look at this graph normal residuals???

43

44

45 Here's the corresponding normal probability plot of the residuals:

46

47

48 residuals vs. order plot residuals vs. order plot" as a way of detecting a particular form of non- independence of the error terms, namely serial correlation. If the data are obtained in a time (or space) sequence, a residuals vs. order plot helps to see if there is any correlation between the error terms that are near each other in the sequence. The plot is only appropriate if you know the order in which the data were collected! Highlight this, underline this, circle this,..., er, on second thought, don't do that if you are reading it on a computer screen. Do whatever it takes to remember it though — it is a very common mistake made by people new to regression analysis. So, what is this residuals vs. order plot all about? As its name suggests, it is a scatter plot with residuals on the y axis and the order in which the data were collected on the x axis. Here's an example of a well-behaved residuals vs. order plot:

49 Residual Vs. Order The residuals bounce randomly around the residual = 0 line as we would hope so. In general, residuals exhibiting normal random noise around the residual = 0 line suggest that there is no serial correlation.

50 A residuals vs. order plot that exhibits (positive) trend as the following plot does: Residual Vs. Order

51 R-SquaredResidual Standard Error R2R2 ResidualsSe Residuals Analysis is more important than High R 2

52 Residual Activity https://www.causeweb.org/repository/StarLibrary/activities/miller2001/


Download ppt "TODAY we will Review what we have learned so far about Regression Develop the ability to use Residual Analysis to assess if a model (LSRL) is appropriate."

Similar presentations


Ads by Google