Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Similar presentations


Presentation on theme: "Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation."— Presentation transcript:

1 Simple Linear Regression (OLS)

2 Types of Correlation Positive correlationNegative correlationNo correlation

3 Simple linear regression describes the linear relationship between an independent variable, plotted on the x-axis, and a dependent variable, plotted on the y-axis Independent Variable (X)dependent Variable (Y)

4 X Y 1.0

5 X Y

6 X Y

7 X Y ε ε

8 Fitting data to a linear model interceptslope residuals

9 How to fit data to a linear model? The Ordinary Least Square Method (OLS)

10 Least Squares Regression Residual (ε) = Sum of squares of residuals = Model line: we must find values of and that minimise

11 Y = a + b X b = ∑(X – X bar) (Y – Y bar) / ∑(X- X bar) 2 a = Y bar – b X bar

12 Regression Coefficients

13 Required Statistics

14 Descriptive Statistics

15 Regression Statistics

16 Y Variance to be explained by predictors (SST)

17 Y X1X1 Variance NOT explained by X 1 (SSE) Variance explained by X 1 (SSR)

18 Regression Statistics

19 Coefficient of Determination to judge the adequacy of the regression model

20

21 Regression Statistics Correlation measures the strength of the linear association between two variables.

22 Standard Error for the regression model Regression Statistics

23 ANOVA dfSSMSFP-value Regression1SSRSSR / dfMSR / MSEP(F) Residualn-2SSESSE / df Totaln-1SST If P(F)<  then we know that we get significantly better prediction of Y from the regression model than by just predicting mean of Y. ANOVA to test significance of regression

24 Hypothesis Tests for Regression Coefficients

25 Hypotheses Tests for Regression Coefficients

26 Hypothesis Tests on Regression Coefficients

27 Hypotheses Test the Correlation Coefficient We would reject the null hypothesis if

28 Diagnostic Tests For Regressions Expected distribution of residuals for a linear model with normal distribution or residuals (errors).

29 Diagnostic Tests For Regressions Residuals for a non-linear fit

30 Diagnostic Tests For Regressions Residuals for a quadratic function or polynomial

31 Diagnostic Tests For Regressions Residuals are not homogeneous (increasing in variance)

32 Regression – important points 1.Ensure that the range of values sampled for the predictor variable is large enough to capture the full range to responses by the response variable.

33 X Y X Y

34 Regression – important points 2. Ensure that the distribution of predictor values is approximately uniform within the sampled range.

35 X Y X Y

36 Assumptions of Regression 1. The linear model correctly describes the functional relationship between X and Y.

37 Assumptions of Regression 1. The linear model correctly describes the functional relationship between X and Y. Y X

38 Assumptions of Regression 2. The X variable is measured without error X Y

39 Assumptions of Regression 3. For any given value of X, the sampled Y values are independent 4. Residuals (errors) are normally distributed. 5. Variances are constant along the regression line.

40 Multiple Linear Regression (MLR)

41 The linear model with a single predictor variable X can easily be extended to two or more predictor variables.

42 Y X1X1 Variance NOT explained by X 1 and X 2 Unique variance explained by X 1 Unique variance explained by X 2 X2X2 Common variance explained by X 1 and X 2

43 Y X1X1 X2X2 A “good” model

44 Partial Regression Coefficients (slopes): Regression coefficient of X after controlling for (holding all other predictors constant) influence of other variables from both X and Y. Partial Regression Coefficients interceptresiduals

45 The matrix algebra of Ordinary Least Square Predicted Values: Residuals: Intercept and Slopes:

46 Regression Statistics How good is our model?

47 Regression Statistics Coefficient of Determination to judge the adequacy of the regression model

48 Adjusted R 2 are not biased! n = sample size k = number of independent variables Regression Statistics

49 Standard Error for the regression model Regression Statistics

50 ANOVA dfSSMSFP-value RegressionkSSRSSR / dfMSR / MSEP(F) Residualn-k-1SSESSE / df Totaln-1SST If P(F)<  then we know that we get significantly better prediction of Y from the regression model than by just predicting mean of Y. ANOVA to test significance of regression at least one!

51 Hypothesis Tests for Regression Coefficients

52 Hypotheses Tests for Regression Coefficients

53 Diagnostic Tests For Regressions Expected distribution of residuals for a linear model with normal distribution or residuals (errors).

54 Standardized Residuals

55 Avoiding predictors (Xs) that do not contribute significantly to model prediction Model Selection

56 - Forward selection The ‘best’ predictor variables are entered, one by one. - Backward elimination The ‘worst’ predictor variables are eliminated, one by one. Model Selection

57 Forward Selection

58 Backward Elimination

59 Model Selection: The General Case Reject H 0 if :

60  The degree of correlation between Xs.  A high degree of multicolinearity produces unacceptable uncertainty (large variance) in regression coefficient estimates (i.e., large sampling variation)  Imprecise estimates of slopes and even the signs of the coefficients may be misleading.  t-tests which fail to reveal significant factors. Multicolinearity

61 Scatter Plot

62 Multicolinearity  If the F-test for significance of regression is significant, but tests on the individual regression coefficients are not, multicolinearity may be present.  Variance Inflation Factors (VIFs) are very useful measures of multicolinearity. If any VIF exceed 5, multicolinearity is a problem.

63 Thank You!


Download ppt "Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation."

Similar presentations


Ads by Google