 # Quantitative Business Analysis for Decision Making Simple Linear Regression.

## Presentation on theme: "Quantitative Business Analysis for Decision Making Simple Linear Regression."— Presentation transcript:

Quantitative Business Analysis for Decision Making Simple Linear Regression

403.72 Lecture Outlines n n Scatter Plots n n Correlation Analysis n n Simple Linear Regression Model n n Estimation and Significance Testing n n Coefficient of Determination n n Confidence and Prediction Intervals n n Analysis of Residuals

403.73 Regression Analysis ? Regression analysis is used for modeling the mean of “response” variable Y as a function of “predictor” variables X 1, X 2,.., X k. When K = 1, it is called simple regression analysis.

403.74 Random Sample Y: Response Variable, X: Predictor Variable For each unit in a random sample of n, the pair (X, Y) is observed resulting a random sample: (x 1, y 1 ), (x 2, y 2 ),... (x n, y n )

403.75 Scatter Plot Scatter Plot is a graphical displays of the sample (x 1, y 1 ), (x 2, y 2 ),... (x n, y n ) by n points in 2-dimension. It will suggest if there is a relationship between X and Y

403.76 A Scatter Plot Showing Linear Trend

403.77 A Scatter Plot Showing No Linear Trend

403.78 Modeling linear Trend Modeling linear Trend n A perfect linear relationship between Y and X and X exists if. Coefficient is the slope--quantifying the amount of change in y corresponding to one unit change in x. n n There are no perfect linear relationships in practical world.

403.79 Simple Linear Regression Model Model: n n is linear function (nonrandom) n n is random error. It is assumed to be normally distributed mean 0 and standard deviation. So n n are parameters of the model

403.710 Estimation Simple linear regression analysis estimates the mean of Y (linear trend) by and

403.711 Standard deviation Standard deviation (s) of the sample of n points in the scatter plot around the estimated regression line is:

403.712 Testing the Slope of Linear Trend For Testing compute t-statistic and its p value:

403.713 Coefficient of Determination: R 2 n n A quantification of the significance of estimated model is denoted by R 2. n n R 2 > 85% = significant model n n R 2 < 85% = model is perceived as inadequate n n Low R 2 will suggest a need for additional predictors for modeling the mean of Y

403.714 Correlation Coefficient: r The correlation coefficient r is the square root of R 2. It is a number between -1 and 1. – –Closer r is to -1 or 1, the stronger is the linear trend – –Its sign is positive for increasing trend (slope b is positive) – –Its sign is negative for decreasing trend (slope b is negative)

403.715 Confidence and Prediction Intervals To estimate by a confidence interval, or to predict response Y corresponding to its predictor value x = x 0 – –1. Compute: – –2. compute:

403.716 What is ? i.e. Standard Error of For estimating, For Predicting Y,

403.717 Analysis of Residuals Residuals are defined: n n Residual analysis is used to check the normality and homogeneity of variance assumptions of random errors. n n Histogram or box plot of residuals will help to ascertain if errors are normally distributed.

403.718 Analysis of Residuals (con’t) Plot of residual against observed predictor values x i will help ascertain homogeneity assumption. – –random appearance = homogeneity of variance assumption is valid. – –non-random appearance =homogeneity assumption is not valid and variance is dependent on predictor values.