where: y dependent variable (value depends on x) a y-intercept (value of y when x = 0) b slope (rate of change in ratio of delta y divided by delta x) x independent variable
Assumptions Linearity Independence of Error Homoscedasticity Normality
Linearity The most fundamental assumption is that the model fits the situation [i.e.: the Y variable is linearly related to the value of the X variable].
Independence of Error The error (residual) is independent for each value of X. [Residual = observed - predicted]
Homoscedasticity The variation around the line of regression constant for all values of X.
Normality T he values of Y be normally distributed at each value of X.
Diagnostic Checking u Linearity u Independence u Examine scatter plot of residuals versus fitted [Y hat ] for evidence of nonlinearity u Plot residuals in time order and look for patterns
Diagnostic Checking u Homoscedasticity u Normality u Examine scatter plots of residuals versus fitted [Y hat ] and residuals vs time order and look for changing scatter. u Examine histogram of residuals. Look for departures from normal curve.
Goal Develop a statistical model that can predict the values of a dependent (response) variable based upon the values of the independent (explanatory) variable(s).
Simple Regression quantitative quantitative A statistical model that utilizes one quantitative independent variable “X” to predict the quantitative dependent variable “Y.”
Mini-Case Since a new housing complex is being developed in Carmichael, management is under pressure to open a new pie restaurant. Assuming that population and annual sales are related, a study was conducted to predict expected sales.
Types of Models No relationship between X and Y Positive linear relationship Negative linear relationship
Method of Least Squares u The straight line that best fits the data. u Determine the straight line for which the differences between the actual values (Y) and the values that would be predicted from the fitted line of regression (Y-hat) are as small as possible.
Measures of Variation Explained Unexplained Total
Explained Variation Sum of Squares (Y hat - Y bar ) 2 due to Regression [SSR]
Unexplained Variation Sum of Squares (Y obs - Y hat ) 2 Error [SSE]
Total Variation Sum of Squares (Y obs - Y bar ) 2 Total [SST]
H0:H0: There is no linear relationship between the dependent variable and the explanatory variable
Hypotheses H 0 : = 0 H 1 : 0 or H 0 : No relationship exists H 1 : A relationship exists