2 Purpose of Regression Analysis Regression analysis procedures have as their primary purpose the development of an equation that can be used for predicting values on some Dependent Variable, Y, given Independent Variables, X, for all members of a population.
3 Purpose of Linear Relationship One of the most important functions of science is the description of natural phenomenon in terms of ‘functional relationships’ between variables.When it was found that the value of a variable Y depends on the value of another variable X so that for every value of X there is a corresponding value of Y, then Y is said to be a ‘function’ of’ X.
4 Example of Linear Relationship If one is given a temperature value in the Centigrade Scale ( represented by X), then the corresponding value in the Fahrenheit Scale ( represented by Y), can be calculated by the formula:Y = XIf the Centigrade temperature is 10, the Fahrenheit temperature is calculated to be:Y = (10) = = 50Similarly, if the Centigrade temperature is 20, the Fahrenheit temperature must be:Y = (20) = = 68We can plot this relationship on the usual rectangular system of coordinates.
5 Linear Equation Dependent variable Y InterceptSlope of lineDependent variableIndependent variableLinear EquationAny equation of the following form will generate a straight lineY = a + b XA straight line is defined by two terms: Slope and Intercept. The slope (b) reflects the angle and direction of regression line.The intercept (a) is the point at which regression line intersects the Y axis.
6 Regression and Prediction As a university admissions officer, what GPA would you predict for a student who earns a score of 650 on SAT-V ?If the relationship between X and Y is not perfect, you should attach error to your prediction.Correlation and RegressionDetermining the Line of Best Fit or Regression Line using Least Squares Criterion.
7 Selection of Regression Line Residual or error of prediction = (Y –Y’)Positive or negativeRegression line, Y’ = a + bX, is chosen so that the sum of the squared prediction error for all cases, ∑(Y- Y’)2, is as small as possible
9 Calculation of Regression Line ContinuedCalculation of Regression LineCalculatedeviationfrom average Y
10 Calculation of Regression Line ContinuedCalculation of Regression LineCalculatedeviationfrom average X
11 Calculation of Regression Line ContinuedCalculation of Regression LineCalculateproduct of deviation from X and Y
12 Calculation of Regression Line ContinuedCalculation of Regression Line
13 Calculation of Regression Line ContinuedCalculation of Regression LineStandardDeviation of YDeviation of XCorrelation of X & Y
14 Calculation of Regression Line ContinuedCalculation of Regression Line
15 Calculation of Regression Line ContinuedCalculation of Regression Linea = 1.42b = .0021Y’ = X
16 Calculation of Predicted Values and Residuals Y (GPA)X (SAT)Y’Y-Y’1.60400.002.26-0.662.00350.002.16-0.162.20500.002.47-0.272.800.54450.002.370.432.60550.002.580.023.200.62600.002.68-0.682.40650.002.78-0.383.40700.002.89-0.093.00750.002.990.01Sum30.800.00Average2.57545.83Y’ = X
18 Plot of Data Intercept Regression line shows Slope shows change in Yassociated toto changein one unit of XRegression line showspredicted values. Difference between predicted & observed is the residualIntercept
19 Calculation of Regression Line Using Standard Deviations Predicted weight = Gestation days
20 Relationship between Weight & Gestation Days Regression equation: Y` = XInterceptPredicted weight = Gestation days
21 Predicting Weight from Gestation Days If a baby’s gestation is…AddInterceptPlus Coefficient TIMES gestationPredicted Weight250811* (250)3061260* (260)3151270811+ 9* (270)3241280* (280)3331300* (300)3511
22 Sources of VariationThe sum of Squares of the Dependent Variable is partitioned into two components:One due to Regression (Explained)One due to Residual (Unexplained)Similarities between ANOVA and regression
24 Testing Statistical Significance of Variance Explained Source of variationSSdfRegression0.802.40110ResidualTotal3.2011
25 Testing Statistical Significance of Variance Explained ContinuedTesting Statistical Significance of Variance ExplainedSource of variationSSdfMSRegression0.802.401100.24ResidualTotal3.2011
26 Testing Statistical Significance ContinuedTesting Statistical SignificanceSource of variationSSdfMSFFαRegression0.802.401100.243.334.90ResidualTotal3.2011Testing the proportion of variance due to regressionH0 : R2 = Since the F< Fα fail to reject HoHa : R2 ≠ 0
27 Testing Statistical Significance of Regression Coefficient B. Testing the Regression CoefficientH0 : β = 0 Since the p> α Fail to reject HoHa : β ≠ 0
29 Interpretation of Standard Error of Estimate The average amount of error in predicting GPA scores is 0.49.The smaller the standard error of estimate, the more accurate the predictions are likely to be.
31 Assumptions Continued X and Y are normally distributed The relationship between X and Y is linear and not curved
32 Assumptions Continued X and Y are normally distributed The relationship between X and Y is linear and not curvedThe variation of Y at particular values of X is not proportional to X
33 Assumptions Continued X and Y are normally distributed The relationship between X and Y is linear and curvedThe variation of Y at particular values of X is not proportional to XThere is negligible error in measurement of X
34 The Use of Simple Regression Answering Research Questions and Testing HypothesisMaking Prediction about Some Outcome or Dependent VariableAssessing an Instrument ReliabilityAssessing an Instrument Validity
35 How to conduct Regression Analysis Take Home LessonHow to conduct Regression Analysis