# Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.

## Presentation on theme: "Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or."— Presentation transcript:

Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or more independent variables.

Correlation A correlation describes a relationship between two variables Correlation tries to answer the following questions: –What is the relationship between variable X and variable Y? –How are the scores on one measure associated with scores on another measure? –To what extent do the high scores on one variable go with the high scores on the second variable?

Simple Linear Regression Understanding relationships between variables: –Prediction –Explanation

Design Requirements and Assumptions Two continuous variables Variables are linearly related Random Sampling Independence Bivariate Normality N >= 30

How Used in Making Predictions

The Regression Coefficient? What Slope? What Altitude?

Fitting the Regression Line: The Best Fit (Least Squares) Y'= a + b y X The predicted value of Y(Y') for a value of X is computed by: –Multiplying a score (X) by the regression coefficient (b y ) –Adding the regression constant (a) to this product The prediction of Y from X based on linear relationship of X and Y so that errors are minimized

Least Squares Fit: Visual * * Where the average squared distance of the points from the regression line is minimized

Minimizing Prediction Error: What that Means (For Math Types)

The Regression Coefficient: Close Your Eyes if You Don’t Want the Derivation b y = r xy (s y /s x ) –b y = regression coefficient –r = correlation between X and Y –s y = standard deviation of Y –s x = standard deviation of X Compute b y : divide the standard deviation of Y (s y ) by the standard deviation of X (s x ) then multiply by the Pearson correlation (r xy )between X and Y

The Constant (a): More Math Regression Constant (a): the altitude of the regression line; the value where the regression line intercepts Y where X = 0 (the Y intercept) a = Y - b y X –a = the regression constant –Y = mean of Y –b y = regression coefficient –X = mean of X Compute a: multiply X (mean of X) by the regression coefficient (b y ) and then subtract that product from Y (mean of Y)

Plotting Regression Line Need compute two predicted scores: –For X (undergrad GPA) = 2.75 Y’ = a + b y X = 2.93+.24(2.75) = 3.59 –For X (undergrad GPA) = 3.60 Y’ = a + b y X = 2.93+.24(3.60) = 3.79 Draw regression line through scatter plot using these two points

Plotting the Regression Line: Visual * *

Errors of Prediction

Standard Error of Estimate The magnitude of the error made in estimating Y from X: a measure of dispersion around the regression line The average error of prediction

The Standard Error of Estimate: A Visual Representation 3.00 3.50 3.25 3.75 4.00 3.253.003.503.754.00 Graduate GPA Undergraduate GPA 3.75 3.25

Standard Error of Estimate: Another Visual Representation Y

Is the prediction worth pursuing? Standard error Amount of variance explained by X Testing the regression coefficient (b) for significance

Explaining Variance: How much? Total Variance Predicted Variance Unpredicted Variance Y

Assessing Prediction Accuracy: Explaining Variance Total Variance: = Predicted variance + Residual (unexplained) variance Coefficient of Determination (r 2 ):Proportion of total variance in Y that has been predicted by variable X (r 2 = s 2 y’ /s 2 y ) –Our example: r =.56, so r 2 =.3136 Coefficient of Non-Determination (1-r 2): : Proportion of total variance in Y that is not predicted by X –Our example: 1- r 2 = 1-.31 =.69

Proportion of Explained (Predicted) and Unexplained (Residual) Variance X Y r xy =.56 r 2 =.31 (31%) Explained variance (1-r 2 ) =.69 (69%) Unexplained variance

t-Test for Individual Regression Coefficients (b y ) H 0 :  = 0 (where  is the population regression coefficient) H 1 :  not= 0 Compute a t statistic: T = (b -  )/s b = b/s b (how many standard error points b is from the hypothesized population parameter under the null hypothesis,  = 0 )

t-Test of b: Our Example t =.24/.12 = 2.00 Set alpha at.05 (two-tailed) Figure out df (N-2): 8 t critical (05/2,8) = 2.306 Decision: t observed (2.00) < t critical (2.306) so do not reject the null hypothesis Conclusion: cannot conclude that the slope is significantly different from 0 in the population.

Our Conclusion: Do not reject the null hypothesis * *

Warnings Simple regression assumes a straight line relationship Outliers can control regression results Assumes random samples for making proper generalizations Regression is correlational and does not show a causal link between x causes y

Download ppt "Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or."

Similar presentations