Presentation is loading. Please wait.

Presentation is loading. Please wait.

Michael J. Kalsher PSYCHOMETRICS MGMT 6971 Regression 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher.

Similar presentations


Presentation on theme: "Michael J. Kalsher PSYCHOMETRICS MGMT 6971 Regression 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher."— Presentation transcript:

1 Michael J. Kalsher PSYCHOMETRICS MGMT 6971 Regression 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher

2 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher Introduction to Regression If two variables covary, we should be able to predict the value of one variable from another. Correlation only tells us how much two variables covary. In regression, we construct an equation that uses one or more variables (the IV(s) or predictor variable(s)) to predict another variable (the DV or outcome variable). –Predicting from one IV = Simple Regression –Predicting from multiple IVs = Multiple Regression

3 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher Simple Regression: The Model The general equation: Outcome i = (model) + error i In regression the model is linear and we summarize a data set with a straight line. The regression line is determined through the method of least squares. 3

4 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 4 The Regression Line: Model = “things that define the line we fit to the data” Any straight line can be defined by: -The slope of the line (b 1 ) -The point at which the line crosses the ordinate, termed the intercept of the line (b 0 ) The general equation: Outcome i = (model) + error i … becomes Y i = (b 0 + b 1 X i ) + ε i b 1 and b 0 are termed regression coefficients b 1 tells us what the model looks like (it’s shape) b 0 tells us where the model is in geometric space ε i is the residual term and represents the difference between participant i’s predicted and obtained scores.

5 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 5 Method of Least Squares: Finding the line of best fit The method of least squares selects the line (regression line) that has the lowest sum of squared differences and therefore best represents the observed data. Once we determine the slope (b 1 )and intercept (b 0 ) of the line, we can insert different values of our predictor variable into the model to estimate the value of the outcome variable. Regression Line Slope = b 1 = dy / dx Individual Data Points Residual (Error in Prediction) Sum of residuals = 0 Intercept (Constant) b 0

6 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher Assessing Goodness of Fit Even the best fitting line can be a lousy fit to the data, so we need to assess the goodness of fit of the model against our best estimate--the mean. Let’s consider an example (Record1.sav): –A music mogul wants to know how many records her company will sell if she spends £100,000 on advertising. –In the absence of a model of the relationship between advertising and sales, the best guess would be the mean number of record sales (say 200,000)--regardless of amount of advertising. –So, as a basic strategy for predicting the outcome, we could use the mean, because on average it is a good guess. 6

7 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 7 Assessing Goodness of Fit Represents the total amount of differences present when the most basic model is applied to the data. SS T uses the differences between the observed data and the mean value of Y. Represents the degree of inaccuracy when the best model is fitted to the data. SS R uses the differences between the observed data and the regression line. Shows the reduction in inaccuracy resulting from fitting the regression model to the data. SS M uses the differences between the mean value of Y and the regression line. A large SS M implies the regression model predicts the outcome variable better than the mean. SS T SS R SS M

8 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 8 Assessing Goodness of Fit A large SS M implies the regression model is much better than using the mean to predict the outcome variable. How big is big? Assessed in two ways: (1) Via R 2 and (2) the F-test (assesses the ratio of systematic to unsystematic variance). R 2 = SS M SS T Represents the amount of variance in the outcome explained by the model relative to how much variance there was to explain. F =F = SS M / df SS R / df = MS M MS R df for SS M = number of variables in the model df for SS R = number of observations minus number of parameters being estimated.

9 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher Simple Regression Using SPSS: Predicting Record Sales (Y) from Advertising Budget (X) Record1.sav What’s the overall relationship between record sales and advertising budget?

10 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher

11 11

12 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 12

13 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 13

14 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher Interpreting a Simple Regression: Overall Fit of the Model 14 SS M SS R SS T MS M MS R The significant “F” test allows us to conclude that the regression model results in significantly better prediction of record sales than the mean value of record sales. Advertising expenditure accounts for 33.5% of the variation in record sales.

15 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 15 Df = 1, 198 F=99.587

16 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher b 0, the Y intercept b 1, the slope, or the change in the outcome associated with a unit change in the predictor Interpreting a Simple Regression: Model Parameters The ANOVA tells us whether the overall model results in a significantly good prediction of the outcome variable … not about the individual contribution of variables in the model. b 0 = 134.14. Tells us that when no money is spent on ads, the model predicts 134,140 records will be sold. b 1 =.096. The amount of change in the outcome associated with a unit change in the predictor. Thus, we can predict 96 extra record sales for every £ 1000 in advertising. Regression coefficients should be sig. different from 0 and big relative to their S.E.

17 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 17 Unstandardized Regression Weights Y pred = b 0 + b 1 X Intercept and Slope are in original units of X and Y and so aren’t directly comparable Interpreting a Simple Regression: Model Parameters Standardized Regression Weights Z y(pred) =  Z x Standardized regression weights tell us the number of standard deviations that the outcome will change as a result of one standard deviation change in the predictor. Richards. (1982). Standardized versus Unstandardized Regression Weights. Applied Psychological Measurement, 6, 201-212.

18 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 18 Interpreting a Simple Regression: Using the Model Since we’ve demonstrated the model significantly improves our ability to predict the outcome variable (record sales), we can plug in different values of the predictor variable(s). record sales i = b 0 + b 1 advertising budget i = 134.14 + (0.096 x advertising budget i ) What could the record executive expect if she spent £500,000 in advertising? How about £1,000,000?

19 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 19 Simple Regression: Supermodel.sav A fashion student interested in the factors that predict salaries of catwalk models collects data from 231 models. For each model, she asks them their salary per day on days they work (salary), their age (age), number of years they have worked as a model (years), and then gets a panel of experts from modeling agencies to rate the attractiveness of each model as a percentage with 100% being perfectly attractive (beauty). Use simple regression to predict the relationship between each of the potential predictor variables (i.e., age, years, beauty) to predict a model’s salary.

20 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 20

21 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 21 Attractiveness Age Years

22 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 22 Attractiveness

23 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 23 Age

24 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 24 Years

25 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 25

26 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher 26 Simple Regression: Record2.sav Let’s return to our “Record Mogul” example (Record1.sav): –In Record1.sav, we had data for “record sales” (the outcome variable) and “advertising budget” (the predictor variable). –This allowed the mogul to predict how many records her company will sell if she spends £100,000 (or any other amount) on advertising. –We found that the predictor did a better job than the simplest model, the mean. –Let’s first look to see whether additional predictors are useful in predicting record sales using “simple regression” (assessing each potential variable individually).


Download ppt "Michael J. Kalsher PSYCHOMETRICS MGMT 6971 Regression 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher."

Similar presentations


Ads by Google