Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Correlation and Linear Regression.
Copyright © 2010 Pearson Education, Inc. Slide
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Prediction with multiple variables Statistics for the Social Sciences Psychology 340 Spring 2010.
Objectives (BPS chapter 24)
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Statistics for the Social Sciences
Statistics for the Social Sciences Psychology 340 Spring 2005 Statistics & Research Methods.
Statistics for the Social Sciences
Lecture 6: Multiple Regression
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Statistics for the Social Sciences Psychology 340 Spring 2005 Hypothesis testing with Correlation and Regression.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Correlation and Regression Analysis
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Simple Linear Regression Analysis
Relationships Among Variables
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
Inference for regression - Simple linear regression
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Examining Relationships in Quantitative Research
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Regression relationship = trend + scatter
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 12, 2013 Correlation and Regression.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Correlation & Regression Analysis
ANOVA, Regression and Multiple Regression March
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Outline of Today’s Discussion 1.Seeing the big picture in MR: Prediction 2.Starting SPSS on the Different Models: Stepwise versus Hierarchical 3.Interpreting.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Stats Methods at IC Lecture 3: Regression.
The simple linear regression model and parameter estimation
Simple Bivariate Regression
Statistics for the Social Sciences
Kin 304 Regression Linear Regression Least Sum of Squares
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
BPK 304W Regression Linear Regression Least Sum of Squares
(Residuals and
BPK 304W Correlation.
No notecard for this quiz!!
Unit 3 – Linear regression
Statistics for the Social Sciences
M248: Analyzing data Block D UNIT D2 Regression.
Product moment correlation
3.2. SIMPLE LINEAR REGRESSION
3 basic analytical tasks in bivariate (or multivariate) analyses:
Correlation and Simple Linear Regression
Presentation transcript:

Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.

Statistics for the Social Sciences Outline (for week) Simple bi-variate regression, least-squares fit line –The general linear model –Residual plots –Using SPSS Multiple regression –Comparing models, (?? Delta r 2 ) –Using SPSS

Statistics for the Social Sciences From last time Review of last time Y = intercept + slope(X) + error Y X

Statistics for the Social Sciences From last time Y X The sum of the residuals should always equal 0. –The least squares regression line splits the data in half Additionally, the residuals to be randomly distributed. –There should be no pattern to the residuals. –If there is a pattern, it may suggest that there is more than a simple linear relationship between the two variables.

Statistics for the Social Sciences Seeing patterns in the error –Useful tools to examine the relationship even further. These are basically scatterplots of the Residuals (often transformed into z-scores) against the Explanatory (X) variable (or sometimes against the Response variable) Residual plots

Statistics for the Social Sciences Seeing patterns in the error The residual plot shows that the residuals fall randomly above and below the line. Critically there doesn't seem to be a discernable pattern to the residuals. Residual plot Scatter plot The scatter plot shows a nice linear relationship.

Statistics for the Social Sciences Seeing patterns in the error Residual plot The scatter plot also shows a nice linear relationship. The residual plot shows that the residuals get larger as X increases. This suggests that the variability around the line is not constant across values of X. This is referred to as a violation of homogeniety of variance. Scatter plot

Statistics for the Social Sciences Seeing patterns in the error The residual plot suggests that a non- linear relationship may be more appropriate (see how a curved pattern appears in the residual plot). Residual plot Scatter plot The scatter plot shows what may be a linear relationship.

Statistics for the Social Sciences Regression in SPSS –Variables (explanatory and response) are entered into columns –Each row is an unit of analysis (e.g., a person) Using SPSS

Statistics for the Social Sciences Regression in SPSS Analyze: Regression, Linear

Statistics for the Social Sciences Regression in SPSS –Predicted (criterion) variable into Dependent Variable field –Predictor variable into the Independent Variable field Enter :

Statistics for the Social Sciences Regression in SPSS The variables in the model r r 2 We’ll get back to these numbers in a few weeks Slope (indep var name) Intercept (constant) Unstandardized coefficients

Statistics for the Social Sciences Regression in SPSS  (indep var name) Standardized coefficient Recall that r = standardized  in bi-variate regression

Statistics for the Social Sciences Multiple Regression Typically researchers are interested in predicting with more than one explanatory variable In multiple regression, an additional predictor variable (or set of variables) is used to predict the residuals left over from the first predictor.

Statistics for the Social Sciences Multiple Regression Y = intercept + slope (X) + error Bi-variate regression prediction models

Statistics for the Social Sciences Multiple Regression Multiple regression prediction models “fit” “residual” Y = intercept + slope (X) + error Bi-variate regression prediction models

Statistics for the Social Sciences Multiple Regression Multiple regression prediction models First Explanatory Variable Second Explanatory Variable Fourth Explanatory Variable whatever variability is left over Third Explanatory Variable

Statistics for the Social Sciences Multiple Regression First Explanatory Variable Second Explanatory Variable Fourth Explanatory Variable whatever variability is left over Third Explanatory Variable Predict test performance based on: Study time Test time What you eat for breakfast Hours of sleep

Statistics for the Social Sciences Multiple Regression Predict test performance based on: Study time Test time What you eat for breakfast Hours of sleep Typically your analysis consists of testing multiple regression models to see which “fits” best (comparing r 2 s of the models) versus For example:

Statistics for the Social Sciences Multiple Regression Response variable Total variability it test performance Total study time r =.6 Model #1: Some co-variance between the two variables R 2 for Model =.36 64% variance unexplained If we know the total study time, we can predict 36% of the variance in test performance

Statistics for the Social Sciences Multiple Regression Response variable Total variability it test performance Test time r =.1 Model #2: Add test time to the model Total study time r =.6 R 2 for Model =.49 51% variance unexplained Little co-variance between these test performance and test time We can explain more the of variance in test performance

Statistics for the Social Sciences Multiple Regression Response variable Total variability it test performance breakfast r =.0 Model #3: No co-variance between these test performance and breakfast food Total study time r =.6 Test time r =.1 R 2 for Model =.49 51% variance unexplained Not related, so we can NOT explain more the of variance in test performance

Statistics for the Social Sciences Multiple Regression Response variable Total variability it test performance breakfast r =.0 We can explain more the of variance But notice what happens with the overlap (covariation between explanatory variables), can’t just add r’s or r 2 ’s Total study time r =.6 Test time r =.1 Hrs of sleep r =.45 R 2 for Model =.60 40% variance unexplained Model #4: Some co-variance between these test performance and hours of sleep

Statistics for the Social Sciences Multiple Regression in SPSS Setup as before: Variables (explanatory and response) are entered into columns A couple of different ways to use SPSS to compare different models

Statistics for the Social Sciences Regression in SPSS Analyze: Regression, Linear

Statistics for the Social Sciences Multiple Regression in SPSS Method 1: enter all the explanatory variables together –Enter: All of the predictor variables into the Independent Variable field Predicted (criterion) variable into Dependent Variable field

Statistics for the Social Sciences Multiple Regression in SPSS The variables in the model r for the entire model r 2 for the entire model Unstandardized coefficients Coefficient for var1 (var name) Coefficient for var2 (var name)

Statistics for the Social Sciences Multiple Regression in SPSS The variables in the model r for the entire model r 2 for the entire model Standardized coefficients Coefficient for var1 (var name)Coefficient for var2 (var name)

Statistics for the Social Sciences Multiple Regression –Which  to use, standardized or unstandardized? –Unstandardized  ’s are easier to use if you want to predict a raw score based on raw scores (no z-scores needed). –Standardized  ’s are nice to directly compare which variable is most “important” in the equation

Statistics for the Social Sciences Multiple Regression in SPSS Predicted (criterion) variable into Dependent Variable field First Predictor variable into the Independent Variable field Click the Next button Method 2: enter first model, then add another variable for second model, etc. –Enter:

Statistics for the Social Sciences Multiple Regression in SPSS Method 2 cont: –Enter: Second Predictor variable into the Independent Variable field Click Statistics

Statistics for the Social Sciences Multiple Regression in SPSS –Click the ‘R squared change’ box

Statistics for the Social Sciences Multiple Regression in SPSS The variables in the first model (math SAT) Shows the results of two models The variables in the second model (math and verbal SAT)

Statistics for the Social Sciences Multiple Regression in SPSS The variables in the first model (math SAT) r 2 for the first model Coefficients for var1 (var name) Shows the results of two models The variables in the second model (math and verbal SAT) Model 1

Statistics for the Social Sciences Multiple Regression in SPSS The variables in the first model (math SAT) Coefficients for var1 (var name) Coefficients for var2 (var name) Shows the results of two models r 2 for the second model The variables in the second model (math and verbal SAT) Model 2

Statistics for the Social Sciences Multiple Regression in SPSS The variables in the first model (math SAT) Shows the results of two models The variables in the second model (math and verbal SAT) Change statistics: is the change in r 2 from Model 1 to Model 2 statistically significant?

Statistics for the Social Sciences Cautions in Multiple Regression We can use as many predictors as we wish but we should be careful not to use more predictors than is warranted. –Simpler models are more likely to generalize to other samples. –If you use as many predictors as you have participants in your study, you can predict 100% of the variance. Although this may seem like a good thing, it is unlikely that your results would generalize to any other sample and thus they are not valid. –You probably should have at least 10 participants per predictor variable (and probably should aim for about 30).