Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Interactions in Regression.

Slides:



Advertisements
Similar presentations
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Quantitative Data Analysis: Hypothesis Testing
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 6: Correlation.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Data Analysis Statistics. Inferential statistics.
Ch. 14: The Multiple Regression Model building
Intro to Statistics for the Behavioral Sciences PSYC 1900
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 14: Factorial ANOVA.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Relationships Among Variables
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Lecture 5 Correlation and Regression
Lecture 16 Correlation and Coefficient of Correlation
Statistical Analyses & Threats to Validity
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Understanding Multivariate Research Berry & Sanders.
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Chapter 15 Correlation and Regression
Moderation & Mediation
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
Examining Relationships in Quantitative Research
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
11 Chapter 12 Quantitative Data Analysis: Hypothesis Testing © 2009 John Wiley & Sons Ltd.
Chapter 16 Data Analysis: Testing for Associations.
Chapter 13 Multiple Regression
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Lab 9: Two Group Comparisons. Today’s Activities - Evaluating and interpreting differences across groups – Effect sizes Gender differences examples Class.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
General Linear Model.
Correlation & Regression Analysis
Chapter 8: Simple Linear Regression Yang Zhenlin.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
LESSON 6: REGRESSION 2/21/12 EDUC 502: Introduction to Statistics.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Example x y We wish to check for a non zero correlation.
Research Methodology Lecture No :26 (Hypothesis Testing – Relationship)
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Inference about the slope parameter and correlation
REGRESSION G&W p
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Product moment correlation
Regression Part II.
Presentation transcript:

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Interactions in Regression

Bivariate Regression Review Predicts values of Y as a linear function of X Predicts values of Y as a linear function of X The intercept: a The intercept: a The predicted value of Y when X=0 The predicted value of Y when X=0 The slope: b The slope: b The change in Y associated with a one unit change in X The change in Y associated with a one unit change in X Regression line is one that minimizes errors Regression line is one that minimizes errors

Partitioning Variance Total variance of Y is partitioned into portion explained by X and error Total variance of Y is partitioned into portion explained by X and error R 2 is portion that is explained R 2 is portion that is explained Standard error of estimate is average deviation around prediction line Standard error of estimate is average deviation around prediction line

Hypothesis Testing in Regression The null hypothesis is simply that the slope equals zero. The null hypothesis is simply that the slope equals zero. This is equivalent to testing  =0 in correlation. This is equivalent to testing  =0 in correlation. If the correlation is significant, so must the slope be. If the correlation is significant, so must the slope be. The actual significance of the slope is tested using a t-distribution. The actual significance of the slope is tested using a t-distribution. The logic is similar to all hypothesis testing. The logic is similar to all hypothesis testing. We compare the magnitude of the slope (b) to its standard error (i.e., the variability of slopes drawn from a population where the null is true). We compare the magnitude of the slope (b) to its standard error (i.e., the variability of slopes drawn from a population where the null is true).

Hypothesis Testing in Regression The formula to calculate the t value is: The formula to calculate the t value is: Note that standard error of b increases as standard error of the estimate increases Note that standard error of b increases as standard error of the estimate increases We then determine how likely it would be that we found a slope as large as we did using a t distribution (similar to the normal distribution). We then determine how likely it would be that we found a slope as large as we did using a t distribution (similar to the normal distribution).

Multiple Regression Allows analysis of more than one independent variable Allows analysis of more than one independent variable Explains variance in Y as a function of a linear composite of iv’s Explains variance in Y as a function of a linear composite of iv’s Each iv has a regression coefficient which provides an estimate of its independent effects. Each iv has a regression coefficient which provides an estimate of its independent effects.

Example Let’s examine applicant attractiveness as a function of GREV, Letters of Rec, & Personal Statements Let’s examine applicant attractiveness as a function of GREV, Letters of Rec, & Personal Statements Letters & statements rated on 7pt scales; Y on 10pt. Letters & statements rated on 7pt scales; Y on 10pt. Thus, the predicted evaluation for someone with a great statement (7), ok letters (5), and solid GREV (700) would be: Thus, the predicted evaluation for someone with a great statement (7), ok letters (5), and solid GREV (700) would be:

Standardized Regressions The use of standardized coefficients allows easier comparisons of the magnitude of effects The use of standardized coefficients allows easier comparisons of the magnitude of effects Coefficients refer to changes in predicted z-scores of Y as a function of z-scores of x Coefficients refer to changes in predicted z-scores of Y as a function of z-scores of x What is the relation of  to r here? What is the relation of  to r here? In multiple regression,  only equals r if all iv’s are uncorrelated. In multiple regression,  only equals r if all iv’s are uncorrelated.

Testing Hypotheses for Individual Predictors Hypothesis testing here is quite similar to that for the single iv in bivariate regression. Hypothesis testing here is quite similar to that for the single iv in bivariate regression. But note that in multiple regression, the standard error is sensitive to the overlap (i.e., correlation) among coefficients. But note that in multiple regression, the standard error is sensitive to the overlap (i.e., correlation) among coefficients. As the intercorrelation increases, so does the standard error. As the intercorrelation increases, so does the standard error.

Refining a Model When you are building a model, one way to determine if the addition of new variables improves fit is to see if adding them results in a significant change in R 2. When you are building a model, one way to determine if the addition of new variables improves fit is to see if adding them results in a significant change in R 2. If so, this means that adding the variable explains a significant amount of previously unexplained variability. If so, this means that adding the variable explains a significant amount of previously unexplained variability.

Analyzing Interactions in Multiple Regression Many times, we are not only interested in the direct effects of single independent variables on a dependent variable, but also in how one variable may affect the influence of another. That is, how the influence of one independent variable changes as a function of change on a second independent variable. In regression, we represent interactions by using cross-product terms. The unique effect of a cross-product term (i.e., it’s effect after being partialled for the main effects) represents the interaction effect. To achieve this end, we must have the single independent variables that comprise the cross-product in the regression equation.

An Example Let’s say we have a 2 (Gender) X 2 (Self-Esteem: High/Low) study on aggression. Aggression is defined as the level of shock given to a confederate in the experimental task. Gender will be scored 0 for males, 1 for females. Self-Esteem will be scored 0 for low, 1 for high (based on a median split of scores). To create the interaction term, we simply multiply scores for gender X self- esteem. Groupgenderself-esteemInteraction Male/LoSE000 Male/HiSE010 Female/LoSE100 Female/HiSE111

Our “main-effects” regression equation is: To examine the interaction, we add the cross-product term: If b 3 is significant (or equivalently if the change in R 2 is significant), the interaction is significant. Note: when an interaction is present, it becomes tenuous to interpret the main effects from the first equation. The main effect parameters from the second equation are not easily interpretable.

The resulting data from the sample we have provides the following estimates for the first equation: Let’s calculate the predicted values (means) for the groups. LSEmen=6.95, HSEmen=3.78, LSEwomen=4.96, HSEwomen=1.79 Now, we add the interaction term and find:

To depict the interaction, we show different regression lines based on 1 iv as a function of the other iv. For men: For women: So, it’s clear that SE has a stronger association with aggression for men than for women.

The same can be done with a continuous iv (or iv’s). In this case, the cross-product term will not simply be 1’s and 0’s, but will function in the same manner. To depict the interaction, you should select three different levels of the continuous iv (usually –1sd, the mean, and +1sd). For example, if we use self-esteem scores rather than dichotomize them, the regression equation becomes: You could then show how the effect of gender differs as a function of self-esteem (though in the present case, it might make more sense just to have two lines for gender and show how the effect of self-esteem differs).

An Example Returning to the solar radiation data, we know that increasing sun exposure is associated with decreased breast cancer. Returning to the solar radiation data, we know that increasing sun exposure is associated with decreased breast cancer. What about the role of toxins in the environment? Might it affect this relation? What about the role of toxins in the environment? Might it affect this relation?

Top line is plotted substituting +1sd for pollution Bottom line is plotted substituting -1sd for pollution You can see that the benefits of sun exposure decline with increasing exposure to toxins.