Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Regression single and multiple. Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous.
Chapter 17 Making Sense of Advanced Statistical Procedures in Research Articles.
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
Multiple Regression Models Advantages of multiple regression Important preliminary analyses Parts of a multiple regression model & interpretation Differences.
Lecture 6: Multiple Regression
Multiple Regression.
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
Multiple Regression Research Methods and Statistics.
Multiple Regression – Basic Relationships
Multiple Regression Dr. Andy Field.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
Multiple Regression.
Understanding Regression Analysis Basics. Copyright © 2014 Pearson Education, Inc Learning Objectives To understand the basic concept of prediction.
Extension to Multiple Regression. Simple regression With simple regression, we have a single predictor and outcome, and in general things are straightforward.
The Goal of MLR  Types of research questions answered through MLR analysis:  How accurately can something be predicted with a set of IV’s? (ex. predicting.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Path Analysis. Remember What Multiple Regression Tells Us How each individual IV is related to the DV The total amount of variance explained in a DV Multiple.
More Regression What else?.
Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Chapter 7 Relationships Among Variables What Correlational Research Investigates Understanding the Nature of Correlation Positive Correlation Negative.
Multiple Linear Regression Partial Regression Coefficients.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 12 Making Sense of Advanced Statistical.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Linear Regression Chapter 8. Slide 2 What is Regression? A way of predicting the value of one variable from another. – It is a hypothetical model of the.
ANCOVA. What is Analysis of Covariance? When you think of Ancova, you should think of sequential regression, because really that’s all it is Covariate(s)
Assumptions 5.4 Data Screening. Assumptions Parametric tests based on the normal distribution assume: – Independence – Additivity and linearity – Normality.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Outliers Chapter 5.3 Data Screening. Outliers can Bias a Parameter Estimate.
Descriptions. Description Correlation – simply finding the relationship between two scores ○ Both the magnitude (how strong or how big) ○ And direction.
Linear Regression Chapter 7. Slide 2 What is Regression? A way of predicting the value of one variable from another. – It is a hypothetical model of the.
Multiple Regression 8 Oct 2010 CPSY501 Dr. Sean Ho Trinity Western University Please download from “Example Datasets”: Record2.sav Domene.sav.
Multiple Regression David A. Kenny January 12, 2014.
Outline of Today’s Discussion 1.Seeing the big picture in MR: Prediction 2.Starting SPSS on the Different Models: Stepwise versus Hierarchical 3.Interpreting.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
ANCOVA.
Venn diagram shows (R 2 ) the amount of variance in Y that is explained by X. Unexplained Variance in Y. (1-R 2 ) =.36, 36% R 2 =.64 (64%)
Week of March 23 Partial correlations Semipartial correlations
Multiple Linear Regression An introduction, some assumptions, and then model reduction 1.
PROFILE ANALYSIS. Profile Analysis Main Point: Repeated measures multivariate analysis One/Several DVs all measured on the same scale.
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
Multiple Regression Scott Hudson January 24, 2011.
Chapter 11 REGRESSION Multiple Regression  Uses  Explanation  Prediction.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Multiple Regression: II
Correlation, Bivariate Regression, and Multiple Regression
Multiple Regression Prof. Andy Field.
Multiple Regression – Part I
Chapter 9 Multiple Linear Regression
Understanding Regression Analysis Basics
Multiple Regression.
Multiple Regression.
Regression.
Making Sense of Advanced Statistical Procedures in Research Articles
Multiple Regression – Part II
Regression Analysis.
3 basic analytical tasks in bivariate (or multivariate) analyses:
Regression Part II.
Presentation transcript:

Regression

Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework is much easier to run/interpret.

Why Regression? Regression is a type of analysis where you want to be able to predict some scores using other information. Regression is more flexible than ANOVA because you do not have to have categorical or group variables for your independent variables (but you can).

Definitions Predictors – these are your independent variables that you want to use to predict the dependent variable. You can think of them as IVs or Xs. You can use categorical, Likert, or continuous variables as predictors.

Definitions Criterion – this variable is your dependent variable or the information you are trying to predict/understand. You can think of this variable as DV or Y.

a b Y-hat = a + bx Y-hat = β x

Definitions Essentially, the point of regression is to create an equation that best explains the relationship between Xs and Y. Equation = Y-hat = a + b 1 X 1 + b 2 X 2 + …

Definitions a = Constant, this value is added to the regression equation to help with prediction. Not often reported in research.

Definitions b (little b) = Coefficient, this value is the unstandardized slope for your regression equation. Interpretation: For every one point increase in X, you will get b points increase in Y. This score will be based on the scale of the variable you are using to predict.

Definitions β (beta) = Coefficient, this value is the standardized slope for your regression equation. With one X/predictor beta is equal to Pearson’s r. Since beta is standardized you can use it to compare across predictors at which IV best explained your DV. Interpretation: For every one SD increase in X, you will get b SD increase in Y.

Definitions Y-hat = The predicted value based on the equation. If your regression equation is good, it will be close to the actual value for that participant Y (DV).

Definitions Effect sizes for regression include several types of correlations Correlations: r R R 2 sr pr

Definitions r is the correlation between one X and one Y. R is the correlation or the relationship between your predicted values and people’s actual DV/Y values. Correlation between Yhat and Y – which means that it is the correlation between multiple Xs and Y. R 2 = the amount of variance in the DV scores that your IV/predictors account for. This number is effect size for regression equations.

Definitions sr = semi-partial correlation, the variance from only that IV over the total variance. Tells you how much variance overall that variable accounts for. The unique contribution of that variable to R 2 – increase in proportion explained Y when that variable is added to the equation.

Definitions pr = partial correlation, the variance from only that IV over the variance not accounted for (error). Tells you how much variance your variance accounts for when you only look at variance that you can explain. Proportion of variance in Y not explain by other predictors.

Definitions R = B + C + D / A + B + C + D r IV1 = B + C / A + B + C + D sr IV1 = B / A + B + C + D pr IV1 = B / A

Types Simple Linear Regression – regression with only one predictor variable (IV). It’s called simple because there’s only ONE thing predicting. In this case, beta = r.

Types Multiple Linear Regression – regression with more than one predictor variable (IVs). You can use a mix of variables – continuous, categorical, Likert, etc. You can use MLR to figure out which IVs are the most important.

Types Of MLR Standard/Simultaneous Linear Regression – all variables are entered into the equation at the same time. Each variable assessed as if it were the last variable entered This “controls” for the other IVs, as we talked about the interpretation of b. Evaluates sr > 0? If you have two highly correlated IVs the one with the biggest sr gets all the variance. Therefore the other IV will get very little variance associated with it and look unimportant (suppression).

Types of MLR Hierarchical / Sequential Linear Regression – predictor variables are entered in as sets or steps. Variance gets assigned at each step to the first variables entered. IVs enter the regression equation in an order specified by the researcher. What order? Assigned by theoretical importance Or you can control for nuisance variables in the first step First IV is basically tested against r or sr (since there’s nothing else in the equation it gets all the variance) Next IVs are tested against pr (they only get the left over variance).

Types of MLR More about hierarchical regression: Using SETS of IVs instead of individuals So, say you have a group of IVs that are super highly correlated but you don’t know how to combine them or want to eliminate them. Instead you will process each step as a SET and you don’t care about each individual predictor

Types of MLR Statistical / Stepwise Linear Regression – predictor variables are entered in steps, based on some statistical cutoff. Entry into the equation is solely based on statistical relationship and nothing to do with theory or your experiment Types Forward – biggest IV is added first, then each IV is added as long as it accounts for enough variance Backward – all are entered in the equation at first, and then each one is removed if it doesn’t account for enough variance Stepwise – mix between the two (adds them but then may later delete them if they are no longer important).

Interpretation and Purpose of MLR How good is the equation? Can we predict people’s scores better than chance? Use the overall model ANOVA statistics, R 2 for effect size. Which IVs are the most important? Which contribute the most to prediction? Use the coefficient statistics (t values) with pr 2 as the effect size. This procedures looks suspiciously like ANOVA …

Types of MLR Special types of MLR also include mediation and moderation.

Mediation Mediation – analyzing that the relationship between X and Y is changed by the inclusion of another variable M. c path: Simple regression analysis with X predicting Y a path: Simple regression analysis with X predicting M b path: Simple regression analysis with M predicting Y c’ path: Multiple regression analysis with X and M predicting Y

Mediation

Moderation Moderation – analyzing an interaction between two X variables in the prediction of Y. Similar to an interaction effect in ANOVA.

Data Screening Accuracy and missing data  same. Outliers  new rules. Additivity  only look at the IVs correlations. Normality, linearity, homogeneity, homoscedasticity  same BUT you will run these analyses on your REAL regression, not the fake one. We don’t need a fake one, we have a real regression set up!

Outliers in Regression Leverage – how much influence over the slope a person has (how much their score will change b values) Cut off rule of thumb = (2K+2)/N K is the number of IV(s) predictors N is the number of people

Outliers in Regression Discrepancy – how far away from other data points a point is (no influence on the slope) Cooks – influence – combination of both leverage and discrepancy Cut off rule of thumb = 4/(N-K-1) K is the number of IV/predictors N is the number of people

Categorical IVs in Regression When you use categorical IVs in a regression analysis, you must make sure the variable is factored in R. Otherwise, it will interpret that variable as continuous, which doesn’t make a whole lot of sense. Whatever variable is coded as the first group in your factored variable becomes the comparison group. You will automatically get every group compared that one. What if you want all pairwise? Recode and run again. We will work an example of these variables to help clarify.