Bivariate & Multivariate Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas process of.

Slides:



Advertisements
Similar presentations
Bivariate &/vs. Multivariate
Advertisements

Lesson 10: Linear Regression and Correlation
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
Research Hypotheses and Multiple Regression Kinds of multiple regression questions Ways of forming reduced models Comparing “nested” models Comparing “non-nested”
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
Statistical Control Regression & research designs Statistical Control is about –underlying causation –under-specification –measurement validity –“Correcting”
Regression Models w/ k-group & Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting.
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
Introduction to Path Analysis Reminders about some important regression stuff Boxes & Arrows – multivariate models beyond MR Ways to “think about” path.
Multiple Regression Models Advantages of multiple regression Important preliminary analyses Parts of a multiple regression model & interpretation Differences.
Multiple Regression Models: Some Details & Surprises Review of raw & standardized models Differences between r, b & β Bivariate & Multivariate patterns.
Multiple Regression Models
Simple Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas –quantitative vs. binary predictor.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Simple Correlation Scatterplots & r Interpreting r Outcomes vs. RH:
Multivariate Analyses & Programmatic Research Re-introduction to Multivariate research Re-introduction to Programmatic research Factorial designs  “It.
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Pertemua 19 Regresi Linier
An Introduction to Classification Classification vs. Prediction Classification & ANOVA Classification Cutoffs, Errors, etc. Multivariate Classification.
Multiple Regression Research Methods and Statistics.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Educational Research: Correlational Studies EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
Relationships Among Variables
Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation created by Dr. Susan R. Burns Morningside College.
Understanding Research Results
Lecture 15 Basics of Regression Analysis
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Chapter 6 & 7 Linear Regression & Correlation
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Statistical Control Statistical Control is about –underlying causation –under-specification –measurement validity –“Correcting” the bivariate relationship.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Coding Multiple Category Variables for Inclusion in Multiple Regression More kinds of predictors for our multiple regression models Some review of interpreting.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
ANOVA and Linear Regression ScWk 242 – Week 13 Slides.
Part IV Significantly Different: Using Inferential Statistics
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Chapter 16 Data Analysis: Testing for Associations.
Regression Models w/ 2-group & Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting.
Lecture 4 Introduction to Multiple Regression
Plotting Linear Main Effects Models Interpreting 1 st order terms w/ & w/o interactions Coding & centering… gotta? oughta? Plotting single-predictor models.
1 Regression & Correlation (1) 1.A relationship between 2 variables X and Y 2.The relationship seen as a straight line 3.Two problems 4.How can we tell.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation & Regression Analysis
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Simple Correlation Scatterplots & r –scatterplots for continuous - binary relationships –H0: & RH: –Non-linearity Interpreting r Outcomes vs. RH: –Supporting.
Developing a Hiring System Measuring Applicant Qualifications or Statistics Can Be Your Friend!
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, partial correlations & multiple regression Using model plotting to think about ANCOVA & Statistical control Homogeneity.
Linear Discriminant Function Classification vs. Prediction Classification & ANOVA Classification Cutoffs, Errors, etc. Multivariate Classification & LDF.
BNAD 276: Statistical Inference in Management Spring 2016 Green sheets.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Regression Analysis.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Multiple Regression.
Chapter 6 Predicting Future Performance
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Product moment correlation
Regression & Correlation (1)
MGS 3100 Business Analysis Regression Feb 18, 2016
Correlation and Prediction
Presentation transcript:

Bivariate & Multivariate Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas process of a prediction study Multivariate research & multiple regression Advantages of multiple regression Interpreting multiple regression weights Inspecting & describing multiple regression models

Correlation Studies and Prediction Studies Correlation research (95%) purpose is to identify the direction and strength of linear relationship between two quantitative variables usually theoretical hypothesis-testing interests Prediction research (5%) purpose is to take advantage of linear relationships between quantitative variables to create (linear) models to predict values of hard-to-obtain variables from values of available variables use the predicted values to make decisions about people (admissions, treatment availability, etc.) However, to fully understand important things about the correlation models requires a good understanding of the regression model upon which prediction is based...

Linear regression for prediction... “prediction” is using “what you know” to estimate “what you wish you knew” “what you know” is called the predictor variable “what you wish you knew” is called the criterion variable “what you wish you knew” must be estimated/predicted because it “hasn’t happened yet” and you don’t want to wait For each, which is the criterion and which is the predictor? score on the Final Exam & score on Exam 1 undergraduate GPA & graduate GPA length of treatment & recidivism performance & practice

Let’s take a look at the relationship between the strength of the linear relationship and the accuracy of linear prediction. for a given value of X draw up to the regression line draw over to the predicted Y value Predictor Criterion X Y’ When the linear relationship is very strong, there is a narrow range of Y values for any X value, and so the Y’ “guess” will be close

Predictor Criterion X Y’ However, when the linear relationship is very weak, there is a wide range of Y values for any X value, and so the Y’ “guess” will be less accurate, on the average. There is still some utility to the linear regression, because larger values of X still “tend to” go with larger values of Y. So the linear regression might supply useful information, even if it isn’t very precise -- depending upon what is “useful”?

Predictor Criterion X Y’ However, when there is no linear relationship,not only is there a very wide range of Y values for any X value, but all X values lead to the SAME Y value estimate (the mean of Y) Some key ideas are: everyone with a given “X” value will have the same predicted “Y” value if there is no (statistically significant & reliable) linear relationship, then there is no basis for linear prediction the stronger the linear relationship, the more accurate will be the linear prediction (on the average) XX

Predictors, Predicted Criterion, Criterion and Residuals Here are two formulas that contain “all you need to know” y’ = bx + a residual = y - y’ y the criterion -- variable you want to use to make decisions, but “can’t get” for each participant (time, cost, ethics) x the predictor -- variable related to criterion that you will use to make an estimate of criterion value for each participant y’ the predicted criterion value -- “best guess” of each participant’s y value, based on their x value --that part of the criterion that is related to (predicted from) the predictor Residual -- difference between criterion and predicted criterion values -- the part of the criterion not related to the predictor -- the stronger the correlation the smaller the residual (on average)

Simple regression y’ = bx + a raw score form For a quantitative predictor a = expected value of y if x = 0 b = direction and extent of expected change in the criterion for a 1-unit increase in the predictor For a binary x with 0-1 coding a = the mean of y for the group with the code value = 0 b = the y mean difference between the two coded groups

Let’s practice -- quantitative predictor... #1 depression’ = (2.5 * stress) + 23 apply the formula -- patient has stress score of 10 dep’ = _____ interpret “b” -- for each 1-unit increase in stress, depression is expected to ________ by ________ interpret “a” -- if a person has a stress score of “0”, their expected depression score is ______ #2job errors = ( -6 * interview score) + 95 apply the formula -- applicant has interview score of 10, expected number of job errors is ______ interpret “b” -- for each 1-unit increase in interview score, errors are expected to ________ by ________ interpret “a” -- if a person has a interview score of “0”, their expected number of job errors is ______ 48 increase decrease 6 95

Let’s practice -- binary predictor... #1 depression’=(7.5 * tx group) code: Tx=1 Cx=0 interpret “b” -- the Tx group has mean _______ _____ than Cx interpret “a” -- mean of the Cx group (code=0) is ______ so … mean of Tx group is ________ #2job errors = ( -2.0 * job) + 38 code: mgr=1 sales=0 the mean # of job errors of the sales group is _______ the mean # job errors of the management group is ______ if we measured another group of salespersons, what would be the expected # of job errors? _________ 7.5 higher

Conducting a Prediction Study This is a 2-step process Step 1 -- using the “Modeling Sample” which has values for both the predictor and criterion. Determine that there is a significant linear relationship between the predictor and the criterion. If there is an appreciable and significant correlation, then build the regression model (find the values of b and a) Step 2 -- using the “Application Sample” which has values for only the predictor. Apply the regression model, obtaining a y’ value for each member of the sample

Advantages of Multiple Regression Practical issues … better prediction from multiple predictors Theoretical issues … even when we know in our hearts that the design will not support causal interpretation of the results, we have thoughts and theories of the causal relationships between the predictors and the criterion -- and these thoughts are about multi- causal relationships can examine “unique relationships” between individual predictors within a model and the criterion

y x1x1 x2x2 x3x3 Venn diagrams representing r, b and R 2 r y,x1 r y,x2 r y,x3

y x1x1 x2x2 x3x3 Remember R 2 is the total variance shared between the model (all of the predictors) and the criterion R 2 = + + +

raw score regression y’ = b 1 x 1 + b 2 x 2 + b 3 x 3 + a each b represents the unique and independent contribution of that predictor to the model for a quantitative predictor tells the expected direction and amount of change in the criterion for a 1-unit change in that predictor, while holding the value of all the other predictors constant for a binary predictor (with unit coding -- 0,1 or 1,2, etc.), tells direction and amount of group mean difference on the criterion variable, while holding the value of all the other predictors constant a the expected value of the criterion if all predictors have a value of 0

y x1x1 x2x2 x3x3 Remember that the b of each predictor represents the part of that predictor shared with the criterion that is not shared with any other predictor -- the unique contribution of that predictor to the model b x1 &  x1 b x2 &  x2 b x3 &  x2

Let’s practice -- Tx (0 = control, 1 = treatment) depression’ = (2.0 * stress) - (1.5 * support) - (3.0 * Tx) + 35 apply the formula patient has stress score of 10, support score of 4 and was in the treatment group dep’ = _____ interpret “b” for stress -- for each 1-unit increase in stress, depression is expected to ________ by ________, when holding all other variables constant interpret “b” for support -- for each 1-unit increase in support, depression is expected to ________ by ________, when holding all other variables constant interpret “b” for tx – those in the Tx group are expected to have a mean depression score that is ______ _______ than the control group, when holding all other variables constant interpret “a” -- if a person has a score of “0” on all predictors, their depression is expected to be ______ 20 – 6 – = 46 increase 2 decrease lower 35

standard score regression Z y ’ =  Z x1 +  Z x2 +  Z x3  carries the same information as b, but is scaled so that they are more comparable – allowing us to think about the “relative importance” of the different predictors to the model The most common reason to refer to standardized weights is when you (or the reader) is unfamiliar with the scale of the criterion. A second reason is to promote comparability of the relative contribution of the various predictors (but see the important caveat to this discussed below!!!).

2. How well does the model work? R² is an “effect size estimate” telling the proportion of variance of the criterion variable that is accounted for by the model 3. Which variables contribute to the model ?? t-test of H0: b = 0 for each variable 4.Which variables “contribute most” to the model ?? Compare the β weights of the variables (never b weights) This must be done carefully – only trust large differences (e.g. at least different) Inspecting & describing results of a multiple regression formula 1. Does the model work? F-test (ANOVA) of H0: R² = 0 (R=0)

Important Stuff !!! There are two different reasons that a predictor might not be contributing to a multiple regression model... the variable isn’t correlated with the criterion the variable is correlated with the criterion, but is collinear with one or more other predictors, and so, has no independentcontribution to the multiple regression model y x1 x2 x3 X1 has a substantial r with the criterion and has a substantial b x2 has a substantial r with the criterion but has a small b because it is collinear with x1 x3 has neither a substantial r nor substantial b