Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
The Simple Regression Model
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Greg C Elvers.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Linear regression models
Regression Basics Predicting a DV with a Single IV.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Cal State Northridge  320 Andrew Ainsworth PhD Regression.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Chapter 10 Simple Regression.
CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2.
Lecture 11 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Linear Regression and Correlation Analysis
Correlational Designs
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Simple Linear Regression Analysis
Cal State Northridge 427 Ainsworth
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
Relationships Among Variables
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Correlation and Regression
Lecture 16 Correlation and Coefficient of Correlation
Regression and Correlation Methods Judy Zhong Ph.D.
Equations in Simple Regression Analysis. The Variance.
Correlation and Regression
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
CORRELATION & REGRESSION
Chapter 15 Correlation and Regression
Chapter Eighteen. Figure 18.1 Relationship of Correlation and Regression to the Previous Chapters and the Marketing Research Process Focus of This Chapter.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Business Research Methods William G. Zikmund Chapter 23 Bivariate Analysis: Measures of Associations.
Introduction to Linear Regression
Basic Statistics Correlation Var Relationships Associations.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Psychology 820 Correlation Regression & Prediction.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Chapter 14 Correlation and Regression
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Applied Regression Analysis BUSI 6220
CHAPTER 7 Linear Correlation & Regression Methods
Chapter 10 CORRELATION.
Kin 304 Regression Linear Regression Least Sum of Squares
Regression.
Product moment correlation
Simple Linear Regression
Chapter Thirteen McGraw-Hill/Irwin
Presentation transcript:

Introduction to Regression Analysis

Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test scores vary). –We’ll cover this later. Prediction –Construction an equation to predict scores on some variable. –Construct an equation that can be used in selecting individuals.

Prediction Use a set of scores collected from a sample to make predictions about individuals in the population (not in the sample). Use the scores to construct a mathematical (typically linear) equation that allows us to predict performance. Two types of scores are collected: –Usually a measure on one criterion (outcome, dependent) variable. –Scores on one or more predictor (independent) variables.

The equations The equation for one individual’s criterion score: The prediction equation for that individual’s score The difference between the two equations (called a residual)

The function The linear function has the form: Where the βs are weights (regression weights) selected such that sum of squared errors are minimized (least squares criterion)

Multiple Correlation Minimizing the sum of squared errors causes the correlation between the actual criterion scores and the predicted scores to be maximized (as large as possible). This correlation is called a multiple correlation. It is the correlation between the criterion variable and a linear composite of the predictor variables.

Coefficient of Determinatin The square of the multiple correlation, is called the coefficient of determination. It gives the proportion of shared variance (i.e., covariance) between the criterion variable and the weighted linear composite. Hence the larger the, R 2, the better the prediction equation.

Basic regression equation

Computing the constants in the regression equation where

A closer look at the regression equation

Partitioning the Sum of Squares (SS y ) SSy is given by Now, Consider the following identity Subtracting from each side gives, Squaring and summing gives,

Simplifying the previous equation Where SSreg = Sum of squares due to regression, and SSres = Residual sum of squares. Dividing through by the total sum of squares, gives:, or

Example YX

Calculation of squares and cross-products Deviation squares and cross- products Sums of squares and cross- products

Calculation of the coefficients The slope, the intercept, and the regression line.

Calculation of SSreg From an earlier equation…

Some additional equations for SSreg Hence…

SSreg computed from a correlation The formula for the Pearson correlation is… therefore,

A Closer Look at the Equations in Regression Analysis

The Variance

The standard deviation

The covariance

The Pearson product moment correlation

The normal equations (for the regressions of y on x)

The structural model (for an observation on individual i)

The regression equation

Partitioning a deviation score, y

The score, Y, is partitioned Hence, Y is partitioned into a deviation of a predicted score from the mean or the scores PLUS a deviation of the actual score from the predicted score. Our next step is to square the deviation, and sum over all the scores.

Partitioning the sum of squared deviations (sum of squares, SS y )

What happened to the term, Showing that reduces to zero requires some complicated algebra, recalling that and that

Calculation of proportions of sums of squares due to regression and due to error (or residual)

Alternative formulas for computing the sums of squares due to regression

Test of the regression coefficient, b yx, (i.e. test the null hypothesis that b yx = 0) First compute the variance of estimate

Test of the regression coefficient, b yx, (i.e. test the null hypothesis that b yx = 0) Then obtain the standard error of estimate Then compute the standard error of the regression coefficient, S b

The test of significance of the regression coefficient ( b yx ) The significance of the regression coefficient is tested using a t test with (N-k-1) degrees of freedom:

Computing regression using correlations The correlation, in the population, is given by The population correlation coefficient, ρ xy, is estimated by the sample correlation coefficient, r xy

Sums of squares, regression (SS reg ) Recalling that R 2 gives the proportion of variance of Y accounted for (or explained) by X, we can obtain or, in other words, SS reg is that portion of SS y predicted or explained by the regression of Y on X.

Standard error of estimate From SS res we can compute the variance of estimate and standard error of estimate as (Note alternative formulas were given earlier.)

Testing the Significance of r The significance of a correlation coefficient, r, is tested using a t test: With N-2 degrees of freedom.

Testing the difference between two correlations To test the difference between two Pearson correlation coefficients, use the “Comparing two correlation coefficients” calculator on my web site.

Testing the difference between two regression coefficients This, also, is a t test: Where was given earlier.

Point-biserial and Phi correlation These are both Pearson Product-moment correlations The Point-biserial correlation is used when on variable is a scale variable and the other represents a true dichotomy. For instance, the correlation between a performance on an item—the dichotomous variable—and the total score on a test—the scaled variable.

Point-biserial and Phi correlation The Phi correlation is used when both variables represent a true dichotomy. For instance, the correlation between two test items.

Biserial and Tetrachoric correlation These are non-Pearson correlations. Both are rarely used anymore. The biserial correlation is used when one variable is truly a scaled variable and the other represents an artificial dichotomy. The Tetrachoric correlation is used when both variables represent an artificial dichotomy.

Spearman’s Rho Coefficient and Kendall’s Tau Coefficient Spearman’s rho is used to compute the correlation between two ordinal (or ranked) variables. It is the correlation between two sets of ranks.