©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Simple Linear Regression
Chapter 12 Simple Linear Regression
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Chapter 12 Simple Regression
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
SIMPLE LINEAR REGRESSION
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Pertemua 19 Regresi Linier
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 10 th Edition.
Simple Linear Regression and Correlation
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 5 Correlation and Regression
Regression and Correlation Methods Judy Zhong Ph.D.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Regression Analysis (2)
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
©2006 Thomson/South-Western 1 Chapter 14 – Multiple Linear Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western Concise.
Lecture 10: Correlation and Regression Model.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
Chapter 13 Simple Linear Regression
Inference for Least Squares Lines
Statistics for Managers using Microsoft Excel 3rd Edition
Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Chapter 13 Simple Linear Regression
Chapter 14 – Correlation and Simple Regression
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Chapter 13 Simple Linear Regression
Presentation transcript:

©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western Concise Managerial Statistics Concise Managerial Statistics KVANLI PAVUR KEELING KVANLI PAVUR KEELING

©2006 Thomson/South-Western 2 Bivariate Data Figure – 30 – 25 – 20 – 15 – 10 – 5 – Square footage (hundreds) |20|30|40|50|60|70|80YX Income (thousands) (a) 35 – 30 – 25 – 20 – 15 – 10 – 5 – Square footage (hundreds) |20|30|40|50|60|70|80YX Income (thousands) (b)

©2006 Thomson/South-Western 3 Coefficient of Correlation The strength of the linear relationship between two variables is called the coefficient of correlation, r. r =r =r =r = ∑(x - x)(y - y) ∑(x - x) 2 ∑(y - y) 2 ∑(x - x) 2 ∑(y - y) 2= ∑xy - (∑x)(∑y) / n ∑x 2 - (∑x) 2 / n ∑y 2 - (∑y) 2 / n ∑x 2 - (∑x) 2 / n ∑y 2 - (∑y) 2 / n

©2006 Thomson/South-Western 4 Coefficient of Correlation Properties 1.r ranges from -1.0 to The larger |r | is, the stronger the linear relationship 3.The sign of r tells you whether the relationship between X and Y is a positive (direct) or a negative (inverse) relationship 4.r = 1 or -1 implies that a perfect linear pattern exists between the two variables, that they are perfectly correlated

©2006 Thomson/South-Western 5 Sum of Squares SS X = sum of squares for X = ∑(x - x) 2 = ∑x 2 - (∑x) 2 n SS Y = sum of squares for Y = ∑(y - y) 2 = ∑y 2 - (∑y) 2 n SCP XY = sum of cross products for XY = ∑(x - x)(y - y) = ∑xy - (∑x) (∑y) n

©2006 Thomson/South-Western 6 Sum of Squares SS X = sum of squares for X = ∑(x - x) 2 = ∑x 2 - (∑x) 2 n SS Y = sum of squares for Y = ∑(y - y) 2 = ∑y 2 - (∑y) 2 n SCP XY = sum of cross products for XY = ∑(x - x)(y - y) = ∑xy - (∑x) (∑y) n r =r =r =r = SCP XY SS X SS Y

©2006 Thomson/South-Western 7 Scatter Diagram and Correlation Coefficient Figure 13.2

©2006 Thomson/South-Western 8 Vertical Distances d1d1d1d1 d2d2d2d2 d3d3d3d3 d4d4d4d4 d5d5d5d5 d6d6d6d6 d7d7d7d7 d8d8d8d8 d9d9d9d9 d 10 Line L Figure 13.3 |20|30|40|50|60|70|80 XY Square footage Income

©2006 Thomson/South-Western 9 Least Squares Line The least squares line is the line through the data that minimizes the sum of the differences between the observations and the line ∑d 2 = d d d … + d n 2 b 1 = b 0 = y - b 1 x SCP XY SS X

©2006 Thomson/South-Western 10 Least Squares Line Figure 13.6 d1d1d1d1 d2d2d2d2 Y = b 0 + b 1 X ^ Y for X = 50 ^ YX 50Income Square footage Distance is Y − Y ^

©2006 Thomson/South-Western 11 Sum of Squares of Error SSE = SS Y - (SCP XY ) 2 SS X SSE = ∑d 2 = ∑(y - y) 2 ^

©2006 Thomson/South-Western 12 Least Squares Line for Real Estate Data Figure 13.5 YX 50Income Squarefootage Y = X ^ Y = 20 Y = ^

©2006 Thomson/South-Western 13 Assumptions for the Simple Regression Model 1. The mean of each error component is zero Y =  0 +  1 X + e 2. Each error component (random variable) follows an approximate normal distribution 3. The variance of the error component is the same for each value of X 4. The errors are independent of each other

©2006 Thomson/South-Western 14 Assumption 1 for the Simple Regression Model YX Income Square footage Figure 13.6 Y =  0 +  1 X Y =  0 +  1 X + e µ y150 µ y135 e 0000

©2006 Thomson/South-Western 15 Violation of Assumption 3 Figure 13.7 YX Income Square footage Y =  0 +  1 X e e 60

©2006 Thomson/South-Western 16 Assumptions 1, 2, 3 for the Simple Regression Model Figure 13.8 YX Income Square footage e ee eeee eeee eeee

©2006 Thomson/South-Western 17 Estimating the Error Variance,  e 2 s 2 =  e 2 = estimate of  e 2 = SSE n - 2 ^ where (SCP XY ) 2 SS X SSE = ∑(y - y) 2 = SS Y - ^

©2006 Thomson/South-Western 18 Three Possible Populations  1 < 0 (c) XY  1 > 0 (b) XY  1 = 0 (a) XY Figure 13.9

©2006 Thomson/South-Western 19 Hypothesis Test on the Slope of the Regression Line H o :  1 = 0 (X provides no information) H a :  1 ≠ 0 (X does provide information) Two-Tailed Test Test Statistic: reject H o if |t| > t  /2,n-2 t = = b 1 –  1 s/ SS x b 1 –  1 s b 1

©2006 Thomson/South-Western 20 Hypothesis Test on the Slope of the Regression Line Test Statistic: t =t =t =t = b1b1sbsbb1b1sbsb 1 H o :  1 ≤ 0 H a :  1 > 0 One-Tailed Test H o :  1 ≥ 0 H a :  1 < 0 reject H o if t > t  /2,n-2 reject H o if t < -t  /2,n-2

©2006 Thomson/South-Western 21 t Curve with 8 df Figure Rejection region Rejection region tt

©2006 Thomson/South-Western 22 Real Estate Example Figure 13.11

©2006 Thomson/South-Western 23 Real Estate Example Figure 13.12

©2006 Thomson/South-Western 24 Real Estate Example Figure 13.13

©2006 Thomson/South-Western 25 Real Estate Example Figure 13.14

©2006 Thomson/South-Western 26 Scatter Diagram 30 – 20 – 10 – |12|24|36|48|60 Age Liquid assets (% of annual income) Y X Y = X ^ Figure 13.15

©2006 Thomson/South-Western 27 Scatter Diagram Figure SS X = x = SS Y =348.92y = SCP XY = r = =.672 SCP XY SS X SS Y

©2006 Thomson/South-Western 28 Confidence Interval for  1 The (1 -  ) 100% confidence interval for  1 is b 1 - t  /2,n-2 s b to b 1 + t  /2,n-2 s b 11

©2006 Thomson/South-Western 29 Curvilinear Relationship YYXX Figure 13.16

©2006 Thomson/South-Western 30 Measuring the Strength of the Model r =r =r =r = SCP XY SS X SS Y SS X SS Y r 1 - r r 2 n - 2 n - 2 t =t =t =t = H o : p = 0(no linear relationship exists between X and Y) H a : p ≠ 0(a linear relationship does exist)

©2006 Thomson/South-Western 31 Danger of Assuming Causality A high statistical correlation does not imply causality A high statistical correlation does not imply causality There are many situations when variables are highly correlated because a factor not being studied affects the variables being studied There are many situations when variables are highly correlated because a factor not being studied affects the variables being studied

©2006 Thomson/South-Western 32 Coefficient of Determination SSE = SS Y - (SCP XY ) 2 SS X r2 =r2 =r2 =r2 = (SCP XY ) 2 SS X SS Y r 2 =coefficient of determination =1 - =percentage of explained variation in the dependent variable using the simple linear regression model SSE SS Y

©2006 Thomson/South-Western 33 Total Variation, SS Y Figure YX (x, y) y - y (x, y) y - y y Y = b 0 + b 1 X Sample point ^ ^ ^ ^

©2006 Thomson/South-Western 34 Total Variation, SS Y Figure YX (x, y) y - y (x, y) y - y y Y = b 0 + b 1 X Sample point ^ ^ ^ ^ SS Y = SSR + SSE SSR = (SCP XY ) 2 SS X

©2006 Thomson/South-Western 35 Estimation and Prediction Using the Simple Linear Model The least squares line can be used to estimate average values or predict individual values

©2006 Thomson/South-Western 36 Confidence Interval for µ Y|x 0 (1-  ) 100% Confidence Interval for  Y|x 0 Y - t  /2,n-2 s + ^ (x 0 - x) 2 SS X 1n to Y + t  /2,n-2 s + (x 0 - x) 2 SS X 1n ^ s Y = s + (x 0 - x) 2 SS X 1n ^

©2006 Thomson/South-Western 37 Confidence and Prediction Intervals Figure 13.18

©2006 Thomson/South-Western 38 Confidence and Prediction Intervals Figure 13.19

©2006 Thomson/South-Western 39 Confidence and Prediction Intervals Figure 13.20

©2006 Thomson/South-Western 40 95% Confidence Intervals x = Upper confidence limits Lower confidence limits Y = X ^ Figure – 30 – 25 – 20 – 15 – 10 – 5 – |20|30|40|50|60|70 X

©2006 Thomson/South-Western 41 Prediction Interval for Y X 0 Y - t  /2,n-2 s ^ (x 0 - x) 2 SS X 1n to Y + t  /2,n-2 s (x 0 - x) 2 SS X 1n ^ s Y 2 = s (x 0 - x) 2 SS X 1n ^

©2006 Thomson/South-Western 42 95% Confidence Intervals Figure x = Prediction interval limits Confidence interval limits – 30 – 25 – 20 – 15 – 10 – 5 – |20|30|40|50|60|70 X

©2006 Thomson/South-Western 43 Checking Model Assumptions 1.The errors are normally distributed with a mean of zero 2.The variance of the errors remains constant. For example, you should not observe larger errors associated with larger values of X. 3.The errors are independent

©2006 Thomson/South-Western 44 Examination of Residuals X (a) Y - Y ^ ^X (b) Figure 13.23

©2006 Thomson/South-Western 45 Examination of Residuals Figure Time Y - Y ^ – 1995 – 1993 – 1997 – 1999 – 1992 – 1996 – 1998 – 2000 – 2001 –

©2006 Thomson/South-Western 46 Autocorrelation and the Durbin-Watson Statistic Range from 0 to 4 Range from 0 to 4 Ideal value is 2 Ideal value is 2 As DW decreases from 2, positive autocorrelation increases As DW decreases from 2, positive autocorrelation increases As DW increases from 2, negative autocorrelation increases As DW increases from 2, negative autocorrelation increases DW = ∑(e t - e t-1 ) 2 ∑e t 2 T t =2 T t =1

©2006 Thomson/South-Western 47 Autocorrelation and the Durbin-Watson Statistic Figure 13.25

©2006 Thomson/South-Western 48 Checking for Outliers Figure 13.26

©2006 Thomson/South-Western 49 Identifying Outlying Values Outlying sample values can be found by calculating the sample leverage h i = + (x i - x) 2 SS X 1n SS X = ∑x 2 - (∑x) 2 /n A sample is considered an outlier if its leverage is greater than 4/n or 6/n

©2006 Thomson/South-Western 50 Identifying Outlying Values The standard deviation of the predicted Y value is s y = s h i The confidence interval is Y - t  /2,n-2 s h i to Y + t  /2,n-2 s h i ^ ^ The prediction interval is Y - t  /2,n-2 s 1 + h i to Y + t  /2,n-2 s 1 + h i ^ ^

©2006 Thomson/South-Western 51 Real Estate Example Real Estate Example Figure 13.27(a)

©2006 Thomson/South-Western 52 Real Estate Example Real Estate Example Figure 13.27(b)

©2006 Thomson/South-Western 53 Identifying Outlying Values Unusually large or small values of the dependent variable (Y) can generally be detected using the sample standardized residuals Estimated standard deviation of the ith residual s 1 - h i Standardized residual = Y i - Y i s 1 - h i ^ An observation is thought to have and outlying value of Y if its standardized residual > 2 or 2 or < -2

©2006 Thomson/South-Western 54 Identifying Influential Observations You may conclude the ith observation is influential if the corresponding D i measure >.8 Cook’s distance measure D i = (standardized residual) 2 12 h i 1 - h i = (Y i – Y i ) 2 2s 2 h i (1 – h i ) 2 ^

©2006 Thomson/South-Western 55 Leverages, Standardized Residuals, and Cook’s Distance Measures Figure 13.28

©2006 Thomson/South-Western 56 Summary of Figures and Outlying inOutlying in Influential X ValueY ValueObservation Point(h i >.4)(|stand. res.| > 2)(D i >.8) ANoYesNo BNoNoNo CYesYesYes Table 13.1

©2006 Thomson/South-Western 57 Engine Capacity and MPG Figure 13.29

©2006 Thomson/South-Western 58 Engine Capacity and MPG Figure 13.30

©2006 Thomson/South-Western 59 Engine Capacity and MPG Figure 13.31

©2006 Thomson/South-Western 60 Engine Capacity and MPG Figure 13.32

©2006 Thomson/South-Western 61 Engine Capacity and MPG Figure 13.33