# Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.

## Presentation on theme: "Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing."— Presentation transcript:

Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Bivariate Data X = family Income Y = square footage of their home Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing Figure 14.1

Coefficient of Correlation The sample coefficient of correlation, r, measures the strength of the linear relationship that exists within a sample of n bivariate data. Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Coefficient of Correlation Value Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Coefficient of Correlation Properties r ranges from -1.0 to 1.0. The larger |r| is, the stronger the linear relationship. r near zero indicates that there is no linear relationship. X and Y are uncorrelated. r = 1 or -1 implies that a perfect linear pattern exists between the two variables. Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Coefficient of Correlation Properties The sign of r tells you whether the relationship between X and Y is a positive (direct) or a negative (inverse) relationship. The value of r tells you very little about the slope of the line. Except if the sign of r is positive the slope of the line is positive and if r is negative then the slope is negative. Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Various Values of r Figure 14.2 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Covariance The sample covariance between two variables, cov(X,Y) is a measure of the joint variation of the two variables X and Y and is defined to be: Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Least Squares Line The least squares line is the line through the data that minimizes the sum of the differences between the observations and the line.  d 2 = d 1 2 + d 2 2 + d 3 2 + … + d n 2 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Least Squares Line Figure 14.6 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Sum of Squares of Error  d 2 = SSE  ( y  ˆ y ) 2 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Assumptions for the Simple Regression Model The mean of each error component is zero. Each error component (random variable) follows an approximate normal distribution. The variance of the error component is the same for each value of X. The errors are independent of each other. Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Simple Linear Regression Model Assumptions Figure 14.10 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Hypothesis Test on the Slope of the Regression Line H o :  1 = 0 (X provides no information) H a :  1  0 (X does provide information) Figure 14.11 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Hypothesis Test on the Slope of the Regression Line H o :  1 = 0 H a :  1  0 Reject H o if |t| > t , n-2 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

(1-  ) 100% Confidence Interval for  1 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Measuring the Strength of the Model b 1  SCP XY SS X Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Hypothesis Test to Determine the Significance of the Model H o :  1 = 0 (no linear relationship exists) H a :  1  0 ( a linear relationship exists) Reject H o if |t| > t , n-2 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Coefficient of Determination Introduction to Business Statistics, 5e Kvanli/Guynes/Pav ur (c)2000 South- Western College Publishing

Total Variation Figure 14.18 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

(1-  ) 100% Confidence Interval for  y|x Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Prediction Interval for Y Xo Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Checking Model Assumptions Figure 14.22 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing

Checking Model Assumptions Figure 14.23 Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing