Download presentation

Presentation is loading. Please wait.

Published byNaomi Woodby Modified over 4 years ago

1
**Correlation and Regression Analysis – An Application**

Systems Engineering Program Department of Engineering Management, Information and Systems EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Correlation and Regression Analysis – An Application Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering

2
Montgomery, Peck, and Vining (2001) present data concerning the performance of the 28 National Football league teams in It is suspected that the number of games won(y) is related to the number of yards gained rushing by an opponent(x). The data are shown in the following table:

3
**Yards Rushing by Opponent (x)**

Team Games Won (y) Yards Rushing by Opponent (x) Washington 10 2205 Detroit 6 1901 Minnesota 11 2096 Green Bay 5 2288 New England 1847 Houston 2072 Oakland 13 1903 Kansas City 2861 Pittsburgh 1457 Miami 2411 Baltimore 1848 New Orleans 4 2289 Los Angeles 1564 New york Giants 3 2203 Dallas 1821 New York Jets 2592 Atlanta 2577 Philadelphia 2053 Buffalo 2 2476 St. Louis 1979 Chicago 7 1984 San Diego 2048 Cincinnati 1917 San Francisco 8 1786 Cleveland 9 1761 Seattle 2876 Denver 1709 Tampa Bay 2560

4
Correlation Analysis Statistical analysis used to obtain a quantitative measure of the strength of the relationship between a dependent variable and one or more independent variables

5
Scatter Plot

6
**Sample correlation coefficient**

Notes: -1 r 1 R=r2 100% = coefficient of determination

7
R=r2 100% =0.5447

8
**Correlation To test for no linear association between x & y, calculate**

Where r is the sample correlation coefficient and n is the sample size.

9
**Correlation Conclude no linear association if**

then treat y1, y2, …, yn as a random sample

10
**Correlation Take α=0.05 and check from the T-table, we get**

Since t= < , we conclude that there is linear association between x and y and proceed with regression analysis

11
**Linear Regression Model**

Simple linear regression model where Y is the response (or dependent) variable 0 and 1 are the unknown parameters ~ N(0,) and data: (x1, y1), (x2, y2), ..., (xn, yn)

12
**Least squares estimates of 0 and 1**

13
estimates of 1

14
estimates of 0

15
**Least squares regression equation**

Point estimate of the linear model is Least squares regression equation

16
**Regression Fitted Line Plot**

17
Point estimate of 2

18
**Interval Estimates for y intercept (0)**

(1 - )100% confidence interval for 0 is where and

19
**Interval Estimates for y intercept (0)**

Take =0.05, then 95% confidence interval for 0 is

20
**Interval Estimates for y intercept (0)**

Apply to the equation and we get the lower and upper bound for :

21
**Interval Estimates for slope (1)**

(1 - )100% confidence interval for 1 is where and

22
**Interval Estimates for slope (1)**

23
**Confidence interval for conditional mean of Y, given x=2205**

Given x equal to 2205, we can calculate the confidence interval of conditional mean of Y

24
**Confidence interval for conditional mean of Y, given x=2205**

and

26
**Prediction interval for a single future value of Y, given x**

and

27
**Prediction interval for a single future value of Y, given x=2000**

28
**Prediction interval for a single future value of Y, given x=2000**

and

30
**Excel Calculation X Y XY X^2 Y^2 Y ^ (Y-Y^)^2 (x-xbar)^2 2205 10 22050**

100 2096 11 23056 121 1847 20317 1903 13 24739 169 1457 14570 1848 20328 1564 15640 298272 1821 20031 2577 4 10308 16 2476 2 4952 1984 7 13888 49 1917 19170 1761 9 15849 81 1709 15381 1901 6 11406 36 2288 5 11440 25 2072 10360 2861 14305 2411 14466 2289 9156 2203 3 6609 2592 7776 2053 8212 1979 19790 2048 12288 1786 8 14288 64 2876 5752 2560 SUM 59084 195 386127 1685 x-bar 9155 <-r Sb0 b1 <-S^2 b0l b0 <--S b0u Sb1 Sb1l Y(2205)-> Sb1u mu-l mu-u Y(2000)-> y-l y-u

31
**Regression Statistics**

Excel Regression Analysis Output SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 28 ANOVA df SS MS F Significance F Regression 1 7.381E-06 Residual 26 Total 27 Coefficients t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 1.46E-08 X Variable 1 7.38E-06 RESIDUAL OUTPUT Observation Predicted Y Residuals 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Similar presentations

Presentation is loading. Please wait....

OK

Statistical Analysis SC504/HS927 Spring Term 2008

Statistical Analysis SC504/HS927 Spring Term 2008

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google