1 Chapter 3 Multiple Linear Regression. 2 3.1 Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.

Slides:



Advertisements
Similar presentations
3.3 Hypothesis Testing in Multiple Linear Regression
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
 Population multiple regression model  Data for multiple regression  Multiple linear regression model  Confidence intervals and significance tests.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Probability & Statistical Inference Lecture 9
12-1 Multiple Linear Regression Models Introduction Many applications of regression analysis involve situations in which there are more than.
/k 2DS00 Statistics 1 for Chemical Engineering lecture 4.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
12 Multiple Linear Regression CHAPTER OUTLINE
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Linear regression models
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
The General Linear Model. The Simple Linear Model Linear Regression.
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 10 Simple Regression.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
The Simple Regression Model
Chapter 11 Multiple Regression.
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Multiple Linear Regression
Chapter 9 Multicollinearity
Linear Regression Analysis 5E Montgomery, Peck & Vining Hidden Extrapolation in Multiple Regression In prediction, exercise care about potentially.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression and Correlation
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
13 Design and Analysis of Single-Factor Experiments:
Correlation & Regression
Linear Regression.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Simple Linear Regression Models
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Chapter 12 Multiple Linear Regression Doing it with more variables! More is better. Chapter 12A.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Linear Regression Analysis 5E Montgomery, Peck & Vining
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Regression Analysis1. 2 INTRODUCTION TO EMPIRICAL MODELS LEAST SQUARES ESTIMATION OF THE PARAMETERS PROPERTIES OF THE LEAST SQUARES ESTIMATORS AND ESTIMATION.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Essentials of Modern Business Statistics (7e)
Simple Linear Regression
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Chapter 3 Multiple Linear Regression
Simple Linear Regression
Example 3.3 Delivery Time Data
Model Adequacy Checking
Presentation transcript:

1 Chapter 3 Multiple Linear Regression

2 3.1 Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends on temperature and the catalyst concentration. A multiple regression model that might describe this relationship is This is a multiple linear regression model in two variables.

3 3.1 Multiple Regression Models Figure 3.1 (a) The regression plane for the model E(y)= 50+10x 1 +7x 2. (b) The contour plot.

4 3.1 Multiple Regression Models In general, the multiple linear regression model with k regressors is

5 3.1 Multiple Regression Models

6 Linear regression models may also contain interaction effects: If we let x 3 = x 1 x 2 and  3 =  12, then the model can be written in the form

7 3.1 Multiple Regression Models

8

9 3.2 Estimation of the Model Parameters Least Squares Estimation of the Regression Coefficients Notation n – number of observations available k – number of regressor variables, p-- k+1( number of regression coefficients) y – response or dependent variable x ij – ith observation on jth regressor j.

Least Squares Estimation of Regression Coefficients

Least Squares Estimation of the Regression Coefficients The sample regression model can be written as

Least Squares Estimation of the Regression Coefficients The least squares function is The function S must be minimized with respect to the coefficients.

Least Squares Estimation of the Regression Coefficients The least squares estimates of the coefficients must satisfy

Least Squares Estimation of the Regression Coefficients Simplifying, we obtain the least squares normal equations: The ordinary least squares estimators are the solutions to the normal equations.

Least Squares Estimation of the Regression Coefficients Matrix notation is more convenient to find the estimiates Let where

Least Squares Estimation of the Regression Coefficients

Least Squares Estimation of the Regression Coefficients These are the least-squares normal equations. The solution is

Linear Regression Analysis 5E Montgomery, Peck & Vining Least Squares Estimation of the Regression Coefficients

Least Squares Estimation of the Regression Coefficients The n residuals can be written in matrix form as There will be some situations where an alternative form will prove useful Where H is called hat matrix

20 Example 3-1. The Delivery Time Data The model of interest is y =  0 +  1 x 1 +  2 x 2 + 

21 Example 3-1. The Delivery Time Data Figure 3.4 Scatterplot matrix for the delivery time data from Example 3.1. R codes for the figure in “Chapter_3_nu lti_reg.txt”

22 Example 3-1 The Delivery Time Data Figure 3.5 Three-dimensional scatterplot of the delivery time data from Example 3.1.

23 Example 3-1 The Delivery Time Data

24 Example 3-1 The Delivery Time Data

25 Example 3-1 The Delivery Time Data

26

27 R Output

Properties of Least-Squares Estimators Statistical Properties Variances/Covariances p×p matrix Diagonal entities Cjj are variances, And the remaining Cij are covariance of two regression coefficients

Linear Regression Analysis 5E Montgomery, Peck & Vining Estimation of  2 The residual sum of squares can be shown to be: The residual mean square for the model with p parameters is:

Linear Regression Analysis 5E Montgomery, Peck & Vining Estimation of  2 Recall that the estimator of  2 is model dependent - that is, change the form of the model and the estimate of  2 will invariably change. –Note that the variance estimate is a function of the errors; “unexplained noise about the fitted regression line”

Which model is better? Let’s calculate the variance of errors of different models Model 1; consider two reggressors ( case and distance) Model 2; only consider reggressor “case” We would usually prefer a model with a small residual mean square (estimated variance of error). 31

Linear Regression Analysis 5E Montgomery, Peck & Vining 32 Example 3.2 Delivery Time Data

Linear Regression Analysis 5E Montgomery, Peck & Vining 33 Example 3.2 Delivery Time Data

Linear Regression Analysis 5E Montgomery, Peck & Vining Inadequacy of Scatter Diagrams in Multiple Regression Scatter diagrams of the regressor variable(s) against the response may be of little value in multiple regression. –These plots can actually be misleading –If there is an interdependency between two or more regressor variables, the true relationship between x i and y may be masked.

35 Illustration of the Inadequacy of Scatter Diagrams in Multiple Regression

There is only one (of few) dominate reggressor The regressors operate nearly independent !!! Scartterplot could be misleading when several important regressors are related. ( we will discuss the analytical methods for sorting out the relationships between regressors in later chapter. Scatterplot is useful if… 36

Hypothesis Testing in Multiple Linear Regression Once we have estimated the parameters in the model, we face two immediate questions: 1. What is the overall adequacy of the model? 2. Which specific regressors seem important?

Linear Regression Analysis 5E Montgomery, Peck & Vining Hypothesis Testing in Multiple Linear Regression Test for Significance of Regression (sometimes called the global test of model adequacy) Tests on Individual Regression Coefficients (or groups of coefficients) Next we will consider:

Linear Regression Analysis 5E Montgomery, Peck & Vining Test for Significance of Regression The test for significance is a test to determine if there is a linear relationship between the response and any of the regressor variables The hypotheses are H 0 :  1 =  2 = …=  k = 0 H 1 :  j  0 for at least one j

Test for Significance of Regression As in Chapter 2, the total sum of squares can be partitioned in two parts: SS T = SS R + SS Res This leads to an ANOVA procedure with the test (F) statistic

Test for Significance of Regression The standard ANOVA is conducted with

Linear Regression Analysis 5E Montgomery, Peck & Vining Test for Significance of Regression ANOVA Table: Reject H 0 if or p-1 or n-p

Test for Significance of Regression R 2 –R 2 is calculated exactly as in simple linear regression –R 2 can be inflated simply by adding more terms to the model (even insignificant terms) Adjusted R 2 –Penalizes you for added terms to the model that are not significant

Linear Regression Analysis 5E Montgomery, Peck & Vining 44 Example 3.3 Delivery Time Data

Linear Regression Analysis 5E Montgomery, Peck & Vining 45 Example 3.3 Delivery Time Data To test H 0 :  1 =  2 = 0, we calculate the F–statistic:

46 Example 3.3 Delivery Time Data R 2 = Adjusted R 2 = To look at the overall significance of regression: p-value of F test R 2 Adjusted R 2

Adding a variable will always result in increase of R –squared. Our goal is to only add necessary regressors that will reduce the residual variability.. But we do not want over-fitting( add un necessary variables ( will learn variable selection procedure in later chapters). 47

Tests on Individual Regression Coefficients Hypothesis test on any single regression coefficient: Test Statistic: –Reject H 0 if |t 0 | > –This is a partial or marginal test!

Linear Regression Analysis 5E Montgomery, Peck & Vining 49 The Extra Sum of Squares method can also be used to test hypotheses on individual model parameters or groups of parameters Full model

Linear Regression Analysis 5E Montgomery, Peck & Vining 50

Linear Regression Analysis 5E Montgomery, Peck & Vining 51

Linear Regression Analysis 5E Montgomery, Peck & Vining Special Case of Orthogonal Columns in X If the columns X 1 are orthogonal to the columns in X 2, the sum of squares due to  2 that is free of any dependence on the the regressors in X 1.

Linear Regression Analysis 5E Montgomery, Peck & Vining 53 Example Consider a dataset with four regressor variables and a single response. Fit the equation with all regressors and find that: y = x x x x 4 Looking at the t-tests, suppose that x3 is insignificant. So it is removed. What is the equation now? Generally, it is not y = x x x 4

Linear Regression Analysis 5E Montgomery, Peck & Vining 54 Example The model must be refit with the insignificant regressors left out of the model. The regression equation is y = x x x 4 The refitting must be done since the coefficient estimates for an individual regressor depend on all of the regressors, x j

Linear Regression Analysis 5E Montgomery, Peck & Vining 55 Example However, if the columns are orthogonal to each other, then there is no need to refit. Can you think of some situations where we would have orthogonal columns?

Linear Regression Analysis 5E Montgomery, Peck & Vining Confidence Intervals on the Regression Coefficients A 100(1-  ) percent C.I. for the regression coefficient,  j is: Or,

Linear Regression Analysis 5E Montgomery, Peck & Vining 57

Linear Regression Analysis 5E Montgomery, Peck & Vining Confidence Interval Estimation of the Mean Response 100(1-  ) percent CI on the mean response at the point x 01, x 02, …, x 0k is See Example 3-9 on page 95 and the discussion that follows

Linear Regression Analysis 5E Montgomery, Peck & Vining 59

Linear Regression Analysis 5E Montgomery, Peck & Vining 60

Linear Regression Analysis 5E Montgomery, Peck & Vining 61

Linear Regression Analysis 5E Montgomery, Peck & Vining Simultaneous Confidence Intervals on Regression Coefficients It can be shown that From this result, the joint confidence region for all parameters in  is

Linear Regression Analysis 5E Montgomery, Peck & Vining Prediction of New Observations A 100(1-  ) percent prediction interval for a future observation is

Linear Regression Analysis 5E Montgomery, Peck & Vining 64

Linear Regression Analysis 5E Montgomery, Peck & Vining Hidden Extrapolation in Multiple Regression In prediction, exercise care about potentially extrapolating beyond the region containing the original observations. Figure 3.10 An example of extrapolation in multiple regression.

Linear Regression Analysis 5E Montgomery, Peck & Vining Hidden Extrapolation in Multiple Regression We will define the smallest convex set containing all of the original n data points (x i1, x i2, … x ik ), i = 1, 2, …, n, as the regressor variable hull RVH. If a point x 01, x 02, …, x 0k lies inside or on the boundary of the RVH, then prediction or estimation involves interpolation, while if this point lies outside the RVH, extrapolation is required.

Linear Regression Analysis 5E Montgomery, Peck & Vining Hidden Extrapolation in Multiple Regression Diagonal elements of the matrix H = X(X’X) -1 X’ can aid in determining if hidden extrapolation exists: The set of points x (not necessarily data points used to fit the model) that satisfy is an ellipsoid enclosing all points inside the RVH.

Linear Regression Analysis 5E Montgomery, Peck & Vining Hidden Extrapolation in Multiple Regression Let x 0 be a point at which prediction or estimation is of interest. Then If h 00 > h max then the point is a point of extrapolation.

Linear Regression Analysis 5E Montgomery, Peck & Vining 69 Example 3.13 Consider prediction or estimation at:

Linear Regression Analysis 5E Montgomery, Peck & Vining 70 Figure 3.10 Scatterplot of cases and distance for the delivery time data. #9 a b c d