Chapter 11 Multiple Regression.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Ch11 Curve Fitting Dr. Deshi Ye
Simple Linear Regression
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 13 Multiple Regression
Chapter 10 Simple Regression.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Chapter 12 Simple Regression
Chapter 12 Multiple Regression
Additional Topics in Regression Analysis
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Ch. 14: The Multiple Regression Model building
Simple Linear Regression and Correlation
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Chapter 8 Forecasting with Multiple Regression
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Chapter 11 Simple Regression
Correlation and Linear Regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Lecture 10: Correlation and Regression Model.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Chapter 16 Multiple Regression and Correlation
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 11: Simple Linear Regression
Chapter 11 Simple Regression
Quantitative Methods Simple Regression.
Slides by JOHN LOUCKS St. Edward’s University.
Chapter 9 Hypothesis Testing.
CHAPTER 29: Multiple Regression*
PENGOLAHAN DAN PENYAJIAN
Interval Estimation and Hypothesis Testing
Simple Linear Regression
3.2. SIMPLE LINEAR REGRESSION
Introduction to Regression
St. Edward’s University
Presentation transcript:

Chapter 11 Multiple Regression

Multiple Regression Model Multiple regression enables us to determine the simultaneous effect of several independent variables on a dependent variable using the least squares principle.

Multiple Regression Objectives Multiple regression provides two important results: A linear equation that predicts the dependent variable, Y, as a function of “K” independent variables, xji, j = 1 , . . K. 2. The marginal change in the dependent variable, Y, that is related to a change in the independent variables – measured by the partial coefficients, bj’s. In multiple regression these partial coefficients depend on what other variables are included in the model. The coefficients bj indicates the change in Y given a unit change in xj while controlling for the simultaneous effect of the other independent variables. (In some problems both results are equally important. However, usually one will predominate.

Multiple Regression Model (Example 11.1)

Multiple Regression Model POPULATION MULTIPLE REGRESSION MODEL The population multiple regression model defines the relationship between a dependent or endogenous variable, Y, and a set of independent or exogenous variables, xj, j=1, . . , K. The xji’s are assumed to be fixed numbers and Y is a random variable, defined for each observation, i, where i = 1, . . ., n and n is the number of observations. The model is defined as Where the j’s are constant coefficients and the ’s are random variables with mean 0 and variance 2.

Standard Multiple Regression Assumptions The population multiple regression model is and we assume that n sets of observations are available. The following standard assumptions are made for the model. The x’s are fixed numbers, or they are realizations of random variables, Xji that are independent of the error terms, i’s. In the later case, inference is carried out conditionally on the observed values of the xji’s. The error terms are random variables with mean 0 and the same variance, 2. The later is called homoscedasticity or uniform variance.

Standard Multiple Regression Assumptions (continued) The random error terms, i , are not correlated with one another, so that It is not possible to find a set of numbers, c0, c1, . . . , ck, such that This is the property of no linear relation for the Xj’s.

Least Squares Estimation and the Sample Multiple Regression We begin with a sample of n observations denoted as (x1i, x2i, . . ., xKi, yi i = 1, . . ,n) measured for a process whose population multiple regression model is The least-squares procedure obtains estimates of the coefficients, 1, 2, . . .,K are the values b0 , b1, . . ., bK, for which the sum of the squared deviations is a minimum. The resulting equation is the sample multiple regression of Y on X1, X2, . . ., XK.

Multiple Regression Analysis for Profit Margin Analysis (Using Example 11.1) The regression equation is: Y Profit Margin = 1.56 + 0.382 X1 Revenue – 0.00025 X2 Office Space b0 b1 b2

Sum of Squares Decomposition and the Coefficient of Determination Given the multiple regression model fitted by least squares Where the bj’s are the least squares estimates of the coefficients of the population regression model and e’s are the residuals from the estimated regression model. The model variability can be partitioned into the components Where Total Sum of Squares

Sum of Squares Decomposition and the Coefficient of Determination (continued) Error Sum of Squares: Regression Sum of Squares: This decomposition can be interpreted as

Sum of Squares Decomposition and the Coefficient of Determination (continued) The coefficient of determination, R2, of the fitted regression is defined as the proportion of the total sample variability explained by the regression and is and it follows that

Estimation of Error Variance Given the population regression model And the standard regression assumptions, let 2 denote the common variance of the error term i. Then an unbiased estimate of that variance is The square root of the variance, Se is also called the standard error of the estimate.

Multiple Regression Analysis for Profit Margin Analysis (Using Example 11.1) The regression equation is: Y Profit Margin = 1.56 + 0.382 X1 Revenue – 0.00025 X2 Office Space R2 se SSR SSE b0 b1 b2

Adjusted Coefficient of Determination The adjusted coefficient of determination, R2, is defined as We use this measure to correct for the fact that non-relevant independent variables will result in some small reduction in the error sum of squares. Thus the adjusted R2 provides a better comparison between multiple regression models with different numbers of independent variables.

Coefficient of Multiple Correlation The coefficient of multiple correlation, is the correlation between the predicted value and the observed value of the dependent variable and is equal to the square root of the multiple coefficient of determination. We use R as another measure of the strength of the linear relationship between the dependent variable and the independent variables. Thus it is comparable to the correlation between Y and X in simple regression.

Basis for Inference About the Population Regression Parameters Let the population regression model be Let b0, b1 , . . , bK be the least squares estimates of the population parameters and sb0, sb1, . . ., sbK be the estimated standard deviations of the least squares estimators. Then if the standard regression assumptions hold and if the error terms i are normally distributed, the random variables corresponding to are distributed as Student’s t with (n – K – 1) degrees of freedom.

Confidence Intervals for Partial Regression Coefficients If the regression errors i , are normally distributed and the standard regression assumptions hold, the 100(1 - )% confidence intervals for the partial regression coefficients j, are given by Where t(n – K - 1, /2) is the number for which And the random variable t(n – K - 1) follows a Student’s t distribution with (n – K - 1) degrees of freedom.

Multiple Regression Analysis for Profit Margin Analysis (Using Example 11.1) The regression equation is: Y Profit Margin = 1.56 + 0.382 X1 Revenue – 0.00025 X2 Office Space b1 b2 tb2 tb1

Tests of Hypotheses for the Partial Regression Coefficients If the regression errors i are normally distributed and the standard least squares assumptions hold, the following tests have significance level : To test either null hypothesis against the alternative the decision rule is

Tests of Hypotheses for the Partial Regression Coefficients (continued) 2. To test either null hypothesis against the alternative the decision rule is

Tests of Hypotheses for the Partial Regression Coefficients (continued) 3. To test the null hypothesis Against the two-sided alternative the decision rule is

Test on All the Parameters of a Regression Model Consider the multiple regression model To test the null hypothesis against the alternative hypothesis At a significance level  we can use the decision rule Where F K,n – K –1, is the critical value of F from Table 7 in the appendix for which The computed F K,n – K –1 follows an F distribution with numerator degrees of freedom k and denominator degrees of freedom (n – K – 1)

Test on a Subset of the Regression Parameters Consider the multiple regression model To test the null hypothesis That a subset of regression parameters are simultaneously equal to 0 against the alternative hypothesis

Test on a Subset of the Regression Parameters (continued) We compare the error sum of squares for the complete model with the error sum of squares for the restricted model. First run a regression for the complete model that includes all the independent variables and obtain SSE. Next run a restricted regression that excludes the Z variables whose coefficients are the ’s - - the number of variables excluded is r. From this regression obtain the restricted error sum of squares SSE (r). The compute the F statistic and apply the decision rule for a significance level 

Predictions from the Multiple Regression Models Given that the population regression model holds and that the standard regression assumptions are valid. Let b0, b1, . . . , bK be the least squares estimates of the model coefficients, j, j = 1, 2, . . . ,K, based on the x1i, x2i, . . . , xKi, yi (i = 1, 2, . . . n) data points. Then given a new observation of a data point, x1,n+1, x 2,n+1, . . . , x K,n+1 the best linear unbiased forecast of Y n+1 is It is very risky to obtain forecasts that are based on X values outside the range of the data used to estimate the model coefficients, because we do not have data evidence to support the linear model at those points.

Quadratic Model Transformations The quadratic function Can be transformed into a linear multiple regression model by defining new variables: And then specifying the model as Which is linear in the transformed variables. Transformed quadratic variables can be combined with other variables in a multiple regression model. Thus we could fit a multiple quadratic regression using transformed variables.

Exponential Model Transformations Coefficients for exponential models of the form Can be estimated by first taking the logarithm of both sides to obtain an equation that is linear in the logarithms of the variables: Using this form we can regress the logarithm of Y on the logarithm of the two X variables and obtain estimates for the coefficients 1, 2 directly from the regression analysis. Note that this estimation procedure requires that the random errors are multiplicative in the original exponential model. Thus the error term, , is expressed as a percentage increase or decrease instead of the addition or subtraction of a random error as we have seen for linear regression models.

Dummy Variable Regression Analysis The relationship between Y and X1 can shift in response to a changed condition. The shift effect can be estimated by using a dummy variable which has values of 0 (condition not present) and 1 (condition present). All of the observations from one set of data have dummy variable X2 = 1, and the observations for the other set of data have X2 = 0. In these cases the relationship between Y and X1 is specified by the regression model

Dummy Variable Regression Analysis (continued) The functions for each set of points are and In the first function the constant is b0, while in the second the constant is b0 + b2. Dummy variables are also called indicator variables.

Dummy Variable Regression for Differences in Slope To determine if there are significant differences in the slope between two discrete conditions we need to expand our regression model to a more complex form Now we see that the slope coefficient of x1 contains two components, b1, and b3x2. When x2 equals 0, the slope estimate is the usual b1. However, when x2 equals 1, the slope is equal to the algebraic sun of b1 + b3. To estimate the model we actually need to multiply the variables to create a new set of transformed variables that are linear. Therefore the model actually used for the estimation is

Dummy Variable Regression for Differences in Slope (continued) The resulting regression model is now linear with three variables. The new variable x1x2 is often called an interaction variable. Note that when the dummy variable x2 = 0 this variable has a value of 0, but when x2 = 1 this variable has the value of x1. The coefficient b3 is an estimate of the difference in the coefficient of x1 when x2 = 1 compared to when x2 = 0. Thus the t statistic for b3 can be used to test the hypothesis If we reject the null hypothesis we conclude that there is a difference in the slope coefficient for the two subgroups. In many cases we will be interested in both the difference in the constant and difference in the slope and will test both of the hypotheses presented in this section.

Key Words Adjusted Coefficient of Determination Basis for Inference About the Population Regression Parameters Coefficient of Multiple Determination Confidence Intervals for Partial Regression Coefficients Dummy Variable Regression Analysis Dummy Variable Regression for Differences in Slope Estimation of Error Variance Least Squares Estimation and the Sample Multiple Regression Prediction from Multiple Regression Models Quadratic Model Transformations

Key Words (continued) Regression Objectives Standard Error of the Estimate Standard Multiple Regression Assumptions Sum of Squares Decomposition and the Coefficient of Determination Test on a Subset of the Regression Parameters Test on All the Parameters of a Regression Model Tests of Hypotheses for the Partial Regression Coefficients The Population Multiple Regression Model