Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

Slides:



Advertisements
Similar presentations
Forecasting Using the Simple Linear Regression Model and Correlation
Advertisements

Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Chapter 12 Simple Linear Regression
Chapter 13 Multiple Regression
To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ Chapter 4 RegressionModels.
Chapter 10 Simple Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Chapter 12 Multiple Regression
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
Linear Regression Example Data
SIMPLE LINEAR REGRESSION
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 10 th Edition.
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 13: Inference in Regression
Correlation and Linear Regression
Simple Linear Regression Models
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide Simple Linear Regression Part A n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
Stats Methods at IC Lecture 3: Regression.
Chapter 13 Simple Linear Regression
Regression Analysis AGEC 784.
Inference for Least Squares Lines
Statistics for Managers using Microsoft Excel 3rd Edition
Correlation and Simple Linear Regression
Chapter 13 Simple Linear Regression
Slides by JOHN LOUCKS St. Edward’s University.
Correlation and Simple Linear Regression
Prepared by Lee Revere and John Large
PENGOLAHAN DAN PENYAJIAN
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
St. Edward’s University
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is called the Y intercept - represents the value of Y when X = 0. But be cautious - this interpretation may be incorrect and difficult to estimate - many times our data does not include 0. Think of this value as representing the influences of the many other independent variables that are not included in the equation. bX is called the slope - represents the amount of change in Y when X increases by one unit.

Regression Analysis Regression line - line that best fits a collection of X-Y data points. The regression line minimizes the sum of the squared distances from the points to the line. Regression equation - Method of Least Squares. Find bo and bX. Other models: step-wise, forward and backward stepwise.

Regression Assumptions Y values are normally distributed about the regression line Variance remains constant as X values increase and decrease. Violation is called heteroscedasticity. Error terms (residuals) are independent of one another - random (no autocorrelation) Linear relationship exists between X and Y - nonlinear techniques are discussed later.

Excel’s Regression Tool Tools, Data Analysis, Regression - Hint: Include labels in the input ranges to help with the interpretation! Can also include plots (not shown here)

Comparison of A Forecasted value to the actual value and average. Total Deviation = Explained Variance + Unexplained Variance Comparison of A Forecasted value to the actual value and average. Sales Total Variance Explained Y ^ Unexplained Y Advertising

Data Analysis R2 or Coefficient of Determination. Equals the proportion of the variance in the dependent variable Y that is explained through the relationship with the Independent variable X. Explained Variance = Total Variance - Unexplained Variance. We state this as a proportion: Adjusted: Unadjusted: Adjusted R2 - adjusted for complexity by the degrees of freedom. Unadjusted R2 becomes larger as more variables are added to the equation (decreases the sum of errors in the denominator). The use of an unadjusted R2 may result in believing that additional variables are useful when they are not.

More on R2 If R2 = 1, there is a perfect linear relationship. All the variance in Y is explained by X. All of the data points are on the regression line. If R2 = 0, there is no relationship between X and Y (if this is the case, we should not have run a linear model - and we should have realized this with a correlation coefficient and by graphing - BEFORE running the model! Several ways to calculate. From ANOVA table: SSR/SST (this is an UNADJUSTED R2 ) Adjusted R2 from ANOVA = 1-MSE/(SST/n-1) The square root of R2 is R which is the correlation coefficient. This identifies positive and negative relationships R2 is useful to make model comparisons

Data Analysis Syx or Standard Error - measure for goodness of fit. Measures the actual values (Y) against the regression line Lower Syx is a better fit k refers to the number of population parameters being estimated - in this case, we have 2: bo and bX The standard error can also be calculated by taking the square root of the MSE in the ANOVA table! Y ^ Syx =

Residuals Excel will provide the residuals in the output. This table also includes another column that I added - the residuals squared which is used to determine the standard error of the estimate (Syx)

Confidence Intervals Prior to relating Y to X, confidence intervals about the future values are based on the standard error of Y. However, in the regression equation, the standard error of forecast (Sf) gives tighter confidence intervals and greater accuracy. Confidence Interval for Y: Confidence Interval for : Y ^ Use ta/2 for small sample sizes!

Making Predictions Identifying a forecasted point from the regression equation does not give us an idea of the accuracy of the prediction. We use the prediction interval to determine accuracy. For example, a prediction of 8.44 appears to be precise - but not if the 95% confidence level allows the forecast to be between 1.75 to 15.15! Be careful about making a prediction based on a prediction. For example, if the X values range between 5 and 15, you should be cautious about using an X value of 20 - it is outside the range of the data and possibly outside of the linear relationship.

Is the Independent Variable Significant? Ho: The regression coefficient is not significantly different from zero HA: The regression coefficient is significantly different from zero Where B is the true slope of the regression line

Is the Independent Variable Significant? The Standard Error of the Estimate is Syx, The Standard Error of the Regression Coefficient is Sb. We will use Excel’s P-value for the Independent Variable to determine significance. If the p-value is less than .05, we Reject the null hypothesis and conclude that the Independent variable is related to the dependent variable. However, it is important to have an understanding of the formulation development - which is why the formulas and definitions are provided.

Analyzing it all at once What happens if you have a large sample size, a small R2 (such as .10) and you have determined that the independent variable is significant? What happens with a small sample, large R2 and the independent variable is NOT significant? To test the model, we use the F statistic from the ANOVA table.

ANOVA Analysis ANOVA df SS MS F Regression Error k-1 SSR/k-1 Total n-k SSE/n-k MSR/MSE

F-Test Ho: The model is NOT valid and there is NOT a statistical relationship between the dependent and independent variables HA: The model is valid. There is a statistical relationship between the dependent and independent variables. If F from the ANOVA is greater than the F from the F-table, reject Ho: The model is valid. We can look at the P-values. If the p-value is less than our set a level, we can REJECT Ho.

Durbin-Watson Statistic Minitab will provide a DW statistic. This detects autocorrelation for Yt and Yt-1. The value of DW varies between 0 and 4. A value of 2 indicates no autocorrelation. A value of 0 indicates positive autocorrelation A value of 4 indicates negative autocorrelation.

Data Transformations Curvilinear relationships - fit the data with a curved line Transform the X variable (independent) so the resulting relationship with Y is linear. Log of X, Square Root of X, X squared, and reciprocal of X (or 1/X) are common. The hope is that one of these transformations will result in a linear relationship.

Ok, 18 pages of notes, so where do we start? Determine the dependent and independent variables Develop scatter plots and determine if linear or nonlinear relationships exist. Calculate a correlation coefficient. Transform non-linear data. Run an autocorrelation and interpret the results - it will be helpful to see if any patterns exist Compute the regression equation. Interpret. Understand the difference between standard error of estimate, standard error of forecast (regression) and standard error of the regression coefficient. Evaluate and interpret the adjusted R2 Test the independent variables for significance Evaluate the ANOVA and test the model for significance (F and DW) Plot the error terms Calculate a prediction and prediction interval State final conclusions about the model (if running different models, compare using MSE, MAD, MAPE, MPE)