1 B IVARIATE AND MULTIPLE REGRESSION Estratto dal Cap. 8 di: “Statistics for Marketing and Consumer Research”, M. Mazzocchi, ed. SAGE, 2008. LEZIONI IN.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Lesson 10: Linear Regression and Correlation
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Correlation and regression Dr. Ghada Abo-Zaid
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
The Simple Regression Model
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Linear Regression and Correlation Analysis
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Chapter 11 Multiple Regression.
Topic 3: Regression.
Multiple Regression and Correlation Analysis
Ch. 14: The Multiple Regression Model building
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Correlation and regression Chapter 8.
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Lecture 5 Correlation and Regression
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Correlation and Linear Regression
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
1 B IVARIATE AND MULTIPLE REGRESSION Estratto dal Cap. 8 di: “Statistics for Marketing and Consumer Research”, M. Mazzocchi, ed. SAGE, LEZIONI IN.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Examining Relationships in Quantitative Research
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
1Spring 02 First Derivatives x y x y x y dy/dx = 0 dy/dx > 0dy/dx < 0.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
1 Analysis of variance (ANOVA) LEZIONI IN LABORATORIO Corso di MARKETING L. Baldi Università degli Studi di Milano Estratto dal Cap. 7 di: “Statistics.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Lecture 10: Correlation and Regression Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Correlation & Regression Analysis
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
The simple linear regression model and parameter estimation
Chapter 20 Linear and Multiple Regression
26134 Business Statistics Week 5 Tutorial
Correlation and Simple Linear Regression
I271B Quantitative Methods
Prepared by Lee Revere and John Large
Simple Linear Regression
Product moment correlation
Introduction to Regression
Analysis of variance (ANOVA)
Correlation and Simple Linear Regression
Presentation transcript:

1 B IVARIATE AND MULTIPLE REGRESSION Estratto dal Cap. 8 di: “Statistics for Marketing and Consumer Research”, M. Mazzocchi, ed. SAGE, LEZIONI IN LABORATORIO Corso di MARKETING L. Baldi Università degli Studi di Milano

B IVARIATE LINEAR REGRESSION Causality (from x to y ) is assumed The error term embodies anything which is not accounted for by the linear relationship The unknown parameters (  and  ) need to be estimated (usually on sample data). We refer to the sample parameter estimates as a and b 2 Dependent variable Intercept Regression coefficient (Random) error term Explanatory variable

T O STUDY IN DETAIL : L EAST SQUARES ESTIMATION OF THE UNKNOWN PARAMETERS For a given value of the parameters, the error (residual) term for each observation is The least squares parameter estimates are those who minimize the sum of squared errors: 3

T O STUDY IN DETAIL : A SSUMPTIONS ON THE ERROR TERM 1. The error term has a zero mean 2. The variance of the error term does not vary across cases ( homoskedasticity) 3. The error term for each case is independent of the error term for other cases 4. The error term is also independent of the values of the explanatory (independent) variable 5. The error term is normally distributed 4

P REDICTION Once a and b have been estimated, it is possible to predict the value of the dependent variable for any given value of the explanatory variable Example: change in price x, what happens in consumption y? 5

M ODEL EVALUATION An evaluation of the model performance can be based on the residuals ( ), which provide information on the capability of the model predictions to fit the original data ( goodness-of- fit ) Since the parameters a and b are estimated on the sample, just like a mean, they are accompanied by the standard error of the parameters, which measures the precision of these estimates and depends on the sampling size. Knowledge of the standard errors opens the way to run hypothesis testing. 6

H YPOTHESIS TESTING ON REGRESSION COEFFICIENTS T-test on each of the individual coefficients Null hypothesis: the corresponding population coefficient is zero. The p -value allows one to decide whether to reject or not the null hypothesis that coeff. =zero, (usually p<0.05 reject the null hyp.) F-test (multiple independent variables, as discussed later) It is run jointly on all coefficients of the regression model Null hypothesis: all coefficients are zero The F-test in linear regression corresponds to the ANOVA test 7

8 R 2 COEFFICIENT OF DETERMINATION R 2 Definition: A statistical measure of the ‘goodness of fit’ in a regression equation. It gives the proportion of the total variance of the forecasted variable that is explained by the fitted regression equation, i.e. the independent explanatory variables. The natural candidate for measuring how well the model fits the data is the coefficient of determination, which varies between zero (when the model does not explain any of the variability of the dependent variable) and 1 (when the model fits the data perfectly)

B IVARIATE REGRESSION IN SPSS 9

R EGRESSION OUTPUT 10 Only 5% of total variation is explained by the model (correlation is 0.23) The F-test rejects the hypothesis that all coefficients are zero Both parameters are statistically different from zero according to the t-test

MULTIPLE REGRESSION The principle is identical to bivariate regression, but there are more explanatory variables 11

A DDITIONAL ISSUES : Collinearity (or multicollinearity ) problem: The independent variables must be also independent of each other. Otherwise we could run into some double- counting problem and it would become very difficult to separate the meaning. Inefficient estimates Apparently good model but poor forecasts 12

G OODNESS - OF - FIT The coefficient of determination R 2 always increases with the inclusion of additional regressors adjusted R 2 Thus, a proper indicator is the adjusted R 2 which accounts for the number of explanatory variables (k ) in relation to the number of observations (n) 13

M ULTIPLE REGRESSION IN SPSS Analyze / Regression / Linear 14 Simply select more than one explanatory variable

O UTPUT 15 The model accounts for 19.3% of variability in the dependent variable. After adjusting for the number of regressors, the R 2 is The null hypothesis that all regressors are zero is strongly rejected

16 O UTPUT Only these parameters (price, “we like..” and household size) emerge as significantly different from 0.

C OEFFICIENT INTERPRETATION – INTERCEPT The constant represents the amount spent being zero all other variables. It provides a negative value, but the hypothesis that the constant is zero is not rejected A household of zero components, with no income is unlikely to consume chicken However, estimates for the intercept are often unsatisfactory, because frequently there are no data points with values for the independent variables close or equal to zero 17

C OEFFICIENT INTERPRETATION The significant coefficients tell one that: Each additional household component means an increase in consumption by 277 grams A £ 1 increase in price leads to a decrease in consumption by 109 grams. 18