Lecture 3-3 Summarizing r relationships among variables © 1.

Slides:



Advertisements
Similar presentations
Lecture 3-4 Summarizing relationships among variables ©
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Lecture 3 (Ch4) Inferences
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Summary of previous lecture Introduction of dummy variable
Chapter 13 Multiple Regression
Chapter 12 Simple Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Multiple Regression
1 Qualitative Independent Variables Sometimes called Dummy Variables.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Simple Linear Regression Analysis
Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2.
Relationships Among Variables
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 3-2 Summarizing Relationships among variables ©
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Linear Regression and Correlation
Hypothesis Testing in Linear Regression Analysis
Regression Method.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
Chapter 14 Introduction to Multiple Regression
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Managerial Economics Demand Estimation & Forecasting.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Chapter 13 Multiple Regression
Lecture 4 Introduction to Multiple Regression
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
ANOVA, Regression and Multiple Regression March
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Stats Methods at IC Lecture 3: Regression.
Chapter 14 Introduction to Multiple Regression
Linear Regression.
Basic Estimation Techniques
Multiple Regression Analysis and Model Building
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Seminar in Economics Econ. 470
Presentation transcript:

Lecture 3-3 Summarizing r relationships among variables © 1

Topics covered in this lecture note 4We will cover several topics about ordinary least square estimation. 1.Testing the statistical significance of the estimated coefficient using t-statistics (i.e., testing whether advertisement spending has any effect on revenue). 2.Ordinary Least Square estimation when there are more explanatory variables. 3.An introduction to panel data (repeated observations over time) 2

1. Testing the statistical significance of the estimated coefficient: Example The graph above shows a relationship between advertisement spending and revenue along with the estimated linear equation. The estimated slope coefficient is This means that every 1000 yen you spend on advertisement, revenue increases by 13.4 thousand yen. Next Page 3

Testing the statistical significance of the estimated coefficient: Example, contd However, the graph also seems to indicate that there is not much relationship between advertisement spending and revenue. When we estimate a linear equation, we typically would like to know if advertisement has any effect on the revenue. To answer such a question, just estimating β 0 and β 1 is not enough. We need more information. 4

Testing the statistical significance of the estimated coefficient: Example, contd The following slides describe the procedure to answer the following question: “Would the advertisement have any impact on the revenue?” 5

Testing the statistical significance of the estimated coefficient: Example, contd 4To test if advertisement spending has any impact on the revenue, we need to test whether the slope coefficient is “significantly” different from zero. 1.If the slope coefficient is significantly different from zero, we may conclude that advertisement spending has some effect on the revenue. 2.If the slope coefficient is not significantly different from zero, we may conclude that advertisement spending has no effect on the revenue. 4Then, what would be the criterion to decide whether the slope coefficient is “significantly” different from zero? See next slide 6

Testing the statistical significance of the estimated coefficient: Example, contd 4To decide whether the slope coefficient is significantly different from zero, we use “t-statistic”. 4OLS estimation procedure estimates much more than β 0 andβ 1, also it includes t-statistic. Now, we will obtain some of extra information from OLS estimation using Excel. 7

Testing the statistical significance of the estimated coefficient: Example, contd 4Open Data set “OLS Exercise 2-Advertisement and Revenue”. This is the data set used to produce the graph in the previous slides. Now, use “Data Analysis” to estimate the following Model (Revenue)= β 0 +β 1 (Advertisement Spending) 8

Testing the statistical significance of the estimated coefficient: Example, contd The table above is the result of OLS regression. 1.Intercept Coefficient (β 0 )= Slope Coefficient(β 1 )= We have some extra information, such as standard error and t statistic (t-Stat in the table). These are pieces of information needed to test whether slope coefficient is significantly different from zero. See next slides Coefficie nts Standard Error t Stat P- val ue Lower 95 % Upper 95 % Lower 95.0 % Upper 95.0 % Intercept E Advertisem ent Spendin g

Testing the statistical significance of the estimated coefficient: Example -Standard Error- Since data contain a lot of noise (unexpected rises and falls in revenue, etc), the effect of advertisement on revenue (β 1 ) is estimated with some error. Standard errors show the expected error in the estimation of the coefficients. Next Slides Coefficie nts Standard Error t Stat P- val ue Lower 95 % Upper 95 % Lower 95.0 % Upper 95.0 % Intercept E Advertisem ent Spendin g

Testing the statistical significance of the estimated coefficient: Example -Standard Error, contd-  For example, the standard error for the slope coefficient is This means that there would be an error in the estimate of the slope coefficient (β 1 ) of about ± 60.3 on average.  Thus, the smaller the standard error for (β 1 ) is, the more precise the estimate of the impact of advertisement is. Coefficien ts Standard Error t Stat P- val ue Lower 95 % Upper 95 % Lower 95.0 % Upper 95.0 % Intercept E Advertise ment Spendi ng

Testing the statistical significance of the estimated coefficient: Example -t statistic- t-statistic is obtained by dividing the coefficient by its standard error. For example, t-statistic for the slope coefficient is / = Our confidence that the advertisement spending has some impact on revenue increases if t-statistic increases (because this happens when the standard error decreases or the coefficient increases) We use t-statistic to test whether the slope coefficient is significantly different from zero. Coefficie nts Standard Error t Stat P- val ue Lower 95 % Upper 95 % Lower 95.0 % Upper 95.0 % Intercept E Advertisem ent Spendin g

The procedure to test the statistical significance of the estimated coefficient 4The following is the procedure to test if a coefficient is significantly different from zero. 1.Obtain t-statistic 2.Check if the absolute value of the t-statistic is greater than or equal to 2 (that is, t-stat ≤ ‒ 2 or t-stat≥+2) 3.If the absolute value of the t-statistic is greater than (or equal to) 2, the coefficient is statistically significantly different from zero 4.If the absolute value of the t-statistic is smaller than 2, then the coefficient is not statistically significantly different from zero 13

A note on the test of statistical significance of the estimated coefficient 1 4When the coefficient is statistically significantly different from zero, we simply say “the coefficient is statistically significant”. 1.If the coefficient is statistically significant, we conclude that the advertisement spending has some impact on the revenue. 2.If the coefficient is not statistically significant, we concluded that the advertisement spending has no impact on the revenue. 14

A note on the test of statistical significance of the estimated coefficient 2 (Optional) The criterion value for t-statistic that we used for testing the statistical significance was 2. More precisely speaking, this criterion value depends on the number of observations and the number of parameters to be estimated. This topic will be discussed more in detail later in the class. When you use the criterion value of 2, roughly speaking, you are testing the statistical significance of the slope coefficient at the 5% significance level. 15

Exercise 4Exercise 1: Open data “Statistical Significance Exercise”. Use Product A data to estimate the effect of promotion on the revenue by estimating the following model. Pay particular attention to the statistical significance of the slope coefficient. (Revenue)=β 0 +β 1 (Number of promotion) 4Exercise 2: Use data “Statistical Significance Exercise”. Use Product C data to estimate the same model. 16

Exercise 1 Answer The estimated effect of the promotion on the revenue is , with t-statistic equal to Since t-statistic is greater than 2, we conclude that the effect of the promotion on the revenue is statistically significant. Given the statistical significance of the coefficient, the estimated slope coefficient of indicates that, if we increase the number of promotion by one, the revenue is likely to increase by yen. Produc t A Coefficient s Standard Error t StatP-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercep t Number of promotio ns

Exercise 2 Answer The estimated effect of promotion on the revenue is with t-statistic equal to Since the absolute value of t- statistic is smaller than 2, we conclude that the slope coefficient is not statistically significant. In other word, we did not find evidence that promotion has any impact on the revenue from the product C. Produc t C Coefficient s Standard Error t StatP-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Interce pt Numbe r of promot ions

2. OLS with multiple explanatory variables Introduction 4So far, we have considered a model with only one explanatory variable. Y=β 0 +β 1 X 4Often, we have more than one explanatory variable. For example, in addition to promotion, the company may increase the number of sales persons. If we have data about the number of sales persons, we can also incorporate such a variable. 19

OLS with multiple regressors -Example: Returns on Education- 4Suppose you are considering to pursue more education (going to graduate school, etc). Then you may want to know if this is worth your effort. 20

OLS with multiple regressors -Example: Returns on Education- 4To investigate by how much the extra education increases your future salary we can utilize OLS regression. 4Open data “Returns on education”. This data contain three variables. These are data collected for 935 persons. For each person, data contain information about weekly wage in dollars, number of years of education, and number of years of work experience. 4As an exercise, find the mean, variance and standard deviation for the three variables. 21

OLS with multiple regression -Example: Returns on Education- 4To investigate the effect of education on wage, we may estimate the OLS regression: (wage)=β 0 +β 1 (education). 4However, wage is affected not only by education, but also the number of years of work experience. Therefore, it seems better to incorporate “work experience” in the model. 4The simplest way to incorporate experience in the model is the following: (wage)=β 0 +β 1 (education)+β 2 (experience) 4Notice, that this OLS equation has two explanatory variables on the right hand side of the equation. 22

OLS with multiple regressors -Example: Returns on Education- 4Excel estimates coefficients β 0, β 1 and β 2 automatically (wage)=β 0 +β 1 (education)+β 2 (experience) 4The estimated β 1 is the effect of education on wage, holding experience constant. This is the big advantage of OLS with multiple explanatory variables. When we look at data, education and experience vary at the same time, so it is difficult see the effect of education separately from the effect of experience just by looking at the data. By incorporating these two variables we can separate the effect of experience from the effect of education. 4Exercise: Estimate the model above using Excel. 23

OLS with multiple regressors -Example: Returns on Education- Estimated β 0 =-272.5, β 1 =76.2 and β 2 =17.6 Also notice that t-statistic for β 1 is 12.1, which is bigger than 2. Therefore, the estimated β 1 is statistically significant. Therefore, education does have an impact on wage. Given the statistical significance of β 1, we can say that, holding experience constant, increasing the year of education by one year would increase the weekly wage by $76.2. This also means that if you go to graduate school for 2 years, your annual salary would increase by $76.2*(52 weeks)*(2 years)=$ Coefficient s Standard Errort StatP-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept education (in years) E work experience in years) E

Exercise 2 4Open Data “Returns on education 2” 4This is the same data set as “Returns on education 1”, except that it has more variables. This data set contains information about the age of the person, and IQ test score of the person. Exercise: Add IQ to the model. Does this change the results? 25

OLS with multiple variables: Application -Making a model more flexible- 4When you specify a model for OLS estimation, the first criterion is the simplicity. (Revenue)=β 0 +β 1 (Promotion) 4Such a simple equation gives a clear idea of the effect of promotion on revenue. 4However, simplicity comes with a cost: It is often not flexible. 26

OLS with multiple variables: Application -Making a model more flexible- 4The model implicitly assumes that the effect of increasing the number of promotion by one does not change revenue. That is, the model assumes that the effect of increasing the number of promotion from 10 to 11 is the same as the effect of increasing the number of promotion from 40 to 41. 4However, it is reasonable to think that the effect of promotion would diminish due to the law of diminishing marginal return. 4See the next example. 27

-Making a model more flexible. An example 4Open the data set “Making a model more flexible”. This data show the relationship between number of promotion and revenue for product D. 4Plot the relationship between the number of promotion and revenue, then describe the relationship. 28

-Making a model more flexible: An example The relationship seems to be a curve, not a straight line. The effectiveness of promotion seems to be diminishing as the number of promotion increases. How do we incorporate the“diminishing effectiveness” of promotion in the model? 29

-Making a model more flexible: An example- 4To incorporate the “diminishing effectiveness” in the model we need to specify the model that can “curve”. 4A simple way to achieve this is to estimate the following model: (Revenue)=β 0 +β 1 (Number of promotion) +β 2 (Number of promotion) 2 30

-Making a model more flexible: Exercise- 4Use the data “Making a model more flexible” and estimate the following model: (Revenue)=β 0 +β 1 (Number of promotion) +β 2 (Number of promotion) 2 31

Exercise: Answer The estimated equation is (Revenue)= (Number of promotion) ‒ (Number of promotion)2 Note the both β 1 and β 2 are statistically significant. Coefficient s Standard Errort Stat P- value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept Number of promotion s E (Number of promotion )^ E

More exercises 4Exercise 1: Using the estimated equation compute “predicted” revenue for each observation. 4Exercise 2: Now plot the predicted revenue and the number of promotions. Also plot the actual revenue and promotions, on the same graph. See how well the model predicts the outcome. 33

More exercises 4Exercise 3: Using the estimated results, compute the expected increases in revenue when you increase the number of promotion from 10 to 11, and 25 to

OLS with multiple variables: Application 2 -Dummy Variables- 4Often, our data contain qualitative variables. For example, if you have data about your clients, for each client you may have data about whether the person is male or female. Such data (about gender) is not a quantitative variable but a qualitative variable. 35

OLS with multiple variables: Application 2 -Dummy Variables- 4However, such a qualitative variable is also important in analyzing data. For example, you would like to answer the following question: “which gender consumes more?” 36

4To incorporate such a qualitative variable into the OLS equation, we first convert qualitative information into a quantitative variable called a “dummy variable”. 4A dummy variable is a variable that takes 1 if a particular criterion is satisfied, and takes 0 otherwise. 4If you would like to incorporate gender information in your model, create the following dummy variable: Female =1 if the client is female =0 if the client is male Then you can estimate (Consumer spending)=β 0 +β 1 (Number of promotion) +β 2 (Female) 37

OLS with multiple variables: Application 2 -Dummy Variables- 4 A dummy variable is very versatile. Suppose you would like to know if there is any wage differentials among different races (for example between white and black), then you can use a dummy variable that takes 1 if the person is black, and 0 otherwise. 4A dummy variable can be created for many other occasions. The use of a dummy variable is one of the most important techniques in regression analysis. 38

Dummy variable exercise 4Open Data. “Dummy variable Exercise”. This data set contains three dummy variables. Black =1 if the person is black =0 otherwise Married =1 if the person is married =0 otherwise South =1 if the person lives in South of USA =0 otherwise Urban =1 if the person lives in urban area =0 otherwise. 39

Dummy variable exercise 4Exercise 1: Estimate the following model: (Wage)=β0+β1(Education)+β2(Experience) +β3(Age)+ β4(IQ) +β5(Black) Then interpret the results. 40

Dummy variable exercise 、 Answer The coefficient for the dummy variable for black person is The t-statistic is -3.19;the absolute value of t-statistic is greater than 2. Therefore, the coefficient is statistically significant. The results indicate that, holding education, experience, age, and IQ constant, the weekly wage is lower for a black person by $ There seems to exist a large wage gap among white and black races. Coefficien ts Standard Error t StatP-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept E education E experience age IQ E black

Dummy variable: More exercises 4Use data “Dummy Variable Exercise”. Specify your own model, estimate, and interpret the results. 42