Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 - 1 © 1997 Prentice-Hall, Inc. Learning Objectives n Describe the linear regression model n State the regression modeling steps n Explain least squares.

Similar presentations


Presentation on theme: "11 - 1 © 1997 Prentice-Hall, Inc. Learning Objectives n Describe the linear regression model n State the regression modeling steps n Explain least squares."— Presentation transcript:

1 © 1997 Prentice-Hall, Inc. Learning Objectives n Describe the linear regression model n State the regression modeling steps n Explain least squares n Compute regression coefficients n Describe residual analysis n Predict the response variable n Understand correlational analysis

2 © 1997 Prentice-Hall, Inc. Probabilistic Models n Hypothesize 2 components l Deterministic l Random error n Example: Sales volume is 10 times advertising spending plus random error Y = 10X +  Y = 10X +  l Random error may be due to factors other than advertising

3 © 1997 Prentice-Hall, Inc. Types of Probabilistic Models

4 © 1997 Prentice-Hall, Inc. Regression Models n Answer ‘What is the relationship between the variables?’ n Equation used l 1 numerical dependent (response) variable s What is to be predicted l 1 or more numerical or categorical independent (explanatory) variables n Used mainly for prediction

5 © 1997 Prentice-Hall, Inc. Regression Modeling Steps n Define problem or question n Specify model n Collect data n Do descriptive data analysis n Estimate unknown parameters n Evaluate model n Use model for prediction

6 © 1997 Prentice-Hall, Inc. Problem Definition n Most critical step l Don’t want right answer to wrong question n What are the model objectives? n Who will use the model? n What will be the benefits? n Are resources available (data etc.)? n How will the results be implemented?

7 © 1997 Prentice-Hall, Inc. Specifying the Model n Define variables l Conceptual (e.g., advertising, price) l Empirical (e.g., list price, regular price) l Measurement (e.g., $, units) n Hypothesize nature of relationship l Expected effects (i.e., coefficients’ signs) l Functional form (linear or non-linear) l Interactions

8 © 1997 Prentice-Hall, Inc. Model Specification Is Based on Theory n Economic & business theory n Mathematical theory n Previous research n ‘Common sense’

9 © 1997 Prentice-Hall, Inc. Types of Regression Models

10 © 1997 Prentice-Hall, Inc. Linear Equations High School Teacher © T/Maker Co.

11 © 1997 Prentice-Hall, Inc. Linear Regression Model n Relationship between variables is a linear function Dependent (Response) Variable Independent (Explanatory) Variable Population Slope Population Y-Intercept Random Error

12 © 1997 Prentice-Hall, Inc. Sample Linear Regression Model Unsampled observation e i = Random error Observed value

13 © 1997 Prentice-Hall, Inc. Scatter Diagram n Plot of all (X i, Y i ) pairs n Suggests how well model will fit

14 © 1997 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? AloneGroupClass

15 © 1997 Prentice-Hall, Inc. Least Squares Least Squares n ‘Best fit’ means difference between actual Y values & predicted Y values are a minimum l But positive differences off-set negative n LS minimizes the sum of the squared differences (or errors)

16 © 1997 Prentice-Hall, Inc. Least Squares Graphically

17 © 1997 Prentice-Hall, Inc. Coefficient Equations Sample regression equation Sample slope Sample Y-intercept # (X i, Y i ) pairs Average X i ’s, then square

18 © 1997 Prentice-Hall, Inc. Computation Table

19 © 1997 Prentice-Hall, Inc. Interpretation of Coefficients n Slope (b 1 ) l Estimated Y changes by b 1 for each 1 unit increase in X s Example: If b 1 = 2, then Sales (Y) is expected to increase by 2 for each 1 unit increase in Advertising (X) n Y-Intercept (b 0 ) l Average value of Y when X = 0 s Example: If b 0 = 4, then average Sales (Y) is expected to be 4 when Advertising (X) is 0

20 © 1997 Prentice-Hall, Inc. Parameter Estimation Example You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $Sales (Units) What is the relationship between sales & advertising?

21 © 1997 Prentice-Hall, Inc. Scatter Diagram Sales vs. Advertising Sales Advertising

22 © 1997 Prentice-Hall, Inc. Parameter Estimation Solution Table

23 © 1997 Prentice-Hall, Inc. Coefficient Interpretation Solution n Slope (b 1 ) l Sales Volume (Y) is expected to increase by.7 units for each $1 increase in Advertising (X) n Y-Intercept (b 0 ) l Average value of Sales Volume (Y) is -.10 units when Advertising (X) is 0 s Difficult to explain to Marketing Manager s Expect some sales without advertising

24 © 1997 Prentice-Hall, Inc. Parameter Estimation Excel Output b0b0b0b0 bPbPbPbP b1b1b1b1

25 © 1997 Prentice-Hall, Inc. Evaluating the Model n How well does the model describe the relationship between the variables? n Closeness of ‘best fit’ l Closer the points to the line the better n Assumptions met n Significance of parameter estimates

26 © 1997 Prentice-Hall, Inc. Evaluating Model Steps n Examine variation measures n Do residual analysis n Test coefficients for significance

27 © 1997 Prentice-Hall, Inc. Random Error Variation n Variation of actual Y from predicted Y n Measured by standard error of estimate l Sample standard deviation of e l Denoted S YX n Affects several factors l Parameter significance l Prediction accuracy

28 © 1997 Prentice-Hall, Inc. Standard Error of Estimate

29 © 1997 Prentice-Hall, Inc. Measures of Variation in Regression n Total sum of squares (SST) Measures variation of observed Y i around the mean  Y Measures variation of observed Y i around the mean  Y n Explained variation (SSR) l Variation due to relationship between X & Y n Unexplained variation (SSE) l Variation due to other factors

30 © 1997 Prentice-Hall, Inc. Variation Measures Total sum of squares (Y i -  Y) 2 Unexplained sum of squares (Y i -  Y i ) 2 ^ Explained sum of squares (Y i -  Y) 2 ^ YiYiYiYi

31 © 1997 Prentice-Hall, Inc. n Proportion of variation ‘explained’ by relationship between X & Y Coefficient of Determination 0  r 2  1

32 © 1997 Prentice-Hall, Inc. r 2 Examples r 2 = 1 r 2 =.8r 2 = 0

33 © 1997 Prentice-Hall, Inc. n Proportion of variation ‘explained’ by relationship between X & Y n Reflects l Sample size l Number of independent variables Adjusted Coefficient of Determination

34 © 1997 Prentice-Hall, Inc. Coef. of Determination Excel Output r2r2r2r2 r 2 adjusted for number of explanatory variables & sample size S YX

35 © 1997 Prentice-Hall, Inc. Residual Analysis n Graphical analysis of residuals l Plot residuals vs. X i values s Residuals are also called errors n Difference between actual Y i & predicted Y i n Purposes l Examine functional form (linear vs. non-linear model) l Evaluate violations of assumptions

36 © 1997 Prentice-Hall, Inc. Linear Regression Assumptions n Normality l Y values are normally distributed for each X l Probability distribution of error is normal n Homoscedasticity (constant variance) n Independence of errors n Linearity

37 © 1997 Prentice-Hall, Inc. Residual Plot for Functional Form Add X 2 Term Correct Specification

38 © 1997 Prentice-Hall, Inc. Residual Plot for Homoscedasticity Heteroscedasticity Correct Specification Fan-shaped. Standardized residuals used typically.

39 © 1997 Prentice-Hall, Inc. Residual Plot for Independence Not Independent Correct Specification Plots reflect sequence data were collected.

40 © 1997 Prentice-Hall, Inc. Residual Analysis Excel Output

41 © 1997 Prentice-Hall, Inc. Residual Plot Excel Output

42 © 1997 Prentice-Hall, Inc. Test of Slope Coefficient n Tests if there is a linear relationship between X & Y Involves population slope  1 Involves population slope  1 n Hypotheses H 0 :  1 = 0 (No linear relationship) H 0 :  1 = 0 (No linear relationship) H 1 :  1  0 (Linear relationship) H 1 :  1  0 (Linear relationship) n Theoretical basis is sampling distribution of slopes

43 © 1997 Prentice-Hall, Inc. Test of Slope Parameter Solution H 0 :  1 = 0 H 1 :  1  0  .05 df  = 3 Critical Value(s): Test Statistic: Decision:Conclusion: Reject at  =.05 There is evidence of a relationship

44 © 1997 Prentice-Hall, Inc. Test Statistic Solution

45 © 1997 Prentice-Hall, Inc. Test of Slope Parameter Excel Output t = b P /S b SbSbSbSb bPbPbPbP P-Value P P

46 © 1997 Prentice-Hall, Inc. Prediction With Regression Models n Types of predictions l Point estimates l Interval estimates n What is predicted Population mean response (  YX ) for given X Population mean response (  YX ) for given X s Point on population regression line l Individual response (Y i ) for given X

47 © 1997 Prentice-Hall, Inc. What Is Predicted

48 © 1997 Prentice-Hall, Inc. Factors Affecting Interval Width Level of confidence (1 -  ) Level of confidence (1 -  ) l Width increases as confidence increases n Data dispersion (S YX ) l Width increases as variation increases n Sample size l Width decreases as sample size increases Distance of X given from mean  X Distance of X given from mean  X l Width increases as distance increases

49 © 1997 Prentice-Hall, Inc. Regression Cautions n Violated assumptions n Relevancy of historical data n Level of significance n Extrapolation n Cause & effect

50 © 1997 Prentice-Hall, Inc. ExtrapolationExtrapolation

51 © 1997 Prentice-Hall, Inc. Cause & Effect Liquor Consumption # Teachers

52 © 1997 Prentice-Hall, Inc. Types of Probabilistic Models

53 © 1997 Prentice-Hall, Inc. Correlation Models n Answer ‘How strong is the linear relationship between 2 variables?’ n Coefficient of correlation used Population correlation coefficient denoted  (rho) Population correlation coefficient denoted  (rho) l Values range from -1 to +1 l Measures degree of association n Used mainly for understanding

54 © 1997 Prentice-Hall, Inc. n Pearson Product-Moment Coefficient of Correlation: Sample Coefficient of Correlation

55 © 1997 Prentice-Hall, Inc. Correlation & Regression Line r = 1r = -1 r =.89r = 0

56 © 1997 Prentice-Hall, Inc. Test of Correlation Coefficient n Shows if there is a linear relationship between 2 numerical variables Same conclusion as testing population slope  1 Same conclusion as testing population slope  1 n Hypotheses H 0 :  = 0 (No correlation) H 0 :  = 0 (No correlation) H 1 :   0 (Correlation) H 1 :   0 (Correlation)

57 © 1997 Prentice-Hall, Inc. ConclusionConclusion n Described the linear regression model n Stated the regression modeling steps n Explained least squares n Computed regression coefficients n Described residual analysis n Predicted the response variable

58 © 1997 Prentice-Hall, Inc. Learning Objectives n Explain the linear multiple regression model n Interpret linear multiple regression computer output n Explain multicollinearity

59 © 1997 Prentice-Hall, Inc. Multiple Regression Models

60 © 1997 Prentice-Hall, Inc. Linear Multiple Regression Model n Relationship between 1 dependent & 2 or more independent variables is a linear function Dependent (response) variable Independent (explanatory) variables Population slopes Population Y-intercept Random error

61 © 1997 Prentice-Hall, Inc. Population Multiple Regression Model Bivariate model

62 © 1997 Prentice-Hall, Inc. Sample Multiple Regression Model Bivariate model

63 © 1997 Prentice-Hall, Inc. Regression Modeling Steps n Define problem or question n Specify model n Collect data n Do descriptive data analysis n Estimate unknown parameters n Evaluate model n Use model for prediction

64 © 1997 Prentice-Hall, Inc. Parameter Estimation Linear Multiple Regression Model

65 © 1997 Prentice-Hall, Inc. Multiple Linear Regression Equations Too complicated by hand! Ouch!

66 © 1997 Prentice-Hall, Inc. Interpretation of Estimated Coefficients n Slope (b P ) l Estimated Y changes by b P for each 1 unit increase in X P holding all other variables constant s Example: If b 1 = 2, then Sales (Y) is expected to increase by 2 for each 1 unit increase in Advertising (X 1 ) given the Number of Sales Rep’s (X 2 ) n Y-Intercept (b 0 ) l Average value of Y when X P = 0

67 © 1997 Prentice-Hall, Inc. Parameter Estimation Example You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) & newspaper circulation (000) on the number of ad responses (00). You’ve collected the following data: RespSizeCirc

68 © 1997 Prentice-Hall, Inc. Parameter Estimation Excel Output bPbPbPbP b0b0b0b0 b1b1b1b1 b2b2b2b2

69 © 1997 Prentice-Hall, Inc. Interpretation of Coefficients Solution n Slope (b 1 ) l # Responses to Ad is expected to increase by.2049 (20.49) for each 1 sq. in. increase in Ad Size holding Circulation constant n Slope (b 2 ) l # Responses to Ad is expected to increase by.2805 (28.05) for each 1 unit (1,000) increase in Circulation holding Ad Size constant

70 © 1997 Prentice-Hall, Inc. Evaluating the Model

71 © 1997 Prentice-Hall, Inc. Regression Modeling Steps n Define problem or question n Specify model n Collect data n Do descriptive data analysis n Estimate unknown parameters n Evaluate model n Use model for prediction  

72 © 1997 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps n Examine variation measures n Do residual analysis n Test parameter significance l Overall model l Portions of model l Individual coefficients n Test for multicollinearity

73 © 1997 Prentice-Hall, Inc. Coef. of Determination Excel Output S YX r 2 Y. 12 r 2 adj means 95.61% of variation in Y is due to Ad Size & Circulation

74 © 1997 Prentice-Hall, Inc. Coefficient of Partial Determination n Proportion of variation in Y ‘explained’ by variable X P holding all others constant n Must estimate separate models n Denoted r 2 Y1.2 in two X variables case l Coefficient of partial determination of X 1 with Y holding X 2 constant n Useful in selecting X variables

75 © 1997 Prentice-Hall, Inc. r 2 Y1. 2 Excel Output ANOVA df SS Regression Residual Total

76 © 1997 Prentice-Hall, Inc. Testing Parameters

77 © 1997 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps n Examine variation measures n Do residual analysis n Test parameter significance l Overall model l Portions of model l Individual coefficients n Test for multicollinearity New! Expanded! 

78 © 1997 Prentice-Hall, Inc. Testing Overall Significance n Shows if there is a linear relationship between all X variables together & Y n Uses F test statistic n Hypotheses H 0 :  1 =  2 =... =  P = 0 H 0 :  1 =  2 =... =  P = 0 s No linear relationship l H 1 : At least one coefficient is not 0 s At least one X variable affects Y

79 © 1997 Prentice-Hall, Inc. Overall Significance Excel Output n - 1 P-value P n - P -1 MSR / MSE

80 © 1997 Prentice-Hall, Inc. n Examines the contribution of a set of X variables to the relationship with Y n Null hypothesis: l Variables in set do not improve significantly the model when all other variables are included n Must estimate separate models n Used in selecting X variables Testing Model Portions

81 © 1997 Prentice-Hall, Inc. Testing Model Portions Test Statistic From ANOVA section of regression for Test H 0 :  1 = 0 in a 2 variable model

82 © 1997 Prentice-Hall, Inc. MulticollinearityMulticollinearity

83 © 1997 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps n Examine variation measures n Do residual analysis n Test parameter significance l Overall model l Portions of model l Individual coefficients n Test for multicollinearity New! Expanded! 

84 © 1997 Prentice-Hall, Inc. MulticollinearityMulticollinearity n High correlation between X variables n Coefficients measure combined effect n Leads to unstable coefficients depending on X variables in model n Always exists; matter of degree n Example: Using both Sales & Profit as explanatory variables in same model

85 © 1997 Prentice-Hall, Inc. Detecting Multicollinearity n Examine correlation matrix l Correlations between pairs of X variables are more than with Y variable n Examine variance inflation factor (VIF) l If VIF j > 5, multicollinearity exists n Few remedies l Obtain new sample data l Eliminate one correlated X variable

86 © 1997 Prentice-Hall, Inc. Correlation Matrix Excel Output r Y2 r 12 r Y1

87 © 1997 Prentice-Hall, Inc. VIF Excel Output Regress X 1 on X 2

88 © 1997 Prentice-Hall, Inc. This Class... n What was the most important thing you learned in class today? n What do you still have questions about? n How can today’s class be improved? Please take a moment to answer the following questions in writing:


Download ppt "11 - 1 © 1997 Prentice-Hall, Inc. Learning Objectives n Describe the linear regression model n State the regression modeling steps n Explain least squares."

Similar presentations


Ads by Google