Download presentation

Presentation is loading. Please wait.

Published byKasey Oldford Modified about 1 year ago

1
© 1997 Prentice-Hall, Inc. Learning Objectives n Describe the linear regression model n State the regression modeling steps n Explain least squares n Compute regression coefficients n Describe residual analysis n Predict the response variable n Understand correlational analysis

2
© 1997 Prentice-Hall, Inc. Probabilistic Models n Hypothesize 2 components l Deterministic l Random error n Example: Sales volume is 10 times advertising spending plus random error Y = 10X + Y = 10X + l Random error may be due to factors other than advertising

3
© 1997 Prentice-Hall, Inc. Types of Probabilistic Models

4
© 1997 Prentice-Hall, Inc. Regression Models n Answer ‘What is the relationship between the variables?’ n Equation used l 1 numerical dependent (response) variable s What is to be predicted l 1 or more numerical or categorical independent (explanatory) variables n Used mainly for prediction

5
© 1997 Prentice-Hall, Inc. Regression Modeling Steps n Define problem or question n Specify model n Collect data n Do descriptive data analysis n Estimate unknown parameters n Evaluate model n Use model for prediction

6
© 1997 Prentice-Hall, Inc. Problem Definition n Most critical step l Don’t want right answer to wrong question n What are the model objectives? n Who will use the model? n What will be the benefits? n Are resources available (data etc.)? n How will the results be implemented?

7
© 1997 Prentice-Hall, Inc. Specifying the Model n Define variables l Conceptual (e.g., advertising, price) l Empirical (e.g., list price, regular price) l Measurement (e.g., $, units) n Hypothesize nature of relationship l Expected effects (i.e., coefficients’ signs) l Functional form (linear or non-linear) l Interactions

8
© 1997 Prentice-Hall, Inc. Model Specification Is Based on Theory n Economic & business theory n Mathematical theory n Previous research n ‘Common sense’

9
© 1997 Prentice-Hall, Inc. Types of Regression Models

10
© 1997 Prentice-Hall, Inc. Linear Equations High School Teacher © T/Maker Co.

11
© 1997 Prentice-Hall, Inc. Linear Regression Model n Relationship between variables is a linear function Dependent (Response) Variable Independent (Explanatory) Variable Population Slope Population Y-Intercept Random Error

12
© 1997 Prentice-Hall, Inc. Sample Linear Regression Model Unsampled observation e i = Random error Observed value

13
© 1997 Prentice-Hall, Inc. Scatter Diagram n Plot of all (X i, Y i ) pairs n Suggests how well model will fit

14
© 1997 Prentice-Hall, Inc. Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? AloneGroupClass

15
© 1997 Prentice-Hall, Inc. Least Squares Least Squares n ‘Best fit’ means difference between actual Y values & predicted Y values are a minimum l But positive differences off-set negative n LS minimizes the sum of the squared differences (or errors)

16
© 1997 Prentice-Hall, Inc. Least Squares Graphically

17
© 1997 Prentice-Hall, Inc. Coefficient Equations Sample regression equation Sample slope Sample Y-intercept # (X i, Y i ) pairs Average X i ’s, then square

18
© 1997 Prentice-Hall, Inc. Computation Table

19
© 1997 Prentice-Hall, Inc. Interpretation of Coefficients n Slope (b 1 ) l Estimated Y changes by b 1 for each 1 unit increase in X s Example: If b 1 = 2, then Sales (Y) is expected to increase by 2 for each 1 unit increase in Advertising (X) n Y-Intercept (b 0 ) l Average value of Y when X = 0 s Example: If b 0 = 4, then average Sales (Y) is expected to be 4 when Advertising (X) is 0

20
© 1997 Prentice-Hall, Inc. Parameter Estimation Example You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $Sales (Units) What is the relationship between sales & advertising?

21
© 1997 Prentice-Hall, Inc. Scatter Diagram Sales vs. Advertising Sales Advertising

22
© 1997 Prentice-Hall, Inc. Parameter Estimation Solution Table

23
© 1997 Prentice-Hall, Inc. Coefficient Interpretation Solution n Slope (b 1 ) l Sales Volume (Y) is expected to increase by.7 units for each $1 increase in Advertising (X) n Y-Intercept (b 0 ) l Average value of Sales Volume (Y) is -.10 units when Advertising (X) is 0 s Difficult to explain to Marketing Manager s Expect some sales without advertising

24
© 1997 Prentice-Hall, Inc. Parameter Estimation Excel Output b0b0b0b0 bPbPbPbP b1b1b1b1

25
© 1997 Prentice-Hall, Inc. Evaluating the Model n How well does the model describe the relationship between the variables? n Closeness of ‘best fit’ l Closer the points to the line the better n Assumptions met n Significance of parameter estimates

26
© 1997 Prentice-Hall, Inc. Evaluating Model Steps n Examine variation measures n Do residual analysis n Test coefficients for significance

27
© 1997 Prentice-Hall, Inc. Random Error Variation n Variation of actual Y from predicted Y n Measured by standard error of estimate l Sample standard deviation of e l Denoted S YX n Affects several factors l Parameter significance l Prediction accuracy

28
© 1997 Prentice-Hall, Inc. Standard Error of Estimate

29
© 1997 Prentice-Hall, Inc. Measures of Variation in Regression n Total sum of squares (SST) Measures variation of observed Y i around the mean Y Measures variation of observed Y i around the mean Y n Explained variation (SSR) l Variation due to relationship between X & Y n Unexplained variation (SSE) l Variation due to other factors

30
© 1997 Prentice-Hall, Inc. Variation Measures Total sum of squares (Y i - Y) 2 Unexplained sum of squares (Y i - Y i ) 2 ^ Explained sum of squares (Y i - Y) 2 ^ YiYiYiYi

31
© 1997 Prentice-Hall, Inc. n Proportion of variation ‘explained’ by relationship between X & Y Coefficient of Determination 0 r 2 1

32
© 1997 Prentice-Hall, Inc. r 2 Examples r 2 = 1 r 2 =.8r 2 = 0

33
© 1997 Prentice-Hall, Inc. n Proportion of variation ‘explained’ by relationship between X & Y n Reflects l Sample size l Number of independent variables Adjusted Coefficient of Determination

34
© 1997 Prentice-Hall, Inc. Coef. of Determination Excel Output r2r2r2r2 r 2 adjusted for number of explanatory variables & sample size S YX

35
© 1997 Prentice-Hall, Inc. Residual Analysis n Graphical analysis of residuals l Plot residuals vs. X i values s Residuals are also called errors n Difference between actual Y i & predicted Y i n Purposes l Examine functional form (linear vs. non-linear model) l Evaluate violations of assumptions

36
© 1997 Prentice-Hall, Inc. Linear Regression Assumptions n Normality l Y values are normally distributed for each X l Probability distribution of error is normal n Homoscedasticity (constant variance) n Independence of errors n Linearity

37
© 1997 Prentice-Hall, Inc. Residual Plot for Functional Form Add X 2 Term Correct Specification

38
© 1997 Prentice-Hall, Inc. Residual Plot for Homoscedasticity Heteroscedasticity Correct Specification Fan-shaped. Standardized residuals used typically.

39
© 1997 Prentice-Hall, Inc. Residual Plot for Independence Not Independent Correct Specification Plots reflect sequence data were collected.

40
© 1997 Prentice-Hall, Inc. Residual Analysis Excel Output

41
© 1997 Prentice-Hall, Inc. Residual Plot Excel Output

42
© 1997 Prentice-Hall, Inc. Test of Slope Coefficient n Tests if there is a linear relationship between X & Y Involves population slope 1 Involves population slope 1 n Hypotheses H 0 : 1 = 0 (No linear relationship) H 0 : 1 = 0 (No linear relationship) H 1 : 1 0 (Linear relationship) H 1 : 1 0 (Linear relationship) n Theoretical basis is sampling distribution of slopes

43
© 1997 Prentice-Hall, Inc. Test of Slope Parameter Solution H 0 : 1 = 0 H 1 : 1 0 .05 df = 3 Critical Value(s): Test Statistic: Decision:Conclusion: Reject at =.05 There is evidence of a relationship

44
© 1997 Prentice-Hall, Inc. Test Statistic Solution

45
© 1997 Prentice-Hall, Inc. Test of Slope Parameter Excel Output t = b P /S b SbSbSbSb bPbPbPbP P-Value P P

46
© 1997 Prentice-Hall, Inc. Prediction With Regression Models n Types of predictions l Point estimates l Interval estimates n What is predicted Population mean response ( YX ) for given X Population mean response ( YX ) for given X s Point on population regression line l Individual response (Y i ) for given X

47
© 1997 Prentice-Hall, Inc. What Is Predicted

48
© 1997 Prentice-Hall, Inc. Factors Affecting Interval Width Level of confidence (1 - ) Level of confidence (1 - ) l Width increases as confidence increases n Data dispersion (S YX ) l Width increases as variation increases n Sample size l Width decreases as sample size increases Distance of X given from mean X Distance of X given from mean X l Width increases as distance increases

49
© 1997 Prentice-Hall, Inc. Regression Cautions n Violated assumptions n Relevancy of historical data n Level of significance n Extrapolation n Cause & effect

50
© 1997 Prentice-Hall, Inc. ExtrapolationExtrapolation

51
© 1997 Prentice-Hall, Inc. Cause & Effect Liquor Consumption # Teachers

52
© 1997 Prentice-Hall, Inc. Types of Probabilistic Models

53
© 1997 Prentice-Hall, Inc. Correlation Models n Answer ‘How strong is the linear relationship between 2 variables?’ n Coefficient of correlation used Population correlation coefficient denoted (rho) Population correlation coefficient denoted (rho) l Values range from -1 to +1 l Measures degree of association n Used mainly for understanding

54
© 1997 Prentice-Hall, Inc. n Pearson Product-Moment Coefficient of Correlation: Sample Coefficient of Correlation

55
© 1997 Prentice-Hall, Inc. Correlation & Regression Line r = 1r = -1 r =.89r = 0

56
© 1997 Prentice-Hall, Inc. Test of Correlation Coefficient n Shows if there is a linear relationship between 2 numerical variables Same conclusion as testing population slope 1 Same conclusion as testing population slope 1 n Hypotheses H 0 : = 0 (No correlation) H 0 : = 0 (No correlation) H 1 : 0 (Correlation) H 1 : 0 (Correlation)

57
© 1997 Prentice-Hall, Inc. ConclusionConclusion n Described the linear regression model n Stated the regression modeling steps n Explained least squares n Computed regression coefficients n Described residual analysis n Predicted the response variable

58
© 1997 Prentice-Hall, Inc. Learning Objectives n Explain the linear multiple regression model n Interpret linear multiple regression computer output n Explain multicollinearity

59
© 1997 Prentice-Hall, Inc. Multiple Regression Models

60
© 1997 Prentice-Hall, Inc. Linear Multiple Regression Model n Relationship between 1 dependent & 2 or more independent variables is a linear function Dependent (response) variable Independent (explanatory) variables Population slopes Population Y-intercept Random error

61
© 1997 Prentice-Hall, Inc. Population Multiple Regression Model Bivariate model

62
© 1997 Prentice-Hall, Inc. Sample Multiple Regression Model Bivariate model

63
© 1997 Prentice-Hall, Inc. Regression Modeling Steps n Define problem or question n Specify model n Collect data n Do descriptive data analysis n Estimate unknown parameters n Evaluate model n Use model for prediction

64
© 1997 Prentice-Hall, Inc. Parameter Estimation Linear Multiple Regression Model

65
© 1997 Prentice-Hall, Inc. Multiple Linear Regression Equations Too complicated by hand! Ouch!

66
© 1997 Prentice-Hall, Inc. Interpretation of Estimated Coefficients n Slope (b P ) l Estimated Y changes by b P for each 1 unit increase in X P holding all other variables constant s Example: If b 1 = 2, then Sales (Y) is expected to increase by 2 for each 1 unit increase in Advertising (X 1 ) given the Number of Sales Rep’s (X 2 ) n Y-Intercept (b 0 ) l Average value of Y when X P = 0

67
© 1997 Prentice-Hall, Inc. Parameter Estimation Example You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) & newspaper circulation (000) on the number of ad responses (00). You’ve collected the following data: RespSizeCirc

68
© 1997 Prentice-Hall, Inc. Parameter Estimation Excel Output bPbPbPbP b0b0b0b0 b1b1b1b1 b2b2b2b2

69
© 1997 Prentice-Hall, Inc. Interpretation of Coefficients Solution n Slope (b 1 ) l # Responses to Ad is expected to increase by.2049 (20.49) for each 1 sq. in. increase in Ad Size holding Circulation constant n Slope (b 2 ) l # Responses to Ad is expected to increase by.2805 (28.05) for each 1 unit (1,000) increase in Circulation holding Ad Size constant

70
© 1997 Prentice-Hall, Inc. Evaluating the Model

71
© 1997 Prentice-Hall, Inc. Regression Modeling Steps n Define problem or question n Specify model n Collect data n Do descriptive data analysis n Estimate unknown parameters n Evaluate model n Use model for prediction

72
© 1997 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps n Examine variation measures n Do residual analysis n Test parameter significance l Overall model l Portions of model l Individual coefficients n Test for multicollinearity

73
© 1997 Prentice-Hall, Inc. Coef. of Determination Excel Output S YX r 2 Y. 12 r 2 adj means 95.61% of variation in Y is due to Ad Size & Circulation

74
© 1997 Prentice-Hall, Inc. Coefficient of Partial Determination n Proportion of variation in Y ‘explained’ by variable X P holding all others constant n Must estimate separate models n Denoted r 2 Y1.2 in two X variables case l Coefficient of partial determination of X 1 with Y holding X 2 constant n Useful in selecting X variables

75
© 1997 Prentice-Hall, Inc. r 2 Y1. 2 Excel Output ANOVA df SS Regression Residual Total

76
© 1997 Prentice-Hall, Inc. Testing Parameters

77
© 1997 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps n Examine variation measures n Do residual analysis n Test parameter significance l Overall model l Portions of model l Individual coefficients n Test for multicollinearity New! Expanded!

78
© 1997 Prentice-Hall, Inc. Testing Overall Significance n Shows if there is a linear relationship between all X variables together & Y n Uses F test statistic n Hypotheses H 0 : 1 = 2 =... = P = 0 H 0 : 1 = 2 =... = P = 0 s No linear relationship l H 1 : At least one coefficient is not 0 s At least one X variable affects Y

79
© 1997 Prentice-Hall, Inc. Overall Significance Excel Output n - 1 P-value P n - P -1 MSR / MSE

80
© 1997 Prentice-Hall, Inc. n Examines the contribution of a set of X variables to the relationship with Y n Null hypothesis: l Variables in set do not improve significantly the model when all other variables are included n Must estimate separate models n Used in selecting X variables Testing Model Portions

81
© 1997 Prentice-Hall, Inc. Testing Model Portions Test Statistic From ANOVA section of regression for Test H 0 : 1 = 0 in a 2 variable model

82
© 1997 Prentice-Hall, Inc. MulticollinearityMulticollinearity

83
© 1997 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps n Examine variation measures n Do residual analysis n Test parameter significance l Overall model l Portions of model l Individual coefficients n Test for multicollinearity New! Expanded!

84
© 1997 Prentice-Hall, Inc. MulticollinearityMulticollinearity n High correlation between X variables n Coefficients measure combined effect n Leads to unstable coefficients depending on X variables in model n Always exists; matter of degree n Example: Using both Sales & Profit as explanatory variables in same model

85
© 1997 Prentice-Hall, Inc. Detecting Multicollinearity n Examine correlation matrix l Correlations between pairs of X variables are more than with Y variable n Examine variance inflation factor (VIF) l If VIF j > 5, multicollinearity exists n Few remedies l Obtain new sample data l Eliminate one correlated X variable

86
© 1997 Prentice-Hall, Inc. Correlation Matrix Excel Output r Y2 r 12 r Y1

87
© 1997 Prentice-Hall, Inc. VIF Excel Output Regress X 1 on X 2

88
© 1997 Prentice-Hall, Inc. This Class... n What was the most important thing you learned in class today? n What do you still have questions about? n How can today’s class be improved? Please take a moment to answer the following questions in writing:

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google