Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simple Linear Regression

Similar presentations


Presentation on theme: "Simple Linear Regression"— Presentation transcript:

1 Simple Linear Regression
Chapter 11 Simple Linear Regression

2 Learning Objectives 1. Describe the Linear Regression Model
2. State the Regression Modeling Steps Explain Ordinary Least Squares Understand and check model assumptions 4. Compute Regression Coefficients 5. Predict Response Variable 6. Interpret Computer Output As a result of this class, you will be able to...

3 Models 3

4 Models 1. Representation of Some Phenomenon
2. Mathematical Model Is a Mathematical Expression of Some Phenomenon 3. Often Describe Relationships between Variables 4. Types Deterministic Models Probabilistic Models .

5 Deterministic Models 1. Hypothesize Exact Relationships
2. Suitable When Prediction Error is Negligible 3. Example: Force Is Exactly Mass Times Acceleration F = m·a © T/Maker Co.

6 Probabilistic Models 1. Hypothesize 2 Components
Deterministic Random Error 2. Example: Sales Volume Is 10 Times Advertising Spending + Random Error Y = 10X +  Random Error May Be Due to Factors Other Than Advertising

7 Types of Probabilistic Models
7

8 Regression Models 13

9 Types of Probabilistic Models
7

10 Regression Models 1. Answer ‘What Is the Relationship Between the Variables?’ 2. Equation Used 1 Numerical Dependent (Response) Variable What Is to Be Predicted 1 or More Numerical or Categorical Independent (Explanatory) Variables 3. Used Mainly for Prediction & Estimation

11 Regression Modeling Steps
1. Hypothesize Deterministic Component 2. Estimate Unknown Model Parameters 3. Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error 4. Evaluate Model 5. Use Model for Prediction & Estimation

12 Model Specification 13

13 Regression Modeling Steps
1. Hypothesize Deterministic Component 2. Estimate Unknown Model Parameters 3. Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error 4. Evaluate Model 5. Use Model for Prediction & Estimation

14 Specifying the Model 1. Define Variables
2. Hypothesize Nature of Relationship Expected Effects (i.e., Coefficients’ Signs) Functional Form (Linear or Non-Linear) Interactions

15 Model Specification Is Based on Theory
1. Theory of Field (e.g., Sociology) 2. Mathematical Theory 3. Previous Research 4. ‘Common Sense’

16 Thinking Challenge: Which Is More Logical?
With positive linear relationship, sales increases infinitely. Discuss concept of ‘relevant range’. 17

17 Types of Regression Models
This teleology is based on the number of explanatory variables & nature of relationship between X & Y. 18

18 Types of Regression Models
This teleology is based on the number of explanatory variables & nature of relationship between X & Y. 19

19 Types of Regression Models
1 Explanatory Variable Models This teleology is based on the number of explanatory variables & nature of relationship between X & Y. Simple 20

20 Types of Regression Models
1 Explanatory 2+ Explanatory Variable Models Variables This teleology is based on the number of explanatory variables & nature of relationship between X & Y. Simple Multiple 21

21 Types of Regression Models
1 Explanatory 2+ Explanatory Variable Models Variables This teleology is based on the number of explanatory variables & nature of relationship between X & Y. Simple Multiple Linear 22

22 Types of Regression Models
1 Explanatory 2+ Explanatory Variable Models Variables This teleology is based on the number of explanatory variables & nature of relationship between X & Y. Simple Multiple Non- Linear Linear 23

23 Types of Regression Models
1 Explanatory 2+ Explanatory Variable Models Variables This teleology is based on the number of explanatory variables & nature of relationship between X & Y. Simple Multiple Non- Linear Linear Linear 24

24 Types of Regression Models
1 Explanatory 2+ Explanatory Variable Models Variables This teleology is based on the number of explanatory variables & nature of relationship between X & Y. Simple Multiple Non- Non- Linear Linear Linear Linear 24

25 Linear Regression Model
26

26 Types of Regression Models
This teleology is based on the number of explanatory variables & nature of relationship between X & Y. 27

27 Linear Equations High School Teacher © T/Maker Co. 28

28 Linear Regression Model
1. Relationship Between Variables Is a Linear Function Population Y-Intercept Population Slope Random Error Y X i 1 i i Dependent (Response) Variable (e.g., income) Independent (Explanatory) Variable (e.g., education)

29 Population & Sample Regression Models
30

30 Population & Sample Regression Models
 $  $  $  $  $ 31

31 Population & Sample Regression Models
Unknown Relationship  $  $  $  $  $ 32

32 Population & Sample Regression Models
Random Sample Unknown Relationship  $  $  $  $  $  $  $ 33

33 Population & Sample Regression Models
Random Sample Unknown Relationship  $  $  $  $  $  $  $ 34

34 Population Linear Regression Model
Observedvalue i = Random error Observed value 35

35 Sample Linear Regression Model
i = Random error ^ Unsampled observation Observed value 36

36 Estimating Parameters: Least Squares Method
40

37 Regression Modeling Steps
1. Hypothesize Deterministic Component 2. Estimate Unknown Model Parameters 3. Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error 4. Evaluate Model 5. Use Model for Prediction & Estimation

38 Scattergram 1. Plot of All (Xi, Yi) Pairs
2. Suggests How Well Model Will Fit Y 60 40 20 X 20 40 60

39 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? 42

40 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? 43

41 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? 44

42 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? 45

43 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? 46

44 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? 47

45 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’? 48

46 Least Squares 1. ‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum But Positive Differences Off-Set Negative 49

47 Least Squares 1. ‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum But Positive Differences Off-Set Negative 50

48 Least Squares 1. ‘Best Fit’ Means Difference Between Actual Y Values & Predicted Y Values Are a Minimum But Positive Differences Off-Set Negative 2. LS Minimizes the Sum of the Squared Differences (SSE) 51

49 Least Squares Graphically
52

50 Coefficient Equations
Prediction Equation Sample Slope Sample Y-intercept 53

51 Computation Table 54

52 Interpretation of Coefficients

53 Interpretation of Coefficients
^ 1. Slope (1) Estimated Y Changes by 1 for Each 1 Unit Increase in X If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X) ^ ^

54 Interpretation of Coefficients
^ 1. Slope (1) Estimated Y Changes by 1 for Each 1 Unit Increase in X If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X) 2. Y-Intercept (0) Average Value of Y When X = 0 If 0 = 4, then Average Sales (Y) Is Expected to Be 4 When Advertising (X) Is 0 ^ ^ ^ ^

55 Parameter Estimation Example
You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $ Sales (Units) What is the relationship between sales & advertising?

56 Scattergram Sales vs. Advertising
57

57 Guess The Parameters!

58 Scattergram Sales vs. Advertising
57

59 Parameter Estimation Solution Table
58

60 Parameter Estimation Solution
59

61 Coefficient Interpretation Solution

62 Coefficient Interpretation Solution
^ 1. Slope (1) Sales Volume (Y) Is Expected to Increase by .7 Units for Each $1 Increase in Advertising (X)

63 Coefficient Interpretation Solution
^ 1. Slope (1) Sales Volume (Y) Is Expected to Increase by .7 Units for Each $1 Increase in Advertising (X) 2. Y-Intercept (0) Average Value of Sales Volume (Y) Is Units When Advertising (X) Is 0 Difficult to Explain to Marketing Manager Expect Some Sales Without Advertising ^

64 Parameter Estimation Computer Output
Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP ADVERT ^ k ^ ^ 0 1

65 Derivation of Parameter Equations
Goal: Minimize squared error

66 Derivation of Parameter Equations

67 Parameter Estimation Thinking Challenge
You’re an economist for the county cooperative. You gather the following data: Fertilizer (lb.) Yield (lb.) What is the relationship between fertilizer & crop yield? © T/Maker Co. 62

68 Scattergram Crop Yield vs. Fertilizer*
Yield (lb.) Fertilizer (lb.) 65

69 Parameter Estimation Solution Table*
66

70 Parameter Estimation Solution*
67

71 Coefficient Interpretation Solution*

72 Coefficient Interpretation Solution*
^ 1. Slope (1) Crop Yield (Y) Is Expected to Increase by .65 lb. for Each 1 lb. Increase in Fertilizer (X)

73 Coefficient Interpretation Solution*
^ 1. Slope (1) Crop Yield (Y) Is Expected to Increase by .65 lb. for Each 1 lb. Increase in Fertilizer (X) 2. Y-Intercept (0) Average Crop Yield (Y) Is Expected to Be 0.8 lb. When No Fertilizer (X) Is Used ^

74 Probability Distribution of Random Error
69

75 Regression Modeling Steps
1. Hypothesize Deterministic Component 2. Estimate Unknown Model Parameters 3. Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error 4. Evaluate Model 5. Use Model for Prediction & Estimation

76 Linear Regression Assumptions
1. Mean of Probability Distribution of Error Is 0 Probability Distribution of Error Has Constant Variance Exercise: Constant across what? 3. Probability Distribution of Error is Normal 4. Errors Are Independent

77 Error Probability Distribution
^ 91

78 Random Error Variation

79 Random Error Variation
1. Variation of Actual Y from Predicted Y

80 Random Error Variation
1. Variation of Actual Y from Predicted Y 2. Measured by Standard Error of Regression Model Sample Standard Deviation of , s ^

81 Random Error Variation
1. Variation of Actual Y from Predicted Y 2. Measured by Standard Error of Regression Model Sample Standard Deviation of , s 3. Affects Several Factors Parameter Significance Prediction Accuracy ^

82 Testing for Significance
Evaluating the Model Testing for Significance 99

83 Regression Modeling Steps
1. Hypothesize Deterministic Component 2. Estimate Unknown Model Parameters 3. Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error 4. Evaluate Model 5. Use Model for Prediction & Estimation

84 Test of Slope Coefficient
1. Shows If There Is a Linear Relationship Between X & Y 2. Involves Population Slope 1 3. Hypotheses H0: 1 = 0 (No Linear Relationship) Ha: 1  0 (Linear Relationship) 4. Theoretical Basis Is Sampling Distribution of Slope

85 Sampling Distribution of Sample Slopes
102

86 Sampling Distribution of Sample Slopes
103

87 Sampling Distribution of Sample Slopes
All Possible Sample Slopes Sample 1: 2.5 Sample 2: 1.6 Sample 3: 1.8 Sample 4: : : Very large number of sample slopes 104

88 Sampling Distribution of Sample Slopes
All Possible Sample Slopes Sample 1: 2.5 Sample 2: 1.6 Sample 3: 1.8 Sample 4: : : Very large number of sample slopes Sampling Distribution S ^ 1 ^ 1 105

89 Slope Coefficient Test Statistic
106

90 Test of Slope Coefficient Example
You’re a marketing analyst for Hasbro Toys. You find b0 = -.1, b1 = .7 & s = Ad $ Sales (Units) Is the relationship significant at the .05 level?

91 Solution Table 108

92 Test of Slope Parameter Solution
H0: 1 = 0 Ha: 1  0   .05 df  = 3 Critical Value(s): Test Statistic: Decision: Conclusion: Reject at  = .05 There is evidence of a relationship 109

93 Test Statistic Solution

94 Test of Slope Parameter Computer Output
Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP ADVERT ‘Standard Error’ is the estimated standard deviation of the sampling distribution, sbP. ^ ^ S ^ t = k / S k ^ k k P-Value

95 Measures of Variation in Regression
1. Total Sum of Squares (SSyy) Measures Variation of Observed Yi Around the MeanY 2. Explained Variation (SSR) Variation Due to Relationship Between X & Y 3. Unexplained Variation (SSE) Variation Due to Other Factors

96 Variation Measures Yi Unexplained sum of squares (Yi -Yi)2 ^
Total sum of squares (Yi -Y)2 Explained sum of squares (Yi -Y)2 ^ 78

97 Coefficient of Determination
1. Proportion of Variation ‘Explained’ by Relationship Between X & Y 0  r2  1 79

98 Coefficient of Determination Examples
80

99 Coefficient of Determination Example
You’re a marketing analyst for Hasbro Toys. You find 0 = -0.1 & 1 = 0.7. Ad $ Sales (Units) Interpret a coefficient of determination of ^ ^ 83

100 r 2 Computer Output Root MSE R-square Dep Mean Adj R-sq C.V r2 r2 adjusted for number of explanatory variables & sample size S

101 Using the Model for Prediction & Estimation
112

102 Regression Modeling Steps
1. Hypothesize Deterministic Component 2. Estimate Unknown Model Parameters 3. Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error 4. Evaluate Model 5. Use Model for Prediction & Estimation

103 Prediction With Regression Models
1. Types of Predictions Point Estimates Interval Estimates 2. What Is Predicted Population Mean Response E(Y) for Given X Point on Population Regression Line Individual Response (Yi) for Given X

104 What Is Predicted 115

105 Confidence Interval Estimate of Mean Y

106 Factors Affecting Interval Width
1. Level of Confidence (1 - ) Width Increases as Confidence Increases 2. Data Dispersion (s) Width Increases as Variation Increases 3. Sample Size Width Decreases as Sample Size Increases 4. Distance of Xp from MeanX Width Increases as Distance Increases

107 Why Distance from Mean? X Greater dispersion than X1
The closer to the mean, the less variability. This is due to the variability in estimated slope parameters. X 118

108 Confidence Interval Estimate Example
You’re a marketing analyst for Hasbro Toys. You find b0 = -.1, b1 = .7 & s = Ad $ Sales (Units) Estimate the mean sales when advertising is $4 at the .05 level.

109 Solution Table 120

110 Confidence Interval Estimate Solution
X to be predicted 121

111 Prediction Interval of Individual Response
Note the 1 under the radical in the standard error formula. The effect of the extra Syx is to increase the width of the interval. This will be seen in the interval bands. Note! 122

112 Why the Extra ‘S’? The error in predicting some future value of Y is the sum of 2 errors: 1. the error of estimating the mean Y, E(Y|X) 2. the random error that is a component of the value of Y to be predicted. Even if we knew the population regression line exactly, we would still make  error. 123

113 Interval Estimate Computer Output
Dep Var Pred Std Err Low95% Upp95% Low95% Upp95% Obs SALES Value Predict Mean Mean Predict Predict Predicted Y when X = 4 Confidence Interval Prediction Interval SY ^

114 Hyperbolic Interval Bands
Note: 1. As we move farther from the mean, the bands get wider. 2. The prediction interval bands are wider. Why? (extra Syx) 124

115 Correlation Models 129

116 Types of Probabilistic Models
130

117 Correlation Models 1. Answer ‘How Strong Is the Linear Relationship Between 2 Variables?’ 2. Coefficient of Correlation Used Population Correlation Coefficient Denoted  (Rho) Values Range from -1 to +1 Measures Degree of Association 3. Used Mainly for Understanding

118 Sample Coefficient of Correlation
1. Pearson Product Moment Coefficient of Correlation, r: 132

119 Coefficient of Correlation Values
-1.0 -.5 +.5 +1.0 134

120 Coefficient of Correlation Values
No Correlation -1.0 -.5 +.5 +1.0 135

121 Coefficient of Correlation Values
No Correlation -1.0 -.5 +.5 +1.0 Increasing degree of negative correlation 136

122 Coefficient of Correlation Values
Perfect Negative Correlation No Correlation -1.0 -.5 +.5 +1.0 137

123 Coefficient of Correlation Values
Perfect Negative Correlation No Correlation -1.0 -.5 +.5 +1.0 Increasing degree of positive correlation 138

124 Coefficient of Correlation Values
Perfect Negative Correlation Perfect Positive Correlation No Correlation -1.0 -.5 +.5 +1.0 139

125 Coefficient of Correlation Examples
141

126 Test of Coefficient of Correlation
1. Shows If There Is a Linear Relationship Between 2 Numerical Variables 2. Same Conclusion as Testing Population Slope 1 3. Hypotheses H0:  = 0 (No Correlation) Ha:   0 (Correlation)

127 Conclusion 1. Described the Linear Regression Model
2. Stated the Regression Modeling Steps 3. Explained Ordinary Least Squares 4. Computed Regression Coefficients 5. Predicted Response Variable 6. Interpreted Computer Output

128 Any blank slides that follow are blank intentionally.
End of Chapter Any blank slides that follow are blank intentionally.


Download ppt "Simple Linear Regression"

Similar presentations


Ads by Google