Bina Nusantara Regresi dan Analisis Varians Model Analisis Varians Eka Arah Pendekatan Regresi terhadap Klasifikasi satu arah
Bina Nusantara Population Y-intercept Population slopesRandom error The Multiple Regression Model Relationship between 1 dependent & 2 or more independent variables is a linear function Dependent (Response) variable Independent (Explanatory) variables
Bina Nusantara Multiple Regression Model Bivariate model
Bina Nusantara Multiple Regression Equation Bivariate model Multiple Regression Equation
Bina Nusantara Multiple Regression Equation Too complicated by hand! Ouch!
Bina Nusantara Interpretation of Estimated Coefficients Slope ( b j ) – Estimated that the average value of Y changes by b j for each 1 unit increase in X j, holding all other variables constant (ceterus paribus) – Example: If b 1 = -2, then fuel oil usage ( Y ) is expected to decrease by an estimated 2 gallons for each 1 degree increase in temperature ( X 1 ), given the inches of insulation ( X 2 ) Y-Intercept ( b 0 ) – The estimated average value of Y when all X j = 0
Bina Nusantara Multiple Regression Model: Example ( 0 F) Develop a model for estimating heating oil used for a single family home in the month of January, based on average temperature and amount of insulation in inches.
Bina Nusantara Multiple Regression Equation: Example Excel Output For each degree increase in temperature, the estimated average amount of heating oil used is decreased by 5.437 gallons, holding insulation constant. For each increase in one inch of insulation, the estimated average use of heating oil is decreased by 20.012 gallons, holding temperature constant.
Bina Nusantara Multiple Regression in PHStat PHStat | Regression | Multiple Regression … Excel spreadsheet for the heating oil example
Bina Nusantara Venn Diagrams and Explanatory Power of Regression Oil Temp Variations in Oil explained by Temp or variations in Temp used in explaining variation in Oil Variations in Oil explained by the error term Variations in Temp not used in explaining variation in Oil
Bina Nusantara Venn Diagrams and Explanatory Power of Regression Oil Temp (continued)
Bina Nusantara Venn Diagrams and Explanatory Power of Regression Oil Temp Insulation Overlapping variation NOT estimation Overlapping variation in both Temp and Insulation are used in explaining the variation in Oil but NOT in the estimation of nor NOT Variation NOT explained by Temp nor Insulation
Bina Nusantara Coefficient of Multiple Determination Proportion of Total Variation in Y Explained by All X Variables Taken Together – Never Decreases When a New X Variable is Added to Model – Disadvantage when comparing among models
Bina Nusantara Venn Diagrams and Explanatory Power of Regression Oil Temp Insulation
Bina Nusantara Adjusted Coefficient of Multiple Determination Proportion of Variation in Y Explained by All the X Variables Adjusted for the Sample Size and the Number of X Variables Used – – Penalizes excessive use of independent variables – Smaller than – Useful in comparing among models – Can decrease if an insignificant new X variable is added to the model
Bina Nusantara Example: Adjusted r 2 Can Decrease Adjusted r 2 decreases when k increases from 2 to 3 Color is not useful in explaining the variation in oil consumption.
Bina Nusantara Using the Regression Equation to Make Predictions Predict the amount of heating oil used for a home if the average temperature is 30 0 and the insulation is 6 inches. The predicted heating oil used is 278.97 gallons.
Bina Nusantara Testing for Overall Significance Test Statistic: – Where F has k numerator and ( n-k-1 ) denominator degrees of freedom (continued)
Bina Nusantara Test for Overall Significance Excel Output: Example k = 2, the number of explanatory variables n - 1 p -value
Bina Nusantara Test for Overall Significance: Example Solution F 03.89 H 0 : 1 = 2 = … = k = 0 H 1 : At least one j 0 =.05 df = 2 and 12 Critical Value : Test Statistic: Decision: Conclusion: Reject at = 0.05. There is evidence that at least one independent variable affects Y. = 0.05 F 168.47 (Excel Output)
Bina Nusantara Test for Significance: Individual Variables Show If Y Depends Linearly on a Single X j Individually While Holding the Effects of Other X’ s Fixed Use t Test Statistic Hypotheses: – H 0 : j 0 (No linear relationship) – H 1 : j 0 (Linear relationship between X j and Y )
Bina Nusantara t Test Statistic Excel Output: Example t Test Statistic for X 1 (Temperature) t Test Statistic for X 2 (Insulation)
Bina Nusantara t Test : Example Solution H 0 : 1 = 0 H 1 : 1 0 df = 12 Critical Values: Test Statistic: Decision: Conclusion: Reject H 0 at = 0.05. There is evidence of a significant effect of temperature on oil consumption holding constant the effect of insulation. t 0 2.1788 -2.1788.025 Reject H 0 0.025 Does temperature have a significant effect on monthly consumption of heating oil? Test at = 0.05. t Test Statistic = -16.1699
Bina Nusantara Venn Diagrams and Estimation of Regression Model Oil Temp Insulation Only this information is used in the estimation of This information is NOT used in the estimation of nor
Bina Nusantara Confidence Interval Estimate for the Slope Provide the 95% confidence interval for the population slope 1 (the effect of temperature on oil consumption). -6.169 1 -4.704 We are 95% confident that the estimated average consumption of oil is reduced by between 4.7 gallons to 6.17 gallons per each increase of 1 0 F holding insulation constant. We can also perform the test for the significance of individual variables, H 0 : 1 = 0 vs. H 1 : 1 0, using this confidence interval.
Bina Nusantara Contribution of a Single Independent Variable Let X j Be the Independent Variable of Interest – Measures the additional contribution of X j in explaining the total variation in Y with the inclusion of all the remaining independent variables
Bina Nusantara Contribution of a Single Independent Variable Measures the additional contribution of X 1 in explaining Y with the inclusion of X 2 and X 3. From ANOVA section of regression for
Bina Nusantara Coefficient of Partial Determination of Measures the proportion of variation in the dependent variable that is explained by X j while controlling for (holding constant) the other independent variables
Bina Nusantara Coefficient of Partial Determination for (continued) Example: Model with two independent variables
Bina Nusantara Venn Diagrams and Coefficient of Partial Determination for Oil Temp Insulation =
Bina Nusantara Contribution of a Subset of Independent Variables Let X s Be the Subset of Independent Variables of Interest – – Measures the contribution of the subset X s in explaining SST with the inclusion of the remaining independent variables
Bina Nusantara Contribution of a Subset of Independent Variables: Example Let X s be X 1 and X 3 From ANOVA section of regression for
Bina Nusantara Testing Portions of Model Examines the Contribution of a Subset X s of Explanatory Variables to the Relationship with Y Null Hypothesis: – Variables in the subset do not improve the model significantly when all other variables are included Alternative Hypothesis: – At least one variable in the subset is significant when all other variables are included
Bina Nusantara Testing Portions of Model One-Tailed Rejection Region Requires Comparison of Two Regressions – One regression includes everything – Another regression includes everything except the portion to be tested (continued)
Bina Nusantara Partial F Test for the Contribution of a Subset of X Variables Hypotheses: – H 0 : Variables X s do not significantly improve the model given all other variables included – H 1 : Variables X s significantly improve the model given all others included Test Statistic: – – with df = m and ( n-k-1 ) –m = # of variables in the subset X s
Bina Nusantara Partial F Test for the Contribution of a Single Hypotheses: – H 0 : Variable X j does not significantly improve the model given all others included – H 1 : Variable X j significantly improves the model given all others included Test Statistic: – – with df = 1 and ( n-k-1 ) –m = 1 here
Bina Nusantara Testing Portions of Model: Example Test at the =.05 level to determine if the variable of average temperature significantly improves the model, given that insulation is included.
Bina Nusantara Testing Portions of Model: Example H 0 : X 1 (temperature) does not improve model with X 2 (insulation) included H 1 : X 1 does improve model =.05, df = 1 and 12 Critical Value = 4.75 (For X 1 and X 2 )(For X 2 ) Conclusion: Reject H 0 ; X 1 does improve model.