Lecture 23: Tues., April 6 Interpretation of regression coefficients (handout) Inference for multiple regression.
Published byModified over 5 years ago
Presentation on theme: "Lecture 23: Tues., April 6 Interpretation of regression coefficients (handout) Inference for multiple regression."— Presentation transcript:
Lecture 23: Tues., April 6 Interpretation of regression coefficients (handout) Inference for multiple regression
Interpreting the Coefficients = increase in mean of Y that is associated with a one unit (1cm) increase in length, holding fixed weight = increase in mean of Y that is associated with a one unit (1 gram) increase in weight, holding fixed length Interpretation of multiple regression coefficients depends on what other explanatory variables are in the model. See handout.
Inference for Multiple Regression Types of inferences: Confidence intervals/hypothesis tests for regression coefficients Confidence intervals for mean response, prediction intervals Overall usefulness of predictors (F-test, R- squared) Effect tests (we will cover these later when we cover categorical explanatory variables)
Test of Regression Coefficient vs. Interpretation: “Is there evidence that length is a useful predictor of mercury concentration once weight has been taken into account (held fixed),” or “Is length associated with mercury concentration once weight has been taken into account (held fixed).” t-test:. Reject for large |t|. JMP output gives t- statistic and p-value under parameter estimates. For mercury data, p-value for is less than 0.0001 – strong evidence that length is a useful predictor once weight has been taken into account. p-value for is 0.2217 – no evidence that weight is a useful predictor once length has been taken into account (Interpretation: We could just use a simple linear regression of mercury concentration on length without losing much predictive accuracy.)
Confidence Interval for Coefficient Confidence interval for : Range of plausible values for 95% CI for Exact CI in JMP: Under parameter estimates, right click, select columns, lower 95%, upper 95% For fish mercury data, 95% CI for = (0.048, 0.095)
Confidence Interval for Mean Response What is the mean mercury concentration for the population of fish of length 48 cm and weight 1000g? Point estimate: 95% CI in JMP: Create row with length=48, weight=1000 but no mercury concentration. Fit model of mercury concentration on length, weight. Click red triangle next to response and select columns, mean confidence interval. Creates columns with upper and lower endpoints of 95% CI for mean response.
Prediction Interval You are considering eating a fish of length 48 cm and weight 1000 grams. What would you estimate that its mercury concentration will be? What is a range of values which is likely to contain the mercury concentration for this fish? Point estimate: Prediction Interval: To obtain 95% prediction interval in JMP, follow same instructions as for forming confidence intervals response but instead, after clicking red triangle next to response and selecting columns, select Indiv Confid Interval.
Prediction Interval/Confidence Intervals For a fish of length 48cm and weight 1000 grams, 95% CI for mean response: (1.55,2.02) 95% prediction interval: (0.62, 2.95) Is a confidence interval for mean response or a prediction interval more relevant for setting up guidelines about whether it is safe to eat fish of a certain length and weight?
Overall usefulness of predictors Are any of the predictors useful? Does the mean of y change as any of the explanatory variables changes. vs. at least one of ‘s does not equal zero. Test (called overall F test) is carried out in Analysis of Variance table. We reject for large values of F statistic. Prob>F is the p-value for this test. For fish mercury data, Prob>F less than 0.0001 – strong evidence that at least one of length/weight is a useful predictor of mercury concentration.
The R-Squared Statistic P-value from overall F test tests whether any of predictors are useful but does not give a measure of how useful the predictors are. R squared is a measure of how good the predictions from the multiple regression model are compared to using the mean of y, i.e., none of the predictors, to predict y. Similar interpretation as in simple linear regression. The R-squared statistic is the proportion of the variation in y explained by the multiple regression model Total Sum of Squares: Residual Sum of Squares:
Applications of Multiple Regression 1.Prediction of y given the explanatory variables Example: Prediction of mercury concentration given fish’s length and weight 2.Estimate the causal effect of a variable on y by holding fixed confounding variables. Example: Causal effect of irrigation on yield. Problem: Multiple regression will only give causal effect if we have not omitted any confounding variables from multiple regression.