Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prediction concerning the response Y. Where does this topic fit in? Model formulation Model estimation Model evaluation Model use.

Similar presentations


Presentation on theme: "Prediction concerning the response Y. Where does this topic fit in? Model formulation Model estimation Model evaluation Model use."— Presentation transcript:

1 Prediction concerning the response Y

2 Where does this topic fit in? Model formulation Model estimation Model evaluation Model use

3 Translating two research questions into two reasonable statistical answers What is the mean weight, μ, of all American women, aged 18-24? –If we want to estimate μ, what would be a good estimate? What is the weight, y, of a randomly selected American woman, aged 18-24? –If we want to predict y, what would be a good prediction?

4 Could we do better by taking into account a person’s height?

5 One thing to estimate (μ y ) and one thing to predict (y)

6 Two different research questions What is the mean response μ Y when the predictor value is x h ? What value will a new observation Y new be when the predictor value is x h ?

7 Example: Skin cancer mortality and latitude What is the expected (mean) mortality rate for all locations at 40 o N latitude? What is the predicted mortality rate for 1 new randomly selected location at 40 o N?

8 Example: Skin cancer mortality and latitude

9 “Point estimators” is the best answer to each research question. That is, it is: the best guess of the mean response at x h the best guess of a new observation at x h But, as always, to be confident in the answer to our research question, we should put an interval around our best guess.

10 It is dangerous to “extrapolate” beyond scope of model.

11

12 A confidence interval for the population mean response μ Y … when the predictor value is x h

13 Again, what are we estimating?

14 (1-α)100% t-interval for mean response μ Y Formula in notation: Formula in words: Sample estimate ± (t-multiplier × standard error)

15 Example: Skin cancer mortality and latitude Predicted Values for New Observations New Obs Fit SE Fit 95.0% CI 95.0% PI 1 150.08 2.75 (144.56, 155.61) (111.23,188.93) Values of Predictors for New Observations New Obs Lat 1 40.0

16 Factors affecting the length of the confidence interval for μ Y As the confidence level decreases, … As MSE decreases, … As the sample size increases, … The more spread out the predictor values, … The closer x h is to the sample mean, …

17 Does the estimate of μ Y when x h = 1 vary more here …? Var N StDev yhat(x=1) 5 0.320

18 … or here? Var N StDev yhat(x=1) 5 2.127

19 Does the estimate of μ Y vary more when x h = 1 or when x h = 5.5? Var N StDev yhat(x=1) 5 2.127 yhat(x=5.5) 5 0.512

20 Predicted Values for New Observations New Fit SE Fit 95.0% CI 95.0% PI 1 150.08 2.75 (144.6,155.6) (111.2,188.93) 2 221.82 7.42 (206.9,236.8) (180.6,263.07)X X denotes a row with X values away from the center Values of Predictors for New Observations New Obs Latitude 1 40.0 Mean of Lat = 39.533 2 28.0 Example: Skin cancer mortality and latitude

21 When is it okay to use the confidence interval for μ Y formula? When x h is a value within the scope of the model – x h does not have to be one of the actual x values in the data set. When the “LINE” assumptions are met. –The formula works okay even if the error terms are only approximately normal. –If you have a large sample, the error terms can even deviate substantially from normality.

22 Prediction interval for a new response Y new

23 Again, what are we predicting?

24 (1-α)100% prediction interval for new response Y new Formula in notation: Formula in words: Sample prediction ± (t-multiplier × standard error)

25 Example: Skin cancer mortality and latitude Predicted Values for New Observations New Obs Fit SE Fit 95.0% CI 95.0% PI 1 150.08 2.75 (144.56, 155.61) (111.23,188.93) Values of Predictors for New Observations New Obs Lat 1 40.0

26 When is it okay to use the prediction interval for Y new formula? When x h is a value within the scope of the model – x h does not have to be one of the actual x values in the data set. When the “LINE” assumptions are met. –The formula for the prediction interval depends strongly on the assumption that the error terms are normally distributed.

27 What’s the difference in the two formulas? Confidence interval for μ Y : Prediction interval for Y new :

28 Prediction of Y new if the mean μ Y is known Suppose it were known that the mean skin cancer mortality at x h = 40 o N is 150 deaths per million (with variance 400)? What is the predicted skin cancer mortality in Columbus, Ohio?

29 And then reality sets in The mean μ Y is not known. – Estimate it with the predicted response – The cost of usingto estimate μ Y is the The variance σ 2 is not known. variance of – Estimate it with MSE.

30 Variance of the prediction which is estimated by: The variation in the prediction of a new response depends on two components: 1. the variation due to estimating the mean μ Y with 2. the variation in Y

31 What’s the effect of the difference in the two formulas? Confidence interval for μ Y : Prediction interval for Y new :

32 What’s the effect of the difference in the two formulas? A (1-α)100% confidence interval for μ Y at x h will always be narrower than a (1-α)100% prediction interval for Y new at x h. The confidence interval’s standard error can approach 0, whereas the prediction interval’s standard error cannot get close to 0.

33 Confidence intervals and prediction intervals for response in Minitab Stat >> Regression >> Regression … Specify response and predictor(s). Select Options… –In “Prediction intervals for new observations” box, specify either the X value or a column name containing multiple X values. –Specify confidence level (default is 95%). Click on OK. Results appear in session window.

34 Confidence intervals and prediction intervals for response in Minitab

35 C6 40 28

36 Predicted Values for New Observations New Fit SE Fit 95.0% CI 95.0% PI 1 150.08 2.75 (144.6,155.6) (111.2,188.93) 2 221.82 7.42 (206.9,236.8) (180.6,263.07)X X denotes a row with X values away from the center Values of Predictors for New Observations New Obs Latitude 1 40.0 Mean of Lat = 39.533 2 28.0 Example: Skin cancer mortality and latitude

37 A plot of the confidence interval and prediction interval in Minitab Stat >> Regression >> Fitted line plot … Specify predictor and response. Under Options … –Select Display confidence bands. –Select Display prediction bands. –Specify desired confidence level (95% default) Select OK.

38 A plot of the confidence interval and prediction interval in Minitab

39

40


Download ppt "Prediction concerning the response Y. Where does this topic fit in? Model formulation Model estimation Model evaluation Model use."

Similar presentations


Ads by Google