Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i.

Similar presentations


Presentation on theme: "1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i."— Presentation transcript:

1 1 732G21/732A35/732G28

2 Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i th predictor value  is i.i.d. normally distributed random vars with expectation zero and variance σ 2 732G21/732A35/732G282

3 Inference about regression coefficients and response:  Interval estimates and test concerning coefficients  Confidence interval for Y  Prediction interval for Y  ANOVA-table 732G21/732A35/732G283

4  After fitting the data, we may obtain a regr. line  Is 0.00005 significant or just because of random variation? (hence, no linear dependence between Y and X)  How to do? ◦ Use Hypothesis testing (later) ◦ Derive confindence interval for β 0. If ”0” does not fall within this interval, there is dependence 732G21/732A35/732G284

5  Estimated slope b 1 is a random variable (look at formula) Properties of b 1  Normally distributed (show)  E(b 1 )= β 1  Variance Further: Test statistics is distributed as t(n-2) 732G21/732A35/732G285

6  See table B.2 (p. 1317)  Example one-sided interval t(95%), 15 observations t 13 =1.771 732G21/732A35/732G286

7  Confidence interval for β 1 (show…)  If variance in the data is unknown, Example Compute confidence interval for slope, Salary dataset 732G21/732A35/732G287

8 8

9  Often, we have sample and we test at some confidence level α How to do?  Step 1: Find and compute appropriate test function T=T(sample,λ 0 )  Step 2: Plot test function’s distrubution and mark a critical area dependent on α  If T is in the critical area, reject H 0 otherwise do not reject H 0 (accept H 1 ) 732G21/732A35/732G289

10  Test  Step 1: compute  Step 2: Plot the distribution, mark the points and the critical area.  Step 3: define where t* is and reject H 0 if it is in the critical area Example Test the hypothesis for Salary dataset:  Manually, compute also P-values  By Minitab 732G21/732A35/732G2810

11  Sometimes, we need to know ” β 0 =0?” Do confidence intervals and hypothesis testing in the same way using folmulas below! Properties of b 0  Normally distributed (show)  E(b 0 )= β 0  Variance (show..) Further: Test statistics is distributed as t(n-2) 732G21/732A35/732G2811

12  If distribution not normal (if slightly, OK, otherwise asymptotic)  Spacing affects variance (larger spacing –smaller variance) Example Test β 0 =0 for Salary data 732G21/732A35/732G2812

13  Estimate at X=X h (X h – any): Properties of E(Y h )  Normally distributed (show)   Variance Further: Test statistics is distributed as t(n-2) Confidence interval 732G21/732A35/732G2813

14  Make a plot… 732G21/732A35/732G2814 POINT ESTIMATE CONFIDENCE INTERVAL We estimate the position of the mean in the population with X = X h PREDICTION INTERVAL We estimate the position of the individual observation in the population with X = X h

15  When parameters are unknown, the mean E(Y h ) may have more than one possible location  New observation = mean + random error -> prediction interval should be wider 732G21/732A35/732G2815

16 Further: Test statistics is distributed as t(n-2) Prediction interval  How to estimate s(pred) ? New observ. is any within b 0 +b 1 X h +ε. Hence  Standard error (show)  732G21/732A35/732G2816

17 Example  Calculate confidence and prediction intervals for 35 years old person  Compare with output in Minitab 732G21/732A35/732G2817

18  Total sum of squares  Error sum of squares  Regression sum of squares 732G21/732A35/732G2818

19  SSTO has n-1 (sum up to zero)  SSE has n-2 ( 2 model parameters)  SSR has 1 (fitted values lie on regression line= 2 degrees- sum up to zero 1 degree) n-1 = n-2 + 1 SSTO =SSE + SSR Important : MSxx= SSxx/degrees_of_freedom 732G21/732A35/732G2819

20  ANOVA table 732G21/732A35/732G2820 Source of variation SSdfMS Regression1 Errorn - 2 Totaln - 1

21 Expected mean squares  E(MSE) does not depend on the slope, even when zero  E(MSR) =E(MSE) when slope is zero  -> IF MSR much more than MSE, slope is not zero, if approximately same, can be zero 732G21/732A35/732G2821

22  Test statistics F* = MSR/MSE, use F(1,n-2) (see p. 1320) Decision rules:  If F* > F(1-α;1, n-2) conclude H a  If F* ≤ F(1-α;1, n-2) conclude H 0 Note: F test and t test about β 1 are equivalent 732G21/732A35/732G2822

23  General approach  Full model: (linear)  Reduced model: (constant) 732G21/732A35/732G2823

24  It is known (why?..) SSE(F)≤SSE(R). Large difference -different models, small difference – can be same  Test statistics  For univariate linear model, equivalent to F* = MSR/MSE  F* belongs to F(df R -df F,df F ) distribution (plot critical area..)  Test rule: F*> F(1-α; df R -df F,df F )  reject H 0 732G21/732A35/732G2824

25 Example For Salary dataset  Compose ANOVA table and compare with MINITAB  Perform F-test and compare with MINITAB 732G21/732A35/732G2825

26  Coefficient of determination:  Coefficient of correlation: Limitations:  High R does not mean a good fit  Low R does not mean than X and Y are not related Example: For Salary dataset, compute R 2 and compare with MINITAB 732G21/732A35/732G2826

27  Chapter 2 up to page 78 732G21/732A35/732G2827


Download ppt "1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i."

Similar presentations


Ads by Google