# 4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.

## Presentation on theme: "4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance."— Presentation transcript:

4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance level α (which is used to determine t*), we construct 100(1- α)% confidence intervals -Given random samples, 100(1- α)% of our confidence intervals contain the true value B j -we don’t know whether an individual confidence interval contains the true value

4.3 Confidence Intervals -Confidence intervals are similar to 2-tailed tests in that α/2 is in each tail when finding t* -if our hypothesis test and confidence interval use the same α: 1)we can not reject the null hypothesis (at the given significance level) that B j =a j if a j is within the confidence interval 2)we can reject the null hypothesis (at the given significance level) that B j =a j if a j is not within the confidence interval

4.3 Confidence Example -Going back to our Pepsi example, we now look at geekiness: -From before our 2-sided t* with α=0.01 was t*=2.704, therefore our 99% CI is:

4.3 Confidence Intervals -Remember that a CI is only as good as the 6 CLM assumptions: 1)Omitted variables cause the estimates (B j hats) to be unreliable -CI is not valid 2) If heteroskedasticity is present, standard error is not a valid estimate of standard deviation -CI is not valid 3) If normality fails, CI MAY not be valid if our sample size is too small

4.4 Complicated Single Tests -In this section we will see how to test a single hypothesis involving more than one B j -Take again our coolness regression: -If we wonder if geekiness has more impact on coolness than Pepsi consumption:

4.4 Complicated Single Tests -This test is similar to our one coefficient tests, but our standard error will be different -We can rewrite our hypotheses for clarity: -We can reject the null hypothesis if the estimated difference between B 1 hat and B 2 hat is positive enough

4.4 Complicated Single Tests -Our new t statistic becomes: -And our test continues as before: 1) Calculate t 2) Pick α and calculate t* 3) Reject if t<t*

4.4 Complicated Standard Errors -The standard error in this test is more complicated than before -If we simply subtract standard errors, we may end up with a negative value -this is theoretically impossible -se must always be positive since it estimates standard deviations

4.4 Complicated Standard Errors -Using the properties of variances, we know that: -Where the variances are always added and the covariance always subtracted -transferring to standard deviation, this becomes: -Where s 12 is an estimate of the covariance between coefficients -s 12 can either be calculated using matrix algebra or be supplied by econometrics programs

4.4 Complicated Standard Errors -To see how to find this standard error, take our typical regression: -and consider the related equation where θ=B 1 -B 2 or B 1 = θ+B 2 : -where x 1 and x 1 could be related concepts (ie: sleep time and naps) and x 3 could be relatively unrelated (ie: study time)

4.4 Complicated Standard Errors -By running this new regression, we can find the standard error for our hypothesis test -using an econometric program is easier -Empirically: 1)B 0 and se(B 0 ) are the same for both regressions 2)B 2 and B 3 are the same for both regressions 3)Only B 1 (the coefficient of θ) changes -given this new standard error, CI’s are created as normal

4.5 Testing Multiple Restrictions -Thus far we have tested whether a SINGLE variable is significant, or how two different variable’s impacts compare -In this section we will test whether a SET of variables are significant; have a partial effect on the dependent variable -Even though a group of variables may be individually insignificant, they may be significant as a group due to multicollinearity

4.5 Testing Multiple Restrictions -Consider our general true model and an example measuring reading week utility (rwu): -we want to test the hypothesis that B 1 and B 2 equal zero at the same time, that x 1 and x 1 have no partial effect simultaneously: -in our example, we are testing that positive activities have no effect on r.w. utility

4.5 Testing Multiple Restrictions -our null hypothesis had two EXCLUSION RESTRICTIONS -this set of MULTIPLE RESTRICTIONS is tested using a MULTIPLE HYPOTHESIS TEST or JOINT HYPOTHESIS TEST -the alternate hypothesis is unique: -note that we CANNOT use individual t tests to test this multiple restriction; we need to test the restriction jointly

4.5 Testing Multiple Restrictions -to test joint significance, we need to use SSR and R squared values obtained from two different regressions -we know that SSR increases and R 2 decreases when variable are dropped from the model -in order to conduct our test, we need to regress two models: 1)An UNRESTRICTED model with all of the variables 2)A RESTRICTED MODEL that excludes the variables in the test

4.5 Testing Multiple Restrictions -Given a hypothesis test with q restrictions, we have the following regressions: -Where 4.34 is the UNRESTIRCTED MODEL giving us SSR u and 4.35 is the RESTRICTED MODEL giving us SSR r

4.5 Testing Multiple Restrictions -These SSR values combine to give us our F STATISTIC or TEST F STATISTIC: -Where q is the number of restrictions in the null hypothesis and q=numerator degrees of freedom -n-k-1=denominator degrees of freedom (the denominator is the unbiased estimator of σ 2 ) -since SSR r ≥SSR ur, F is always positive

4.5 Testing Multiple Restrictions -Once can think of our test F stat as measuring the relative increase in SSR from moving from the unrestricted model to restricted -a large F indicates that the excluded variables have much explanatory power -using H o and our CLM assumptions, we know that F has an F distribution with q, n-k-1 degrees of freedom: F~F q, n-k-1 -we obtain F* from F tables and reject H o if:

4.5 Multiple Example -Given our previous example of reading week utility, a restricted and unrestricted model give us: -Which correspond to the hypotheses:

4.5 Multiple Example -We use these SSR to construct a test statistic: -given α=0.05, F* 2,569 =3.00 -since F>F*, reject H 0 at a 95% confidence level; positive activities have an impact on reading week utility

4.5 Multiple Notes -Once the degrees of freedom in F’s denominator reach about 120, the F distribution is no longer sensitive to it -hence the infinity entry in the F table -if H 0 is rejected, the variables in question are JOINTLY (STATISTICALLY) SIGNIFICANT at the given alpha level -if H 0 is not rejected the variables in question are JOINTLY INSIGNIFICANT at the alpha level -an F test can often be not rejected when individual t tests are rejected due to multicollinearity

4.5 F, t’s secret identity? -the F statistic can also be used to test significance of a single variable -in this case, q=1 -it can be shown that F=t 2 in this case -or t 2 n-k-1 ~F 1, n-k-1 -this only applies to two-sided tests -therefore t statistic is more flexible since it allows for one-sided tests -the t statistic is always best suited for testing a single hypothesis

4.5 F tests and abuse -we have already seen where individually insignificant variables may be jointly significant due to multicollinearity -a significant variable can also prove to be jointly insignificant if grouped with enough insignificant variables -an insignificant variable can also prove to be significant if grouped with significant variables -therefore t tests are much better than F tests at determining individual significance

4.5 R 2 and F -While SSR can be large, R 2 is bounded, often making it an easier way to calculate F: -Which is also called the R-SQUARED FORM OF THE F STATISTIC -since R 2 ur >R 2 r, F is still always positive -this form is NOT valid for testing all linear restrictions (as seen later)

4.5 F and p-values -similar to t-tests, F tests can produce p-values which are defined as: -the p-value is the “probability of observing a value of F at least as large as we did, given that the null hypothesis is true” -a small p-value is therefore evidence against H 0 -as before, reject H 0 if p>α -p-values can give us a more complete view of significance

4.5 Overall significance -Often it is valid to test if the model is significant overall -the hypothesis that NONE of the explanatory variables have an effect on y is given as: -as before with multiple restrictions, we compare against the restricted model:

4.5 Overall significance -Since our restricted model has no independent variables, its R 2 is zero and our F formula simplifies to: -Which is only valid for this special test -this test determines the OVERALL SIGNIFICANCE OF THE REGRESSION -if this tests fails, we need to find other explanatory variables

4.5 Testing General Linear Restrictions -Sometimes economic theory (generally using elasticity) requires us to test complicated joint restrictions, such as: -Which expects our model: -To be of the form:

4.5 Testing General Linear Restrictions -We rewrite this expected model to obtain a restricted model: -We then calculate the F statistic using the SSR formula -note that since the dependent variable changes between the two models, the R 2 F formula is not valid in this case -note that the number of restrictions (q) is simply equal to the number of = in the null hypothesis

4.6 Reporting Regression Results -When reporting single regressions, the proper reporting method is: -where R 2, estimated coefficients, and N MUST be reported (note also the ^ and i’s) -either standard errors or t-values must also be reported (se is more robust for tests other than B k =0) -SSR and standard error of the regression can also be reported

4.6 Reporting Regression Results -When multiple, related regressions are run (often to test for joint significance), the results can be expressed in table format, as seen on the next slide -whether a simple or table reporting method is done, the meanings and scaling of all the included variables must always be explained in a proper project Ie: price: average price, measured weekly, in American dollars College: Dummy Variable. 0 if no college education, 1 if college education

4.6 Reporting Regression Results Dependent variable: Midterm readiness Ind. variables 1 2 Study Time0.47 (0.12) - Intellect1.89 (1.7) 2.36 (1.4) Intercept2.5 (0.03) 2.8 (0.02) Observations R 2 33 0.48 33 0.34

Download ppt "4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance."

Similar presentations