4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.

Slides:



Advertisements
Similar presentations
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
Advertisements

The Simple Regression Model
The Multiple Regression Model.
1 Javier Aparicio División de Estudios Políticos, CIDE Otoño Regresión.
Hypothesis Testing Steps in Hypothesis Testing:
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Econ 488 Lecture 5 – Hypothesis Testing Cameron Kaplan.
Lecture 3 (Ch4) Inferences
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Objectives (BPS chapter 24)
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
July 1, 2008Lecture 17 - Regression Testing1 Testing Relationships between Variables Statistics Lecture 17.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
PSY 307 – Statistics for the Behavioral Sciences
Chapter 10 Simple Regression.
2.5 Variances of the OLS Estimators
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Multiple Regression Applications
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Chapter 11 Multiple Regression.
EC Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Simple Linear Regression Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Multiple Linear Regression Analysis
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Hypothesis Tests and Confidence Intervals in Multiple Regressors
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Hypothesis Testing in Linear Regression Analysis
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Regression Method.
4.2 One Sided Tests -Before we construct a rule for rejecting H 0, we need to pick an ALTERNATE HYPOTHESIS -an example of a ONE SIDED ALTERNATIVE would.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
CHAPTER 14 MULTIPLE REGRESSION
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
3.4 The Components of the OLS Variances: Multicollinearity We see in (3.51) that the variance of B j hat depends on three factors: σ 2, SST j and R j 2.
Chapter 13 Multiple Regression
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
6. Simple Regression and OLS Estimation Chapter 6 will expand on concepts introduced in Chapter 5 to cover the following: 1) Estimating parameters using.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Example x y We wish to check for a non zero correlation.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
6. Simple Regression and OLS Estimation
Multiple Regression Analysis: Inference
Chapter 7: The Normality Assumption and Inference with OLS
The Multiple Regression Model
Presentation transcript:

4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance level α (which is used to determine t*), we construct 100(1- α)% confidence intervals -Given random samples, 100(1- α)% of our confidence intervals contain the true value B j -we don’t know whether an individual confidence interval contains the true value

4.3 Confidence Intervals -Confidence intervals are similar to 2-tailed tests in that α/2 is in each tail when finding t* -if our hypothesis test and confidence interval use the same α: 1)we can not reject the null hypothesis (at the given significance level) that B j =a j if a j is within the confidence interval 2)we can reject the null hypothesis (at the given significance level) that B j =a j if a j is not within the confidence interval

4.3 Confidence Example -Going back to our Pepsi example, we now look at geekiness: -From before our 2-sided t* with α=0.01 was t*=2.704, therefore our 99% CI is:

4.3 Confidence Intervals -Remember that a CI is only as good as the 6 CLM assumptions: 1)Omitted variables cause the estimates (B j hats) to be unreliable -CI is not valid 2) If heteroskedasticity is present, standard error is not a valid estimate of standard deviation -CI is not valid 3) If normality fails, CI MAY not be valid if our sample size is too small

4.4 Complicated Single Tests -In this section we will see how to test a single hypothesis involving more than one B j -Take again our coolness regression: -If we wonder if geekiness has more impact on coolness than Pepsi consumption:

4.4 Complicated Single Tests -This test is similar to our one coefficient tests, but our standard error will be different -We can rewrite our hypotheses for clarity: -We can reject the null hypothesis if the estimated difference between B 1 hat and B 2 hat is positive enough

4.4 Complicated Single Tests -Our new t statistic becomes: -And our test continues as before: 1) Calculate t 2) Pick α and calculate t* 3) Reject if t<t*

4.4 Complicated Standard Errors -The standard error in this test is more complicated than before -If we simply subtract standard errors, we may end up with a negative value -this is theoretically impossible -se must always be positive since it estimates standard deviations

4.4 Complicated Standard Errors -Using the properties of variances, we know that: -Where the variances are always added and the covariance always subtracted -transferring to standard deviation, this becomes: -Where s 12 is an estimate of the covariance between coefficients -s 12 can either be calculated using matrix algebra or be supplied by econometrics programs

4.4 Complicated Standard Errors -To see how to find this standard error, take our typical regression: -and consider the related equation where θ=B 1 -B 2 or B 1 = θ+B 2 : -where x 1 and x 1 could be related concepts (ie: sleep time and naps) and x 3 could be relatively unrelated (ie: study time)

4.4 Complicated Standard Errors -By running this new regression, we can find the standard error for our hypothesis test -using an econometric program is easier -Empirically: 1)B 0 and se(B 0 ) are the same for both regressions 2)B 2 and B 3 are the same for both regressions 3)Only B 1 (the coefficient of θ) changes -given this new standard error, CI’s are created as normal

4.5 Testing Multiple Restrictions -Thus far we have tested whether a SINGLE variable is significant, or how two different variable’s impacts compare -In this section we will test whether a SET of variables are significant; have a partial effect on the dependent variable -Even though a group of variables may be individually insignificant, they may be significant as a group due to multicollinearity

4.5 Testing Multiple Restrictions -Consider our general true model and an example measuring reading week utility (rwu): -we want to test the hypothesis that B 1 and B 2 equal zero at the same time, that x 1 and x 1 have no partial effect simultaneously: -in our example, we are testing that positive activities have no effect on r.w. utility

4.5 Testing Multiple Restrictions -our null hypothesis had two EXCLUSION RESTRICTIONS -this set of MULTIPLE RESTRICTIONS is tested using a MULTIPLE HYPOTHESIS TEST or JOINT HYPOTHESIS TEST -the alternate hypothesis is unique: -note that we CANNOT use individual t tests to test this multiple restriction; we need to test the restriction jointly

4.5 Testing Multiple Restrictions -to test joint significance, we need to use SSR and R squared values obtained from two different regressions -we know that SSR increases and R 2 decreases when variable are dropped from the model -in order to conduct our test, we need to regress two models: 1)An UNRESTRICTED model with all of the variables 2)A RESTRICTED MODEL that excludes the variables in the test

4.5 Testing Multiple Restrictions -Given a hypothesis test with q restrictions, we have the following regressions: -Where 4.34 is the UNRESTIRCTED MODEL giving us SSR u and 4.35 is the RESTRICTED MODEL giving us SSR r

4.5 Testing Multiple Restrictions -These SSR values combine to give us our F STATISTIC or TEST F STATISTIC: -Where q is the number of restrictions in the null hypothesis and q=numerator degrees of freedom -n-k-1=denominator degrees of freedom (the denominator is the unbiased estimator of σ 2 ) -since SSR r ≥SSR ur, F is always positive

4.5 Testing Multiple Restrictions -Once can think of our test F stat as measuring the relative increase in SSR from moving from the unrestricted model to restricted -a large F indicates that the excluded variables have much explanatory power -using H o and our CLM assumptions, we know that F has an F distribution with q, n-k-1 degrees of freedom: F~F q, n-k-1 -we obtain F* from F tables and reject H o if:

4.5 Multiple Example -Given our previous example of reading week utility, a restricted and unrestricted model give us: -Which correspond to the hypotheses:

4.5 Multiple Example -We use these SSR to construct a test statistic: -given α=0.05, F* 2,569 =3.00 -since F>F*, reject H 0 at a 95% confidence level; positive activities have an impact on reading week utility

4.5 Multiple Notes -Once the degrees of freedom in F’s denominator reach about 120, the F distribution is no longer sensitive to it -hence the infinity entry in the F table -if H 0 is rejected, the variables in question are JOINTLY (STATISTICALLY) SIGNIFICANT at the given alpha level -if H 0 is not rejected the variables in question are JOINTLY INSIGNIFICANT at the alpha level -an F test can often be not rejected when individual t tests are rejected due to multicollinearity

4.5 F, t’s secret identity? -the F statistic can also be used to test significance of a single variable -in this case, q=1 -it can be shown that F=t 2 in this case -or t 2 n-k-1 ~F 1, n-k-1 -this only applies to two-sided tests -therefore t statistic is more flexible since it allows for one-sided tests -the t statistic is always best suited for testing a single hypothesis

4.5 F tests and abuse -we have already seen where individually insignificant variables may be jointly significant due to multicollinearity -a significant variable can also prove to be jointly insignificant if grouped with enough insignificant variables -an insignificant variable can also prove to be significant if grouped with significant variables -therefore t tests are much better than F tests at determining individual significance

4.5 R 2 and F -While SSR can be large, R 2 is bounded, often making it an easier way to calculate F: -Which is also called the R-SQUARED FORM OF THE F STATISTIC -since R 2 ur >R 2 r, F is still always positive -this form is NOT valid for testing all linear restrictions (as seen later)

4.5 F and p-values -similar to t-tests, F tests can produce p-values which are defined as: -the p-value is the “probability of observing a value of F at least as large as we did, given that the null hypothesis is true” -a small p-value is therefore evidence against H 0 -as before, reject H 0 if p>α -p-values can give us a more complete view of significance

4.5 Overall significance -Often it is valid to test if the model is significant overall -the hypothesis that NONE of the explanatory variables have an effect on y is given as: -as before with multiple restrictions, we compare against the restricted model:

4.5 Overall significance -Since our restricted model has no independent variables, its R 2 is zero and our F formula simplifies to: -Which is only valid for this special test -this test determines the OVERALL SIGNIFICANCE OF THE REGRESSION -if this tests fails, we need to find other explanatory variables

4.5 Testing General Linear Restrictions -Sometimes economic theory (generally using elasticity) requires us to test complicated joint restrictions, such as: -Which expects our model: -To be of the form:

4.5 Testing General Linear Restrictions -We rewrite this expected model to obtain a restricted model: -We then calculate the F statistic using the SSR formula -note that since the dependent variable changes between the two models, the R 2 F formula is not valid in this case -note that the number of restrictions (q) is simply equal to the number of = in the null hypothesis

4.6 Reporting Regression Results -When reporting single regressions, the proper reporting method is: -where R 2, estimated coefficients, and N MUST be reported (note also the ^ and i’s) -either standard errors or t-values must also be reported (se is more robust for tests other than B k =0) -SSR and standard error of the regression can also be reported

4.6 Reporting Regression Results -When multiple, related regressions are run (often to test for joint significance), the results can be expressed in table format, as seen on the next slide -whether a simple or table reporting method is done, the meanings and scaling of all the included variables must always be explained in a proper project Ie: price: average price, measured weekly, in American dollars College: Dummy Variable. 0 if no college education, 1 if college education

4.6 Reporting Regression Results Dependent variable: Midterm readiness Ind. variables 1 2 Study Time0.47 (0.12) - Intellect1.89 (1.7) 2.36 (1.4) Intercept2.5 (0.03) 2.8 (0.02) Observations R