Econ 488 Lecture 5 – Hypothesis Testing Cameron Kaplan.

Slides:



Advertisements
Similar presentations
C 3.7 Use the data in MEAP93.RAW to answer this question
Advertisements

Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Lecture 3 (Ch4) Inferences
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Lecture 4 Econ 488. Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
MF-852 Financial Econometrics
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Multiple Linear Regression Model
CHAPTER 3 ECONOMETRICS x x x x x Chapter 2: Estimating the parameters of a linear regression model. Y i = b 1 + b 2 X i + e i Using OLS Chapter 3: Testing.
4.1 All rights reserved by Dr.Bill Wan Sing Hung - HKBU Lecture #4 Studenmund (2006): Chapter 5 Review of hypothesis testing Confidence Interval and estimation.
Topic 2: Statistical Concepts and Market Returns
Stat 112 – Notes 3 Homework 1 is due at the beginning of class next Thursday.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
Chapter 11 Multiple Regression.
Lecture 23 Multiple Regression (Sections )
Topic 3: Regression.
EC Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Multiple Linear Regression Analysis
What Is Hypothesis Testing?
Hypothesis Tests and Confidence Intervals in Multiple Regressors
Active Learning Lecture Slides
AM Recitation 2/10/11.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Experimental Statistics - week 2
Statistical inference: confidence intervals and hypothesis testing.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Hypothesis Testing in Linear Regression Analysis
5.1 Basic Estimation Techniques  The relationships we theoretically develop in the text can be estimated statistically using regression analysis,  Regression.
4.2 One Sided Tests -Before we construct a rule for rejecting H 0, we need to pick an ALTERNATE HYPOTHESIS -an example of a ONE SIDED ALTERNATIVE would.
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
Inferential Statistics 2 Maarten Buis January 11, 2006.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Six.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
May 2004 Prof. Himayatullah 1 Basic Econometrics Chapter 5: TWO-VARIABLE REGRESSION: Interval Estimation and Hypothesis Testing.
Warsaw Summer School 2011, OSU Study Abroad Program Difference Between Means.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
2010, ECON Hypothesis Testing 1: Single Coefficient Review of hypothesis testing Testing single coefficient Interval estimation Objectives.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Multiple Regression Analysis: Inference
Multiple Regression Analysis: Inference
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
Hypothesis Testing: Preliminaries
Interval Estimation and Hypothesis Testing
Chapter 7: The Normality Assumption and Inference with OLS
Presentation transcript:

Econ 488 Lecture 5 – Hypothesis Testing Cameron Kaplan

Classical Assumptions 1.Regression is linear, correctly specified, and has additive error term 2.E(ε i )=0 3.Correlation between X ki and ε i is 0 for all k. 4.ε t is uncorrelated with ε t+1 for all t. 5.Var(ε i )=σ 2 [No Heteroskedasticity] 6.No perfect multicollinearity and sometimes: 7.ε i ~N(0, σ 2 )

Sampling Distribution of is assumed to be normally distributed because the stochastic error is assumed to be normally distributed (assumption 7) Usually, we take a sample of size N from a population to produce a single estimator of β, which we call. But what if we took a different sample? We should get a different result for

Sampling Distribution of β

In OLS, is unbiased, so E( )=β OLS estimators also have the smallest variance possible at any sample size (efficiency) Finally, OLS estimators are consistent. As N increases, variance shrinks. As N->∞, β->

Consistency

Hypothesis Testing Most times, we only take one sample, so we only get one estimate of How do we know if is meaningful I we can only observe one value in the distribution?

Example Suppose we are interested in whether school size has an effect on student performance. Specifically, do students at small schools do better? We estimate the following equation: math10 i = β 0 +β 1 enroll i +β 2 staff i +β 3 totcomp i +ε i

Example math10 i = β 0 +β 1 enroll i +β 2 staff i +β 3 totcomp i +ε i Where: math10 = % of students passing the 10 th grade math portion of the Michigan Educational Assessment Program (MEAP) test enroll = school size staff = number of staff/1000 students (to control for how much attention students get) totcomp = average annual teaching compensation (to control for teacher quality)

Hypothesis Testing We need to develop a null and alternative hypothesis before running the regression. Null Hypothesis (H 0 ) Usually, you want to reject the null hypothesis Most common null hypothesis: “there is no effect of X on Y” or “ β 1 =0” Alternative Hypothesis (H A or H 1 ) Usually, what you are trying to prove

Hypothesis Testing In our example, we would pick H 0 :β 1 ≥0 “there is no negative effect of school size on student performance” H A :β 1 <0 “There is a negative effect of school size on student performance” Test this using meap93.gdt

Example 2 Consider the wage equation log(wage i )=β 0 +β 1 educ i +β 2 exer i +β 3 tenure i +ε i The null hypothesis H 0 : β 2 =0 says: once education and tenure have been accounted for, the number of years in the workforce has no effect on hourly wage If β 2 >0, prior work experience contributes to productivity, and to wage.

Alternative Hypothesis Usually, we want to reject the null hypothesis. We form an alternative hypothesis – values we don’t expect. One-sided Alternatives We expect there to be a sign on a particular variable based on our economic model e.g. H A : β K >0.

Hypothesis Testing log(wage i )=β 0 +β 1 educ i +β 2 exer i +β 3 tenure i +ε i In our example, we might set our hypotheses as H 0 :β 2 ≤0 H A :β 2 >0 We believe that the effect of experience on wages is positive, holding education and tenure fixed.

Hypothesis Testing log(wage i )=β 0 +β 1 educ i +β 2 exer i +β 3 tenure i +ε i What should the null and alternative hypotheses for the other coefficients be? H 0 :β 1 ≤0 H A :β 1 >0 H 0 :β 3 ≤0 H A :β 3 >0

Two sided alternatives Y i =β 0 +β 1 X 1i +…+β k X ki +ε i H 0 :β 1 =0 H A :β 1 ≠0 Under the alternative, X 1i has a significant effect on the dependent variable without specifying if it’s positive or negative You should use this if you don’t know what sign β k has (not well defined by theory) Or…sometimes it is better to use because it prevents us from forming our hypothesis after looking at the results

Other Hypotheses Although H 0 :β k =0 is the most common null hypothesis, sometimes, we want to test whether or not β k is equal to some other constant – usually 1 or -1. Example: Suppose we want to look at the effect of college enrollment on crime. log(crime i )=β 0 +β 1 log(enroll i )+ε i This is a constant elasticity model, where β 1 is the elasticity of crime with respect to enrollment.

Other hypotheses log(crime i )=β 0 +β 1 log(enroll i )+ε i We could test, H 0 :β 1 =0 & H A :β 1 ≠0 But more interesting would be to test if β 1 =1 If β 1 >1, then a 1% increase in enrollment leads to a greater than 1% increase in crime, so crime is a bigger problem at large campuses Set up our hypotheses as follows H 0 :β 1 =1 H A :β 1 ≠1

t-test Y i =β 0 +β 1 X 1i +…+β k X ki +ε i t-statistic: = estimated regression coefficient of the k th variable = The border value (usually zero) implied by the null hypothesis = The estimated standard error of the coefficient on the k th variable

t-test For example, suppose our hypotheses were: H 0 :β 1 =0 H A :β 1 >0 Then, suppose that we estimate that =6, and that =2 We would calculate t as

How does the t-test work? β1β1 Distribution of if null is true Suppose we found a value of way out here It’s not very likely that the null hypothesis is true…

t-test How does this look for our example? =6 and =

t-test We want to know, if H 0 really is true (i.e. β 1 really is 0), how likely is it that we could have observed a value of 6? Not very. We can probably say that H 0 is not true. But we need a rule to decide.

Hypothesis Testing How do we decide when to reject the null? Choose a level of significance Rule of thumb: 5% level of significance This means that we will rule out H 0 if we would have expected a value of at least as extreme as 6 less than 5% of the time. Instead of trying to figure out this probability using the sampling distribution, we transform the distribution to the t-distribution The t-distribution is almost the same as the standard normal distribution.

t-test In our example, t=6-0/2 = 3 Suppose our sample size was 23 We need to compare our t-statistic to the critical t-value, which distinguishes the acceptance region from the rejection region. Look at inside cover of book We want the t-value for = 20 degrees of freedom. For a one sided test with 5% significance, this is t c =1.725 Decision Rule: Reject H 0 if |t k |>t c, and has the sign implied by H A, otherwise do not reject. Here, we reject the null in favor of the alternative, suggesting that X 1 is significant

Choosing a Level of Significance Rule of thumb – Significance level = 5% If significance level is too low, we risk what is called a type II error, where we reject the null hypothesis when it is actually true. If we reject H 0 at the 5% level, we say that the coefficient is “statistically significant at the 5% level” Sometimes researchers use asterisks * means significant at 10% ** means significant at 5% *** means significant at 1%

Confidence Intervals Confidence Interval - The range that contains the population value a specified percent of the time. The two-sided t-critical value at a specific significance level gives the (1-sig level) confidence interval. So, the 5% significance level is equivalent to the 95% CI.

Confidence Intervals For our example, the t-critical value was So the 95% CI= 6 ± 2*2.086 = 6±4.172 Or to We could say that with 95% confidence, the true value of β is between and Notice that 0 is not in this range. We can reject H 0

P-value Alternative to t-test If the true population value was really 0, what is the probability we would have observed a value as extreme as 6? If p is small, reject the null. This is calculated automatically by most econometrics software Reject the null if p is less than the significance level

Example Student performance and school size using data.

F-test (Appendix Ch. 5) What if you want to test a hypothesis that involves multiple coefficients? For example: Suppose we run this regression (data7-2.gdt): wage i = β 0 +β 1 educ i +β 2 exper i +β 3 clerical i +β 4 maint i +β 5 crafts i +ε i clerical, maint, and crafts are job type “dummies” We want to test whether job type matters We would need to test whether β 3, β 4, and β 5 are “jointly significant. H 0 :β 3 =β 4 =β 5 =0 H A : The null hypothesis is not true.

F-test Steps 1. Run full regression, get RSS 2. Run constrained regression (without job type variables), get RSS M RSS = RSS from step 1 RSS M = RSS from step 2 M = # of excluded coeffs N = # observations K = # of coefficients in overall equation

F-stat Calculate F-stat, and compare it to the critical value of F (from F-table) Degrees of freedom numerator = M Degrees of freedom denominator = N-K-1 If F>F crit reject null hypothesis The variables are jointly significant if you can reject the null.

F-test In Gretl Run the model Select test>omit variables Gives F-stat and related p-value