Lecture 8: Hypothesis Testing

Lecture 8: Hypothesis Testing
(Chapter 7.1–7.2, 7.4) Distribution of Estimators (Chapter 5.1–5.2, Chapter 6.4) Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Agenda for Today Hypothesis Testing (Chapter 7.1)
Distribution of Estimators (Chapter 5.2) Estimating s2 (Chapter 5.1, Chapter 6.4) t-tests (Chapter 7.2) P-values (Chapter 7.2) Power (Chapter 7.2) Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

What Sorts of Hypotheses to Test?
To test a hypothesis, we first need to specify our “null hypothesis” precisely, in terms of the parameters of our regression model. We refer to this “null hypothesis” as H0. We also need to specify our “alternative hypothesis,” Ha , in terms of our regression parameters. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

What Sorts of Hypotheses to Test? (cont.)
Claim: The marginal propensity to consume is greater than 0.70 : Conduct a one-sided test of the null hypothesis H0 : b1 > 0.70 against the alternative, Ha : b1 = 0.70 Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Claim: The marginal propensity to consume equals the average propensity to consume: Conduct a two-sided test of H0 : b0 = 0 against the alternative, Ha : b0 ≠0 Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

The CAPM model from finance says that the Regress for a particular mutual fund, using data over time. Test H0 : b0 > 0. If b0 > 0, the fund performs better than expected, said early analysts. If b0 < 0, the fund performs less well than expected. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Hypothesis Testing: Errors
In our CAPM example, we are testing H0 : b0 > 0, against the alternative Ha : b0 = 0 We can make 2 kinds of mistakes. Type I Error: We reject the null hypothesis when the null hypothesis is “true.” Type II Error: We fail to reject the null hypothesis when the null hypothesis is “false.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Hypothesis Testing: Errors (cont.)
Type I Error: Reject the null hypothesis when it is true. Type II Error: Fail to reject the null hypothesis when it is false. We need a rule for deciding when to reject a null hypothesis. To make a rule with a lower probability of Type I error, we have to have a higher probability of Type II error. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Type I Error: Reject the null hypothesis when it is true. Type II Error: Fail to reject the null hypothesis when it is false. In practice, we build rules to have a low probability of a Type I error. Null hypotheses are “innocent until proven guilty beyond a reasonable doubt.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Type I Error: Reject the null hypothesis when it is true. Type II Error: Fail to reject the null hypothesis when it is false. We do NOT ask whether the null hypothesis is more likely than the alternative hypothesis. We DO ask whether we can build a compelling case to reject the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Hypothesis Testing What constitutes a compelling case to reject the null hypothesis? If the null hypothesis were true, would we be extremely surprised to see the data that we see? Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Hypothesis Testing: Errors
In our CAPM example What if we run our regression and find Could a reasonable jury reject the null hypothesis if the estimate is “just a little lower” than 0? Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Type I Error: Reject the null hypothesis when it is true. Type II Error: Fail to reject the null hypothesis when it is false. In our CAPM example, our null hypothesis is b0 > 0. Can we use our data to amass overwhelming evidence that this null hypothesis is false? Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Note: if we “fail to reject” the null, it does NOT mean we can “accept” the null hypothesis. “Failing to reject” means the null has “reasonable doubt.” The null hypothesis could still be fairly unlikely, just not overwhelmingly unlikely. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Hypothesis Testing: Strategy
Our Strategy: Look for a Contradiction. Assume the null hypothesis is true. Calculate the probability that we see the data, assuming the null hypothesis is true. Reject the null hypothesis if this probability is just too darn low. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

How Should We Proceed? Ask how our estimates of b0 and b are distributed if the null hypothesis is true. Determine a test statistic. Settle upon a critical region to reject the null hypothesis if the probability of seeing our data is too low. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

How Should We Proceed? (cont.)
The key tool we need is the probability of seeing our data if the null hypothesis is true. We need to know the distribution of our estimators. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Distribution of a Linear Estimator (from Chapter 5.2)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Distribution of a Linear Estimator (cont.)
Perhaps the most common hypothesis test is H0 : b= 0 against Ha : b≠ 0 This hypothesis tests whether a variable has any effect on Y We will begin by calculating the variance of our estimator for the coefficient on X1 Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Hypothesis Testing Add to the Gauss–Markov Assumptions
The disturbances are normally distributed Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

How Should We Proceed? Ask how our guesses of b0 and b1 are distributed. Since the Yi are distributed normally, all linear estimators are, too. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

What is the Variance of (cont.)

Distribution of a Linear Estimator
We have a formula for the distribution of our estimator. However, this formula is not in a very convenient form. We would really like a formula that gives a distribution for which we can look up the probabilities in a common table. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Test Statistics A “test statistic” is a statistic:
Readily calculated from the data Whose distribution is known (under the null hypothesis) Using a test statistic, we can compute the probability of observing the data given the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Test Statistics (cont.)

Estimating s2 (from Chapter 5.1, Chapter 6.4)

Estimating s2 (cont.) We need to estimate the variance of the error terms, Problem: we do not observe ei directly. Another Problem: we do not know b0…bk, so we cannot calculate ei either. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Estimating s2 (cont.) Once we have an estimate of the error term, we can calculate an estimate of the variance of the error term. We need to make a “degrees of freedom” correction. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Standard Error (from Chapter 5.2)
Remember, the standard deviation of the distribution of our estimator is called the “standard error.” The smaller the standard error, the more closely your estimates will tend to fall to the mean of the distribution. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Standard Error (from Chapter 5.2)
If your estimate is unbiased, a low standard error implies that your estimate is probably “close” to the true parameter value. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

t-statistic (from Chapter 7.2)
Because we need to estimate the standard error, the t-statistic is NOT distributed as a Standard Normal. Instead, it is distributed according to the t-distribution. The t-distribution depends on n-k-1. For large n-k-1, the t-distribution closely resembles the Standard Normal. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

t-statistic (cont.) Under the null hypothesis H0 : b1 = b1*
In our earlier example, we could: Replace b1* with 0.70 Compare t to the “critical value” for which the tn-2 distribution has .05 of its probability mass lying to the left, There is less than a 5% chance of observing the data under the null if t < “critical value.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Significance Level We can now calculate the probability of observing the data IF the null hypothesis is true. We choose the maximum chance we are willing to risk that we accidentally commit a Type I Error (reject a null hypothesis when it is true). This chance is called the “significance level.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Significance Level (cont.)
We choose the probability we are willing to accept of a Type I Error. This probability is the “Significance Level.” The significance level gives operational meaning to how compelling a case we need to build. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

The significance level denotes the chance of committing a Type I Error. By historical convention, we usually reject a null hypothesis if we have less than a 5% chance of observing the data under the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Region We know the distribution of our test statistic under the null hypothesis. We can calculate the values of the test statistic for which we would reject the null hypothesis (i.e., values that we would have less than a 5% chance of observing under the null hypothesis). Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Region (cont.)
We can calculate the values of the test statistic for which we would reject the null hypothesis. These values are called the “critical region.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Region Regression packages routinely report estimated coefficients, their estimated standard errors, and the t-statistics associated with the null hypothesis that an individual coefficient is equal to zero. Some programs also report a “p-value” for each estimated coefficient. This reported p-value is the smallest significance level for a two sided test at which one would reject the null that the coefficient is zero. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

One-Sided, Two-Sided Tests
t-tests come in two flavors: 1-sided and 2-sided. 2-sided tests are much more common: H0 : b= b* Ha : b≠b* 1-sided tests look at only one-side: H0 : b> b* Ha: b= b* Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

One-Sided, Two-Sided Tests (cont.)
The procedure for both 1-sided and 2-sided tests is very similar. For either test, you construct the same t-statistic: Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

One-Sided, Two-Sided Tests (cont.)
Once you have your t-statistic, you need to choose a “critical value.” The critical value is the boundary point for the critical region. You reject the null hypothesis if your t-statistic is greater in magnitude than the critical value. The choice of critical value depends on the type of test you are running. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Value for 1-Sided Test
For a 1-sided test, you need a critical value such that a of the distribution of the estimator is greater than (or less than) the critical value. a is our significance level (for example, 5%). Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Value for 1-Sided Test (cont.)
In our CAPM example, we want to test: We need a critical value t* such that a of the distribution of our estimator is less than t* Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Value of a 1-Sided Test (cont.)
For a 5% significance level and a large sample size, t* = -1.64 We reject the null hypothesis if: Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Value for a 2-Sided test
For a 2-sided test, we need to spread our critical region over both tails. We need a critical value t* such that a/2 of the distribution is to the right of t* a/2 of the distribution is to the left of –t* Summing both tails, a of the distribution is beyond either t* or -t* Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Critical Value for 2-Sided Test
For a large sample size, the critical value for a 2-sided test at the 5% level is 1.96 You reject the null hypothesis if: Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

P-values The p-value is the smallest significance level for which you could reject the null hypothesis. The smaller the p-value, the stricter the significance level at which you can reject the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

P-values (cont.) Many statistics packages automatically report the p-value for a two-sided test of the null hypothesis that a coefficient is 0 If p < 0.05, then you could reject the null that = 0 at a significance level of 0.05 The coefficient “is significant at the 95% confidence level.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Statistical Significance
A coefficient is “statistically significant at the 95% confidence level” if we could reject the null that b = 0 at the 5% significance level. In economics, the word “significant” means “statistically significant” unless otherwise qualified. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Power Type I Error: reject a null hypothesis when it is true
Type II Error: fail to reject a null hypothesis when it is false We have devised a procedure based on choosing the probability of a Type I Error. What about Type II Errors? Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Power (cont.) The probability that our hypothesis test rejects a null hypothesis when it is false is called the Power of the test. (1 – Power) is the probability of a Type II Error. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Power (cont.) If a test has a low probability of rejecting the null hypothesis when that hypothesis is false, we say that the test is “weak” or has “low power.” The higher the standard error of our estimator, the weaker the test. More efficient estimators allow for more powerful tests. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Power (cont.) Power depends on the particular Ha you are considering. The closer Ha is to H0, the harder it is to reject the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Figure 7.2 Distribution of for bs = -2, 0, and a Little Less Than 0

Figure 7.3 Power Curves for Two-Tailed Tests of H0 : bs = 0

Figure SA.12 The Distribution of the t-Statistic Given the Null Hypothesis is False and = + 5

Figure SA.13 The t-Statistic’s Power When the Sample Size Grows

Review To test a null hypothesis, we:
Assume the null hypothesis is true; Calculate a test statistic, assuming the null hypothesis is true; Reject the null hypothesis if we would be very unlikely to observe the test statistic under the null hypothesis. Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Six Steps to Hypothesis Testing
State the null and alternative hypotheses Choose a test statistic (so far, we have learned the t-test) Choose a significance level, the probability of a Type I Error (typically 5%) Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Six Steps to Hypothesis Tests (cont.)
Find the critical region for the test (for a 2-sided t-test at the 5% level in large samples, the critical value is t*=1.96) Calculate the test statistic Reject the null hypothesis if the test statistic falls within the critical region Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Lecture 8: Hypothesis Testing

Similar presentations

Presentation on theme: "Lecture 8: Hypothesis Testing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 8: Hypothesis Testing

Similar presentations

Presentation on theme: "Lecture 8: Hypothesis Testing"— Presentation transcript:

Similar presentations

About project

Feedback