Presentation on theme: "6.Hypothesis Testing and the Comparison of 2 or More Populations ASW Chapter 9 + Chapter 10."— Presentation transcript:
6.Hypothesis Testing and the Comparison of 2 or More Populations ASW Chapter 9 + Chapter 10
2 A) Introduction Estimating parameters of population hypothesis testing on our model. Model or Theory Deductions/Analysis Testable Hypothesis Data Reject Theory Accept Theory (For Now) Modify Theory Empirical Result Ch. 1-8, 12, 13 Economic Theory Ch. 9-11, 12, 13 Tests
3 Confidence Intervals and Hypothesis Testing Confidence intervals range that μ falls into. NOW: is μ > 0, or > 1, etc. OR: is μ 1 > μ 2 ? Testing for specific values of μ. We have a confidence interval for Saskatchewan female wages by age. What could we test here? We will have a confidence interval for Bachelors salaries in Saskatchewan. What could we test here? Others: gasoline prices? Stock Market fluctuations?
4 B) Developing Null and Alternative Hypotheses Start with a testable hypothesis. Point of interest: do older women get paid more? Economic theory: is 0 < MPC < 1 and constant? Define its opposite: Older womens salaries are < average. MPC is > 1. One is the Null hypothesis, the other is the Alternative hypothesis. Use sample data to test the Null hypothesis. What if it is not that simple to have 2 opposites?
5 Which is the Null? General rule: the hypothesis with the = sign or the sign is the Null. OR: the Null is something we assume is true unless contradicted by the sample.
6 1. Research hypotheses: Testing an exception to the general rule, so it goes in the alternative. E.g, testing if older womens salaries (μ) > average: H 0 : μ < μ(average) H A : μ > μ(average) Results will tell us either: If testing shows H 0 cannot be rejected (accepted) implies that older womens salaries are not higher, but we cannot be sure. If testing shows H 0 can be rejected we can infer H A is true, μ > μ(average).
7 2.Testing the Validity of a Claim Assume claim is true until disproven. E.G.: manufacturers claim of weight/container. H 0 : μ > 100 grams. H A : μ < 100 grams. Results will tell us either: If testing shows H 0 cannot be rejected (accepted) manufacturers claim not challenged. If testing shows H 0 can be rejected we can infer manufacturer is lying.
8 3.Testing in Decision-Making Here, if either too high or too low, need to do something. E.G.: is class length 75 minutes? H 0 : μ = 75 minutes. H A : μ 75 minutes. If H 0 not rejected (accepted), no change in behaviour. If H 0 rejected –> change behaviour.
9 C) Type I and Type II Errors Sample data could have errors.
10 Type I and Type II Errors CorrectDecision Type II Error CorrectDecision Type I Error Reject H 0 Accept H 0 H 0 True H 0 False Conclusion Population Condition False Negative False Positive
11 False Positive or False Negative? New claims over bungled shooting of Brazilian By Mark Sellman, Times Online, and Daniel McGrory U.K. police defend shoot-to-kill after mistake … Blair said Menezes had emerged from an apartment block in south London that had been under surveillance in connection with Thursdays attacks, and refused police orders to halt. Menezes had also been wearing an unseasonably heavy coat, further raising police suspicions. MSNBC, July 24 th.
12 Other Type I and Type II Errors Sampling songs. Health tests. Pregnancy tests. Jury decisions.
13 Level of Significance Hypothesis testing is really designed to control the chance of a Type I error. Probability of Type I error = the level of significance. Selecting (= level of significance ) select probability of Type I error. What is the level of significance for Jury trials? We do not control for Type II errors –> except by our language of stating do not reject H 0.
14 Level of Significance contd is picked by researcher –> normally 5%? = 5% type I error happens only 5% of the time.
15 D) The probability value (p-value) approach 1. Develop null and alternative hypothesis. 2. Select level of significance. 3. Collect data, calculate sample mean and test statistic. 4. Use test statistic to calculate p-value. 5. Compare: reject H 0 if p-value <. The sample implies that the alternative (your research hypothesis) is true. p-value approach
16 Hypothesis Testing: The Critical Value Approach 4. Use to determine critical value and rejection rule. 5. Compare: if |test statistic| > |critical value|, reject H 0. The sample implies that the alternative (your research hypothesis) is true. Critical-value approach 1.Develop null and alternative hypothesis. 2.Select level of significance. 3.Collect data, calculate sample mean and test statistic.
17 Hypothesis Testing contd… This is essentially inverting our confidence index. Is more than 2 standard deviations away from some benchmark?
18 E) Population Mean, σ Known, One-Tailed Test Same hypothesis p-value method and critical value method. Example: a new employment program initiative has been introduced to reduce time spent being unemployed. Goal: 12 weeks or less unemployed. Population standard deviation believed to be 3.2 weeks. Sample of 40 unemployed workers, average time unemployed weeks. Assuming a level of significance ( ) of.05, is the program goal being met?
19 First (Common) Steps Hypothesis: H 0 : μ 12 Clearly > 12. This casts doubt on our program goal (the null), and whether we should continue it. Key: is it enough more, given sample size and standard deviation, or is it just a (small) random fluctuation?
20 Computing the Test Statistic Under our assumptions: use the standard normal. Use sample mean to calculate test statistic Is this z big enough to reject the null hypothesis? Next, go our two routes: Calculate p-value OR z-critical.
21 Calculating the p-value Given the z-value, what is the corresponding probability? 0 Z=2.47z p=?? This is the probability that > 12 by chance.
22 Calculating the p-value contd Find 2.47 on the Standard Normal Distribution tables: z … …
23 Calculating the p-value contd.4932 is the probability of being between 0 and z=2.47. p = 0.5 – = Z=2.47z
24 Should We Reject the Hypothesis? This says that the probability of getting a sample mean of when the true mean is 12 =.0038 or less than ½ of 1 percent. Our significance level was only 5%, so we reject the null. We are 99.62% certain that the program has failed.
25 Rejection of the Null 0 Z=2.47z Z.05 Sometimes we say: significant at the 0.38% level.
26 Critical Value Approach This is an alternative you often see in textbooks or articles. Find the value of z.05, and compare it to the test value of z (2.47). From the tables, z.05 = Because 2.47 > 1.645, reject H 0.
27 Rejection 0 Z=2.47z Z.05 = 1.645
28 Excel Lets do this example in Excel. Look at Appendix 9.2 in text, especially Figure 9.8
29 F) Population Mean, σ Known, Two-Tailed Test Null: μ = μ 0 Alternative is μ μ 0. Must examine two areas of the distribution. Example: Price/earnings ratios for stocks. Theory: stable rate of P/E in market = 13. If P/E (market) < 13, you should invest in the stock market. If P/E (market) > 13, you should take your money out.
30 Estimate Steps Can we estimate if the population P/E is 13 or not? Common steps: Set hypothesis: H 0 : μ = 13 H A : μ 13 Select =.05. Calculate standard error. Calculate z-value.
31 Calculating Test Statistic We have a sample of 50 = Historical σ =
32 p-value Approach Calculate the p-value. We will calculate for the lower tail then make an adjustment for the upper tail.
33 p-value, Two-Tailed Test 0 Z=2.09z Z=–2.09 p(z < –2.09) = ?? p(z > 2.09) = ?? We can just calculate one value, and double it.
34 Calculating the p-value contd Find on the Standard Normal Distribution tables: z … …
35 p-value, Two-Tailed Test 0 Z=2.09z Z=–2.09 p(z < –2.09) = ?? p(z > 2.09) = = Doubling the value, we find the p-value =
36 Should We Reject the Null Hypothesis? Yes! p-value = < = There is only a 3.66% chance that the measured price/earnings ratio sample mean of 12.1 is not equal to the stable rate of 13 by random chance.
37 Critical Value Approach Reject the null hypothesis if: test z-value > critical value or if test z-value < critical value Two tailed test: = 0.05 need critical value for /2 = The tables tell us that this is 1.96.
38 G) Population Mean, σ Unknown σ unknown must estimate it with our sample too use t-distribution, n – 1 degrees of freedom.
39 One-tailed Test, p-value Approach Steps: 1.Set up hypothesis. 2.Decide on level of significance ( ) 3.Collect data, calculate sample mean and test statistic. 4.Use test statistic & t-table/Excel to calculate p- value. 5.Compare: reject H 0 if p-value <.
40 The RCMP periodically samples The RCMP periodically samples vehicle speeds at various locations on a particular roadway. The sample of vehicle speeds is used to test the hypothesis Example: Highway Patrol The locations where H 0 is rejected are deemed The locations where H 0 is rejected are deemed the best locations for radar traps. H 0 : < 65
41 Example : Highway Patrol Outside Lumsden: A sample of 64 vehicles Outside Lumsden: A sample of 64 vehicles –> average speed = 65.5 mph –> standard deviation = 4.2 mph. Use =.05 to test the hypothesis.
42 Common to Both Approaches 1. Determine the hypotheses. 2. Specify the level of significance. 3. Compute the value of the test statistic. =.05 =.05 H 0 : < 65 H a : > 65
43 4. Estimate the p-value From t-Distribution Table t = , df = 65 Must interpolate the value of t = , df = 65 Degrees of Freedom Area in Upper Tail … … < p–value <.20
44 p –Value Approach 5. Determine whether to reject H 0. Because p –value > =.05, we do NOT reject H 0. We are at least 95% confident that the mean speed of vehicles outside Lumsden is LESS than OR EQUAL TO 65 mph.
45 5. Determine whether to reject H 0. Because < 1.669, we do NOT reject H 0. Critical Value Approach For =.05 and d.f. = 64 – 1 = 63, t.05 = Determine the critical value and rejection rule. Reject H 0 if t > 1.669
46 H) Introduction: Comparing Population Differences Do men get paid more than women? $46,452 for men vs. $35,122 for women (bachelors). Do more 100-level Economics courses help you in Econ 201? Has the crime rate risen? Are there more hurricanes recently?
47 Key point: the role of standard deviation Probably the same Probably different
48 Comparing 2 Populations True population means: 1 and 2. Random sample of n 1 –> 1. Versus random sample of n 2 –> 2. Transform into problem: is 1 – 2 = 0? Assuming 1 and 2 known –> use z-test. If unknown –> estimate s from sample ss, and use t-test.
49 I) Confidence Intervals, 2 Means: s Unknown How important is an extra introductory course in determining your grade in Economics 201? Data: Natural experiment. 59 students. 43 had only one 10x course. 16 had two 10x courses. Final exam grades: One 10x: average = 61.69%, s 1 = 22.65, n 1 = 43. Two 10x: average = 75.11%, s 2 = 12.80, n 2 = 16.
50 Confidence Interval Estimation Point estimator: 1 – 2. Standard error of 1 – 2 is:
51 Confidence Interval contd Confidence interval of difference in means: 1 – 2 + Margin of Error Typically use α = Margin of error:
52 Degrees of Freedom One UGLY formula: In this example: df = round down to % confidence interval t For 47 degrees of freedom, table says:
53 Confidence Interval contd.
54 Confidence Interval contd.
55 Confidence Interval We are 95% confident that students with only one 10x course scored between 3.775% and % lower than students with two 10x courses. Next step would be: why, how??
56 J) Hypothesis Tests, 2 Means: s Unknown Two datasets –> is the mean value of one larger than the other? Is it larger by a specific amount? μ 1 vs. μ 2 –> μ 1 – μ 2 vs. D 0. Often set D 0 = 0 –> is μ 1 = μ 2 ?
57 Example: Female vs. Male Salaries Saskatchewan 2001 Census data: - only Bachelors degrees - aged work full-time - not in school Men: M = $46,452.48, s M = 36,260.1, n M = 557. Women: W = $35,121.94, s W = 20,571.3, n W = 534. M – W = $11, } our point estimate. Is this an artifact of the sample, or do men make significantly more than women?
58 Hypothesis, Significance Level, Test Statistic We will now ONLY use the p-value approach, and NOT the critical value approach. Research hypothesis: men get paid more: 1. H 0 : μ M – μ W 0 2. Select = Compute test t-statistic:
59 4. a. Compute the Degrees of Freedom Can compute by hand, or get from Excel: = 888
60 4. b. Computing the p-value Degrees of Freedom … up here somewhere The p-value <<<
61 5. Check the Hypothesis Since the p-value is <<< 0.05, we reject H 0. We conclude that we can accept the alternative hypothesis that men get paid more than women at a very high level of confidence (greater than 99%).
62 Excel t-Test: Two-Sample Assuming Unequal Variances MaleFemale Mean Variance Observations Hypothesized Mean Diff.0 df888 t Stat P(T<=t) one-tail E t Critical one-tail P(T<=t) two-tail E-10 t Critical two-tail
63 Summary Hypothesis tests on comparing two populations. Convert to a comparison of the difference to a standard. More complex standard deviation and degrees of freedom. Same methodology as comparing other hypothesis tests.
64 K) Statistical vs. Practical Significance Our tests: statistically significantly Real world interest:practical significance. Men vs. women: the difference is statistically significant AND practically: $46, vs $35, Saskatchewan, full-time, Bachelors: Women make only 75.6% of men, same average education level.
65 Source: Leader-Post, Oct. 31, 2008
66 L) Matched Samples Controlled experiment –> match individuals in each group. Matched samples –> each individual tries each method in turn. Variation between samples not a problem. Focus on difference data. Independent samples –> the norm in economics. Regression analysis.
67 M) Introduction to ANOVA What if we want to compare 3 or more sample means (treatment means)? Example: total income, Saskatchewan females employed full-time and full-year, by age, 2003 (Source: See Oct. 8 th lectures) Age groupIncome in thousands of dollars Sample size MeanStandard deviation Overall weighted average = 38.2
68 ANOVAs Hypotheses 4 different populations. There is one true population mean and 4 sample variations.
69 N) Steps of ANOVA 1.Set up the Hypothesis Statements H 0 : μ 1 = μ 2 = μ 3 = μ 4 = … = μ k H A : Not all population means are equal 2.Collect your sample data: Means: 1, 2, 3, 4, … k Variances:s 2 1, s 2 2, s 2 3, s 2 4,… s 2 k Sample Sizes:n 1, n 2, n 3, n 4,… n k
70 Steps of ANOVA Continued 4.Calculate the overall average: 5.Create our two estimates of 2.
71 Step 5 a) Estimating 2 via SSTR Between-treatments estimate of 2 or sum of squares due to treatments (SSTR). This compares to, and constructs an estimate of 2 based on the assumption the Null Hypothesis is true:
72 Step 5 b) Estimating 2 via SSE Within-treatments estimate of 2 or the sum of squares due to error (SSE). This takes the weighted average of the sample s j 2 as an estimate of 2 and is a good estimate regardless of whether the Null is true:
73 Step 6: Testing The Null If Null true, both estimates should be similar, and SSTR 1. SST If ratio >>> 1 reject the Null, accept the Alternative that there is multiple population distributions.
74 Steps of ANOVA 1.Set up the Hypothesis Statement. (Null: all means are equal) 2.Collect the sample data. 3.Select level of significance –> α = Calculate the overall average. 5.a) Estimate 2 via sum of squares due to treatments (SSTR). b) estimate of 2 via sum of squares due to error (SSE). 6.If Null true, both estimates should be similar, and STR 1. SST
75 MSTR and MSE MSTR = sum of squares due to treatment numerator degrees of freedom = sum of squares due to treatment no. of treatments – 1 = SSTR k-1 MSE = Sum of squares due to error denominator degrees of freedom =Sum of squares due to error total no. of obs. – no. of treatments = SSE n T – k df1df2
76 F-test F-statistic = MSTR k-1 degrees of freedom (df 1 ) MSE n T – k degrees of freedom (df 2 ) If H 0 is true, MSTR MSE F-statistic 1. If H 0 is false p-value is < level of significance (α). F-statistic is higher than critical value from the table/Excel.
77 F-Distribution 0 F test-value
78 O) Saskatchewan Female Wages Example Example: total income, Saskatchewan females employed full-time and full-year, by age, 2003 (Source: See Oct. 8 th lectures) Age GroupMean IncomeVariance = (St. Dev.) 2 Sample Size (13.5) 2 = (20.7) 2 = (25.9) 2 = (25.9) 2 = Overall weighted average = 38.2
79 Calculating the MSTR, MSE
80 Calculating F-Stat and p-value F test-value = MSTR = = 2.59 MSE
81 F-Table for df 1 = 3 and df 2 = 176 Denomina- tor degrees of freedom (df 2 ) (MSE) Area in Upper Tail Numerator degrees of freedom (df 1 ) (MSTR) … … Clearly the p-value > 0.05 –> accept the Null of one distribution 176 degrees of freedom, F=2.59 in here.
82 Excel F-test formula =FDIST(F-value, df 1, df 2 ) –> yields value of
83 P) Econometrics for Dummies… Instead of ANOVA, economists tend to use Regression analysis + dummy variables. Gives us the direction and size of the differences in mean values. But ANOVA is a useful first step.