Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.

Slides:



Advertisements
Similar presentations
“Students” t-test.
Advertisements

Statistical Techniques I
Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Chapter 10 Two-Sample Tests
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
PSY 307 – Statistics for the Behavioral Sciences
Statistics Are Fun! Analysis of Variance
GG313 Lecture 8 9/15/05 Parametric Tests. Cruise Meeting 1:30 PM tomorrow, POST 703 Surf’s Up “Peak Oil and the Future of Civilization” 12:30 PM tomorrow.
Topic 2: Statistical Concepts and Market Returns
Chapter Goals After completing this chapter, you should be able to:
Don’t spam class lists!!!. Farshad has prepared a suggested format for you final project. It will be on the web
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 12 Additional.
11 Comparison of Two Means Tests involving two samples – comparing variances, F distribution TOH - x A = x B ? Step 1 - F-test  s A 2 = s B 2 ? Step.
A Decision-Making Approach
T-Tests Lecture: Nov. 6, 2002.
Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations.
Testing the Difference Between Means (Small Independent Samples)
Inferences About Process Quality
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Two-Sample Tests Basic Business Statistics 10 th Edition.
Week 9 October Four Mini-Lectures QMM 510 Fall 2014.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Statistical Inference for Two Samples
Hypothesis testing.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Two Sample Tests Ho Ho Ha Ha TEST FOR EQUAL VARIANCES
Intermediate Statistical Analysis Professor K. Leppel.
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
Analysis of Variance or ANOVA. In ANOVA, we are interested in comparing the means of different populations (usually more than 2 populations). Since this.
Chapter 9.3 (323) A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population.
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
One-sample In the previous cases we had one sample and were comparing its mean to a hypothesized population mean However in many situations we will use.
Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set.
Testing Hypotheses about Differences among Several Means.
Chap 9-1 Two-Sample Tests. Chap 9-2 Two Sample Tests Population Means, Independent Samples Means, Related Samples Population Variances Group 1 vs. independent.
Comparing Two Variances
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Two-Sample Tests Statistics for Managers Using Microsoft.
© Copyright McGraw-Hill 2000
1 Objective Compare of two population variances using two samples from each population. Hypothesis Tests and Confidence Intervals of two variances use.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Other Types of t-tests Recapitulation Recapitulation 1. Still dealing with random samples. 2. However, they are partitioned into two subsamples. 3. Interest.
SECTION 1 HYPOTHESIS TEST FOR THE DIFFERENCE IN TWO POPULATION PROPORTIONS Two-Population Tests With Qualitative Data  A lot.
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
§2.The hypothesis testing of one normal population.
The p-value approach to Hypothesis Testing
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 10-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Sample Size Needed to Achieve High Confidence (Means)
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Two-Sample Hypothesis Testing
Hypothesis Testing: Preliminaries
Chapter 8 Hypothesis Testing with Two Samples.
Chapter 9 Hypothesis Testing.
Hypothesis Tests for Two Population Standard Deviations
Confidence intervals for the difference between two means: Independent samples Section 10.1.
Hypothesis Tests for a Standard Deviation
The z-test for the Mean of a Normal Population
Presentation transcript:

Two-Sample Hypothesis Testing

Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population means is zero. You have independent samples from the two populations. Their sizes are n 1 and n 2.

We’ll use this formula to test whether the population means are equal. So we have a standard normal distribution

Example Suppose from a large class, we sample 4 grades: 64, 66, 89, 77. From another large class, we sample 3 grades: 56, 71, 53. We assume that the class grades are normally distributed, and that the population variances for the two classes are both 96. Test at the 5% level

crit. reg. acceptance region As we’ve found before, the Z-values for a two tailed 5% test are 1.96 and -1.96, as indicated below. Since our Z-statistic, 1.87, is in the acceptance region, we accept H 0 :  1 -  2 = 0, concluding that the population means are equal. Z

What do you do if you don’t know the population variances in this formula? Replace the population variances with the sample variances and the Z distribution with the t distribution. The number of degrees of freedom is the integer part of this very messy formula:

Example Consider the same example as the last one but without the information on the population variances. Again test at the 5% level Class 1Class 2 X1X1 X2X We need to determine the sample means and sample variances. As before, the sample means are 74 and 60.

Class 1Class 2 X1X1 X2X So we subtract the sample mean from each of the grades.

Class 1Class 2 X1X1 X2X Then we square those differences and add them up.

Class 1Class 2 X1X1 X2X Then we divide that sum by n-1 to get the sample variance.

What are the dof & critical t value? Since we have: our very messy dof formula yields So the degrees of freedom is the integer part of 4.86 or 4. For a 5% two-tailed test & 4 dof, the t value is t

Since our t-value, 1.748, is in the acceptance region, we accept H 0 :  1 =  t Next we need to compute our test statistic.

Sometimes we don’t know the population variances, but we believe that they are equal. So we need to compute an estimate of the common variance, which we do by pooling our information from the two samples. We denote the pooled sample variance by s p 2. s p 2 is a weighted average of the two sample variances, with more weight put on the sample variance that was based on the larger sample. If the two samples are the same size, s p 2 is just the sum of the two sample variances, divided by two. In general,

Let’s return for a moment to the statistic that we used to compare population means when the population variances were known. Then we can factor out the  2 and replace the  2 by s p 2 and the Z by t. The number of degrees of freedom is n 1 + n 2 -2.

Let’s do the previous example again, but this time assume that the unknown population variances are believed to be equal. We had: The number of degrees of freedom is n 1 + n 2 -2, and we are doing a 2-tailed test at the 5% level. Since our t-statistic 1.70 is in the acceptance region, we accept H 0 :  1 =  crit. reg. Acceptance region t5t crit. reg

In the previous three hypothesis tests, we tested whether 2 populations has the same mean, when we had 2 independent samples. We can’t use those tests, however, if the 2 samples are not independent. For example, suppose you are looking at the weights of people, before and after a fitness program. Since the weights are for the same group of people, the before and after weights are not independent of each other. In this type of situation, we can use a hypothesis test based on matched-pairs samples.

The test statistic is The hypotheses are

Example personBeforeAfterD = A-B

personBeforeAfterD = A-B First we calculate the weight differences.

personBeforeAfterD = A-B Then we add up the differences and determine the mean.

personBeforeAfterD = A-B Next we need to calculate the sample standard deviation for the weight differences. The sample standard deviation is

personBeforeAfterD = A-B We subtract the mean difference from each of the D values.

personBeforeAfterD = A-B We square the values in that column, and add up the squares.

personBeforeAfterD = A-B Then since we divide by n-1 = 4, and take the square root.

personBeforeAfterD = A-B Next we assemble our statistic.

crit. reg. Acceptance region t4t crit. reg Since we had 5 people and 5 pairs of weights, n=5, and the number of degrees of freedom is n-1 = 4. We’re doing a 2-tailed t-test at the 5% level, so the critical region looks like this: Since our t-statistic, -2.35, is in the acceptance region, we accept the null hypothesis that the program would cause no average weight change for the population as a whole.

Hypothesis tests on the difference between 2 population proportions, using independent samples If you look at the statistics we have used in our hypothesis tests, you will notice that they have a common form: In our hypothesis tests on the difference between 2 population proportions, we are going to use that same form.

We still need to determine the standard deviation, or an estimate of the standard deviation, of our point estimate.

Assembling the pieces, we have

Suppose the proportions of Democrats in samples of 100 and 225 from 2 states are 33% and 20%. Test at the 5% level the hypothesis that the proportion of Democrats in the populations of the 2 states are equal.

crit. reg. Acceptance region Z. 025 crit. reg We’re doing a 2-tailed Z-test at the 5% level, so the critical region looks like this: Since our Z-statistic, 2.53, is in the critical region, we reject the null hypothesis and accept the alternative that the proportions of Democrats in the 2 states are different.

Sometimes you want to test whether two independent samples have the same variance. If the populations are normally distributed, we can use the F-statistic to perform the test.

This F-statistic has n 1 -1 degrees of freedom for the numerator, and n 2 -1 degrees of freedom for the denominator. The F-statistic is

f(F) critical region acceptance region with the tail for the critical region looks like this: The distribution of our F-statistic,

Two-sided versus one-sided tests for equality of variance While you are always using the upper tail of the F-test on tests of equality of variance, the size of the critical region you sketch varies with whether you have a two-sided or a one-sided test. Let’s see why this is true.

Our sketch of the critical region is based on the situation in which the variance is greater for the first group, but we admit that, if we had information for the entire population, we might find that the situation is reversed. So there is an implicit second sketch of an F-statistic in which the sample variance of the second group is in the numerator. Thus, for each of the sketches, the sketch we draw and the implicit sketch, the area of the critical region is α/2, half of the test level α. So, for example, if you are doing a two-sided test at the 5% level, your sketch will show a tail area of While, for our samples, the sample variance from the first group was greater, our alternative hypothesis indicates that we think that the population variance could have been larger or smaller for the first population:

What if we are performing a one-sided test? Now we are looking at a situation in which the sample variance is again larger for the first group. This time however, we want to know if, in fact, the population variance is really larger for the first group. So we have the one-sided alternative shown above. Keep in mind that, as usual with one-sided tests, the null hypothesis is the devil’s advocate view. Here the devil’s advocate is saying: nah, the population variance for the first group isn’t really any larger than for the second group. For a one-sided test with level α, your critical region will have area α. For example, if you are performing a one-sided test at the 5% level, the critical region will have area 0.05.

Example: You are looking at test results for two groups of students. There are 25 students in the first group, for which you have calculated the sample variance to be 15. There are 30 students in the second group, for which you have calculated the sample variance to be 10. Test at the 10% level whether the populations variances are the same. F 24, 29 f(F) critical region acceptance region Because 1.5 is in the acceptance region, you cannot reject the null hypothesis and you conclude that the variances of the two populations are the same. There are 25-1 = 24 degrees of freedom in the numerator and 30-1=29 degrees of freedom in the denominator. This is a two-sided test, so the critical region has area 0.05.

In the two sections we have just completed, we did 9 different types of hypothesis tests. 1.population mean - 1 sample - known population variance 2.population mean - 1 sample - unknown population variance 3.population proportion - 1 sample 4.difference in population means - 2 independent samples - known population variances 5.difference in population means - 2 independent samples - unknown population variances 6.difference in population means - 2 independent samples - unknown population variances that are believed to be equal 7.difference in population means - 2 dependent samples 8.difference in population proportions - 2 independent samples 9.Difference in population variances - 2 independent samples The statistics for these tests are compiled on a summary sheet which is available at my web site.