Copyright © 2009 Pearson Education, Inc. 10.2 LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.

Slides:



Advertisements
Similar presentations
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Advertisements

Basic Statistics The Chi Square Test of Independence.
The Chi-Square Test for Association
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Statistical Inference for Frequency Data Chapter 16.
Chapter 13: Inference for Distributions of Categorical Data
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-square Test of Independence
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
11-3 Contingency Tables In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square Tests and the F-Distribution
Presentation 12 Chi-Square test.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 15 Inference for Counts:
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chapter 11: Inference for Distributions of Categorical Data Section 11.1 Chi-Square Goodness-of-Fit Tests.
1 Pertemuan 11 Uji kebaikan Suai dan Uji Independen Mata kuliah : A Statistik Ekonomi Tahun: 2010.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
11/16/2015Slide 1 We will use a two-sample test of proportions to test whether or not there are group differences in the proportions of cases that have.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chap 11-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 11 Chi-Square Tests Business Statistics: A First Course 6 th Edition.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 12-1 Chapter 12 Chi-Square Tests and Nonparametric Tests Statistics for Managers using.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter Outline Goodness of Fit test Test of Independence.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Copyright © 2009 Pearson Education, Inc. 9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
Basic Statistics The Chi Square Test of Independence.
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
9.3 Hypothesis Tests for Population Proportions
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square hypothesis testing
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
Two Categorical Variables: The Chi-Square Test
Chapter 10 Analyzing the Association Between Categorical Variables
Inference on Categorical Data
Presentation transcript:

Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized in two-way tables. Hypothesis Testing with Two-Way Tables

Slide Copyright © 2009 Pearson Education, Inc. Identifying the Hypotheses with Two Variables Suppose that administrators at a college are concerned that there may be bias in the way degrees are awarded to men and women in different departments. They therefore collect data on the number of degrees awarded to men and women in different departments. These data concern two variables: major and gender. To test whether there is bias in the awarding of degrees, the administrators ask the following question: Do the data suggest a relationship between the two variables?

Slide Copyright © 2009 Pearson Education, Inc. Null and Alternative Hypotheses with Two Variables The null hypothesis, H 0, states that the variables are independent (there is no relationship between them). The alternative hypothesis, H a, states that there is a relationship between the two variables.

Slide Copyright © 2009 Pearson Education, Inc. Displaying the Data in Two-Way Tables With the hypotheses identified, the next step in the hypothesis test is to examine the data set to see if it supports rejecting or not rejecting the null hypothesis. We can display the data efficiently with a two-way table (also called a contingency table), so named because it displays two variables.

Slide Copyright © 2009 Pearson Education, Inc. Note: One variable is displayed along the columns and the other along the rows. Here, there are only two rows because gender can be only either male or female. There are many columns for the majors, with just the first few shown here.

Slide Copyright © 2009 Pearson Education, Inc. Two-Way Tables A two-way table shows the relationship between two variables by listing one variable in the rows and the other variable in the columns. The entries in the table’s cells are called frequencies (or counts).

Slide Copyright © 2009 Pearson Education, Inc. Here, to simplify the calculations, let’s focus on just two majors, biology and business. Does a person’s gender influence whether he or she chooses to major in biology or business? Table 10.3 shows the biology and business data extracted from Table 10.2, along with row and column totals.

Slide Copyright © 2009 Pearson Education, Inc. Table 10.4 shows the results of a pre-election survey on gun control. Use the table to answer the following questions. a. Identify the two variables displayed in the table. b. What percentage of Democrats favored stricter laws? c. What percentage of all voters favored stricter laws? d. What percentage of those who opposed stricter laws are Republicans? EXAMPLE 1 A Two-Way Table for a Survey

Slide Copyright © 2009 Pearson Education, Inc. Note that the total of the row totals and the total of the column totals are equal. a. The rows show the variable survey response, which can be either “favor stricter laws,” “oppose stricter laws,” or “undecided.” The columns show the variable party affiliation, which in this table can be either Democrat or Republican. b. Of the 622 Democrats polled, 456 favored stricter laws. The percentage of Democrats favoring stricter laws is 456/622 = 0.733, or 73.3%. EXAMPLE 1 A Two-Way Table for a Survey Solution:

Slide Copyright © 2009 Pearson Education, Inc. c. Of the 1,421 people polled, 788 favored stricter laws. The percentage of all respondents favoring stricter laws is 788/1,421 = 0.555, or 55.5%. d. Of the 569 people polled who opposed stricter laws, 446 are Republicans. Since 446/569 = 0.783, 78.3% of those opposed to stricter laws are Republicans. EXAMPLE 1 A Two-Way Table for a Survey Solution: (cont.)

Slide Copyright © 2009 Pearson Education, Inc. Carrying Out the Hypothesis Test The basic idea of the hypothesis test is the same as always— to decide whether the data provide enough evidence to reject the null hypothesis. For the case of a test with a two-way table, the specific steps are as follows: As always, we start by assuming that the null hypothesis is true, meaning there is no relationship between the two variables. In that case, we would expect the frequencies (the numbers in the individual cells) in the two-way table to be those that would occur by pure chance. Our first step, then, is to find a way to calculate the frequencies we would expect by chance.

Slide Copyright © 2009 Pearson Education, Inc. We next compare the frequencies expected by chance to the observed frequencies from the sample, which are the frequencies displayed in the table. We do this by calculating something called the chi-square statistic (pronounced “ky-square”) for the sample data, which here plays a role similar to the role of the standard score z in the hypothesis tests we carried out in Chapter 9 or the role of the t test statistic in Section Carrying Out the Hypothesis Test (cont.)

Slide Copyright © 2009 Pearson Education, Inc. Recall that for the hypothesis tests in Chapter 9, we made the decision about whether to reject or not reject the null hypothesis by comparing the computed value of the standard score for the sample data to critical values given in tables; similarly, in Section 10.1 we compared computed values of the t test statistic to values found in a table. Here, we do the same thing, except rather than using critical values for the standard score or t, we use critical values for the chi-square statistic. Carrying Out the Hypothesis Test (cont.)

Slide Copyright © 2009 Pearson Education, Inc. As an example of the process, let’s work through these steps with the data in Table Our first step is to find the frequencies we would expect in Table 10.3 if there were no relationship between the variables, which is equivalent to the frequency expected by chance alone. Let’s start by finding the frequency we would expect by chance for male business majors. Finding the Frequencies Expected by Chance Carrying Out the Hypothesis Test

Slide Copyright © 2009 Pearson Education, Inc P(business) = As discussed in Chapter 6, we can interpret this result as a relative frequency probability. That is, if we select a student at random from the sample, the probability that he or she earned a business degree is 197/250. Using the notation for probability, we write To do this, we first calculate the fraction of all students in the sample who received business degrees: total business degree total degrees =

Slide Copyright © 2009 Pearson Education, Inc. Recall from Section 6.5 that if two events A and B are independent (the outcome of one does not affect the probability of the other), then We can apply this rule to determine the probability that a student is both a man and a business major (assuming the null hypothesis that gender is independent of major): P(man and business) = P(A and B) = P(A) × P(B) P(man) × P(business) = × ≈ Similarly, if we select a student at random from the sample, the probability that this student is a man is P(man) =

Slide Copyright © 2009 Pearson Education, Inc. This probability is equivalent to the fraction of the total students whom we expect to be male business majors if there is no relationship between gender and major × × 250 ≈ We therefore multiply this probability by the total number of students in the sample (250) to find the number (or frequency) of male business majors that we expect by chance: We call this value the expected frequency for the number of male business majors.

Slide Copyright © 2009 Pearson Education, Inc. Definition The expected frequencies in a two-way table are the frequencies we would expect by chance if there were no relationship between the row and column variables.

Slide Copyright © 2009 Pearson Education, Inc. Find the frequencies expected by chance for female business majors in Table Solution: EXAMPLE 2 Expected Frequencies for Table 10.3 P(woman and business) = P(woman) × P(business) = × ≈ We now find the expected frequency by multiplying the cell probability by the total number of students (250): × × ≈

Slide Copyright © 2009 Pearson Education, Inc. The calculations for men biology majors and women biology majors are shown below × × ≈ Expected frequency of men biology majors = × × ≈ Expected frequency of women biology majors = Solution: (cont.)

Slide Copyright © 2009 Pearson Education, Inc. Table 10.5 repeats the data from Table 10.3, but this time it also shows the expected frequency for each cell (in parentheses). To check that we did our work correctly, we confirm that the total of all four expected frequencies equals the total of 250 students in the sample: Notice also that the values in the “Total” row and “Total” column are the same for both the observed frequencies and the frequencies expected by chance. This should always be the case, providing another good check on your work =

Slide Copyright © 2009 Pearson Education, Inc. Finding the Chi-Square Statistic Step 1. For each cell in the two-way table, identify O as the observed frequency and E as the expected frequency if the null hypothesis is true (no relationship between the variables). Step 2. Compute the value (O - E) 2 /E for each cell. Step 3. Sum the values from step 2 to get the chi-square statistic:  2 = sum of all values (O - E) 2 E The larger the value of  2, the greater the average difference between the observed and expected frequencies in the cells.

Slide Copyright © 2009 Pearson Education, Inc. To do this calculation in an organized way, it’s best to make a table such as Table 10.6, with a row for each of the cells in the original two-way table. As shown in the lower right cell, the result for the gender/major data is χ 2 =

Slide Copyright © 2009 Pearson Education, Inc. Finding the Frequencies Expected by Chance Computing the Chi-Square Statistic Making the Decision Carrying Out the Hypothesis Test The value of χ 2 gives us a way of testing the null hypothesis of no relationship between the variables. If χ 2 is small, then the average difference between the observed and expected frequencies is small and we should not reject the null hypothesis. If χ 2 is large, then the average difference between the observed and expected frequencies is large and we have reason to reject the null hypothesis of independence.

Slide Copyright © 2009 Pearson Education, Inc. To quantify what we mean by “small” or “large,” we compare the χ 2 value found for the sample data to critical values: If the calculated value of χ 2 is less than the critical value, the differences between the observed and expected values are small and there is not enough evidence to reject the null hypothesis. If the calculated value of χ 2 is greater than or equal to the critical value, then there is enough evidence in the sample to reject the null hypothesis (at the given level of significance).

Slide Copyright © 2009 Pearson Education, Inc. Table 10.7 gives the critical values of χ 2 for two significance levels, 0.05 and Notice that the critical values differ for different table sizes, so you must make sure you read the critical values for a data set from the appropriate table size row.

Slide Copyright © 2009 Pearson Education, Inc. For the gender/major data we have been studying in Tables 10.3 (slide 7) and 10.5 (slide 22), there are two rows and two columns (do not count the “total” rows or columns), which means a table size of 2 × 2. Looking in the first row of Table 10.7 (previous slide), we see that the critical value of χ 2 for significance at the 0.05 level is The chi-square value that we found for the gender/major data is χ 2 = 0.350; because this is less than the critical value of 3.841, we cannot reject the null hypothesis. Of course, failing to reject the null hypothesis does not prove that major and gender are independent. It simply means that we do not have enough evidence to justify rejecting the null hypothesis of independence.

Slide Copyright © 2009 Pearson Education, Inc. A (hypothetical) study seeks to determine whether vitamin C has an effect in preventing colds. Among a sample of 220 people, 105 randomly selected people took a vitamin C pill daily for a period of 10 weeks and the remaining 115 people took a placebo daily for 10 weeks. At the end of 10 weeks, the number of people who got colds was recorded. Table 10.8 summarizes the results. Determine whether there is a relationship between taking vitamin C and getting colds. EXAMPLE 3 Vitamin C Test

Slide Copyright © 2009 Pearson Education, Inc. Solution: We begin by stating the null and alternative hypotheses. H 0 (null hypothesis): There is no relationship between taking vitamin C and getting colds; that is, vitamin C has no more effect on colds than the placebo. H a (alternative hypothesis): There is a relationship between taking vitamin C and getting colds; that is, the numbers of colds in the two groups are not what we would expect if vitamin C and the placebo were equally effective (or equally ineffective). As always, we assume that the null hypothesis is true and calculate the expected frequency for each cell in the table. Noting that the sample size is 220 and proceeding as in Example 2, we find the following expected frequencies (next slide):

Slide Copyright © 2009 Pearson Education, Inc × × = Vitamin C and cold: × × = Vitamin C and no cold: × × = Placebo and cold: × × = Placebo and no cold: Table 10.9 shows the two-way table with the expected frequencies in parentheses. Solution: (cont.)

Slide Copyright © 2009 Pearson Education, Inc. We now compute the chi-square statistic for the sample data. Table shows how we organize the work; you should confirm all the calculations shown. Solution: (cont.)

Slide Copyright © 2009 Pearson Education, Inc. To make the decision about whether to reject the null hypothesis, we compare the value of chi-square for the sample data, χ 2 = , to the critical values from Table We look in the row for a table size of 2 × 2, because the original data in Table 10.8 have two rows and two columns (not counting the “total” values). We see that the critical value of χ 2 for significance at the 0.01 level is Because our sample value of χ 2 = is greater than this critical value, we reject the null hypothesis and conclude that there is a relationship between vitamin C and colds. That is, based on the data from this sample, there is reason to believe that vitamin C does have more effect on colds than a placebo.