Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests.

Slides:



Advertisements
Similar presentations
Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
Advertisements

1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Inference about the Difference Between the
Statistical Inference for Frequency Data Chapter 16.
Chapter 13: Inference for Distributions of Categorical Data
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
12.The Chi-square Test and the Analysis of the Contingency Tables 12.1Contingency Table 12.2A Words of Caution about Chi-Square Test.
Chapter 16 Chi Squared Tests.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Chapter 11a: Comparisons Involving Proportions and a Test of Independence Inference about the Difference between the Proportions of Two Populations Hypothesis.
Chi-Square Tests and the F-Distribution
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
Section 10.1 Goodness of Fit. Section 10.1 Objectives Use the chi-square distribution to test whether a frequency distribution fits a claimed distribution.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
CHAPTER 11 SECTION 2 Inference for Relationships.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 16 Chi-Squared Tests.
1 Nonparametric Statistical Techniques Chapter 17.
+ Chi Square Test Homogeneity or Independence( Association)
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter Outline Goodness of Fit test Test of Independence.
1 Chapter 10. Section 10.1 and 10.2 Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Dan Piett STAT West Virginia University Lecture 12.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Statistics 300: Elementary Statistics Section 11-2.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chi-Två Test Kapitel 6. Introduction Two statistical techniques are presented, to analyze nominal data. –A goodness-of-fit test for the multinomial experiment.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Section 10.1 Goodness of Fit © 2012 Pearson Education, Inc. All rights reserved. 1 of 91.
Keller: Stats for Mgmt & Econ, 7th Ed Chi-Squared Tests
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
CHAPTER 11 CHI-SQUARE TESTS
Chapter 11 Goodness-of-Fit and Contingency Tables
Elementary Statistics: Picturing The World
Econ 3790: Business and Economics Statistics
Chapter 11: Inference for Distributions of Categorical Data
Overview and Chi-Square
Inference on Categorical Data
CHAPTER 11 CHI-SQUARE TESTS
Chapter Outline Goodness of Fit test Test of Independence.
St. Edward’s University
Presentation transcript:

Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests

Copyright © 2009 Cengage Learning 15.2 A Common Theme… What to do?Data Type? Number of Categories? Statistical Technique: Describe a population NominalTwo or more X2 goodness of fit test Compare two populations NominalTwo or more X2 test of a contingency table Compare two or more populations Nominal-- X2 test of a contingency table Analyze relationship between two variables Nominal-- X2 test of a contingency table One data type… …Two techniques

Copyright © 2009 Cengage Learning 15.3 Two Techniques… The first is a goodness-of-fit test applied to data produced by a multinomial experiment, a generalization of a binomial experiment and is used to describe one population of data. The second uses data arranged in a contingency table to determine whether two classifications of a population of nominal data are statistically independent; this test can also be interpreted as a comparison of two or more populations. In both cases, we use the chi-squared ( ) distribution.

Copyright © 2009 Cengage Learning 15.4 The Multinomial Experiment… Unlike a binomial experiment which only has two possible outcomes (e.g. heads or tails), a multinomial experiment: Consists of a fixed number, n, of trials. Each trial can have one of k outcomes, called cells. Each probability p i remains constant. Our usual notion of probabilities holds, namely: p 1 + p 2 + … + p k = 1, and Each trial is independent of the other trials.

Copyright © 2009 Cengage Learning 15.5 Chi-squared Goodness-of-Fit Test… We test whether there is sufficient evidence to reject a specified set of values for p i. To illustrate, our null hypothesis is: H 0 : p 1 = a 1, p 2 = a 2, …, p k = a k where a 1, a 2, …, a k are the values we want to test. Our research hypothesis is: H 1 : At least one p i is not equal to its specified value

Copyright © 2009 Cengage Learning 15.6 Example 15.1 Two companies, A and B, have recently conducted aggressive advertising campaigns to maintain and possibly increase their respective shares of the market for fabric softener. These two companies enjoy a dominant position in the market. Before the advertising campaigns began, the market share of company A was 45%, whereas company B had 40% of the market. Other competitors accounted for the remaining 15%.

Copyright © 2009 Cengage Learning 15.7 Example 15.1 To determine whether these market shares changed after the advertising campaigns, a marketing analyst solicited the preferences of a random sample of 200 customers of fabric softener. Of the 200 customers, 102 indicated a preference for company A's product, 82 preferred company B's fabric softener, and the remaining 16 preferred the products of one of the competitors. Can the analyst infer at the 5% significance level that customer preferences have changed from their levels before the advertising campaigns were launched?

Copyright © 2009 Cengage Learning 15.8 Example 15.1… We compare market share before and after an advertising campaign to see if there is a difference (i.e. if the advertising was effective in improving market share). We hypothesize values for the parameters equal to the before-market share. That is, H 0 : p 1 =.45, p 2 =.40, p 3 =.15 The alternative hypothesis is a denial of the null. That is, H 1 : At least one p i is not equal to its specified value

Copyright © 2009 Cengage Learning 15.9 Example 15.1… Test Statistic If the null hypothesis is true, we would expect the number of customers selecting brand A, brand B, and other to be 200 times the proportions specified under the null hypothesis. That is, e 1 = 200(.45) = 90 e 2 = 200(.40) = 80 e 3 = 200(.15) = 30 In general, the expected frequency for each cell is given by e i = np i This expression is derived from the formula for the expected value of a binomial random variable, introduced in Section 7.4.

Copyright © 2009 Cengage Learning Example 15.1… If the expected frequencies and the observed frequencies are quite different, we would conclude that the null hypothesis is false, and we would reject it. However, if the expected and observed frequencies are similar, we would not reject the null hypothesis. The test statistic measures the similarity of the expected and observed frequencies.

Copyright © 2009 Cengage Learning Chi-squared Goodness-of-Fit Test… Our Chi-squared goodness of fit test statistic is given by: Note: this statistic is approximately Chi-squared with k–1 degrees of freedom provided the sample size is large. The rejection region is: observed frequency expected frequency

Copyright © 2009 Cengage Learning Example 15.1… In order to calculate our test statistic, we lay-out the data in a tabular fashion for easier calculation by hand: Company Observed Frequency Expected Frequency Delta Summation Component fifi eiei (f i – ei)(f i – e i ) 2 /e i A B Others Total Check that these are equal COMPUTE

Copyright © 2009 Cengage Learning Example 15.1… Our rejection region is: Since our test statistic is 8.18 which is greater than our critical value for Chi-squared, we reject H 0 in favor of H 1, that is, “There is sufficient evidence to infer that the proportions have changed since the advertising campaigns were implemented” INTERPRET

Copyright © 2009 Cengage Learning Example 15.1… p-value

Copyright © 2009 Cengage Learning Required Conditions… In order to use this technique, the sample size must be large enough so that the expected value for each cell is 5 or more. (i.e. n x p i ≥ 5) If the expected frequency is less than five, combine it with other cells to satisfy the condition.

Copyright © 2009 Cengage Learning Identifying Factors… Factors that Identify the Chi-Squared Goodness-of-Fit Test: e i =(n)(p i )

Copyright © 2009 Cengage Learning Chi-squared Test of a Contingency Table The Chi-squared test of a contingency table is used to: determine whether there is enough evidence to infer that two nominal variables are related, and to infer that differences exist among two or more populations of nominal variables. In order to use use these techniques, we need to classify the data according to two different criteria.

Copyright © 2009 Cengage Learning Example 15.2 The MBA program was experiencing problems scheduling their courses. The demand for the program's optional courses and majors was quite variable from one year to the next. In desperation the dean of the business school turned to a statistics professor for assistance. The statistics professor believed that the problem may be the variability in the academic background of the students and that the undergraduate degree affects the choice of major.

Copyright © 2009 Cengage Learning Example 15.2 As a start he took a random sample of last year's MBA students and recorded the undergraduate degree and the major selected in the graduate program. The undergraduate degrees were BA, BEng, BBA, and several others. There are three possible majors for the MBA students, accounting, finance, and marketing. Can the statistician conclude that the undergraduate degree affects the choice of major?

Copyright © 2009 Cengage Learning Example 15.2 Xm15-02 The data are stored in two columns. The first column consist of integers 1, 2, 3, and 4 representing the undergraduate degree where 1 = BA 2 = BEng 3 = BBA 4 = other The second column lists the MBA major where 1= Accounting 2 = Finance 3 = Marketing

Copyright © 2009 Cengage Learning Example 15.2 The problem objective is to determine whether two variables (undergraduate degree and MBA major) are related. Both variables are nominal. Thus, the technique to use is the chi- squared test of a contingency table. The alternative hypotheses specifies what we test. That is, H 1 : The two variables are dependent The null hypothesis is a denial of the alternative hypothesis. H 0 : The two variables are independent. IDENTIFY

Copyright © 2009 Cengage Learning Test Statistic The test statistic is the same as the one used to test proportions in the goodness-of-fit-test. That is, the test statistic is Note however, that there is a major difference between the two applications. In this one the null does not specify the proportions p i, from which we compute the expected values e i, which we need to calculate the χ 2 test statistic. That is, we cannot use e = np i because we don’t know the p i (they are not specified by the null hypothesis). It is necessary to estimate the pi from the data.

Copyright © 2009 Cengage Learning Example 15.2 The first step is to count the number of students in each of the 12 combinations. The result is called a cross- classification table.

Copyright © 2009 Cengage Learning Example 15.2 MBA Major Undergrad Degree AccountingFinanceMarketingTotal BA BEng BBA Other Total

Copyright © 2009 Cengage Learning Example 15.2 If the null hypothesis is true (Remember we always start with this assumption.) and the two nominal variables are independent, then, for example P(BA and Accounting) = [P(BA)] [P(Accounting)] Since we don’t know the values of P(BA) or P(Accounting) We need to use the data to estimate the probabilities.

Copyright © 2009 Cengage Learning Test Statistic There are 152 students of which 61 who have chosen accounting as their MBA major. Thus, we estimate the probability of accounting as P(Accounting) Similarly P(BA)

Copyright © 2009 Cengage Learning Example 15.2… If the null hypothesis is true P(BA and Accounting) = (60/152)(61/152) Now that we have the probability we can calculate the expected value. That is, E(BA and Accounting) = 152(60/152)(61/152) = (60)(61)/152 = We can do the same for the other 11 cells.

Copyright © 2009 Cengage Learning Example 15.2 We can now compare observed with expected frequencies… and calculate our test statistic: MBA Major Undergrad Degree AccountingFinanceMarketing BA BEng BBA Other COMPUTE

Copyright © 2009 Cengage Learning Example 15.2… Using Excel : Click Add-Ins, Data Analysis Plus, Contingency Table [if the table has already been prepared] or Contingency Table (Raw Data) [if the table has not been completed] COMPUTE

Copyright © 2009 Cengage Learning Example 15.2… COMPUTE The printout below was produced from file Xm15-02 using the Contingency Table (Raw Data) command

Copyright © 2009 Cengage Learning Example 15.2… The p-value is There is enough evidence to infer that the MBA major and the undergraduate degree are related. We can also interpret the results of this test in two other ways. 1.There is enough evidence to infer that there are differences in MBA major between the four undergraduate categories. 2. There is enough evidence to infer that there are differences in undergraduate degree between the majors. INTERPRET

Copyright © 2009 Cengage Learning Required Condition – Rule of Five… In a contingency table where one or more cells have expected values of less than 5, we need to combine rows or columns to satisfy the rule of five. Note: by doing this, the degrees of freedom must be changed as well.

Copyright © 2009 Cengage Learning Identifying Factors… Factors that identify the Chi-squared test of a contingency table:

Copyright © 2009 Cengage Learning Table 15.1 Statistical Techniques for Nominal Data Problem Objective Categories Statistical Technique Describe a population 2 z-test of p or the chi-squared goodness-of-fit test Describe a population More than 2 Chi-squared goodness-of-fit test Compare two populations 2 z-test p 1 –p 2 or chi-squared test of a contingency table Compare two populations More than 2 Chi-squared test of a contingency table Compare more than two populations 2 or moreChi-squared test of a contingency table Analyze the relationship between two variables2 or more Chi-squared test of a contingency table