Chi-square test or c2 test

Slides:



Advertisements
Similar presentations
Chapter 11 Other Chi-Squared Tests
Advertisements

Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)
 2 test for independence Used with categorical, bivariate data from ONE sample Used to see if the two categorical variables are associated (dependent)
AP Statistics Tuesday, 15 April 2014 OBJECTIVE TSW (1) identify the conditions to use a chi-square test; (2) examine the chi-square test for independence;
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
The Analysis of Categorical Data and Goodness of Fit Tests
Inference about the Difference Between the
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
 2 test for independence Used with categorical, bivariate data from ONE sample Used to see if the two categorical variables are associated (dependent)
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chi-square test Chi-square test or  2 test Notes: Page Goodness of Fit 2.Independence 3.Homogeneity.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Chapter 26 Chi-Square Testing
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Other Chi-Square Tests
+ Chi Square Test Homogeneity or Independence( Association)
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
2 sample interval proportions sample Shown with two examples.
AGENDA:. AP STAT Ch. 14.: X 2 Tests Goodness of Fit Homogeniety Independence EQ: What are expected values and how are they used to calculate Chi-Square?
+ Chapter 11 Inference for Distributions of Categorical Data 11.1Chi-Square Goodness-of-Fit Tests 11.2Inference for Relationships.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Goodness-of-FitSlide #1 Goodness-of-Fit Test Examples – –Test whether responses are “random” (e.g., preference) –Test Mendelian genetics (e.g., 3:1 and.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 12 The Analysis of Categorical Data and Goodness of Fit Tests.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Inference for Tables Catapult Discovery Question: –How does a cat land (feet, side, nose/face)? –Write your predictions in percent. Collect data for.
Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chi-Squared Test of Homogeneity Are different populations the same across some characteristic?
The χ 2 (Chi-Squared) Test. Crazy Dice? You roll a die 60 times and get: 3 ones, 6 twos, 19 threes, 22 fours, 6 fives, and 4 sixes  Is this a fair die?
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
AP Statistics Tuesday, 05 April 2016 OBJECTIVE TSW (1) identify the conditions to use a chi-square test; (2) examine the chi- square test for independence;
Inference for Tables Chi-Square Tests Chi-Square Test Basics Formula for test statistic: Conditions: Data is from a random sample/event. All individual.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Copyright © Cengage Learning. All rights reserved. 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Chapter 12 Lesson 12.2b Comparing Two Populations or Treatments 12.2: Test for Homogeneity and Independence in a Two-way Table.
 Check the Random, Large Sample Size and Independent conditions before performing a chi-square test  Use a chi-square test for homogeneity to determine.
Chi-square test or χ2 test
Presentation 12 Chi-Square test.
CHAPTER 11 Inference for Distributions of Categorical Data
Chi-square test or c2 test
Chi-squared test or c2 test
CHAPTER 11 CHI-SQUARE TESTS
Chi-square test or c2 test
Chi-square test or c2 test
Chi-square test or c2 test
Chi-square test or c2 test
The Analysis of Categorical Data and Chi-Square Procedures
Chapter 11: Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Inference for Relationships
Chi-square test or c2 test
The Analysis of Categorical Data and Goodness of Fit Tests
Analyzing the Association Between Categorical Variables
CHAPTER 11 CHI-SQUARE TESTS
The Analysis of Categorical Data and Goodness of Fit Tests
The Analysis of Categorical Data and Goodness of Fit Tests
The Analysis of Categorical Data and Goodness of Fit Tests
Chi-square test or c2 test
Presentation transcript:

Chi-square test or c2 test

Chi-square test Used to test the counts of categorical data Three types Goodness of fit Independence Homogeneity

c2 distribution – df=3 df=5 df=10

c2 distribution Different df have different curves Skewed right Only positive values As df increases, curve shifts toward right & becomes more like a normal curve

c2 Goodness of fit test Want to see how well the observed counts “fit” what we expect the counts to be

c2 Goodness of fit test Explain the parameters. State the Hypotheses Null Hypothesis: H0: p1 = hypothesized proportion for category 1 and p2 = hypothesized proportion for category 2 and … i.e. The actual population distribution is equal to the expected distribution. Alternative Hypothesis: Ha: H0 is not true. i.e. The actual population distribution is different from the expected distribution. Conditions: 1. Observed cell counts are based on a random sample. 2. The sample size is large. The sample size is large enough for the chi-squared test to be appropriate as long as every expected count is at least 5.

c2 Goodness of fit test Test Statistic: Degrees of freedom = Number of categories -1 Write the decision and conclusion.

c2 Goodness of fit test Example Last year, at the 6pm time slot, television channels 2, 11, 13 and 26 captured the entire audience with 30%, 25%, 20% and 25% respectively. During the first week of the new season, 500 viewers are interviewed with the results below. Has the preference changed from last season? Channel 2 11 13 26 Viewers 129 148 112 111 Parameters p1 = true proportion of channel 2 viewers p2 = true proportion of channel 11 viewers p3 = true proportion of channel 13 viewers p4 = true proportion of channel 26 viewers

c2 Goodness of fit test Example Last year, at the 6pm time slot, television channels 2, 11, 13 and 26 captured the entire audience with 30%, 25%, 20% and 25% respectively. During the first week of the new season, 500 viewers are interviewed with the results below. Has the preference changed from last season? Channel 2 11 13 26 Viewers 129 148 112 111 Hypothesis Ho : p1 = 0.30 p2 = 0.25 p3 = 0.20 p4 = 0.25 Ha: At least one of the proportions is not as expected

c2 Goodness of fit test Example Conditions The sample should be random which I will assume. 2) The sample size should be large. Channel 2 11 13 26 Expected 150 125 100 observed 129 148 112 111 Since all expected counts are greater than 5 the sample is large enough. df = 3 p-value =  = .05

c2 Goodness of fit test Example Decision Decision Since the p-value < , I reject the null hypothesis at the .05 level. Since the p-value < , I reject the null hypothesis at the .05 level. Conclusion There is evidence to conclude that the viewing preference for the 6 pm news has changed.

c2 test for independence Used to see if the two categorical variables are associated or not associated (independent)

c2 test for independence State the Hypotheses Null Hypothesis: H0: The two variables are independent (or not associated) Alternative Hypothesis: Ha: The two variables are not independent (or associated) Conditions: 1) A random sample is taken from one large population. 2) The sample size is large - all expected cell counts are at least 5 3) Each outcome can be classified into one of several categories on one variable and into one of several categories on a second variable.

c2 test for independence Test Statistic: expected cell count = df = (# of rows -1)(# of column -1) Write the decision and conclusion.

c2 test for independence A beef distributor wishes to determine whether there is a relationship between geographic region and cut of meat preferred. Suppose that, in a random sample of 500 customers, 300 are from the North and 200 from the south preferences were as in the table. Is beef preference independent of geographic region? Geographic Region North South Cut A 100 50 Cut B 150 125 Cut C 25 Beef Preference Hypothesis Ho : Beef preference is independent of geographic region Ha: Beef preference is not independent of geographic region

c2 test for independence A beef distributor wishes to determine whether there is a relationship between geographic region and cut of meat preferred. Suppose that, in a random sample of 500 customers, 300 are from the North and 200 from the south preferences were as in the table. Is beef preference independent of geographic region? Geographic Region North South Cut A 100 50 Cut B 150 125 Cut C 25 Remember : exp count = (90) (60) Beef Preference (165) (110) (45) (30) Conditions: 1) The sample is random which is stated in the problem. 2) The sample size should be large. All expected cell counts are at least 5 as shown in the table 3) Each outcome can be classified by region and cut.

c2 test for independence Geographic Region North South Cut A 100 50 Cut B 150 125 Cut C 25 (90) (60) Beef Preference (165) (110) Enter observed counts in Matrix A (45) (30) df = 2 p-value =  = .05

Decision Since the p-value < , I reject the null hypothesis at the .05 level. Conclusion There is evidence to conclude that beef preference is not independent of geographic region.

c2 test for homogeneity Used to see if the two populations are the same (homogeneous) Are the proportion of the different outcomes in one population equal to those in another population?

c2 test for homogeneity State the Hypotheses Conditions: Null Hypothesis: H0: The true category proportions are the same for all the populations Alternative Hypothesis: Ha: The true category proportions are not the same for all the populations Conditions: Independent random samples of fixed sizes are taken from two or more large OR two or more treatments are randomly assigned to two or more types of available subjects 2) Each outcome falls into exactly one of several categories, with the categories being the same in all populations. 3) The sample size is large - all expected cell counts are at least 5

c2 test for homogeneity Test Statistic: expected cell count = df = (# of rows -1)(# of column -1) Write the decision and conclusion.

In July 1991 and again in April 2001, the Gallup Poll asked random samples of 1015 adults about their opinions on working parents. The table summarizes responses to the question, “Considering the needs of both parents and children, which of the following so you see as the ideal family in today’s society? Based on these results, do you think there was a change in people’s attitudes during the 10 years between these polls? Use  = 0.02 1991 2001 Both work full time 142 131 One works full time, other part time 274 244 One works, other works at home 152 173 One works, other stays home for kids 396 416 No opinion 51 Hypotheses Ho : The proportion of adults who believe which type of family is “ideal” was not different in 1991 and 2001. Ha: The proportion of adults who believe which type of family is “ideal” was different in 1991 and 2001.

In July 1991 and again in April 2001, the Gallup Poll asked random samples of 1015 adults about their opinions on working parents. The table summarizes responses to the question, “Considering the needs of both parents and children, which of the following so you see as the ideal family in today’s society? Based on these results, do you think there was a change in people’s attitudes during the 10 years between these polls? Use  = 0.02 1991 2001 Both work full time 142 131 One works full time, other part time 274 244 One works, other works at home 152 173 One works, other stays home for kids 396 416 No opinion 51 (136.5) (136.5) (259) (259) (162.5) (162.5) (406) (406) (51) (51) Conditions: The sample should be random which is stated and independent which I will assume. 2) Each opinion falls into one type of “ideal family” category for both 1991 and 2001. 3) The sample size should be large. All expected cell counts are at least 5 as shown in the table

df = 4 p-value =  = .02 1991 2001 Both work full time 142 131 One works full time, other part time 274 244 One works, other works at home 152 173 One works, other stays home for kids 396 416 No opinion 51 (136.5) (136.5) (259) (259) (162.5) (162.5) (406) (406) (51) (51) df = 4 p-value =  = .02

Decision Since the p-value > , I fail to reject the null hypothesis at the .02 level. Conclusion There is not evidence to conclude that the proportion of adults who believed in what type of family is “ideal” was different in 1991 and 2001.