Chi-square test Chi-square test or  2 test. crazy What if we are interested in seeing if my “crazy” dice are considered “fair”? What can I do?

Slides:

Advertisements

Similar presentations

Chi-square test or c2 test

Advertisements

Chapter 26 Comparing Counts

Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)

AP Statistics Tuesday, 15 April 2014 OBJECTIVE TSW (1) identify the conditions to use a chi-square test; (2) examine the chi-square test for independence;

Chi-Squared Hypothesis Testing Using One-Way and Two-Way Frequency Tables of Categorical Variables.

1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Goodness-of-Fit Tests.

The Analysis of Categorical Data and Goodness of Fit Tests

Chapter 11 Inference for Distributions of Categorical Data

1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.

Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)

CHAPTER 11 Inference for Distributions of Categorical Data

Chi-square Goodness of Fit Test

Presentation 12 Chi-Square test.

Does your zodiac sign determine how successful you will be? Fortune magazine collected the zodiac signs of 256 heads of the largest 400 companies. Is there.

Chapter 13: Inference for Tables – Chi-Square Procedures

Analysis of Count Data Chapter 26

 Involves testing a hypothesis.  There is no single parameter to estimate.  Considers all categories to give an overall idea of whether the observed.

Chapter 26: Comparing Counts AP Statistics. Comparing Counts In this chapter, we will be performing hypothesis tests on categorical data In previous chapters,

Chapter 11: Inference for Distributions of Categorical Data.

Chapter 11: Inference for Distributions of Categorical Data

Chi-square test Chi-square test or  2 test Notes: Page 217, and your own notebook paper 1.Goodness of Fit 2.Independence 3.Homogeneity.

Chi-square test Chi-square test or  2 test. crazy What if we are interested in seeing if my “crazy” dice are considered “fair”? What can I do?

Chapter 11 Chi Square Distribution and goodness of fit.

Chi-square test or c2 test

Chi-square test Chi-square test or  2 test Notes: Page Goodness of Fit 2.Independence 3.Homogeneity.

Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.

The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.

Chapter 11: Inference for Distributions of Categorical Data Section 11.1 Chi-Square Goodness-of-Fit Tests.

1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 26 Comparing Counts.

Chapter 12: The Analysis of Categorical Data and Goodness- of-Fit Test.

Warm up On slide.

Chi-Square Test James A. Pershing, Ph.D. Indiana University.

Comparing Counts.  A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a.

Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.

+ Chapter 11 Inference for Distributions of Categorical Data 11.1Chi-Square Goodness-of-Fit Tests 11.2Inference for Relationships.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.

Chapter 12 The Analysis of Categorical Data and Goodness of Fit Tests.

Lecture 11. The chi-square test for goodness of fit.

Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)

Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.

+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.

Chi-Squared Test of Homogeneity Are different populations the same across some characteristic?

11.1 Chi-Square Tests for Goodness of Fit Objectives SWBAT: STATE appropriate hypotheses and COMPUTE expected counts for a chi- square test for goodness.

The χ 2 (Chi-Squared) Test. Crazy Dice? You roll a die 60 times and get: 3 ones, 6 twos, 19 threes, 22 fours, 6 fives, and 4 sixes  Is this a fair die?

Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.

AP Statistics Tuesday, 05 April 2016 OBJECTIVE TSW (1) identify the conditions to use a chi-square test; (2) examine the chi- square test for independence;

AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.

Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.

Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a goodness-of-fit.

Chi-square test or c2 test

Inference for Tables: Chi-Squares procedures (2 more chapters to go!)

Check your understanding: p. 684

CHAPTER 11 Inference for Distributions of Categorical Data

Chi-square test or χ2 test

Chi-squared test or c2 test

Chi-square test or c2 test

Chi-square test or c2 test

Chi-square test or c2 test

Chi-square test or c2 test

The Analysis of Categorical Data and Chi-Square Procedures

CHAPTER 11 Inference for Distributions of Categorical Data

The Analysis of Categorical Data and Goodness of Fit Tests

The Analysis of Categorical Data and Goodness of Fit Tests

CHAPTER 11 Inference for Distributions of Categorical Data

CHAPTER 11 Inference for Distributions of Categorical Data

The Analysis of Categorical Data and Goodness of Fit Tests

The Analysis of Categorical Data and Goodness of Fit Tests

CHAPTER 11 Inference for Distributions of Categorical Data

Chi-square test or c2 test

Inference for Distributions of Categorical Data

Presentation transcript:

Chi-square test Chi-square test or  2 test

crazy What if we are interested in seeing if my “crazy” dice are considered “fair”? What can I do?

Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate) –Independence (bivariate) –Homogeneity (univariate with two samples)

Chi-square distributions

Upper-tail Areas for Chi-square Distributions

 2 distribution Different df have different curves Skewed right Cannot take on negative values normal curveAs df increases, curve shifts toward right & becomes more like a normal curve Each curve has a mode at df-2 and a mean at df

 2 assumptions SRSSRS – reasonably random sample countsHave counts of categorical data & we expect each category to happen at least once Sample sizeSample size – to insure that the sample size is large enough we should expect at least five in each category. ***Be sure to list expected counts!! Combine these together: All expected counts are at least 5.

 2 formula

 2 Goodness of fit test Uses univariate data (one sample, one variable) Want to see how well the observed counts “fit” what we expect the counts to be  2 cdf function p-valuesUse  2 cdf function on the calculator to find p-values Based on df – df = number of categories - 1

Let’s test our dice!

Hypotheses – written in words H 0 : proportions are equal H a : at least one proportion is not the same Be sure to write in context!

Does your zodiac sign determine how successful you will be? Fortune magazine collected the zodiac signs of 256 heads of the largest 400 companies. Is there sufficient evidence to claim that successful people are more likely to be born under some signs than others? Aries 23Libra18Leo20 Taurus20Scorpio21Virgo19 Gemini18Sagittarius19Aquarius24 Cancer23Capricorn22Pisces29 How many would you expect in each sign if there were no difference between them? How many degrees of freedom? I would expect CEOs to be equally born under all signs. So 256/12 = Since there are 12 signs – df = 12 – 1 = 11

Assumptions: Have a random sample of CEO’s All expected counts are greater than 5. (I expect CEO’s to be born in each sign.) H 0 : The proportions of CEO’s born under each sign are the same. H a : At least one of the proportion of CEO’s born under each sign is different.

2.) Compute the residuals. (Observed – Expected) SignObserved value Expected value (256/12) Residual = Observed - expected Aires Taurus Gemini Cancer Leo Virgo Libra Scorpio Sagittarius Capricorn Aquarius Pisces

3.) Square the residuals SignObserved value Expected value (256/12) Residual = Observed - expected (Observed- expected) 2 Aires Taurus Gemini Cancer Leo Virgo Libra Scorpio Sagittarius Capricorn Aquarius Pisces

4. Compute the components for each cell SignObserved value Expected value (256/12) Residual = Observed - expected (Observed- expected) 2 Expected value Aires Taurus Gemini Cancer Leo Virgo Libra Scorpio Sagittarius Capricorn Aquarius Pisces

5. Find the sum of the components (that’s the chi-square statistic) SignObserved value Expected value (256/12) Residual = Observed - expected (Observed- expected) 2 Expected value Aires Taurus Gemini Cancer Leo Virgo Libra Scorpio Sagittarius Capricorn Aquarius Pisces Σ = 5.094

P-value =  2 cdf(5.094, 10^99, 11) =.9265  =.05 Since p-value > , I fail to reject H 0. There is not sufficient evidence to suggest that the CEOs are born under some signs more than under others.

Offspring of certain fruit flies may have yellow or ebony bodies and normal wings or short wings. Genetic theory predicts that these traits will appear in the ratio 9:3:3:1 (yellow & normal, yellow & short, ebony & normal, ebony & short) A researcher checks 100 such flies and finds the distribution of traits to be 59, 20, 11, and 10, respectively. What are the expected counts? df? Are the results consistent with the theoretical distribution predicted by the genetic model? (see next page) Expected counts: Y & N = Y & S = E & N = E & S = 6.25 We expect 9/16 of the 100 flies to have yellow and normal wings. (Y & N) Since there are 4 categories, df = 4 – 1 = 3

Assumptions: Have a random sample of fruit flies All expected counts are greater than 5. Expected counts: Y & N = 56.25, Y & S = 18.75, E & N = 18.75, E & S = 6.25 H 0 : The proportions of fruit flies are the same as the theoretical model. H a : At least one of the proportions of fruit flies is not the same as the theoretical model. P-value =  2 cdf(5.671, 10^99, 3) =.129  =.05 Since p-value > , I fail to reject H 0. There is not sufficient evidence to suggest that the distribution of fruit flies is not the same as the theoretical model.

A company says its premium mixture of nuts contains 10% Brazil nuts, 20% cashews, 20% almonds, 10% hazelnuts and 40% peanuts. You buy a large can and separate the nuts. Upon weighing them, you find there are 112 g Brazil nuts, 183 g of cashews, 207 g of almonds, 71 g or hazelnuts, and 446 g of peanuts. You wonder whether your mix is significantly different from what the company advertises? Why is the chi-square goodness-of-fit test NOT appropriate here? What might you do instead of weighing the nuts in order to use chi-square? counts Because we do NOT have counts of the type of nuts. count We could count the number of each type of nut and then perform a  2 test.

Example: Does the color of a car influence the chance that it will be stolen? Of 830 cars reported stolen, 140 were white, 100 were blue, 270 were red, 230 were black, and 90 were other colors. It is known that 15% of all cars are white, 15% are blue, 35% are red, 30% are black, and 5% are other colors. CategoryColorObservedExpected 1White140.15*830 = Blue100.15*830 = Red270.35*830 = Black230.30*830 = 249 5Other90.05*830 = 41.5

CategoryColorObservedExpected 1White Blue Red Black Other Let π 1, π 2,... Π 5 denote true proportions of stolen cars that fall into the 5 color categories H o : π 1 =.15, π 2 =.15, π 3 =.35, π 4 =.30, π 5 =.05 H a : H o is not true. α =.01 Test statistic: Assumptions: The sample was a random sample of stolen cars. All expected counts are greater than 5, so the sample size is large enough to use the chi-square test.

Calculations: = = P-value: All expected counts exceed 5, so the P-value can be based on a chi-square distribution with 4 df. The computed value is larger than 18.46, so P-value <.001. Because P-value < α, H o is rejected. There is convincing evidence that at least one of the color proportions for stolen cars differs from the corresponding proportion for all cars.