Download presentation
Presentation is loading. Please wait.
Published byStephanie Lawrence Modified over 8 years ago
1
CHAPTER 11: INFERENCE FOR DISTRIBUTIONS OF CATEGORICAL DATA 11.1 CHI-SQUARE TESTS FOR GOODNESS OF FIT OUTCOME: I WILL STATE APPROPRIATE HYPOTHESES AND COMPUTE EXPECTED COUNTS, CALCULATE THE CHI-SQUARE STATISTIC, DEGREES OF FREEDOM, AND P-VALUE, ALL FOR A CHI-SQUARE TEST FOR GOODNESS OF FIT.
2
CANDY ACTIVITY BROWNREDYELLOWGREENORANGEBLUE
3
LET’S RECAP
4
CHI-SQUARE GOODNESS OF FIT TEST Performing one-sample z tests for each color wouldn’t tell us how likely it is to get a random sample of 60 candies with a color distribution that differs as much from the one claimed by the company as this bag does (taking all the colors into consideration at one time). For that, we need a new kind of significance test, called a chi-square goodness-of-fit test.
5
CHI SQUARE TEST STATISTIC
6
MAIN IDEA The idea of the chi-square goodness-of-fit test is this: we compare the observed counts from our sample with the counts that would be expected if H 0 is true. The more the observed counts differ from the expected counts, the more evidence we have against the null hypothesis.
8
OUR CHI-SQUARE STATISTIC + + + + +
9
IS THIS VALUE LARGE OR SMALL? LET’S FIND OUT!
10
CLASS DOTPLOT 0510 15 2025
12
DEGREES OF FREEDOM
13
P-VALUE P df.25.20.15.10.05.025.02.01.005.0025.001.0005 45.395.996.747.789.4911.1411.6713.2814.8616.4218.4720.00 56.637.298.129.2411.0712.8313.3915.0916.7518.3920.5122.11 67.848.569.4510.6412.5914.4515.0316.8118.5520.2522.4624.10
14
CALCULATOR
15
DONE (BREATHE)
16
CHAPTER 11: INFERENCE FOR DISTRIBUTIONS OF CATEGORICAL DATA 11.1 CHI-SQUARE TESTS FOR GOODNESS OF FIT OUTCOME: I WILL PERFORM A CHI-SQUARE TEST AND CONDUCT A FOLLOW-UP ANALYSIS WHEN THE RESULTS OF A CHI-SQUARE TEST ARE SIGNIFICANT.
17
CARRYING OUT A CHI-SQUARE TEST State Plan Do Conclude
18
STATE - NULL AND ALTERNATIVE HYPOTHESES Null Hypothesis: This should state a claim about the distribution of a single categorical variable in the population of interest. H 0 : The company’s stated color distribution for M&M’S ® Milk Chocolate Candies is correct. H 0 : p blue = 0.24, p orange = 0.20, p green = 0.16, p yellow = 0.14, p red = 0.13, p brown = 0.13
19
STATE Alternative Hypothesis: This should state that the categorical variable does not have the specified distribution. H a : The company’s stated color distribution for M&M’S ® Milk Chocolate Candies is not correct. H a : At least one of the p i ’s is incorrect
20
PLAN – CONDITIONS
21
WE HAVE TO BE CAREFUL! The chi-square test statistic compares observed and expected counts. Don’t try to perform calculations with the observed and expected proportions in each category. When checking the Large Sample Size condition, be sure to examine the expected counts, not the observed counts.
22
DO Calculate the test statistic (Last class) Calculate the P-value (Last class)
23
CALCULATOR
24
EXAMPLE In his book Outliers, Malcolm Gladwell suggests that a hockey player’s birth month has a big influence on his chance to make it to the highest levels of the game. Specifically, since January 1 is the cut-off date for youth leagues in Canada (where many National Hockey League (NHL) players come from), players born in January will be competing against players up to 12 months younger. The older players tend to be bigger, stronger, and more coordinated and hence get more playing time, more coaching, and have a better chance of being successful. To see if birth date is related to success (judged by whether a player makes it into the NHL), a random sample of 80 National Hockey League players from a recent season was selected and their birthdays were recorded. Do these data provide convincing evidence that the birthdays of all NHL players are evenly distributed among the four quarters of the year? BirthdayJan – MarApr – JunJul – SepOct – Dec Number of Players32201612
25
FIN.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.