15.1 Goodness-of-Fit Tests

Slides:



Advertisements
Similar presentations
Statistical Methods Lecture 25
Advertisements

Statistical Methods Lecture 26
Chapter 11 Other Chi-Squared Tests
Chapter 14 Comparing two groups Dr Richard Bußmann.
Chapter 13: Inference for Distributions of Categorical Data
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 15 Inference for Counts:
 Involves testing a hypothesis.  There is no single parameter to estimate.  Considers all categories to give an overall idea of whether the observed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 26 Comparing Counts.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Slide
Comparing Counts.  A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Chi-Squared Test of Homogeneity Are different populations the same across some characteristic?
Statistics 26 Comparing Counts. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a goodness-of-fit.
Check your understanding: p. 684
CHAPTER 11 Inference for Distributions of Categorical Data
Comparing Counts Chi Square Tests Independence.
CHAPTER 26 Comparing Counts.
Warm Up Check your understanding on p You do NOT need to calculate ALL the expected values by hand but you need to do at least 2. You do NOT need.
CHAPTER 11 Inference for Distributions of Categorical Data
Chi-square test or c2 test
Vocabulary Statistical Inference – provides methods for drawing conclusions about a population parameter from sample data Expected Values– row total *
CHAPTER 11 CHI-SQUARE TESTS
Introduction The two-sample z procedures of Chapter 10 allow us to compare the proportions of successes in two populations or for two treatments. What.
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
CHAPTER 11 Inference for Distributions of Categorical Data
Essential Statistics Two Categorical Variables: The Chi-Square Test
Chapter 25 Comparing Counts.
Chapter 11 Goodness-of-Fit and Contingency Tables
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
CHAPTER 11 Inference for Distributions of Categorical Data
Contingency Tables: Independence and Homogeneity
Inference for Relationships
Inference on Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Lesson 11 - R Chapter 11 Review:
Paired Samples and Blocks
Analyzing the Association Between Categorical Variables
CHAPTER 11 CHI-SQUARE TESTS
Chapter 26 Comparing Counts.
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 9 Hypothesis Testing: Single Population
Chapter 26 Comparing Counts.
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Presentation transcript:

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Given the following… 1) Counts of items in each of several categories 2) A model that predicts the distribution of the relative frequencies …this question naturally arises: “Does the actual distribution differ from the model because of random error, or do the differences mean that the model does not fit the data?” In other words, “How good is the fit?”

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Example: Stock Market “Up” Days Population of Stock Market Days Sample of 1000 “up” days “Up” days appear to be more common than expected on certain days, especially on Fridays. Null Hypothesis: The distribution of “up” days is no different from the population distribution. Test the hypothesis with a chi-square goodness-of-fit test. 2

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Assumptions and Condition Counted Data Condition – The data must be counts for the categories of a categorical variable. Independence Assumption Independence Assumption – The counts should be independent of each other. Think about whether this is reasonable. Randomization Condition – The counted individuals should be a random sample of the population. Guard against auto-correlated samples. 3

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Sample Size Assumption Sample Size Assumption -- There must be enough data so check the following condition. Expected Cell Frequency Condition – Expect at least 5 individuals per cell. 4

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Chi-Square Model To decide if the null model is plausible, look at the differences between the observed values and the values expected if the model were true. Note that “accumulates” the relative squared deviation of each cell from its expected value. So, gets “big” when i) the data set is large and/or ii) the model is a poor fit. 5

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests The Chi-Square Calculation 6

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Example : Credit Cards At a major credit card bank, the percentages of people who historically apply for the Silver, Gold, and Platinum cards are 60%, 30%, and 10% respectively. In a recent sample of customers, 110 applied for Silver, 55 for Gold, and 35 for Platinum. Is there evidence to suggest the percentages have changed? What type of test do you conduct? What are the expected values? Find the test statistic and p-value. State conclusions. 7

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Example : Credit Cards At a major credit card bank, the percentages of people who historically apply for the Silver, Gold, and Platinum cards are 60%, 30%, and 10% respectively. In a recent sample of customers, 110 applied for Silver, 55 for Gold, and 35 for Platinum. Is there evidence to suggest the percentages have changed? What type of test do you conduct? This is a goodness-of-fit test comparing a single sample to previous information (the null model). What are the expected values? Silver Gold Platinum Observed 110 55 35 Expected 120 60 20 8

15.1 Goodness-of-Fit Tests QTM1310/ Sharpe 15.1 Goodness-of-Fit Tests Example : Credit Cards At a major credit card bank, the percentages of people who historically apply for the Silver, Gold, and Platinum cards are 60%, 30%, and 10% respectively. In a recent sample of customers, 110 applied for Silver, 55 for Gold, and 35 for Platinum. Is there evidence to suggest the percentages have changed? Find the test statistic and p-value. Using df = 2, the p-value < 0.005 State conclusions. Reject the null hypothesis. There is sufficient evidence customers are not applying for cards in the traditional proportions. 9

15.2 Interpreting Chi-Square Values QTM1310/ Sharpe 15.2 Interpreting Chi-Square Values The Chi-Square Distribution The distribution is right-skewed and becomes broader with increasing degrees of freedom: The test is a one-sided test. 10

15.2 Interpreting Chi-Square Values QTM1310/ Sharpe 15.2 Interpreting Chi-Square Values The Chi-Square Calculation: Stock Market “Up” Days Using a chi-square table at a significance level of 0.05 and with 4 degrees of freedom: Do not reject the null hypothesis. (The fit is “good”.) 11

15.3 Examining the Residuals QTM1310/ Sharpe 15.3 Examining the Residuals When we reject a null hypothesis, we can examine the residuals in each cell to discover which values are extraordinary. Because we might compare residuals for cells with very different counts, we should examine standardized residuals: Note that standardized residuals from goodness-of-fit tests are actually z-scores (which we already know how to interpret and analyze). 12

15.3 Examining the Residuals QTM1310/ Sharpe 15.3 Examining the Residuals Standardized residuals for the trading days data: None of these values is remarkable. The largest, Friday, at 1.292, is not impressive when viewed as a z-score. The deviations are in the direction of a “weekend effect”, but they aren’t quite large enough for us to conclude they are real. 13

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Below are responses to the question, “How important is it to seek your utmost attractive appearance?” 14

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Convert the results to “column percentages”: Response patterns are beginning to become apparent. 15

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity The stacked barchart shows the patterns even more vividly: It seems that India stands out from the others. 16

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity But, are the differences real or just natural sampling variation? Our null hypothesis is that the relative frequency distributions are the same (homogeneous) for each country. Test the hypothesis with a chi-square test for homogeneity. 17

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Use the Row % column to determine the expected counts for each table column (each country): 18

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Assumptions and Conditions Counted Data Condition – Data must be counts Independence Assumption – Counts need to be independent from each other. Check for randomization Randomization Condition – Random sample needed Sample Size Assumption – There must be enough data so check the following condition. Expected Cell Frequency Condition – Expect at least 5 individuals per cell. 19

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Following the pattern of the goodness-of-fit test, compute the component for each cell: Then, sum the components: The degrees of freedom are (The for the appearance survey indicates that the differences between countries are not due to random chance.) 20

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Example: More Credit Cards A market researcher for the credit card bank wants to know if the distribution of applications by card is the same for the past 3 mailings. She takes a random sample of 200 from each mailing and counts the number of applications for each type of card. What type of test do you conduct? What are the expected values? Find the test statistic and p-value. State conclusions. 21

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Example : More Credit Cards A market researcher for the credit card bank wants to know if the distribution of applications by card is the same for the past 3 mailings. She takes a random sample of 200 from each mailing and counts the number of applications for each type of card. What type of test do you conduct? A chi-square test of homogeneity What are the expected values? 22

15.4 The Chi-Square Test for Homogeneity QTM1310/ Sharpe 15.4 The Chi-Square Test for Homogeneity Example : More Credit Cards A market researcher for the credit card bank wants to know if the distribution of applications by card is the same for the past 3 mailings. She takes a random sample of 200 from each mailing and counts the number of applications for each type of card. Find the test statistic. Given p-value > 0.10,state conclusions. Fail to reject the null. There is insufficient evidence to suggest that the distributions are different for the three mailings. 23

15.5 Comparing Two Proportions QTM1310/ Sharpe 15.5 Comparing Two Proportions Are women more likely to graduate high school than men, or are the differences due to random variation? Sample of 25,000 24-year-olds: Men: 84.9% diploma rate Women: 88.1% diploma rate Overall, of the sample had diplomas. Use this proportion to compute the expected values. 24

15.5 Comparing Two Proportions QTM1310/ Sharpe 15.5 Comparing Two Proportions Observed Counts: Expected Values: 25

15.5 Comparing Two Proportions QTM1310/ Sharpe 15.5 Comparing Two Proportions 26

15.5 Comparing Two Proportions QTM1310/ Sharpe 15.5 Comparing Two Proportions For high school graduation, a 95% confidence interval for the true difference between women’s and men’s rates is: Sample of 25,000 24-year-olds: We can be 95% confident that women’s rates of having a HS diploma by 2000 were 2.36% to 4.04% higher than men’s. 27

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence The table below shows the importance of personal appearance for several age groups. Are Age and Appearance independent, or is there a relationship? 28

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence A stacked barchart suggests a relationship: Test for independence using a chi-square test of independence. 29

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence The test is mechanically equivalent to the test for homogeneity, but with some differences in how we think about the data and the results: Homogeneity Test: one variable (Appearance) measured on two or more populations (countries). Independence Test: Two variables (Appearance and Age) measured on a single population. We ask the question “Are the variables independent?” rather than “Are the groups homogeneous?” This subtle distinction is important when drawing conclusions. 30

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence Assumptions and Conditions Counted Data Condition – Data must be counts Independence Assumption – Counts need to be independent from each other. Check for randomization Randomization Condition – Random sample needed Sample Size Assumption – There must be enough data so check the following condition. Expected Cell Frequency Condition – Expect at least 5 individuals per cell. 31

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence Example : Automobile Manufacturers Consumer Reports uses surveys to measure reliability in automobiles. Annually they release survey results about problems that consumers have had with vehicles in the past 12 months and the origin of manufacturer. Is consumer satisfaction related to country of origin? State the hypotheses. Find the test statistic. Given p-value = 0.231, state your conclusion. 32

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence Example : Automobile Manufacturers Consumer Reports uses surveys to measure reliability in automobiles. Annually they release survey results about problems that consumers have had with vehicles in the past 12 months and the origin of manufacturer. Is consumer satisfaction related to country of origin? State the hypotheses. Find the test statistic. Given p-value = 0.231, state your conclusion. There is not enough evidence to conclude there is an association between vehicle problems and origin of vehicle. 33

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence For the Appearance and Age example, we reject the null hypothesis that the variables are independent. So, it may be of interest to know how differently two age groups (teens and 30-something adults) select the “very important” category (Appearance response 6 or 7). Construct a confidence interval for the true difference in proportions… 34

15.6 Chi-Square Test of Independence QTM1310/ Sharpe 15.6 Chi-Square Test of Independence From the data table, the percentage responses for Appearance = 6 or 7 are as follows: Teens: 45.17% 30-39: 39.91% The 95% confidence interval is found below: 35

Don’t use chi-square methods unless you have counts. QTM1310/ Sharpe Don’t use chi-square methods unless you have counts. Beware large samples! With a sufficiently large sample size, a chi-square test can always reject the null hypothesis. Don’t say that one variable “depends” on the other just because they’re not independent. 36

QTM1310/ Sharpe What Have We Learned? Recognize when a chi-square test of goodness of fit, homogeneity, or independence is appropriate. For each test, find the expected cell frequencies. For each test, check the assumptions and corresponding conditions and know how to complete the test. • Counted data condition. • Independence assumption; randomization makes independence more plausible. • Sample size assumption with the expected cell frequency condition; expect at least 5 observations in each cell. 37

What Have We Learned? Interpret a chi-square test. QTM1310/ Sharpe What Have We Learned? Interpret a chi-square test. • Even though we might believe the model, we cannot prove that the data fit the model with a chi-square test because that would mean confirming the null hypothesis. Examine the standardized residuals to understand what cells were responsible for rejecting a null hypothesis. 38

What Have We Learned? Compare two proportions. QTM1310/ Sharpe What Have We Learned? Compare two proportions. State the null hypothesis for a test of independence and understand how that is different from the null hypothesis for a test of homogeneity. • Both are computed the same way. You may not find both offered by your technology. You can use either one as long as you interpret your result correctly. 39