Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Statistical Tests Chapter 16.
Advertisements

Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 5 Association between Categorical Variables.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Chapter 11 Inference for Distributions of Categorical Data
Inference about the Difference Between the
Chapter 14 Comparing two groups Dr Richard Bußmann.
Chapter 13: Inference for Distributions of Categorical Data
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
Chapter 26: Comparing Counts
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 15 Inference for Counts:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 26 Comparing Counts.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
1 1 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial Population Goodness of.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 22 Regression Diagnostics.
Copyright © 2011 Pearson Education, Inc. Comparison Chapter 18.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 17 Comparison.
Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests.
Copyright © 2011 Pearson Education, Inc. Analysis of Variance Chapter 26.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 16 Statistical Tests.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
CHAPTER 11 SECTION 2 Inference for Relationships.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
FPP 28 Chi-square test. More types of inference for nominal variables Nominal data is categorical with more than two categories Compare observed frequencies.
13.2 Chi-Square Test for Homogeneity & Independence AP Statistics.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 16 Chi-Squared Tests.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
+ Chi Square Test Homogeneity or Independence( Association)
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Copyright © 2010 Pearson Education, Inc. Slide
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Chapter Outline Goodness of Fit test Test of Independence.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent variable.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 26 Analysis of Variance.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
11/12 9. Inference for Two-Way Tables. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect,
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Active Learning Lecture Slides
Chapter 25 Comparing Counts.
Chapter 11: Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Paired Samples and Blocks
Analyzing the Association Between Categorical Variables
Chapter 26 Comparing Counts.
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
Chapter 26 Comparing Counts.
Presentation transcript:

Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts

Copyright © 2014, 2011 Pearson Education, Inc Chi-Squared Tests Retailers can customize the online shopping experience by learning more about its customers. For example, Amazon wants to know if income level affects what shoppers look for (camera or phone) when they visit electronics.  Use a chi-squared test for independence to answer this question.

Copyright © 2014, 2011 Pearson Education, Inc Chi-Squared Tests Contingency Table: Purchase Category vs. Household Income (555 visitors to Amazon)

Copyright © 2014, 2011 Pearson Education, Inc Chi-Squared Tests Observations from Contingency Table  Association is evident suggesting that income and choice of product are dependent.  Households with lower incomes seem more likely to purchase a phone; those with higher incomes a camera.  Are these differences in purchase rates the result of sampling variation?

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Chi-Squared test of independence Tests the independence of two categorical variables using counts in a contingency table.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Hypotheses for the chi-squared test H 0 : Household Income and Purchase Category are independent. H a : Household Income and Purchase Category are not independent. Or H 0 : p 25 = p 50 = p 75 = p 100 = p 100+ H a : p 25, p 50, p 75, p 100, p 100+ are not all equal

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Hypotheses for the chi-squared test  Null hypothesis describes five segments of the population defined by household income.  Null assumes conditional probabilities of purchase type given income level are equal across the five segments.  Alternative hypothesis is vague; does not indicate why the null is false.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Calculating χ 2  Measures the distance between the observed contingency table and a hypothetical contingency table.  The hypothetical contingency table obeys H 0 while being consistent with observed marginal counts.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Calculating χ 2 The null hypothesis determines expected cell counts in the hypothetical table.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Calculating χ 2 Accumulates the deviations between the observed and expected counts (in the hypothetical table) across all cells.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Calculating χ 2 For retail data on purchase category and household income, the chi-squared statistic is

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Plots of the chi-squared test Mosaic Plot for Retail Data

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Plots of the chi-squared test Mosaic Plot for Independent Variables

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Conditions  No lurking explanation for association.  Data are random samples from indicated segments of the population.  Categories defining the table are mutually exclusive.  Expected cell counts are not too small.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence The chi-squared distribution  Sampling distribution of the chi-squared statistic if the null hypothesis is true.  Right-skewed.  Assigns probabilities to positive values only.  Identified by degrees of freedom (df).  Approaches normal distribution as df increase.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence The chi-squared distribution

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Getting the p-value df for χ 2 test of independence = (r - 1)(c - 1) df based on size of contingency table r = number of rows c = number of columns

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Getting the p-value – Retail Example Observed χ 2 = with 4 df From χ 2 table P(χ 2 > ) = 0.05; since > , we can reject H 0 The p-value is therefore < 0.05; the exact p-value is

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Getting the p-value – Retail Example

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Summary: chi-squared test of independence

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Chi-squared test of independence – Checklist  No obvious lurking variable.  SRS Condition.  Contingency table condition.  Sample size condition. Expected cell frequencies at least 10; expected cell frequencies of 5 permitted with at least 4 df.

Copyright © 2014, 2011 Pearson Education, Inc Test of Independence Connection to two-sample tests  Chi-squared test reduces to two-sided version of the two-sample test of the difference between proportions.  If the 95% confidence interval for p 1 – p 2 does not include zero, then the chi-squared test has a p-value less than 0.05 and H 0 is rejected.

Copyright © 2014, 2011 Pearson Education, Inc. 23 4M Example 18.1: RETAIL CREDIT Motivation Managers of a chain worry that some methods of recruiting customers for store credit, called channels, produce more problems than other channels. Is the channel used related to the status of the customer’s account a year later?

Copyright © 2014, 2011 Pearson Education, Inc. 24 4M Example 18.1: RETAIL CREDIT Method Data collected for 630 accounts on variables Channel and Status after 12 months.

Copyright © 2014, 2011 Pearson Education, Inc. 25 4M Example 18.1: RETAIL CREDIT Method – Check Conditions  No obvious lurking variable. Difficult to check without knowing more about channels.  SRS condition reasonably met.  Contingency table condition satisfied.  Sample size condition must be checked after computing expected cell frequencies.

Copyright © 2014, 2011 Pearson Education, Inc. 26 4M Example 18.1: RETAIL CREDIT Mechanics – Mosaic Plot

Copyright © 2014, 2011 Pearson Education, Inc. 27 4M Example 18.1: RETAIL CREDIT Mechanics – Expected Counts Sample size condition satisfied. Χ 2 = with 4 df; p-value = Cannot reject H 0 at α = 0.05.

Copyright © 2014, 2011 Pearson Education, Inc. 28 4M Example 18.1: RETAIL CREDIT Message Observed rates of late payments and early closure are not statistically significantly different among credit accounts opened a year ago through in-store, mailing and Web channels. Since the p-value is close to 0.05, it may be worthwhile to monitor accounts developed through mailings.

Copyright © 2014, 2011 Pearson Education, Inc General Versus Specific Hypotheses  Chi-squared test cannot match the power of a more specific test.  A 95% confidence interval for the difference in proportions of late payments from accounts developed via the mailing channel versus the other two channels (combined into one) does not contain zero.

Copyright © 2014, 2011 Pearson Education, Inc Tests of Goodness of Fit Chi-Squared test of goodness of fit A test of the distribution of a single categorical variable.

Copyright © 2014, 2011 Pearson Education, Inc Tests of Goodness of Fit Testing for randomness  Do shoppers purchase big-ticket items more often on some days of the week than on others?  Are cars made on some days more likely to have defects than the cars made on other days?

Copyright © 2014, 2011 Pearson Education, Inc. 32 4M Example 18.2: DETECTING ACCOUNTING FRAUD Motivation Managers would like to have a systematic method to audit purchase amounts on invoices to uncover fraud.

Copyright © 2014, 2011 Pearson Education, Inc. 33 4M Example 18.2: DETECTING ACCOUNTING FRAUD Method Managers collected a sample of n = 135 invoices. Amounts ranged from $100 to $100,000, with an average of $42,000. Leading digits for the amounts should follow a distribution known as Benford’s law.

Copyright © 2014, 2011 Pearson Education, Inc. 34 4M Example 18.2: DETECTING ACCOUNTING FRAUD Method Probabilities based on Benford’s law

Copyright © 2014, 2011 Pearson Education, Inc. 35 4M Example 18.2: DETECTING ACCOUNTING FRAUD Method Counts of leading digits in sample of invoices

Copyright © 2014, 2011 Pearson Education, Inc. 36 4M Example 18.2: DETECTING ACCOUNTING FRAUD Method – Check Conditions All conditions are satisfied. The smallest expected count is 6.2. Because there are more than 4 degrees of freedom, the relaxed sample size condition is used.

Copyright © 2014, 2011 Pearson Education, Inc. 37 4M Example 18.2: DETECTING ACCOUNTING FRAUD Mechanics Χ 2 = 19.1 with 8 df. P-value = Reject H 0.

Copyright © 2014, 2011 Pearson Education, Inc. 38 4M Example 18.2: DETECTING ACCOUNTING FRAUD Message The deviation of the distribution of leading digits in these invoice amounts is statistically significantly different from the form predicted by Benford’s law. This confirms suspicion that the digits are atypical and may indicate fraud.

Copyright © 2014, 2011 Pearson Education, Inc Tests of Goodness of Fit Testing the fit of a probability model  How do we know whether the observed counts match a particular distribution?

Copyright © 2014, 2011 Pearson Education, Inc. 40 4M Example 18.3: WEB HITS Motivation Managers of the Web site plan to use a Poisson model to summarize how often users click on ads. If it fits well, they will use this model to summarize concisely the volume of traffic headed to advertisers and to measure the effects of changes in the Web site on traffic patterns.

Copyright © 2014, 2011 Pearson Education, Inc. 41 4M Example 18.3: WEB HITS Method Data collected on a sample of 685 users that visited the Web site during a recent weekday evening.

Copyright © 2014, 2011 Pearson Education, Inc. 42 4M Example 18.3: WEB HITS Method – Check Conditions SRS and the contingency table conditions are satisfied. However, need to combine the last three categories in order to meet the sample size condition.

Copyright © 2014, 2011 Pearson Education, Inc. 43 4M Example 18.3: WEB HITS Mechanics

Copyright © 2014, 2011 Pearson Education, Inc. 44 4M Example 18.3: WEB HITS Mechanics Χ 2 = with 2 df. P-value = Cannot reject H 0.

Copyright © 2014, 2011 Pearson Education, Inc. 45 4M Example 18.3: WEB HITS Message The distribution of the number of ads clicked by users is consistent with a Poisson distribution. Managers of the Web site can use this model to summarize user behavior.

Copyright © 2014, 2011 Pearson Education, Inc. 46 Best Practices  Remember the importance of experiments.  State your hypotheses before looking at the data.  Plot the data.  Think when you interpret a p-value.

Copyright © 2014, 2011 Pearson Education, Inc. 47 Pitfalls  Don’t confuse statistical significance with substantive significance.  Don’t use a chi-squared test when the expected frequencies are too small.  Don’t cherry pick comparisons.  Don’t use the number of observations to find the degrees of freedom of chi-squared.