Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company.

Similar presentations


Presentation on theme: "Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company."— Presentation transcript:

1 Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company

2 Objectives (IPS chapter 9.2) Inference for two-way tables  The “no association” hypothesis  Expected counts in two-way tables  The Chi-square test

3 Hypothesis: no association Again, we want to know if the differences in sample proportions between groups are likely to have occurred by chance. We use the chi-square (   ) test to assess the hypothesis of no relationship between the two categorical variables of a two-way table.

4 Expected counts in two-way tables Two-way tables summarize the data according to two categorical variables. We want to test the hypothesis that there is no relationship between these two categorical variables (H 0 ). To test this hypothesis, we compare actual counts from the sample data with expected counts assuming the hypothesis of no relationship. The expected count in any cell of a two-way table when H 0 is true is:

5 Expected counts in two-way tables If there is no difference between the row proportions for the various columns then we can estimate the common proportion for any row as total for that row divided by n. If we apply that proportion to the column total (multiplication) we get the formula given on the previous page. That number represents the number of observations we expect to see in the cell if there is no association between the row and column factors.

6 Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain that effect, the drug may have to be taken more frequently and at higher doses. After stopping use, users may feel tired, sleepy or depressed. The pleasurable high followed by unpleasant after-effects encourage repeated compulsive use, which can easily lead to dependency. Desipramine is an antidepressant that acts on the brain in a way that may mitigate depression. It was tested to see if it aided recovery from cocaine addiction. Treatment with desipramine was compared to a standard treatment (lithium) and a placebo.

7 25*26/74 ≈ 8.7816.22 9.1416.86 8.0814.92 Desipramine Lithium Placebo Expected relapse counts No Yes 35% Expected Observed Cocaine addiction

8 The chi-square statistic (  2 ) is a measure of how much the observed cell counts in a two-way table diverge from the expected cell counts. The formula for the  2 statistic is given by Large values for  2 are caused by large deviations from the expected counts and provide strong evidence against H 0. How large is “Large”? The chi-square test

9 For the chi-square test, H 0 states that there is no association between the row and column variables in a two-way table. The alternative is that these variables are related. If H 0 is true, the chi-square test statistic has approximately a χ 2 distribution with (r − 1)(c − 1) degrees of freedom. See Table F in the textbook. The P-value for the chi-square test is the area to the right of  2 under the  2 distribution with df (r−1)(c−1): P(χ 2 ≥ X 2 ).

10 Table F  If we have 3 rows and 3 columns then degrees of freedom is  (3-1)*(3-1) = 4  If we are interested in choosing a large value we are unlikely to reach by chance we might select the 95 th percentile since there is only a 5% chance of getting a number that large if H o is true.  From page T-20 we get the value 9.49  If we calculate  2 from the sample and if that value is greater than 9.49 then we are likely to conclude that we have strong evidence against H o.

11 When is it safe to use a  2 test? We can safely use the chi-square test when:  All individual expected counts are 1 or more (≥1)  No more than 20% of expected counts are less than 5 (< 5)  For a 2x2 table, this implies that all four expected counts should be 5 or more.

12 Chi-square test vs. z-test for two proportions When comparing only two proportions such as in a 2x2 table where the columns represent counts of “success” and “failure,” we can test H 0 : p 1 = p 2 vs. H a p 1 ≠ p 2 equally with a two-sided z-test or with a chi-square test with 1 degree of freedom and get the same p-value. In fact, the two test statistics are related by the simple expression X 2 = (z) 2.

13 Observed Cocaine addiction The p-value is 0.005 or half a percent. This is very significant. We reject the null hypothesis of no association and conclude that there is a significant relationship between treatment (desipramine, lithium, placebo) and outcome (relapse or not). Minitab statistical software output for the cocaine study

14 Successful firms Franchise businesses are often given exclusive rights to a territory. This means that the outlet will not have to compete with other outlets of the same chain within that territory. How does the presence of an exclusive-territory clause in the contract relate to the survival of the business? A random sample of 170 new franchises recorded two categorical variables for each firm: (1) whether the firm was successful or not (based on economic criteria) and (2) whether or not the firm had an exclusive-territory contract. This is a 2x2 table (two levels for success, yes/no; two levels for exclusive territory, yes/no).  df = (2 − 1)(2 − 1) = 1

15 Successful firms How does the presence of an exclusive-territory clause in the contract relate to the survival of the business? To compare firms that have an exclusive territory with those that do not, we start by examining column percents (conditional distribution): The difference between the percent of successes among the two types of firms is quite large. The chi-square test can tell us whether or not these differences can be plausibly attributed to chance. Specifically, we will test H 0 : No relationship between exclusivity clause and success H a : There is some relationship between the two variables

16 The p-value is significant at  5% (p is about 1.5%) so we reject H 0 : we have found a significant relationship between an exclusive territory and the success of a franchised firm. Successful firms Here is the chi-square output from Minitab:


Download ppt "Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company."

Similar presentations


Ads by Google