Presentation is loading. Please wait.

Presentation is loading. Please wait.

Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)

Similar presentations


Presentation on theme: "Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)"— Presentation transcript:

1 Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)

2 Chapter 23: Contingency Tables Another (and last look!!) at peas!

3 Remember the pea color & texture: Put the data into a matrix:

4 What was the hypothesis from the last session? Answer: 9 : 3 : 3: 1, or ? (1) Y is dominant! Test the hypothesis with:

5 (2) S is dominant! Test the hypothesis with: Then    for the answer.

6 (3)Color & Texture are independent Problem: How to test this w/o assuming proportions in the hypothesis? Estimate from the Marginal Totals

7 Given the marginal probability estimates, if the H o of the independence is true, what are the probabilities in the table “cells”?

8 Given the marginal probability estimates and table estimates, how do we get the expected number? Expected # = probability x total number

9

10 In General: 2 x 2 Tables: DF: 2  2 –1 –1 –1= 1 rowcolumntotal4 cells

11 In general:

12 Where: Estimates of the “cells”: Degrees of Freedom: cellsrows columnstotal

13 Special cases:

14 Example 23.1: Hair Color and Gender Therefore, reject H 0

15 Okay, we have now rejected the null hypothesis: H 0 : The rows and the columns are independent And accepted the alternative hypothesis: H A : The rows and the columns are not independent What now? Look for sub-hypotheses? Look at the individual chi-squares! MBk MBr MBl MR FBk FBr FBl Fr

16 It makes sense to combine hair color with similar gender ratios. Blonds vs. Non-Blonds? But are all the Non-Blonds the same? H 0 : Black=Brown=Red

17 Now H 0 : Blonds=Non-Blonds (1) Too Few Male Blonds (2) Too Many Female Blonds (3) Too Many Male NBl (4) Too Few Female NBl Hair Color independent of Gender

18 The 2 x 2 Table re-examined: (2) The Yates Correction (1) The problem of counts vs. estimates The subtraction of 0.5 is “conservative”. Rounding would be better.

19 (3) The Computing formulae

20  The Cochran/Haber –Correction The Cochran/Haber (Haber(1980)) correction gives better results when routinely employed than does either the Yates-corrected or non-corrected chi-square calculation. (a) Determine each of the four expected frequencies denoting the smallest as. (b) Calculate the absolute difference between the smallest expected frequency and its corresponding observed frequency is. (c) If define D=the largest multiple of 0.5 that is < d; (d) If define D=d - 0.5. (e) Calculate

21 Use the Blond/Non-Blond Example: Yates: Cochran: Uncorrected:

22 (5) The Fisher – Exact Method 2 x 2 tables only Just like the binomial, we need this probability + all more extreme. Hypergeometric Distribution

23 To fine those more extreme, look for the smallest expected frequency as in Cochran’s Method. Form successive tables as in the next example:

24 From 2 to 1 And 1 to 0 Prob= 0.02923 + 0.003498 + 0.0001499 = 0.03288. Reject H 0 : Independence or ?

25 A heterogeneity chi-square analysis of 2 x2 contingency tables. H o : The four samples are homogeneous. H A The four samples are heterogeneous. Example 6.4a (Edition 2) Or see Example 23.8a (Edition 4)

26  Accept H 0, Experiments Homogeneous

27 THE LOG-LIKELIHOOD RATIO The log-likelihood ratio was introduced in Session 2. In tables, it is calculated: Since G is approximately distributed as     able B.1 may be used with (r-1)(c-1) degrees of freedom.

28 Log-likelihood Example: Hair Color

29 Three and Higher Dimensional Tables Ex: Problem III: Row: Remission Column: Age Tier:LI

30 Notation : Row i; column j; tier l f ijl  # in i, j, l Mutual f 112 f 122 f 111 f 121 f 211 f 221 Types of Independence: 3-D:

31 Partial R vs C&T: Column & Tier Spread Out as a single variable

32 C vs R & T T vs R & C

33 Pairwise

34

35 Testing Proportions: A standard problem for which contingency tables and the independence null hypothesis is used is a test of proportions (more about this in Session 7). The problem in it’s simplest form is a 2 by 2 contingency table similar to the following table comparing two groups, control and treatment, say for which there are two outcomes positive/negative, alive/dead, remission/no remission

36 The control group percent positive = The treatment group percent positive = The hypothesis of independence with Fisher-Exact or the chi-square approximation is a test of this hypothesis

37 Power Calculations and Sample size Determination: Proportion positive control (p 1 ) Proportion positive treatment (p 2 ) Sample Size (n) Type I Error (  Pr(Rejecting H 0 | H 0 is true) Type II Error (  Pr (Rejecting H A | H A is true) Or 1-  Pr(Accepting H A | H A is true) = power Given any four of these (p 1, p 2, n,  the fifth one is specified.

38 This is using the formula of Casagrande JT, Pike MC (An Improved Formula for Calculating Sample Sizes, for Comparing Two Binomial Distributions, Biometrics, 34, 483-486): This equation can be solved for the parameter that is not defined. With the t-distribution, iteration is required.


Download ppt "Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)"

Similar presentations


Ads by Google