Download presentation
Presentation is loading. Please wait.
Published byMorris Wilson Modified over 9 years ago
1
Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)
2
Chapter 23: Contingency Tables Another (and last look!!) at peas!
3
Remember the pea color & texture: Put the data into a matrix:
4
What was the hypothesis from the last session? Answer: 9 : 3 : 3: 1, or ? (1) Y is dominant! Test the hypothesis with:
5
(2) S is dominant! Test the hypothesis with: Then for the answer.
6
(3)Color & Texture are independent Problem: How to test this w/o assuming proportions in the hypothesis? Estimate from the Marginal Totals
7
Given the marginal probability estimates, if the H o of the independence is true, what are the probabilities in the table “cells”?
8
Given the marginal probability estimates and table estimates, how do we get the expected number? Expected # = probability x total number
10
In General: 2 x 2 Tables: DF: 2 2 –1 –1 –1= 1 rowcolumntotal4 cells
11
In general:
12
Where: Estimates of the “cells”: Degrees of Freedom: cellsrows columnstotal
13
Special cases:
14
Example 23.1: Hair Color and Gender Therefore, reject H 0
15
Okay, we have now rejected the null hypothesis: H 0 : The rows and the columns are independent And accepted the alternative hypothesis: H A : The rows and the columns are not independent What now? Look for sub-hypotheses? Look at the individual chi-squares! MBk MBr MBl MR FBk FBr FBl Fr
16
It makes sense to combine hair color with similar gender ratios. Blonds vs. Non-Blonds? But are all the Non-Blonds the same? H 0 : Black=Brown=Red
17
Now H 0 : Blonds=Non-Blonds (1) Too Few Male Blonds (2) Too Many Female Blonds (3) Too Many Male NBl (4) Too Few Female NBl Hair Color independent of Gender
18
The 2 x 2 Table re-examined: (2) The Yates Correction (1) The problem of counts vs. estimates The subtraction of 0.5 is “conservative”. Rounding would be better.
19
(3) The Computing formulae
20
The Cochran/Haber –Correction The Cochran/Haber (Haber(1980)) correction gives better results when routinely employed than does either the Yates-corrected or non-corrected chi-square calculation. (a) Determine each of the four expected frequencies denoting the smallest as. (b) Calculate the absolute difference between the smallest expected frequency and its corresponding observed frequency is. (c) If define D=the largest multiple of 0.5 that is < d; (d) If define D=d - 0.5. (e) Calculate
21
Use the Blond/Non-Blond Example: Yates: Cochran: Uncorrected:
22
(5) The Fisher – Exact Method 2 x 2 tables only Just like the binomial, we need this probability + all more extreme. Hypergeometric Distribution
23
To fine those more extreme, look for the smallest expected frequency as in Cochran’s Method. Form successive tables as in the next example:
24
From 2 to 1 And 1 to 0 Prob= 0.02923 + 0.003498 + 0.0001499 = 0.03288. Reject H 0 : Independence or ?
25
A heterogeneity chi-square analysis of 2 x2 contingency tables. H o : The four samples are homogeneous. H A The four samples are heterogeneous. Example 6.4a (Edition 2) Or see Example 23.8a (Edition 4)
26
Accept H 0, Experiments Homogeneous
27
THE LOG-LIKELIHOOD RATIO The log-likelihood ratio was introduced in Session 2. In tables, it is calculated: Since G is approximately distributed as able B.1 may be used with (r-1)(c-1) degrees of freedom.
28
Log-likelihood Example: Hair Color
29
Three and Higher Dimensional Tables Ex: Problem III: Row: Remission Column: Age Tier:LI
30
Notation : Row i; column j; tier l f ijl # in i, j, l Mutual f 112 f 122 f 111 f 121 f 211 f 221 Types of Independence: 3-D:
31
Partial R vs C&T: Column & Tier Spread Out as a single variable
32
C vs R & T T vs R & C
33
Pairwise
35
Testing Proportions: A standard problem for which contingency tables and the independence null hypothesis is used is a test of proportions (more about this in Session 7). The problem in it’s simplest form is a 2 by 2 contingency table similar to the following table comparing two groups, control and treatment, say for which there are two outcomes positive/negative, alive/dead, remission/no remission
36
The control group percent positive = The treatment group percent positive = The hypothesis of independence with Fisher-Exact or the chi-square approximation is a test of this hypothesis
37
Power Calculations and Sample size Determination: Proportion positive control (p 1 ) Proportion positive treatment (p 2 ) Sample Size (n) Type I Error ( Pr(Rejecting H 0 | H 0 is true) Type II Error ( Pr (Rejecting H A | H A is true) Or 1- Pr(Accepting H A | H A is true) = power Given any four of these (p 1, p 2, n, the fifth one is specified.
38
This is using the formula of Casagrande JT, Pike MC (An Improved Formula for Calculating Sample Sizes, for Comparing Two Binomial Distributions, Biometrics, 34, 483-486): This equation can be solved for the parameter that is not defined. With the t-distribution, iteration is required.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.