Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)

Chapter 23: Contingency Tables Another (and last look!!) at peas!

Remember the pea color & texture: Put the data into a matrix:

What was the hypothesis from the last session? Answer: 9 : 3 : 3: 1, or ? (1) Y is dominant! Test the hypothesis with:

(2) S is dominant! Test the hypothesis with: Then    for the answer.

(3)Color & Texture are independent Problem: How to test this w/o assuming proportions in the hypothesis? Estimate from the Marginal Totals

Given the marginal probability estimates, if the H o of the independence is true, what are the probabilities in the table “cells”?

Given the marginal probability estimates and table estimates, how do we get the expected number? Expected # = probability x total number

In General: 2 x 2 Tables: DF: 2  2 –1 –1 –1= 1 rowcolumntotal4 cells

In general:

Where: Estimates of the “cells”: Degrees of Freedom: cellsrows columnstotal

Special cases:

Example 23.1: Hair Color and Gender Therefore, reject H 0

Okay, we have now rejected the null hypothesis: H 0 : The rows and the columns are independent And accepted the alternative hypothesis: H A : The rows and the columns are not independent What now? Look for sub-hypotheses? Look at the individual chi-squares! MBk MBr MBl MR FBk FBr FBl Fr

It makes sense to combine hair color with similar gender ratios. Blonds vs. Non-Blonds? But are all the Non-Blonds the same? H 0 : Black=Brown=Red

Now H 0 : Blonds=Non-Blonds (1) Too Few Male Blonds (2) Too Many Female Blonds (3) Too Many Male NBl (4) Too Few Female NBl Hair Color independent of Gender

The 2 x 2 Table re-examined: (2) The Yates Correction (1) The problem of counts vs. estimates The subtraction of 0.5 is “conservative”. Rounding would be better.

(3) The Computing formulae

 The Cochran/Haber –Correction The Cochran/Haber (Haber(1980)) correction gives better results when routinely employed than does either the Yates-corrected or non-corrected chi-square calculation. (a) Determine each of the four expected frequencies denoting the smallest as. (b) Calculate the absolute difference between the smallest expected frequency and its corresponding observed frequency is. (c) If define D=the largest multiple of 0.5 that is < d; (d) If define D=d - 0.5. (e) Calculate

Use the Blond/Non-Blond Example: Yates: Cochran: Uncorrected:

(5) The Fisher – Exact Method 2 x 2 tables only Just like the binomial, we need this probability + all more extreme. Hypergeometric Distribution

To fine those more extreme, look for the smallest expected frequency as in Cochran’s Method. Form successive tables as in the next example:

From 2 to 1 And 1 to 0 Prob= 0.02923 + 0.003498 + 0.0001499 = 0.03288. Reject H 0 : Independence or ?

A heterogeneity chi-square analysis of 2 x2 contingency tables. H o : The four samples are homogeneous. H A The four samples are heterogeneous. Example 6.4a (Edition 2) Or see Example 23.8a (Edition 4)

 Accept H 0, Experiments Homogeneous

THE LOG-LIKELIHOOD RATIO The log-likelihood ratio was introduced in Session 2. In tables, it is calculated: Since G is approximately distributed as     able B.1 may be used with (r-1)(c-1) degrees of freedom.

Log-likelihood Example: Hair Color

Three and Higher Dimensional Tables Ex: Problem III: Row: Remission Column: Age Tier:LI

Notation : Row i; column j; tier l f ijl  # in i, j, l Mutual f 112 f 122 f 111 f 121 f 211 f 221 Types of Independence: 3-D:

Partial R vs C&T: Column & Tier Spread Out as a single variable

C vs R & T T vs R & C

Pairwise

Testing Proportions: A standard problem for which contingency tables and the independence null hypothesis is used is a test of proportions (more about this in Session 7). The problem in it’s simplest form is a 2 by 2 contingency table similar to the following table comparing two groups, control and treatment, say for which there are two outcomes positive/negative, alive/dead, remission/no remission

The control group percent positive = The treatment group percent positive = The hypothesis of independence with Fisher-Exact or the chi-square approximation is a test of this hypothesis

Power Calculations and Sample size Determination: Proportion positive control (p 1 ) Proportion positive treatment (p 2 ) Sample Size (n) Type I Error (  Pr(Rejecting H 0 | H 0 is true) Type II Error (  Pr (Rejecting H A | H A is true) Or 1-  Pr(Accepting H A | H A is true) = power Given any four of these (p 1, p 2, n,  the fifth one is specified.

This is using the formula of Casagrande JT, Pike MC (An Improved Formula for Calculating Sample Sizes, for Comparing Two Binomial Distributions, Biometrics, 34, 483-486): This equation can be solved for the parameter that is not defined. With the t-distribution, iteration is required.

Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)

Similar presentations

Presentation on theme: "Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)

Similar presentations

Presentation on theme: "Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)"— Presentation transcript:

Similar presentations

About project

Feedback