Download presentation

Presentation is loading. Please wait.

Published byZoie Lipton Modified over 3 years ago

1
**Chi-square, Goodness of fit, and Contingency Tables**

2
**What is the χ2 distribution**

Basically a distribution of squared differences

3
**Useful for detecting categorical differences**

Calculate the χ2 test statistic= (observed-expected)2/expected Degrees of freedom = number of categories -1 Look up χ2 value for that degree of freedom and chosen alpha value. If test statistic > table value, then significant

4
**Two sided test: find the column corresponding to α/2 in the table for upper critical values and**

reject the null hypothesis if the test statistic is greater than the tabled value. Use 1 - α /2 in the table for lower critical values and reject null if the test statistic is less than the tabled value. Upper one-sided test: find column corresponding to α in upper critical values table. If test statistic greater, reject.

5
**Also useful for model fitting**

Assume you have a fit a model to some data and have some residual errors left over. You want to check if residuals are normally distributed. You bin them in a histogram Estimate proportions of residuals in each, compare to actual data

6
**Model Fitting Example Consider a classic genetics experiment.**

The offspring of a cross between the F1 brassicas was 53 dark green and 11 yellow. If the plants are heterozygous for color the ratio of 3 dark green to 1 yellow would be expected. Dark Green Yellow Total Observed numbers (O) 53 11 64 Expected numbers (E) 48 16 O - E 5 -5 (O-E)2 25 (O-E)2 / E 25/48 = 0.52 25/16 = 1.56 2.08

7
**Compound Hypotheses and Directionality**

With multiple categories, compound hypotheses are possible H0 Pr(cat 1) = 0.25, Pr(cat 2) = and Pr(cat 3) = 0.75 HA: one of the above not the case Where there are 2 categories, a “directional alternative” is possible

8
**Directional Alternatives**

Only in the case of “dichotomous variables” – two categories, effectively. Step 1: Check Directionality of trend If not, p-value > 0.5 by necessity If so, proceed to step 2 The P-value is half what it would be if HA were non directional

9
**Directional Alternative Example**

Two football teams records are compared against the average number of wins by an NFL team per year, 9. Team 1 won 14 games this year and several players were caught doping with HGF. Team 2 won 11 games this year and tested clean. Is there evidence that doping increased the number of wins by team 1?

10
**Contingency Tables Use χ2 test statistic as above, but**

Calculate expected values for each element in table from E=(row total)*(column total)/Grand Total; Df =1

11
**2x2 Contingency Tables Can indicate either**

Two independent samples with a dichotomous observed variabled One sample with two dichotomous observed variables Female Male Tot(col) HIV test 9 8 17 No HIV test 52 51 103 Tot (row) 61 59 120

12
**Relation to Independence of data**

You can interpret contingency tables in terms of conditional probabilities Pr(HIV test | female)= 9/61 Pr(female | HIV test) = 9/17 Test becomes H0 : Likelihood of taking and HIV test is independent of sex Female Male Tot(col) HIV test 9 8 17 No HIV test 52 51 103 Tot (row) 61 59 120

13
**Rxk contingency tables**

Same as above, but degrees of freedom = (r-1)*(k-1).

14
**Corrections to the Chi-Squared Test**

It is a requirement that a chi-squared test be applied to discrete data. Counting numbers are appropriate, continuous measurements are not. Assuming continuity in the underlying distribution distorts the p value and may make false positives more likely. Frank Yates proposed a correction to the chi-squared formula. Adding a small negative term to the argument. This tends to increase the p-value, and makes the test more conservative, making false positives less likely. However, the test may now be *too* conservative. Additionally, chi squared test should not be used when the observed values in a cell are <5. It is, at times not inappropriate to pad an empty cell with a small value, though, as one can only assume the result would be more significant with no value there.

Similar presentations

OK

Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.

Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on acid-base titration problems and solutions Ppt on acid-base indicators in nature Ppt on liberal education Ppt on leadership qualities of adolf hitler Ppt on renewable energy source biomass Ppt on conservation of environment biodiversity Ppt on question tags exercises Ppt on conservation of land resources Ppt on tropical deciduous forest in india Ppt on principles of object-oriented programming concepts