Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Chi-square Statistic

Similar presentations


Presentation on theme: "The Chi-square Statistic"— Presentation transcript:

1 The Chi-square Statistic

2 Calculating Probabilities

3 Probability Probability of an event happening =
Number of ways it can happen Total number of outcomes

4 Coin Toss Example A balanced coin flipped in an unbiased way results in heads or tails (each with an equal 50% chance) Chance of heads = one/two possible outcomes What if the last 4 coin flips were heads, what is the chance of the next flip resulting in tails?

5 Probability of Failure
Know the odds! Example when rolling a die, the chance of your number coming up equals 1/6 (or 16.6%) More importantly the chance of numbers that you didn’t pick to show up is – 1/6 (or 83.3%)

6 The Chi-square Test The Chi-square test is checking to see if the observed results match the expected results Like with the Dice rolls, if you rolled a dice 100 times did you indeed observe about 1/6 of each number. You can put the observed values versus the expected values in the test to see if the dice is not faulty or loaded.

7 Goodness of fit This test is used to decide whether there is any difference between the observed (experimental) value and the expected (theoretical) value.

8 Goodness of Fit

9 Free from Assumptions Chi square goodness of fit test depends only on the set of observed and expected frequencies and degrees of freedom. This test does not need any assumption regarding distribution of the parent population from which the samples are taken. Since this test does not involve any population parameters or characteristics, it is termed as non-parametric or distribution free tests. This test is also sample size independent and can be used for any sample size. Generally performed on Discrete data

10 It is all about expectations
Oi = an observed frequency (i.e. count) for measurement i Ei = an expected (theoretical) frequency for measurement i, asserted by the null hypothesis.

11 Another way to look at it
The value of the Chi-squared statistic = the sum of the (squares of the differences) expected values

12 Expected Value F = the cumulative Distribution function for the distribution being tested. Yu = the upper limit for class I (maximum possible observations for any category) Yl = the lower limit for class I (minumum possible observations for any category) N = the sample size

13 Hypothesis testing Choose a level of alpha – usually 0.05
This implies a 95% level of comfort that the observation is correct.

14 Degrees of Freedom = Number of groups – 1
Example The number of cubs delivered to a population of bears in the wild is tested to see if there is no difference in probability of twins. (N = 50 females) Number of cubs 1 2 3 Observed 5 35 9 Expected 12.5 Degrees of Freedom = Number of groups – 1 df = 4 – 1 = 3

15 CHI-SQUARE DISTRIBUTION TABLE

16 Decision Rule Based on the alpha and the degrees of freedom, look up the value in the table. For our example of alpha=.05 and df=3 If chi square is greater than 7.82 then reject the null hypothesis that bears normally birth twins.

17 Calculate the value Number of cubs 1 2 3 Observed 5 35 9 Expected 12.5 Chi-square = (1-12.5)2/ (5-12.5)2/ ( )2/ (9-12.5)2/12.5 = = 56.56 Since > 7.82 we reject the null hypothesis that the number of bear cubs is equally possible for 0-3 cubs

18 Interpret the result Since we rejected the null hypothesis, what conclusions (inferences) can we come to?

19 Two-Way Table Method Observed Column 1 Column 2 Row Totals Row 1
Row 1 Total (R1T) Row 2 Row 2 Total (R2T) Column Totals Column 1 Total (C1T) Column 2 Total (C2T) Grand Total (GT) Each value in the expected values table is calculated by multiplying the row total X the column total and dividing by the grand total for each cells location Expected Column 1 Column 2 Row 1 R1T*C1T/GT R1T*C2T/GT Row 2 R2T*C1T/GT R2T*C2T/GT

20 2-Way Chi-Square Conditions
Simple Random Samples Categorical Data Degrees of Freedom equals number of rows minus 1 times the number of columns minus 1 or DF = (r – 1) * (c – 1) Test Statistic is calculated as before but this time for each cell of the table Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ] P-value is the probability of observing a sample statistic as extreme as the test statistic.

21 Two-Way Table Example Observed Democrat Republican Row Totals Male 20
30 50 Female Column Totals 100 Each value in the expected values table is calculated by multiplying the row total (50) X the column total (50) and dividing by the grand total (100) for each cells location. Expected Democrat Republican Male 25 Female Calculating the Chi-square statistic: ((20-25)^2/25) + ((30-25)^2/25) + ((30-25)^2/25) + ((20-25)^2/25), or (25/25) + (25/25) + (25/25) + (25/25) or or 4. 

22 Compare Chi-square to table
For the example, Chi-square = 4 The degrees of freedom are 1 Since 4 > 3.841 We can reject the null hypothesis that political party is independent of gender with 95% confidence.


Download ppt "The Chi-square Statistic"

Similar presentations


Ads by Google