The Chi-square Statistic

Slides:



Advertisements
Similar presentations
Chapter 11 Other Chi-Squared Tests
Advertisements

CHI-SQUARE(X2) DISTRIBUTION
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Chi Square Example A researcher wants to determine if there is a relationship between gender and the type of training received. The gender question is.
The Chi-Square Test for Association
Hypothesis Testing IV Chi Square.
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
CJ 526 Statistical Analysis in Criminal Justice
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
CJ 526 Statistical Analysis in Criminal Justice
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
AP STATS EXAM REVIEW Chapter 8 Chapter 13 and 14 Chapter 11 and 12 Chapter 9 and Chapter 10 Chapter 7.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) x ±  x and x ±  y What.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Chi Square Chi square is employed to test the difference between an actual sample and another hypothetical or previously established distribution such.
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
Chi Square Test Dr. Asif Rehman.
Comparing Counts Chi Square Tests Independence.
Basic Statistics The Chi Square Test of Independence.
Test of independence: Contingency Table
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square hypothesis testing
Chi-square Basics.
Chapter 9: Non-parametric Tests
Presentation 12 Chi-Square test.
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
Hypothesis testing. Chi-square test
Chapter 25 Comparing Counts.
Qualitative data – tests of association
Chapter 11 Goodness-of-Fit and Contingency Tables
Chi Square SBI3UP.
The Chi-Square Distribution and Test for Independence
Chi Square Two-way Tables
Hypothesis Testing and Comparing Two Proportions
Chapter 11: Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Contingency Tables: Independence and Homogeneity
Statistical Analysis Chi-Square.
Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chi Square (2) Dr. Richard Jackson
Overview and Chi-Square
Inference on Categorical Data
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
Lesson 11 - R Chapter 11 Review:
Chi – square Dr. Anshul Singh Thapa.
Analyzing the Association Between Categorical Variables
Chi2 (A.K.A X2).
Chapter 26 Comparing Counts.
Parametric versus Nonparametric (Chi-square)
How do you know if the variation in data is the result of random chance or environmental factors? O is the observed value E is the expected value.
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
Inference for Two Way Tables
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chapter 26 Comparing Counts.
Chapter Outline Goodness of Fit test Test of Independence.
Lecture 43 Section 14.1 – 14.3 Mon, Nov 28, 2005
Math 10, Spring 2019 Introductory Statistics
CHI SQUARE (χ2) Dangerous Curves Ahead!.
Presentation transcript:

The Chi-square Statistic

Calculating Probabilities

Probability Probability of an event happening = Number of ways it can happen Total number of outcomes

Coin Toss Example A balanced coin flipped in an unbiased way results in heads or tails (each with an equal 50% chance) Chance of heads = one/two possible outcomes What if the last 4 coin flips were heads, what is the chance of the next flip resulting in tails?

Probability of Failure Know the odds! Example when rolling a die, the chance of your number coming up equals 1/6 (or 16.6%) More importantly the chance of numbers that you didn’t pick to show up is 1 – 1/6 (or 83.3%)

The Chi-square Test The Chi-square test is checking to see if the observed results match the expected results Like with the Dice rolls, if you rolled a dice 100 times did you indeed observe about 1/6 of each number. You can put the observed values versus the expected values in the test to see if the dice is not faulty or loaded.

Goodness of fit This test is used to decide whether there is any difference between the observed (experimental) value and the expected (theoretical) value.

Goodness of Fit

Free from Assumptions Chi square goodness of fit test depends only on the set of observed and expected frequencies and degrees of freedom. This test does not need any assumption regarding distribution of the parent population from which the samples are taken. Since this test does not involve any population parameters or characteristics, it is termed as non-parametric or distribution free tests. This test is also sample size independent and can be used for any sample size. Generally performed on Discrete data

It is all about expectations Oi = an observed frequency (i.e. count) for measurement i Ei = an expected (theoretical) frequency for measurement i, asserted by the null hypothesis.

Another way to look at it The value of the Chi-squared statistic = the sum of the (squares of the differences) expected values

Expected Value F = the cumulative Distribution function for the distribution being tested. Yu = the upper limit for class I (maximum possible observations for any category) Yl = the lower limit for class I (minumum possible observations for any category) N = the sample size

Hypothesis testing Choose a level of alpha – usually 0.05 This implies a 95% level of comfort that the observation is correct.

Degrees of Freedom = Number of groups – 1 Example The number of cubs delivered to a population of bears in the wild is tested to see if there is no difference in probability of twins. (N = 50 females) Number of cubs 1 2 3 Observed 5 35 9 Expected 12.5 Degrees of Freedom = Number of groups – 1 df = 4 – 1 = 3

CHI-SQUARE DISTRIBUTION TABLE

Decision Rule Based on the alpha and the degrees of freedom, look up the value in the table. For our example of alpha=.05 and df=3 If chi square is greater than 7.82 then reject the null hypothesis that bears normally birth twins.

Calculate the value Number of cubs 1 2 3 Observed 5 35 9 Expected 12.5 Chi-square = (1-12.5)2/12.5 + (5-12.5)2/12.5 + (35-12.5)2/12.5 + (9-12.5)2/12.5 = 10.58 + 4.5 + 40.5 + 0.98 = 56.56 Since 56.56 > 7.82 we reject the null hypothesis that the number of bear cubs is equally possible for 0-3 cubs

Interpret the result Since we rejected the null hypothesis, what conclusions (inferences) can we come to?

Two-Way Table Method Observed Column 1 Column 2 Row Totals Row 1 Row 1 Total (R1T) Row 2 Row 2 Total (R2T) Column Totals Column 1 Total (C1T) Column 2 Total (C2T) Grand Total (GT) Each value in the expected values table is calculated by multiplying the row total X the column total and dividing by the grand total for each cells location Expected Column 1 Column 2 Row 1 R1T*C1T/GT R1T*C2T/GT Row 2 R2T*C1T/GT R2T*C2T/GT

2-Way Chi-Square Conditions Simple Random Samples Categorical Data Degrees of Freedom equals number of rows minus 1 times the number of columns minus 1 or DF = (r – 1) * (c – 1) Test Statistic is calculated as before but this time for each cell of the table Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ] P-value is the probability of observing a sample statistic as extreme as the test statistic.

Two-Way Table Example Observed Democrat Republican Row Totals Male 20 30 50 Female Column Totals 100 Each value in the expected values table is calculated by multiplying the row total (50) X the column total (50) and dividing by the grand total (100) for each cells location. Expected Democrat Republican Male 25 Female Calculating the Chi-square statistic: ((20-25)^2/25) + ((30-25)^2/25) + ((30-25)^2/25) + ((20-25)^2/25), or (25/25) + (25/25) + (25/25) + (25/25) or 1 + 1 + 1 + 1 or 4. 

Compare Chi-square to table For the example, Chi-square = 4 The degrees of freedom are 1 Since 4 > 3.841 We can reject the null hypothesis that political party is independent of gender with 95% confidence.