The Chi-Square Test for Association

Slides:



Advertisements
Similar presentations
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Advertisements

CHI-SQUARE(X2) DISTRIBUTION
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Basic Statistics The Chi Square Test of Independence.
Contingency Tables (cross tabs)  Generally used when variables are nominal and/or ordinal Even here, should have a limited number of variable attributes.
Bivariate Analysis Cross-tabulation and chi-square.
Hypothesis Testing IV Chi Square.
Chapter 13: The Chi-Square Test
Chi Square Analyses: Comparing Frequency Distributions.
Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp ; 485.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
CHI-SQUARE GOODNESS OF FIT TEST u A nonparametric statistic u Nonparametric: u does not test a hypothesis about a population value (parameter) u requires.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Lecture 14 Goodness of Fit and Chi Square
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
+ Quantitative Statistics: Chi-Square ScWk 242 – Session 7 Slides.
11.4 Hardy-Wineberg Equilibrium. Equation - used to predict genotype frequencies in a population Predicted genotype frequencies are compared with Actual.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chi Square AP Biology.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Tests of Significance June 11, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
1 Chi-Square Heibatollah Baghi, and Mastee Badii.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Two Way Tables and the Chi-Square Test ● Here we study relationships between two categorical variables. – The data can be displayed in a two way table.
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory. How.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Chi square analysis Just when you thought statistics was over!!
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Reasoning in Psychology Using Statistics Psychology
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Chapter 13. The Chi Square Test ( ) : is a nonparametric test of significance - used with nominal data -it makes no assumptions about the shape of the.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chapter 13 Understanding research results: statistical inference.
Did Mendel fake is data? Do a quick internet search and can you find opinions that support or reject this point of view. Does it matter? Should it matter?
ANOVA Knowledge Assessment 1. In what situation should you use ANOVA (the F stat) instead of doing a t test? 2. What information does the F statistic give.
The Chi Square Equation Statistics in Biology. Background The chi square (χ 2 ) test is a statistical test to compare observed results with theoretical.
Statistics 300: Elementary Statistics Section 11-3.
Chi-Square Chapter 14. Chi Square Introduction A population can be divided according to gender, age group, type of personality, marital status, religion,
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Basic Statistics The Chi Square Test of Independence.
The Chi-square Statistic
Test of independence: Contingency Table
Chi-Square (Association between categorical variables)
Chi-Square hypothesis testing
Chi-square Basics.
Chapter 9: Non-parametric Tests
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Hypothesis Testing Review
Qualitative data – tests of association
Different Scales, Different Measures of Association
Contingency Tables (cross tabs)
Parametric versus Nonparametric (Chi-square)
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Chi-Square Test A fundamental problem in Science is determining whether the experiment data fits the results expected. How can you tell if an observed.
Hypothesis Testing - Chi Square
Presentation transcript:

The Chi-Square Test for Association Math 137 Fall‘13 L. Burger

-The Chi Square Test A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted from a hypothesis Note: The chi square test does not prove that a hypothesis is correct It evaluates to what extent the data and the hypothesis have a good fit

-Determine The Hypothesis: Whether There is an Association or Not Ho : The two variables are independent (not matching up well!) Ha : The two variables are associated (variables matching good --- called a ‘good fit!’)

Understanding the Chi-Square Distribution You see that there is a range here: if the results were perfect you get a chi-square value of 0 (because obs = exp). This rarely happens: most experiments give a small chi-square value (the hump in the graph). Note that all the values are greater than 0: that's because we squared the (obs - exp) term: squaring always gives a non-negative number. Sometimes you get really wild results, with obs very different from exp: the long tail on the graph. Really odd things occasionally do happen by chance alone (for instance, you might win the lottery).

Critical Chi-Square Critical values for chi-square are found on tables, sorted by degrees of freedom and probability levels. Be sure to use p = 0.05. If your calculated chi-square value is greater than the critical value from the table, you “reject the null hypothesis”. If your chi-square value is less than the critical value, you “fail to reject” the null hypothesis (that is, you accept that your theory about the expected ratio is correct).

-Calculating Test Statistics Contrasts observed frequencies in each cell of a contingency table with expected frequencies. The expected frequencies represent the number of cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated). Expected frequency of two unrelated events is product of the row and column frequency divided by number of cases. Fe= Fr Fc / N Mean difference between pairs of values

-Calculating Test Statistics Mean difference between pairs of values

4. Calculating Test Statistics Observed frequencies Expected frequency Mean difference between pairs of values Expected frequency

-Determine Degrees of Freedom df = (R-1)(C-1) Number of levels in column variable Number of levels in row variable

-Compare computed test statistic against a tabled/critical value The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable The critical tabled values are based on sampling distributions of the Pearson chi-square statistic If calculated 2 is greater than 2 table value, reject Ho

Example 1: One-dimensional Suppose we want to know how people in a particular area will vote in general and go around asking them. How will we go about seeing what’s really going on? Republican Democrat Other 20 30 10

-Determine Hypotheses Ho : There is no difference between what is observed and what is expected. 𝐻 𝑎 : There is an association between the observed and expected frequencies. Solution: chi-square analysis to determine if our outcome is different from what would be expected if there was no preference

Plug in to formula 20 30 10 Observed Expected Republican Democrat Other Observed 20 30 10 Expected

Reject H0 The district will probably vote democratic, there is association between what is observed and what is expected. However…

Conclusion of Example 1 Note that all we really can conclude is that our data is different from the expected outcome given a situation Although it would appear that the district will vote democratic, really we can only conclude they were not responding by chance Regardless of the position of the frequencies we’d have come up with the same result In other words, it is a non-directional test regardless of the prediction

Example 2-Two Dimensional Suppose a researcher is interested in voting preferences on gun control issues. A questionnaire was developed and sent to a random sample of 90 voters. The researcher also collects information about the political party membership of the sample of 90 respondents.

Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Democrat 10 30 50 Republican 15 40 f column 25 n = 90

Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Democrat 10 30 50 Republican 15 40 f column 25 n = 90 Observed frequencies

Bivariate Frequency Table or Contingency Table Row frequency Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Democrat 10 30 50 Republican 15 40 f column 25 n = 90

Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Democrat 10 30 50 Republican 15 40 f column 25 n = 90 Column frequency

-Determine The Hypothesis Ho : There is no difference between D & R in their opinion on gun control issue. Ha : There is an association between responses to the gun control survey and the party membership in the population.

-Calculating Test Statistics Favor Neutral Oppose f row Democrat fo =10 fe =13.9 fo =30 fe=22.2 50 Republican fo =15 fe =11.1 fe =17.8 40 f column 25 n = 90

-Calculating Test Statistics Favor Neutral Oppose f row Democrat fo =10 fe =13.9 fo =30 fe=22.2 50 Republican fo =15 fe =11.1 fe =17.8 40 f column 25 n = 90 = 50*25/90

-Calculating Test Statistics Favor Neutral Oppose f row Democrat fo =10 fe =13.9 fo =30 fe=22.2 50 Republican fo =15 fe =11.1 fe =17.8 40 f column 25 n = 90 = 40* 25/90

-Calculating Test Statistics = 11.03

-Determine Degrees of Freedom df = (R-1)(C-1) = (2-1)(3-1) = 2

-Compare computed test statistic against a tabled/critical value α = 0.05 df = 2 Critical tabled value = 5.991 Test statistic, 11.03, exceeds critical value Null hypothesis is rejected

-Conclusion of Example 2 Ho : There is no difference between D & R in their opinion on gun control issue. Ha : There is an association between responses to the gun control survey and the party membership in the population. -Democrats & Republicans differ significantly in their opinions on gun control issues