Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona.

Slides:



Advertisements
Similar presentations
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
Advertisements

What is Chi-Square? Used to examine differences in the distributions of nominal data A mathematical comparison between expected frequencies and observed.
CHI-SQUARE(X2) DISTRIBUTION
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Chapter 16: Chi Square PSY —Spring 2003 Summerfelt.
Basic Statistics The Chi Square Test of Independence.
Contingency Tables (cross tabs)  Generally used when variables are nominal and/or ordinal Even here, should have a limited number of variable attributes.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Hypothesis Testing IV Chi Square.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
 What is chi-square  CHIDIST  Non-parameteric statistics 2.
Analysis of frequency counts with Chi square
S519: Evaluation of Information Systems
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 12 Chicago School of Professional Psychology.
CJ 526 Statistical Analysis in Criminal Justice
Chi-square Test of Independence
CHI-SQUARE GOODNESS OF FIT TEST u A nonparametric statistic u Nonparametric: u does not test a hypothesis about a population value (parameter) u requires.
Crosstabs and Chi Squares Computer Applications in Psychology.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
CHI-SQUARE statistic and tests
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 11 Chi-Square Tests and Strategies.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
AM Recitation 2/10/11.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
CJ 526 Statistical Analysis in Criminal Justice
Week 10 Chapter 10 - Hypothesis Testing III : The Analysis of Variance
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Copyright © 2012 by Nelson Education Limited. Chapter 10 Hypothesis Testing IV: Chi Square 10-1.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.
CHI SQUARE TESTS.
© aSup-2007 CHI SQUARE   1 The CHI SQUARE Statistic Tests for Goodness of Fit and Independence.
Copyright © 2010 Pearson Education, Inc. Slide
Nonparametric Tests of Significance Statistics for Political Science Levin and Fox Chapter Nine Part One.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
Leftover Slides from Week Five. Steps in Hypothesis Testing Specify the research hypothesis and corresponding null hypothesis Compute the value of a test.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chi Square & Correlation
1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.
Chapter 13. The Chi Square Test ( ) : is a nonparametric test of significance - used with nominal data -it makes no assumptions about the shape of the.
ANOVA Knowledge Assessment 1. In what situation should you use ANOVA (the F stat) instead of doing a t test? 2. What information does the F statistic give.
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 15: Chi-square.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chi Square Test Dr. Asif Rehman.
I. ANOVA revisited & reviewed
Basic Statistics The Chi Square Test of Independence.
Chapter 9: Non-parametric Tests
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Association between two categorical variables
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
Qualitative data – tests of association
Hypothesis Testing Using the Chi Square (χ2) Distribution
The Chi-Square Distribution and Test for Independence
Different Scales, Different Measures of Association
Reasoning in Psychology Using Statistics
Chapter 11: Inference for Distributions of Categorical Data
Contingency Tables (cross tabs)
Parametric versus Nonparametric (Chi-square)
Fundamental Statistics for the Behavioral Sciences, 4th edition
Chapter 18: The Chi-Square Statistic
Presentation transcript:

Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

Nonparametric (a.k.a. Distribution-Free) Nonparamteric refers to tests that: –Make no estimates about parameters –Make few or no assumptions –Can be run with ordinal or nominal data –Usually less powerful that parametric tests They are significant tests

Chi Square Distribution A distribution with one parameter, k Mathematically defined by: All values set except k –k is the only value that can vary –k is statistically equal to df –Distribution changes for different values of k f(  2 ) = 1 2 k /2  (k/2)  2[(k/2)-1]e -(  2)/2

Chi Square Distribution Howell, 1997

Chi Square Test Based on the chi square distribution This is a nonparametric test It can be used with nominal data –Therefore, it can be used with data more complex, as well >Data must be in nominal form Tests if frequency differences occur due to chance

Transforming Data Set of reaction time (RT) data, in ms {778, 921, 1148, 1675, 1721, 782, 1549, 846, 1313, 1947, 1498, 885, 1211} How can this be transformed into nominal data?

The Nominal Scale Could be called labeling Numbers are assigned to define a category –Therefore, all cases in the same category receive the same designation, the same number Categories are independent or mutually exclusive e.g., political party affiliation

Nominal Data These data tells whether a particular case possess a particular trait, and are categorized along these traits –We do not know how much of the trait All categories must share one trait All observations within any category are equal

Terminology  2 - chi square C - number of categories f o - frequency observed f e - frequency expected

Chi Square and the H 0 As do all significant tests, the chi square tests the H 0 The H 0 with a chi square test says that the frequencies in your sample are equivalent to those that are expected –H 0 : f o = f e >How do you obtain the value of f e ?

Observed frequencies (f o ): frequencies you observe in your sample Expected frequencies (f e ): frequencies you would expect given H 0 Observed and Expected Frequencies

Goodness of Fit (1 x C) Chi Square Applies when one group is assigned to C categories Good  2 to compare a sample to a population Testing how well our observed frequencies (f o ) fit with the expected frequencies (f e ), given H 0

H o & GOF Chi Square H o can be stated in two ways: No Preference Idea that population is evenly divided among categories No Difference Idea that f e is same as those of a known population

f e & GOF Chi Square f e can be calculated in two ways, corresponding to the H o : fe =fe = C N By ChanceA priori This means that prior knowledge has informed your hypothesis and your expected frequency is based on this prior knowledge

Calculating Chi Square  2 = (f o - f e ) 2 fefefefe  This formula generalizes to multiple category variables

A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Are the students equally divided? Practical Problem

Calculating 1 x C  2 ∑ of values in the bottom row =  2 value

Evaluating  2 df = C - 1 Once you have calculated a  2 value, you compare it to a table value (p. 699) Find the table value by looking up the df &  level If calculated  2 is  table value, reject H o H o : f o = f e H a : f o  f e

 2 Table Treat just like t table Note that, unlike t, as you increase df, the table or critical value also increases –Making it harder to find a significant result at these higher df

A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Can she reject the H 0 that states the students are divided equally? Class Problem  2.05 (2) = 5.99  2 = 27.18; reject H 0

Consumer psychologists tell us that red is an powerful color for merchandising. According to the numbers, products whose packaging contains red sell 2/3 more often than equivalent products whose packaging lacks red. Packaging companies know this & therefore charge more for red packaging. We test a new product in two packages: R+ & R-. We find that 49 people prefer the R+ & 38 prefer the R-. Does this mean that our sample is preferring red to the same degree? Class Problem  2.05 (1) = 3.84  2 = 4.49; reject H 0

Independence (r x C) Chi Square Analysis of contingency Applies when more than one group is assigned to C categories Good  2 to compare a sample to a another sample Uses contingency tables Tests H 0 : the observed frequencies for one category are independent of the observed frequencies for any other; they occurred by chance

Show the distribution of one variable at each level of another variable Also know as crosstabs Rows are defined by the groups Columns are defined by the categories Identifies marginal totals Contingency Tables

These are the totals of the frequencies in all cells of a row or column –For rows, they are placed to the right –For columns, they are placed at the bottom Marginal Totals  row totals =  column totals = N

f e = (row total * column total) / N –Follows from multiplicative law of probability Expected Frequencies, df, & Independence  2 df = (# rows - 1) (# of columns - 1) –Refers to the number of cell values that are free to vary once the marginal totals are set –Check by crossing out 1 row & 1 column

A 1993 survey of men in CA looked at martial and employment status. It found the following breakdown: Class Problem  2.05 (2) = 5.99  2 = 5.56; fail to reject H 0 Do men of different marital statuses have different distributions of employment status? Or, are these differences just chance variation?

 2 & Percentage  2 can be calculated with percentages The formula stays the same Treat the percentages just as you would frequencies Remember, a key factor in  2 is sample size Percentage based  2 must account for N They do so after the  2, based on the percentage, has been calculated  2 =  2 % (N) 100

You have classified a sample of 24 people into 5 categories based on ethnicity, using percents. You surveyed these people on their attitudes toward increasing taxes. To see if their attitudes were related to ethnicity, you have calculated  2 and obtained a value of What is your conclusion? Class Problem  2.05 (4) = 9.49  2 = 3.43; fail to reject H 0

Inclusion of non-occurrences Normality - expected cell frequencies large enough Independence Assumptions of  2

Every possible value of a variable needs to be included –some slippage OK with very rare events Inclusion of Non- Occurrences

Are Catholics more likely to vote pro- abortion than Non-Catholics? Catholics Non-Catholics Pro votes: Surprisingly, it looks like the answer is yes Violation Example We have not considered the non- occurrences

Are Catholics more likely to vote pro- abortion than Non-Catholics? Catholics Non-Catholics Pro votes: Con votes: Violation Example Catholics are much more likely to vote con

Refers to having large enough frequencies for the normal approximation to the multinomial to be valid - make sure to check Different opinions on this: –Some say that all cells need f e > 5 –Some say that no more than 20% of cells can have f e < 5 Biggest problem is lack of power Fisher’s exact test is an alternative for 2x2 tables Assumption of Normality

Each subject falls into one and only one cell –Check: do totals of your cell counts = N Assumption of Independence If you have repeated measurements, you do not have independence Alternative if you don’t have independence –McNemar test

In some cases, you can account for a lack of independence by using McNemar’s test Can only be computed with a 2 x 2 contingency table Within the table, we do not have the observed frequencies –We have change scores We compute  2 on these change scores McNemar’s Test

The H o in this case states that the distributions of original & changed scores are the same H o McNemar’s Test

McNemar’s Test  2 = 2 = 2 = 2 = a - d a + d 2 The contingency table must be set-up as: ab cd PrePost

You have classified a sample of 100 Texans into 2 categories: 77 pro death penalty & 23 con. You surveyed these Texans after having watched an execution. Of the original 77 pro opinions, 61 remain. Of the original 23 con, 18 remain. Did viewing an execution change Texans’ attitudes? Class Problem  2.05 (4) = 3.84  2 = 5.76; reject H 0

 2 & Effect Size This measure of effect size for  2 has different conventions than those for parametric tests –.10 (small effect size) –.25 (medium effect size) –.40 (large effect size) Effect size =  2  +  2  This measure of effect size is also call the contingency coefficient

A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Can she reject the H 0 that states the students are divided equally? Class Problem  2.05 (2) = 5.99  2 = 27.18; reject H 0 ES =.27, medium effect

Questions/ Comments? Thank You The end

Homework Chapter 17 –1, 2, 3, 4, 9, 13, 17, 19, 21