Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona.

Similar presentations


Presentation on theme: "Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona."— Presentation transcript:

1 Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona

2 Nonparametric (a.k.a. Distribution-Free) Nonparamteric refers to tests that: –Make no estimates about parameters –Make few or no assumptions –Can be run with ordinal or nominal data –Usually less powerful that parametric tests They are significant tests

3 Chi Square Distribution A distribution with one parameter, k Mathematically defined by: All values set except k –k is the only value that can vary –k is statistically equal to df –Distribution changes for different values of k f(  2 ) = 1 2 k /2  (k/2)  2[(k/2)-1]e -(  2)/2

4 Chi Square Distribution Howell, 1997

5 Chi Square Test Based on the chi square distribution This is a nonparametric test It can be used with nominal data –Therefore, it can be used with data more complex, as well >Data must be in nominal form Tests if frequency differences occur due to chance

6 Transforming Data Set of reaction time (RT) data, in ms {778, 921, 1148, 1675, 1721, 782, 1549, 846, 1313, 1947, 1498, 885, 1211} How can this be transformed into nominal data?

7 The Nominal Scale Could be called labeling Numbers are assigned to define a category –Therefore, all cases in the same category receive the same designation, the same number Categories are independent or mutually exclusive e.g., political party affiliation

8 Nominal Data These data tells whether a particular case possess a particular trait, and are categorized along these traits –We do not know how much of the trait All categories must share one trait All observations within any category are equal

9 Terminology  2 - chi square C - number of categories f o - frequency observed f e - frequency expected

10 Chi Square and the H 0 As do all significant tests, the chi square tests the H 0 The H 0 with a chi square test says that the frequencies in your sample are equivalent to those that are expected –H 0 : f o = f e >How do you obtain the value of f e ?

11 Observed frequencies (f o ): frequencies you observe in your sample Expected frequencies (f e ): frequencies you would expect given H 0 Observed and Expected Frequencies

12 Goodness of Fit (1 x C) Chi Square Applies when one group is assigned to C categories Good  2 to compare a sample to a population Testing how well our observed frequencies (f o ) fit with the expected frequencies (f e ), given H 0

13 H o & GOF Chi Square H o can be stated in two ways: No Preference Idea that population is evenly divided among categories No Difference Idea that f e is same as those of a known population

14 f e & GOF Chi Square f e can be calculated in two ways, corresponding to the H o : fe =fe = C N By ChanceA priori This means that prior knowledge has informed your hypothesis and your expected frequency is based on this prior knowledge

15 Calculating Chi Square  2 = (f o - f e ) 2 fefefefe  This formula generalizes to multiple category variables

16 A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Are the students equally divided? Practical Problem

17 Calculating 1 x C  2 ∑ of values in the bottom row =  2 value

18 Evaluating  2 df = C - 1 Once you have calculated a  2 value, you compare it to a table value (p. 699) Find the table value by looking up the df &  level If calculated  2 is  table value, reject H o H o : f o = f e H a : f o  f e

19  2 Table Treat just like t table Note that, unlike t, as you increase df, the table or critical value also increases –Making it harder to find a significant result at these higher df

20 A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Can she reject the H 0 that states the students are divided equally? Class Problem  2.05 (2) = 5.99  2 = 27.18; reject H 0

21 Consumer psychologists tell us that red is an powerful color for merchandising. According to the numbers, products whose packaging contains red sell 2/3 more often than equivalent products whose packaging lacks red. Packaging companies know this & therefore charge more for red packaging. We test a new product in two packages: R+ & R-. We find that 49 people prefer the R+ & 38 prefer the R-. Does this mean that our sample is preferring red to the same degree? Class Problem  2.05 (1) = 3.84  2 = 4.49; reject H 0

22 Independence (r x C) Chi Square Analysis of contingency Applies when more than one group is assigned to C categories Good  2 to compare a sample to a another sample Uses contingency tables Tests H 0 : the observed frequencies for one category are independent of the observed frequencies for any other; they occurred by chance

23 Show the distribution of one variable at each level of another variable Also know as crosstabs Rows are defined by the groups Columns are defined by the categories Identifies marginal totals Contingency Tables

24 These are the totals of the frequencies in all cells of a row or column –For rows, they are placed to the right –For columns, they are placed at the bottom Marginal Totals  row totals =  column totals = N

25 f e = (row total * column total) / N –Follows from multiplicative law of probability Expected Frequencies, df, & Independence  2 df = (# rows - 1) (# of columns - 1) –Refers to the number of cell values that are free to vary once the marginal totals are set –Check by crossing out 1 row & 1 column

26 A 1993 survey of men in CA looked at martial and employment status. It found the following breakdown: Class Problem  2.05 (2) = 5.99  2 = 5.56; fail to reject H 0 Do men of different marital statuses have different distributions of employment status? Or, are these differences just chance variation?

27  2 & Percentage  2 can be calculated with percentages The formula stays the same Treat the percentages just as you would frequencies Remember, a key factor in  2 is sample size Percentage based  2 must account for N They do so after the  2, based on the percentage, has been calculated  2 =  2 % (N) 100

28 You have classified a sample of 24 people into 5 categories based on ethnicity, using percents. You surveyed these people on their attitudes toward increasing taxes. To see if their attitudes were related to ethnicity, you have calculated  2 and obtained a value of 14.28. What is your conclusion? Class Problem  2.05 (4) = 9.49  2 = 3.43; fail to reject H 0

29 Inclusion of non-occurrences Normality - expected cell frequencies large enough Independence Assumptions of  2

30 Every possible value of a variable needs to be included –some slippage OK with very rare events Inclusion of Non- Occurrences

31 Are Catholics more likely to vote pro- abortion than Non-Catholics? Catholics Non-Catholics Pro votes: 400 100 Surprisingly, it looks like the answer is yes Violation Example We have not considered the non- occurrences

32 Are Catholics more likely to vote pro- abortion than Non-Catholics? Catholics Non-Catholics Pro votes: 400 100 Con votes: 1200 100 Violation Example Catholics are much more likely to vote con

33 Refers to having large enough frequencies for the normal approximation to the multinomial to be valid - make sure to check Different opinions on this: –Some say that all cells need f e > 5 –Some say that no more than 20% of cells can have f e < 5 Biggest problem is lack of power Fisher’s exact test is an alternative for 2x2 tables Assumption of Normality

34 Each subject falls into one and only one cell –Check: do totals of your cell counts = N Assumption of Independence If you have repeated measurements, you do not have independence Alternative if you don’t have independence –McNemar test

35 In some cases, you can account for a lack of independence by using McNemar’s test Can only be computed with a 2 x 2 contingency table Within the table, we do not have the observed frequencies –We have change scores We compute  2 on these change scores McNemar’s Test

36 The H o in this case states that the distributions of original & changed scores are the same H o McNemar’s Test

37 McNemar’s Test  2 = 2 = 2 = 2 = a - d a + d 2 The contingency table must be set-up as: ab cd PrePost - + + -

38 You have classified a sample of 100 Texans into 2 categories: 77 pro death penalty & 23 con. You surveyed these Texans after having watched an execution. Of the original 77 pro opinions, 61 remain. Of the original 23 con, 18 remain. Did viewing an execution change Texans’ attitudes? Class Problem  2.05 (4) = 3.84  2 = 5.76; reject H 0

39  2 & Effect Size This measure of effect size for  2 has different conventions than those for parametric tests –.10 (small effect size) –.25 (medium effect size) –.40 (large effect size) Effect size =  2  +  2  This measure of effect size is also call the contingency coefficient

40 A professor surveys her students to find out if they favor elimination of final exams. She determines that 160 favor elimination, 115 do not, and 80 are undecided. Can she reject the H 0 that states the students are divided equally? Class Problem  2.05 (2) = 5.99  2 = 27.18; reject H 0 ES =.27, medium effect

41 Questions/ Comments? Thank You The end

42 Homework Chapter 17 –1, 2, 3, 4, 9, 13, 17, 19, 21


Download ppt "Chi Square: A Nonparametric Test PSYC 230 June 3rd, 2004 Shaun Cook, ABD University of Arizona."

Similar presentations


Ads by Google