Dan Piett STAT 211-019 West Virginia University Lecture 12.

Dan Piett STAT 211-019 West Virginia University Lecture 12

Last Week Hypothesis Tests on a difference in means Hypothesis Tests on a difference in proportions The 2-sided alternative

Overview Chi-Squared Goodness of Fit Test Chi-Squared Test of Independence

Section 12.1 Chi-Squared Goodness of Fit Test

Multinomial Data Previously we have looked at data coming from a binomial distribution 2 Outcomes (Success, Failure) Example: Flipping a coin (Heads, Tails) Suppose we are interested in data with more than 2 outcomes Example: Rolling a die 6 Outcomes (1, 2, 3, 4, 5, 6) We obtain multinomial data from a multinomial experiment

Multinomial Experiments Multinomial Experiments follow these properties 1. Fixed number of trials, n 2. Each trial results in exactly one of K possible outcomes 3. Probability p i, is the probability of getting outcome i on a single trial p 1 + p 2 + p 3 + … + p K = 1 4. Trials are independent

Finding Expected Frequencies Remembering back to the binomial distribution Expected Value = n*p For our multinomial distribution we will have K expected counts Each Expected Count; E i = n*p i Example: Rolling a fair 6-sided die 600 times (p i = 1/6) Outcome123456 Probability1/6 Expected Counts 100

Observed Frequencies When we do our multinomial experiment, we will not always get exactly our expected counts. Example: We expected 100 4’s on our dice experiment. Suppose we only get 85. 85 is our Observed Frequency; O i Our Observed Frequencies (Counts) are our actual data Suppose on our 600 dice throws, these are our observed counts Outcome123456 Expected Counts 100 Observed Counts 971131028510994

Chi-Squared Goodness of Fit Test So the question to be asked when looking at a table like this is “are our observed counts far enough from our expected counts to determine that the expected counts are wrong?” This is what the Chi-Squared Goodness of Fit Test attempts to answer. Note that our test will follow the 7 step procedure Outcome123456 Expected Counts 100 Observed Counts 971131028510994

Chi-Square Goodness of Fit Test 1. H 0 : p 1 = # 1, p 2 = # 2, … p K = # k 2. H A : At least one p i ≠ # i 3. Alpha is.05 if not specified 4. Test Statistic = 5. P-value will come from the Chi-Squared Table with df = k-1 P(Test Statistic > Chi Squared Tabled Value) There is only 1 alternative hypothesis 6. Our decision rule will be to reject H 0 if p-value < alpha 7. We have (do not have) enough evidence at the.05 level to conclude that the at least one of our probabilities is incorrect. We require that our expected counts at each cell are at least 5 and that our sample is independent and random.

Example: For Fall 2013, 99 STAT 211 students were given a choice of 3 section times (A,B,C) to take the final exam. The data that follows shows the number of students who selected each section. Does the data indicate that the students exhibit a preference, or indicate that all sections are equally likely to be chosen. Use alpha=.05 (Hint: If all 3 are equally likely, all p i ’s will be 1/3) Observed Counts: A – 40 B – 30 C – 29

Section 12.2 Chi-Squared Test for Independence

Association of Categorical Variables Thus far, all of our confidence intervals and hypothesis tests have been done on numeric variables. We will now shift our attention to categorical variables Ex: Eye Color, Class Rank The question we wish to answer is, “is there an association between two categorical variables?” Ex: Is there an association between Eye Color and Hair Color? We will use a Chi Squared Test to answer this question, but first we need to discuss contingency tables.

Contingency Tables (Observed) We can organize categorical data in a contingency table, with r rows and c columns. This is known as an r x c (r by c) contingency table. Note that the contingency tables contains observed counts Example: Some Possible Values for Hair Color vs. Eye Color Hair x EyeBrownBlueGreen Black90208 Brown65229 Blonde337512

Contingency Tables (Expected) Much like the goodness of fit test, we will need to calculate our expected counts. The formula for the expected counts is So for the previous example We now have Observed and Expected counts, so we can do a Chi- Squared Test for independence Hair x EyeBrownBlueGreenTotal Black110 (81.1)20 (45.6)8 (11.3)138 Brown65 (??)22 (??)9 (??)96 Blonde33 (??)75 (??)12 (??)120 Total20811729354

Chi-Squared Test for Independence 1. H 0 : Variable 1 and Variable 2 are independent 2. H A : Variable 1 and Variable 2 are not independent (dependent) 3. Alpha is.05 if not specified 4. Test Statistic = 5. P-value will come from the Chi-Squared Table with df = (r-1)(c-1) P(Test Statistic > Chi Squared Tabled Value) There is only 1 alternative hypothesis 6. Our decision rule will be to reject H 0 if p-value < alpha 7. We have (do not have) enough evidence at the.05 level to conclude that the variables are dependent. We require that our expected counts at each cell are at least 5 and that our sample is independent and random.

Example Does “test failure” reduce academic aspirations and thereby contribute to a decision to drop out of school? A survey of 283 students is randomly selected from schools with low graduation rates. The contingency table below reports the results to the question “Do tests required for graduation discourage students from staying in school?” Does there appear to be a relationship between the schools’ location and the students’ responses? Response x School UrbanSuburbanRural Yes572747 No231612 Unsure452531

Dan Piett STAT 211-019 West Virginia University Lecture 12.

Similar presentations

Presentation on theme: "Dan Piett STAT 211-019 West Virginia University Lecture 12."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dan Piett STAT 211-019 West Virginia University Lecture 12.

Similar presentations

Presentation on theme: "Dan Piett STAT 211-019 West Virginia University Lecture 12."— Presentation transcript:

Similar presentations

About project

Feedback