Chi–squared Tests for Ordinal and Nominal data 1.

Slides:



Advertisements
Similar presentations
Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
Advertisements

Chi square.  Non-parametric test that’s useful when your sample violates the assumptions about normality required by other tests ◦ All other tests we’ve.
Inference about the Difference Between the
Chapter 13: The Chi-Square Test
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
Chapter18 Determining and Interpreting Associations Among Variables.
Chapter 16 Chi Squared Tests.
Chi-square Test of Independence
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Cross-Tabulations.
Nonparametrics and goodness of fit Petter Mostad
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Statistical Analysis I have all this data. Now what does it mean?
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
GOODNESS OF FIT TEST & CONTINGENCY TABLE
Goodness-of-Fit Tests and Categorical Data Analysis
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
Section 10.1 Goodness of Fit. Section 10.1 Objectives Use the chi-square distribution to test whether a frequency distribution fits a claimed distribution.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Basic Data Analysis Chapter 14. Overview  Descriptive Analysis.
Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Chapter-8 Chi-square test. Ⅰ The mathematical properties of chi-square distribution  Types of chi-square tests  Chi-square test  Chi-square distribution.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
CHI SQUARE TESTS.
Copyright © 2010 Pearson Education, Inc. Slide
Comparing Counts.  A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Chapter Outline Goodness of Fit test Test of Independence.
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis – mutually exclusive – exhaustive.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Statistics 300: Elementary Statistics Section 11-2.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Chapter 12 Chi-Square Tests and Nonparametric Tests.
Hypothesis testing In research we want to get answers to posed questions (hypothesis). Are all coffee flavors equally popular? Is the use of bike helmets.
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 2: Chi–squared tests; goodness–of–fit & independence.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chi-Två Test Kapitel 6. Introduction Two statistical techniques are presented, to analyze nominal data. –A goodness-of-fit test for the multinomial experiment.
Introductory Statistics. Test of Independence Review Hypothesis Testing Checking Requirements & Descriptive Statistics.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Module 2 Association and Correlation Relationship between 2 quantitative variables: Scatterplot and Correlation Relationship between 2 qualitative variables:
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Section 10.1 Goodness of Fit © 2012 Pearson Education, Inc. All rights reserved. 1 of 91.
Introduction to Marketing Research
Cross Tabulation with Chi Square
Graphical Descriptive Techniques
Presentation transcript:

Chi–squared Tests for Ordinal and Nominal data 1

2 Techniques to summerize data 1.One variables –univaraite methods 2.Two variables –bivariate methods Graphical displays Two interval variables –scatter plot Two categorical variables –clustered bar chart More than two variables –graphical displays are hard

3 Observations can be taken 1.At the same time –cross sectional data Market surveys: eg. brand preferences of 100 people, etc. 2.At successive times repeatedly –time series data Price of a certain stock over the last 5 years Note: succession can be in space too. But we omit such discussions

Describing Relationship between Two Nominal/Ordinal Variables Contingency / cross–classification / cross–tabulation table is used to describe (two or more) nominal variables Ex: Are the profession and newspaper reading habbits related? A sample of people are asked about their professions and newspaper preferences PersonOccupationNewpaper 1White-collarPost 2White-collarSun 3ProfessionalSun.. 354Blue-collarMail 4 Occ Newsp WCBCProTotal Globe Mail Post Sun Total

5 Occupation Newspaper WCBCPro Globe27/120= /108= /126=0.26 Mail18/120= /108= /126=0.40 Post38/120= /108= /126=0.17 Sun 37/120= /108= /126=0.16 Relative frequencies

Time seires data Observations are repeated at successive times Ex: Total amount of taxed collected (in billions, US$) from year 1993 to 2002 in USA. 6 YearTax

7 Chi–squared Goodness–of–fit test 1.Binomial Experiment: A nomial variable has two outcomes Eg: Do the majority of people like new economic policies or not? 2.Multinomial Experiment: For a nominal variable that has three or more outcomes, we test more than two proportions Eg: Do the people have equal preferences on five brands of tea? Note: Multinomial cases can be reduced to binomial case sometimes!

Example 100 persons took part in a survey about different brands of coffee. Each of the persons tasted four different kinds of coffee (in a blind test), and noted which one they liked the best. The result of the test is as follows: Sort:EllipsGexusLuberLoflia Number of persons

Does the result of the survey show that any of the brands are more popular than the others, or are they all equal? In statistical terms we can formulate the problem as: Null hypothesis: All the coffee brands are equally popular. Alternative hypothesis: All the coffee brands are not equally popular. 9

If the null hypothesis is true, we could expect the following result of the survey: Can we with a significance level of 5% say anything about whether the null hypothesis is true or not. Brand:EllipsGexusLuberLoflia Number of persons 10

One way of measuring how much the observed table differs from the expected table is to look at the differences: 11

However, there is a problem with the fact that the difference between 10 and 20 is relatively larger than the difference between and How can we take this into account? Divide with the expected value and formulate a test statistic: 12

If the null hypothesis is true, ought to be close to zero. Is 4.64 so far away from zero that we can reject the null hypothesis? What is the sampling distribution for if the null hypothesis is true? 13

Chi-squared Chi-squared has two meanings: 1.A continuous distribution: -distribution 2.A statistical test where the sampling distribution for the test statistic is - distributed. 14

Chi-squared Distribution The distribution is a parametric distribution with the parameter v which is called the degrees of freedom. The distribution looks different for different degrees of freedom. Larger the v, the distribution is more symmetric and larger the expected value and standard deviation. 15

16 Tabulated values

Chi–squared Goodness–of–fit test A test to see if a variable with two or more possible categories has a specific distribution. (Do the observed frequencies in different categories align with what we can expect from some theory?) 17

Chi–squared Goodness–of–fit test Formulate null and alternative hypotheses Compute the expected frequencies if the null hypothesis is true (expected counts) Note the observed frequencies (how many) Use the difference between the expected and the observed values and compute the value of the - statistic. Compare your value with the critical value of or compare the p-value with your level of significance. 18

Chi–squared test of independence Test if two variables (with one or more categories) are independent For two nominal variables chi–squared test of a contingency table (Pearson’s Chi– squared test) 19

The 13 first weeks of the season, the TV watchers on Saturday evenings were distributed as follows: SVT128%SVT225% TV318%TV429% After a change of the TV program presentation, a sample of 300 households was taken and the following numbers were observed: SVT1 70 households SVT2 89 households TV3 46 householdsTV4 95 households Has the change in the TV program presentation changed the pattern of TV watchers? 20

Eg: Bike helmets A study was done to investigate whether the usage of bicycle helmets is an effective way to protect people in bicycle accidents from skull damage. 793 persons participated in the study, with the following results: Observed frequency table Used Helmet Damaged skull YesNoTotal Yes No Total

We want to test: Null hypothesis: The amount of skull damages is the same no matter a person in an accident is using a helmet or not (no relationship) Alternative hypothesis: The amount of skull damages is different for those who use helmets and those who don’t. Formally, H 0 : Helmet use and skull damage are independent in accidents H 1 : They are dependent 22

We compute the expected value if the null hypothesis is true and perform a Chi-square test: Expected frequncy table Used helmet Damaged skull YesNo Yes235·147/793 =43,6 235·646/793 =191,4 235 No558·147/793 =103,4 558·646/793 =454,6 558 Total

We compare the observed table with the expected one. If the tables differ much we will reject the null hypothesis. Then we have empirical evidence that there can be a dependency between the variables. 24

If the null hypothesis is true we would get a value close to zero. Is 28,57 so far away from zero that we can reject the null hypothesis? We compare are observed value with the critical value. We can also compare our observed p-value with our significance level. 25