1 Week 3 Association and correlation handout & additional course notes available at 15-10-2007 Trevor Thompson.

Slides:



Advertisements
Similar presentations
Contingency Table Analysis Mary Whiteside, Ph.D..
Advertisements

INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Analysis of frequency counts with Chi square
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
CJ 526 Statistical Analysis in Criminal Justice
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Chi Square Test Dealing with categorical dependant variable.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Inferential Statistics
Presentation 12 Chi-Square test.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
AM Recitation 2/10/11.
Categorical Data Prof. Andy Field.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Inferential Statistics: SPSS
Equations in Simple Regression Analysis. The Variance.
Regression Analysis (2)
CJ 526 Statistical Analysis in Criminal Justice
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Experimental Research Methods in Language Learning Chapter 11 Correlational Analysis.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.
Review Hints for Final. Descriptive Statistics: Describing a data set.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
CHI SQUARE TESTS.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Chapter Outline Goodness of Fit test Test of Independence.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Chapter 10 The t Test for Two Independent Samples
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chapter 11: Additional Topics Using Inferences 11.1 – Chi-Square: Tests of Independence 11.2 – Chi-Square: Goodness of Fit 11.3 – Testing a Single Variance.
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
Chapter 13 Understanding research results: statistical inference.
T-tests Chi-square Seminar 7. The previous week… We examined the z-test and one-sample t-test. Psychologists seldom use them, but they are useful to understand.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Nonparametric Statistics
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
STATISTICAL TESTS USING SPSS Dimitrios Tselios/ Example tests “Discovering statistics using SPSS”, Andy Field.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Appendix I A Refresher on some Statistical Terms and Tests.
I. ANOVA revisited & reviewed
Introduction to Marketing Research
Nonparametric Statistics
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis mutually exclusive exhaustive.
Hypothesis Testing Review
Qualitative data – tests of association
Nonparametric Statistics
The Chi-Square Distribution and Test for Independence
Hypothesis testing. Chi-square test
Association, correlation and regression in biomedical research
RES 500 Academic Writing and Research Skills
COMPARING VARIABLES OF ORDINAL OR DICHOTOMOUS SCALES: SPEARMAN RANK- ORDER, POINT-BISERIAL, AND BISERIAL CORRELATIONS.
Presentation transcript:

1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson

2 Overview 1) What are tests of association and which test do I use? 2) Associations within categorical data - descriptives (frequency tables) - descriptives (frequency tables) - the chi-square test - the chi-square test - Howell (2002) Chap 6 & 9. ‘Statistical Methods for Psychology’ 3) Associations within continuous data - descriptives (scatterplots) - descriptives (scatterplots) - Spearmans and Pearsons ‘r’ - Spearmans and Pearsons ‘r’

3 What is association/correlation? To examine whether there is a relationship between variables Variables are either associated or independent (which is null hypothesis?) Causation vs. association depends on the experimental design not the test used

4 Which test to use? Test selection depends on data: Categorical data – Chi-square Ordinal (ranked) data - Spearmans rho Interval/ratio data - Pearsons r Other less commonly used tests exist (tetrachoric, kendall’s tau, phi etc) – see Howell Logistic regression covered in later lecture

5 Which test to use - examples Is there an association between height and weight? Pearson’s r Is there an association between 50 cities ranked for ‘livability’ 10 years ago and these cities ranked for ‘livability’ today? Spearman’s rho Is there an association between gender (male / female) and yogurt preference (light / dark)? Chi-square test

6 Pearson’s chi-square test for categorical data -descriptives -assumptions -chi-square significance test Research question: Is gender associated with preference for a specifically coloured yogurt?

7 Chi-square test Data entry each row should represent responses of one participant Compute contingency (frequency) table n-way table denotes number of variables gender & yogurt is 2-way table Tables also described in terms of how many levels of each variable. So 3*2 table would represent one variable with 3 levels & one variable with 2 levels gender & yogurt preference is 2*2 table

8 Chi-square test Descriptives Contingency tables: Probable association Probable independence (no association) Possible association?

9 Chi-square test Assumptions 1. Observations must be independent 2. Observations must be mutually exclusive responses should only fall into cell. E.g. prefer either dark or light yogurt – not both 3. Inclusion of non-occurrences include all responses (e.g. both ‘yes’ and ‘no’ ) - otherwise can be misleading 4. Cell size Expected cell size>5

10 Chi-square test Significance testing Are two variables significantly associated? Run Pearson’s chi-square

11 Chi-square test Pearsons  2 statistic ) Gender & yogurt preference significantly associated (  2=6.67, p<.05) Is this in the expected direction? Our hypothesis was 2-tailed. If 1-tailed (e.g. females will prefer light yogurts) then check contingency table for direction Can halve p-value if 1-tailed – but only if variables have 2 levels

12 Chi-square test Degrees of freedom df = (R-1) * (C-1) where r=rows, c=columns df = (R-1) * (C-1) where r=rows, c=columns Yates’ Continuity correction Yates’ Continuity correction Only applicable to 2 * 2 tables Only applicable to 2 * 2 tables (O ‑ E) 2 in formula to {|0-E| -0.5} 2 (O ‑ E) 2 in formula to {|0-E| -0.5} 2 Not really needed Not really needed

13 Chi-square test Likelihood ratio Likelihood ratio An alternative test for associations of categorical data An alternative test for associations of categorical data For large samples, likelihood ratio=Pearson chi-square For large samples, likelihood ratio=Pearson chi-square For small samples, chi-square test may be more accurate For small samples, chi-square test may be more accurate Likelihood ratio is useful when for multi-dimensional associations – covered in Logistic regression lecture Likelihood ratio is useful when for multi-dimensional associations – covered in Logistic regression lecture

14 Chi-square test Odds-ratio (OR) estimate How large is our significant association? Odds of: females choosing light relative to dark? 2/1 Odds of: females choosing light relative to dark? 2/1 & males choosing light relative to dark? 1/2 Odds ratio= a/b Odds ratio= a/b c/d -or equivalently, OR=(ad)/(bc) Odds ratio: What is likelihood of choosing a light yogurt for females relative to males? 4/1 Odds ratio: What is likelihood of choosing a light yogurt for females relative to males? 4/1

15 Chi-square test – underlying logic Pearson  2 = ∑ (O-E) 2 Pearson  2 = ∑ (O-E) 2 E  2 statistic represents deviation of actual observed data differs from that expected by chance  2 statistic represents deviation of actual observed data differs from that expected by chance Calculating  2 Calculating  2 Step 1 -Calculate expected frequencies Step 1 -Calculate expected frequencies Prob of choosing light yogurt? O=observed frequency E=expected frequency ½ (30/60) ½ ¼ [Joint prob = p1 x p2] Prob of being female? Prob of being female & prefer light yogurt? So if N=60, expected freq for each cell =15 (60 x ¼)

16 Chi-square test – underlying logic Step 2. Observed frequencies Step 2. Observed frequencies Bigger deviations between observed and chance-expected cell sizes, the greater the likelihood of a significant association Bigger deviations between observed and chance-expected cell sizes, the greater the likelihood of a significant association  2 = ∑ (O-E) 2 = (20-15) 2 + (10-15) 2 + (10-15) 2 + (20-15) 2  2 = ∑ (O-E) 2 = (20-15) 2 + (10-15) 2 + (10-15) 2 + (20-15) 2 E =6.67, same as in SPSS output E =6.67, same as in SPSS output

17 Chi-square test – underlying logic Corresponding probability value of  2 =6.67 is p=.01 (meaning a value of 6.67 occurs 1/100 by chance) Corresponding probability value of  2 =6.67 is p=.01 (meaning a value of 6.67 occurs 1/100 by chance) Above chi-square distribution shows values of chi-square statistic that would be obtained by chance in repeated sampling Above chi-square distribution shows values of chi-square statistic that would be obtained by chance in repeated sampling Distribution of  2 changes according to df Distribution of  2 changes according to df

18 Correlation and regression Detailed coverage of correlation/regression in lectures 8 & 9 Detailed coverage of correlation/regression in lectures 8 & 9 When X & Y are continuous variables, we use Pearson’s correlation-coefficient ‘r’ (or equivalent Spearman’s rho for ranked data) When X & Y are continuous variables, we use Pearson’s correlation-coefficient ‘r’ (or equivalent Spearman’s rho for ranked data) Correlation vs. regression Correlation vs. regression i. correlation used to index strength of association regression used in prediction ii. (historically) If X is fixed then regression, if X is random then correlation

19 Correlation and regression Descriptives DescriptivesScatterplot Correlation (r) related to degree to which the points cluster around line (0 to 1 or -1) Correlation (r) related to degree to which the points cluster around line (0 to 1 or -1) Regression line is “line of best fit” Regression line is “line of best fit”

20 Correlation and regression Significance testing Significance testing Pearsons product-moment correlation Null hyp is population r=0, with r normally distributed To evaluate significance of ‘r’ convert to ‘t’ Assumptions of normality and homogeneity of variance apply – covered in detail in lecture 6 t = r * √(N – 2) (1 – r 2 ) r=0; no correlation r=+1 or -1; max correlation

21 Summary Selection of appropriate test depends on data Selection of appropriate test depends on data Chi-square test - explanation of output Chi-square test - explanation of output Chi-square test - underlying logic Chi-square test - underlying logic Correlation and regression Correlation and regression