Categorical data analysis: An overview of statistical techniques AnnMaria De Mars The Julia Group AnnMaria De Mars The Julia Group.

Slides:



Advertisements
Similar presentations
Assumptions underlying regression analysis
Advertisements

Two-sample tests. Binary or categorical outcomes (proportions) Outcome Variable Are the observations correlated?Alternative to the chi- square test if.
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Tutorial: Chi-Square Distribution Presented by: Nikki Natividad Course: BIOL Biostatistics.
Simple Logistic Regression
Departments of Medicine and Biostatistics
Statistical Tests Karen H. Hagglund, M.S.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
T-Tests.
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
Ordinal Logistic Regression
Chapter 19 Data Analysis Overview
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chapter 14 Inferential Data Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Inferential Statistics
The Practice of Social Research
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
STATISTICAL TECHNIQUES FOR research ROMMEL S. DE GRACIA ROMMEL S. DE GRACIA SEPS for PLANNING & RESEARCH SEPS for PLANNING & RESEARCH.
Categorical Data Prof. Andy Field.
Inferential Statistics: SPSS
LIS 570 Summarising and presenting data - Univariate analysis continued Bivariate analysis.
Descriptive Statistics e.g.,frequencies, percentiles, mean, median, mode, ranges, inter-quartile ranges, sds, Zs Describe data Inferential Statistics e.g.,
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
How to Teach Statistics in EBM Rafael Perera. Basic teaching advice Know your audience Know your audience! Create a knowledge gap Give a map of the main.
Proc freq: Five secrets* *Okay, well, lesser known facts.
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL);
Associate Professor Arthur Dryver, PhD School of Business Administration, NIDA url:
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
ANOVA and Linear Regression ScWk 242 – Week 13 Slides.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Linear correlation and linear regression + summary of tests
Categorical Data Analysis: When life fits in little boxes AnnMaria DeMars, PhD.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
AnnMaria De Mars, Ph.D. The Julia Group Santa Monica, CA Categorical data analysis: For when your data DO fit in little boxes.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
CHI SQUARE TESTS.
Academic Research Academic Research Dr Kishor Bhanushali M
ANALYSIS PLAN: STATISTICAL PROCEDURES
Review. Statistics Types Descriptive – describe the data, create a picture of the data Mean – average of all scores Mode – score that appears the most.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.
Master’s Essay in Epidemiology I P9419 Methods Luisa N. Borrell, DDS, PhD October 25, 2004.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Tuesday PM  Presentation of AM results  What are nonparametric tests?  Nonparametric tests for central tendency Mann-Whitney U test (aka Wilcoxon rank-sum.
Soc 3306a Lecture 7: Inference and Hypothesis Testing T-tests and ANOVA.
Recapitulation! Statistics 515. What Have We Covered? Elements Variables and Populations Parameters Samples Sample Statistics Population Distributions.
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Nonparametric Statistics
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
STATISTICAL TESTS USING SPSS Dimitrios Tselios/ Example tests “Discovering statistics using SPSS”, Andy Field.
Chapter 22 Inferential Data Analysis: Part 2 PowerPoint presentation developed by: Jennifer L. Bellamy & Sarah E. Bledsoe.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
CHAPTER 15: THE NUTS AND BOLTS OF USING STATISTICS.
CHOOSING A STATISTICAL TEST
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Presentation transcript:

Categorical data analysis: An overview of statistical techniques AnnMaria De Mars The Julia Group AnnMaria De Mars The Julia Group

Anyone who thinks he knows all of SAS is clinically insane Okay, Hemingway didn’t really say that, but he should have

Three uses for descriptive statistics Describe a sample Check data quality Answer descriptive questions Describe a sample Check data quality Answer descriptive questions

Descriptive Statistics PROC FREQ PROC UNIVARIATE PROC TABULATE ODS graphs SAS/Graph Graph – N- Go SAS Enterprise Guide

Basic Inferential Statistics Pearson chi-square McNemar Fisher

Answers to deep questions  What does a McNemar test test?  Why would a Pearson chi-square and a McNemar test give different answers?

Pearson Chi-Square  Tests for a relationship between two categorical variables, e.g. whether having participated in a program is related to having a correct answer on a test.  Assumes randomly sampled data  Assumes independent observations  Tests for a relationship between two categorical variables, e.g. whether having participated in a program is related to having a correct answer on a test.  Assumes randomly sampled data  Assumes independent observations

Good for chi-square Correct cause Group YESNO Interactive919 Handouts5545

Why is the previous example good? It includes two independent groups There are adequate numbers per cell

Bad for chi-square Correct death Pre-Post YES NO PRE1585 POST919

Enter the McNemar This is a test of correlated proportions It is commonly used to test, for example, if the proportion showing mastery at time 1 = the proportion showing mastery at time 2

Bad for Pearson chi-square Correct cause Group YESNO Interactive123 Handouts84

Fisher’s exact test Is used when the assumption of large sample sizes cannot be met There is no advantage to using it if you do have large sample sizes

A lot more … Cochran-Mantel- Haenszel test for repeated tests of independence –Do athletes in physical therapy report improvement in mobility more than those who do not receive PT and does this vary depending on if it is preseason or during the season ?

Other simple statistics Binomial tests Confidence intervals Odds ratios Because, obviously, not everyone has the same tastes

What about logistic regression? Logistic is similar to linear regression in that a dependent variable is predicted from a combination of independent variables The dependent is the LOG of the ODDS ratio of being in one group versus another

Example: Death certificates The death certificate is an important medical document. Resident physician accuracy in completing death certificates is poor. Participants were in an interactive workshop or provided printed handouts. Pre-existing knowledge was measured

Example Dependent: Cause of death medical student is correct or incorrect Independent: Group Independent: Awareness of guidelines for death certificate completion

Surveylogistic Interpreted the same as the logistic output but allows inclusion of survey features such as strata and cluster

Other PROCs CATMOD CORRESP PRINQUAL

Hybrids T-test ANOVA NPAR1WAY FACTOR REG

It’s all about questions Are your data any good? What is the distribution of X ? What is the distribution of X given Y? Is there a significant relationship between X and Y? Given X, what are the odds of Y? How well, and with what variables, can we predict which category of X a person falls into? Is this set of variables significantly better for predicting X than that other set of variables lying over there?

Our secret plan Bivariate descriptives Contingency, chi-square, probability Other descriptives Other simple statistics Logistic regression