Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director.

Slides:



Advertisements
Similar presentations
Analyzing Time-to-Event Data Cox Proportional Hazards Regression
Advertisements

Group Comparisons Part 1 Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary Clinical Research Center for Rheumatic and Musculoskeletal.
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
Logistic Regression.
Simple Logistic Regression
Departments of Medicine and Biostatistics
Multiple Logistic Regression RSQUARE, LACKFIT, SELECTION, and interactions.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
From last time….. Basic Biostats Topics Summary Statistics –mean, median, mode –standard deviation, standard error Confidence Intervals Hypothesis Tests.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #16.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
EPI 809/Spring Multiple Logistic Regression.
Analysis of Complex Survey Data Day 3: Regression.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Presenting Statistical Aspects of Your Research Analysis of Factors Associated with Pre-term Births in North Carolina.
Correlation, Regression Covariate-Adjusted Group Comparisons
Simple Linear Regression
Statistics for clinical research An introductory course.
Biostatistics Breakdown Common Statistical tests Special thanks to: Christyn Mullen, Pharm.D. Clinical Pharmacy Specialist John Peter Smith Hospital 1.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Jaw Pain: Characteristics and Prevalence in Fibromyalgia and other Rheumatic Disorders Robert S. Katz 1, Frederick Wolfe 2. 1 Rush University Med Center,
A Retrospective Study of the Association of Obesity and Overweight with Admission Rate within York Hospital Emergency Department for Acute Asthma Exacerbations.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 5: Analysis Issues in Large Observational Studies.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Mrs. Watcharasa Pitug ID The Association between Waist Circumference and Renal Insufficiency among Hypertensive Patients 15/10/58 1.
Biostat Didactic Seminar Series Correlation and Regression Part 2 Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary Clinical.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.
Group Comparisons Part 3: Nonparametric Tests, Chi-squares and Fisher Exact Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary.
Logistic (regression) single and multiple. Overview  Defined: A model for predicting one variable from other variable(s).  Variables:IV(s) is continuous/categorical,
Linear correlation and linear regression + summary of tests
HSRP 734: Advanced Statistical Methods July 17, 2008.
Descriptive Statistics Examining Your Data Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary Clinical Research Center for Rheumatic.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Osteoarthritis Initiative Analytic Strategies for the OAI Data December 6, 2007 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and.
Ch10: T-tests 6 Mar 2012 Dr. Sean Ho busi275.seanho.com Please download: 08-TTests.xls 08-TTests.xls HW5 this week Projects.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Three Statistical Issues (1) Observational Study (2) Multiple Comparisons (3) Censoring Definitions.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Overview and Common Pitfalls in Statistics and How to Avoid Them
Non-parametric Tests e.g., Chi-Square. When to use various statistics n Parametric n Interval or ratio data n Name parametric tests we covered Tuesday.
Statistics for Neurosurgeons A David Mendelow Barbara A Gregson Newcastle upon Tyne England, UK.
: Pairwise t-Test and t-Test on Proportions 25 Oct 2011 BUSI275 Dr. Sean Ho HW6 due Thu Please download: Mileage.xls Mileage.xls.
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Instrument design Essential concept behind the design Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public.
1 Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University Overview and Common Pitfalls in.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
Analysis of matched data Analysis of matched data.
Exploring the Relationship between Socioeconomic status and C-Reactive protein levels in the US population using NHANES Survey Data Udoka Obinwa 1, Katherine.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Lecture note on statistics, data analysis planning – week 14 Elspeth Slayter, M.S.W., Ph.D.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 16 : Summary Marshall University Genomics Core Facility.
BACKGROUND RESULTS OBJECTIVES METHODS CONCLUSIONS REFERENCES
Stats Club Marnie Brennan
Introduction to Logistic Regression
Presentation transcript:

Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director of Methodology Core PITT-Multidisciplinary Clinical Research Center for Rheumatic and Musculoskeletal Diseases Core Director for Biostatistics Center for Aging and Population Health Center for Aging and Population Health Dept. of Epidemiology, GSPH 10/8/2010 Dept. of Epidemiology, GSPH 10/8/2010

Flow chart for group comparisons Measurements to be compared continuous Distribution approx normal or N ≥ 20? NoYes Non-parametrics T-tests discrete ( binary, nominal, ordinal with few values) Chi-square Fisher’s Exact

Flow chart for regression models (includes adjusted group comparisons) Outcome variable continuous or dichotomous? Dichotomous (binary)continuous Time-to-event available (or relevant)? NoYes Multiple logistic regression Cox proportional hazards regression Predictor variable categorical? NoYes (e.g. groups) Multiple linear regression ANCOVA (Multiple linear regression - using dummy variable(s) for categorical var(s)

Analysis From Last Didactic … In Health, Aging and Body Composition Knee-OA Substudy: In Health, Aging and Body Composition Knee-OA Substudy:  Examine Association between SxRxKOA (knee OA) and CRP adjusted for BMI. Motivation: Sowers M, Hochberg M et. al. C-reactive protein as a biomarker of emergent osteoarthritis. Osteoarthritis and Cartilage Volume 10, Issue 8, August 2002, Pages Conclusion: “CRP is highly associated with Knee OA; however, its high correlation with obesity limits its utility as an exclusive marker for knee OA”

Logistic Regression Outline for today Definition and interpretation of odds-ratio for binary outcome Essential equivalence of odds-ratio ↔ testing for group differences in rates (or percentages) when evaluated using 2 x 2 table, chi-square and p-values Logistic regression as “binary outcome” version of multiple linear regression: group (and covariate adjustment) effects are interpreted as odds-ratios affecting the binary outcome Detailed example: relating obesity to odds of knee OA - adjusted for race and gender

HABC: Obese x KneeOA Obese: BMI > 30 Chi-square P < Obese=1: Odds of kneeOA = p/(1-p)=0.2444/ = Obese=0: Odds of kneeOA = p/(1-p)=0.0911/ = Obesity odds-ratio for kneeOA OR = / =3.225

HABC: Obese x KneeOA proc logistic data=worst_knee_vs_noOA; model kneeOA(event="1")=obese; run; Note OR and C.I. Confidence Interval (C.I.) (2.56,4.04) doesn’t cover 1.0 => stat signif.

HABC: Obese x KneeOA Prob[kneeOA│obese=0]= exp(-2.3)/(1+exp(-2.3) = Prob[kneeOA│obese=0]= exp( )/(1+exp( ) =

HABC: Obese x KneeOA Obese: BMI > 30 Chi-square P < Prob[kneeOA│obese=0]= exp(-2.3)/(1+exp(-2.3) = Prob[kneeOA│obese=0]= exp( )/(1+exp( ) = General logistic regression form: Prob[kneeOA│obese] = exp(int+obese)/(1+exp(int+obese)

Gender x PAD

Gender x PAD (referent=female) proc logistic data=pad; model y1ppad(event=“1”)=male; run;

Gender x PAD (ref=male) proc logistic data=pad; model y1ppad(event=“1”)=female; run;

Gender x PAD (compare models: ref=female vs ref=male) (vs females) Male OR= (vs males) Female OR= = 1/1.891

CHD x KneeOA CHD  Knee OA association not statistically significant C.I.=(0.79,1.34)

Self-reported rheumatoid arthritis as binary outcome (or covariate) for analyses ? (NOT ?#!) Self-reported rheumatoid arthritis as binary outcome (or covariate) for analyses ? (NOT ?#!)

White Females: Obesity x KneeOA

White vs Black Females Obesity x KneeOA: Similar OR’s White Females Black Females

Black females have about two times higher rates of kneeOA than white women proc logistic data=worst_knee_vs_noOA; model kneeOA(event="1")= black ; where female; run;

Obesity odds-ratio is same for white and black women (interaction term is NS) proc logistic data=worst_knee_vs_noOA; model kneeOA(event="1")=obese black obese_x_black; where female; run;

Non-obese black women have OR=1.53 higher rates of knee OA, but obesity is associated with increased OR=3.61 for knee OA that applies within each race

Obesity explains some, but not all of the difference in rates of knee OA between black and white females (Note the “black race” OR attenuation from 2.08 to 1.53 after “adjusting” for obesity) model kneeOA= black obese

White Females: White Females: Continuous CRP Difference in average logCRP: 0.76 – 0.43 = 0.33 Knee OA P-value No (n=752)Yes (n=92) Mean (SD) Equal varsUnequal logCRP0.43 (0.83)0.76 (0.58)0.0002< logCRP SD’s were signif diff (p<0.0001) => Use Satterthwaite unequal variance test

All White Females in HABC (N=844) [includes SxRxKOA (n=93); also rest of parent study cohort] N=5 N=5 had CRP > 30 (max=63.2)

log CRP

White Females Continuous CRP as predictor of kneeOA Standardized var: mean-centered, divided by SD logCRP_perSD= (logCRP )/  Units of standardized logCRP is SD’s

White Females: Per SD higher logCRP, rates of knee OA increase by OR=1.5 proc logistic data=worst_knee_vs_noOA3; model kneeOA(event="1")=logCRP_perSD ; where female and white; run;

Thank you Questions, comments, suggestions or insights? Questions, comments, suggestions or insights? Remaining time: Open consultation … Remaining time: Open consultation …