Interpreting regression for non-statisticians Colin Fischbacher.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Ananda Allan Senior Health Intelligence Analyst ‘The Quality Outcomes Framework (QOF): Can it be used for more than just paying GPs?’ Ananda Allan Senior.
ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma.
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Measures of association
Journal Club Alcohol and Health: Current Evidence May-June 2006.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Notes  Data are presented as a pair of overlying bars, the outer, wider bar representing the period 1st Oct 2007 to 30th September 2008, and the inner,
Statistical Fridays J C Horrow, MD, MS STAT Clinical Professor, Anesthesiology Drexel University College of Medicine.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Regression and Correlation
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Logistic Regression. Outline Review of simple and multiple regressionReview of simple and multiple regression Simple Logistic RegressionSimple Logistic.
Chapter 13: Inference in Regression
HSTAT1101: 27. oktober 2004 Odd Aalen
7 Regression & Correlation: Rates Basic Medical Statistics Course October 2010 W. Heemsbergen.
Multiple Choice Questions for discussion
Simple Linear Regression
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Tim Wiemken PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky Confounding.
Greg Rubin Professor of General Practice and Primary Care University of Durham.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Assessing Survival: Cox Proportional Hazards Model
01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
Racial and Ethnic Disparities in the Knowledge of Shaken Baby Syndrome among Recent Mothers Findings from the Rhode Island PRAMS Hanna Kim, Samara.
Introduction to Logistic Regression Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein.
HSRP 734: Advanced Statistical Methods July 17, 2008.
Lipoatrophy and lipohypertrophy are independently associated with hypertension: the effect of lipoatrophy but not lipohypertrophy on hypertension is independent.
1 THE ROLE OF COVARIATES IN CLINICAL TRIALS ANALYSES Ralph B. D’Agostino, Sr., PhD Boston University FDA ODAC March 13, 2006.
Use of Fan During Sleep and the Risk of Sudden Infant Death Syndrome De-Kun Li, MD, PhD Division of Research Kaiser Permanente Oakland, California March.
Research Techniques Made Simple: Multivariable Analysis Marlies Wakkee Loes Hollestein Tamar Nijsten Department of Dermatology, Erasmus University Medical.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Urban-Rural Inequalities in Potentially Preventable Hospital Admissions Carolyn Hunter-Rowe Senior Health Intelligence Analyst Department of Public Health.
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Supplemental Table A. Baseline proteinuria predicting renal outcome in multivariable Cox-Hazard model PredictorsHR95% CIp value Baseline UPE, g/day
School of Geography FACULTY OF ENVIRONMENT ESRC Research Award RES What happens when international migrants settle? Ethnic group population.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
© Cancer Research UK 2005 Registered charity number Oral Cancer The statistics in this presentation are based on the Oral CancerStats report published.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Logistic Regression Analysis Gerrit Rooks
Stomach – UK July 2007.
Logistic regression (when you have a binary response variable)
POPLHLTH 304 Regression (modelling) in Epidemiology Simon Thornley (Slides adapted from Assoc. Prof. Roger Marshall)
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
Instrument design Essential concept behind the design Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public.
National Cancer Intelligence Network Outcome and the effect of age in 1318 patients with synovial sarcoma: Report from the National Cancer Intelligence.
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Analysis of matched data Analysis of matched data.
Additional Regression techniques Scott Harris October 2009.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Date of download: 6/23/2016 Copyright © The American College of Cardiology. All rights reserved. From: Outcomes Associated With Microalbuminuria: Effect.
Table S1. Logistic regression analysis for the variables associated with the degree of change in cTnT between 2 time points (n=89) UnivariateMultivariate.
Bootstrap and Model Validation
The SPRINT Research Group
Table 1. Baseline Characteristics of the 36,636 Study Subjects
Copyright © 2011 American Medical Association. All rights reserved.
Supplementary Table 1 Independent Predictors of 2-Year Mortality
Copyright © 2008 American Medical Association. All rights reserved.
Copyright © 2007 American Medical Association. All rights reserved.
Coffee drinking and leukocyte telomere length: A meta-analysis
Multiple logistic regression
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Changes in the Summary Report
Figure 1 Diagram showing analysis flow of patient selection and treatment allocation of ONTARGET/TRANSCEND. Figure 1 Diagram showing analysis flow of patient.
Admission Glucose and In-hospital Mortality after Acute Myocardial Infarction in Patients with or without Diabetes: A Cross-sectional Study Shi Zhao, Karthik.
Presentation transcript:

Interpreting regression for non-statisticians Colin Fischbacher

What this presentation will cover overview of regression methods what are they and why use them? what do results from regression look like? how do you interpret those results? what pitfalls should I look out for?

What is regression? regression relates two kinds of variables: outcome variables: for example –30 day mortality –Blood pressure –CHD admission rate explanatory variables: for example –age –sex –treatment type

What is regression? (2) are these variables related? if so, in what way?

What is regression? (3) the red line is an estimate of the relationship that best fits the data we have other estimates are possible

What is regression? (3) the red line is an estimate of the relationship that best fits the data we have other estimates are possible

What is regression? (4) regression can examine more than one explanatory variable at a time males in red, females in black... females have higher blood pressure overall

What is regression? (5) males in red, females in black.. at each age male blood pressure is higher

What is regression? (6) here regression is used to estimate how much blood pressure rises with age (so many mm/yr) taking this effect of age into account, regression is used to estimate how much higher male blood pressure is than female blood pressure (so many mm higher, taking into account age) males in red, females in black

Why use regression methods? There are other methods to adjust for one or two variables –standardisation –stratification These methods deal well with one or two explanatory variables (usually age or sex) Regression allows you to take into account the effect of many variables at the same time Answers the question “What’s the effect of this variable allowing for all the other ones in the model?”

What methods are available? Depends on outcome variable.. Continuous variable (eg blood pressure) –linear regression Yes/no/binary outcome (eg dead/alive) –logistic regression Rate variable (eg admissions per year) –Poisson regression Time to event (eg death from cancer) –Cox regression/ survival analysis (Many other types also available)

Linear regression Continuous variable (eg blood pressure)

Linear regression Continuous outcome data (eg blood pressure) Blood pressure mmHg Age (per year)0.5 (0.3, 0.7) Sex (male)4.0 (3.5, 4.5) Ethnic group White0 (ref) South Asian3.5 (3.0, 4.0) Afro-Carribean4.1 (3.6, 4.6)

Logistic regression Yes/no/binary outcome (eg dead/alive) Death within 30 days of heart attack AgeOdds ratio (95% CI) years years1.5 (1.1, 1.9) years2.5 (1.5, 3.0) Sex Male1.0 Female1.2 (1.1, 1.3) Blood pressure (per 10mmHg)1.5 (1.4, 1.6)

Poisson regression Rate variable (eg admissions per year) Emergency admission for COPD SexRate ratio (95% CI) Females1.0 Males1.2 (0.5, 1.9) Additional co-morbidities None1.0 Present2.5 (2.2, 2.8) Age (per 10 year increase)1.5 (1.3, 1.7)

Cox regression Time to event (eg recurrence of cancer) Time to recurrence of cancer TreatmentHazard ratio (95% CI) Previous treatment1.0 New drug X0.5 (0.2, 0.8) Stage of disease Grade 11.0 Grade 20.9 (0.5, 1.3) Grade 31.5 (1.2, 1.8) Age1.01 (1.005, 1.015)

Some notes of caution Regression is technically easy with most stats packages (point and click) However skill is needed: –to choose the right method and the best model –to select how many and which variables to include –to check that the final model fits well –to interpret the final results There are always important assumptions Modelling requires experience and judgement and includes a degree of subjectivity

What should I look for? The kind of model used (logistic, Poisson etc) The variables included in the model The effect estimates for each variable (or “parameter”) For each categorical variable an indication of which category is the reference category (usually given a null effect size) An assessment of the goodness of model fit

What do the results mean? Effect estimates (may be called coefficients) may be: Single figures Odds ratios Rate ratios Hazard ratios

Linear regression Continuous outcome data (eg blood pressure) Blood pressure mmHg (95% CI) Age (per year)0.5 (0.3, 0.7) Sex (male)4.0 (3.5, 4.5) Ethnic group White0 (ref) South Asian3.5 (3.0, 4.0) Afro-Carribean4.1 (3.6, 4.6)

Logistic regression Yes/no/binary outcome (eg dead/alive) Death within 30 days of heart attack AgeOdds ratio (95% CI) years years1.5 (1.1, 1.9) years2.5 (1.5, 3.0) Sex Male1.0 Female1.2 (1.1, 1.3) Blood pressure (per 10mmHg)1.5 (1.4, 1.6)

Poisson regression Rate variable (eg admissions per year) Emergency admission for COPD SexRate ratio (95% CI) Females1.0 Males1.2 (0.5, 1.9) Additional co-morbidities None1.0 Present2.5 (2.2, 2.8) Age1.01 (1.005, 1.015)

Cox regression Time to event (eg recurrence of cancer) Time to recurrence of cancer TreatmentHazard ratio (95% CI) Previous treatment1.0 New drug X0.5 (0.2, 0.8) Stage of disease Grade 11.0 Grade 20.9 (0.5, 1.3) Grade 31.5 (1.2, 1.8) Age (per 10 years)1.5 (1.4, 1.6)

What else should I look for? Is the basic question clear? –why was a regression method chosen? Was the correct model used? –logistic if yes/no outcomes, Poisson if rates etc Which variables were included? –Were any ones you think are important left out? How were the variables chosen? –modelling strategies and results of exploration? How many variables were included? – cases per variable approximate rule of thumb Effect sizes (or “coefficients”) and confidence intervals Were measures of model fit reported?

REAL LIFE EXAMPLES regression methods

Cox regression McBride and colleagues (BMJ Dec 4, 2010) conducted a study of patients in 324 UK general practices and examined the time they waited between consulting their GP with hip pain and being referred to secondary care. The figures show hazard ratios for referral from a Cox regression model that included age group, sex and deprivation quintile

Poisson regression Sim and colleagues (BMJ Dec 4, 2010) conducted a study to examine changes in the rate of emergency admission for acute myocardial infarction before and after the introduction of smoke free legislation in England. After adjusting for year of admission, temperature, Christmas holidays and week of admission in a Poisson regression model, they obtained the results shown in the table. BMJ 340: doi: /bmj.c2161

Logistic regression Alm and colleagues interviewed parents of 294 cases of Sudden Infant Death Syndrome (SIDS) in three Scandinavian countries, asking about coffee and alcohol consumption by the mother. * adjusted for maternal smoking in 1st trimester, maternal age, education and parity Arch Dis Child 1999;81: doi: /adc

Conclusions Regression methods allow you to examine the effects of many variables simultaneously However they do not give “automatic” answers Care is needed in choice of method, selection of variables, testing the final model and interpreting the results Model building always involves some degree of judgement and personal choice