MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Slides:



Advertisements
Similar presentations
Statistical Analysis SC504/HS927 Spring Term 2008
Advertisements

Survival Analysis-1 In Survival Analysis the outcome of interest is time to an event In Survival Analysis the outcome of interest is time to an event The.
Comparing Two Proportions (p1 vs. p2)
Logistic Regression Psy 524 Ainsworth.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Logistic Regression.
Observational Designs Oncology Journal Club April 26, 2002.
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC.
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
Measures of Disease Association Measuring occurrence of new outcome events can be an aim by itself, but usually we want to look at the relationship between.
Measures of association
Chapter 17 Comparing Two Proportions
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Basic Statistical Concepts Donald E. Mercante, Ph.D. Biostatistics School of Public Health L S U - H S C.
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Sample Size Determination
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Logistic regression for binary response variables.
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
Are exposures associated with disease?
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
Multiple Choice Questions for discussion
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Statistics for clinical research An introductory course.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1October In Chapter 17: 17.1 Data 17.2 Risk Difference 17.3 Hypothesis Test 17.4 Risk Ratio 17.5 Systematic Sources of Error 17.6 Power and Sample.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
The binomial applied: absolute and relative risks, chi-square.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Leicester Warwick Medical School Health and Disease in Populations Case-Control Studies Paul Burton.
1 Risk Assessment Tests Marina Kondratovich, Ph.D. OIVD/CDRH/FDA March 9, 2011 Molecular and Clinical Genetics Panel for Direct-to-Consumer (DTC) Genetic.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
More Contingency Tables & Paired Categorical Data Lecture 8.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Logistic regression (when you have a binary response variable)
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.
Methods of Presenting and Interpreting Information Class 9.
March 28 Analyses of binary outcomes 2 x 2 tables
Notes on Logistic Regression
Lecture 8 – Comparing Proportions
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Interpreting Epidemiologic Results.
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
Case-control studies: statistics
Presentation transcript:

MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources) Ch 10 Multifactorial Analyses

Assignment 3 Due: March 8 -solutions will be posted after due date -but marks will not likely available prior to exam

Observational Studies with Binary Outcomes Case/control and cohort studies - common in cancer research Outcome: cancer/ no cancer, dead/alive - cross-sectional studies - classify subjects into categories of 2 binary variables

X X X X X X X 0 X X X Exposure eg diet Case Control Study Exposure eg diet Measure of risk: odds ratio (OR)

X0 0 0 X 0 0 X 0 0 X Cohort Study Exposure eg diet Cancer (yes/no) Measure of risk: RR or OR

Cross-sectional Study Subjects NOT selected on exposure or outcome Classify subjects into exposure and outcome OR or RR can be used to describe association with binary outcome

Observational Studies with Binary Outcomes -case/control, cohort studies, cross-sectional studies Ways to examine association: chi square test for association (2 x 2 contingency table) X 2 odds ratio (OR) or relative risk (RR) X 2 and magnitude of risk and CI logistic regression X 2, magnitude of risk, CI and can include other variables of interest

Relative Risk Prospective Cohort Studies RR = 1.0 no association RR = times the risk 40% higher risk RR = % lower risk RR = p 1 /p 2 P 1 = probability of disease for exposed individuals P 2 = probability of disease for unexposed individuals

MDM2 protein expression and breast cancer prognosis - cohort study - women with invasive breast cancer at BCCA - TMA stained for MDM2 protein expression - data on outcome (dead/alive) available Turbin et al, Modern Pathology 2006

MDM2 protein expression and breast cancer prognosis Prospective Cohort Study p1 = 28/49 = 0.57 p2 = 94/313 = 0.30 X 2 = = 12.75, df = 1, p-value = < 0.01) RR = (28/49)/(94/313) = 1.90 Women with MDM2 protein expression were at 1.9 times the risk of dying from breast cancer compared to women without MDM2 protein expression (p<0.01).

(from lecture 4)

Case Control Study of Family History and Breast Cancer - cases of breast cancer identified by cancer registry - controls identified through provincial screening program - data collected by questionnaire (after diagnosis in cases)

Case Control Study of Family History and Breast Cancer 2 x 2 Contingency Table Chi-square results with Yates’ continuity correction: X 2 = 9.60, df = 1, p-value = (< 0.01) We conclude that there is a statistically significant association between first degree family history of breast cancer and breast cancer risk (p<0.01). 22% of women with breast cancer have a first degree family history of breast cancer compared to 16% of women without breast cancer.

Estimate of Risk from Case-Control Study we fixed the number with and without breast cancer we cannot estimate of the probabilities of breast cancer in women with and without family history - Relative Risk cannot be estimated What can we do?

Gamblers calculate their chances of winning using a term called the odds Suppose that the horse is a favourite and it is declared to have a 1 in 4 chance of winning [1 / (1 + 3)]. The gambler might say that the horse had an odds of 1 in 3 of winning. However, gamblers are much more likely to say that the odds of the horse losing are 3 to 1. A horse that is a longshot may have only a 1 in 50 chance of winning. On the tote board the gambler will read that it has 49 to 1 odds against winning. A day at the racetrack.....

Estimate of risk: Odds Ratio If the probability of an event = p, then: The odds in favour of an event = p/(1-p) ratio of probability that event occurs to probability that is does not Odds Ratio: Odds in favour of disease for the exposed group Odds in favour of disease for the unexposed group

odds of breast cancer with FHX = 238/418 1-(238/418) = 1.32 odds of breast cancer with no FHX = 862/ (862/1782) = 0.94 OR = 1.32/0.94 = 1.41 odds = p/(1-p) Odds Ratio

OR = (a/b)/(c/d) = (238/180)/ 862/920 = 1.41 Alternate equation: (a*d)/(b*c) = (238*920)/(180*862) = 1.41 Ratio of the number times event occurs to number of times it doesn’t Simple method for calculating OR:

- OR has a skewed distribution - limited at lower end because it can’t be negative but not limited at the upper end - log(OR) however can take any value and has an approximately normal distribution SE for ln(OR) = sqrt (1/a + 1/b + 1/c + 1/d) = sqrt(1/ / / /920) = ln(1.41) ± 1.96 x to to % CI Confidence Interval for OR Calculate limits on log(OR) and then “exponentiate”

What is the interpretation of the OR? The odds of breast cancer in women with a family history is about 1.41 times of that in women without a family history. Strictly speaking OR should be expressed as “odds” (as above): However, when the outcome is rare (as it is generally for cancer), the OR is approximately equal to RR and results are often expressed as risk (ie more or less likely at risk to develop cancer).

Disease Odds Ratio: Odds in favour of disease for the exposed group Odds in favour of disease for the unexposed group Exposure Odds Ratio: Odds in favour of being exposed for diseased subjects Odds in favour of being exposed for non diseased subjects OR is reversible = 1.41

MDM2 protein expression and breast cancer prognosis Prospective Cohort Study p1 = 28/49 = 0.57 p2 = 94/313 = 0.30 X 2 = = 12.75, df = 1, p-value = < 0.01) RR = (28/49)/(94/313) = 1.90 OR = (28*219)/21*122) = 3.11 Proportion dying = 34%

Caution about Case/Control Studies “Recall” bias subjects with disease may recall their exposures differently from controls - Biological samples collected after diagnosis may be affected by presence of disease -Selection of controls extremely important (different population?) -Treatment of samples from cases and controls must be the same -Posted paper: Sources of Bias in Specimens for Research about Molecular markers for cancer

Ransohoff, D. F. et al. J Clin Oncol; 28: Fig 1. The fundamental comparison in experimental and observational study design - paper posted on website under resources

Nested Case-Control Study Measure of risk: OR cohort select cases & subset of controls measure exposure follow to identify cases

Relative Risk RR = p 1 /p 2 P 1 = probability of disease for exposed individuals P 2 = probability of disease for unexposed individuals Nested Case-Control Study Do a prospective cohort study Identify cases Select controls (randomly) from the cohort study - usually matched to case - followed same length of time as case - match on other characteristics (eg age, site etc) perform measurements of exposure Analyze as case-control (Odds Ratio) - Still requires cohort study; but less measurements required - Control from same population as cases -Measurements from baseline (no recall bias)

a generalization of chi square to examine association of a binary variable with one or more independent variables (categorical or continuous) Logistic regression quantifies the relationship between a risk factor for (or treatment) and a disease, after adjusting for other variables. Binary dependent variable: an event which is either present or absent (“success” or “failure”) Goal is to examine factors associated with the probability of an event uses method of maximum likelihood rather than least squares Logistic Regression

How does logistic regression work? Logistic regression finds an equation that predicts an outcome variable that is binary from one or more x variables. Outcome = probability of disease (p) p = β 0 + β 1 X 1 + β 2 X 2 … But…probabilities can only range from 0 to 1 and the right hand side could be 1 for some values of X : Use logit transformation

How does logistic regression work? logit transformation : logit(p) = ln(p/1-p) Natural logarithm of the odds can take on any value (negative or positive). Ln(Odds) = β 0 + β 1 X 1 + β 2 X 2 … Logistic Regression Model:

Logistic Regression family history example Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) fhx ln(Odds)= x Intercept (β 0 ): log odds in baseline group (x = 0) Slope (β): difference between ln(odds) for 1 unit of x variable To Interpret – use transformation: = e β = e = 1.41 Odds FHX Odds no FHX

Case Control Study of Family History and Breast Cancer Since there are only 2 values for x (family history: yes/no): For women with family history: ln(Odds) = β 0 + β 1 (x=1) For women with no family history: ln(Odds) = β 0 (x=0) ln(Odds)= x A little more detail on interpretation….

odds of breast cancer with FHX = 238/418 1-(238/418) = 1.32 odds of breast cancer with no FHX = 862/ (862/1782) = 0.94 OR = 1.32/0.94 = 1.41

Case Control Study of Family History and Breast Cancer Since there are only 2 values for x (family history: yes/no): For women with family history: ln(Odds) = β 0 + β 1 (x=1) = = = ln(1.32) For women with no family history: ln(Odds) = β 0 (x=0) = = ln(0.94) LN(Odds) = x β 1 = difference in ln(odds) between categories = ratio of odds = (-0.065) = OR = 1.32/0.94 = 1.41; e = 1.41

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) fhx age bmi HRT Multiple Logistic Regression – Family History Example Note: z test used for coefficients. For 95% CI can use 1.96 x se

Multiple Logistic Regression – Family History Example lower 95% CI higher 95% CI OR 2.5 % 97.5 % (Intercept) fhx age bmi HRT Interpretation: The odds of a woman with family history developing breast cancer is 1.43 times (95% CI 1.15 to 1.77) that of a woman without a family history, after adjustment for age, BMI and HRT use.

Studies with Binary Outcomes - Summary Ways to examine association: chi square test for association (2 x 2 contingency table) odds ratio (OR) or relative risk (RR)* - test of association, magnitude of risk and CI logistic regression OR as measure as risk, CI and can include other variables of interest * for case-control study only OR is appropriate; for cohort and cross-sectional both OR and RR are valid; if probability of outcome is rare - OR and RR will be similar