Introduction to Survival Analysis

Slides:



Advertisements
Similar presentations
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Advertisements

Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
Informative Censoring Addressing Bias in Effect Estimates Due to Study Drop-out Mark van der Laan and Maya Petersen Division of Biostatistics, University.
Statistical Issues in Contraceptive Trials
Extension Article by Dr Tim Kenny
Tests for time-to-event outcomes (survival analysis)
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Statistics 262: Intermediate Biostatistics
1 Statistics 262: Intermediate Biostatistics Kaplan-Meier methods and Parametric Regression methods.
بسم الله الرحمن الرحیم. Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of.
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Main Points to be Covered
Introduction to Survival Analysis
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Cohort Studies.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Sample Size Determination
EVIDENCE BASED MEDICINE
Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
1 Tests for time-to-event outcomes (survival analysis)
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. Stanford University Department of Health.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
Introduction to Survival Analysis August 3 and 5, 2004.
Multiple Choice Questions for discussion
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Prevalence The presence (proportion) of disease or condition in a population (generally irrespective of the duration of the disease) Prevalence: Quantifies.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
INTRODUCTION TO SURVIVAL ANALYSIS
Cox Regression II Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy Kristin Sainani Ph.D.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Statistics for the Terrified Talk 4: Analysis of Clinical Trial data 30 th September 2010 Janet Dunn Louise Hiller.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Lecture 9: Analysis of intervention studies Randomized trial - categorical outcome Measures of risk: –incidence rate of an adverse event (death, etc) It.
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Compliance Original Study Design Randomised Surgical care Medical care.
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X = decrease (–) in cholesterol.
INTRODUCTION TO CLINICAL RESEARCH Survival Analysis – Getting Started Karen Bandeen-Roche, Ph.D. July 20, 2010.
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Date of download: 5/31/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Estrogen Plus Progestin and Breast Cancer Incidence.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
An introduction to Survival analysis and Applications to Predicting Recidivism Rebecca S. Frazier, PhD JBS International.
April 18 Intro to survival analysis Le 11.1 – 11.2
Overview What is survival analysis? Terminology and data structure.
The binomial applied: absolute and relative risks, chi-square
Tests for time-to-event outcomes (survival analysis)
From: Tipping the Balance of Benefits and Harms to Favor Screening Mammography Starting at Age 40 YearsA Comparative Modeling Study of Risk Ann Intern.
Some Epidemiological Studies
Statistics 103 Monday, July 10, 2017.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Statistics 262: Intermediate Biostatistics
Basic statistics.
Presentation transcript:

Introduction to Survival Analysis

Early example of survival analysis, 1669 Christiaan Huygens' 1669 curve showing how many out of 100 people survive until 86 years. From: Howard Wainer­ STATISTICAL GRAPHICS: Mapping the Pathways of Science. Annual Review of Psychology. Vol. 52: 305-335

Early example of survival analysis Roughly, what shape is this function? What was a person’s chance of surviving past 20? Past 36? This is survival analysis! We are trying to estimate this curve—only the outcome can be any binary event, not just death.

What is survival analysis? Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness, recovery from illness (binary variables) or transition above or below the clinical threshold of a meaningful continuous variable (e.g. CD4 counts). Accommodates data from randomized clinical trial or cohort study design.

Randomized Clinical Trial (RCT) Intervention Control Disease Random assignment Disease-free Target population Disease-free, at-risk cohort Disease Disease-free TIME

Randomized Clinical Trial (RCT) Treatment Control Cured Random assignment Not cured Target population Patient population Cured Not cured TIME

Randomized Clinical Trial (RCT) Treatment Control Dead Random assignment Alive Target population Patient population Dead Alive TIME

Cohort study (prospective/retrospective) Disease Exposed Disease-free Target population Disease-free cohort Disease Unexposed Disease-free TIME

Examples of survival analysis in medicine

RCT: Women’s Health Initiative (JAMA, 2001) On hormones On placebo Cumulative incidence

WHI and low-fat diet… Control Low-fat diet Prentice, R. L. et al. JAMA 2006;295:629-642. Control Low-fat diet

Retrospective cohort study: From December 2003 BMJ: Aspirin, ibuprofen, and mortality after myocardial infarction: retrospective cohort study

Why use survival analysis? 1. Why not compare mean time-to-event between your groups using a t-test or linear regression? -- ignores censoring 2. Why not compare proportion of events in your groups using risk/odds ratios or logistic regression? --ignores time 1. If no censoring (everyone followed to outcome-of-interest) than ttest on mean or median time to event is fine. 2. If time at-risk was the same for everyone, could just use proportions.

Survival Analysis: Terms Time-to-event: The time from entry into a study until a subject has a particular outcome Censoring: Subjects are said to be censored if they are lost to follow up or drop out of the study, or if the study ends before they die or have an outcome of interest. They are counted as alive or disease-free for the time they were enrolled in the study. PhD candidates who are most likely to take longest may be most likely to drop out, thereby biasing results.

Review Question 1 Which of the following data sets is likely to lend itself to survival analysis? A case-control study of caffeine intake and breast cancer. A randomized controlled trial where the outcome was whether or not women developed breast cancer in the study period. A cohort study where the outcome was the time it took women to develop breast cancer. A cross-sectional study which identified both whether or not women have ever had breast cancer and their date of diagnosis.

Review Question 1 Which of the following data sets is likely to lend itself to survival analysis? A case-control study of caffeine intake and breast cancer. A randomized controlled trial where the outcome was whether or not women developed breast cancer in the study period. A cohort study where the outcome was the time it took women to develop breast cancer. A cross-sectional study which identified both whether or not women have ever had breast cancer and their date of diagnosis.

Introduction to Kaplan-Meier Non-parametric estimate of the survival function: Simply, the empirical probability of surviving past certain times in the sample (taking into account censoring).

Introduction to Kaplan-Meier Non-parametric estimate of the survival function. Commonly used to describe survivorship of study population/s. Commonly used to compare two study populations. Intuitive graphical presentation.

Survival Data (right-censored) Beginning of study End of study  Time in months  Subject B Subject A Subject C Subject D Subject E 1. subject E dies at 4 months X

Corresponding Kaplan-Meier Curve 100%  Time in months  Probability of surviving to 4 months is 100% = 5/5 Fraction surviving this death = 4/5 Subject E dies at 4 months

Survival Data Beginning of study End of study  Time in months  Subject B Subject A Subject C Subject D Subject E 2. subject A drops out after 6 months 3. subject C dies at 7 months X 1. subject E dies at 4 months X

Corresponding Kaplan-Meier Curve 100%  Time in months  Fraction surviving this death = 2/3 subject C dies at 7 months

Survival Data Beginning of study End of study  Time in months  Subject B Subject A Subject C Subject D Subject E 2. subject A drops out after 6 months 4. Subjects B and D survive for the whole year-long study period 3. subject C dies at 7 months X 1. subject E dies at 4 months X

Corresponding Kaplan-Meier Curve 100%  Time in months  Rule from probability theory: P(A&B)=P(A)*P(B) if A and B independent In survival analysis: intervals are defined by failures (2 intervals leading to failures here). P(surviving intervals 1 and 2)=P(surviving interval 1)*P(surviving interval 2) Product limit estimate of survival = P(surviving interval 1/at-risk up to failure 1) * P(surviving interval 2/at-risk up to failure 2) = 4/5 * 2/3= .5333

The product limit estimate The probability of surviving in the entire year, taking into account censoring = (4/5) (2/3) = 53% NOTE:  40% (2/5) because the one drop-out survived at least a portion of the year. AND <60% (3/5) because we don’t know if the one drop-out would have survived until the end of the year.

Example 1: time-to-conception for subfertile women “Failure” here is a good thing. 38 women (in 1982) were treated for infertility with laparoscopy and hydrotubation. All women were followed for up to 2-years to describe time-to-conception. The event is conception, and women "survived" until they conceived. Example from: BMJ, Dec 1998; 317: 1572 - 1580.

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2 3 4 7 8 9 11 24   6 10 13 16 Conceived (event) Did not conceive (censored) Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding Kaplan-Meier Curve S(t) is estimated at 9 event times. (step-wise function)

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2 3 4 7 8 9 11 24   6 10 13 16 Conceived (event) Did not conceive (censored) Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2 3 4 7 8 9 11 24   6 10 13 16 Conceived (event) Did not conceive (censored) Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding Kaplan-Meier Curve 6 women conceived in 1st month (1st menstrual cycle). Therefore, 32/38 “survived” pregnancy-free past 1 month.

Corresponding Kaplan-Meier Curve S(t=1) = 32/38 = 84.2% S(t) represents estimated survival probability: P(T>t) Here P(T>1).

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2.1 3 4 7 8 2 9 11 24   6 10 13 16 Conceived (event) Did not conceive (censored) Important detail of how the data were coded: Censoring at t=2 indicates survival PAST the 2nd cycle (i.e., we know the woman “survived” her 2nd cycle pregnancy-free). Thus, for calculating KM estimator at 2 months, this person should still be included in the risk set. Think of it as 2+ months, e.g., 2.1 months. Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding Kaplan-Meier Curve

Corresponding Kaplan-Meier Curve 5 women conceive in 2nd month. The risk set at event time 2 included 32 women. Therefore, 27/32=84.4% “survived” event time 2 pregnancy-free. S(t=2) = ( 84.2%)*(84.4%)=71.1%

Risk set at 3 months includes 26 women Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2.1 3.1 4 7 8 2 9 11 3 24   6 10 13 16 Conceived (event) Did not conceive (censored) Risk set at 3 months includes 26 women Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding Kaplan-Meier Curve

Corresponding Kaplan-Meier Curve 3 women conceive in the 3rd month. The risk set at event time 3 included 26 women. 23/26=88.5% “survived” event time 3 pregnancy-free. S(t=3) = ( 84.2%)*(84.4%)*(88.5%)=62.8%

Risk set at 4 months includes 22 women Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2 3.1 4 7 8 9 11 3 24   6 10 13 16 Conceived (event) Did not conceive (censored) Risk set at 4 months includes 22 women Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding Kaplan-Meier Curve

Corresponding Kaplan-Meier Curve 3 women conceive in the 4th month, and 1 was censored between months 3 and 4. The risk set at event time 4 included 22 women. 19/22=86.4% “survived” event time 4 pregnancy-free. S(t=4) = ( 84.2%)*(84.4%)*(88.5%)*(86.4%)=54.2%

Risk set at 6 months includes 18 women Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2 3 4.1 7 8 9 11 24   4 6 10 13 16 Conceived (event) Did not conceive (censored) Risk set at 6 months includes 18 women Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Corresponding Kaplan-Meier Curve

Corresponding Kaplan-Meier Curve 2 women conceive in the 6th month of the study, and one was censored between months 4 and 6. The risk set at event time 5 included 18 women. 16/18=88.8% “survived” event time 5 pregnancy-free. S(t=6) = (54.2%)*(88.8%)=42.9%

Skipping ahead to the 9th and final event time (months=16)… S(t=13)  22% (“eyeball” approximation)

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study) 1 2 3 4 7 8 9 11 24   6 10 13 16 Conceived (event) Did not conceive (censored) 2 remaining at 16 months (9th event time) Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Skipping ahead to the 9th and final event time (months=16)… S(t=16) =( 22%)*(2/3)=15% Tail here just represents that the final 2 women did not conceive (cannot make many inferences from the end of a KM curve)!

Comparing 2 groups Use log-rank test to test the null hypothesis of no difference between survival functions of the two groups

Kaplan-Meier: example 2 Researchers randomized 44 patients with chronic active hepatitis were to receive prednisolone or no treatment (control), then compared survival curves. Example from: BMJ 1998;317:468-469 ( 15 August )

Data from: BMJ 1998;317:468-469 ( 15 August ) *=censored Survival times (months) of 44 patients with chronic active hepatitis randomised to receive prednisolone or no treatment. Prednisolone (n=22) Control (n=22) 2 6 3 12 4 54 7 56 * 10 68 22 89 28 96 29 32 125* 37 128* 40 131* 41 140* 141* 61 143 63 145* 71 146 127* 148* 162* 146* 168 158* 173* 167* 181* 182* Data from: BMJ 1998;317:468-469 ( 15 August ) *=censored

Kaplan-Meier: example 2 Are these two curves different? Big drops at the end of the curve indicate few patients left. E.g., only 2/3 (66%) survived this drop. Misleading to the eye—apparent convergence by end of study. But this is due to 6 controls who survived fairly long, and 3 events in the treatment group when the sample size was small.

Log-rank test Test of Equality over Strata Pr > Test Chi-Square DF Chi-Square Log-Rank 4.6599 1 0.0309 Chi-square test (with 1 df) of the (overall) difference between the two groups. Groups appear significantly different.

Caveats Survival estimates can be unreliable toward the end of a study when there are small numbers of subjects at risk of having an event.

WHI and breast cancer Small numbers left

Limitations of Kaplan-Meier Mainly descriptive Doesn’t control for covariates Requires categorical predictors Can’t accommodate predictor variables that change over time

Review question 2 A two-sample t-test. ANOVA Repeated-measures ANOVA Investigators studied a cohort of individuals who joined a weight-loss program by tracking their weight loss over 1 year. Which of the following statistical test is likely the most appropriate test for evaluating the effectiveness of the weight loss program? A two-sample t-test. ANOVA Repeated-measures ANOVA Chi-square Kaplan-Meier methods

Review question 2 A two-sample t-test. ANOVA Repeated-measures ANOVA Investigators studied a cohort of individuals who joined a weight-loss program by tracking their weight loss over 1 year. Which of the following statistical test is likely the most appropriate test for evaluating the effectiveness of the weight loss program? A two-sample t-test. ANOVA Repeated-measures ANOVA Chi-square Kaplan-Meier methods

Review question 3 A two-sample t-test. ANOVA. Repeated-measures ANOVA Investigators compared mean cholesterol level between cases with heart disease and controls without heart disease. Which of the following is likely the most appropriate statistical test for this comparison? A two-sample t-test. ANOVA. Repeated-measures ANOVA Chi-square Kaplan-Meier methods

Review question 3 A two-sample t-test. ANOVA. Repeated-measures ANOVA Investigators compared mean cholesterol level between cases with heart disease and controls without heart disease. Which of the following is likely the most appropriate statistical test for this comparison? A two-sample t-test. ANOVA. Repeated-measures ANOVA Chi-square Kaplan-Meier methods

Review question 4 Logistic regression Cox regression Linear regression What is another way to analyze the data from review question 3? Logistic regression Cox regression Linear regression Kaplan-Meier methods There is no other way.

Review question 4 Logistic regression Cox regression Linear regression What is another way to analyze the data from review question 3? Logistic regression Cox regression Linear regression Kaplan-Meier methods There is no other way.

Review question 5 Which statement about this K-M curve is correct? The mortality rate was higher in the control group than the treated group. The probability of surviving past 100 days was about 50% in the treated group. The probability of surviving past 100 days was about 70% in the control group. Treatment should be recommended.

Review question 5 Which statement about this K-M curve is correct? The mortality rate was higher in the control group than the treated group. The probability of surviving past 100 days was about 50% in the treated group. The probability of surviving past 100 days was about 70% in the control group. Treatment should be recommended.

Introduction to Cox Regression Also called proportional hazards regression Multivariate regression technique where time-to-event (taking into account censoring) is the dependent variable. Estimates adjusted hazard ratios. A hazard ratio is a ratio of rates (hazard rates)

History “Regression Models and Life-Tables” by D.R. Cox, published in 1972, is one of the most frequently cited journal articles in statistics and medicine Introduced “maximum partial likelihood”

Introduction to Cox Regression Distinction between hazard/rate ratio and odds ratio/risk ratio: Hazard ratio: ratio of rates Odds/risk ratio: ratio of proportions All are measures of relative risk! By taking into account time, you are taking into account more information than just binary yes/no. Gain power/precision. Logistic regression aims to estimate the odds ratio; Cox regression aims to estimate the hazard ratio

Example 1: Study of publication bias By Kaplan-Meier methods From: Publication bias: evidence of delayed publication in a cohort study of clinical research projects BMJ 1997;315:640-645 (13 September)

Univariate Cox regression Table 4 Risk factors for time to publication using univariate Cox regression analysis Characteristic # not published # published Hazard ratio (95% CI)  Null 29 23 1.00 Non-significant trend 16 4 0.39 (0.13 to 1.12) Significant 47 99 2.32 (1.47 to 3.66) Interpretation: Significant results have a 2-fold higher incidence of publication compared to null results. From: Publication bias: evidence of delayed publication in a cohort study of clinical research projects BMJ 1997;315:640-645 (13 September)

Example 2: Study of mortality in academy award winners for screenwriting Kaplan-Meier methods From: Longevity of screenwriters who win an academy award: longitudinal study BMJ 2001;323:1491-1496 ( 22-29 December )

Relative increase in death rate for winners   Table 2. Death rates for screenwriters who have won an academy award.* Values are percentages (95% confidence intervals) and are adjusted for the factor indicated Relative increase in death rate for winners Basic analysis 37 (10 to 70) Adjusted analysis Demographic:   Year of birth 32 (6 to 64)   Sex 36 (10 to 69)   Documented education 39 (12 to 73)   All three factors 33 (7 to 65) Professional:   Film genre   Total films   Total four star films 40 (13 to 75)   Total nominations 43 (14 to 79)   Age at first film 36 (9 to 68)   Age at first nomination   All six factors 40 (11 to 76) All nine factors 35 (7 to 70)  HR=1.37; interpretation: 37% higher incidence of death for winners compared with nominees HR=1.35; interpretation: 35% higher incidence of death for winners compared with nominees even after adjusting for potential confounders

Characteristics of Cox Regression Can accommodate both discrete and continuous measures of event times Easy to incorporate time-dependent covariates—covariates that may change in value over the course of the observation period

Characteristics of Cox Regression, continued Cox models the effect of covariates on the hazard rate but leaves the baseline hazard rate unspecified. Does NOT assume knowledge of absolute risk. Estimates relative rather than absolute risk.

Assumptions of Cox Regression Proportional hazards assumption: the hazard for any individual is a fixed proportion of the hazard for any other individual Multiplicative risk

The Hazard function In words: the probability that if you survive to t, you will succumb to the event in the next instant.

The model Can take on any form! Components: A baseline hazard function that is left unspecified but must be positive (=the hazard when all covariates are 0) A linear function of a set of k fixed covariates that is exponentiated. (=the relative risk) Can take on any form!

The model Proportional hazards: Hazard for person i (eg a smoker) Hazard ratio Hazard for person j (eg a non-smoker) Hazard functions should be strictly parallel! Produces covariate-adjusted hazard ratios!

The model: binary predictor This is the hazard ratio for smoking adjusted for age.

The model:continuous predictor This is the hazard ratio for a 10-year increase in age, adjusted for smoking. Exponentiating a continuous predictor gives you the hazard ratio for a 1-unit increase in the predictor.

Review Question 6 Exponentiating a beta-coefficient from linear regression gives you what? Odds ratios Risk ratios Hazard ratios None of the above

Review Question 6 Exponentiating a beta-coefficient from linear regression gives you what? Odds ratios Risk ratios Hazard ratios None of the above

Review Question 7 Exponentiating a beta-coefficient from logistic regression gives you what? Odds ratios Risk ratios Hazard ratios None of the above

Review Question 7 Exponentiating a beta-coefficient from logistic regression gives you what? Odds ratios Risk ratios Hazard ratios None of the above

Review Question 8 Exponentiating a beta-coefficient from Cox regression gives you what? Odds ratios Risk ratios Hazard ratios None of the above

Review Question 8 Exponentiating a beta-coefficient from Cox regression gives you what? Odds ratios Risk ratios Hazard ratios None of the above

Intention-to-Treat Analysis in Randomized Trials Intention-to-treat analysis: compare outcomes according to the groups to which subjects were initially randomized, regardless of which intervention (if any) they actually followed.

Intention to treat Participants will be counted in the intervention group to which they were originally assigned, even if they: Refused the intervention after randomization Discontinued the intervention during the study Followed the intervention incorrectly Violated study protocol Missed follow-up measurements

Dietary Modification Trial <20% of diet from fat >=5 servings of fruit and vegetables >=6 servings of whole grains Primary outcomes: breast and colorectal cancer

Participant Flow in the Dietary Modification Component of the Women's Health Initiative Prentice, R. L. et al. JAMA 2006;295:629-642. Copyright restrictions may apply.

Why intention to treat? Preserves the benefits of randomization. Randomization balances potential confounding factors in the study arms. This balance will be lost if the data are analyzed according to how participants self-selected rather than how they were randomized. Simulates real life, where patients often don’t adhere perfectly to treatment or may discontinue treatment altogether Evaluates effectiveness, rather than efficacy

Baseline Demographics of Participants in Women's Health Initiative Dietary Modification Trial* Copyright restrictions may apply. Prentice, R. L. et al. JAMA 2006;295:629-642.

Benefits of randomization…

Nutrient Consumption Estimates and Body Weight at Baseline and Year 1 Prentice, R. L. et al. JAMA 2006;295:629-642. Copyright restrictions may apply.

Real-world effectiveness… Only 31 percent of treatment participants got their dietary fat below 20% in the first year.

Effect of intention to treat on the statistical analysis Intention-to-treat analyses tend to underestimate treatment effects; increased variability due to switching “waters down” results.

Example Take the following hypothetical RCT: Treated subjects have a 25% chance of dying during the 2-year study vs. placebo subjects have a 50% chance of dying. TRUE RR= 25%/50% = .50 (treated have 50% less chance of dying) You do a 2-yr RCT of 100 treated and 100 placebo subjects. If nobody switched, you would see about 25 deaths in the treated group and about 50 deaths in the placebo group (give or take a few due to random chance). Observed RR .50

Example, continued BUT, if early in the study, 25 treated subjects switch to placebo and 25 placebo subjects switch to control. You would see about 25*.25 + 75*.50 = 43-44 deaths in the placebo group And about 25*.50 + 75*.25 = 31 deaths in the treated group Observed RR = 31/44  .70 Diluted effect! (but not biased)

The researchers factored this into their power calculation… The study was powered to find a 14% difference in breast cancer risk between treatment and control. They assumed a 50% reduction in risk with perfect adherence, but calculated that this would translate to only a 14% reduction in risk with imperfect adherence.

Alternatives to ITT Per-protocol analysis Treatment-received analysis Restricts analysis to only those who followed the assigned intervention until the end. Treatment-received analysis Censored analysis: Subjects are dropped from the analysis at the time of stopping the assigned treatment Transition analysis: e.g., controls who cross over to treatment contribute to the denominator for the control group until they cross over; then they contribute to the denominator for the treatment group. But becomes an observational study…

Review question 9 I randomized 600 people to receive treatment (n=300) or placebo (n=300). Of these, 10 treatment and 8 placebo subjects never started their study drug. An additional 30 dropped out of each group before the end of the trial (or were lost to followup). 18 treated subjects and 3 placebo subjects discontinued their treatment because of side effects. How many subjects do I include in my primary statistical analysis? 290 treatment, 292 placebo 272 treatment, 289 placebo 272 treatment, 292 placebo 242 treatment, 259 placebo 300 treatment, 300 placebo

Review question 9 I randomized 600 people to receive treatment (n=300) or placebo (n=300). Of these, 10 treatment and 8 placebo subjects never started their study drug. An additional 30 dropped out of each group before the end of the trial (or were lost to followup). 18 treated subjects and 3 placebo subjects discontinued their treatment because of side effects. How many subjects do I include in my primary statistical analysis? 290 treatment, 292 placebo 272 treatment, 289 placebo 272 treatment, 292 placebo 242 treatment, 259 placebo 300 treatment, 300 placebo

Homework Finish reading textbook Study for the final exam!