Multivariable regression modelling – a pragmatic approach based on fractional polynomials for continuous variables Willi Sauerbrei Institut of Medical.

Slides:



Advertisements
Similar presentations
Agency for Healthcare Research and Quality (AHRQ)
Advertisements

Federal Institute for Drugs and Medical Devices | The Farm is a Federal Institute within the portfolio of the Federal Ministry of Health (Germany) How.
Doug Altman Centre for Statistics in Medicine, Oxford, UK
Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Departments of Medicine and Biostatistics
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Interactions With Continuous Variables – Extensions of the Multivariable Fractional Polynomial Approach Willi Sauerbrei Institut of Medical Biometry and.
Some comments on the 3 papers Robert T. O’Neill Ph.D.
ODAC May 3, Subgroup Analyses in Clinical Trials Stephen L George, PhD Department of Biostatistics and Bioinformatics Duke University Medical Center.
Detecting an interaction between treatment and a continuous covariate: a comparison between two approaches Willi Sauerbrei Institut of Medical Biometry.
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
Modelling continuous variables with a spike at zero – on issues of a fractional polynomial based procedure Willi Sauerbrei Institut of Medical Biometry.
Recursive Partitioning Method on Survival Outcomes for Personalized Medicine 2nd International Conference on Predictive, Preventive and Personalized Medicine.
15 de Abril de A Meta-Analysis is a review in which bias has been reduced by the systematic identification, appraisal, synthesis and statistical.
Making fractional polynomial models more robust Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg, Germany.
Flexible modeling of dose-risk relationships with fractional polynomials Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical.
Common Problems in Writing Statistical Plan of Clinical Trial Protocol Liying XU CCTER CUHK.
BIOST 536 Lecture 9 1 Lecture 9 – Prediction and Association example Low birth weight dataset Consider a prediction model for low birth weight (< 2500.
Issues In Multivariable Model Building With Continuous Covariates, With Emphasis On Fractional Polynomials Willi Sauerbrei Institut of Medical Biometry.
CRITICAL READING OF THE LITERATURE RELEVANT POINTS: - End points (including the one used for sample size) - Surrogate end points - Quality of the performed.
Multivariable model building with continuous data Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg, Germany.
By Dr. Ahmed Mostafa Assist. Prof. of anesthesia & I.C.U. Evidence-based medicine.
EVIDENCE BASED MEDICINE
Thoughts on Biomarker Discovery and Validation Karla Ballman, Ph.D. Division of Biostatistics October 29, 2007.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Are the results valid? Was the validity of the included studies appraised?
STrengthening the Reporting of OBservational Studies in Epidemiology
The Use of Fractional Polynomials in Multivariable Regression Modeling Part I - General considerations and issues in variable selection Willi Sauerbrei.
DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.
Building multivariable survival models with time-varying effects: an approach using fractional polynomials Willi Sauerbrei Institut of Medical Biometry.
Modelling continuous exposures - fractional polynomials Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg,
Chapter 13: Inference in Regression
Multiple Choice Questions for discussion
Department of O UTCOMES R ESEARCH. Daniel I. Sessler, M.D. Michael Cudahy Professor and Chair Department of O UTCOMES R ESEARCH The Cleveland Clinic Clinical.
Improved Use of Continuous Data- Statistical Modeling instead of Categorization Willi Sauerbrei Institut of Medical Biometry and Informatics University.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Surrogate Endpoints and Correlative Outcomes Hem/Onc Journal Club January 9, 2009.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.
CI - 1 Cure Rate Models and Adjuvant Trial Design for ECOG Melanoma Studies in the Past, Present, and Future Joseph Ibrahim, PhD Harvard School of Public.
Systematic Reviews.
Sgroi DC et al. Proc SABCS 2012;Abstract S1-9.
Evidence Based Medicine Meta-analysis and systematic reviews Ross Lawrenson.
Introduction to Systematic Reviews Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /9/20151.
D:/rg/folien/ms/ms-USA ppt F 1 Assessment of prediction error of risk prediction models Thomas Gerds and Martin Schumacher Institute of Medical.
Use of FP and Other Flexible Methods to Assess Changes in the Impact of an exposure over time Willi Sauerbrei Institut of Medical Biometry and Informatics.
1 Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting Authors: A. Dupuy and R.M. Simon.
Meta-analysis and “statistical aggregation” Dave Thompson Dept. of Biostatistics and Epidemiology College of Public Health, OUHSC Learning to Practice.
Criteria to assess quality of observational studies evaluating the incidence, prevalence, and risk factors of chronic diseases Minnesota EPC Clinical Epidemiology.
Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.
Landmark Trials: Recommendations for Interpretation and Presentation Julianna Burzynski, PharmD, BCOP, BCPS Heme/Onc Clinical Pharmacy Specialist 11/29/07.
Clinical Writing for Interventional Cardiologists.
Randomized Trial of Preoperative Chemoradiation Versus Surgery Alone in Patients with Locoregional Esophageal Carcinoma, Ursa et al. Statistical Methods:
1 THE ROLE OF COVARIATES IN CLINICAL TRIALS ANALYSES Ralph B. D’Agostino, Sr., PhD Boston University FDA ODAC March 13, 2006.
Introduction to sample size and power calculations Afshin Ostovar Bushehr University of Medical Sciences.
The Use of Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute
Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Systematic Review and Meta-Analysis.
G. Biondi Zoccai – Ricerca in cardiologia What to expect? Core modules IntroductionIntroduction Finding out relevant literatureFinding out relevant literature.
Systematic Reviews and Meta-analyses. Introduction A systematic review (also called an overview) attempts to summarize the scientific evidence related.
Retrospective Chart Reviews: How to Review a Review Adam J. Singer, MD Professor and Vice Chairman for Research Department of Emergency Medicine Stony.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.
Meta-analysis of observational studies Nicole Vogelzangs Department of Psychiatry & EMGO + institute.
Multivariable regression models with continuous covariates with a practical emphasis on fractional polynomials and applications in clinical epidemiology.
Donald E. Cutlip, MD Beth Israel Deaconess Medical Center
Critical Reading of Clinical Study Results
Regression and Clinical prediction models
Presentation transcript:

Multivariable regression modelling – a pragmatic approach based on fractional polynomials for continuous variables Willi Sauerbrei Institut of Medical Biometry and Informatics University Medical Center Freiburg, Germany Patrick Royston MRC Clinical Trials Unit, London, UK

Outline Prognostic factor studies Continuous variables –categorizing data –fractional polynomials –interactions Reporting Conclusions 2

3 Mc Guire 1991 Guidelines for evaluating new prognostic factors 1.Begin with a biological hypothesis for the new factor 2.Differentiate between a pilot study and a definitve study 3.Perform sample size calculations prior to initiating the study 4.Identify possible patient selection biases 5.Validate the methodologies used to measure the new factor 6.Include optimal representations of the factor in the analyses 7.Perform multivariate analyses that also include standard factors 8.Validate the reproducibility of the results in internal and external validation sets

Observational Studies one spezific variable of interest, necessity to control for confounders many variables measured, pairwise- and multicollinearity present model should fit the data identification of important variables model and single effects sensible interpretable Use subject-matter knowledge for modelling But for some variables, data-driven choice inevitable Modelling in the framework of Regression models Trees Neutral Net Selection of important variables 4

Methods for variable selection full model - variance inflation in the case of multi-collinearity * Wald-statistic stepwise procedures - prespecified ( α in, α out ) and actual selection level? * forward selection (FS) * stepwise selection (StS) * backward elimination (BE) all subset selection - which criteria? *C p Mallows *AICAkaike *SBCSchwarz Bayes variable selection MORE OR LESS COMPLEX MODELS? 5

Evaluation of prognostic factors is often based on historical data Advantage Patient data with long-term follow-up information available in a database Disadvantages Insufficient quality of data Important variables not availabe Study population heterogeneous with respect to prognostic factors and therapy 6

Assessment of a ‘new‘ factor Population ideally from a clinical trial most often registry data from a clinic often too small Analysis Often only univariate analysis cutpoint for division into two groups cutpoint derived data-dependently multivariate analysis required 7

Example to demonstrate issues Freiburg DNA study (Pfisterer et al 1995) N= 266, Median follow-up 82 months 115 events for recurrence free survival time Prognostic value of SPF SPF missing: 2.5% of diploid tumours (N=122) 38.9% of aneuploid tumours (N=144) 8

´Optimal´ cutpoint analysis – serious problem SPF-cutpoints used in the literature (Altman et al 1994) 9 1)Three Groups with approx. equal size 2)Upper third of SPF-distribution

Searching for optimal cutpoint minimal p-value approach 10 Problem multiple testing => inflated type I error SPF in Freiburg DNA study

Searching for optimal cutpoint 11 Inflation of type I errors (wrongly declaring a variable as important) Cutpoint selection in inner interval (here 10% - 90%) of distribution of factor % significant Sample size Simulation study Type I error about 40% istead of 5% Increased type I error does not disappear with increased sample size (in contrast to type II error)

12 Freiburg DNA study Study and 5 subpopulations (defined by nodal and ploidy status Optimal cutpoints with P-value

Continuous factor Categorisation or determination of functional form ? a) Step function (categorical analysis) – Loss of information – How many cutpoints? – Which cutpoints? – Bias introduced by outcome-dependent choice b) Linear function – May be wrong functional form – Misspecification of functional form leads to wrong – conclusions c) Non-linear function – Fractional polynominals 13

14 StatMed 2006, 25:

Fractional polynomial models 15 Fractional polynomial of degree m with powers p = (p 1,…, p m ) is defined as Notation: FP1 means FP with one term (one power), FP2 is FP with two terms, etc. Powers p are taken from a predefined set S We use S = {2,  1,  0.5, 0, 0.5, 1, 2, 3} Power 0 means log X here ( conventional polynomial p 1 = 1, p 2 = 2,... )

Fractional polynomial models Describe for one covariate, X –multiple regression later Fractional polynomial of degree m for X with powers p 1, …, p m is given by FPm(X) =  1 X p 1 + … +  m X p m Powers p 1,…, p m are taken from a special set {  2,  1,  0.5, 0, 0.5, 1, 2, 3} Usually m = 1 or m = 2 is sufficient for a good fit 8 FP1, 36 FP2 models 16

Examples of FP2 curves - varying powers 17

Examples of FP2 curves - single power, different coefficients 18

Our philosophy of function selection Prefer simple (linear) model Use more complex (non-linear) FP1 or FP2 model if indicated by the data Contrasts to more local regression modelling –Already starts with a complex model 19

20 GBSG-study in node-positive breast cancer 299 events for recurrence-free survival time (RFS) in 686 patients with complete data 7 prognostic factors, of which 5 are continuous

21 FP analysis for the effect of age

22 Effect of age at 5% level? χ 2 dfp-value Any effect? Best FP2 versus null Effect linear? Best FP2 versus linear FP1 sufficient? Best FP2 vs. best FP

Multivariable Fractional Polynomials (MFP) With multiple continuous predictors selection of best FP for each becomes more difficult  MFP algorithm as a standardized way to variable and function selection MFP algorithm combines backward elimination with FP function selection procedures 23

Continuous factors Different results with different analyses Age as prognostic factor in breast cancer (adjusted) 24 P-value

Results similar? Nodes as prognostic factor in breast cancer (adjusted) 25 P-value

Multivariable FP Final Model in breast cancer Model choosen out of 5760 possible models, one model selected Model– Sensible? – Interpretable? – Stable? Bootstrap stability analysis 26

Main interest of clinicians: Individualized treatment This requires knowledge about several predictive factors 27

Detecting predictive factors Most popular approach -Treatment effect in separate subgroups -Has several problems (Assman et al 2000) Test of treatment/covariate interaction required -For `binary`covariate standard test for interaction available Continuous covariate -Often categorized into two groups 28

Categorizing a continuous covariate How many cutpoints? Position of the cutpoint(s) Loss of information  loss of power 29

FP approach can also be used to investigate predictive factors 30

31 MRC RE01 trial RCT in metastatic renal carcinoma N = 347; 322 deaths

Renal Carcinoma Overall conclusion: Interferon is better (p<0.01) MRCRCC, Lancet 1999 Is the treatment effect similar in all patients? 32

Predictive factors Treatment – covariate interaction 33 Treatment effect function for WCC Only a result of complex (mis-)modelling?

Check result of MFPI modelling 34 Treatment effect in subgroups defined by WCC HR (Interferon to MPA) overall: 0.75 (0.60 – 0.93) I : 0.53 (0.34 – 0.83) II : 0.69 (0.44 – 1.07) III : 0.89 (0.57 – 1.37) IV : 1.32 (0.85 –2.05)

Assessment of WCC as a predictive factor Retrospective, searching for hypothesis 10 factors investigated, for one an interaction was identified ‚Dose-response‘ effect in RE01 trial Validation in independent data Worldwide collaboration: Don‘t we have other trials to check this result? 35

REPORTING – Can we believe in the published literature? Selection of published studies Insufficient reporting for assessment of quality of –planing –conducting –analysis too early publications Usefullness for systematic review (meta-analysis) Begg et al (1996) Improving the Quality of Reporting of Randomized Controlled Trials – The CONSORT Statement, JAMA,276: Moher et al JAMA (2001), Revised Recommendations 36

Reporting of prognostic markers Riley et al BJC (2003) Systematic review of tumor markers for neuroblastoma 260 studies identified, 130 different markers The reporting of these studies was often inadequate, in terms of both statistical analysis and presentation, and there was considerable heterogeneity for many important clinical/statistical factors. These problems restricted both the extraction of data and the meta-analysis of results from the primary studies, limiting feasibility of the evidence-based approach. 37

Papers useful for overview ? Prognostic markers for neuroblastoma 38

39 Database 1: 340 articles included in meta-analysis Database 2: 1575 articles published in 2005 EJC 2007, 43:

examined whether the abstract reported any statistically significant prognostic effect for any marker and any outcome (‘positive’ articles). ‘Positive’ prognostic articles comprised 90.6% and 95.8% in Databases 1 and 2, respectively. ‘Negative’ articles were further examined for statements made by the investigators to overcome the absence of prognostic statistical significance. Most of the ‘negative’ prognostic articles claimed significance for other analyses,expanded on non-significant trends or offered apologies that were occasionally remote from the original study aims. Only five articles in Database 1 (1.5%) and 21 in Database 2 (1.3%) were fully ‘negative’ for all presented results in the abstract and without efforts to expand on non-significant trends or to defend the importance of the marker with other arguments. Of the statistically non-significant relative risks in the meta-analyses, 25% had been presented as statistically significant in the primary papers using different analyses compared with the respective meta- analysis. Under strong reporting bias, statistical significance loses its discriminating ability for the importance of prognostic markers. 40

41 We expect some improvements by the REMARK guidelines published simultaneously in 5 journals, August 2005

42 Prognostic markers – current situation number of cancer prognostic markers validated as clinically useful is pitifully small Evidence based assessment is required, but collection of studies difficult to interpret due to inconsistencies in conclusions or a lack of comparability Small underpowered studies, poor study design, varying and sometimes inappropriate statistical analyses, and differences in assay methods or endpoint definitions More complete and transparent reporting distinguish carefully designed and analyzed studies from haphazardly designed and over-analyzed studies Identification of clinically useful cancer prognostic factors: What are we missing? McShane LM, Altman DG, Sauerbrei W; Editorial JNCI July 2005

43 Concluding comments – MFP FPs use full information - in contrast to a priori categorisation FPs search within flexible class of functions (FP1 and FP(2)- 44 models) MFP is a well-defined multivariate model-building strategy – combines search for transformations with BE Important that model reflects medical knowledge, e.g. monotonic / asymptotic functional forms MFP extensions Interactions Time-varying effects Investigation of properties required Comparison to splines required

44 References McShane LM, Altman DG, Sauerbrei W, Taube SE, Gion M, Clark GM for the Statistics Subcommittee of the NCI-EORTC Working on Cancer Diagnostics (2005): REporting recommendations for tumor MARKer prognostic studies (REMARK). Journal of the National Cancer Institute, 97: Royston P, Altman DG. (1994): Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling (with discussion). Applied Statistics, 43, Royston P, Altman DG, Sauerbrei W. (2006): Dichotomizing continuous predictors in multiple regression: a bad idea. Statistics in Medicine, 25: Royston P, Sauerbrei W. (2005): Building multivariable regression models with continuous covariates, with a practical emphasis on fractional polynomials and applications in clinical epidemiology. Methods of Information in Medicine, 44, Royston P, Sauerbrei W. (2008): Multivariable Model-Building - A pragmatic approach to regression analysis based on fractional polynomials for continuous variables.Wiley. Sauerbrei W, Meier-Hirmer C, Benner A, Royston P. (2006): Multivariable regression model building by using fractional polynomials: Description of SAS, STATA and R programs. Computational Statistics & Data Analysis, 50: Sauerbrei W, Royston P. (1999): Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. Journal of the Royal Statistical Society A, 162, Sauerbrei, W., Royston, P., Binder H (2007): Selection of important variables and determination of functional form for continuous predictors in multivariable model building. Statistics in Medicine, to appear Sauerbrei W, Royston P, Look M. (2007): A new proposal for multivariable modelling of time-varying effects in survival data based on fractional polynomial time-transformation. Biometrical Journal, 49: Sauerbrei W, Royston P, Zapien K. (2007): Detecting an interaction between treatment and a continuous covariate: a comparison of two approaches. Computational Statistics and Data Analysis, 51: Schumacher M, Holländer N, Schwarzer G, Sauerbrei W. (2006): Prognostic Factor Studies. In Crowley J, Ankerst DP (ed.), Handbook of Statistics in Clinical Oncology, Chapman&Hall/CRC,