Development and Use of Predictive Biomarkers Dr. Richard Simon.

Development and Use of Predictive Biomarkers Dr. Richard Simon

Potential Conflict of Interest None

Development and Use of Predictive Biomarkers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute http://brb.nci.nih.gov

Biometric Research Branch Website http:// brb.nci.nih.gov Powerpoint presentations Powerpoint presentations Reprints Reprints BRB-ArrayTools software BRB-ArrayTools software Web based Sample Size Planning Web based Sample Size Planning

Why are Metastatic Tumors Resistant? Poor intracellular drug access in bulky tumors Poor intracellular drug access in bulky tumors Large tumors have low growth fractions Large tumors have low growth fractions Old tumors have undergone many generations of replication, harbor many mutations and are mutationally heterogeneous Old tumors have undergone many generations of replication, harbor many mutations and are mutationally heterogeneous Metastatic tumors have survived many selection pressures, activated detoxification pathways and de- activated control pathways like apoptosis Metastatic tumors have survived many selection pressures, activated detoxification pathways and de- activated control pathways like apoptosis …

How Can We Treat More Effectively Treat early Treat early Treat intensively Treat intensively Treat with combinations Treat with combinations Treat with drugs that target the key oncogenic mutations that occurred early, are present in all cells of the tumor, drive the invasion of the tumor and to which the tumor is addicted Treat with drugs that target the key oncogenic mutations that occurred early, are present in all cells of the tumor, drive the invasion of the tumor and to which the tumor is addicted Characterize the key oncogenic mutations in individual tumors and select the right drugs for that tumor Characterize the key oncogenic mutations in individual tumors and select the right drugs for that tumor

Prognostic & Predictive Biomarkers Most cancer treatments currently benefit only a minority of patients to whom they are administered Most cancer treatments currently benefit only a minority of patients to whom they are administered Being able to predict which patients are likely to benefit would Being able to predict which patients are likely to benefit would Save patients from unnecessary toxicity, and enhance their chance of receiving a drug that helps them Save patients from unnecessary toxicity, and enhance their chance of receiving a drug that helps them Help control medical costs Help control medical costs Improve the success rate of clinical drug development Improve the success rate of clinical drug development

Personalized Oncology is Here Today Estrogen receptor over-expression in breast cancer Estrogen receptor over-expression in breast cancer Anti-estrogens, aromatase inhibitors Anti-estrogens, aromatase inhibitors HER2 amplification in breast cancer HER2 amplification in breast cancer Trastuzumab, Lapatinib Trastuzumab, Lapatinib OncotypeDx in breast cancer OncotypeDx in breast cancer Low score for ER+ node - = hormonal rx Low score for ER+ node - = hormonal rx KRAS in colorectal cancer KRAS in colorectal cancer WT KRAS = cetuximab or panitumumab WT KRAS = cetuximab or panitumumab EGFR mutation or amplification in NSCLC EGFR mutation or amplification in NSCLC EGFR inhibitor EGFR inhibitor

Different Kinds of Biomarkers Endpoint Biomarkers Endpoint Biomarkers A measurement made on a patient before, during and after treatment to determine whether the treatment is working A measurement made on a patient before, during and after treatment to determine whether the treatment is working Predictive biomarkers Predictive biomarkers Measured before treatment to identify who will benefit from a particular treatment Measured before treatment to identify who will benefit from a particular treatment Prognostic biomarkers Prognostic biomarkers Measured before treatment to indicate long-term outcome for patients untreated or receiving standard treatment Measured before treatment to indicate long-term outcome for patients untreated or receiving standard treatment

Endpoint Biomarkers Surrogate Endpoints Surrogate Endpoints It is very difficult to properly validate a biomarker as a surrogate of clinical benefit for use as an alternative endpoint in phase III trials It is very difficult to properly validate a biomarker as a surrogate of clinical benefit for use as an alternative endpoint in phase III trials Partial Surrogate Endpoints Partial Surrogate Endpoints Necessary but not sufficient for clinical benefit Necessary but not sufficient for clinical benefit Pharmacodynamic biomarkers can be useful in phase I/II studies as measures of treatment effect Pharmacodynamic biomarkers can be useful in phase I/II studies as measures of treatment effect They need not be validated as surrogates for clinical benefit They need not be validated as surrogates for clinical benefit

Types of Validation for Prognostic and Predictive Biomarkers Analytical validation Analytical validation Measures accurately what it is supposed to measure Measures accurately what it is supposed to measure Clinical validation/correlation Clinical validation/correlation Does the biomarker predict the clinical endpoint that it’s supposed to predict for independent data Does the biomarker predict the clinical endpoint that it’s supposed to predict for independent data Medical utility Medical utility Does use of the biomarker result in patient benefit Does use of the biomarker result in patient benefit Depends on medical context, other prognostic factors, therapeutic options Depends on medical context, other prognostic factors, therapeutic options

Prognostic and Predictive Biomarkers in Oncology Single gene or protein measurement Single gene or protein measurement ER protein expression ER protein expression HER2 amplification HER2 amplification KRAS mutation KRAS mutation Scalar index or classifier that summarizes expression levels of multiple genes Scalar index or classifier that summarizes expression levels of multiple genes

Prognostic Markers in Oncology Most prognostic markers are not used because they are not therapeutically relevant Most prognostic markers are not used because they are not therapeutically relevant Most studies do not address medical utility Most studies do not address medical utility They use a convenience sample of patients for whom tissue is available They use a convenience sample of patients for whom tissue is available Most prognostic marker studies are not reliable because they are exploratory and not prospectively focused on a single marker Most prognostic marker studies are not reliable because they are exploratory and not prospectively focused on a single marker

Prognostic Biomarkers Can be Therapeutically Relevant <10% of node negative ER+ breast cancer patients require or benefit from the cytotoxic chemotherapy that they receive <10% of node negative ER+ breast cancer patients require or benefit from the cytotoxic chemotherapy that they receive

B-14 Results—Relapse-Free Survival 338 pts 149 pts 181 pts p<0.0001 Paik et al, SABCS 2003

Predictive Biomarkers In the past often studied as un-focused post-hoc subset analyses of RCTs. In the past often studied as un-focused post-hoc subset analyses of RCTs. Numerous subsets examined Numerous subsets examined No pre-specified hypotheses No pre-specified hypotheses No control of type I error from multiple testing No control of type I error from multiple testing

Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the patients likely to benefit from a new drug 2. Establish analytical validity of the classifier 3. Use the completely specified classifier to design and analyze a focused clinical trial to evaluate effectiveness of the new treatment and how it relates to the candidate biomarker

Guiding Principle The data used to develop the classifier should be distinct from the data used to test hypotheses about treatment effect in subsets determined by the classifier The data used to develop the classifier should be distinct from the data used to test hypotheses about treatment effect in subsets determined by the classifier Developmental studies can be exploratory Developmental studies can be exploratory Studies on which treatment effectiveness claims are to be based should be definitive studies that test a treatment hypothesis in a patient population completely pre-specified by the classifier Studies on which treatment effectiveness claims are to be based should be definitive studies that test a treatment hypothesis in a patient population completely pre-specified by the classifier

Using phase II data, develop predictor of response to new drug Develop Predictor of Response to New Drug Patient Predicted Responsive New Drug Control Patient Predicted Non-Responsive Off Study Enrichment Design

Primarily for settings where the classifier is based on a single gene whose protein product is the target of the drug and the biological evidence that the new treatment in marker negative patients is compelling Primarily for settings where the classifier is based on a single gene whose protein product is the target of the drug and the biological evidence that the new treatment in marker negative patients is compelling eg Herceptin eg Herceptin

Trastuzumab Herceptin Metastatic breast cancer Metastatic breast cancer 234 randomized patients per arm 234 randomized patients per arm 90% power for 13.5% improvement in 1-year survival over 67% baseline at 2-sided.05 level 90% power for 13.5% improvement in 1-year survival over 67% baseline at 2-sided.05 level If benefit were limited to the 25% assay + patients, overall improvement in survival would have been 3.375% If benefit were limited to the 25% assay + patients, overall improvement in survival would have been 3.375% 4025 patients/arm would have been required 4025 patients/arm would have been required

Evaluating the Efficiency of Enrichment Design Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004. Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004. Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005. Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005. reprints and interactive sample size calculations at http://linus.nci.nih.gov/brb reprints and interactive sample size calculations at http://linus.nci.nih.gov/brb

Efficiency of Enrichment Design Depends on Depends on proportion of patients test positive proportion of patients test positive effectiveness of new drug for test negative patients effectiveness of new drug for test negative patients When less than half of patients are test positive and the drug has little or no benefit for test negative patients, the enrichment design requires dramatically fewer randomized patients When less than half of patients are test positive and the drug has little or no benefit for test negative patients, the enrichment design requires dramatically fewer randomized patients

Stratification Design Develop Predictor of Response to New Rx Predicted Non- responsive to New Rx Predicted Responsive To New Rx Control New RXControl New RX

Does not use the test to restrict eligibility, but to structure a prospective analysis plan Does not use the test to restrict eligibility, but to structure a prospective analysis plan Having a prospective analysis plan is essential Having a prospective analysis plan is essential “Stratifying” (balancing) the randomization is useful to ensure that all randomized patients have tissue available but is not a substitute for a prospective analysis plan “Stratifying” (balancing) the randomization is useful to ensure that all randomized patients have tissue available but is not a substitute for a prospective analysis plan Size the study for adequate evaluation of T vs C separately by marker status Size the study for adequate evaluation of T vs C separately by marker status The purpose of the study is to evaluate the new treatment overall and for the pre-defined subsets; not to modify or refine the classifier The purpose of the study is to evaluate the new treatment overall and for the pre-defined subsets; not to modify or refine the classifier The purpose is not to demonstrate that repeating the classifier development process on independent data results in the same classifier The purpose is not to demonstrate that repeating the classifier development process on independent data results in the same classifier

R Simon. Using genomics in clinical trial design, Clinical Cancer Research 14:5984-93, 2008 R Simon. Using genomics in clinical trial design, Clinical Cancer Research 14:5984-93, 2008 R Simon. Designs and adaptive analysis plans for pivotal clinical trials of therapeutics and companion diagnostics, Expert Opinion in Medical Diagnostics 2:721-29, 2008 R Simon. Designs and adaptive analysis plans for pivotal clinical trials of therapeutics and companion diagnostics, Expert Opinion in Medical Diagnostics 2:721-29, 2008

Web Based Software for Planning Clinical Trials of Treatments with a Candidate Predictive Biomarker http://brb.nci.nih.gov http://brb.nci.nih.gov

Use of Archived Specimens in Evaluation of Prognostic and Predictive Biomarkers Richard M. Simon, Soonmyung Paik and Daniel F. Hayes Claims of medical utility for prognostic and predictive biomarkers based on analysis of archived tissues can be considered to have either a high or low level of evidence depending on several key factors. Claims of medical utility for prognostic and predictive biomarkers based on analysis of archived tissues can be considered to have either a high or low level of evidence depending on several key factors. Studies using archived tissues, when conducted under ideal conditions and independently confirmed can provide the highest level of evidence. Studies using archived tissues, when conducted under ideal conditions and independently confirmed can provide the highest level of evidence. Traditional analyses of prognostic or predictive factors, using non analytically validated assays on a convenience sample of tissues and conducted in an exploratory and unfocused manner provide a very low level of evidence for clinical utility. Traditional analyses of prognostic or predictive factors, using non analytically validated assays on a convenience sample of tissues and conducted in an exploratory and unfocused manner provide a very low level of evidence for clinical utility.

For Level I Evidence Archived tissue adequate for a successful assay must be available on a sufficiently large number of patients from a phase III trial with a design that enables the appropriate analyses Archived tissue adequate for a successful assay must be available on a sufficiently large number of patients from a phase III trial with a design that enables the appropriate analyses Adequate statistical power Adequate statistical power The patients included in the evaluation are clearly representative of the patients in the trial. The patients included in the evaluation are clearly representative of the patients in the trial. The test should be analytically and pre-analytically validated for use with archived tissue. The test should be analytically and pre-analytically validated for use with archived tissue. The analysis plan for the biomarker evaluation should be completely specified in writing prior to the performance of the biomarker assays on archived tissue and should be focused on evaluation of a single completely defined classifier. The analysis plan for the biomarker evaluation should be completely specified in writing prior to the performance of the biomarker assays on archived tissue and should be focused on evaluation of a single completely defined classifier. The results of the analysis should be validated using specimens from a similar, but separate, study The results of the analysis should be validated using specimens from a similar, but separate, study

Development of Prognostic & Predictive Classifiers using Gene Expression Profiles

Major Flaws Found in 40 Studies Published in 2004 Inadequate control of multiple comparisons in gene finding Inadequate control of multiple comparisons in gene finding 9/23 studies had unclear or inadequate methods to deal with false positives 9/23 studies had unclear or inadequate methods to deal with false positives 10,000 genes x.05 significance level = 500 false positives 10,000 genes x.05 significance level = 500 false positives Misleading report of prediction accuracy Misleading report of prediction accuracy 12/28 reports based on evaluating accuracy in training set or using incomplete cross-validation 12/28 reports based on evaluating accuracy in training set or using incomplete cross-validation Misleading use of cluster analysis Misleading use of cluster analysis 13/28 studies invalidly claimed that expression clusters based on differentially expressed genes could help distinguish clinical outcomes 13/28 studies invalidly claimed that expression clusters based on differentially expressed genes could help distinguish clinical outcomes 50% of studies contained one or more major flaws 50% of studies contained one or more major flaws

Recent Literature Review of Expression Profiling in Early Lung Cancer Simon & Subramanian Most studies relating gene expression profiles to outcome of cancer patients do not address medical utility Most studies relating gene expression profiles to outcome of cancer patients do not address medical utility The patients included are too heterogeneous with regard to stage The patients included are too heterogeneous with regard to stage Failure to emphasize predictive accuracy over existing prognostic factors rather than statistical significance Failure to emphasize predictive accuracy over existing prognostic factors rather than statistical significance Most publications feature highly misleading claims based on failure to separate the data used for model development from the data used for model evaluation Most publications feature highly misleading claims based on failure to separate the data used for model development from the data used for model evaluation Sample splitting Sample splitting Do not evaluate results in the training set! Do not evaluate results in the training set! Complete cross validation Complete cross validation

New Challenges in Phase II Trial Design Evaluating new drugs in molecularly heterogeneous diseases Evaluating new drugs in molecularly heterogeneous diseases Treating a sufficient number of patients whose tumors are thought to be good candidates for the drug Treating a sufficient number of patients whose tumors are thought to be good candidates for the drug Developing a predictive biomarker for identifying target population and a robust test for use in the phase III trial Developing a predictive biomarker for identifying target population and a robust test for use in the phase III trial Development of effective combinations Development of effective combinations Reliable use of endpoints other than objective response Reliable use of endpoints other than objective response

Selecting Patients for Phase II Trial If the phase II trial for a particular primary site is not enriched for patients thought responsive to the drug, an initial stage of 10-15 patients may contain very few appropriate patients If the phase II trial for a particular primary site is not enriched for patients thought responsive to the drug, an initial stage of 10-15 patients may contain very few appropriate patients If drug target is thought known, accrual of separate cohort of 25-30 patients whose tumors are thought to be driven by the target gives best chance to evaluate drug If drug target is thought known, accrual of separate cohort of 25-30 patients whose tumors are thought to be driven by the target gives best chance to evaluate drug Small phase II trials are generally not adequate for developing or even refining predictive biomarkers Small phase II trials are generally not adequate for developing or even refining predictive biomarkers

Phase II Designs Single agent In combination with active agents Response rate Simon Optimal 2-stage single arm design Single arm comparison to historical control Single arm comparison to historical control Makuch-Simon Makuch-Simon Thall-Simon Bayesian Thall-Simon Bayesian Randomized design Randomized design Time to progression Dixon-Simon single arm comparison to historical control Dixon-Simon single arm comparison to historical control Randomized design Randomized design Randomized design

Evaluating a New Drug in Combination with Active Agents For a new drug in combination with active agents, p 0 represents the response probability of the active agents without the new drug in the same type of patients being selected for the phase II study of the combination regimen For a new drug in combination with active agents, p 0 represents the response probability of the active agents without the new drug in the same type of patients being selected for the phase II study of the combination regimen The effectiveness of the single arm design is limited by the availability of a large number of comparable patients who have been treated with the active agents alone The effectiveness of the single arm design is limited by the availability of a large number of comparable patients who have been treated with the active agents alone For combination regimens, unless p 0 is based on a large number of patients, the methods of Makuch-Simon or Bayesian Thall-Simon designs should be used instead of the optimal two-stage design. For combination regimens, unless p 0 is based on a large number of patients, the methods of Makuch-Simon or Bayesian Thall-Simon designs should be used instead of the optimal two-stage design. The Makuch-Simon and Thall-Simon designs require individual patient data for historical controls. This increases focus on comparability and they take into account the actual number of historical controls and the resulting uncertainty in p 0 The Makuch-Simon and Thall-Simon designs require individual patient data for historical controls. This increases focus on comparability and they take into account the actual number of historical controls and the resulting uncertainty in p 0

Using Time to Progression or Stable Disease as Endpoint Requires comparison to progression times for control patients not receiving drug Requires comparison to progression times for control patients not receiving drug Proportion of patients with “stable disease” also requires a control group for evaluation to be meaningful Proportion of patients with “stable disease” also requires a control group for evaluation to be meaningful It is difficult to reliably evaluate time to progression endpoint without a randomized control group It is difficult to reliably evaluate time to progression endpoint without a randomized control group With historical controls, specific controls should be used for whom comparability of prognosis and surveillance for progression can be established With historical controls, specific controls should be used for whom comparability of prognosis and surveillance for progression can be established

Thall-Simon Bayesian Single Arm Phase II Designs Using a Specific Set of Historical Control Patients Makuch, RW, and Simon, RM.: Sample size considerations for non- randomized comparative studies. J. Chron. Dis. 33: 175-181, 1980. Makuch, RW, and Simon, RM.: Sample size considerations for non- randomized comparative studies. J. Chron. Dis. 33: 175-181, 1980. Dixon, DO, and Simon, R. Sample size considerations for studies comparing survival curves using historical controls. J. Clin. Epidemiology 41: 1209-1214, 1988. Dixon, DO, and Simon, R. Sample size considerations for studies comparing survival curves using historical controls. J. Clin. Epidemiology 41: 1209-1214, 1988. Thall, P F and Simon R. A Bayesian approach to establishing sample size and monitoring criteria for phase II clinical trials. Controlled Clinical Trials 15:463-481, 1994. Thall, P F and Simon R. A Bayesian approach to establishing sample size and monitoring criteria for phase II clinical trials. Controlled Clinical Trials 15:463-481, 1994. Thall PF, Simon R, Estey E: A new statistical strategy for monitoring safety and efficacy in single-arm clinical trials. Journal of Clinical Oncology 14:296-303, 1996. Thall PF, Simon R, Estey E: A new statistical strategy for monitoring safety and efficacy in single-arm clinical trials. Journal of Clinical Oncology 14:296-303, 1996.

Randomized Phase II Screening Designs Simon, Ellenberg, Wittes Cancer Treatment Reports 69:1375,1985 For evaluating multiple new drugs, regimens or combinations to select most promising For evaluating multiple new drugs, regimens or combinations to select most promising Arm with greatest observed response rate is selected regardless of how small the difference is Arm with greatest observed response rate is selected regardless of how small the difference is Not for comparing a new drug/regimen to control Not for comparing a new drug/regimen to control Randomization ensures uniform patient selection and evaluation Randomization ensures uniform patient selection and evaluation Can be used with time to progression endpoint Can be used with time to progression endpoint

Phase 2.5 Trial Design Randomization to new regimen vs control Randomization to new regimen vs control E.g. C+X vs C E.g. C+X vs C Endpoint is progression free survival regardless of whether it is an accepted phase III endpoint Endpoint is progression free survival regardless of whether it is an accepted phase III endpoint Threshold of significance can exceed.05 for sample size planning Threshold of significance can exceed.05 for sample size planning Simon R et al. Clinical trial designs for the early clinical development of therapeutic cancer vaccines. Journal of Clinical Oncology 19:1848-54, 2001 Simon R et al. Clinical trial designs for the early clinical development of therapeutic cancer vaccines. Journal of Clinical Oncology 19:1848-54, 2001 Korn EL et al. Clinical trial designs for cytostatic agents: Are new approaches needed? Journal of Clinical Oncology 19:265-272, 2001 Korn EL et al. Clinical trial designs for cytostatic agents: Are new approaches needed? Journal of Clinical Oncology 19:265-272, 2001

Total Sample Size Randomized Phase 2.5 2 years accrual, 1.5 years followup Improvement in median PFS Hazard Ratio  =.05  =.10  =.20 4 → 6 months 1.5216168116 6 → 9 months 1.5228176120 4 → 8 months 2766040 6→12 months 2846444

Acknowledgements NCI Biometric Research Branch NCI Biometric Research Branch Boris Freidlin Boris Freidlin Yingdong Zhao Yingdong Zhao Alain Dupuy Alain Dupuy Wenyu Jiang Wenyu Jiang Aboubakar Maitournam Aboubakar Maitournam Jyothi Subramanian Jyothi Subramanian Soonmyung Paik, NSABP Soonmyung Paik, NSABP Daniel Hayes, U. Michigan Daniel Hayes, U. Michigan

Questions?

Development and Use of Predictive Biomarkers Dr. Richard Simon.

Similar presentations

Presentation on theme: "Development and Use of Predictive Biomarkers Dr. Richard Simon."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Development and Use of Predictive Biomarkers Dr. Richard Simon.

Similar presentations

Presentation on theme: "Development and Use of Predictive Biomarkers Dr. Richard Simon."— Presentation transcript:

Similar presentations

About project

Feedback