Presentation is loading. Please wait.

Presentation is loading. Please wait.

Targeted (Enrichment) Design. Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the.

Similar presentations


Presentation on theme: "Targeted (Enrichment) Design. Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the."— Presentation transcript:

1 Targeted (Enrichment) Design

2 Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the patients likely to benefit from a new drug Pre-clinical, phase II data, archived specimens from previous phase III studies Pre-clinical, phase II data, archived specimens from previous phase III studies 2. Establish analytical validated test for the classifier 3. Use the completely specified classifier to design and analyze a new clinical trial to evaluate effectiveness of the new treatment with a pre-defined analysis plan that preserves the overall type-I error of the study.

3 Guiding Principle The data used to develop the classifier should be distinct from the data used to test hypotheses about treatment effect in subsets determined by the classifier The data used to develop the classifier should be distinct from the data used to test hypotheses about treatment effect in subsets determined by the classifier Developmental studies can be exploratory Developmental studies can be exploratory Studies on which treatment effectiveness claims are to be based should be definitive studies that test a treatment hypothesis in a patient population completely pre-specified by the classifier Studies on which treatment effectiveness claims are to be based should be definitive studies that test a treatment hypothesis in a patient population completely pre-specified by the classifier

4 Using phase II data, develop predictor of response to new drug Develop Predictor of Response to New Drug Patient Predicted Responsive New Drug Control Patient Predicted Non-Responsive Off Study

5 Primarily for settings where the classifier is based on a single gene whose protein product is the target of the drug Primarily for settings where the classifier is based on a single gene whose protein product is the target of the drug eg Herceptin eg Herceptin

6 Evaluating the Efficiency of Strategy (I) Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004; Correction and supplement 12:3229, 2006 Simon R and Maitnourim A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clinical Cancer Research 10:6759-63, 2004; Correction and supplement 12:3229, 2006 Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005 Maitnourim A and Simon R. On the efficiency of targeted clinical trials. Statistics in Medicine 24:329-339, 2005

7 Model for Two Treatments With Binary Response Molecularly targeted treatment T Molecularly targeted treatment T Control treatment C Control treatment C 1-  Proportion of test + patients 1-  Proportion of test + patients p c control response probability p c control response probability response probability for test + patients on T is (p c +  1 ) response probability for test + patients on T is (p c +  1 ) Response probability for test – patients on T is (p c +  0 ) Response probability for test – patients on T is (p c +  0 )

8 Untargeted Trial Compare outcome for treatment group T vs control group C without classifier data Compare outcome for treatment group T vs control group C without classifier data Fisher-Exact test at two-sided level.05 comparing response proportion in control group to response proportion in treatment group Fisher-Exact test at two-sided level.05 comparing response proportion in control group to response proportion in treatment group Number of responses in C group of n patients is binomial B(n,p c ) Number of responses in C group of n patients is binomial B(n,p c ) Number of responses in T group is Number of responses in T group is B(n,(1-  )(p c +  1 )+  (p c +  0 )) B(n,(1-  )(p c +  1 )+  (p c +  0 )) Determine n patients per treatment group for power 1-  Determine n patients per treatment group for power 1-  Use Ury & Fleiss approximation Biom 36:347-51,1980. Use Ury & Fleiss approximation Biom 36:347-51,1980.

9 Targeted Trial Compare outcome for treatment group T vs control group C for Assay positive patients Compare outcome for treatment group T vs control group C for Assay positive patients Fisher-Exact test at two-sided level.05 comparing response proportion in control group to response proportion in treatment group Fisher-Exact test at two-sided level.05 comparing response proportion in control group to response proportion in treatment group Number of responses in C group of n patients is binomial B(n,p c ) Number of responses in C group of n patients is binomial B(n,p c ) Number of responses in T group is Number of responses in T group is B(n,p c +  1 ) B(n,p c +  1 ) Determine n T patients per treatment group for power 1-  Determine n T patients per treatment group for power 1-  Use Ury & Fleiss approximation Biom 36:347-51,1980. Use Ury & Fleiss approximation Biom 36:347-51,1980.

10

11

12 Approximations Observed response rate ~ N(p,p(1-p)/n) Observed response rate ~ N(p,p(1-p)/n) p e (1-p e ) ~ p c (1-p c ) p e (1-p e ) ~ p c (1-p c )

13 Number of Randomized Patients Required Type I error  Type I error  Power 1-  for obtaining significance Power 1-  for obtaining significance

14 Randomized Ratio (normal approximation) RandRat = n untargeted /n targeted RandRat = n untargeted /n targeted  1 = rx effect in test + patients  1 = rx effect in test + patients  0 = rx effect in test - patients  0 = rx effect in test - patients  =proportion of test - patients  =proportion of test - patients If  0 =0, RandRat = 1/ (1-  ) 2 If  0 =0, RandRat = 1/ (1-  ) 2 If  0 =  1 /2, RandRat = 1/(1-  /2) 2 If  0 =  1 /2, RandRat = 1/(1-  /2) 2

15 Screened Ratio N untargeted = n untargeted N untargeted = n untargeted N targeted = n targeted /(1-  ) N targeted = n targeted /(1-  ) ScreenRat = N untargeted /N targeted =(1-  )RandRat ScreenRat = N untargeted /N targeted =(1-  )RandRat

16 No treatment Benefit for Test - Patients n std / n targeted Proportion Test Positive RandomizedScreened 0.751.781.33 0.542 0.25164

17 Treatment Benefit for Test – Pts Half that of Test + Pts n std / n targeted Proportion Test Positive RandomizedScreened 0.751.310.98 0.51.780.89 0.252.560.64

18 Relative efficiency of targeted design depends on Relative efficiency of targeted design depends on proportion of patients test positive proportion of patients test positive effectiveness of new drug (compared to control) for test negative patients effectiveness of new drug (compared to control) for test negative patients When less than half of patients are test positive and the drug has little or no benefit for test negative patients, the targeted design requires dramatically fewer randomized patients When less than half of patients are test positive and the drug has little or no benefit for test negative patients, the targeted design requires dramatically fewer randomized patients The targeted design may require fewer or more screened patients than the standard design The targeted design may require fewer or more screened patients than the standard design

19 Trastuzumab Herceptin Metastatic breast cancer Metastatic breast cancer 234 randomized patients per arm 234 randomized patients per arm 90% power for 13.5% improvement in 1-year survival over 67% baseline at 2-sided.05 level 90% power for 13.5% improvement in 1-year survival over 67% baseline at 2-sided.05 level If benefit were limited to the 25% test + patients, overall improvement in survival would have been 3.375% If benefit were limited to the 25% test + patients, overall improvement in survival would have been 3.375% 4025 patients/arm would have been required 4025 patients/arm would have been required

20 Treatment Hazard Ratio for Marker Positive Patients Number of Events for Targeted Design Number of Events for Traditional Design Percent of Patients Marker Positive 20%33%50% 0.5742040720316 Comparison of Targeted to Untargeted Design Simon R, Development and Validation of Biomarker Classifiers for Treatment Selection, JSPI

21 Web Based Software for Comparing Sample Size Requirements http://brb.nci.nih.gov http://brb.nci.nih.gov

22

23

24

25

26

27 “Stratification Design”

28 Developmental Strategy (II) Develop Predictor of Response to New Rx Predicted Non- responsive to New Rx Predicted Responsive To New Rx Control New RXControl New RX

29 Developmental Strategy (II) Do not use the test to restrict eligibility, but to structure a prospective analysis plan Do not use the test to restrict eligibility, but to structure a prospective analysis plan Having a prospective analysis plan is essential Having a prospective analysis plan is essential “Stratifying” (balancing) the randomization is useful to ensure that all randomized patients have tissue available but is not a substitute for a prospective analysis plan “Stratifying” (balancing) the randomization is useful to ensure that all randomized patients have tissue available but is not a substitute for a prospective analysis plan The purpose of the study is to evaluate the new treatment overall and for the pre-defined subsets; not to modify or refine the classifier The purpose of the study is to evaluate the new treatment overall and for the pre-defined subsets; not to modify or refine the classifier The purpose is not to demonstrate that repeating the classifier development process on independent data results in the same classifier The purpose is not to demonstrate that repeating the classifier development process on independent data results in the same classifier

30 R Simon. Using genomics in clinical trial design, Clinical Cancer Research 14:5984-93, 2008 R Simon. Using genomics in clinical trial design, Clinical Cancer Research 14:5984-93, 2008 R Simon. Designs and adaptive analysis plans for pivotal clinical trials of therapeutics and companion diagnostics, Expert Opinion in Medical Diagnostics 2:721-29, 2008 R Simon. Designs and adaptive analysis plans for pivotal clinical trials of therapeutics and companion diagnostics, Expert Opinion in Medical Diagnostics 2:721-29, 2008

31 Validation of EGFR biomarkers for selection of EGFR- TK inhibitor therapy for previously treated NSCLC patients 2 nd line NSCLC with specimen FISH Testing FISH + (~ 30%) FISH − (~ 70%) Erlotinib Pemetrexed Erlotinib Pemetrexed Outcome 1° PFS 2° OS, ORR PFS endpoint PFS endpoint 90% power to detect 50% PFS improvement in FISH+ 90% power to detect 50% PFS improvement in FISH+ 90% power to detect 30% PFS improvement in FISH− 90% power to detect 30% PFS improvement in FISH− Evaluate EGFR IHC and mutations as predictive markers Evaluate EGFR IHC and mutations as predictive markers Evaluate the role of RAS mutation as a negative predictive marker Evaluate the role of RAS mutation as a negative predictive marker 957 patients 4 years accrual, 1196 patients 1-2 years minimum additional follow-up

32 Analysis Plan A Compare the new drug to the control for classifier positive patients Compare the new drug to the control for classifier positive patients If p + >0.05 make no claim of effectiveness If p + >0.05 make no claim of effectiveness If p +  0.05 claim effectiveness for the classifier positive patients and If p +  0.05 claim effectiveness for the classifier positive patients and Compare new drug to control for classifier negative patients using 0.05 threshold of significance Compare new drug to control for classifier negative patients using 0.05 threshold of significance

33 Sample size for Analysis Plan A 88 events in classifier + patients needed to detect 50% reduction in hazard at 5% two-sided significance level with 90% power 88 events in classifier + patients needed to detect 50% reduction in hazard at 5% two-sided significance level with 90% power If 25% of patients are positive, then when there are 88 events in positive patients there will be about 264 events in negative patients If 25% of patients are positive, then when there are 88 events in positive patients there will be about 264 events in negative patients 264 events provides 90% power for detecting 33% reduction in hazard at 5% two-sided significance level 264 events provides 90% power for detecting 33% reduction in hazard at 5% two-sided significance level

34 Analysis Plan B (Limited confidence in test) Compare the new drug to the control overall for all patients ignoring the classifier. Compare the new drug to the control overall for all patients ignoring the classifier. If p overall  0.03 claim effectiveness for the eligible population as a whole If p overall  0.03 claim effectiveness for the eligible population as a whole Otherwise perform a single subset analysis evaluating the new drug in the classifier + patients Otherwise perform a single subset analysis evaluating the new drug in the classifier + patients If p subset  0.02 claim effectiveness for the classifier + patients. If p subset  0.02 claim effectiveness for the classifier + patients.

35 Sample size for Analysis Plan B To have 90% power for detecting uniform 33% reduction in overall hazard at 3% two-sided level requires 297 events (instead of 263 for similar power at 5% level) To have 90% power for detecting uniform 33% reduction in overall hazard at 3% two-sided level requires 297 events (instead of 263 for similar power at 5% level) If 25% of patients are positive, then when there are 297 total events there will be approximately 75 events in positive patients If 25% of patients are positive, then when there are 297 total events there will be approximately 75 events in positive patients 75 events provides 75% power for detecting 50% reduction in hazard at 2% two-sided significance level 75 events provides 75% power for detecting 50% reduction in hazard at 2% two-sided significance level By delaying evaluation in test positive patients, 80% power is achieved with 84 events and 90% power with 109 events By delaying evaluation in test positive patients, 80% power is achieved with 84 events and 90% power with 109 events

36 This analysis strategy is designed to not penalize sponsors for having developed a classifier This analysis strategy is designed to not penalize sponsors for having developed a classifier It provides sponsors with an incentive to develop genomic classifiers It provides sponsors with an incentive to develop genomic classifiers

37 Analysis Plan C Test for difference (interaction) between treatment effect in test positive patients and treatment effect in test negative patients Test for difference (interaction) between treatment effect in test positive patients and treatment effect in test negative patients If interaction is significant at level  int then compare treatments separately for test positive patients and test negative patients If interaction is significant at level  int then compare treatments separately for test positive patients and test negative patients Otherwise, compare treatments overall Otherwise, compare treatments overall

38 Sample Size Planning for Analysis Plan C 88 events in test + patients needed to detect 50% reduction in hazard at 5% two-sided significance level with 90% power 88 events in test + patients needed to detect 50% reduction in hazard at 5% two-sided significance level with 90% power If 25% of patients are positive, when there are 88 events in positive patients there will be about 264 events in negative patients If 25% of patients are positive, when there are 88 events in positive patients there will be about 264 events in negative patients 264 events provides 90% power for detecting 33% reduction in hazard at 5% two-sided significance level 264 events provides 90% power for detecting 33% reduction in hazard at 5% two-sided significance level

39 Simulation Results for Analysis Plan C Using  int =0.10, the interaction test has power 93.7% when there is a 50% reduction in hazard in test positive patients and no treatment effect in test negative patients Using  int =0.10, the interaction test has power 93.7% when there is a 50% reduction in hazard in test positive patients and no treatment effect in test negative patients A significant interaction and significant treatment effect in test positive patients is obtained in 88% of cases under the above conditions A significant interaction and significant treatment effect in test positive patients is obtained in 88% of cases under the above conditions If the treatment reduces hazard by 33% uniformly, the interaction test is negative and the overall test is significant in 87% of cases If the treatment reduces hazard by 33% uniformly, the interaction test is negative and the overall test is significant in 87% of cases

40

41

42 Prospective-Retrospective Study

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60 Prospective-Retrospective Evaluation of Prognostic or Predictive Classifier 1. Analytically validate a single completely specified classifier 2. Design a prospective clinical trial that definitvely addresses the hypothesis of interest about the medical utility of the completely specified classifier 1. Write a detailed protocol for the prospective study, including sample size justification and detailed statistical analysis plan addressing a single hypothesis about the prognostic or predictive utility of a single completely specified classifier 3. Find a previously performed clinical trial that matches as closely as possible the prospective protocol developed above 1. Adequate design 2. Adequate sample size 3. Adequate proportion of patients with archived tissue 4. Not used in any way in developing the classifier or analytically validating it 4. Perform the assay on the archived samples and then analyze the data as defined in the prospective analysis plan

61 Use of Archived Specimens in Evaluation of Prognostic and Predictive Biomarkers Richard M. Simon, Soonmyung Paik and Daniel F. Hayes We propose modified guidelines for the conduct of reliable analyses of prognostic and predictive biomarkers using archived specimens. These guidelines stipulate that: We propose modified guidelines for the conduct of reliable analyses of prognostic and predictive biomarkers using archived specimens. These guidelines stipulate that: (i) archived tissue adequate for a successful assay must be available on a sufficiently large number of patients from a phase III trial that the appropriate analyses have adequate statistical power and that the patients included in the evaluation are clearly representative of the patients in the trial. (i) archived tissue adequate for a successful assay must be available on a sufficiently large number of patients from a phase III trial that the appropriate analyses have adequate statistical power and that the patients included in the evaluation are clearly representative of the patients in the trial. (ii) The test should be analytically and pre-analytically validated for use with archived tissue. (ii) The test should be analytically and pre-analytically validated for use with archived tissue. (iii) The analysis plan for the biomarker evaluation should be completely specified in writing prior to the performance of the biomarker assays on archived tissue and should be focused on evaluation of a single completely defined classifier. (iii) The analysis plan for the biomarker evaluation should be completely specified in writing prior to the performance of the biomarker assays on archived tissue and should be focused on evaluation of a single completely defined classifier. iv) the results from archived specimens should be validated using specimens from a similar, but separate, study. iv) the results from archived specimens should be validated using specimens from a similar, but separate, study.

62 Use of Archived Specimens in Evaluation of Prognostic and Predictive Biomarkers Richard M. Simon, Soonmyung Paik and Daniel F. Hayes Conclusions Conclusions Claims of medical utility for prognostic and predictive biomarkers based on analysis of archived tissues can be considered to have either a high or low level of evidence depending on several key factors. Claims of medical utility for prognostic and predictive biomarkers based on analysis of archived tissues can be considered to have either a high or low level of evidence depending on several key factors. These factors include the analytical and pre-analytical validation of the assay, the nature of the study from which the specimens were archived, the number and condition of the specimens, and the development prior to assaying tissue of a focused written plan for analysis of a completely specified biomarker classifier. These factors include the analytical and pre-analytical validation of the assay, the nature of the study from which the specimens were archived, the number and condition of the specimens, and the development prior to assaying tissue of a focused written plan for analysis of a completely specified biomarker classifier. Studies using archived tissues, when conducted under ideal conditions and independently confirmed can provide the highest level of evidence. Studies using archived tissues, when conducted under ideal conditions and independently confirmed can provide the highest level of evidence. Traditional analyses of prognostic or predictive factors, using non analytically validated assays on a convenience sample of tissues and conducted in an exploratory and unfocused manner provide a very low level of evidence for clinical utility. Traditional analyses of prognostic or predictive factors, using non analytically validated assays on a convenience sample of tissues and conducted in an exploratory and unfocused manner provide a very low level of evidence for clinical utility.


Download ppt "Targeted (Enrichment) Design. Prospective Co-Development of Drugs and Companion Diagnostics 1. Develop a completely specified genomic classifier of the."

Similar presentations


Ads by Google