Presentation on theme: "Adaptive Design Methods in Clinical Trials Shein-Chung Chow, PhD Department of Biostatistics and Bioinformatics Duke University School of Medicine Durham,"— Presentation transcript:
Adaptive Design Methods in Clinical Trials Shein-Chung Chow, PhD Department of Biostatistics and Bioinformatics Duke University School of Medicine Durham, North Carolina Presented at The 2011 MBSW, Muncie, Indiana May 23, 2011
Lecture 1 - Outline What is adaptive design? Type of adaptive designs Possible benefits Regulatory perspectives Protocol amendments
What is adaptive design? There is no universal definition. Adaptive randomization, group sequential, and sample size re-estimation, etc. PhRMA (2006) FDA (2010) Adaptive design is also known as Flexible design (EMEA, 2002, 2006) Attractive design (Uchida, 2006)
PhRMA’s definition PhRMA (2006). J. Biopharm. Stat., 16 (3), An adaptive design is referred to as a clinical trial design that uses accumulating data to decide on how to modify aspects of the study as it continues, without undermining the validity and integrity of the trial.
PhRMA’s definition Characteristics Adaptation is a design feature. Changes are made by design not on an ad hoc basis. Comments It does not reflect real practice. It may not be flexible as it means to be.
FDA’s definition US FDA Guidance for Industry – Adaptive Design Clinical Trials for Drugs and Biologics Feb, 2010 An adaptive design clinical study is defined as a study that includes a prospectively planned opportunity for modification of one or more specified aspects of the study design and hypotheses based on analysis of data (usually interim data) from subjects in the study
FDA’s definition Characteristics Adaptation is a prospectively planned opportunity. Changes are made based on analysis of data (usually interim data). Does not include medical devices? Comments It classifies adaptive designs into well- understood and less well-understood designs It does not reflect real practice (protocol amendments) It is not a guidance but a document of caution
Adaptation An adaptation is defined as a change or modification made to a clinical trial before and during the conduct of the study. Examples include Relax inclusion/exclusion criteria Change study endpoints Modify dose and treatment duration etc.
Types of adaptations Prospective adaptations Adaptive randomization Interim analysis Stopping trial early due to safety, futility, or efficacy Sample size re-estimation etc. Concurrent adaptations Trial procedures Implemented by protocol amendments Retrospective adaptations Statistical procedures Implemented by statistical analysis plan prior to database lock and/or data unblinding
Adaptive randomization design A design that allows modification of randomization schedules Unequal probabilities of treatment assignment Increase the probability of success Types of adaptive randomization Treatment-adaptive Covariate-adaptive Response-adaptive
Comments Randomization schedule may not be available prior to the conduct of the study. It may not be feasible for a large trial or a trial with a relatively long treatment duration. Statistical inference on treatment effect is often difficult to obtain if it is not impossible.
Group sequential design An adaptive design that allows for prematurely stopping a trial due to safety, efficacy/futility, or both based on interim analysis results Sample size re-estimation Other adaptations
Comments Well-understood design without additional adaptations Overall type I error rate may not be preserved when there are additional adaptations (e.g., changes in hypotheses and/or study endpoints) there is a shift in target patient population
Flexible sample size re-estimation An adaptive design that allows for sample size adjustment or re-estimation based on the observed data at interim blinding or unblinding Sample size adjustment or re-estimation is usually performed based on the following criteria variability conditional power reproducibility probability etc.
Comments Can we always start with a small number and perform sample size re- estimation at interim? It should be noted sample size re- estimation is performed based on estimates from the interim analysis. A flexible sample size re-estimation design is also known as an N- adjustable design.
Drop-the-losers design Drop-the-losers design is a multiple stage adaptive design that allows dropping the inferior treatment groups General principles drop the inferior arms retain the control arm may modify or add additional arms It is useful where there are uncertainties regarding the dose levels
Comments The selection criteria and decision rules play important roles for drop-the-losers designs. Dose groups that are dropped may contain valuable information regarding dose response of the treatment under study. How to utilize all of the data for a final analysis? Some people prefer pick-the-winner.
Adaptive dose finding design A typical example is an adaptive dose escalation design. A dose escalation design is often used in early phase clinical development to identify the maximum tolerable dose (MTD), which is usually considered the optimal dose for later phase clinical trials adaptation to the traditional “ 3+3 ” escalation rule continual re-assessment method (CRM) in conjunction with the Bayesian ’ s approach
Comments How to select the initial dose? How to select the dose range under study? How to achieve statistical significance with a desired power with a limit number of subjects? What are the selection criteria and decision rules? What is the probability of achieving the optimal dose?
Biomarker-adaptive design A design that allows for adaptation based on the responses of biomarkers such as genomic markers for assessment of treatment effect. It involves qualification and standard optimal screening design establishment of predictive model validation of the established predictive model
Comments A classifier marker usually does not change over the course of study and can be used to identify patient population who would benefit from the treatment from those do not. DNA marker and other baseline marker for population selection A prognostic marker informs the clinical outcomes, independent of treatment. A predictive marker informs the treatment effect on the clinical endpoint. Predictive marker can be population-specific. That is, a marker can be predictive for population A but not population B.
Comments Classifier marker is commonly used in enrichment process of target clinical trials Prognostic vs. predictive markers Correlation between biomarker and true endpoint make a prognostic marker Correlation between biomarker and true endpoint does not make a predictive biomarker There is a gap between identifying genes that associated with clinical outcomes and establishing a predictive model between relevant genes and clinical outcomes
Adaptive treatment-switching design A design that allows the investigator to switch a patient ’ s treatment from an initial assignment to an alternative treatment if there is evidence of lack of efficacy or safety of the initial treatment Commonly employed in cancer trials In practice, a high percentage of patients may switch due to disease progression
Comments Estimation of survival is a challenge to biostatistician. A high percentage of subjects who switched could lead to a change in hypotheses to be tested. Sample size adjustment for achieving a desired power is critical to the success of the study.
Adaptive-hypotheses design A design that allows change in hypotheses based on interim analysis results often considered before database lock and/or prior to data unblinding Examples switch from a superiority hypothesis to a non-inferiority hypothesis change in study endpoints (e.g., switch primary and secondary endpoints)
Comments Switch between non-inferiority and superiority The selection of non-inferiority margin Sample size calculation Switch between the primary endpoint and the secondary endpoints Perhaps, should consider the switch from the primary endpoint to a co-primary endpoint or a composite endpoint
Seamless adaptive trial design A seamless adaptive (e.g., phase II/III) trial design is a trial design that combines two separate trials (e.g., a phase IIb and a phase III trial) into one trial and would use data from patients enrolled before and after the adaptation in the final analysis Learning phase (e.g., phase II) Confirmatory phase (e.g., phase III)
Comments Characteristics Will be able to address study objectives of individual (e.g., phase IIb and phase III) studies Will utilize data collected from both phases (e.g., phase IIb and phase III) for final analysis Questions/Concerns Is it efficient? How to control the overall type I error rate? How to perform power analysis for sample size calculation/allocation? How to perform a combined analysis if the study objectives/endpoints are different at different phases?
Multiple adaptive design A multiple adaptive design is any combinations of the above adaptive designs very flexible very attractive very complicated statistical inference is often difficult, if not impossible to obtain
Multiple adaptive design A multiple adaptive design is any combinations of the above adaptive designs very flexible very attractive very complicated statistical inference is often difficult, if not impossible to obtain very painful
Possible benefits Correct wrong assumptions Select the most promising option early Make use of emerging external information to the trial React earlier to surprises (positive and/or negative) May speed up development process
Possible benefits May have a second chance to re-design the trial after seeing data from the trial itself at interim (or externally) Sample size may start out with a smaller up-front commitment of sample size More flexible but more problematic operationally due to potential bias
Regulatory perspectives May introduce operational bias. May not be able to preserve type I error rate. P-values may not be correct. Confidence intervals may not be reliable. May result in a totally different trial that is unable to address the medical questions the original study intended to answer. Validity and integrity may be in doubt.
Protocol amendments On average, for a given clinical trial, we may have 2-3 protocol amendments during the conduct of the trial. It is not uncommon to have 5-10 protocol amendments regardless the size of the trial. Some protocols may have up to 12 protocol amendments
Protocol amendments Rationale for changes Clinical Statistical Review process Internal protocol review IRB Regulatory agencies
Ad hoc adaptations Inclusion/exclusion criteria (slow enrollment) Changes in dose/dose regimen or treatment duration (safety concern) Changes in study endpoints (increase the probability of success) Increase the frequency of data monitoring or administrative looks Others
Concerns Protocol amendments may result in a similar but different target patient population Protocol amendments (with major changes) could result in a totally different trial that is unable to address the questions the original trial intended to answer.
Target patient population Has the disease under study Inclusion criteria to describe the target patient population Exclusion criteria to remove heterogeneity Subpopulations may be defined based on some baseline demographics and/or patient characteristics
Target patient population Denote target patient population by, where and are population mean and standard deviation, respectively. After a modification made to the trial procedures, the target patient population lead to the actual patient population of
Target patient population
is usually referred to as the effect size The effect size after the modification made is inflated or reduced by the factor of. “ Clinically meaningful difference ” may have been changed after the modification (adaptation) is made.
Target patient population is referred to as a sensitivity index. When (i.e., there are no impact on the target patient population after the modifications made). In this case, we have =1 (i.e., the sensitivity index is 1).
Sensitivity index A shift in mean of the target patient population may be offset by the inflation (or reduction) of the variability, e.g., A shift of 10% (-10%) in mean could be offset by a 10% inflation (reduction) of variability may not be sensitive due to the masking effect between and C.
Moving target patient population Under the moving target patient population, the effect size is the original effect size times the sensitivity index, that is How will this impact statistical inference?
Two possible approaches Adjustment for covariate Assuming that change in population can be linked by a covariate Chow SC and Shao J. (2005). J. Biopharm. Stat., 15, Assessment of the sensitivity index Assuming that the shift and/or scale parameter is random and derive a unconditional inference for the original patient population Chow SC, Shao J, and Hu OYP (2002). J. Biopharm. Stat., 12,
Adjustment for covariate - motivation An asthma trial (Chow and Shao, 2005) Primary endpoint: FEV1 change from baseline Adaptation: inclusion criteria (slow enrollment) Range of Baseline FEV1 in Inclusion Criterion Number of Patients Mean of FEV1 Change Standard Deviation of FEV1 Change 1.5~ ~ ~
Adjustment for covariates The idea is to relate the means before and after protocol amendments by means of some covariates. In other words, where and are the mean and the corresponding covariate after the protocol amendment, is a given function (linear or non-linear), and is the number of protocol amendments.
An example - summary statistics Range of Baseline FEV1 In Inclusion Criterion Number of Patients Mean of FEV1 Change S.D. of FEV1 Change Mean of Baseline FEV1 Test drug 1.5~ ~ ~ Placebo 1.5~ ~ ~
Results of the proposed method P-value was obtained based on testing for one-sided hypothesis. Classical method ignoring shift in patient population tends to overestimate the treatment effect. Estimate for Test Drug Estimate for Placebo Differen ce Estimated SE P-value Covariate- adjusted method Classical method
Example for sample size calculation Without adjustmentAdjustment Factor R Hypotheses TotalEach treatment TotalEach treatment Superiority Non-inferiority Equivalence Note: Two protocol amendment is considered; δ is chosen as 3%.
EndpointGoalModelEstimation Conti. The WLS method is used to estimate, then Binary The ML method is used to estimate, then : the test treatment : the control treatment Summary
Assessment of the sensitivity index The sensitivity index can be estimated by where
Assessment of the sensitivity index There are four scenarios for assessment of sensitivity index (i) both and are fixed (ii) is random and is fixed (iii) is fixed and is random (iv) both and are random In addition The sample size between two protocol amendments is random
Sample size adjustment – random location shift Hypotheses Without adjustment Adjustment Superiority Non-inferiority Equivalence
Hypotheses Without adjustmentAdjustment Superiority Non- inferiority Equivalence Sample size adjustment – random scale shift
Lecture 2 - Outline Adaptive dose finding Algorithm-based design Model-based design Design selection Seamless adaptive design? Relative advantages and limitations Types of two-stage adaptive design Statistical analysis An example Concluding remarks
Dose finding trials Fundamental questions Is there any drug effect? What doses exhibit a response different from control? What is the nature of the dose-response relationship? What is the optimal dose?
Dose response trials ICH E4 (1994) Guideline on Dose-response Information to Support Drug Registration Randomized parallel dose-response designs Crossover dose-response design Forced titration design Dose escalation design Optimal titration design Placebo-controlled titration to endpoint
Dose response relationship Null hypothesis Alternative hypotheses
Dose escalation trial Primary objectives Is there any evidence of the drug effect? What is the nature of the dose-response? What is the optimal dose? Principles Less patients to be exposed to the toxicity More patients to be treated at potential efficacious dose levels
Dose escalation trial Algorithm-based design Traditional dose escalation (TER) rule The “ 3+3 ” TER The “a+b” TER Model-based design Continual reassessment method (CRM) CRM in conjunction with Bayesian approach
Dose escalation trial Dose limiting toxicity (DLT) DLT is referred to as unacceptable or unmanageable safety profile which is pre- defined by some criteria such as Grade 3 or greater hematological toxicity according to the US National Cancer Institute ’ s Common Toxicity Criteria (CTC) Maximum tolerable dose (MTD) MTD is the highest possible but still tolerable dose with respect to some pre- specified DLT.
Dose escalation trial The “ 3+3 ” TER The traditional escalation rule is to enter three patients at a new dose level and then enter another three patients when a DLT is observed. The assessment of the six patients is then performed to determine whether the trial should be stopped at the level or to escalate to the next dose level.
Traditional escalation rule Basically, there are two types of escalation rules Traditional escalation rule (TER) Does not allow dose de-escalation Strict traditional escalation rule (STER) Allow dose de-escalation if two of three patients have DLTs The “ 3+3 ” TER can be generalized to the “a +b ” design with and without dose de- escalation.
Continual reassessment method For the method of CRM, the dose- response relationship is continually reassessed based on accumulative data collected from the trial. The next patient who enters the trial is then assigned to the potential MTD level Dose toxicity modeling Dose level selection Reassessment of model parameters Assignment of next patient
Dose toxicity modeling Assumptions There is monotonic relationship between dose and toxicity The biologically inactive dose is lower than the active dose, which is in turn lower than the toxic dose Model The logistic model is often considered
Dose toxicity modeling Model where is the probability of toxicity associated with dose, and and are positive parameters to be determined. Then, the MTD can be expressed as where is the probability of DLT at MTD
Dose toxicity modeling Remarks For an aggressive tumor and a transient and non-life-threatening DLT, could be as high as 0.5 For persistent DLT and less aggressive tumors, could be as low as 0.1 to 0.25 A commonly used value for is somewhere between 0 and 1/3=0.33 Crowley (2001)
Dose level selection General principles It should be low enough to avoid severe toxicity It should be high enough for observing some activity or potential efficacy in humans The commonly used starting dose is the dose at which 10% mortality occurs in mice
Dose level selection The subsequent dose levels are usually selected based on the following multiplicative set where is called the dose escalation factor
Dose level selection Remarks The highest dose level should be selected such that it covers the biologically active dose, but remains lower than the toxic dose. In general, CRM does not require pre-determined dose intervals.
Reassessment of model parameters The key is to estimate the parameter in the response mode An initial assumption or prior about the parameter is necessary in order to assign patients to the dose level based on the toxicity relationship The estimate of is continually updated based on cumulative data observed from the trial
Reassessment of model parameter Remarks – the estimation method could be a frequentist or Bayesian approach Frequentist approach Maximum likelihood estimate or least square estimate are commonly considered Bayesian approach - It requires a prior distribution about the parameter - It provides posterior distribution and predictive probabilities of MTD
Assignment of next patient The updated dose-toxicity model is usually used to choose the dose level for the next patient. In other words, the next patient enrolled in the trial is assigned to the current estimated MTD based on dose-response model
Assignment of next patient Remarks Assignment of patient to the most updated MTD leads to majority of the patients assigned to the dose levels near MTD, which allows a more precise estimate of MTD with a minimum number of patients In practice, this assignment is subject to safety constraints such as limited dose jump and delayed response
Criteria for design selection Number of patients expected Number of DLT expected Toxicity rate Probability of observing DLT prior to MTD Probability of correctly achieving the MTD Probability of overdosing Others Flexibility of dose de-escalation
An example – radiation theraphy Small size cohort for lower dose levels Minimize the number of patients at lower dose groups Majority patients near the MTD Ideally, the last two dose cohorts under study Flexibility for dose de-escalation Limited dose jump if CRM is used Higher probability of reaching the MTD Lower probability of overdosing Also take moderate AE into consideration
Example - clinical trial simulation Simulation runs = 5,000 Initial dose = 0.5 mCi/kg Dose range is from 0.5 mCi/kg to 4.5 mCi/kg #of dose levels = 6 Modified Fibonacci sequence is considered. That is, dose levels are 0.5, 1, 1.6, 2.5, 3.5, and 4.7. Maximum dose de-escalation allowed = 1 (for STER) DLT rate at MTD is assumed to be 1/3=33% Logistic toxicity model is assumed for the CRM Uniform prior is assumed for the CRM #of doses allowed to skip = 0
Summary of Simulation Results (based ob 5,000 simulation runs) Design # patients expected (N) # of DLT expected Mean MTD (SD) Prob. of selecting correct MTD “3+3” TER (0.507)0.392 “3+3” STER* (0.499)0.208 CRM** (0.451)0.696 * Allows dose de-escalation ** Uniform prior was used
What is seamless adaptive design? A seamless trial design is defined as a design that combines two separate (independent) trials into a single study. The single study is able to address the study objectives that are normally achieved through the conduct of the two trials.
Seamless adaptive trial design A seamless adaptive trial design is referred to as a seamless design that applies adaptations during the conduct of the trial. A seamless adaptive design would use data collected from patients enrolled before and after the adaptation in the final analysis.
Characteristics Combine two separate trials into a single trial It is also known as a two-stage adaptive design The single trial consists of two stages (phases) Learning (exploratory) phase Confirmatory phase Opportunity for adaptations based on accrued data at the end of learning phase
Advantages Opportunity for saving Stopping early for safety and/or futility/efficacy Efficiency Can reduce lead time between the learning and confirmatory phases Combined analysis Data collected at the learning phase are combined with those data obtained at the confirmatory phase for final analysis
Limitations (regulatory’s concerns) May introduce operational bias Adaptations relate to dose, hypothesis and endpoint etc. May not be able to control the overall type I error rate When study objectives/endpoints are different at different stages Statistical methods for combined analysis are not well established Complexity depends upon the adaptations apply
An example Two-stage phase II/III study Learning (exploratory) phase Dose finding Confirmatory phase Efficacy confirmation
Comparison of type I errors Let and be the type I error for phase II and phase III studies, respectively. Then the alpha for the traditional approach is given by if one phase III study is required if two phase III studies are required In an adaptive seamless phase II/III design, the actual alpha is The alpha for a seamless design is actually times larger than the traditional design
Comparison of power Let and be the power for phase II and phase III studies, respectively. Then the power for the traditional approach is given by if one phase III study is required if two phase III studies are required In an adaptive seamless phase II/III design, the power is The power for a seamless design is actually times larger than the traditional design
Comparison Traditional Approach Seamless Design Significance level 1/20 * 1/201/20 Power0.8 * Lead time6 m – 1 yrReduce lead time Sample size
Seamless adaptive designs In practice, a seamless adaptive design may combine two separate (independent) trials with similar but different study objectives into a single trial, e.g., A phase II trial for dose selection and a phase III study for efficacy confirmation In some cases, the study endpoints considered at the two separate trials may be different, e.g., A biomarker or surrogate endpoint versus a regular clinical endpoint
Seamless adaptive designs Category I - SS Same study objectives Same study endpoints Category II - SD Same study objectives Different study endpoints Category III - DS Different study objectives Same study endpoints Category IV - DD Different study objectives Different study endpoints
Categories of two-stage seamless adaptive designs Study endpoints at different stages SD Study objectives at different stages SI=SSII-SD DIII=DSIV=DD
Seamless adaptive designs Category I - SS Similar to typical group sequential design Category II - SD Biomarker (or surrogate endpoint or clinical endpoint with shorter duration) versus clinical endpoint Category III - DS Dose finding versus early efficacy Category IV - DD Treatment selection with biomarker (or surrogate endpoint or clinical endpoint with different treatment duration) versus efficacy confirmation with regular clinical endpoint
Frequently asked questions How to perform power analysis for sample size calculation/allocation? How to control the overall type I error rate at a pre-specified level of significance? Especially when the study objectives at different stages are different How to combine data collected from both stages for a valid final analysis? Especially when the study objectives and study endpoints at different stages are different
Statistical analysis for Category I – SS designs Similar to group sequential trial design with planned interim analyses Can be treated as a multiple-stage trial design with adaptations Adaptations Stop the trial early for futility/efficacy Drop the losers Sample size re-estimation etc
Hypotheses testing Consider a K-stage design and suppose we are interested in testing the following null hypothesis where is the null hypothesis at the kth stage
Stopping rules Let be the test statistic associated with the null hypothesis Stop for efficacy if Stop for futility if Continue with adaptations if Where and
Test based on individual p-values This method is referred to as method of individual p-values (MIP) Test statistics For K=2 (a two-stage design), we have
Stopping boundaries based on MIP
Test based on sum of p-values This method is referred to as the method of sum of p-values (MSP) Test statistic For K=2 (a two-stage design), we have
Stopping boundaries based on MSP
Test based on product of p-values This method is known as the method of products of p-values (MPP) Test statistic For K=2 (a two-stage design), we have
Stopping boundaries based on MPP
Practical Issues Based on MIP, MSP, or MPP, sample size calculation for achieving the desire power by controlling the overall type I error rate at a pre-specified level of significance can be obtained Statistical methodology was derived under the assumptions of category I – SS designs Did not address the issue of sample size allocation. The target patient population may have been shifted after adaptations How do we preserve type I error rate? Adaptations are made based on “ estimates ” How this will affect the power? More adaptations will result in a more complicated adaptive design
Statistical analysis for Category II – SD designs Let be the data observed from stage 1 (learning phase) and stage 2 (confirmatory phase), respectively. Assume that there is a relationship between and, i.e.,. The idea is to use the predicted values of at the first stage for the final combined analysis.
Assumptions and and can be related by where is an error term with zero mean and variance
Weighted-mean approach Graybill-Deal estimator where
Sample size allocation Let and be the sample sizes at the first and second stages, respectively. Also, let. Then the total sample size where is referred to as the allocation factor.
Sample size allocation For testing the hypothesis of equality, we have where and
Remarks Following similar idea, formulas for sample size calculation/allocation for testing the hypothesis of superiority, non-inferiority, and equivalence can also be derived. Similar idea can be applied to count data (binary responses) and time-to-event data assuming that there is a well-established relationship between the two study endpoints at different stages.
Category II – SD designs (continuous and binary endpoints) (continuous and binary endpoints) Hypotheses Testing Continuous EndpointBinary Response Equality Non-inferiority (δ < 0) Superiority (δ > 0) Equivalence
Category II – SD designs (time-to-event data) (time-to-event data) Hypotheses Testing Weibull DistributionCox ’ s Proportional Hazard Model Equality Non-inferiority (δ < 0) Superiority (δ > 0) Equivalence
Note The definitions of the notations given in the previous slides can be found in following reference Reference Chow, S.C. and Tu, Y.H. (2008). On two- stage seamless adaptive design in clinical trials. Journal of Formosan Medical Association, 107, No. 12, S51-S59.
Remarks One of the key assumptions in the proposed method is that there is a well-established relationship between the endpoints Biomarker vs clinical endpoint Same clinical endpoint with different durations When there is a shift in patient population, the proposed method needs to be modified Protocol amendments
An example – the HCV study Study objectives – to evaluate the safety and efficacy of a test treatment for treating patients with hepatitis C virus (HCV) genotype 1 infection Dose selection (phase II) Efficacy confirmation (phase III) Study design A two-stage phase II/III seamless adaptive design Subjects are randomly assigned to five treatments (4 active and one placebo)
An example – the HCV study Characteristics Study objectives are similar but different Study endpoints are different Study endpoints First stage – early virologic response (EVR) at week 12 Second stage – sustained virologic response (SVR) at 72 week (i.e., 24 weeks after 48 weeks of treatment)
Adaptations considered Two planned interim analyses The first interim analysis will be performed when all Stage 1 subjects have completed study Week 12. The second interim analysis will be conducted when all Stage 2 subjects have completed Week 12 of the study and about 75% of Stage 1 subjects have completed Stage 1 treatment. The O ’ Brien-Fleming type of boundaries are applied.
Criteria for dose selection at Stage 1 Dose selection is performed based on the precision analysis. Based on EVR, the dose with highest confidence level for achieving statistical difference (i.e., the observed difference is not by chance alone) as compared to the control arm is selected.
An example – the HCV study Notations : treatment effect of the i th dose group at the j th stage based on surrogate endpoint : treatment effect of the i th dose group at the j th stage based on regular clinical endpoint i =1,…, k (dose group) j =1,2 (stage)
Study design of the HCV study Two-stage seamless adaptive design Stage 1 Stage 2 Multiple stage design 1 st interim analysis Decision-making End of Stage 12 nd interim analysis Sample size re- estimation End of study Stage 1 Stage 2 Stage 3Stage 4
An example – the HCV study This two-stage seamless design can then be viewed as a 4-stage design Hypotheses of interest
An example – the HCV study Testing procedure Stage 1 If, then stop the trial. If, then treatment will proceed to Stage 2, where Stage 2 If, then stop the trial. If but then move to Stage 3.
An example – the HCV study Stage 3 If, stop the trial; otherwise move to Stage 4. Stage 4 If, reject.
Controlling type I error rate It can be shown that the maximum probability of wrongly rejecting is achieve when and. Denote and the two vectors. Then we have
Sample size calculation Power can be evaluated at and Denote and the two vectors. Then we have
Sample size calculation Let N be the total number. Then Similar to Thall, Simon and Ellenburg (1988), we can choose the parameters such that is minimized.
Remaining issues How about sample size allocation? Is the usual O ’ Brien-Fleming type of boundaries appropriate? How to combine data collected from both stages for a valid final analysis? What if there is a shift in target patient population? Can clinical trial simulation help?
Remarks The usual sample size calculation for a two-stage design with different study objectives/endpoints needs adjustment. One of the key assumptions is that there is a well-established relationship between different endpoints. This relationship may not exist or cannot be verified in practice. When there is a shift in patient population (e.g., as the result of protocol amendments), the above method needs to be modified.
Future perspectives Well-understood design Group sequential design Less well-understood design Adaptive group sequential design Adaptive dose finding Two-stage seamless adaptive design For less well-understood designs, they should be used with caution
Future perspectives Design-specific guidances are necessarily developed Misuse Abuse Statistical methods need to be derived Validity Reliability Monitoring of adaptive trial design Integrity
Concluding remarks Clinical Adaptive design reflects real clinical practice in clinical development. Adaptive design is very attractive due to its flexibility and efficiency. Potential use in early clinical development. Statistical The use of adaptive methods in clinical development will make current good statistics practice even more complicated. The validity of adaptive methods is not well established.
Concluding remarks Regulatory Regulatory agencies may not realize but the adaptive methods for review/approval of regulatory submissions have been employed for years. Specific guidelines regarding different types of less-well-understood adaptive designs are necessary developed.
Selected references  Special issues at Biometrics, Statistics in Medicine, Journal of Biopharmaceutical Statistics, Biometrical Journal, Pharmaceutical Statistics, etc.  Gallo, P., et al. (2006). Adaptive design in clinical drug development – an executive summary of the PhRMA Working Group (with discussions). Journal of Biopharmaceutical Statistics, 16,  Chow, S.C. and Chang, M. (2006). Adaptive Design Methods in Clinical Trials. Chapman Hall/CRC Press, Taylor & Francis, New York, NY.  Chow, S.C. and Chang, M. (2008). Adaptive design methods in clinical trials – a review. The Orphanet Journal of Rare Diseases, 3,  Pong, A. and Chow, S.C. (2010). Handbook of Adaptive Design In Pharmaceutical Research and Development. Chapman Hall/CRC Press, Taylor & Francis, New York, NY.