Presentation on theme: "Thirteen Clinical Trial Design Questions and Answers"— Presentation transcript:
1Thirteen Clinical Trial Design Questions and Answers Peter A. LachenbruchDBCynthia RaskDCEPT
2Usual DisclaimerThe views expressed here are those of the authors and may not reflect those of the FDAWe have been asked to address 13 questions (unlucky?)
3OrientationThe NIAID requested that we discuss some questions regarding clinical trial design principles. There are many books on the basics of clinical trial design and analysisPocockFreedman Furberg and DeMetsPiantadosi
4Orientation (2)FDA has many guidance documents on their web site Consult these for further details.The International Conference on Harmonization (ICH) has issued many reports that worldwide regulatory agencies will abide by. Web site:See E9, E10, E3, E6, and E5 for particularly useful documents
5General Statistical Ideas Clarity of approachFull disclosure of design, sample size calculationsAnalysis methodsDistinction between CONFIRMATORY and Exploratory analysesIf “new” methods are used, there should be peer-reviewed citation
6Statistical Ideas (2) Must be analytically appropriate Maintain size (α level)Maintain blinding as appropriateHave endpoints (outcomes) that are appropriateShow a clinical benefitReliable and validMinimize missing data and provide plan for dealing with them when they occur
7Question 1What pitfalls do the FDA see when information from pre-clinical data (or early clinical data on a similar investigational product) is formulated into a Phase I protocol?
8Q1: (1)Safety issuesA main use of pre-clinical data is to ascertain basic safety information. If animal data is limited, the FDA may ask for further studyThe choice of animal model is important – if it is not accepted as appropriate, there may be need to obtain further information or information on another model
9Q1: (2) Need to characterize the product Potency assays Purity IdentityAll specifications need to have at least a start at understanding leading to full GMP compliance
10Q1: (3)It is recognized that proof of concept studies in pre-clinical studies may be limited. However, there should not be evidence of poorer outcomes than comparators.
11Q2:What are the various types of study designs available when there are more than two comparators?
12Q2: Depends on purpose of study Testing 2 or more dosage levels versus controlParallel group design – may wish to account for ordering of dosages in the analysisTesting both schedule and doseFactorial design (high and low levels of each dose crossed) – allows examination of interactions in the analysis
13Q2: (2) Test amount of adjuvant and dose Factorial designCrossover designs are generally not used in vaccine trials because the immune system is permanently affected (or at least affected for a long time). Thus, carry over effects are always present
14Q3:What controls are appropriate when there cannot be any blinding in the trial?
15Q3: Almost any control is reasonable: placebo, standard of care In some cases, historical controls may be used, but these are rareThe endpoint / outcome variable that is being used and how it’s evaluated is most importantA subjective endpoint is usually a problem, so we expect that a blinded evaluator will be used.An objective endpoint (e.g., confirmed disease by laboratory measures) is preferable
16Q4:What are the problems seen by the FDA with randomization in trials?
17Q4: Randomization is absolutely essential in vaccine trials Issues Too many strata make is unlikely that there will be sufficient numbers in each stratum for precise estimation. The number of strata is the product of the number of levels in each stratum (Sex (2), Age (4), Ethnicity (3) = 24 strata)
18Q4: (2) Issues (continued) Inadequate number of strata age<2, 2 ≤ age <12, 12 ≤ age < 18, 18 ≤ age < 50, ≤ age often important in vaccine studiesCheating – unblinding the treatment assignment – need to have robust way of preventing thisIntroduction of biasNot accounting for the design of the study in the analysis – just because you have stratified, you still must account for the stratification in the analysis.
19Q5:What are the steps in designing a dose-ranging/dose-escalation study?
20Q5:Goals:To establish maximum tolerated dose (MTD), dose limiting toxicities, and/or maximum feasible doseTo establish minimum effective doseDesignsOne dose per subject, gradual increase by fixed amount (typically half log increases, with rules for stopping)What’s the right starting dose?
21Q5: (2) Dose ranging / dose escalation Multiple doses per subject for short (3-7 days) or long (1-4 weeks or greater) periodsWhat is range that generates useful levels of antibodies? What level has adverse events?Dose EscalationGive successively larger doses or number of doses (booster doses) until subject respondsMay not be helpful with vaccines because of permanent effect of a vaccine
22Q5: (3) For a vaccine both dose and schedule need to be determined A factorial design may be usefulTest all doses and all schedulesCan look for interactions to see if the response is additive or notThe specific adjuvant may be importantThis may be expanded to look at a response surfaceUseful for first trials to pick a dose-schedule combination for later trials
23Q6:How does one deal with multiple variables that will affect the outcome measure (with an understanding of fixed randomization schemes and adaptive/dynamic randomization schemes)
24Q6:If there are strata, including study sites, these always should be included in the analysis model.Common covariates include (if not made strata) age, sex, ethnicity, disease stageUsual method is to conduct an analysis of covariance – some covariates may not be ordered, often they are
25Q6: (2)The analysis of covariance does an analysis of variance on the adjusted response.Assumptions:Normal distribution of residual errorCovariates not affected by treatment – measure at baseline!Parallelism – no interaction of treatment and slope – i.e. the response treatment rate of change is same for all covariate combinationsLess commonly used in vaccine trials. May be used for immunogenicity studies. Worry about normality assumptions – may need to transform.
26Q6a: Fixed and Adaptive Randomization Schemes Fixed schemeSame proportion assigned to each treatment groupOften 1:1 allocation, but may become 2:1 or 3:1Smallest variance of treatment effect associated with 1:1 allocation, but it may be important to gain understanding of safety profile, so a more extreme allocation may be used. More extreme than 3:1 is not very useful and leads to much larger sample size
27Q6a: (2)Here are total sample sizes for α=0.5, =0.1, mean difference=1, =31:1 allocation 3802:1 allocation % increase3:1 allocation % increase5:1 allocation % increaseThus, the unbalanced allocation leads to a substantial increase in sample size and consequent budget
28Q6a: (3) Adaptive randomization Next randomization depends on outcome of prior subjects in the trialNeed fairly early response in trial. May be possible with skin or other reactions (if they occur within a few hours of immunization), a bit less so with immunogenicity (outcome after first series of immunizations that may take 6 months), unlikely with clinical outcome (occurs after series of immunizations and a relatively long follow up period)
29Q6a: (4)Another form of adaptive randomization attempts to balance covariates (e.g., minimization)This can be done with vaccine studiesNeeds to have a measure of imbalanceMust adjust for covariates used in imbalance scorePotential for manipulation?May be problems in appropriate analysis model – considerable disagreement
30Q7:What factors are used to estimate sample size?
31Q7: What factors are used to estimate sample size? Most popular question to statisticiansFactors (using a two group test as an example)Significance level (α, often 0.05)Type 2 error (, often 0.2 or 0.1; 1- is the power)Standard deviation of observation ()Difference in means (1 - 2 )Allocation ratio
32Q7: (2)Significance level, type 2 error and allocation ratio are relatively easy to determineMean difference and standard deviation are usually based on preliminary studies and may be quite uncertain.I find it useful to take these preliminary differences and halve themWhat is really needed is the ratio of treatment difference to the standard deviationHalve them because a) regression toward the mean, b) preliminary studies aren’t usually conducted as well as the phase 3 studies, c) too optimistic
33Q7: (3)In vaccines we often want to estimate the vaccine efficacy and find a confidence interval with a lower bound that gives us assurance the vaccine is working wellWhere IV is the incidence rate for the vaccine group and IC is the incidence rate for the control groupLower bound must be substantially greater than 0
34Q7: (4)It’s easy to find an expression for the confidence interval and then one sets the lower bound of the interval to the desired level (set in consultation with FDA)Often ¾ of the observed VE, and VE is set by needs of clinical prevention. For example if the target VE is 0.8 (or 80%) the lower bound would be 60%These are not absolute criteria!We never want to show a vaccine has better than 0 VE. That’s not enough
35Q8:How does the investigator choose the margin of equivalence or non-inferiority (delta or ) in comparative clinical trials?
36Q8:Demonstrate that active control has assay sensitivity – that is, it consistently shows itself better than placebo – need evidence of this.Is the control an appropriate one? Compare to best licensed product, not worstWhat is clinically important? Depends on context of diseaseNote I try to avoid the word “significant”
37Q8: (2)When citing a % difference for VE (or anything) be sure to clarify whether you mean 10 percentage points or 10% of the comparatorIf comparator has a VE of 80%, do we want the estimated VE of the new vaccine to be 72% or 70% if we choose a 10% margin? Or do we want the relative VE (vaccine vs. control) to be at least 90%It’s easy to become confused so it’s good to be specific
38Q9:How does the investigator deal with missing data?
39Q9:Don’t have anyDon’t have very much (under 5% is my initial break point)Discuss how you will deal with it prospectively!!!Complete cases analysis (ugh!)Last Observation Carried Forward (LOCF)Mean values for replacementRegression models for replacementImputation modelsWill not discuss complete cases analysis further – only useful if fraction of missing is small
40Q9: (2) Types of missing values Patient misses visit, and an ‘interior’ value is missingPatient drops out, and a series of values at the end are missing(Some) Covariates are missing in one or more visits
41Q9: (3) Last Observation Carried Forward Is almost always a problem for meIt ignores any trends in dataIt reduces variability arbitrarilyWas suggested about 25 years ago and has been a pain in the neck ever since.
42Q9: (4) Mean values imputation Need to be sure you don’t increase the apparent sample size (i.e., replaced values don’t give a more precise estimate)This doesn’t account for patient specific characteristicsThis can be especially tricky if a long series of values is imputedMean of patient values or mean of other patients at that visit?
43Q9: (5) Regression models Determine what variables are “good” predictors of the missing value – usually useful to have a small number of such variables so they won’t have a lot of missing value issuesPredict missing value using a regression model (try to show that variance of predicted value isn’t too big)
44Q9: (6)More sophisticated imputation models have been developed recentlyDetermine classes of similar patients (“propensity scores”)Fill in missing values by selecting randomly from observations in the same class.Do this multiple times, (5 to 10 is usually enough)Analyze the data and pool resultsMultiple imputation – get within imputation and between imputation variation – can get fairly reliable results
45Q10:How does an investigator determine what data should and should not be included in analyses (especially in the cases of protocol violations, withdrawals and drop-outs)?
46Q10:The fundamental efficacy data set includes all subjects as randomizedNote this does not say “ignore patients who didn’t get medication” – doing this messes up the randomization plan and allows data shapingThis is called Intent to Treat (ITT)Modified ITT relaxes this to patients who have had at least one immunization (treatment)
47Q10: (2) Per Protocol Data set Subjects who had no protocol violations (completed study, etc)Can provide various analysis data sets that may be subsets – e.g. no protocol violations, no withdrawals or dropoutsImportant to examine comparability of groups and outcomes to ITT, mITT
48Q10: (3)Data to be included will depend on the purposes of the analysis. All substantial differences from total population need to be explainedE.g., Immunogenicity analysis had only 48% of the total sample because only 50% of subjects were solicited for bleedingFDA will expect to have access to all data and may audit some of it.
49Q11:What types of analyses should be used when there are multiple time points (multiple observations) of data collection (there is no dichotomous outcome/endpoint)?
50Q11:There has been an active area of statistical research on longitudinal data analysis in the past few yearsIn vaccines research, the common issue is in immunogenicity levels over time. These are often in log(GMC) or log(titer). These look like multiple continuous measurementsCan also have whether a four-fold increase in titer has been achieved at various times. This is a series of dichotomous variables (yes or no)
51Q11: (2)Longitudinal analysis accounts for subject, treatment, other covariates and time in the analysisMeasurements made at different times are correlatedMust determine appropriate correlation structureShape of response curve (linear over time? Curve? )
52Q11: (3) Longitudinal analysis Alternatives – may be appropriate GEE models provide great flexibilityDichotomous variables (e.g., seroconversion at different times) can be handled with GEEAlternatives – may be appropriateChange from baseline to final observation (usually need to have a common last time) - does this depend on baseline? Should we use last observation with baseline as covariate?Don’t use change with baseline as covariate – there’s a built in correlation. Bias is last value with base as covariate
53Q12:In analyzing data from clinical trials involving multiple sites, should site be treated as a fixed or random effect?
54Q12:The site should be included in the analysis model, especially if randomization is stratified. If it’s not stratified, you may not wish to include it if sites are generally small. With small sites, the d.f. that are used up reduces the precision of the comparison
55Q12: (2)If we include sites, they are random effects rather than fixed effects since the intent is to generalize beyond those sites in the trialWith random effects, the inference extends to all possible sites (i.e., the population of sites)With fixed effects, the population is just the sites that have enrolled patientsMain idea is to analyze the data according to the way in which they have been collected
56Q12: (3) Interesting question (at least for statisticians!) How can we regard the sites as a random sample of all sites if we have selected them because they have talented and committed physicians conducting the research there?No decent answer – but we have little interest in drawing conclusions that apply only to the sites that have entered patients
57Q13:When should a planned interim analysis (for safety and/or efficacy and/or sample size re-estimation) be appropriate? What are the pros and cons? What are the methods used?
58Q13: There are three reasons for doing an interim analysis: Examine the safety of the vaccine at an early time to ensure that we are not harming subjects - either stop or continueExamine efficacy of the vaccine at an early time – may stop because vaccine is very good or very bad, or may continueRe-estimate sample size – learn that study is too small to show a difference (variability too large, treatment effect is too small) – probably don’t want to increase sample size by more than 50% total
59Q13: (2) All interim analyses carry a risk of unblinding the study If study stops, all is knownIf study continues, a reasonable inference is that the study has a small p-value (>0.05) since we would have stopped if there was little hope of showing a differenceSomeone might inadvertently (or deliberately) let information slip out
60Q13: (3)These require adjustment of the critical values. It is complicated and several programs (EaST, PEST, SPlus for sequential analysis) are availableAll interim analyses or sample size re-estimation analyses need to be specified a priori in the protocol