Presentation on theme: "Practical Application of the Continual Reassessment Method to a Phase I Dose-Finding Trial in Japan: East meets West Satoshi Morita Dept. of Biostatistics."— Presentation transcript:
Practical Application of the Continual Reassessment Method to a Phase I Dose-Finding Trial in Japan: East meets West Satoshi Morita Dept. of Biostatistics and Epidemiology, Yokohama City University Medical Center
Why a phase I dose-finding study of CEX in Japan? Cyclophosphamide, Epirubicin, Xeloda Capecitabine (Xeloda) was/is a novel oral fluoropyrimidine derivative with high single-agent anti-tumor activity in metastatic breast cancer (BC). A research team from the EORTC conducted a phase I dose-finding study to determine the recommended dose of CEX. (Bonnefoi, et al., 2003) Japanese patients/doctors would need CEX as a treatment option.
Why CEX trial in Japanese patients? A concern was raised over possible differences in the tolerability of CEX between Caucasians and Japanese. In many cases, EORTC Bonnefoi et al., 2001 Japan Iwata et al. 2005 Ex. FEC (5-FU, Epi, CPA) Recommended dose(s) CaucasiansJapanese >
The Japanese CEX phase I trial Morita et al.(2007) & Iwata et al.(2007) To answer this question, we conducted a phase I dose-finding study of CEX in Japanese patients (J-CEX) from Dec., 2003 to Feb., 2006. Based on the prior information: - The EORTC CEX study (3+3 cohort design) - The previous studies for other combinations such as FEC, CAF, etc, we applied CRM!!
CRM in J-CEX One-parameter logistic model DLT = Grade 3,4 hematologic / non-hematologic toxicity or grade 3 hand-foot syndrome A target Pr(DLT) = 0.33 Pr(DLT|dose j) = (x j, ) = exp( x j ) 1 + exp( x j ) for j=0,…,4, with fixed > 0,
Implementation of CRM in J-CEX A dose-escalation/de-escalation rule: Each cohort is treated at the dose level with an estimated Pr(DLT | x, Data) closest to 0.33 and NOT exceeding 0.40. Pick x to minimize | E[ (x, ) | Data] – 0.33 | Untried dose is not skipped when escalating. A trial stopping rule: The trial is to be stopped if level 0 is considered too toxic: Pr(DLT | dose 0, Data) > 0.40. N max = 22 treated in cohorts of 3 Start with the 1 st cohort of 1 patient at dose level 1.
Setting up a CRM in J-CEX Step 1. Obtain pre-study point estimation of Pr(DLT) at each dose level from clinical oncologists, 2. Pre-determine the intercept 2. Pre-determine the intercept 3. Specify a prior distribution function of the slope 3. Specify a prior distribution function of the slope 4. Specify a numerical value of x j, j = 0,…,4, 4. Specify a numerical value of x j, j = 0,…,4, 5. Specify the hyperparameters of the prior of p( ) 5. Specify the hyperparameters of the prior of p( ) in terms of how informative p( ) is.
Step 3: Prior of the slope, For computational convenience and to constrain the slope to be positive, , One more restriction One more restriction a=b E ( )=1, Var ( )=1 /a ～ Ga ( a,b ) with E ( )= a/b and Var ( )= a/b 2 Fixing the prior mean dose-toxicity curve regardless of magnitude of prior confidence.
Step 5: Specify the hyperparameter, Step 5: Specify the hyperparameter, a The hyperparameter a determines the credible interval of the dose-toxicity curve. Making several patterns of graphical presentations, and asking the oncologists, “which depicts most appropriately your pre-study perceptions on dose-toxicity relationship?”, a=8 a=5 a=2 a=5 We set a = 5.
In the first cohort (patient),… Level 1 (1 pt) DLT1 例 HFS(G3) C: 600 E: 75 X: 1657
The dose-toxicity curve after updating the prior curve with toxicity data from the 1 st pt 0 1 2 3 4 Dose level for the 2nd cohort
Posterior mean dose-toxicity curve and its 90% CI after treating 16 patients 0 1 2 3 4
Posterior density functions of estimated at each of the five dose levels Posterior density functions of Pr(DLT | x, Data) estimated at each of the five dose levels Selected as RD f [ (x j, )|data] = p( |data) dddd
Concern & Question I had We made many “arbitrary choices” when designing the study, especially eliciting the prior from the oncologists. Based on the EORTC study, using graphical presentations,……, BUT, still arbitrary!! My concern was…‘didn’t Ga(5,5) dominate the posterior inferences after enrolling the first two / three cohorts?’ My question was…‘how could we determine the strength of the prior relative to the likelihood?’.
Fundamental question in Bayesian analysis The amount of information contained in the prior? Prior p( θ ) (((((( (((((( (((((( ((((((
Trans-Pacific Research Project!! December 2005 ~ MDACC, Houston Japan Time difference 15 hours
Prior effective sample size These concerns may be addressed by quantifying the prior information in terms of an equivalent number of hypothetical patients, i.e., a prior effective sample size (ESS). A useful property of prior ESS is that it is readily interpretable by any scientifically literate reviewer without requiring expert mathematical training. This is important, for example, for consumers of clinical trial results.
Work together as a team Peter (Müller) Peter (Thall) Paper? You all right?
The answer seems straightforward For many commonly used models, e.g., beta distribution Effective sample size 1.5 + 2.5 = 4 3 + 8 = 11 16 + 19 = 35 Be (1.5,2.5) Be (16,19) Be (3,8)
For many parametric Bayesian models, however… How to determine the ESS of the prior is NOT obvious. E.g., usual normal linear regression model
General approach to determine the ESS of prior p( ) General approach to determine the ESS of prior p( ) Morita, Thall, M ü ller (2008) Biometrics 1) Construct an “ε-information” prior q 0 ( θ ) 2) For each possible ESS m = 1, 2,..., consider a sample Y m of size m 3) Compute posterior q m (θ |Y m ) starting with prior q 0 ( θ ) 4) Compute distance between q m (θ|Y m ) and p(θ) 5) The value of m minimizing the distance is the ESS
Definition of ε-information prior has the same mean and correlations as, while inflating the variances has the same mean and correlations as, while inflating the variances
The basic idea is To find the sample size, that would be implied by normal approximation of the prior p (θ) and the posterior q m (θ |Y m ). To find the sample size m, that would be implied by normal approximation of the prior p (θ) and the posterior q m (θ |Y m ). This led us to use the second derivative of the log densities to define the distance. M mm=1 … ……
Distance between p and q m Difference of the traces of the two information matrices, evaluated at the prior mean:
DEFINITION of ESS The effective sample size (ESS) of with respect to the likelihood is the (interpolated) integer m that minimizes the distance between p and q m The effective sample size (ESS) of with respect to the likelihood is the (interpolated) integer m that minimizes the distance between p and q m
Algorithm Step 1. Specify Step 3. ESS is the interpolated value of minimizing Step 3. ESS is the interpolated value of m minimizing Step 2. Compute for each analytically or using simulation-based numerical approximation
Step 1: Step 2: Assume a uniform distribution for X i Use simulation to obtain ESS = 2.1 J-CEX
A computer program, ESS_RegressionCalculator.R, to calculate the ESS for a normal linear or logistic regression model is available from the website http://biostatistics.mdanderson.org /SoftwareDownload.http://biostatistics.mdanderson.org
In the context of dose-finding studies, Prior assumptions (arbitrary choices) include - one- / two-parameter model, - priors of the intercept and slope parameters, - numerical values for dose levels, etc. It may be interesting to discuss the impact of prior assumptions in terms of prior ESS and other criteria…in order to obtain a “sensible prior”. → One of the on-going projects!! → One of the on-going projects!!
Step 4: Dose levels, Step 4: Dose levels, x Based on the elicited x j, j = 0,…,4. Based on the elicited Pr(DLT | dose j), specify the numerical values x j, j = 0,…,4. “Backward fitting” (Garrett-Mayer,2006,Clinical Trials) (x j, ) = exp( x j ) 1 + exp( x j )
Prior dose-toxicity curve and its 90% credible interval 0 1 2 3 4
In the context of dose-finding studies, Prior assumptions (arbitrary choices) include - one- / two-parameter model, - priors of the intercept and slope parameters, - numerical values for dose levels, etc. It may be interesting to discuss the impact of prior assumptions in terms of 1) prior ESS, 2) prior predictive probabilities: Pr [ ( x, ) >0.99 ] & Pr [ ( x, ) 0.99 ] & Pr [ ( x, )< 0.01 ], 3) the sensitivity to dose selection decision, in order to obtain a “sensible prior”.
ESS of a beta distribution Saying Be ( a, b ) has ESS = a + b implicitly refers to the fact that θ ~ Be ( a, b ) and Y | θ ~ bin ( n, θ) implies θ | Y 〜 Be ( a + Y, b + n - Y ) θ | Y 〜 Be ( a + Y, b + n - Y ) which has ESS = a + b + n
ESS of a beta distribution (cont’d) Saying Be ( a,b ) has ESS = a + b implictly refers to an earlier Be ( c,d ) prior with very small c+d =ε and solving for m = a + b – ( c + d ) = a + b – ε for a very small value ε > 0 for a very small value ε > 0
Prior ESS of a beta distribution - Beta-binomial case - Be(a,b) Be(c,d) Be(c+Y,d+m-Y) θ p( θ ) θ q 0 ( θ ) θ q m ( θ |Y m ) where c+d = is very small Be(a,b) has a prior ESS = a + b Solving for m = a+b – (c+d) = a+b –