Ken Kowalski, Ann Arbor Pharmacometrics Group (A2PG)

Slides:

Advertisements

Similar presentations

COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Advertisements

Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: Null hypothesis.

Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.

Global Pharmacometrics Enhanced Quantitative Decision Making - Reducing the likelihood of incorrect decisions Mike K. Smith, Jonathan French, (Pfizer)

Sample size optimization in BA and BE trials using a Bayesian decision theoretic framework Paul Meyvisch – An Vandebosch BAYES London 13 June 2014.

1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.

Evaluating Hypotheses

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 9-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.

7-2 Estimating a Population Proportion

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.

Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.

Chapter 8 Introduction to Hypothesis Testing

BCOR 1020 Business Statistics

Today Concepts underlying inferential statistics

Sample Size Determination

Sample Size Determination Ziad Taib March 7, 2014.

Chapter 10 Hypothesis Testing

Confidence Intervals and Hypothesis Testing - II

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,

Fundamentals of Hypothesis Testing: One-Sample Tests

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.

Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests

Inference for a Single Population Proportion (p).

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Chapter 10 Hypothesis Testing

Chapter 8 Introduction to Hypothesis Testing

CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.

Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.

통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.

Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.

Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.

Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.

Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

Federal Institute for Drugs and Medical Devices The BfArM is a Federal Institute within the portfolio of the Federal Ministry of Health (BMG) The use of.

Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.

Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.

Bayesian Approach For Clinical Trials Mark Chang, Ph.D. Executive Director Biostatistics and Data management AMAG Pharmaceuticals Inc.

1 METHODS FOR DETERMINING SIMILARITY OF EXPOSURE-RESPONSE BETWEEN PEDIATRIC AND ADULT POPULATIONS Stella G. Machado, Ph.D. Quantitative Methods and Research.

Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.

© Copyright McGraw-Hill 2004

Sample Size Determination

Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.

Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.

BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.

Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.

Inference for a Single Population Proportion (p)

Sample Size Determination

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

Chapter 8: Inference for Proportions

Dose-finding designs incorporating toxicity data from multiple treatment cycles and continuous efficacy outcome Sumithra J. Mandrekar Mayo Clinic Invited.

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Lecture Slides Elementary Statistics Twelfth Edition

Tobias Mielke QS Consulting Janssen Pharmaceuticals

Chapter 9 Hypothesis Testing: Single Population

Presentation transcript:

Ken Kowalski, Ann Arbor Pharmacometrics Group (A2PG) A General Framework for Model-Based Drug Development Using Probability Metrics for Quantitative Decision Making Ken Kowalski, Ann Arbor Pharmacometrics Group (A2PG)

Outline Population Models Basic Probabilistic Concepts Basic Notation and Key Concepts Basic Probabilistic Concepts General Framework for Model-Based Drug Development (MBDD) Examples Final Remarks/Discussion Bibliography PaSiPhIC 2012 A2PG

Population Models Basic Notation General Form of a Two-Level Hierarchical Mixed Effects Model: Definitions: PaSiPhIC 2012 A2PG

Population Models Key Concepts Typical Individual Prediction: Easy to compute, same functional form as f Population Mean Prediction: Integral is often intractable when f is nonlinear Typically requires Monte-Carlo integration (simulation) The typical individual and population mean predictions are not the same when f is nonlinear Cannot observe a ‘typical individual’ Can observe a sample mean PaSiPhIC 2012 A2PG

Basic Probabilistic Concepts Statistical intervals (i.e., confidence and prediction intervals) Statistical power Probability of achieving the target value (PTV) Probability of success (POS) Probability of correct decision (POCD) PaSiPhIC 2012 A2PG

What’s the difference between a confidence interval and prediction interval? A confidence interval (CI) is used to make inference about the true (unknown) quantity (e.g., population mean) Reflects uncertainty in the parameter estimates Typically used to summarize the current state of knowledge regarding the quantity of interest based on all available data used in the estimation of the quantity A prediction interval (PI) is used to make inference for a future observation (or summary statistic of future observations) Reflects both uncertainty in the parameter estimates as well as the sampling variation for the future observation PaSiPhIC 2012 A2PG

Relationship Between CIs and PIs Prediction Limits Recognizing Uncertainty in E( ) Prediction Limits if E( ) Located Here Distribution of sampling variation Confidence Limits for Note: Prediction intervals are always wider than confidence intervals. PaSiPhIC 2012 A2PG

Confidence interval for the mean based on a sample of N observations Sample mean (parameter estimate) Standard error of the mean (parameter uncertainty) PaSiPhIC 2012 A2PG

Prediction interval for a single future observation Sample variance of a future observation (sampling variation) Sample mean (parameter estimate) Sample variance of the mean (parameter uncertainty) Note: The sample mean based on N previous observations is the best estimate for a single future observation. PaSiPhIC 2012 A2PG

Prediction interval for the mean of M future observations Sample variance of the mean of M future observations (sampling variation) Sample mean (parameter estimate) Sample variance of the mean (parameter uncertainty) Note 1: The sample mean based on N previous observations is the best estimate for the mean of M future observations. Note 2: A prediction interval for M=∞ future observations is equivalent to a confidence interval (see Slide 8). This will also be referred to as ‘averaging out’ the sampling variation. PaSiPhIC 2012 A2PG

A Conceptual Extension of Confidence and Prediction Intervals to Population Modeling Measure/Quantity Simple Mean Model Population Model Parameters ,  , Ω,  Prediction Sampling Variation Parameter Uncertainty* Confidence Interval See Slide 8 Stochastic simulations with sufficiently large M Prediction Interval See Slide 10 Stochastic simulations with finite M * Note for the simple mean model the standard error of the mean does not take into account uncertainty in the sampling variation (s) whereas in population models we typically take into account the uncertainty in Ω and . PaSiPhIC 2012 A2PG

Quantifying Parameter Uncertainty in Population Models – Nonparametric Bootstrap Randomly sample with replacement subject data vectors to preserve within-subject correlations to construct bootstrap datasets Re-estimate model parameters for each bootstrap dataset to obtain an empirical (posterior) distribution of the parameter estimates (, Ω, ) May require stratified-resampling procedure (by design covariates) for a pooled-analysis with very heterogeneous study designs E.g., limited data at a high dose in one study may be critical to estimation of Emax PaSiPhIC 2012 A2PG

Quantifying Parameter Uncertainty in Population Models – Parametric Bootstrap Draw random samples from multivariate normal distribution with Mean vector = [ ] Covariance matrix = Cov( ) Obtained from Hessian or other procedure (e.g., COV step in NONMEM) Based on Fisher’s theory (Efron, 1982) Assumes asymptotic theory (large sample size) that maximum likelihood estimates converge to a MVN distribution See Vonesh and Chinchilli (1997) Based on Wald’s approximation that likelihood surface can be approximated by a quadratic model locally around the maximum likelihood estimates Approximations are dependent on parameterization Improved approximations may occur by estimating the natural logarithm of the parameter for parameters that must be positive PaSiPhIC 2012 A2PG

Non-parametric Versus Parametric Bootstrap Procedures The non-parametric bootstrap procedure is widely used in pharmacometrics Often used as a back-up procedure to quantify parameter uncertainty when difficulties arise in estimating the covariance matrix (eg., NONMEM COV step failure) In this setting issues with a large number of convergence failures in the bootstrap runs may call into question the validity of the confidence intervals (i.e., Do they have the right coverage probabilities?) This form of parametric bootstrap procedure is less computationally intensive than other bootstrap procedures that require re-estimation Requires successful estimation of the covariance matrix (NONMEM COV step) but can lead to random draws outside the feasible range of the parameters unless appropriate transformations are applied PaSiPhIC 2012 A2PG

Non-parametric Versus Parametric Bootstrap Procedures (2) Developing stable models that avoid extremely high pairwise correlations (>0.95) between parameter estimates and have low condition numbers (<1000) can help Ensure successful covariance matrix estimation Reduce convergence failures in non-parametric bootstrap runs Choice of bootstrap procedure should focus on the adequacy of the parametric assumption Random draws from MVN versus the more computationally intensive re-estimation approaches (e.g., non-parametric bootstrap) PaSiPhIC 2012 A2PG

Simulation Procedure to Construct Statistical Intervals for Population Model Predictions Obtain random draw of , Ω,  from bootstrap procedure for kth trial Simulate subject level data Yi | , Ω,  for M subjects Summarize predictions (e.g., mean) stratified by design (dose ,time, etc.) Repeat for k=1,…,K trials k<K Note 1: To construct confidence interval consider sufficiently large M (e.g., ≥2000 subjects) to average out sampling variation in Ω and . Note 2: For prediction intervals, M is chosen based on observed or planned sample size. k=K Use percentile method to obtain statistical interval from K predictions PaSiPhIC 2012 A2PG

To describe other probabilistic concepts we need to define some additional quantities True (unknown) treatment effect or quantity () Target value (TV) A reference value for  Data-analytic decision rule (e.g., Go/No-Go criteria) Based on an observed treatment effect or quantity (T) PaSiPhIC 2012 A2PG

Treatment Effect ()  is the true (unknown) treatment effect Models provide a prediction of  Uncertainty in the parameter estimates of the model provides uncertainty in the prediction of  P( ) denotes the distribution of predictions of  PaSiPhIC 2012 A2PG

Model-Predicted Placebo-Corrected FPG Versus Dose at Week 12 Example of Model-Predicted  Dose-Response Model for Fasted Plasma Glucose (FPG) Semi-mechanistic model of inhibition of glucose production Mean Model Fit of FPG Versus Dose (integrates data across dose and time) Model-Predicted Placebo-Corrected FPG Versus Dose at Week 12 Delta FPG (mg/dL) Dose (mg) Week 0 Week 2 Week 4 Week 8 Week 6 Week 12 Observed Mean Typical Individual Prediction (PRED) Dose (mg) Placebo-Corrected Delta FPG (mg/dL) Population Mean Prediction 5th Percentile (90% LCL) 95th Percentile (90% UCL) PaSiPhIC 2012 A2PG

Target Value (TV) Suppose we desire to develop a compound if the true unknown treatment effect () is greater than or equal to some target value (TV) TV may be chosen based on: Target marketing profile Clinically important difference Competitor’s performance If we knew truth we would make a Go/No-Go decision to develop the compound based on: Go:  ≥ TV No-Go:  < TV PaSiPhIC 2012 A2PG

Data-Analytic Decision Rule But we don’t know truth… So we conduct trials and collect data to obtain an estimate of the treatment effect (T) T can be a point estimate or confidence limit on the estimate or prediction of  (e.g., placebo-corrected change from baseline FPG) We might make a data-analytic Go/No-Go decision to advance the development of the compound if: Go: T ≥ TV No-Go: T < TV PaSiPhIC 2012 A2PG

Statistical Power Power is a conditional probability based on an assumed fixed value of the treatment effect () Power = P(T ≥ TV | ) where P(T ≥ TV |  = TV) =  (false positive) TV=0 for statistical tests of a treatment effect Power is an operating characteristic of the design based on a likely value of  No formal assessment that the compound can achieve the assumed value of  PaSiPhIC 2012 A2PG

Simulation Procedure to Calculate Power Based on a Population Model-Predicted  Use the same final estimates (, Ω, ) for each simulated trial Simulate subject level data Yi | , Ω,  for M subjects Analyze simulated data as per SAP to test Ho:  = TV Ha:   TV Repeat for k=1,…,K trials k<K Note 1: Typically TV=0 when assessing whether the compound has an overall treatment effect. Note 2: When using simulations to evaluate power it is good practice to also simulate data under the null (e.g., no treatment effect or placebo model) to verify that the Type 1 error () is maintained. k=K Power is calculated as the fraction of the K trials in which Ho is rejected PaSiPhIC 2012 A2PG

Probability of Achieving the Target Value (PTV) Probability of achieving the target value is defined as the proportion of trials where the true  ≥ TV PTV = P( ≥ TV) Does not depend on design or sample size Based only on prior information through the model(s) and its assumptions PTV is a measure of confidence in the compound at a given stage of development Can change as compound progresses through development PTV can be calculated from the same set of simulations used to construct confidence intervals of the predicted treatment effect () PaSiPhIC 2012 A2PG

Simulation Procedure to Calculate PTV Based on Population Model Predictions Obtain random draw of , Ω,  from bootstrap procedure for kth trial Simulate subject-level data Yi | , Ω,  for arbitrarily large M Summarize simulated data to obtain population mean predictions of  Repeat for k=1,…,K trials k<K Note: To calculate PTV use an arbitrarily large M (e.g., ≥2000 subjects) to average out sampling variation in Ω and . PTV should only reflect the parameter uncertainty based on all available data used in the model estimation. k=K Calculate PTV as proportion of K trials in which  ≥ TV PaSiPhIC 2012 A2PG

Probability of Success (POS) Probability of success is defined as the proportion of trials where a data-analytic Go decision is made POS = P(Go) = P(T ≥ TV) POS is an operating characteristic that evaluates both the performance of the compound and the design In contrast to Power = P(T ≥ TV | ) which is an operating characteristic of the performance of the design for a likely value of  POS is sometimes referred to as ‘average power’ where a Go decision is based on a statistical hypothesis test PaSiPhIC 2012 A2PG

Simulation Procedure to Calculate POS Based on a Population Model-Predicted  Obtain random draw of , Ω,  from bootstrap procedure for kth trial Simulate subject-level data Yi | , Ω,  for planned sample size (M) Summarize simulated data to obtain estimate of  (T) and perform hypothesis test Repeat for k=1,…,K trials k<K Note: POS integrates the conditional probability of a significant result over the distribution of plausible values of  reflected through the uncertainty in the parameter estimates for , Ω, and . k=K Calculate POS as proportion of K trials in which T ≥ TV PaSiPhIC 2012 A2PG

Probability of Correct Decision (POCD) A correct data-analytic Go decision is made when T ≥ TV and  ≥ TV A correct data-analytic No-Go decision is made when T < TV and  < TV Probability of a correct decision is calculated as the proportion of trials where correct decisions are made POCD = P(T ≥ TV and  ≥ TV) + P(T < TV and  < TV) POCD can only be evaluated through simulation where the underlying truth () is known based on the data- generation model used to simulate the data PaSiPhIC 2012 A2PG

Simulation Procedure to Calculate POCD Based on a Population Model-Predicted  Obtain random draw of , Ω,  from bootstrap procedure for kth trial Simulate subject-level data Yi | , Ω,  for planned sample size (M) Summarize simulated data to obtain estimate of  (T) Classify Go: ≥TV No Go: <TV Under Truth Classify Go: ≥TV No Go: <TV Under Truth Compare Truth Versus Data-Analytic Decision Classify Go: T≥TV No Go: T<TV Under Trial Data Repeat for k=1,…,K trials Calculate POS as proportion of K trials in which T ≥ TV k<K k=K Note: Classification of trial under truth is obtained from the PTV simulations. PaSiPhIC 2012 A2PG

General Framework for MBDD Basic assumptions of MBDD Six components of MBDD Clinical trial simulations (CTS) as a tool to integrate MBDD activities Table of trial performance metrics Improving POCD Setting performance targets Comparing performance targets between early and late stage clinical drug development PaSiPhIC 2012 A2PG

Basic Assumptions of MBDD Predicated on the assumptions: That we can and should develop predictive models That these models can be used in CTS to predict trial outcomes Think of MBDD as a series of learn- predict-confirm cycles Update models based on new data (learn) Conduct CTS to predict trial outcomes (predict) Conduct trial to obtain actual outcomes and evaluate predictions (confirm) Learn Predict Confirm PaSiPhIC 2012 A2PG

Six Components of MBDD MBDD Quantitative Decision Criteria Trial Performance Metrics PK/PD & Disease Models Evaluate probability of achieving target value (PTV), success (POS), correct decisions (POCD) Implement SAP, evaluate alternative analysis methods – ANCOVA, MMRM, regression, NLME Evaluate designs and dose selection; incorporate trial execution models such as dropout models Leverage understanding of pharmacology/disease – useful for extrapolation Explicitly and quantitatively defined criteria “draw line in the sand” Understand competitive landscape from a dose-response perspective MBDD Data-Analytic Models Meta-Analytic Models (Meta-Data from Public Domain) Design & Trial Execution Models PaSiPhIC 2012 A2PG

Clinical Trial Simulations (CTS) Just as a clinical trial is the basic building block of a clinical drug development program, clinical trial simulations should be the cornerstone of an MBDD program CTS allows us to assume (know) truth for a hypothetical trial Based on simulation model we know  Mimic all relevant design features of a proposed clinical trial Sample size, treatments (doses), covariate distributions, drop out rates, etc. Analyze simulated data based on the proposed statistical analysis plan (SAP) Calculate T (test statistic for treatment effect) and apply data-analytic decision rule CTS should be distinguished from other forms of stochastic simulations E.g., CIs for dose predictions, PTV calculations, etc. CTS can be used to integrate the components of MBDD and the various probabilistic concepts (including POS and POCD) PaSiPhIC 2012 A2PG

Table of Trial Performance Metrics Trial No Go Trial Go Total Correct No Go Incorrect Go P(True No Go) “True” No Go Incorrect No Go Correct Go P(True Go) “True” Go P(Trial No Go) P(Trial Go) 1.0 Total POCD POS PTV PaSiPhIC 2012 A2PG

Improving the probability of making correct decisions Change the design  n/group Regression-based designs ( # of dose groups,  n/group) Consider other design constraints (cross-over, titration, etc.) Change the data-analytic model Regression versus ANOVA Longitudinal versus landmark analysis Change the data-analytic decision rule Alternative choices for T Point estimate, confidence limit, etc. All of the above can be evaluated in a CTS PaSiPhIC 2012 A2PG

Setting Performance Targets PTV will change over time as model is refined and new data emerge Bring forward compounds/treatments with higher PTV as compound moves through development PTV may be low in early development Industry average Phase 3 failure rate is approximately 50% It is difficult to improve on this average unless we can routinely quantify PTV Strive to achieve PTV>50% before entering Phase 3 Strive to achieve high POCD in decision-making throughout development PaSiPhIC 2012 A2PG

Advance good compounds / treatments to registration Comparing performance targets between early and late stage clinical drug development High Low 100% Total Trial No Go Trial Go Early Development POCD should be high PTV may be low Kill poor compounds / treatments early True No Go True Go Low High 100% Total True No Go Trial No Go True Go Trial Go Late Development POCD should be high PTV should be high Advance good compounds / treatments to registration PaSiPhIC 2012 A2PG

Examples Rheumatoid Arthritis Example Urinary Incontinence Example Phase 3 development decision Urinary Incontinence Example Potency-scaling for back-up to by-pass Phase 2a POC trial and proceed to a Phase 2b dose-ranging trial Acute Pain Differentiation Case Study Decision to change development strategy to pursue acute pain differentiation hypothesis PaSiPhIC 2012 A2PG

Example – Rheumatoid Arthritis Endpoints: DAS28 remission (DAS28 < 2.6) ACR20 response (20% improvement in ACR score) Models developed based on Phase 2a study: Continuous DAS28 longitudinal PK/PD model with Emax direct- effect drug model ACR20 logistic regression PK/PD model with Emax drug model Both direct and indirect-response models evaluated Conducted clinical trial simulations for a 24-week Phase 2b placebo-controlled dose-ranging study (placebo, low, medium and high doses) At Week 12 non-responders assigned to open label extension at medium dose level Primary analysis at Week 24; Week 12 responses for non-responders carried forward to Week 24 Evaluated No-Go/Hold/Go criteria for Phase 3 development PaSiPhIC 2012 A2PG

Example – Rheumatoid Arthritis (2) DAS28-CRP Remission (Difference from placebo) ACR20 (Difference from placebo) <20% 20-25% 25-30% 30% <10% No Go Hold 10-16% 16-20% Go 20% No Go: Stop development Hold: Wait for results of separate Phase 2b active comparator trial Go: Proceed with Phase 3 development without waiting for results from comparator trial PaSiPhIC 2012 A2PG

Example – Rheumatoid Arthritis (3) Treatment Probability (%) No Go Hold Go Low Dose 96.1 3.9 0.0 Medium Dose 28.1 62.9 9.0 High Dose 18.4 65.6 16.0 CTS results suggest a high probability that the team will have to wait for results from the Phase 2b active comparator trial before making a decision to proceed to Phase 3. Very low probability of taking low dose into Phase 3. PaSiPhIC 2012 A2PG

Example – Urinary Incontinence Endpoint: Daily micturition (MIC) counts Models developed: Longitudinal Poisson-Normal model developed for daily MIC counts for lead compound Time-dependent Emax drug model using AUC0-24 as measure of exposure Potency scaling for back-up based on: In vitro potency estimates for lead and back-up (back-up more potent than lead) Equipotency assumption between lead and back-up Conducted CTS to evaluate Phase 2b study designs for back-up compound (placebo and four active dose levels) Evaluated various dose scenarios of low (L), medium #1 (M1), medium #2 (M2) and high (H) doses levels Implemented SAP (constrained MMRM analysis with step down trend tests) Quantified POS for the L, M1, M2 and H doses for the various dose scenarios and potency assumptions PaSiPhIC 2012 A2PG

Example – Urinary Incontinence (2) Dose Scenario L M1 M2 H Comment 1 1X 2.5X 12.5X 25X Doses selected favor in vitro potency assumption (i.e., back-up more potent than lead compound) 2 37.5X 3 5X 50X 4 75X 5 Doses selected favor equipotent assumption 6 100X Note: Low (L) dose was selected to be a sub-therapeutic response. Design was not powered to detect a significant treatment effect at this dose. PaSiPhIC 2012 A2PG

Example – Urinary Incontinence (3) CTS results: High POS (>95%) demonstrating statistical significance at the H dose for all 6 dose scenarios Insensitive to potency assumptions High POS (>88%) demonstrating statistical significance at the M2 dose for all 6 dose scenarios POS varied substantially for demonstrating statistical significance of the M1 dose Depending on dose scenario and potency assumption POS < 60% for demonstrating statistical significance at the L dose Except for dose scenarios 4 – 6 for the in vitro potency assumption CTS results provided guidance to the team to select a range of doses that would have a high probability of demonstrating dose-response while being robust to the uncertainty in the relative potency between the back-up and lead compounds. Provided confidence to bypass POC and move directly to a Phase 2b trial for the back-up. PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation Background SC-75416 is a selective COX-2 inhibitor Capsule dental pain study conducted Poor pain response relative to active control (50 mg rofecoxib) Lower than expected SC-75416 exposure with capsule relative to oral solution evaluated in Phase 1 PK studies PK/PD models developed to assess whether greater efficacy would have been obtained if exposures were more like that observed for the oral solution Pain relief scores (PR) modeled as an ordered-categorical logistic normal model Dropouts due to rescue therapy modeled as a discrete survival endpoint dependent on the patient’s last observed PR Assumes a missing at random (MAR) dropout mechanism PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation Background (2) PK/PD modeling predicted greater efficacy with oral solution relative to capsules A 6-fold higher SC-75416 dose (360 mg) than previously studied predicted to have clinically relevant improvement in pain relief relative to active control (400 mg ibuprofen) Model extrapolates from capsules to oral solution and leverages in-house data from other COX-2s and NSAIDs Project team considers change in development strategy to pursue a high-dose efficacy differentiation hypothesis Original strategy was to determine an acute pain dose that was equivalent to an active control and then scale down the dose for chronic pain (osteoarthritis) Based on well established relationships that chronic pain doses for NSAIDs tend to be about half of the acute pain dose PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation Proposed POC Dental Pain Trial Proposed conducting a proof of concept oral solution dental pain study Demonstrate improvement in pain relief for 360 mg SC relative to 400 mg ibuprofen Primary endpoint is TOTPAR6 (SC vs. ibuprofen) TOTPAR6 = 3 (TV) is considered clinically relevant Perform ANOVA on observed LOCF-imputed TOTPAR6 response and calculate LS mean differences T = LS mean (SC) – LS mean (ibuprofen) LCL95 = 2-sided lower 95% confidence limit on T Compound and data-analytic decision rule: Truth: Go if ≥3, No-Go if <3 Data: Go if T≥3 and LCL95>0, No-Go if T<3 or LCL95≤0 PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation Simulation Procedure to Calculate PTV Simulate PR Model Parameters (PR,2) ~ MVN Simulate Dropout Model Parameters DO ~ MVN Simulate Dropout Times M=2,000 patients per treatment Simulate PR Scores Perform LOCF Imputation and Calculate TOTPAR6 Calculate Population Mean TOTPAR6 & TOTPAR6 Across M=2,000 pts Determine True Decision Go: 3 No Go: <3 Summarize Distribution of TOTPAR6 () k=K Repeat for k = 1,…,K=10,000 trials k<K PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation Posterior Distribution of TOTPAR6 PTV = P(  3) = 67.2% Mean Prediction = 3.2 PTV = 67.2% sufficiently high to warrant recommendation to conduct oral solution dental pain study to test efficacy differentiation hypothesis. PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation CTS Procedure to Evaluate POC Trial Designs Simulate PR Scores for k-th Trial n pts / treatment Simulate Dropout Times for k-th Trial Perform LOCF Imputation & Calculate TOTPAR6 Calculate Mean TOTPAR6 (T), SEM & 95% LCL Apply Decision Rule Go: LCL>0 and T3 No Go: LCL0 or T<3 Compare Truth vs. Data-Analytic Decision Calculate Metrics POS POCD k=K Repeat for k=1,…,K=10,000 trials k<K PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation CTS Trial Performance Metrics Truth Trial No Go LCL95  0 or T<3 Trial Go LCL95> 0 and T3 Total <3 20.81% 11.99% 32.80% 3 17.29% 49.91% 67.20% 38.10% 61.90% 100% (out of 10,000 trials) POCD = 70.72% POS = 61.90% PTV = 67.20% A sufficiently high POCD and POS supported the recommendation and approval to proceed with the oral solution dental pain study. PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation Comparison of Observed and Predicted (About 9 months later…) PaSiPhIC 2012 A2PG

Case Study – Acute Pain Differentiation Summary of Results 360 mg SC-75416 met pre-defined Go decision criteria Confirmed model predictions Demonstrated statistically significant improvement relative to 400 mg ibuprofen MBDD approach provided rationale to pursue acute pain differentiation strategy that might not have been pursued otherwise Allowed progress to be made while reformulation of solid dosage form was done in parallel Validation of model predictions provided confidence to pursue alternative pain settings for new formulations without repeating dental pain study Model could be used to provide predictions for new formulations PaSiPhIC 2012 A2PG

Final Remarks/Discussion Some thoughts on implementing MBDD Challenges to implementing MBDD PaSiPhIC 2012 A2PG

Final Remarks/Discussion Some thoughts on implementing MBDD We need to clearly define objectives What questions are we trying to address with our models? We need explicit and quantitatively defined decision criteria It’s difficult to know how to apply the models if decision criteria are ambiguous or ill-defined We need complete transparency in communicating model assumptions Entertain different sets of plausible model assumptions Evaluate designs for robustness to competing assumptions We need to routinely evaluate the predictive performance of the models on independent data Modeling results should be presented as ‘hypothesis generating’ requiring confirmation in subsequent independent studies PaSiPhIC 2012 A2PG

Final Remarks/Discussion Some thoughts on implementing MBDD (2) Conduct CTS integrating information across disciplines Implement key features of the design and trial execution (e.g., dropout) Implement statistical analysis plan (SAP) Provide graphical summaries of CTS results for recommended design prior to the release of the actual trial results Perform quick assessment of predictive performance when actual trial reads out Update models and quantification of PTV after actual trial reads out i.e., Begin new learn-predict-confirm cycle PaSiPhIC 2012 A2PG

Final Remarks/Discussion Challenges to implementing MBDD Focus on timelines of individual studies and a ‘go-fast-at-risk’ strategy (i.e., minimizing gaps between studies) can be counter- productive to a MBDD implementation M&S (learning phase) is a time-consuming effort Integration of MBDD activities in project timelines will require focus on integration of information across studies Not just tracking of individual studies May need processes to allow modelers to be un-blinded to interim results to begin modeling activities earlier to meet aggressive timelines Insufficient scientific staff with programming skills to perform CTS Pharmacometricians and statisticians with such skills should be identified CTS implementation often requires considerable customization to address the project’s needs (i.e., no two projects are alike) PaSiPhIC 2012 A2PG

Final Remarks/Discussion Challenges to implementing MBDD (2) Insufficient modeling and simulation resources to implement MBDD on all projects Reluctance to be explicit in defining decision rules (i.e., reluctance to ‘draw line in the sand’) Due to complexities and tradeoffs in making decisions Can be difficult to achieve consensus http://www.ascpt.org/Portals/8/docs/Meetings/2012%20Annual%20Me eting/2012%20speaker%20presentations/ASOP%20TUE%20CHERRY% 20BLOS%20SESSION%201.pdf Reluctance to use assumption rich models We make numerous assumptions now when we make decisions…we’re just not very explicit about them MBDD can facilitate open debate about explicit assumptions PaSiPhIC 2012 A2PG

Bibliography Neter, J., and Wasserman, W. Applied Linear Statistical Models, Irwin Inc., IL, 1974, pp. 71-73. Efron, B. The Jackknife, the Bootstrap, and Other Resampling Plans, Society for Industrial and Applied Mathematics, PA, 1982, pp. 29-30. Vonesh, E.F., and Chinchilli, V.M. Linear and Nonlinear Models for the Analysis of Repeated Measurements, Marcel Dekker, Inc., NY, 1997, pp. 245-246. Kowalski, K.G., Ewy, W., Hutmacher, M.M., Miller, R., and Krishnaswami, S. “Model- Based Drug Development – A New Paradigm for Efficient Drug Development”. Biopharmaceutical Report 2007;15:2-22. Lalonde, R.L., et al. “Model-Based Drug Development”. Clin Pharm Ther 2007;82:21- 32. Chuang-Stein, C.J., et al. “A Quantitative Approach to Making Go/No Go Decisions in Drug Development”. DIJ 2011;45:187-202. Smith, M.K., et al. “Decision-Making in Drug Development – Application of a Model- Based Framework for Assessing Trial Performance”. Book chapter in Clinical Trial Simulations: Applications and Trends, Kimko H.C. and Peck C.C. eds. , Springer Inc., NY, 2011, pp. 61-83. Kowalski, K.G., Olson, S., Remmers, A.E., and Hutmacher, M.M. “Modeling and Simulation to Support Dose Selection and Clinical Development of SC-75416, a Selective COX-2 Inhibitor for the Treatment of Acute and Chronic Pain”. Clin Pharm Ther, 2008; 83: 857-866. PaSiPhIC 2012 A2PG