Bayesian Hierarchical Models for Detecting Safety Signals in Clinical Trials H. Amy Xia and Haijun Ma Amgen, Inc. MBSW 2009, Muncie, IN March 20, 2009.

Slides:

Advertisements

Similar presentations

Drugs vs. Devices Jeng Mah & Gosford A Sawyerr Sept 16, 2005.

Advertisements

FDA/Industry Workshop September, 19, 2003 Johnson & Johnson Pharmaceutical Research and Development L.L.C. 1 Uses and Abuses of (Adaptive) Randomization:

COMPUTER INTENSIVE AND RE-RANDOMIZATION TESTS IN CLINICAL TRIALS Thomas Hammerstrom, Ph.D. USFDA, Division of Biometrics The opinions expressed are those.

Agency for Healthcare Research and Quality (AHRQ)

Adverse Events: An Update

Common Terminology Criteria for Adverse Events (CTCAE) v.4: Updating a Cancer Research Standard Ann Setser 1, Ranjana Srivastava 2, Lawrence Wright 1,

A Flexible Two Stage Design in Active Control Non-inferiority Trials Gang Chen, Yong-Cheng Wang, and George Chi † Division of Biometrics I, CDER, FDA Qing.

Data Monitoring Models and Adaptive Designs: Some Regulatory Experiences Sue-Jane Wang, Ph.D. Associate Director for Adaptive Design and Pharmacogenomics,

Federal Institute for Drugs and Medical Devices | The Farm is a Federal Institute within the portfolio of the Federal Ministry of Health (Germany) How.

By Trusha Patel and Sirisha Davuluri. “An efficient method for accommodating potentially underpowered primary endpoints” ◦ By Jianjun (David) Li and Devan.

Basic Design Consideration. Previous Lecture Definition of a clinical trial The drug development process How different aspects of the effects of a drug.

Sensitivity Analysis for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)

Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.

Qi Jiang and Liping Huang on Behalf of the Adverse Event Sub Team

Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur,

1 A Bayesian Non-Inferiority Approach to Evaluation of Bridging Studies Chin-Fu Hsiao, Jen-Pei Liu Division of Biostatistics and Bioinformatics National.

Journal Club Alcohol and Health: Current Evidence July-August 2006.

The ICH E5 Question and Answer Document Status and Content Robert T. O’Neill, Ph.D. Director, Office of Biostatistics, CDER, FDA Presented at the 4th Kitasato-Harvard.

Common Problems in Writing Statistical Plan of Clinical Trial Protocol Liying XU CCTER CUHK.

Cbio course, spring 2005, Hebrew University (Alignment) Score Statistics.

Clinical Trials Hanyan Yang

Inferences About Process Quality

Knowledge Update Clinical documentation: from preclinical studies to drug registration Split, 12 September 2008.

Adaptive Designs for Clinical Trials

Sample Size Determination Ziad Taib March 7, 2014.

JumpStart the Regulatory Review: Applying the Right Tools at the Right Time to the Right Audience Lilliam Rosario, Ph.D. Director Office of Computational.

Multiple testing in high- throughput biology Petter Mostad.

Safety data collected during clinical trials is incorporated into the product’s approved label. Regulatory reviewers monitor products’ safety profiles.

Chapter 1: Introduction to Statistics

Gene Set Enrichment Analysis (GSEA)

Examples of ADE Surveillance Systems MedDRA ® Processing of Adverse Event Reports in ADE Surveillance Systems Amarilys Vega, M.D, M.P.H., Sonja Brajovic,

Bayesian Analysis and Applications of A Cure Rate Model.

Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.

General Register Office for S C O T L A N D information about Scotland's people General Register Office for Scotland 2006 Census Test – Evaluation Methodology.

False Discovery Rates for Discrete Data Joseph F. Heyse Merck Research Laboratories Graybill Conference June 13, 2008.

Critical Appraisal of the Scientific Literature

RevMan for Registrars Paul Glue, Psychological Medicine What is EBM? What is EBM? Different approaches/tools Different approaches/tools Systematic reviews.

DSBS Discussion: Multiple Testing 28 May 2009 Discussion on Multiple Testing Prepared and presented by Lars Endahl.

1 Controlling False Positive Rate Due to Multiple Analyses Controlling False Positive Rate Due to Multiple Analyses Unstratified vs. Stratified Logrank.

Regulatory Affairs and Adaptive Designs Greg Enas, PhD, RAC Director, Endocrinology/Metabolism US Regulatory Affairs Eli Lilly and Company.

A Comparison of Some Methods for Detection of Safety Signals in Randomised Controlled Clinical Trials Raymond Carragher Project Supervisors: Prof. Chris.

How to Read Scientific Journal Articles

1 Study Design Issues and Considerations in HUS Trials Yan Wang, Ph.D. Statistical Reviewer Division of Biometrics IV OB/OTS/CDER/FDA April 12, 2007.

Good Pharmacovigilance Practices

Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.

IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.

BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.

Comments on FDA Concept Paper Sidney N. Kahn, MD, PhD President Pharmacovigilance & Risk Management, Inc. Risk Assessment of Observational.

1. Objectives Novartis is developing a new triple fixed-dose combination product. As part of the clinical pharmacology program, pharmacokinetic (PK) drug-drug.

NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.

Sample Size Determination

PV-Trend: A JSL Application for Trending Topics for Pharmacovigilance

EudraVigilance.

Improving Adverse Drug Reaction Information in Product Labels

Statistical Approaches to Support Device Innovation- FDA View

Statistical Considerations on the Evaluation of Imbalances of Adverse Events in Randomized Clinical Trials Haijun Ma, Chunlei Ke, Qi Jiang, and Steven.

Supplementary Table 1. PRISMA checklist

Understanding Results

Multiple Endpoint Testing in Clinical Trials – Some Issues & Considerations Mohammad Huque, Ph.D. Division of Biometrics III/Office of Biostatistics/OPaSS/CDER/FDA.

Critical Reading of Clinical Study Results

Crucial Statistical Caveats for Percutaneous Valve Trials

Combination products The paradigm shift

Mark Rothmann U.S. Food and Drug Administration September 14, 2018

Aiying Chen, Scott Patterson, Fabrice Bailleux and Ehab Bassily

Utilizing FDA Standard Terminology

Chi Square (2) Dr. Richard Jackson

Interpreting Epidemiologic Results.

Regulatory Perspective of the Use of EHRs in RCTs

Use of Piecewise Weighted Log-Rank Test for Trials with Delayed Effect

Detecting Treatment by Biomarker Interaction with Binary Endpoints

Presentation transcript:

Bayesian Hierarchical Models for Detecting Safety Signals in Clinical Trials H. Amy Xia and Haijun Ma Amgen, Inc. MBSW 2009, Muncie, IN March 20, 2009 Disclaimer: The views expressed in this presentation represent personal views and do not necessarily represent the views or practices of Amgen.

Outline Introduction A motivating example Bayesian Hierarchical Models –Meta analysis of Adverse Events data from multiple studies incorporating MedDRA structure –Incorporate patient level data –Effective graphics Closing Remarks

Three-Tier System for Analyzing Adverse Events in Clinical Trials Tier 1: Pre-specified Detailed Analysis and Hypothesis Testing –Tier 1 AEs are events for which a hypothesis has been defined Tier 2: Signal Detection among Common Events –Tier 2 AEs are those that are not pre-specified and “common” Tier 3: Descriptive Analysis of Infrequent AEs –Tier 3 AEs are those that are not pre-specified and infrequent Gould 2002 & Mehrotra 2004 SPERT White Paper 2008

Multiplicity Issue in Detecting Signals Is Challenging Detection of safety signals from routinely collected, not pre-specified AE data in clinical trials is a critical task in drug development Multiplicity issue in such a setting is a challenging statistical problem –Without multiplicity considerations, there is a potential for an excess of false positive signals –Traditional ways of adjusting for multiplicity such as Bonferroni may lead to an excessive rate of false negatives –The challenge is to develop a procedure for flagging safety signals which provides a proper balance between ‘no adjustment’ versus ‘too much adjustment’

Considerations Regarding Whether Flagging an Event Actual significance levels Total number of types of AEs Rates for those AEs not considered for flagging Biologic relationships among various AEs 1 st two are standard considerations in the frequentist approach. The 2 nd two are not, but relevant in the Bayesian approach -- Berry and Berry, 2004

Bayesian Work in Signal Detection Spontaneous adverse drug reaction reports –Gamma Poisson Shrinker (GPS) on FDA AERS database (DuMouchel,1999) –Bayesian Confidence Propagation Neural Network (BCPNN) on WHO database (Bate, et al. 1998) Clinical trial safety (AE) data –Bayesian hierarchical mixture modeling (Berry and Berry, 2004)

Meta Analysis Glass (1976) Meta-analysis refers to a statistical analysis that combines the results of some collection of related studies to arrive a single conclusion to the question at hand Meta-analysis based on –aggregate patient data (APD meta-analysis) –Individual patient data (IPD) meta-analysis Bayesian modeling is a natural choice to incorporate the complex hierarchical structure of the data

George Chi, H.M. James Hung, Robert O’Neill (FDA CDER) “Safety assessment is one area where frequentist strategies have been less applicable. Perhaps Bayesian approaches in this area have more promise.” (Pharmaceutical Report, 2002)

An Example Data from four double-blind placebo-controlled studies on drug X. Study populations are similar. Sample sizes: After converting all AEs into same MedDRA version, reported AEs are coded to 464 PTs under 23 SOCs and 233 HLTs StudyDrug X N Drug X Subj-yr Placebo N Placebo Subj-yr Study A Study B Study C Study D

N_0: sample size in placebo arm; N_1: sample size in treatment arm n_0: # subject with AE in placebo arm; n_1: # subject with AE in treatment arm rt_0: subject incidence in placebo arm; rt_1: subject incidence in treatment arm

Proposed Bayesian Approach Hierarchical mixture models for aggregated binary responses was constructed based on the work by Berry & Berry(2004) –Explore impact of using different MedDRA hierarchy –Inclusion of study effects –Further extended to a hierarchical Poisson mixture model, to account for different exposure/follow-up times between patients Individual patient level models are discussed Implemented the above models with available software –WinBUGS for model implementation –S-Plus graphics for inference

MedDRA MedDRA (the Medical Dictionary for Regulatory Activities Terminology) is a controlled vocabulary widely used as a medical coding scheme. MedDRA Definition (MSSO): –MedDRA is a clinically-validated international medical terminology used by regulatory authorities and the regulated biopharmaceutical industry. The terminology is used through the entire regulatory process, from pre-marketing to post-marketing, and for data entry, retrieval, evaluation, and presentation. MSSO: Introduction to MedDRA

MedDRA and Pharmacovigilance - The Way Forward, 7/8/99

MedDRA and Pharmacovigilance - The Way Forward, 7/8/99

SOC = Respiratory, thoracic and mediastinal disorders HLGT = Respiratory tract infections HLT =Viral upper respiratory tract infections HLT = Influenza viral infections HLGT = Viral infectious disorders SOC = Infections and infestations PT = Influenza Example of MedDRA Hierarchy MSSO: Introduction to MedDRA

Hierarchical Structure of MedDRA Bayesian hierarchical models allow for explicitly modeling AEs with the existing coding structure –AEs in the same SOC more likely to be similar within than across SOCs. –Allow for this possibility, but does not impose it, depending on the actual data –SOC tends to be too broad. HLT is more closely related to medical concepts. In fact, clinical and safety people would (informally) consider the similarity of the AEs, say, within SOCs when they review AE tables –For example, if differences in several CV events were observed, then each would be more likely to be causal than if differences came from medically unrelated areas (eg, skin, neurological, thrombosis, cancer) Bayesian hierarchical modeling allows a scientific, explicit, and more formal way to take it into consideration

Notations Study i=1,…I, SOC b=1,…B and PT j=1,…k b Data: For AE ibj –Treatment group: Y ibj incident events observed in N it patients with T ibj subjects’ exposure –Control group: X ibj incident events observed in N ic patients with C ibj subjects’ exposure

Bayesian Meta Analysis Hierarchical Logistic Regression Common treatment effect for same PT across studies

Bayesian Meta Analysis Hierarchical Logistic Regression (Cont.) Treatment effect with additive study effects: A random treatment effect/multiplicative model:

Bayesian Meta Analysis Hierarchical Logistic Regression (Cont.) Other priors –Stage 1 λ bj ~N(μ λb, σ 2 λb ); –Stage 2 μ λb ~N(μ λ0, σ 2 λ0 ); σ 2 λb ~IG(α λ, β λ ); μ θb ~N(μ θ0, σ 2 θ0 ); σ 2 θb ~IG(α θ, β θ ) - Stage 3 μ λ0 ~N(μ λ00, σ 2 λ00 ); σ 2 λ0 ~IG(α λ00, β λ00 ) μ θ0 ~N(μ θ00, σ 2 θ00 ); σ 2 θ0 ~IG(α θ00, β θ00 ) Hyperparameters μ λ00, σ 2 λ00, α λ00, β λ00, μ θ00, σ 2 θ00, α θ00, β θ00, α λ, β λ, α θ, β θ are fixed constants

Inference AE bj is flagged if –Pr( θ bj > d*| Data) > p, where θ bj is log-OR in Binomial models and log-RR in Poisson models. –d* and p are all prespecified constants. Graphs are useful tools in deciphering data and presenting results

Model Selection Deviance Information Criteria (DIC) was used to compare models with same data Limited sensitivity analyses were done to check the robustness of the models Different levels of MedDRA structures were used –SOC/PT, HLT/PT and SOC/HLT/PT Treatment effect with additive study effects model using SOC/PT structure was chosen

Bayesian Meta Analysis Hierarchical Log-linear Regression Poisson models –Adjust for different exposures in treatment and control –Assume constant hazard over time –Unless AEs are fairly common or follow up of studies are quite unbalanced between treatment arms, usually are not very different from Binomial models Y ibj ~Pois(t ibj T ibj ); X bj ~Pois(c ibj C ibj ) where t ibj and c ibj are event rates, and T ibj and C ibj for AE ibj are total exposure times in the treatment and control groups, respectively log(c ibj )=λ ibj ; log(t ibj )=λ ibj + θ bj, Note that θ bj =log(RR bj )

Inferences of Binomial Hierarchical Model with Mixture Prior

Bayesian Patient Level Models IPD models to include within patient correlation and patient level factors while incorporating MedDRA coding hierarchy Data from one study is used:618 subject, 207 unique AEs, N =

Simulation Study Simulation scheme: –Randomly assign subjects to treatment or placebo to create a “null” scenario –Adverse events within subject remain unchanged to maintain the SOC/PT hierarchy –1000 simulated datasets Family-wise error rates (also FDR in this case) of Fisher’s exact text unadjusted for multiplicity and Poisson regression with mixture prior are compared Percentage of simulated datasets yielding Y=0, 1, 2 or ≥3 incorrectly flagged adverse events out of 464 PTs are also compared

Distribution of Y (%) FWER/ FDR 12>=3 Non-adjusted 2- sided Fisher’s exact test 2-sided test, p-value <= Non-adjusted 1- sided Fisher’s exact test p-value<= Bayes Hierarchical Poisson Model * c=1, p= c=1, p= c=1, p= c=1.2, p= c=1.2, p= c=1.2, p= c=2, p= c=2, p= c=2, p= RR≠1, p= RR≠1, p= RR≠1, p= *

Simulation Study 464 independent tests with alpha=0.05 would yield in average about 23 signals and have FDR=1 if no multiplicity is adjusted for. Correlation of the AE data reduced the error rate in our simulation study But the FDR is still as high as 99.1%. For 91% cases there are at least 3 falsely identified signals. The FWERs/FDRs for all Bayes model results are much lower and acceptable.

Closing Remarks Current traditional approach of flagging routinely collected AEs based on unadjusted p-values or CIs can result in excessive false positive signals –As a result, it can cause undue concern for approval/labeling/post marketing commitment Commonly used meta-analysis methods for aggregated binary outcome (OR) –Peto’s method is not recommended for severely unbalanced studies or common events unless treatment effects are small –MH method: needs continuity correction

Closing Remarks (Cont.) Bayesian meta-analysis hierarchical mixture modeling provides a useful tool to analyze data from multiple studies and address multiplicity –Allows for explicitly modeling AEs with the existing MedDRA coding structure –Use a mixture prior by allowing a point mass on equality of the treatment and control rates –Study differences can be accounted for –No need to add continuity correction. Double zero studies are included. –For less common AEs and studies without a great amount of follow-up variation between treatment groups, inferences from Poisson regression and logistic regression models are very similar Computation for signal detection using IPD is challenging Graphics are effective in displaying flagged signals

Future Work More sensitivity analysis on the performance of the models Further simulation study on type II error and operating characteristics of Bayesian models Zero-inflated Poisson model might be a good approach for relatively healthy population Incorporating severity information of AEs Multi-axial structure of MedDRA coding system The field of clinical trial signal detection is still in its infancy –More research and practice are needed –Statisticians need to work with clinicians/safety scientists closely to further advance this field

References Bate A, Lindquist M, Edwards, IR, Olsson S, Orre R, Lansner A, and De Freitas RM (1998). A Bayesian neural network method for adverse drug reaction signal detection. Eur J Clin Pharmacol 54: Berry S and Berry D (2004) Accounting for multiplicities in assessing drug safety: a three-level hierarchical mixture model. Biometrics, 60: Chi G, Hung HMJ, and O’Neill R (2002). Some comments on “Adaptive Trials and Bayesian Statistics in Drug Development” by Don Berry. In Pharmaceutical Report, Vol 9, 1-11 Crowe B, Xia A, Watson D, Shi H, Lin S, Kuebler J, Berlin J, et al. (2008). Recommendations for Safety Planning, Data Collection, Evaluation and Reporting During Drug, Biologic and Vaccine Development: A Report of the PhRMA Safety Planning, Evaluation and Reporting Team (SPERT). Manuscript in preparation. DuMouchel W (1999). Bayesian data mining in large frequency tables, with an application to the FDA Spontaneous Reporting System (with discussion). The American Statistician 53: Gould AL. Drug safety evaluation in and after clinical trials. Deming Conference, Atlantic City, 3 December 2002 Mehrotra, DV and Heyse, JF (2004). Multiplicity considerations in clinical safety analysis. Statistical Methods in Medical Research 13, Spiegelhalter DJ, Best NG, Carlin BP and van der Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J. Roy. Statist. Soc. B. 64,

Thank You!

Back-up Slides

Bayesian Hierarchical Model for AE (Berry & Berry 2004) PT level assumptions: SOC level assumptions: Global assumptions:

SOC: Injury, Poisoning and Procedural Complications