Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Issues in the Design, Conduct and Analysis of Large Safety Studies Michael Gaffney, PhD Pfizer Inc.

Similar presentations


Presentation on theme: "Statistical Issues in the Design, Conduct and Analysis of Large Safety Studies Michael Gaffney, PhD Pfizer Inc."— Presentation transcript:

1 Statistical Issues in the Design, Conduct and Analysis of Large Safety Studies Michael Gaffney, PhD Pfizer Inc

2 Outline General description of PRECISION Three issues in planning PRECISION Issues and design modifications during the conduct of PRECISION Potential issues in the analysis of PRECISION EAGLES Study Personal Perspective/Points of Interest

3 PRECISION 2005 FDA Advisory Committee meeting on CV risk of COX-2 NSAIDS FDA mandated study - Commitment by Pfizer to FDA Funded by Pfizer Independent Executive Committee Principal Investigator and Study Chair- Steven Nissen, MD – Cleveland Clinic Independent Data Monitoring Committee Chair - Thomas R. Fleming, PhD – University of Washington Experts in Cardiology, Rheumatology and Gastroenterology on both committees

4 PRECISION Design Primary Objective To assess the relative cardiovascular effects of celecoxib, ibuprofen and naproxen in the treatment of osteo and rheumatoid arthritis Primary Endpoint Time to first occurrence of the composite cardiovascular endpoint of CV death, non- fatal MI, non-fatal stroke (APTC) Secondary Endpoint APTC + hospitalization for unstable angina, hospitalization for TIA, revascularization Non-inferiority Study Randomization to one of the three treatment options stratified according to: – Treatment Center – OA-RA indications – Aspirin use at baseline

5 Original PRECISION Assumptions APTC rate per year: 0.020 Non-Inferiority Margin (NIM): hazard ratio (HR) =1.333 Off-Treatment rate: Cumulative 40% over 3 years 18 month minimum follow-up 36 month maximum follow-up time Power = 0.90 Conclude non-inferiority if one-sided 97.5% upper confidence limit excludes HR=1.333 Number of APTC events 762

6 Design Issue1 Considerations For A Composite Endpoint Increase in CV events was observed in the APTC endpoints Relationship of hospitalized UA, TIA and revascularization with APTC endpoints Is noise being added to the composite endpoint by inclusion, or Is there informative censoring of the composite endpoint by exclusion Accuracy of the adjudicated diagnosis Interpretation of the results when the less severe endpoints dominate the composite (The broader composite will always lead to a smaller, shorter trial but that should never be the reason for choosing it) PRECISION leadership team, in consultation with FDA, determined that the APTC composite endpoint was the proper primary endpoint

7 Effects on Study Size of NIM and ITT/MITT Events ITT Events - All APTC events over the 3-year observation time MITT Events - All APTC events on randomized treatment + 30 days post-treatment Design NIMAnalysis EventsSample Size 11.33MITT 76220,000 1.30ITT 925 21.40MITT 55614,700 1.36ITT 680 3 1.37MITT 626 1.33ITT 76216,500 4 1.45MITT 455 1.40ITT 55612,000 NIM and ITT/MITT events are strong determinants of study size and consequently the time to clinical knowledge of the study results.

8 Design Issue 2 Purpose/Determination of NIM What is the purpose of NIM in a safety study? An NIM, when able to be determined by a strong scientific/clinical method; - provides important scientific context for the design, conduct and analysis of safety trials and serves the purpose of ruling out an unacceptable increase in risk. - serves the same role in a safety study as 1 does in an efficacy study, i.e., it sets up the null and alternative hypotheses. - can serve as an objective regulatory criterion that the study results have or have not ruled out an unacceptable increase in risk. For the above reasons it is essential to establish a rigorous and defensible NIM The PRECISION NIM of 1.333 was established on a strong/clinical scientific basis. The NIM was established by considering a potential benefit of COX-2 on serious GI events in conjunction with a clinically acceptable excess risk based on the expected APTC event rate in the NSAID control group. (

9 Design Issue 3 ITT and MITT Analyses Statistical, clinical/scientific and practical points to consider Statistical Potential informative censoring regarding estimation of HR (MITT) -Event rate is dependent on censoring mechanism -Censoring mechanism (either time, type or degree) differs between the treatment groups Decreasing the HR towards one (ITT) - Inclusion of events from patients no longer receiving randomized treatment may attenuate the HR and increase the chance of concluding non-inferiority.

10 ITT and MITT Analyses Clinical/Scientific The MITT analysis (under the assumption of non-informative censoring) assesses the on-treatment effects of each of the study treatments (direct exposure effect over a variable, censored time) The ITT analysis (with no assumptions) assesses the effects of treatment strategies beginning with each of the study treatments(“real-world” effects over a fixed observation time) Practical PRECISION will have to show non-inferiority for both the MITT and ITT analyses because each analysis alone has weaknesses.

11 PRECISION Approach All three results must occur in order to conclude non-inferiority ITT - Upper limit of the one-sided 97.5% confidence interval for the HR< 1.33 MITT-Upper limit of the one-sided 97.5% confidence interval for the HR< 1.33 Point estimate of the HR does not exceed 1.12 Thus, with respect to these 3 important design issues, composite endpoint, NIM, and analytical approach to MITT/ITT events, the PRECISION trial was rigorously and properly designed to address the study’s objective.

12 Conduct of PRECISION Rigorous, ongoing, monitoring of the conduct of PRECISION with particular intention given to: APTC event rate Achieving real world adherence to the randomized regimens Retention in the study. The APTC event rate, pooled across the three treatment regimens, was meaningfully lower than expected. This occurrence along with the pooled rate of adherence and the pooled rate of retention led to recognition by the study leadership that refinements to the design of PRECISION were necessary.

13 PRECISION Modifications All modifications resulted from interaction among DMC, Study Chair, Sponsor and FDA Due to the higher than expected off-treatment rate: - ITT observation time was decreased from 36 months to 30 months - MITT observation time was increased from 36 months to 42 months

14 PRECISION Modifications Reduce power from 0.90 to 0.80 - Number of required APTC events decreases from 762 to 580 - Chance of not concluding non-inferiority when there is no difference increases from 10% to 20% - The point estimate that rules out the 1.333 margin is reduced from 1.12 to 1.092 Increase the NIM for the MITT analysis to 1.40 - Number of required APTC events decreases from 580 to 420 - No change in Power - The point estimate that rules out the 1.40 margin is 1.107

15 Rationale for the MITT margin of 1.4 First, in the ITT analysis, in order to rule out the 1.333 NIM for a pairwise comparison the estimated HR < 1.092. Second, given the concern that the estimated HR in the ITT analysis could be attenuated toward unity by follow-up that occurs well after discontinuation of randomized intervention, it would be reassuring if the estimated HR from the MITT analysis also were to be ≈ 1.092, suggesting that the ITT analysis did not achieve this target because of that attenuation. Third, in the MITT analysis, achieving a point estimate < 1.107 occurs if and only if the NIM margin for that analysis is 1.40. Added benefit that 580 ITT events and 420 MITT are expected to occur at approximately the same time

16 Analytical Issues in PRECISION Primary Analysis Cox proportional hazards model with region, diagnosis (OA/RA) and baseline ASA usage as covariates in the model When randomization is stratified, how to use these variables in the Cox model? Covariates or stratification variable Analysis and interpretation of treatment HR by strata used in the randomization

17 Analytical Issues in PRECISION Understanding ITT and MITT Analyses If one is consistent with the other or if ITT HR < MITT HR, maybe fine, but Off-Treatment Issue: Amount of censoring, the type of censoring, the time of censoring and the characteristics of patients censored need to be explored for potential informative censoring in the MITT analysis. Non-retention Issue: Amount, time and characteristics of subjects dropping out of the study need to be explored for potential informative censoring due to non-retention in the ITT analysis. In a study with non-negligible off-treatment and non-retention rates and with different MITT and ITT observation times this may be a complicated issue in analysis and interpretation.

18 Analytical Issues in PRECISION Cross-ins to study NSAIDS, any NSAID Cross-ins likely reduce sensitivity to distinguish CV risk among randomized treatment groups More of an issue with ITT analysis but will also affect MITT Sensitivity analyses based on subgroups or censoring have weaknesses Time-dependent Analyses? Deconstructing the Primary Composite Endpoint Fatal, Non-Fatal MI Fatal, Non-Fatal Stroke

19 Summary PRECISION was rigorously designed and monitored with rigorous performance standards. Enrollment in PRECISION was terminated at 24,333 randomized subjects, projected to provide the targeted number of APTC events by the end of this year. The study was challenging from all aspects and medical/statistical challenges remain in the analysis and interpretation of results. The information from this randomized study of 24,000 subjects, with a strong design and conducted in a rigorous manner, will dwarf the current safety information regarding the three study drugs.

20 Safety Context of EAGLES Trial EAGLES Study – Chantix (varenicline) Smoking Cessation Safety Study Signal for serious neuropsychiatric adverse events came from spontaneous post-marketing reports to FDA FDA determined that varenicline was associated with serious neuropsychiatric adverse events including suicidal ideation, suicidal behavior, changes in behavior, agitation, depressed mood, and worsening of preexisting psychiatric illness. Led to a Boxed Warning in the Chantix label regarding these serious neuropsychiatric adverse events Led to EAGLES, a post-marketing commitment by Pfizer to FDA and to EMA

21 EAGLES Design Randomized Design: placebo, varenicline, bupropion and nicotine patch Study Objective: To characterize the neuropsychiatric safety profiles of varenicline, bupropion, nicotine patch and placebo Composite primary endpoint of Serious/Severe Neuropsychiatric (NPS) AEs occurring over the 12 week treatment period Design and protocol were approved by FDA and EMA External Data Monitoring Committee Determination of NIM by a strong scientific/clinical method was not possible

22 EAGLES Design Guidance: “The study should be sufficiently powered to adequately assess clinically significant neuropsychiatric adverse events with each treatment” Estimation study - not designed to test a specific hypothesis Sample size of 8,000 (2,000 per treatment group) was determined by agreement with FDA on a pre-specified width of the 95% CI in estimating NPS AE rate differences A sample size of 2,000 per treatment group yields a 95% confidence interval about the risk difference of ± 1.59%.

23 EAGLES Approach Applied to HR In studies where an NIM is not able to be determined by a strong clinical/ scientific method, the 95% CI for the HR is a way to size the study and to assess the cost benefit of relative trial size. Example of total number of planned events and 95% CL for the HR Events95% CL 508(HR/1.19, 1.19HR) 388(HR/1.22, 1.22HR) 280(HR/1.26, 1.26HR) The use of the 95% CI to determine study size in a non-NIM study does not necessarily lead to smaller trials than NIM-based trials.

24 Summary EAGLES is a rigorously designed safety study in which an NIM was not possible. When EAGLES is completed the point estimates of the risk difference and the 95% confidence interval and absolute levels of risk will contribute to clinical knowledge. Because of the rigor of the design of EAGLES, FDA anticipates adding the results of EAGLES to the label and using the results to evaluate the boxed warning for Chantix generated by the spontaneous reports.

25 Personal Perspective/ Points of Interest A NIM, when able to be determined by a strong scientific/clinical method, provides important scientific context for the design, conduct and analysis of safety trials and serves the purpose of ruling out an unacceptable increase in risk. When an NIM cannot be established by a strong scientific/clinical method an arbitrary one should not be used. As shown by EAGLES, a safety study can be sized by the 95% CI. This approach calls for as rigorous a trial as does one with an NIM. A hypothesis rejecting, all or nothing, interpretive approach to the results of a safety study based on a NIM should be discouraged. While the NIM does give the pre-specified level of unacceptable risk, a binary interpretation distracts from the clinical information from the study, such as point estimates, confidence intervals and absolute levels of risk. Medical/scientific considerations should inform the on-treatment risk period (e.g., treatment period + 30 days after stopping) for the MITT analysis. Without indications of informative censoring (although maybe hidden) the MITT events over the on-treatment risk period may better estimate risk than ITT events with the expectation of HR=1 after the risk period. All subjects must be followed to the end of the study in order to conduct the ITT analysis. Only the ITT analysis preserves the integrity of randomization which is the statistical basis of inference and the assessment of causality.


Download ppt "Statistical Issues in the Design, Conduct and Analysis of Large Safety Studies Michael Gaffney, PhD Pfizer Inc."

Similar presentations


Ads by Google