Why do people use LOCF? Or why not? Naitee Ting, Allison Brailey Pfizer Global R&D CT Chapter Mini Conference
Outline Last Observation Carried Forward (LOCF) Data set description Modeling approaches Concerns in clinical Trials SAP concerns Why or why not use LOCF
Observed data from each patient over time
Complete Data
Last-Observation-Carried-Forward
LOCF Conservative? Or anti-conservative? Biased point estimate May underestimate variance
Data set Simulated - standing diastolic BP Eight week study of test drug vs placebo Clinic visit every 2 weeks Primary endpoint – change in standing BP from baseline to week 8 Patients completed the study or dropped out at various time points Missing completely at random
Simulated data ctr pid trt wk0 wk2 wk4 wk6 wk8 501 1 1 103.9 102.0 103.6 102.2 100.4 501 2 0 105.9 111.8 112.5 115.0 117.0 501 5 0 93.8 98.4 103.4 104.5 116.7 501 6 1 102.8 87.4 72.8 60.9 48.5 501 11 0 109.4 105.3 99.2 96.9 89.7 501 15 0 93.9 81.6 66.1 50.5 40.3 501 16 1 92.4 83.6 71.7 66.2 56.5 501 18 0 99.3 99.0 101.9 102.5 103.2 502 1 0 105.8 102.7 87.5 84.9 78.8 502 4 1 102.0 100.3 101.1 95.7 . 502 5 1 110.3 116.8 120.6 132.7 136.8 502 8 0 125.6 121.7 116.1 110.0 108.5 502 9 1 92.9 91.4 82.1 . . 502 12 0 123.7 121.7 118.3 122.0 120.3 502 13 0 107.7 121.4 141.5 154.7 168.9 502 16 1 112.1 109.6 103.6 103.3 104.2
Modeling approaches Many proposals to deal with dropouts Mixed model approach Repeated measures Random intercept, random slope Single imputation Multiple imputation Imputation model Analysis model
ANCOVA on LOCF data TREATMENT | 1 2441.0 4.13 0.0444 Source | df MS F p-Value TREATMENT | 1 2441.0 4.13 0.0444 CENTER | 8 765.8 1.30 0.2523 BASELINE | 1 318.4 0.54 0.4644 ERROR |119 591.1 Statistic Test Drug Placebo Raw Mean -9.40 -0.54 Adj Mean -8.93 -0.26 Std Error 3.08 3.01 N 65 65
Analysis of completed cases Source | df MS F p-Value TREATMENT | 1 1963.6 3.32 0.0713 CENTER | 8 1007.2 1.70 0.1060 BASELINE | 1 73.2 0.12 0.7258 ERROR |109 592.0 Statistic Test Drug Placebo Raw Mean -10.40 -1.72 Adj Mean -10.23 -2.11 Std Error 3.27 3.14 N 60 60
Naive interpretation If LOCF provides statistical significance If completer analysis supports LOCF True story may lie between the two Clinical conclusion can be made
Mixed model analysis For demonstration purposes, only repeated measure results are presented proc mixed method=reml ; where week>0 ; class pid trt week ctr ; model y=wk0 trt ctr week trt*week/solution ; repeated week / type=cs subject=pid r rcorr ; estimate 'trt dif at week 8' trt -1 1 trt*week 0 0 0 -1 0 0 0 1 / cl alpha=0.05 ;
Results from PROC MIXED Num Den Effect DF DF F Value Pr > F Baseline 1 456 3.03 0.0826 Treatment 1 16 5.57 0.0313 Center 8 85 5.43 <.0001 Week 3 108 2.46 0.0662 Trt*week 3 46 1.22 0.3132 Standard Label Estimate Error DF t Value Pr > |t| week 8 dif 7.3739 3.0127 46 2.45 0.0183
Single or multiple imputation Mixed model can be considered as single imputation For imputation, we can use the same model for imputation and analysis However, one model can be used for imputation, but a different one is for analysis
Should LOCF be used? After the modeling approaches became available, use of LOCF have been discouraged Models are developed with assumptions More complicated models require more assumptions Are these assumptions justified?
Should LOCF be used? LOCF is a model and there are simple assumptions behind it In New Drug Applications (NDA), LOCF is still widely used Why?
Different phases in clinical trials Phase I, II, III, IV Phase I – How often? Phase II – How much? Phase III – Confirm Phase IV – Post-Market
DOES THE DRUG WORK? Double-blind, placebo controlled, randomized clinical trial Test hypothesis - does the drug work? Null hypothesis (H0) - no difference between test drug and placebo Alternative hypothesis (Ha) - there is a difference
TYPES OF ERRORS Regulatory agencies focus on the control of Type I error Probability of making a Type I error is not greater than a In general, a = 0.05; i.e., 1 in 20 Avoid inflation of this error Changing the method of analysis to fit data will inflate a
MULTIPLE COMPARISONS For 20 independent variables (clinical endpoints), one significant at random For 20 independent treatment comparisons, one significant at random Subgroup analyses can also potentially inflate a Multiple comparison adjustment
Report all data Scientific experiments generate data Outliers may be observed Delete outlier? Clinical trials generate data A wonder drug cures 9,999 patients of 10,000 One died – outlier – delete?
Statistical Analysis Plan (SAP) Pre-specification of analysis Prior to breaking blind Internal agreement within project team Binding document to communicate with regulatory authorities Use of LOCF or modeling approach need to be pre-specified in SAP
Modeling approaches Assumptions Can be complicated Difficult to explain to end users George Box – “All models are wrong, some are useful”
Why LOCF? Or why not? Easy to understand Easy to communicate between statisticians and clinicians, and between sponsor and regulators Lots of prior examples Biased point estimate, biased variance
Recommendations Understand the disease Understand data to be collected Understand the dropout issues Make use of Phase II results Encourage use of statistical models LOCF may still be considered as supportive