Presentation is loading. Please wait.

Presentation is loading. Please wait.

Blinding Index for Clinical Trials Heejung Bang, PhD Weill Medical College of Cornell University.

Similar presentations


Presentation on theme: "Blinding Index for Clinical Trials Heejung Bang, PhD Weill Medical College of Cornell University."— Presentation transcript:

1 Blinding Index for Clinical Trials Heejung Bang, PhD Weill Medical College of Cornell University

2 Outline  Background  Statistical methods  Examples: CRISP and WASID trials  Simulation study (if time permits)  Discussion  Current and future studies

3 Background  Human behavior is influenced by what we know or believe and everybody is tempted to find out something.  Blinding or masking--single, double, triple (Dictionary for Clinical Trials 1999)  Reduce selection and information bias, and improve compliance  Huge efforts directed to disguise the dissimilarity between treatments (e.g., taste, smell, appearance, mode of delivery)  Treatment assignment: the use of blocks of random lengths is common practice (suggested by ICH-E9)

4 4 Background  Blinding is not always feasible or relevant -- some surgical treatments -- some surgical treatments -- treatment vs. nothing -- treatment vs. nothing -- early vs. late interventions (e.g., HIV) -- early vs. late interventions (e.g., HIV) -- some patients don’t know if they are on treatment -- some patients don’t know if they are on treatment -- animal studies?? -- animal studies??  Imperfect blinding is preferable to an open design (Furberg & Soliman 2008)  Bias can occur in every aspect of trials: data reporting, collection, assessment and classification.  Bias may be more pronounced in trials with commercial sponsors.

5 5 Some alternative or back-up methods A third party can be employed in open trials in order to reduce study investigator bias Dose adjustment of the study medications may be handled in a run-in phase prior to randomization Prospective, randomized, open-label, blinded-endpoint (PROBE) design? Perhaps, not yet.

6 6 Important illustrative example In the single-blind placebo-controlled trial, investigating the effect of zinc on taste disorders showed statistically significant benefit. The identical trial was repeated with only one difference, double-blind, and showed no benefit. Problems: 1. responses were very subjective 2. ‘vested interest’ 2. ‘vested interest’ 3. bias from unblinding 3. bias from unblinding

7 7 FDA said: “[drug name]-related sided effects have the potential to unblind subjects and investigators. Unblinding may result in ascertainment bias of subjective study endpoints. We recommend that you administer a questionnaire at study completion to investigate the effectiveness of blinding the subjects and treating and evaluating physicians” Office of Therapeutics Research and Review, Center for Biologics Evaluation and Research, FDA 2003

8 8 FDA also said: “DRUDP requests that subjects and investigators state at the end of the subject’s participation as to what treatment assignment they think was made, in order to assess the adequacy of blinding” Office of Drug Evaluation ODE III. Center for Drug Evaluation and Research“, FDA 2005

9 9 CONSORT (revised version, 2001) Recommends reporting “how the success of blinding was evaluated.”

10  Everybody knows blinding is important but….  Grossly incomplete reporting of procedures and any test for blinding. Call for urgent improvement (Schulz et al. 1996; Fergusson et al. 2004; Hróbjartsson et al among others) (Schulz et al. 1996; Fergusson et al. 2004; Hróbjartsson et al among others)  Not many statistical methods available. Only two blinding indices in the literature.  Most medical papers are exploratory or some are even incorrect.  How to handle “Don’t know” answer? --- different from missing data! More Background

11 11 Common formats of blinding questionnaire With 3 response categories about their guess: “Drug”, “Placebo” or “Don’t know (DK)” With 5 response categories about guess and certainty of guess: “Strongly believe the treatment is drug” “Somewhat believe the treatment is drug” “DK” “Somewhat believe the treatment is placebo” “Strongly believe the treatment is placebo” Remarks: 1. We may re-ask those who answered DK initially. 2. Some don’t allow DK and force to guess --- I don’t think this is a good idea.

12 Common data structures  2x3 format

13  2x5 format Common data structures

14  Ancillary data from those who answered DK Remark: n3. (in ancillary data) = n.3 (in 2x3 or 2x5 format) if no missing data We may also collect

15 1. Chi-Square test (Hughes & Krahn, 1985) - Comparing the proportions of correct and incorrect - Comparing the proportions of correct and incorrect answers answers - 2x2 Chi-Square test to compare P cor and P inc among participants excluding DKs participants excluding DKs - 2x3 Chi-Square test to compare P cor and P inc among all participants participants - If blinding was not maintained from above analyses, blinding was assessed in each arm separately. blinding was assessed in each arm separately. Remark: Strictly speaking, this is like performing a one-sample binomial test! binomial test! Existing methods

16 16 2. Kappa statistic - Note that Kappa measures agreement but we should measure ! - Note that Kappa measures agreement but we should measure disagreement!

17 3. Blinding Index (James et al., 1996) -Modified version of kappa statistics - BI = {1+P DK +(1-P DK )*K D }/2 where - BI = {1+P DK +(1-P DK )*K D }/2 where P ij = n ij /N w ij = 0 (correct guess) = 0.5 (incorrect guess) = 1 (DK)

18 - 0

19 Limitations of existing methods  Most methods are descriptive or use naïve or incorrect statistics (e.g., Hughes and Krahn 1985; Howard et al. 1982).  James method is dominated by DK responses (DK should be real DK). (DK should be real DK).  Existing methods can not detect 1) different behaviors of two arms, 2) qualitatively different scenarios, nor 3) give the proportion of unblinding beyond chance.

20 Define r i|i = P i|i /(P 1|i + P 2|i ) i = 1 for drug, i = 2 for placebo (i.e., proportion of correct guesses among participants with certain identification on the i-th arm)  Without DK: new BI i = 2r i|i – 1 (i.e., proportion of participants who answer correctly on (i.e., proportion of participants who answer correctly on the i-th arm beyond chance level) the i-th arm beyond chance level)  With DK: new BI i = (2r i|i – 1)*(P 1|i + P 2|i ), estimated by New blinding index (Bang et al., 2004) (2x3 format)

21 21 Remark: new BI i is identical to P 1|1 – P 2|1 for drug arm P 2|2 – P 1|2 for placebo arm under trinomial distribution. under trinomial distribution.

22 New blinding index (2x5 format)  More general 2x5 format (& ancillary data for DK) new BI i = P 1|i + w 2|i P 2|i + w3 1|i P3 1|i - P 5|i - w 4|i P 4|i - w3 2|i P3 2|i new BI i = P 1|i + w 2|i P 2|i + w3 1|i P3 1|i - P 5|i - w 4|i P 4|i - w3 2|i P3 2|i subject to 0 ≤ w3 1|i = w3 2|i ≤ w 2|i = w 4|i ≤ 1 & subject to 0 ≤ w3 1|i = w3 2|i ≤ w 2|i = w 4|i ≤ 1 & P 1|i + P 2|i+ + P3 1|i + P3 2|i + P 4|i + P 5|i = 1 P 1|i + P 2|i+ + P3 1|i + P3 2|i + P 4|i + P 5|i = 1Remarks: 1. -1

23  If r i|i = 1 and n i3 = 0 (i.e., all responses are correct), new BI i = 1 (complete unblinding). If r i|i = 0 and n i3 = 0 (i.e., all responses are incorrect), new BI i = -1 (complete blinding or complete unblinding in opposite direction, how to interpret??). If r i|i = 0.5 (i.e., 50% correct and 50% incorrect among participants with certain identification), new BI i = 0 (random guessing)  Unblinding: if one-sided CI does not cover 0.  Wishful thinking*: all subjects tend to believe that they are on active treatment: test if 1) new BI 1 > 0; 2) new BI 2 < 0; 3) new BI 1 + new BI 2 = 0. *Andersen et al. (1972) called this ‘response bias’

24 24 Example 1: Cholesterol Reduction in Seniors Program (CRISP) A pilot study for cholesterol lowering in the elderly. Cholesterol levels continue to be predictors of coronary heart disease in people >65 years. The CRISP was a 5-center pilot study to assess feasibility of recruitment and efficacy of cholesterol lowering in this age group. The main paper: LaRosa, Applegate, Crouse, Hunninghake, Grimm, Knopp, Eckfeldt, Davis & Gordon (Arch Int Med 1994)

25 25 METHODS: A double-blinded RCT with placebo vs. 20- vs. 40-mg lovastatin. 1 year follow-up. Endpoints were changes in lipid levels. RESULTS: 431 subjects with LDL in mg/dL were randomized. In the 20- and 40-mg lovastatin groups, total cholesterol levels fell 17% and 20%; LDL fell 24% and 28%; triglyceride levels fell 4.4% and 9.9%, respectively. HDL rose 7.0% and 9.0%, respectively. No changes in the placebo group. CONCLUSION: Older subjects of both genders and a variety of racial/ethnic groups can be successfully recruited into a cholesterol-lowering trial. Lovastatin has effects similar to those reported in younger subjects. There is little advantage to the higher lovastatin daily dose.

26 CRISP study

27 CRISP results  Hughes & Krahn’s method (with correction) Overall: P cor = 68.1% > P inc = 31.9% (p P inc = 31.9% (p < ), unblinded Lovastatin: P cor = 76.6% > P inc = 23.4% (p P inc = 23.4% (p < ), unblinded Placebo: P cor = 51.8% > P inc = 48.2% (p = ), blinded  James et al.’s BI BI = 0.75 (95% CI: 0.71, 0.78), blinded.  Bang et al.’s New BI - Using data in 2x3 format: Lovastatin: BI = 0.21 (95% CI: 0.15, 0.26), unblinded Lovastatin: BI = 0.21 (95% CI: 0.15, 0.26), unblinded Placebo: BI = 0.01 (95% CI: -0.07, 0.10), blinded Placebo: BI = 0.01 (95% CI: -0.07, 0.10), blinded - Using data in 2x5 format + ancillary data for DK: Lovastatin: BI = 0.16 (95% CI: 0.11, 0.21), unblinded Lovastatin: BI = 0.16 (95% CI: 0.11, 0.21), unblinded Placebo: BI = 0.01 (95% CI: -0.06, 0.08), blinded Placebo: BI = 0.01 (95% CI: -0.06, 0.08), blinded

28 28 Example 2: Warfarin-Aspirin Symptomatic Intracranial Disease (WASID) trial Hertzberg et al. (2008) investigated if use of dose modification schedule is effective for blinding trials of warfarin in the WASID study (Chimowitz et al. NEJM 2005). Compared with blinding in the SPINAF trial (Ezekowitz et al. NEJM 1992). The WASID team spent great attention on blinding issues.

29 29

30 30

31 31Results In the WASID and SPINAF trials, new BI uniformly showed increased unblinding for warfarin than for aspirin, whereas other indices could not capture this fact. If you combine BIs from different arms, cancel-out effect can occur. Summarizing a pattern can be important. The observed trend may be explained in other occurrences associated with warfarin (e.g., number of dose change, hemorrhage)

32 32 Comments from Dr. Canette (at Stata) ‘It is clear that BI and New BI cannot be compared, because they are based on different paradigms. James et al. believe that the most important observations are "DK", and the new index doesn't rely much on these observations. I believe that some research needs to be done to determine the circumstances under which one of the indexes should be chosen over the other. At this point, the subject of which paradigm to choose is a totally subjective matter.’

33 33 Example 3: How to interpret this data? A sample run of ‘Blinding’ module in Stata using an artificial example (provided by Dr. Canette) Methods | Index. Std. Err. z P-value 95% CI James | Bang Drug | Bang Placebo| Some researchers would be happy with the blinding results, but Bang’s new BI rejects the blinding hypothesis. Some people would say that the new BI is being harsh; and the response is that all depends on your interpretation of DK.

34 34 Bang’s response Yes. All depend on the validity of DK. It can be one of New BI's limitations but we want to focus on estimate, not testing. Even when DK is truly DK (so James’ BI might be better), we still want to know % of unblinding in each arm. New BI clearly shows that. We may classify each scenario into 9 blinding scenarios.

35 35

36 Our suggestion in practice Blinding assessment should be reported in publications for relevant trials, if not all trials.  When unblinding (e.g., by new BI) is >20% in any arm.  Perhaps, BI and new BI are reported together, especially, when the two methods yield different conclusions. Remark: Selective reporting can be problematic!

37 37 Another important & controversial question: When to ask blinding questions? (Letters & Response in CCT) Shortly after randomization vs. during vs. after trial? - Henneicke-von Zepelin (2005) and Hemilä (2005) both claim ‘assessment may be inappropriate after the trial’ due to confounding between efficacy and correct guessing. - Sackett (BMJ 2004) and others agree.

38 38 Bang et al’s response (2005): Statistically speaking, of course, the best approach is to ask twice or more. However, we still prefer ‘after the trial’. Although we may not be able to know if blinding is true blinding or DK is DK, we never want to make participants try to guess. As we ask more, they may become curious. ‘Less is more’ or ‘Let it be’. Blinding conveys stories during the entire course of the trial including early and late, efficacy and side effects. If you want to test the blinding at the beginning, do with the third party or in a pilot study.

39 39 We may ask some more qualitative questions: For example, ‘Why do you believe you were on treatment x?’ ‘Why do you believe you were on treatment x?’ ‘When did you find out?’ ‘When did you find out?’ preferably together with other general questions (e.g., participants’ satisfaction, problems or other comments) at the study close-out. preferably together with other general questions (e.g., participants’ satisfaction, problems or other comments) at the study close-out. --- Again, we do not want to make participants try to guess even in their future trials.

40 Simulation studies: settings

41 Simulation studies: settings (cont’) (actually, 9=3X3 possible scenarios)

42 Simulation results

43 Simulation results (cont’)

44 Discussion  New BI is directly interpreted (as % of unblinding beyond chance) and detects relatively low degree of unblinding and captures different behaviors in different arms. DO NOT COMBINE!  New BI’s extension to >2 arms is easy.  We should encourage all the participants to provide their honest guess for the treatment and may include extra question(s) to evaluate 1) the credibility of DK and 2) reasons for guess, etc. --Some don’t like this idea --Some don’t like this idea  Subgroup analyses can be important (Vitamin C trial 1975; Hemilä 1996).

45  Assessment of blinding can be quite straightforward statistically, but the final conclusion relies on the subjectivity of investigators and the nature of the study. (For example, how large is large enough?)  1) BI estimation (& statistical testing), 2) classification into 9 blinding scenario, 3) careful interpretation and potential cause identification can provide a comprehensive evaluation of the blinding of clinical trials.  If unblinding occurs, it is important to identify the causes and fix the problems, if possible, (except for treatment effect) for future study planning and conduct. Discussion

46 46Discussion  Blinding research is destined to be subjective, qualitative, and imperfect.  However, empirical (quantitative) evidence is almost always good to have.  At the end of the day, impacts on primary treatment effect?? --Unblinding may not invalidate primary results. --Unblinding may not invalidate primary results.

47 47 Personal belief about good treatments/trials ‘Treatment effect’ ‘Treatment effect’ should be greater than should be greater than ‘Noncompliance effect’ ‘Noncompliance effect’ should be greater than should be greater than ‘Unblinding effect’ ‘Unblinding effect’

48 48 Current and future research (collaborators: Drs. Park and Canette) Literature review. Meta-analytic approaches Classify existing trials into different blinding scenarios The effect of blinding on the effectiveness BI by types of interventions More simulation study for comparing blinding indices Developing Blinding questionnaire and procedure Park, Bang, Canette. Blinding in controlled trials, time to do it better (Editorial). Complementary Therapies in Medicine

49 49 Acknowledgements Dr. Isabel Canette and Ms. Jiefeng Chen in the Stata team -- "blinding" module to compute two BIs available by Stata since March Dr. Jongbae Park, Mr. Stephen Flaherty and IT team at UNC-Medical School Co-authors: Ms. Liyun Ni (at Amgen) and Dr. Clarence E. Davis (at UNC)


Download ppt "Blinding Index for Clinical Trials Heejung Bang, PhD Weill Medical College of Cornell University."

Similar presentations


Ads by Google