Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 From bench to the bedside Statistics Issues in RCT Ferran Torres Biostatistics and Data Management Platform IDIBAPS - Hospital Clinic Barcelona Universitat.

Similar presentations


Presentation on theme: "1 From bench to the bedside Statistics Issues in RCT Ferran Torres Biostatistics and Data Management Platform IDIBAPS - Hospital Clinic Barcelona Universitat."— Presentation transcript:

1 1 From bench to the bedside Statistics Issues in RCT Ferran Torres Biostatistics and Data Management Platform IDIBAPS - Hospital Clinic Barcelona Universitat Autònoma Barcelona. EMA: Scientific Advice Working Party (SAWP) Biostatistics Working Party (BSWP).

2 2 Disclaimer The opinions expressed today are personal views and should not be understood or quoted as being made on behalf of any organization. – Regulatory Spanish Medicines Agency (AEMPS) European Medicines Agency (EMA) – Scientific Advice Working Party (SAWP) – Biostatistics Working Party (BSWP) – Hospital - Academic - Independent Research IDIBAPS. Hospital Clinic Barcelona Autonomous University of Barcelona (UAB) SCREN. Spanish Clinical Trials Platform

3 DOCUMENTATION 3

4 Documentation Power Point presentation Selected References Direct links to guidelines 4 Password: stats_rct

5 5 Globalisation

6 LACK OF HARMONISATION Data to register in all regions Similar Basic Technical Requirements JAPAN USA EU INTERNATIONAL CONFERENCES HARMONISATION 6

7 Regulatory Agencies 7

8 8

9 9 CPMP/EWP/908/99 CPMP Points to Consider on Multiplicity issues in Clinical Trials CPMP/EWP/908/99 CPMP/EWP/2863/99 Points to Consider on Adjustment for Baseline Covariates CPMP/EWP/2863/99 CPMP/2330/99 Points to Consider on Application with 1.) Meta-analyses and 2.) One Pivotal study CPMP/2330/99 Choice of a Non-Inferiority Margin CPMP/EWP/482/99 Points to Consider on Switching between Superiority and Non-inferiority CPMP/EWP/482/99 CPMP/EWP/1776/99 Points to Consider on Missing Data CPMP/EWP/1776/99 CHMP/EWP/83561/05 Guideline on Clinical Trials in Small Populations CHMP/EWP/83561/05 CHMP/EWP/2459/02 Reflection Paper on Methodological Issues in Confirmatory Clinical Trials with Flexible Design and Analysis Plan CHMP/EWP/2459/02 Regulatory Guidances

10 10 Consort Statement: Summary, // General, // non- inferioritySummaryGeneralnon- inferiority Lancet: Series de Methodological & Stats SeriesSeries de Methodological & Stats Series BMJ: Statistics Notes (Bland & Altman) or in BMJStatistics Notes (Bland & Altman)BMJ10 “Scientific Recomendations”

11 11

12 12 Today’s talk is on statistics

13 13

14 14

15 15 Basic statistics Why Statistics? Samples and populations P-Value Statistical errors Sample size Confidence Intervals Interpretation of CI: superiority, non- inferiority, equivalence

16 16 The role of statistics “Thus statistical methods are no substitute for common sense and objectivity. They should never aim to confuse the reader, but instead should be a major contributor to the clarity of a scientific argument.” The role of statistics. Pocock SJ Br J Psychiat 1980; 137:

17 17 Why Statistics? Variation!!!!

18 BACKGROUNG SAMPLE AND POPULATIONS P-VALUE AND CONFIDENCE INTERVALS 18

19 19 p

20 20 Population and Samples Target Population Population of the Study Sample

21 21 Extrapolation Sample Population Inferential analysis Statistical Tests Confidence Intervals Study Results “Conclusions”

22 22 P-value The p-value is a “tool” to answer the question: – Could the observed results have occurred by chance*? – Remember: Decision given the observed results in a SAMPLE Extrapolating results to POPULATION *: accounts exclusively for the random error, not bias p <.05 “statistically significant”

23 23 P-value: an intuitive definition The p-value is the probability of having observed our data when the null hypothesis is true (no differences exist) Steps: 1)Calculate the treatment differences in the sample (A-B) 2)Assume that both treatments are equal (A=B) and then… 3)…calculate the probability of obtaining a magnitude of at least the observed differences, given the assumption 2 4)We conclude according the probability: a.p<0.05: the differences are unlikely to be explained by random, – we assume that the treatment explains the differences b.p>0.05: the differences could be explained by random, – we assume that random explains the differences

24 24 Factors influencing statistical significance Signal Noise (background) Quantity Difference Variance (SD) Quantity of data

25 25

26 26 P-value. Some reflexions  Tell us NOTHING about clinical or scientific importance. Only, that the results were not due to chance. A “very low” p-value do NOT imply: – Clinical relevance (NO!!!) – Magnitude of the treatment effect (NO!!) With  n or  variability   p Please never compare p-values!! (NO!!!)

27 27 Interval Estimation Confidence interval Sample statistic (point estimate) Confidence limit (lower) Confidence limit (upper) Intuitive interpretation: “A probability that the population parameter falls somewhere within the interval”

28 28 95%CI Better than p-values… – …use the data collected in the trial to give an estimate of the treatment effect size, together with a measure of how certain we are of our estimate CI is a range of values within which the “true” treatment effect is believed to be found, with a given level of confidence. – 95% CI is a range of values within which the ‘true’ treatment effect will lie 95% of the time Generally, 95% CI is calculated as – Sample Estimate ± 1.96 x Standard Error

29 29 Superiority study d > 0 + effect IC95% d = 0 No differences d < 0 - effect Test betterControl better

30 DESIGN STATISTICAL ERRORS SAMPLE SIZE MINIMUM IMPORTANT CLINICALY IMPORTANT DIFFERENCE (MICD) 30

31 31 Type I & II Error & Power

32 32 Utilidad de Creer en la Existencia de Dios (según Pascal) H 0 : Dios No Existe H 1 : Dios Existe

33 33 Type I & II Error & Power Type I Error (  ) – False positive – Rejecting the null hypothesis when in fact it is true – Standard:  =0.05 – In words, chance of finding statistical significance when in fact there truly was no effect Type II Error (  ) – False negative – Accepting the null hypothesis when in fact alternative is true – Standard:  =0.20 or 0.10 – In words, chance of not finding statistical significance when in fact there was an effect

34 34 Sample size and MICD C x Variance n = (MICD) 2 C: function of  and  MICD: Minimum Important Clinically Difference

35 Minimum Important Clinically Difference (MICD or MID) “ Smallest difference that is considered clinically important, this can be a specified difference (the Minimum Important Clinically Difference (MICD)” One can observe a difference between two groups or within one group over time that is statistically significance but small. With a large enough sample size, even a tiny difference could be statistically significant. The MID is the smallest difference that we care about. 35

36 MICD 1) Statistical- or distribution-based methods that focus on the variance and distributional properties of scores in an untreated population of patients with the disease of interest 2) Panel based estimates from healthcare professionals and patients 3) External- or anchor-based methods, which compare changes in the outcome of interest to other clinically important outcomes 36

37 EFFECT SCALES ABBSOLUTE AND RELATIVE DIFERENCES 37

38 Absolute and Relative Scales Incidence events / population at risk Absolute Risk Reduction (ARR) Incidence in Test – Incidence in control Relative Risk Reduction (RRR) (Incidence in Test – Incidence in control) / Incidence in control Number Needed to Treat (NNT) 1/ ARR Relative Risk (RR) Incidence in Test / Incidence in control 38

39 39 Absolute and Relative effects Risks …

40 40 RR & OR RR or OR > 1 RR or OR =1 RR or OR < 1 Risk Factor Absence of effect Protection Factor

41 41 RR & OR Non-Exposed Exposed Ills  Rate in Exposed 2/4 => 0.50  Rate in non-Exposed 1/4 => 0.25 RR=2  Odds in Exposed: 2/2=> 1  Odds in non-Exposed 1/3 OR=3

42 Example Treatment A: relative risk of 0.81 Treatment B: reduction of 19% in risk Treatment C: absolute rate reduction of 3% Treatment D: survival increase from 84% to 87% Treatment E: relative mortality reduction of 19% Treatment F: avoids 1 death per 33 treated patients 42

43 Example Treatment A: relative risk of 0.81 RR = 13% / 16% => 0.81 Treatment B: reduction of 19% in risk RRR = => 19% Treatment C: absolute rate reduction of 3% ARR = 16% - 13% => 3% Treatment D: survival increase from 84% to 87% ARR = 87%-84% = 16% - 13% = 3% Treatment E: relative mortality reduction of 19% RRR = (16%-13%) / 16% = 19% o bé 100*(1-RR) => 19% Treatment F: avoids 1 death per 33 treated patients NNT = 33; ARR = 1/33 = 0,3 = 3% 43

44 CLINICAL RELEVANCE-INTERPRETATION SUPERIORITY, NON-INFERIORITY AND EQUIVALENCE DESIGNS 44

45 45 Superiority study d > 0 + effect IC95% d = 0 No differences d < 0 - effect Test betterControl better

46 46 0 Treatment more effective -><- Treatment less effective Treatment-Control 1 Superiority

47 47 0 Lower equivalence boundary Upper equivalence boundary Treatment more effective -><- Treatment less effective Treatment-Control 1 Equivalence

48 48 0 Lower equivalence boundary Treatment more effective -><- Treatment less effective Treatment-Control 1 Non-Inferiority

49 49 30% B  A P 1/2 ? 1/3 ?

50 50

51 51 JAMA 2002; 287:

52 52 30% 

53 53

54 54

55 55

56 56 MARKETMARKET A P A B B C C D E D

57 57

58 58 0 Treatment more effective -><- Treatment less effective Treatment-Control 1 Superiority

59 59 0 MICD Treatment more effective -> <- Treatment less effective Treatment-Control Statistically and Clinically Superiority

60 60 0 Lower equivalence boundary Upper equivalence boundary Statistical Superiority Non-inferiority Equivalence Inferiority Treatment-Control Statistically and Clinically superiority Statistically and Clinically Superiority Non relevant/negative effect relevant effect

61 61 Effect Size & Sample Size Relative Effect Absolute Size Power* difference (%) (%) (mmHg) % 4.9% % 5.9% % 8.5% % 13.3% % 20.2% % 28.2% % 39.3% % 49.3% % 61.1% % 71.0% % 80.4% * Statistical power assuming constant variability (SD=20mmHg)

62 62 Key statistical issues Multiplicity Subgroups: interaction & confounding Superiority and non-inferiority (and  ) Adjustment by covariates Missing data Others – Interim analyses – Meta-analysis vs one pivotal study – Flexible designs

63 63 MULTIPLICITY

64 64 Torneo Roland Garros ª Ronda Carlos Moyá vs Markus Hipfl

65 65 Lancet 2005; 365: 1591–95  To say it colloquially, torture the data until they speak...

66 66 Torturing data… – Investigators examine additional endpoints, manipulate group comparisons, do many subgroup analyses, and undertake repeated interim analyses. – Investigators should report all analytical comparisons implemented. Unfortunately, they sometimes hide the complete analysis, handicapping the reader’s understanding of the results. Lancet 2005; 365: 1591–95

67 67 DesignConductionResults

68 68 Multiplicity K independent hypothesis : H 01, H 02,..., H 0K S significant results ( p<  ) Pr (S  1 | H 01  H 02 ...  H 0K = H 0. ) = 1 - Pr (S=0|H 0. ) = 1- (1 -  ) K

69 69 Same examples

70 70 Multiplicity Bonferroni correction (simplified version) – K tests with level of signification of  – Each test can be tested at the  /k level Example: – 5 independent tests – Global level of significance=5% – Each test should be tested at the 1% level 5% /5=> 1%

71 71 But this is the simplified version for the general public

72 72 Cautionary Example RCT to treat rheumatoid arthritis Basic Clin Med 1981, 15: 445 Several end ‑ points repeated at various time- points and various subdivisions 48 of these gave p-values < 0.05 But… expect 5% of 850 = 42.5 =>so finding 48 is not very impressive

73 73 Some strategies to ‘burden’ with multiple contrasts

74 74 Handling Multiplicity in Variables Scenario 1:One Primary Variable – Identify one primary variable -- other variables are secondary – Trial is positive if and only if primary variable shows significant (p < 0.05), positive results

75 75

76 76 Handling Multiplicity in Variables Scenario 2Divide Type I Error – Identify two (or more) co-primary variables – Divide the 0.05 experiment-wise Type I error over these co-primary variables, e.g., 0.04 for the 1st, and 0.01 for the 2nd co-primary variable – Trial is positive if at least one of the co-primary variables shows significant, positive results

77 77 Handling Multiplicity in Variables Scenario 3 Sequentially Rejective Procedure – Identify n co-primary variables, e.g., n = 3 – Order obtained p-values Interpret the variable with the highest p-value at the 0.05 level; if significant, then interpret the variable with the 2nd highest p-value at the 0.05/2 level; if positive, then interpret the variable with the smallest p-value at the 0.05/3 level.

78 78 Handling Multiplicity in Variables Scenario 4Hierarchy – Pre-specify hierarchy among n co-primary variables, – All tested at the same level interpret 1st variable at 0.05 level, if significant, then interpret 2nd variable at 0.05 level; if positive, then interpret 3rd variable at 0.05 level. … Test procedure stops when a test is not significant. – Trial is positive if first co-primary variable shows significant, positive result

79 79 Role of Secondary Variables Secondary variables can only be claimed if and only if – the primary variable shows significant results, and – the comparisons related to the secondary variables also are protected under the same Type I error rate as the primary variable. Similar procedures as already discussed can be used to protect Type I error

80 80 Handling Multiplicity in Treatments Similar procedures as how to handle multiplicity in variables. Additional procedures are available, mainly geared to very specific settings of the statistical hypotheses. – Dunnett, Scheffee, REGW, Williams …

81 81 SUBGROUPS

82 82 Subgroups Indiscriminate subgroup analyses pose serious multiplicity concerns. Problems reverberate throughout the medical literature. Even after many warnings, some investigators doggedly persist in undertaking excessive subgroup analyses. Lancet 2000; 355: 1033–34 Lancet 2005; 365: 1657–61

83 CONFOUNDING & INTERACTION 83

84 84 Confounding Non-Smokers Smokers d=6% d=0%

85 Confounding A situation in which a measure of the effect of an exposure on risk is distorted because of the association of exposure with other factor(s) that influence the outcome under study. Criteria for confounding – Factor is associated with exposure – Factor is associated with disease in the absence of exposure – Factor is not in the causal path between exposure and outcome 85

86 Exposure Outcome Third variable To be a confounding factor, two conditions must be met: Be associated with exposure - without being the consequence of exposure Be associated with outcome - independently of exposure (not an intermediary) Confounding 86

87 87 Interacction Age< 45 years Age>= 45 Years d=5% d=0.7% d=11.5%

88 88 Interaction & Subgroups AspirinPlacebo Vascular Death Total %10.2% p= d=-0.9 ISIS-2: Vascular death by Star signs Geminis/LibraOther Star Signs AspirinPlacebo Vascular Death Total % 12.1% p<0.0001d=3.1 Interacction p = Lancet 1988; 2: 349–60.

89 89 Changes from ISIS-2 results Lancet 2005; 365: 1657–61

90 90 Simpson’s Paradox

91 91 Simpson’s Paradox cont.

92 92 “The answer to a randomized controlled trial that does not confirm one’s beliefs is not the conduct of several subanalyses until one can see what one believes. Rather, the answer is to re- examine one’s beliefs carefully.” – BMJ 1999; 318: 1008–09.

93 93 Lancet 2005; 365: 1657–61

94 94 the question is NOT: ‘Is the treatment effect in this subgroup statistically significantly different from zero?’ BUT… are there any differences in the treatment effect between the various subgroups? The correct statistical procedures are either a test of heterogeneity or a test for interaction

95 95 Subgroups Recommendations: – 1) Examine the global effect – 2) Test for the interaction – 3) Plan  adjustments for confirmatory analyses – 4) Some points which increase the credibility: Pre-specification Biologic plausibility

96 96 Lancet 2005; 365: 176–86

97 HOW TO CONTROL FOR CONFOUNDERS? IN STUDY DESIGN… – RESTRICTION of subjects according to potential confounders (i.e. simply don’t include confounder in study) – MATCHING subjects on potential confounder thus assuring even distribution among study groups – RANDOM ALLOCATION of subjects to study groups to attempt to even out unknown confounders IN DATA ANALYSIS… – RESTRICTION is still possible at the analysis stage but it means throwing away data – IMPLEMENT A MATCHED- DESIGN after you have collected data (frequency or group) – STRATIFIED ANALYSIS using to control for confounders – MODEL FITTING using adjustment techniques 97

98 98 MULTIPLE INSPECTIONS

99 99 Interim Analyses in the CDP Z Value Month of Follow-up (Month 0 = March 1966, Month 100 = July 1974) Coronary Drug Project Mortality Surveillance. Circulation. 1973;47:I-1

100 100 Lancet 2005; 365: 1657–61

101 101 Sequential designs 1) Sample size re-estimation 2) Group Sequential Methods 3) Alpha (Beta) Spending Functions 4) Repeated Confidence Intervals 5) Stochastic Curtailment 6) Bayesian Methods 7) Likelihood based Methods

102 102 Not suitable Analyses? Total time Recruitment

103 103 Suitable Analyses Total Time Recruitment

104 104 Métodos secuenciales por grupos Pocock (1977)  Pruebas de significación repetidas  K = Nº máximo de inspecciones a realizar  K fijo a priori  Análisis con pruebas estadísticas clásicas (  2, t-test,...)

105 105 Group Sequential Methods

106 106 Triangular bilateral

107 107 CONCLUSION

108 108

109 109

110 110

111 111 The role of statistics “Thus statistical methods are no substitute for common sense and objectivity. They should never aim to confuse the reader, but instead should be a major contributor to the clarity of a scientific argument.” The role of statistics. Pocock SJ Br J Psychiat 1980; 137:

112 112 Password: stats_rct

113 BACK-UP 113

114 114 RANDOMIZATION & COVARIATES

115 115

116 116 Adjustement The objective should be not to compensate unbalance (randomisation) but to improve the precision Avoid to adjust by post-randomization variables In RCT, never use this widespread strategy: “adjust by any baseline significant variable (5% or 10% level)”

117 117 Testing for “baseline homogeneity” All observed differences are known with certainty to be due to chance. We must not test for it: there is no alternative hypothesis whose truth can be supported by such a test. If significant, the estimator is still unbiased Balance: – Decreases the variance and increases the power. – It has no effect on type I error.

118 118 Stratification A priori May desire to have treatment groups balanced with respect to prognostic or risk factors (co- variates) For large studies, randomization “tends” to give balance For smaller studies a better guarantee may be needed Useful only to a limited extent (especially for small trials) but avoid to many variables (i.e. many empty or partly filled strata)

119 119 Observed Unbalanced… NEVER justifies the post-hoc adjustment: – Randomization is more important – The treatment effect is unbiased without adjustment (randomization) – Type I error level takes into account for “chance error” – Post-hoc: data driven analyses – Multiplicity issues : increase type I error by allowing a post-hoc adjustment

120 120 Adjusted Analyses ‘ When the potential value of an adjustment is in doubt, it is often advisable to nominate the unadjusted analysis as the one for primary attention, the adjusted analysis being supportive.’

121 121 Ajuste por covariables Definición a priori La aparición de desigualdades basales NO justifica el ajuste per se: – Se da más importancia a la randomización – Peligro de los análisis post-hoc – Multiplicidad Como estrategia general, el ajuste por variables significativas basales (ej, p<0.1 o p<0.05) a priori: NO es válida

122 122 MISSING DATA

123 123 Ex: LOCF & lineal extrapolation Time (months) LOCF Lineal Regresion Bias Adas-Cog > Worse < Better

124 124 Ex: Early drop-out due to AE Adas-Cog Time (months) Placeb o Active > Worse < Better Bias: Favours Active

125 125 Ex: Early drop-out due to lack of Efficacy Adas-Cog Time (months) Placebo Active > Worse < Better Bias: Favours Placebo

126 126 RND B Baseline Last Visit ≠ Frecuencies A Drop-outs and missing data AAAA AA B B A Visit 2 Visit 1 A

127 127 RND Baseline Last Visit ≠ Timing A Drop-outs and missing data AAAAB B Visit 2 Visit 1 BBB

128 128 MD e incorrecto uso de poblaciones (1)Diseño  Cirugía vs Tratamiento Médico en estenosis carotidea bilateral (Sackket et al., 1985)  Variable principal: Número de pacientes que presenten TIA, ACV o muerte  Distribución de los pacientes:  Pacientes randomizados:167  Tratamiento quirúrgico: 94  Tratamiento médico: 73 –Pacientes que no completaron el estudio debido a ACV en las fases iniciales de hospitalización:  Tratamiento quirúrgico: 15 pacientes  Tratamiento médico: 01 pacientes

129 129 MD e incorrecto uso de poblaciones (2)  Población Por Protocolo (PP): Pacientes que hayan completado el estudio  Análisis –Tratamiento quirúrgico:43 / ( ) = 43 / 79 = 54% –Tratamiento médico:53 / (73 - 1) = 53 / 72 = 74% –Reducción del riesgo:27%, p = 0.02 Primer análisis que se realiza :

130 130 MD e incorrecto uso de poblaciones (3) El análisis definitivo queda de la siguiente forma :  Población Intención de Tratar (ITT): Todos los pacientes randomizados  Análisis –Tratamiento quirúrgico:58 / 94 = 62% –Tratamiento médico:54 / 73 = 74% –Reducción del riesgo:18%, p = 0.09 (PP: 27%, p = 0.02) Conclusiones:  La población correcta de análisis es la ITT  El tratamiento quirúrgico no ha demostrado ser significativamente superior al tratamiento médico

131 131 Handling of MD Methods for imputation: – Many techniques – No gold standard for every situation – In principle, all methods may be valid: Simple methods to more complex: – From LOCF to multiple imputation methods – Worst Case, “Mean methods” Multiple Imputation But their appropriateness has to be justified Statistical approaches less sensitive to MD: – Mixed models – Survival models They assume no relationship between treatment and the missing outcome, and generally this cannot be assumed.

132 Handling of MD 132

133 Relationship of MD with 1) Treatment 2) Outcome 133

134 134

135 135

136 136

137 137

138 138

139 139

140 140

141 Handling of MD 141

142 Best way to deal with Missing Data: Don’t have any!!! Methods for imputation: – Many techniques – No gold standard for every situation – In principle, “almost any method may be valid”: =>But their appropriateness has to be justified 142

143 Handling of MD Avoidance of missingness: – In the design and conduct all efforts should be directed towards minimising the amount of missing data likely to occur. – Despite these efforts some missing values will generally be expected. The way these missing observations are handled may substantially affect the conclusions of the study. 143

144 Statistical framework applicability of methods based on a classification according to missingness generation mechanisms: – missing completely at random (MCAR) – missing at random (MAR) – missing not at random (MNAR) Rubin (1976) 144

145 MCAR - missing completely at random – Neither observed or unobserved outcomes are related to dropout MAR - missing at random – Unobserved outcomes are not related to dropout, they can be predicted from the observed data MNAR - missing not at random – Unobserved outcomes are related to dropout Missing Data Mechanisms 145

146 MAR methods MAR assumption – MD depends on the observed data – the behaviour of the post drop-out observations can be predicted with the observed data – It seems reasonable and it is not a strong assumption, at least a priori – In RCT, the reasons for withdrawal are known – Other assumptions seem stronger and more arbitrary 146

147 Time (months) > Worse < Better Options after withdrawal

148 However… It is reasonable to consider that the treatment effect will somehow cease/attenuate after withdrawal If there is a good response, MAR will not “predict” a bad response =>MAR assumption not suitable for early drop-outs because of safety issues In this context MAR seems likely to be anti- conservative 148

149 The main analysis: What should reflect ? A) The “pure” treatment effect: – Estimation using the “on treatment” effect after withdrawal – Ignore effects (changes) after treatment discontinuation – Does not mix up efficacy and safety B) The expected treatment effect in “usual clinical practice” conditions 149

150 General Strategies Complete-case analysis “Weighting methods” & Dummy variable/category Imputation methods – Single Imputation / Multiple Imputation Analysing data as incomplete MNAR methods Other methods 150

151 Complete-case analysis a.k.a. Available Data Only (ADO) “Case deletion”: – Listwise deletion (a.k.a. complete-case analysis): delete all cases with missing value on any of the variables in the analysis. Only use complete cases. – Pairwise deletion (a.k.a. available-case analysis) use all available cases for computation of any sample moment Only OK if missing data are MCAR (very strong assumption) – Parameter estimates unbiased – Standard errors appropriate? But, can result in substantial loss of statistical power 151

152 Complete-case analysis Complete case analysis: – Bias, power and variability – Not generally appropriate. – Exceptions: – Exploratory studies, especially in the initial phases of drug development. – Secondary supportive analysis in confirmatory trials (robustness) 152

153 General Strategies Complete-case analysis “Weighting methods” & Dummy variable/category Imputation methods – Single Imputation / Multiple Imputation Analysing data as incomplete MNAR methods Other methods 153

154 “Weighting methods” & Dummy variable/category “ Weighting methods”: – To construct weights for incomplete/under- represented cases – Sometimes considered as a form of imputation Dummy variable/category adjustment – Cohen & Cohen (1985); produces biased coefficient estimates (see Jones’ 1996 JASA article) Utility: observational studies; exploratory analyses 154

155 General Strategies Complete-case analysis “Weighting methods” & Dummy variable/category Imputation methods – Single Imputation / Multiple Imputation Analysing data as incomplete MNAR methods Other methods 155

156 Single Imputation Substitute a value for each missing value. Some of the ways to choose this value: – Mean Estimation Replace missing data with the mean of non-missing values. – Class Imputation methods Stratify and sort by key covariates, replace missing data from another record in the same strata. – Predict missing values from Regression Impute each independent variable on the basis of other independent variables in model. – LOCF / BOCF – Other single imputation methods: Rank/Score based methods Worst (best) case EM estimation 156

157 Mean Imputation Scatterplots are from Joe Schafer’s website 157

158 Regression methods 158

159 Imputation methods LOCF and variants – Bias: Depending on the amount and timing of drop-outs: Ex: The conditions under study has a worsening course – Conservative: » Drop-outs because of lack of efficacy in the control group – Anticonservative: » Drop-outs because of intolerance in the test group – Use: only if MCAR assumption and if there are no trends with time – BOCF useful in some cases such as in a chronic pain trial it is reasonable to assume that when a patient withdraws and treatment is stopped the pain levels return to baseline levels. 159

160 160 Ex: LOCF & lineal extrapolation lineal Time (months) LOCF Lineal Regresion Bias Adas-Cog > Worse < Better

161 Ex: Early drop-out due to AE Adas-Cog Time (months) Placebo Active > Worse < Better Bias: Favours Active 161

162 Ex: Early drop-out due to lack of Efficacy Adas-Cog Time (months) Placebo Active > Worse < Better Bias: Favours Placebo 162

163 Adas-Cog Time month Example of interpolation Regression imputation 163

164 Single Imputation Substitute a value for each missing value. Some of the ways to choose this value: – Mean Estimation – Class Imputation methods – Predict missing values from Regression – LOCF / BOCF – Other single imputation methods: Rank/Score based methods Worst (best) case 164

165 Single Imputation Pros - Cons Advantages-Single Imputation – Allows standard complete-data methods of analysis to be used – Incorporates the data collectors knowledge Disadvantages-Single Imputation – Inferences based on imputed data set might be too sharp – Correlations can be biased 165

166 General Strategies Complete-case analysis “Weighting methods” & Dummy variable/category Imputation methods – Single Imputation / Multiple Imputation Analysing data as incomplete MNAR methods Other methods 166

167 Analysing data as incomplete Direct Estimation: – GEE analysis – Likehood methods – Bayesian Estimation with Metropolis-Hastings or Markov Chain Monte Carlo NMAR Procedures (Usually uses one of these procedures or their extensions.) Time to event variables 167

168 Analysing data as incomplete For continuous responses: – mixed-effect models for repeated measures, MMRM For categorical responses and count data: – marginal (e.g. generalized estimating equations, GEE) – random-effects (e.g., generalized linear mixed models, GLMM) MD is not imputed Information is borrowed from cases where the information is available MAR assumption 168

169 Analysing data as incomplete Time to event analysis When the outcome measure is time to event, survival models which take into account censored observations are often used. Many standard survival methods assume that there is no relationship between the response and the missing outcome. Violations from this assumption could lead to biased results especially when data are missing due to withdrawal. 169

170 General Strategies Complete-case analysis “Weighting methods” & Dummy variable/category Imputation methods – Single Imputation / Multiple Imputation Analysing data as incomplete Other methods 170

171 3 Steps in Multiple Imputation (MI) 1.Create imputations (>1 for each missing value) 2.Analyze the imputed datasets 3.Combine the results 171

172 172 Multiple Imputation

173 173 Multiple Imputation

174 Advantages of MI By imputing more than one value => uncertainty is introduced Re-combining results in efficient and unbiased estimates =>Correct inference 174

175 General Strategies Complete-case analysis “Weighting methods” & Dummy variable/category Imputation methods – Single Imputation / Multiple Imputation Analysing data as incomplete MNAR methods Other methods 175

176 NMAR Missing Data Pattern Mixture Models Selection Models Other: – Auxiliary Variables Can alleviate NMAR bias (If correlates highly with missing values) – Shared parameter/Joint models Be extremely cautious in the interpretation! 176

177 Other methods Retrieval of data after withdrawal – Assessment may be interfered by external treatments, but reflects the clinical practice – Balance: possible influence of external treatments after withdrawals VS possible bias due to the process of imputation or direct estimation – Not biased when there are no effective treatments in one particular setting Responder analysis Reasons for drop-out which are likely to be treatment related (such as lack of efficacy or safety issues) will be considered as non-responders. 177

178 178 Definición de las distintas poblaciones de un estudio

179 179 Objetivo: Evaluar la eficacia de un programa para reducir el peso frente a los a los consejos habituales Diseño: Ensayo Clínico Aleatorio Candidatos: 790 Obesos: 320 Grupo intervención: 161Grupo control: 159 Rechazo: 59 Petición espontánea: 54 Acaban: 102 Acaban: 105

180 180 Grupo intervención: 161Grupo control: 159 Rechazo: 59 Petición espontánea: 54 Acaban: 102 Acaban: 105

181 181 “Los métodos estadísticos no son un sustituto del sentido común y la objetividad. Nunca deberían estar dirigidos a confundir al lector, sino que deben ser una contribución importante a la claridad de los argumentos científicos” SJ Pocock. Br J Psychiat 1980; 137:

182 ¿Complica la estadística el avance del conocimiento? 182

183 183 Análisis multivariante

184 Principales ventajas Ajuste por variables mal distribuidas Ajuste por valores basales distintos Test de significación multivariante

185 Análisis Multivariante Paradoja de Simpson (1951) En 1973, hombres y mujeres solicitaron su admisión en la Universidad de Berkeley. Se admitió al 44% de los hombres y al 30% de las mujeres. El gobierno federal acusó a la Universidad de Berkeley de discriminación sexual.

186 Análisis Multivariante Paradoja de Simpson (1951) Tasa de solicitud y admisión en las 6 carreras más importantes. OR hombres = 1,54  Una persona tiene 1,54 veces más probabilidad de ser admitida si es hombre.

187 Análisis Multivariante Paradoja de Simpson (1951) En la mayoría de las carreras las mujeres tuvieron tasas de admisión superior a los hombres (A, B, D, F). Las excepciones fueron en pequeña cantidad (C, E). ¿De dónde surge la idea de la discriminación?

188 Análisis Multivariante Paradoja de Simpson (1951) Observar que es muy fácil entrar en las carreras A y B y que muchos hombres solicitan el ingreso en ellas. Sin embargo las mujeres pretenden entrar en la carrera F que es muy difícil.


Download ppt "1 From bench to the bedside Statistics Issues in RCT Ferran Torres Biostatistics and Data Management Platform IDIBAPS - Hospital Clinic Barcelona Universitat."

Similar presentations


Ads by Google