Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two-stage individual participant data meta-analysis and flexible forest plots David Fisher MRC Clinical Trials Unit Hub for Trials Methodology Research.

Similar presentations


Presentation on theme: "Two-stage individual participant data meta-analysis and flexible forest plots David Fisher MRC Clinical Trials Unit Hub for Trials Methodology Research."— Presentation transcript:

1 Two-stage individual participant data meta-analysis and flexible forest plots David Fisher MRC Clinical Trials Unit Hub for Trials Methodology Research at UCL df@ctu.mrc.ac.uk 2013 UK Stata Users Group Meeting Cass Business School, London

2 Outline of presentation Introduction to individual patient data (IPD) meta-analysis (MA) IPD vs aggregate-data (AD) MA “One-stage” vs “two-stage” IPD MA The ipdmetan command Basic use; comparison with metan Covariate interactions Combining AD with IPD Advanced syntax The forestplot command Interface with ipdmetan Stand-alone use and “stacking” Summary and Conclusion

3 Introduction to IPD meta-analysis Meta-analysis (MA): Use statistical methods to combine results of “similar” trials to give a single estimate of effect Increase power & precision Assess whether treatment effects are similar in across trials (heterogeneity) Aggregate data (AD) vs IPD: “Traditional” MAs gather results from publications Aggregated across all patients in the trial; nothing is known of individual patients IPD MAs gather raw data from trial investigators Ensures all relevant patients are included Ensures similar analysis across all trials Allows more complex analysis, e.g. patient-level interactions

4 “One-stage” IPD MA Consider a linear regression (extension to GLMs or time- to-event regressions is straightforward) For a one-stage IPD MA ( i = trial, j = patient): Examples in Stata: Fixed effects: regress y x i.trial Random effects: xtmixed y x i.trial || trial: x, nocons where α i = trial identifiers β = overall treatment effect estimated across all trials i (with optional random effect u i )

5 “Two-stage” IPD MA For a two-stage IPD MA: … for trial 1 for trial i … Then:and where

6 Treatment-covariate interactions Assessment of patient-level covariate interactions is a great advantage of IPD Arguably best done with “one-stage” Main effects & interactions (& correlations) estimated simultaneously But basic analysis also possible with “two-stage” Relative effect (interaction coefficient) only Same approach (inverse-variance) as for main effects Ensures no estimation bias from between-trial effects Can be presented in a forest plot, with assessment of heterogeneity etc. Discussed in a published paper (Fisher 2011)

7 “One-stage” vs “two-stage” One-stageTwo-stage Pros-All coeffs & correls estimated simultaneously -Flexible & extendable model structure -Natural extension of AD MA -Easily presentable in forest plots -Applicable to any set of effect estimates and SEs (incl. interactions) -Negligible difference to 1S in most common scenarios Cons-Requires more statistical expertise -Challenging in certain situations, e.g. random-effects with time-to-event data -Not a natural fit with forest plots -Only a single estimate can be pooled, which limits complexity (e.g. interactions) -Theoretically inferior in (at least) some scenarios

8 Example data IPD MA of randomised trials of post-operative radiotherapy (PORT) in non-small cell lung cancer Trial ID (k=11) Patient ID (n=2343) Treatment arm Outcome is censored time to overall survival (death from any cause) Time to event (from randomisation) Event type (death or censorship) Certain covariate measurements also available, not necessarily for all trials or patients Disease stage (factor, but treat as continuous) (+ others)

9 ipdmetan syntax ipdmetan, study(trialid) eform : stcox arm, strata(sex) ipdmetan options after comma, before colon estimation_command and options after colon Uses “prefix” command syntax: ipdmetan [exp_list ], study( study_ID ) [ ipd_options ad( aggregate_data_options ) forestplot( forest_plot_options ) ] : estimation_command... Example: default is to pool coeffs from first dep. var. (excluding baseline factor levels)

10 Trials included: 11 Patients included: 2342 Meta-analysis pooling of main (treatment) effect estimate arm using Fixed-effects -------------------------------------------------------------------- trial reference | number | Effect [95% Conf. Interval] % Weight ----------------------+--------------------------------------------- belgium | 1.456 1.072 1.979 11.09 EORTC 08861 | 1.643 0.913 2.956 3.02 LILLE | 1.568 1.060 2.319 6.81............... ----------------------+--------------------------------------------- Overall effect | 1.178 1.064 1.305 100.00 -------------------------------------------------------------------- Test of overall effect = 1: z = 3.153 p = 0.002 Heterogeneity Measures --------------------------------------------------- | value df p-value ---------------+----------------------------------- Cochrane Q | 15.88 10 0.103 I² (%) | 37.0% Modified H² | 0.588 tau² | 0.0180 --------------------------------------------------- I² = between-study variance (tau²) as a percentage of total variance Modified H² = ratio of tau² to typical within-study variance Output style similar to metan or metaan Variable label

11 Basic forest plot

12 Forest plot of covariate interactions Trials included: 8 Patients included: 1962 Meta-analysis pooling of interaction effect estimate 1.arm#c.stage2 using Fixed-effects ipdmetan, study(trialid) eform interaction keepall : stcox arm##c.stage default is to pool coeffs from first interaction term

13 Inclusion of aggregate data I don’t have a separate aggregate dataset, so I will create one artificially from my IPD dataset. ** Generate artificial trial subgrouping. gen subgroup = inlist(trialid, 1, 8, 12, 15). label define subgroup_ 0 "Trial group 1" 1 "Trial group 2". label values subgroup subgroup_. ** Run ipdmetan within one of the subgroups; save the dataset. qui ipdmetan, study(trialid) by(subgroup) nooverall nograph saving(subgroup1.dta) : stcox arm if subgroup==1, strata(sex)

14 (Aside: Contents of subgroup1.dta ) _usetrialid_labels_ES_seES_lci_uci_wgt_NN 11belgium0.3760.1560.0690.6820.286202 18EORTC 088610.4960.300-0.0911.0840.078105 112LILLE0.4500.2000.0580.8410.176163 115GETCB 05CB860.3620.1230.1200.6030.460539

15 Inclusion of aggregate data: Syntax. ipdmetan, study(trialid) eform nooverall ad(subgroup1.dta, byad) : stcox arm if subgroup==0, strata(sex) Do not pool IPD and aggregate together Aggregate data syntax estimation_command “byad” = treat IPD & aggregate data as subgroups

16 Trials included from IPD: 7 Patients included: 1333 Trials included from aggregate data: 4 Patients included: 1009 Pooling of main (treatment) effect estimate arm using Fixed-effects ------------------------------------------------------------------- trial reference | number | Effect [95% Conf. Interval] % Weight ---------------------+--------------------------------------------- IPD | LCSG 773 | 1.123 0.827 1.526 11.13 CAMS | 1.029 0.768 1.378 12.20... |... Subgroup effect | 1.021 0.896 1.163 61.25 ---------------------+--------------------------------------------- Aggregate | belgium | 1.456 1.072 1.979 11.09 EORTC 08861 | 1.643 0.913 2.956 3.02... |... Subgroup effect | 1.479 1.256 1.743 38.75 ------------------------------------------------------------------- Tests of effect size = 1: IPD z = 0.305 p = 0.760 Aggregate z = 4.682 p = 0.000 Inclusion of aggregate data: Screen output

17 Inclusion of aggregate data: Forest plot

18 Advanced syntax example: non “e-class” estimation command ipdmetan (u[1,1]/V[1,1]) (1/sqrt(V[1,1])), study(trialid) eform ad(subgroup1.dta, byad) lcols(evrate=_d %3.2f "Event rate") rcols(u[1,1] %5.2f "o-E(o)" V[1,1] %5.1f "V(o)") forest(nooverall nostats nowt) : sts test arm if subgroup==0, mat(u V) Effect estimate & SE not from e(b) – must specify manually

19 Advanced syntax example: columns of data in forestplot ipdmetan (u[1,1]/V[1,1]) (1/sqrt(V[1,1])), study(trialid) eform ad(subgroup1.dta, byad) lcols(evrate=_d %3.2f "Event rate") rcols(u[1,1] %5.2f "o-E(o)" V[1,1] %5.1f "V(o)") forest(nooverall nostats nowt) : sts test arm if subgroup==0, mat(u V) Mean of var currently in memory (note user- assigned name, to match with varname in aggregate dataset) Collect lists of returned stats

20 Advanced syntax example: Forest plot

21 These vars do not appear in the aggregate dataset, so are not plotted Subtotal cannot be calculated for aggregate data

22 The forestplot command Does not perform any calculations/estimations; simply plots existing data as a forest plot Overall/subgroup estimates, spacings, labels, text columns etc. need to be created/arranged in advance Ordering & spacing; marking of subgroup/overall estimates for plotting “diamonds”: _use Principal left-hand data column (study IDs, heterogeneity etc. – string fmt): _labels This setup is done automatically by ipdmetan before passing to forestplot (but can also be done manually by user) Multiple datasets can be passed to forestplot at once to create a single large “stacked” plot on common x -axis

23 forestplot syntax forestplot [ varlist ] [if] [in] [, plot_options graph_options using_option ] varlist = manually specify varnames to plot plot_options control the data plotting (within plot region) graph_options control the surroundings (outside plot region; graph region) using_option represents one or more options that allow suitable datasets (or parts of datasets) to be fed to forestplot, possibly with different plot_options, to form a single large forest plot on a single x -axis.

24 using_option syntax using( filenamelist [if] [in] [, plot_options ]) [using( filenamelist [if] [in] [, plot_options )]...] filenamelist is a list of one or more Stata-format datasets parts may be specified with [if] [in] same filename can appear more than once order of filenames determines placement in graph Different plot_options may be specified to each using option For same options applied to multiple files, place them in a filenamelist For different options applied to each file, place each file in a different using option

25 plot_options syntax Based on metan syntax, options refer to different parts of the forest plot Most options appropriate to the underlying twoway plot type are acceptable, with some exceptions OptionFunction twoway plot type boxopt Weighted boxes for study point estimates scatter [aweight] pointopt Points for study point estimates scatter ciopt Lines for confidence intervals rspike, hor pcarrow diamopt Diamond for summary estimate pcspike (x4) olineopt Vertical line through summary estimate rspike

26 Example forestplot dataset (“resultsset” from last ipdmetan example) _use_by_study_labels_ES_lci_uci_wgtevrateu_1_1_V_1_1__NN 01IPD 113LCSG 773 0.116-0.1900.4220.111 0.724.7741.0 115CAMS 0.024-0.2690.3160.121 0.581.0744.9 116MRC LU11 -0.042-0.2960.2130.160 0.78-2.4859.4 119SLOVENIA -0.164-0.6600.3320.042 0.85-2.5615.6 1114GETCB 04CB86 0.157-0.1920.5060.085 0.684.9531.6 1113ITALY -0.341-0.8810.1990.036 0.51-4.5013.2 1116KOREA 0.136-0.2780.5500.061 0.813.0622.4 31Subtotal 0.019-0.1110.1490.615 0.693.24229.6 41(I-squared = 0.0%, p = 0.710) 41 02Aggregate 1217belgium 0.3760.0690.6820.110 0.83202 1218EORTC 08861 0.496-0.0911.0840.030 0.43105 1219LILLE 0.4500.0580.8410.068 0.64163 1220GETCB 05CB86 0.3620.1200.6030.177 0.50539 32Subtotal 0.3920.2280.5560.385 1009 42(I-squared = 0.0%, p = 0.964) 42 4 Heterogeneity between groups: p = 0.000 5Overall 0.1620.0610.2641.000 1009 4(I-squared = 38.4%, p = 0.093) Estimates; CIs; weightsExtra data columns

27 “Stacking” of forest plots Imagine: dataset on previous slide is saved as ipdtest.dta we want IPD boxes to be red, and AD boxes to be green We proceed as follows: Run forestplot with two using(...) options, one for each part of the plot, with the same filename (Alternatively: run ipdmetan twice and save under different filenames) Specify our desired plot_options as suboptions to using()

28 forestplot, using(ipdtest.dta if _by==1, boxopt(mcolor(red))) using(ipdtest.dta if _by==2, boxopt(mcolor(green))) lcols(evrate) rcols(u_1_1_ V_1_1_) nooverall nostats nowt

29 Summary and conclusion IPD is increasingly used, and its advantages widely accepted Large numbers of MA scientists use two-stage models for analysing IPD Currently only AD MA (e.g. metan ) and one-stage IPD (e.g. xtmixed ) commands exist in Stata ipdmetan is a universal command for two-stage IPD MA forestplot is a flexible forest plot command does not carry out analysis itself, thus not restricted by it may be useful outside the MA context (e.g. presenting trial subgroups)

30 Further information Other related programs (all call forestplot by default): admetan : calls ipdmetan to analyse AD (direct alternative to metan ) ipdover : fit model within series of subgroups petometan : perform meta-analysis of time-to-event data using the Peto (log-rank) method SSC and Stata Journal article in near future

31 Thankyou! Questions, requests, bug reports: df@ctu.mrc.ac.uk Thanks to: Jayne Tierney, Patrick Royston Ross Harris (author of metan ) for advice & support Assorted colleagues for testing Reference: Fisher D. J. et al. 2011. Journal of Clinical Epidemiology 64: 949-67

32

33 Trials included: 11 Patients included: 2342 Meta-analysis pooling of main (treatment) effect estimate arm using Fixed-effects -------------------------------------------------------------------- trial reference | number | Effect [95% Conf. Interval] % Weight ----------------------+--------------------------------------------- belgium | 1.456 1.072 1.979 11.09 EORTC 08861 | 1.643 0.913 2.956 3.02 LILLE | 1.568 1.060 2.319 6.81............... ----------------------+--------------------------------------------- Overall effect | 1.178 1.064 1.305 100.00 -------------------------------------------------------------------- Test of overall effect = 1: z = 3.153 p = 0.002 Heterogeneity Measures --------------------------------------------------- | value df p-value ---------------+----------------------------------- Cochrane Q | 15.88 10 0.103 I² (%) | 37.0% Modified H² | 0.588 tau² | 0.0180 --------------------------------------------------- I² = between-study variance (tau²) as a percentage of total variance Modified H² = ratio of tau² to typical within-study variance Summary of analysis (check!)

34 Trials included: 11 Patients included: 2342 Meta-analysis pooling of main (treatment) effect estimate arm using Fixed-effects -------------------------------------------------------------------- trial reference | number | Effect [95% Conf. Interval] % Weight ----------------------+--------------------------------------------- belgium | 1.456 1.072 1.979 11.09 EORTC 08861 | 1.643 0.913 2.956 3.02 LILLE | 1.568 1.060 2.319 6.81............... ----------------------+--------------------------------------------- Overall effect | 1.178 1.064 1.305 100.00 -------------------------------------------------------------------- Test of overall effect = 1: z = 3.153 p = 0.002 Heterogeneity Measures --------------------------------------------------- | value df p-value ---------------+----------------------------------- Cochrane Q | 15.88 10 0.103 I² (%) | 37.0% Modified H² | 0.588 tau² | 0.0180 --------------------------------------------------- I² = between-study variance (tau²) as a percentage of total variance Modified H² = ratio of tau² to typical within-study variance List of trial estimates, CIs and weights; test of overall effect significance

35 Trials included: 11 Patients included: 2342 Meta-analysis pooling of main (treatment) effect estimate arm using Fixed-effects -------------------------------------------------------------------- trial reference | number | Effect [95% Conf. Interval] % Weight ----------------------+--------------------------------------------- belgium | 1.456 1.072 1.979 11.09 EORTC 08861 | 1.643 0.913 2.956 3.02 LILLE | 1.568 1.060 2.319 6.81............... ----------------------+--------------------------------------------- Overall effect | 1.178 1.064 1.305 100.00 -------------------------------------------------------------------- Test of overall effect = 1: z = 3.153 p = 0.002 Heterogeneity Measures --------------------------------------------------- | value df p-value ---------------+----------------------------------- Cochrane Q | 15.88 10 0.103 I² (%) | 37.0% Modified H² | 0.588 tau² | 0.0180 --------------------------------------------------- I² = between-study variance (tau²) as a percentage of total variance Modified H² = ratio of tau² to typical within-study variance Heterogeneity statistics


Download ppt "Two-stage individual participant data meta-analysis and flexible forest plots David Fisher MRC Clinical Trials Unit Hub for Trials Methodology Research."

Similar presentations


Ads by Google