Presentation is loading. Please wait.

Presentation is loading. Please wait.

Systematic Reviews of Diagnostic Studies

Similar presentations

Presentation on theme: "Systematic Reviews of Diagnostic Studies"— Presentation transcript:

1 Systematic Reviews of Diagnostic Studies
Guides for appraisal Acknowledgements: Paul Glasziou, Jon Deeks, Madhukar Pai, Patrick Bossuyt, and Matthias Egger.

2 Information Overload 20,000 biomedical periodicals (6M articles)
17,000 biomedical books annually 30,000 recognized diseases 15,000 therapeutic agents (250/yr) MEDLINE 4,000 journals surveyed 11,000,000 citations 1.27 million articles related to oncology 35,000 articles related to ear, nose, or throat surgery

3 What makes a Review “Systematic”?
Based on a clearly formulated question Identifies relevant studies Appraises quality of studies Summarizes evidence by use of explicit methodology Comments based on evidence gathered

4 Origin of Clinical Questions
Diagnosis: how to select and interpret diagnostic tests Prognosis: how to anticipate the patient’s likely course Therapy: how to select treatments that do more good than harm Prevention: how to screen and reduce the risk for disease


6 Steps in a Systematic Review
Framing the Question (Q) Identifying relevant publications (F) Assessing Study quality (A) Summarising Evidence and interpreting finding (S)

7 Step 1- Framing the Question (Q)
Clear, unambiguous, structured question Questions formulated re: PPICO Populations of interest Prior test(s) (if appropriate) Intervention Comparisons (if appropriate) Outcomes Change to PPICO

8 Unstructured Question
Is cervicovaginal fetal fibronectin useful? For what? For whom? What is meant by “useful”?

9 Structured Question Test (Intervention)
Does a positive cervicovaginal fetal fibronectin test predict spontaneous preterm birth in asymptomatic women? Sometimes you have a comparison test as well Outcome Patient

10 Step 2 – Identifying relevant publications (F)
Wide search of medical/scientific databases Medline Cochrane Reviews Ovid Relevance to focused question PPICO Population Prior test Intervention Comparator Outcome




14 Publication and reporting biases
All studies conducted All studies published Positive Results Bias Grey Literature Bias Time-Lag Bias Language and Country Bias Multiple Publication Bias Selective Citation Bias Database Indexing Bias Selective Outcome Reporting B. Grey literature Studies reviewed Health Technology Assessment, 2000; 4(10):1-115

15 Registered vs. Published Studies Ovarian Cancer chemotherapy: single v combined
Simes, J. Clin Oncol, 86, p1529

16 Search filters for diagnostic studies
No search filter: 39 studies retrieved Sensitivity 1-Specificity .2 .4 .6 .8 1 .9 Lucas Bachmann

17 With search filter: 12 studies retrieved (27 missed)
Sensitivity 1-Specificity .2 .4 .6 .8 1 .9 Lucas Bachmann

18 Documenting & storing

19 Assessment of Study Quality (A)
Quality varies, therefore Standardized Assessment (?blind*) Group/Rank by quality Select a threshold, e.g. all prospective studies with blind reading of reference and index tests. * assessment of quality blind to study outcome

20 Quality Score: Mammals example
In natural habitat (No = 0; Yes = 1) Setting Whole animals (No = 0; Yes = 1) Complete information Photographs (No = 0; Yes = 1) Level of evidence

21 Exercise I: Study Quality

22 Exercise I: Study Quality
3 1 1 2 3 1 3 3 1 2 1 1 3 1 2

23 Assessing a Study of a Test (Jaeschke et al, JAMA, 1994, 271: 389-91)
Was an appropriate spectrum of patients included? (Spectrum Bias) All patients subjected to a Gold Standard? (Verification Bias) Was there an independent, "blind" comparison with a Gold Standard? Observer Bias; Differential Reference Bias Methods described so you could repeat test?

24 Diagnostic Accuracy Study: Basic Design
Series of patients Index test Reference standard Blinded cross-classification

25 Blinded cross-classification
Spectrum Bias Selected Patients Index test Reference standard Blinded cross-classification

26 Blinded cross-classification
Verification Bias Series of patients Index test Reference standard Blinded cross-classification

27 Differential Reference Bias
Series of patients Index test Ref. Std A Ref. Std. B Blinded cross-classification

28 Unblinded cross-classification
Observer Bias Series of patients Index test Reference standard Unblinded cross-classification

29 “Case-control” design
HF patients controls Index test Blinded cross-classification

30 Empirical Effects of Bias
Lijmer JG et al. JAMA 1999;282:

31 Step 4 – Summarising the Evidence (S)
Extracting data from trials Combining data – Meta analysis Does it make sense to combine?

32 What is a meta-analysis?
A way to calculate an average Estimates an ‘average’ or ‘common’ effect Improves the precision of an estimate by using all available data

33 What is a meta-analysis?
Optional part of a systematic review Systematic reviews Meta analysis may be part of a systematic review. May be worth asking participants for egs of when it’s not appropriate to combine studies in meta-analysis. Systematic reviews may included meta-analyses but meta-analysis may be done with out systematically reviewing the studies – there are egs of this in journals – these therefore may be biased In the US the terms are used interchangably, but not the case in the UK Meta-analyses

34 Summary ROC Meta-analytical Display

35 Threshold effects Decreasing threshold increases sensitivity but decreases specificity Increasing threshold increases specificity but decreases sensitivity

36 Accuracy effects Over-estimation of accuracy e.g. spectrum bias
Under-estimation of accuracy e.g. poor reference standard

37 Spectrum effects Variation in the diseased study participants
Variation in the non-diseased study participants

38 Methods of Meta-analysis
Separate pooling of sensitivity and specificity (and likelihood ratios) Inappropriate when highly heterogeneous Underestimates if there is heterogeneity in threshold

39 Constant diagnostic odds ratio across thresholds
sensitivity specificity LR+ LR- DOR=LR+/LR- 99% 71% 3.44 0.01 231 97% 86% 6.95 0.03 94% 15.19 0.07 33.21 0.14 67.09 0.29

40 Methods of Meta-analysis
Creation of Summary ROC DOR often reasonably consistent across studies Deals with variation in threshold Moses/Littenberg – allows for trends in DOR with threshold Difficult to interpret a unique operating point More advanced methods (HSROC, bivariate normal, random effects) estimate variability and uncertainty in values Investigate why studies have different results

41 Variation in studies: Fetal fibronectin in asymptomatic women
SROC (95%CI) predicting 37 weeks’ gestation (28 studies)

42 Does it make sense to combine?
Do we need studies to be exactly the same? When can we say we are measuring the same thing?

43 Are the studies consistent?
Are variations in results between studies consistent with chance? (Test of homogeneity: has low power) If NO, then WHY? Variation in study methods (biases) Variation in intervention Variation in outcome measure (e.g. timing) Variation in population


45 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Q – Clearly focused question?

46 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Objective: To determine the accuracy with which a cervicovaginal fetal fibronectin test predicts spontaneous preterm birth in women with or without symptoms of preterm labour. Q

47 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Q – Clearly focused question? F – Found all available evidence?

48 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Electronic Search: Medline (1966­2000), Embase (1980­2000), PASCAL (1973­2001), BIOSIS (1969­2001), the Cochrane Library (2000:4), MEDION (1974­2000), National Research Register (2000:4), SCISEARCH (1974­2001), and conference papers (1973­2000). Grey literature: Contacted individual experts and manufacturer of fetal fibronectin test. Cross-checking: Checked reference lists of known reviews and primary articles toidentify cited articles not captured by electronic searches. MEDION (1974­2000) (a database of diagnostic test reviews set up by Dutch and Belgian researchers)

49 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Q – Clearly focused question? F – Found all available evidence? A – Studies are critically appraised?

50 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Bias can be associated with case­control study designs, lack of blinding of carer to test results, non­consecutive patient enrolment, non­prospective data collection, inadequate test description, use of different reference tests, partial verification, and lack of description of either the population or the reference test.28 The last four items, however, are not relevant to our review because they refer to delivery of neonates (preterm or term births). Therefore, we considered a study to be of good quality if it used a prospective design, consecutive enrolment, adequate test description (to allow replication by others), and blinding of the test result from clinicians managing the patients

51 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Q – Clearly focused question? F – Found all available evidence? A – Studies are critically appraised? S – Results are adequately synthesised?

52 Exercise II: Fetal fibronectin for predicting spontaneous preterm birth
Subgroups Asymptomatic women spontaneous preterm birth before 34 and 37 weeks' gestation Symptomatic women spontaneous preterm birth before 34 and 37 weeks' gestation, and within 7­10 days of testing Quantitative summary: Used SROC curves as measures of accuracy for all included studies regardless of their thresholds. Provided summary likelihood ratios (positive and negative) Heterogeneity Assessed heterogeneity of diagnostic odds ratios graphically and statistically Meta-regression to explored sources of heterogeneity Sensitivity - estimated accuracy of the highest quality studies

53 Exercise II Honest H, Bachmann LM, Gupta JK, Kleijnen J, Khan KS.
Accuracy of cervicovaginal fetal fibronectin test in predicting spontaneous preterm birth: systematic review. BMJ 2002;325: 301-4

54 Final points To Assess Systematic Reviews of Diagnostic Studies use - QFAS Q The question should be a structured one PPICO F Finding studies of diagnostic tools is generally more difficult than therapies.

55 Final points A Spectrum, verification, differential reference and observer bias to be taken into account S Summaries affected by choices of: Threshold, Population, and Reference test Methods not as well researched as for Therapies Heterogeneity analysis particularly important in these reviews


57 Steering Committee: Bossuyt, Bruns, Gatsonsis,
Glasziou, Irwig, Lijmer, Moher, Rennie, de Viet

58 Guidelines for Conducting SRinDS
For diagnostic reviews: Cochrane Methods Group on Systematic Review of Screening and Diagnostic Tests: Recommended Methods. Cochrane Collaboration, 1996. Deville WL, Buntinx F, Bouter LM, Montori VM, De Vet HC, Van Der Windt DA, Bezemer P. Conducting systematic reviews of diagnostic studies: didactic guidelines. BMC Med Res Methodol 2002; 2(1): 9. Deeks JJ. Systematic reviews of evaluations of diagnostic and screening tests. In: Egger M, Smith GD, Altman DG, eds. Systematic reviews in health care. Meta-analysis in context. London: BMJ Publishing Group, 2001: 248–282. Irwig L, Macaskill P, Glasziou P, Fahey M. Meta-analytic methods for diagnostic test accuracy. J Clin Epidemiol 1995; 48: 119–30. The Bayes Library of Diagnostic Studies and Reviews. 2nd Edition, 2002. For diagnostic studies reporting: Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, Lijmer JG, Moher D, Rennie D, de Vet HC; Standards for Reporting of Diagnostic Accuracy steering group. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. BMJ 2003; 326: 41–44.

59 Databases/sources of studies
Electronic databases: General: Cochrane CENTRAL, PubMed, Embase, etc. Subject-specific: AIDSLINE, CANCERLIT, PsycInfo, MEDION, etc. Reference lists of included studies (citation tracking) References lists of earlier reviews, commentaries CDSR, DARE, MEDION, PubMed search with filters for systematic reviews Personal communication with experts and authors Contacting drug/device companies Hand-searching of key, high-yield journals Grey literature Dissertation abstracts, reports, conference proceedings, etc. Sources of ongoing studies Trial registers, drug companies, contacting experts

60 Quality assessment Criteria for validity of diagnostic studies:
Study design Cross-sectional study of a clinically indicated population or case-control Verification Complete, different reference tests, or partial Blinding Blinded or not Patient selection Consecutive or random or nonconsecutive Data collection Prospective or retrospective Appropriateness of reference standard Description of test Description of study population Lijmer et al. Empiric evidence of design-related bias in studies of diagnostic studies. JAMA 1999;282:1061

61 Present absolute numbers for test results
Distribution of plasma concentrations of B type natriuretic peptide in normal elderly people and in those with left ventricular systolic dysfunction confirmed by echocardiography BMJ, 2000; 320:



Download ppt "Systematic Reviews of Diagnostic Studies"

Similar presentations

Ads by Google