Studying the Impact of Tests

Name: Studying the Impact of Tests
Uploaded: 2017-12-14T12:18:15+00:00
Duration: PTM32S38
Channel: Irving Hoe
Description: Studying the Impact of Tests

Studying the Impact of Tests
Studying the impact of tests Prof Jon Deeks University of Birmingham Studying the Impact of Tests Jon Deeks Professor of Health Statistics University of Birmingham Work supported by a DOH NCC RCD Senior Research Scientist in Evidence Synthesis Award Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Answering policy decisions about the use of diagnostic tests
Studying the impact of tests Prof Jon Deeks University of Birmingham Answering policy decisions about the use of diagnostic tests Should GPs refer patients with low back pain for X-ray and/or MRI? Should patients with dyspeptic symptoms receive serology tests for H.pylori, endoscopy, or empirical therapy? Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Standard hierarchy for HTA of tests (Fryback and Thornton 1991)
Studying the impact of tests Prof Jon Deeks University of Birmingham Standard hierarchy for HTA of tests (Fryback and Thornton 1991) Technical quality of the test Diagnostic accuracy Change in diagnostic thinking Change in patient management Change in patient outcomes Societal costs and benefits Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studies on the Diagnostic Evaluation Pathway
Studying the impact of tests Prof Jon Deeks University of Birmingham Studies on the Diagnostic Evaluation Pathway Analytical validity Reliability (repeatability and reproducibility) Measurement accuracy Diagnostic validity Diagnostic accuracy Comparative/incremental diagnostic accuracy Impact Change in diagnostic yield Change in management Change in patient outcomes Economic evaluation Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

HTA policy on evaluating tests (up until 2004)
Studying the impact of tests Prof Jon Deeks University of Birmingham HTA policy on evaluating tests (up until 2004) “the emphasis of the HTA programme is to assess the effect on patient management and outcomes … improvements in diagnostic accuracy, whilst relevant, are not the primary interest of this commissioned research programme” Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studies on the Diagnostic Evaluation Pathway
Studying the impact of tests Prof Jon Deeks University of Birmingham Studies on the Diagnostic Evaluation Pathway Analytical validity Reliability (repeatability and reproducibility) Measurement accuracy Diagnostic validity Diagnostic accuracy Incremental diagnostic accuracy Impact Change in diagnostic yield Change in management Change in patient outcomes Economic evaluation Focus of HTA programme Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studying the impact of tests Prof Jon Deeks University of Birmingham
Outline of talk Trials of diagnostic evaluations Problems What is being evaluated? Statistical power Study validity Outcomes Pragmatic suggestions When are trials really needed? Alternative trial designs Alternative of assessing comparative accuracy More research is needed Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

RCT to assess patient outcomes
Studying the impact of tests Prof Jon Deeks University of Birmingham RCT to assess patient outcomes Active Outcome Population Sample Randomise Control Outcome Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Diagnostic RCT TEST Outcome Population Sample Randomise Control Outcome Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

RCT 1: X-ray at first GP presentation for low back pain. HTA 2000(4): 20 Randomise GP attendees aged yrs with LBP. Excluded if ‘flu or previous consultation for LBP in last 4 weeks Referred for X-ray N=73 Sample N=153 No X-ray referral N=80 Outcome 6 weeks N=59 Outcome 1 year N=50 PRIMARY Roland score HADS SF-36 EuroQol SECONDARY time off work therapists medication satisfaction Outcome 6 weeks N=67 Outcome 1 year N=58 Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

RCT 1: X-ray at first GP presentation for low back pain. HTA 2000(4): 20 Randomise GP attendees aged yrs with LBP. Excluded if ‘flu or previous consultation for LBP in last 4 weeks Referred for X-ray N=73 Sample N=153 No X-ray referral N=80 Outcome 6 weeks N=59 Outcome 1 year N=50 RESULTS At 6 weeks  SF-36 mental health and vitality subscales (P<.05) At 12 months  SF-36 mental health subscale (P<.05) Outcome 6 weeks N=67 Outcome 1 year N=58 Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

RCT 2: X-ray for GP presentation for low back pain >6 weeks. HTA 2001(5): 30 Randomise GP attendees aged 1st episode of LBP between 6 weeks and 6 months duration. Excluded if ‘red flags’ Referred for X-ray N=210 Sample N=421 No X-ray referral N=211 Outcome 3 months N=199 Outcome 9 months N=195 PRIMARY Roland score SECONDARY pain (VAS) EuroQol pain (diary) satisfaction pain (any) belief in X-ray time off work therapists medication consultations Outcome 3 months N=203 Outcome 9 months N=199 Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

RCT 2: X-ray for GP presentation for low back pain >6 weeks. HTA 2001(5): 30 Randomise GP attendees aged 1st episode of LBP between 6 weeks and 6 months duration. Excluded if ‘red flags’ Referred for X-ray N=210 Sample N=421 No X-ray referral N=211 Outcome 3 months N=199 Outcome 9 months N=195 RESULTS At 3 months  proportion reporting LBP (P<.05) At 9 months None Outcome 3 months N=203 Outcome 9 months N=199 Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

What is being evaluated?
Studying the impact of tests Prof Jon Deeks University of Birmingham What is being evaluated? Medical Test Information Diagnostic accuracy Diagnostic yield Management RCT combines effects Test harms and placebo effects Decision Action Patient Outcome Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studying the impact of tests Prof Jon Deeks University of Birmingham What is being evaluated? Conditions for a test to be of diagnostic benefit Test is more accurate Interpretation of test results is rational and consistent Management is rational and consistent Treatment is effective Conditions for a trial to be informative Rules for interpretation of test results are described Management protocol is described No descriptions given in example trials Applying the results requires faith that the behaviour of your patients and clinicians is the same as the trial Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studying the impact of tests Prof Jon Deeks University of Birmingham What is being evaluated? If no difference is observed … Is the test no more accurate? Are clinicians not correctly interpreting test results? Are management decisions inconsistent or inappropriate? Is the treatment ineffective? None of these questions can be answered If one element changes, the results of the trial become redundant Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Statistical Power RCT 1: Reduction in proportion with pain at 2 weeks from 40% to 30% could be detected with 300 patients with 80% power at 5% significance RCT 2: Difference of 1.5 on Roland score could be detected with 388 patients with 90% power and 5% significance sd=4.5, standardised difference=1.5/4.5=0.33 These sample size calculations are suitable for a trial of treatment vs placebo, not a trial of test+treatment Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Diagnostic Accuracy of Clinical Judgement
Studying the impact of tests Prof Jon Deeks University of Birmingham Diagnostic Accuracy of Clinical Judgement Serious (requires intervention) Minor (requires no intervention) TP FN FP TN Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Diagnostic Accuracy of Clinical Judgement + X-ray
Studying the impact of tests Prof Jon Deeks University of Birmingham Diagnostic Accuracy of Clinical Judgement + X-ray Serious (requires intervention) Minor (requires no intervention) TP FN FP TN Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Comparison of Diagnostic Accuracy
Studying the impact of tests Prof Jon Deeks University of Birmingham Comparison of Diagnostic Accuracy Serious (requires intervention) Minor (requires no intervention) All TP Discrepant A All FN All FP Discrepant B All TN Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Benefit can only occur in those whose diagnosis changes
Studying the impact of tests Prof Jon Deeks University of Birmingham Benefit can only occur in those whose diagnosis changes Where can differences arise? Discrepant A could benefit if intervention effective Discrepant B could benefit if intervention harmful All others have no benefit as no change in their intervention Sample size must take into account Prevalence of treatable condition Detection rate (sensitivity) with control test Detection rate (sensitivity) with new test Treatment rate if control test negative (assume zero) Treatment rate if new test positive (assume 100%) Outcome for treatable condition if untreated Treatment effect Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Sample size for detecting treatment effects
Studying the impact of tests Prof Jon Deeks University of Birmingham Sample size for detecting treatment effects Sample size for treatment vs control Sample size must be adjusted according to the proportion in discrepant cells (particularly A). If 20% have serious disease and sensitivity 20% there will be 4% in Discrepant A  increase N 25-fold (N=7,500-10,000) If 10% have serious disease and sensitivity 10% there will be 1% in Discrepant A  increase N 100-fold (N=30,000-40,000) Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Sample size for detecting differences in accuracy
Studying the impact of tests Prof Jon Deeks University of Birmingham Sample size for detecting differences in accuracy Sample size depends on whether the sample all receive both tests, or are randomised to tests Sample sizes for difference in sensitivity If 20% have serious disease to detect sensitivity 20% from 70% to 90% (80% power, alpha 0.05)  paired cohort design N=116 [68-136]  parallel cohort design N=232 If 10% have serious disease to detect sensitivity 10% from 80% to 90% (80% power, alpha 0.05)  paired cohort design N=706 [ ]  parallel cohort design N=1411 Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Sample size for detecting differences in diagnoses and management
Studying the impact of tests Prof Jon Deeks University of Birmingham Sample size for detecting differences in diagnoses and management Sample size based on accuracy sample size inflated according to: For diagnostic impact diagnosis rate if control test negative diagnosis rate if new test positive* For therapeutic impact treatment rate if control test negative treatment rate if new test positive* * subject to “learning effects” Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Validity Concerns Blinding Participants and outcome assessors are rarely blind in diagnostic trials Trials may be more susceptible to measuring preconceived notions of participants and expectations of trialists Drop-out Lack of blinding can induce differential drop-out There are more stages at which drop-out occurs Compliance Lack of blinding and complexity in strategies can reduce compliance Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

What outcomes? The problem is multi-multi-factorial Assessing the effect of a single intervention for a single disease requires multiple outcomes Tests are used to differentiate between multiple diseases and disease states A trial should assess all the important outcomes for the multiple diseases within the differential diagnosis But trials usually have a focus on one condition Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Summary of problems Diagnostic trials are … Rarely done Assess effects of “test+treatment package” Uninformative about the value of the test Often underpowered At risk of bias May not assess all relevant outcomes May be more likely to detect “placebo” effects than benefits of better diagnoses May not represent future impact on treatment and diagnostic decisions Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Key issues Trials only need be done in limited circumstances Only patients in the discrepant cell are informative Audit and feedback studies are better for assessing and changing clinicians’ behaviour than trials More good comparative studies of test accuracy are required Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

When is measuring sensitivity and specificity sufficient to evaluate a new test? Lord et al. Ann Int Med 2006; 144: 850-5 Categories of test attributes: The new test is safer or is less costly The new test is more specific (excludes more cases of non-disease) The new test is more sensitive (detects more cases of disease) If an RCT of treatments exists, when do we still need to undertake an RCT of test+treatment? Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Trial evidence versus linked evidence of test accuracy and treatment efficacy Lord, S. J. et. al. Ann Intern Med 2006;144: Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Assessing new tests using evidence of test accuracy, given that treatment is effective for cases detected by the old test Lord, S. J. et. al. Ann Intern Med 2006;144: Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

When is measuring sensitivity and specificity sufficient to evaluate a new test? Lord et al. Ann Int Med 2006; 144: 850-5 If the new test has similar sensitivity Trials of test+treatment are not required Reductions in harm or cost are benefits Improved specificity can only be a benefit Decision models can be used to analyse trade-offs between positive and negative benefits Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

When is measuring sensitivity and specificity sufficient to evaluate a new test? Lord et al. Ann Int Med 2006; 144: 850-5 If the new test has improved sensitivity Value of using the test depends on treatment response in the extra cases detected A trial is still not needed if Inclusion in the treatment trial was based on the reference standard for assessing test accuracy The test is evaluated in a treatment trial as a predictor of response The new cases represent the same spectrum or subtype of disease Treatment response is known to be similar across the spectrum or subtype of disease Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Alternative Diagnostic RCT
Studying the impact of tests Prof Jon Deeks University of Birmingham Alternative Diagnostic RCT Intervene Outcome Serious Minor X-ray Outcome Intervene Population Sample Serious Minor Randomise Clinical diagnosis Outcome Do not intervene Do not intervene Outcome Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studying the impact of tests Prof Jon Deeks University of Birmingham Alternative Diagnostic RCT Serious Minor X-ray Outcome Intervene Population Sample Serious Minor Randomise Compare Clinical diagnosis Outcome Do not intervene Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studying the impact of tests Prof Jon Deeks University of Birmingham Alternative Diagnostic RCT Everybody gets all tests, randomise only those with discrepant results Benefits Assess diagnostic yield and resultant patient outcomes Less follow-up required Include a reference standard for a random sample and comparative diagnostic accuracy can also be assessed Downsides More tests undertaken Problems when test material is limited Does not assess test harms or other direct effects May not be ethical to randomise treatment Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Assessing clinicians’ behaviours
Studying the impact of tests Prof Jon Deeks University of Birmingham Assessing clinicians’ behaviours Informative trials require documentation and standardisation of decision-making Particularly difficult when the comparison group is standard practice Assessing behaviour observed in a trial may not be representative Future behaviour will depend on the trial results Learning curves may affect compliance Becoming acquainted with a test Ascertaining how best to use it Gaining confidence in its findings Allowing it to replace other investigations Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Diagnostic Before-and-After Studies
Studying the impact of tests Prof Jon Deeks University of Birmingham Diagnostic Before-and-After Studies Design Doctors’ assessments of diagnostic, prognostic and required management decisions recorded Result of new test made available Doctors’ changes in diagnostic, prognostic and required management decisions noted (Reference standard applied) Application Assessment of an Additional Test only Assessment of Diagnostic Yield and Management Concerns New test assessed independent of other tests Doctors’ processes may not reflect standard clinical practice Learning effects Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Conclusions We have much to learn about the best way of studying diagnostic tests Test+treatment trials are difficult to undertake, are prone to bias, and often require unattainable sample sizes. Good comparative studies of test accuracy combined in decision models with evidence from trials of treatments may in many circumstances provide the necessary evidence for policy decisions Good comparative studies of test accuracy should be commissioned more readily Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Situations when test accuracy is likely to be adequate
Studying the impact of tests Prof Jon Deeks University of Birmingham Situations when test accuracy is likely to be adequate Effective treatment exists Reference standard similar to trial entry criteria All tests detect disease at the same stage All tests lead to the same treatment options Evidence of test accuracy is from similar populations to the evidence of effectiveness Studies of diagnostic accuracy need not be large Good studies that compare tests head-to-head need to be done Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Research Questions Do the limitations of HTA evaluations of patient outcomes bias towards showing no difference? How often are trials appropriately powered? How often are the criteria for not needing RCT evidence met? What efficiencies can be made with other designs? How are economic analyses affected by these issues? Would money be better spent getting good evidence of the comparative accuracy of alternative diagnostic tests? Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

New Diagnostic Test Assessment Framework and Examples Lord, S. J. et. al. Ann Intern Med 2006;144: Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

When are trials most needed?
Studying the impact of tests Prof Jon Deeks University of Birmingham When are trials most needed? When tests detect disease earlier introducing different treatment options (screening) When interventions have harmful effects so treating some non-diseased (FP) may outweigh benefits of treating diseased (TP) When the test itself has a harmful effect When diagnostic accuracy cannot be assessed as a reference standard does not exist (although there still need to be indications for therapy – which could be used as a ref std) Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Destination worth reaching
Studying the impact of tests Prof Jon Deeks University of Birmingham Destination worth reaching Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

The Role of Randomised Trials in Evaluating Diagnostic Tests
Studying the impact of tests Prof Jon Deeks University of Birmingham The Role of Randomised Trials in Evaluating Diagnostic Tests Jon Deeks Professor of Health Statistics University of Birmingham Work supported by a DOH NCC RCD Senior Research Scientist in Evidence Synthesis Award Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Defects and Disasters in Evaluations of the Impact of Diagnostic Tests
Studying the impact of tests Prof Jon Deeks University of Birmingham Defects and Disasters in Evaluations of the Impact of Diagnostic Tests Jon Deeks Professor of Health Statistics University of Birmingham Work supported by a DOH NCC RCD Senior Research Scientist in Evidence Synthesis Award Oxford CEBM Workshop in Evidence-based Diagnostics 17th-19th July 2006

Studying the Impact of Tests

Similar presentations

Presentation on theme: "Studying the Impact of Tests"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Studying the Impact of Tests

Similar presentations

Presentation on theme: "Studying the Impact of Tests"— Presentation transcript:

Similar presentations

About project

Feedback