1 Interpreting Diagnostic Tests Ian McDowell Department of Epidemiology & Community Medicine January 2012 Note to readers: you may find the additional.

1 Interpreting Diagnostic Tests Ian McDowell Department of Epidemiology & Community Medicine January 2012 Note to readers: you may find the additional notes & explanations in the ppt notes panel helpful.

2 The Challenge of Clinical Measurement Diagnoses are based on information - from formal measurements and/or from clinical judgment. This information is seldom perfectly accurate: –Random errors can occur (machine needs calibrating?) –Biases in judgment or measurement can occur (“this patient seems anxious: is he exaggerating?”) –Due to biological variability, this patient may not fit the general rule –Diagnosis (e.g., hypertension) involves a categorical judgment; this often requires dividing a continuous score (blood pressure) into categories. How to choose a cutting- point?

3 Therefore… You need to be aware … –That we express these complexities in terms of probabilities –That using a quantitative approach is better than just guessing! –That you will gradually become familiar with the typical accuracy of measurements in your chosen clinical field –That the principles apply to both diagnostic and screening tests –Of how we describe the accuracy of a measurement.

Test characteristics 1.Reliability: consistency or reproducibility; this considers chance or random errors (which sometimes increase, sometimes decrease, scores). “Is it measuring something?” 2.Validity: “Is it measuring what it is supposed to measure?” By extension, “what diagnostic conclusion can I draw from a particular score on this test?” Validity may be affected by bias, which refers to systematic errors (these fall in a certain direction) 3.Safety, Acceptability, Cost, etc. 4

5 Reliability and Validity Reliability Low High Validity Low High Biased result! ☺ Average of these inaccurate results is not bad. This is probably how screening questionnaires (e.g., for depression) work

Ways of Assessing Validity Content or “Face” validity: does it make clinical or biological sense? Does it include the relevant symptoms? Criterion: comparison to a “gold standard” definitive measure (e.g., biopsy, autopsy) –Expressed as sensitivity and specificity Construct validity (this is used with abstract themes, such as “quality of life” for which there is no definitive standard) 6

Criterion validation: “Gold Standard” The criterion that your clinical observation or simple test is judged against: – more definitive (but expensive or invasive) tests, such as a complete work-up, or –the clinical outcome (for screening tests, when workup of well patients is unethical). Sensitivity and specificity are calculated from a research study that compares the test to a gold standard. 7

8 “2 x 2” table for validating a test TP = true positive; FP = false positive… Golden Rule: always calculate based on the gold standard Gold standard Disease Disease Present Absent Test score: Test positive Test negative a (TP) b (FP) c (FN) d (TN) Validity: Sensitivity Specificity = a/(a+c) = d/(b+d) = TP/Diseased = TN/Healthy

Sensitivity = test’s ability to detect disease when it is present a/(a+c) = TP/(TP+FN) = TP/disease A sensitive person is one who can perceive your feelings (1 – seNsitivity) = false Negative rate: how many cases are missed by the test? 9 Specificity = precision of the test: identifies only that type of disease. “Nothing else looks like this” A specific test generates few false positives. So, if the result is positive, the patient has this diagnosis. (1- sPecificity) = false Positive rate: how many are falsely classified as having the disease?)

Test Errors False Positives can arise due to other factors (diet; taking other medications, etc.) They entail the cost and danger of further investigations, labeling, worry for the patient. –This is similar to Type I or alpha error in a test of statistical significance (the possibility of falsely concluding that there is an effect of an intervention). False Negatives imply missed cases, so potentially bad outcomes if untreated: an adverse event. –Cf. Type II or beta error: the chance of missing a true difference 10

11 Most Tests Provide a Continuous Score. Selecting a Cutting Point Pathological scores Healthy scores Move this way to increase sensitivity (include more of sick group) Move this way to increase specificity (exclude healthy people) Test scores for a healthy population Sick population Crucial issue: changing cut-point can improve sensitivity or specificity, but never both Possible cut-point

12 Improving the test: Pathological scores Healthy scores Healthy population Sick population Improved test reduces overlap, increasing sen & spec.

13 Choosing the cut-point Choice depends on relative significance of false positive and false negative errors Choose a low cut-point (= increase sensitivity) –If results of missing a case (FN) are important –If cost of further diagnostic confirmation is not high –e.g. phenylketonuria PKU test. Choose a higher cut-point –If implications of a false positive are serious –e.g., alpha-fetoprotein test for Down’s syndrome during pregnancy

14 Clinical applications A specific test can be useful to rule in a disease. Why? –Specific tests give few false positives. So, if the result is positive, you can be sure the patient has the condition (‘nothing else would give this result’): “SpPin” D + D - ab cd T + T - A sensitive test can be useful for ruling a disease out: –A negative result on a very sensitive test (which detects all true cases) reassures you that the patient does not have the disease: “SnNout”

15 Your Patient’s Question: “Doctor, how likely am I to have this disease?” This introduces Predictive Values Sensitivity & specificity don’t answer this, because they work from the gold standard. The clinician sees the test result, but does not know whether this person is a true positive or a false positive (or a true or false negative). Hmmm… How accurately does a positive (or negative) test result predict disease (or health)?

16 Start from Prevalence Before you apply any test, the best guide you have to a diagnosis is based on prevalence: –Common conditions (in this population) are the more likely diagnosis Prevalence indicates the ‘pre-test probability’ of disease. You will then refine this informed guess in a series of stages: First, consider the patient’s age and sex; use the prevalence for a similar person. Then, based on the patient’s history you may modify the estimate.

17 Disease present Disease absentTotal Test positiveaba+b Test negativecdc+d Totala+cb+dN 2 x 2 table: Prevalence Prevalence = a+c / N

Predictive Values Based on rows, not columns PPV = a/(a+b); interprets positive test: false positive rate NPV = d/(c+d); interprets negative test: false negative rate Immediately useful to clinician: they tell us about the test in this population and thus this patient Vary with the prevalence of disease, so must be determined for each clinical setting As prevalence goes down, PPV goes down and NPV rises D + D - ab cd T + T -

19 D + D - T + T - 50 5 10 100 Sensitivity = 50/55 = 91% Specificity = 100/110 = 91% Prevalence = 55/165 = 33% A. Specialist referral hospital PPV = 50/60 = 83% NPV = 100/105 = 95% D + D - T + T - 50 5 100 1000 Sensitivity = 50/55 = 91% Specificity = 1000/1100 = 91% Prevalence = 55/1155 = 3% B. Primary care PPV = 50/150 = 33% NPV = 1000/1005 = 99.5% Prevalence and Predictive Values

20 Exercise ECG (aka "treadmill test") A 22 year old male with chest pain has a pretest probability of obstructive CAD of roughly 1%. With a "positive" exercise ECG, his posttest probability is still less than 5%, in other words, there's a greater than 95% chance that he doesn’t have important CAD, despite a "positive" test. The same applies in the opposite direction for a 72 year old male with typical anginal chest pain. Pretest probability is 95%; if the exercise ECG is negative, the posttest probability is still probably greater than 80%. The overarching guideline is to treat the patient, not the test. To display the effects of changing cut-points and prevalence on predictive values, click here. (scroll down to the middle of the page)click here

21 From the literature you can get Sensitivity & Specificity. To work out PPV and NPV for your practice, you need to guess prevalence, then work backwards: Fill cells in following order: “Truth” DiseaseDisease TotalPredictive Present Absent Values Test Pos Test Neg Total 1st2 nd 3rd 4 th 5th (from sensitivity)(from specificity) 7th 6th 8 th 9th 10 th 11th (from estimated prevalence)

22 Predictive Values High specificity = few FPs: Sp = TN/(TN+FP). FPs also drive PPV: PPV = TP/(TP + FP); So, with a high PPV the clinician is more certain that a patient with a positive test has the disease (it rules in the disease) The higher the sensitivity, the higher the NPV: Sn = TP/(TP+FN); NPV = TN/(TN+FN); the clinician can be more confident that a patient with a negative score does not have the diagnosis (because there are few false negatives). So, high NPV can rule out a disease. D + D - TPFP FNTN T + T -

23 Gasp…! Isn’t there an easier way to do all this…? Yes (good!) But first, you need a couple more concepts (less good…) We said that before you apply a test, prevalence gives your best guess about the chances that this patient has the disease. This is known as “Pretest Probability of Disease”: (a+c) / N in the 2 x 2 table: It can also be expressed as odds of disease: (a+c) / (b+d), as long as the disease is rare ab cd N

24 This Leads to … Likelihood Ratios Defined as the odds that a given level of a diagnostic test result would be expected in a patient with the disease, as opposed to a patient without: true positive rate / false positive rate [TP / FP] Advantages: –Combines sensitivity and specificity into one number –Can be calculated for many cut-points on the test –Can be turned into predictive values LR for positive test = Sensitivity / (1-Specificity) LR for negative test = (1-Sensitivity) / Specificity

25 Practical application: a Nomogram 1)You need the LR for this test 2)Plot the likelihood ratio on center axis (e.g., LR+ = 20) Example: Post-test probability = 91% ▪ 3) Select pretest probability (prevalence) on left axis (e.g. Prevalence = 30%) ▪ 4) Draw line through these points to right axis to indicate post-test probability of disease

26 There is another way to combine sensitivity and specificity: Meet Receiver Operating Characteristic (ROC) curves Work out Sen and Spec for every possible cut-point, then plot these. Area under the curve indicates the information provided by the test 1-Specificity ( = false positives) Sensitivity 00.20.40.60.81 0 0.2 0.4 0.6 0.8 1 In an ideal test, the blue line would reach the top left corner. For a useless test it would lie along the diagonal: no better than guessing

Chaining LRs Together (1) Example: 45 year-old woman presents with “chest pain” –Based on her age, pretest probability that a vague chest pain indicates CAD is about 1% Take a fuller history. She reports a 1-month history of intermittent chest pain, suggesting angina (substernal pain; radiating down arm; induced by effort; relieved by rest…) –LR of this history for angina is about 100

28 The previous example: 1. From the History: Pretest probability rises to 50% based on history She’s young; pretest probability about 1% LR 100

29 Chaining LRs Together (2) 45 year-old woman with 1-month history of intermittent chest pain… After the history, post test probability is now about 50%. What will you do? A more precise (but also more costly) test: Record an ECG –Results = 2.2 mm ST-segment depression. LR for ECG 2.2 mm result = 10. –This raises post test probability to > 90% for coronary artery disease (see next slide)

30 The previous example: ECG Results Now start pretest probability (i.e. 50%, prior to ECG, based on history) Post-test probability now rises to 90%

1 Interpreting Diagnostic Tests Ian McDowell Department of Epidemiology & Community Medicine January 2012 Note to readers: you may find the additional.

Similar presentations

Presentation on theme: "1 Interpreting Diagnostic Tests Ian McDowell Department of Epidemiology & Community Medicine January 2012 Note to readers: you may find the additional."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Interpreting Diagnostic Tests Ian McDowell Department of Epidemiology & Community Medicine January 2012 Note to readers: you may find the additional.

Similar presentations

Presentation on theme: "1 Interpreting Diagnostic Tests Ian McDowell Department of Epidemiology & Community Medicine January 2012 Note to readers: you may find the additional."— Presentation transcript:

Similar presentations

About project

Feedback