Presentation on theme: "Diagnostic tests Subodh S Gupta MGIMS, Sewagram"— Presentation transcript:
1Diagnostic tests Subodh S Gupta MGIMS, Sewagram One of the important roles a physician plays is to diagnose illnesses. For reaching a diagnosis, a physician uses different clinical information; e.g. Symptoms and signs and laboratory tests. With experience, a clinician learns the importance of various clinical information and how to interpret the positivity or negativity of a given diagnostic test. A junior clinician remains surprised at times when his senior declares that even if a diagnostic test is positive, the chances that a patient suffers from the given illness is extremely low. Or, even when a test is negative, the chance that a patient suffers from a given illness is high. Is it just by experience? Or, is there a science to it?Before, I proceed, I want to clarify a point. We know that the term ’diagnostic test’ simply means a test performed in a laboratory. But, what we are going to discuss today will apply equally well to all kinds of clinical information; e.g. Symptoms, signs and various risk factors that the patient is exposed to. It may also represent a combination of clinical information.
2Standard 2 X 2 table (For Diagnostic Tests) Disease StatusPresent (D+)Absent (D-)TotalDiagnostic testPositive (T+)aba+bNegative (T-)cdc+da+cb+dNGold StandardLet us start from what all of us already know and build from there. A simple way of looking at the relationships between a test’s results and the true diagnosis is given by a 2X2 table. The test is considered to be either positive or negative or the disease is either present or absent. There are then four possibilities.
3Standard 2 X 2 table (For Diagnostic Tests) Disease StatusPresent (D+)Absent (D-)Diagnostic testPositive (T+)TPFPNegative (T-)FNTNGold StandardThe test has given the correct result when it is positive among those who are diseased or negative among those who are non-diseased. Or it can give incorrect result when it is positive among those without the disease or negative in the presence of the disease.
4Gold standardIn any study of diagnosis, the method being evaluated has to be compared to somethingThe best available test that is used as comparison is called the GOLD STANDARDNeed to remember that all gold standards are not always gold; New test may be better than the gold standard
5Test parameters Gold Standard a b a+b c d c+d a+c b+d N Disease StatusPresent (D+)Absent (D-)TotalDiagnostic TestPositive (T+)aba+bNegative (T-)cdc+da+cb+dNThe rate of correct identification of those who are diseased is known as Sensitivity (True Positives). Similarly, the rate of correct identification of those who are not diseased is known as Specificity (True Negatives). We know from our previous knowledge that a sensitive test should be chosen when we do not want to miss any case. Similarly, a specific test should be used when we want to confirm a diagnosis or we do not want that anyone who may not have the disease is labeled diseased.Sensitivity = Pr(T+|D+) = a/(a+c)--Sensitivity is PID (Positive In Disease)Specificity = Pr(T-|D-) = d/(b+d)--Specificity is NIH (Negative In Health)
6Test parameters a b a+b c d c+d a+c b+d N Gold Standard Disease StatusPresent (D+)Absent (D-)TotalDiagnostic TestPositive (T+)aba+bNegative (T-)cdc+da+cb+dNGold StandardComplementary to the concept of Sensitivity and Specificity are the concepts of Error Rates. False positive rate is the rate of incorrectly identifying the disease among those who are not diseased. And, False negative rate is the rate of incorrectly identifying a subject as not diseased among those who are diseased.There is another parameter known as ‘Diagnostic Accuracy’ or simply ‘Accuracy’ of a test, which gives proportion of study subjects among whom the test gives correct diagnosis.False Positive Rate (FP rate) = Pr(T+|D-) = b/(b+d)False Negative Rate (FN rate) = Pr(T-|D+) = c/(a+c)Diagnostic Accuracy = (a+d)/n
7Test parameters a b a+b c d c+d a+c b+d N Gold Standard Disease StatusPresent (D+)Absent (D-)TotalDiagnostic TestPositive (T+)aba+bNegative (T-)cdc+da+cb+dNGold StandardTwo more parameters are very important in relation with a diagnostic test – Positive predictive value and Negative predictive value. The probability of a disease, given the results of a test, is called ‘Predictive value’ of a test. PPV means probability of a disease after the test has come positive and similarly NPV means probability of a disease after the test has come negative.Positive Predictive Value (PPV) = Pr(D+|T+) = a/(a+b)Negative Predictive Value (NPV) = Pr(D-|T-) = d/(c+d)
8Test parameters: Example Gold StandardDisease StatusPresent (D+)Absent (D-)TotalDiagnostic TestPositive (T+)90595Negative (T-)10105100200Sensitivity = 90/(90+10), Specificity = 95/(95+5)FP rate = 5/ (95+5); FN Rate = 10/ (90+10)Diagnostic Accuracy = (90+95) / ( )PPV = 90/(90+5); NPV = 95/(95+10)Based on this information, let us calculate the different test parameters.
10Let us see this in a graphical form Let us see this in a graphical form. Let us imagine a clinical data which takes on a range of values. Imagine the first curve represents the distribution of the test criterion among the healthy persons and the curve on the right side represents the distribution of the same test criterion among group of patients suffering from a particular illness. Usually, there is an overlap in the test criterion among the healthy and the diseased. Imagine, the cut-off is set at the vertical line given here.
11Then, the different areas shaded in different colors represent the TP, FP, FN and TN.
12Healthy population vs sick population Now, let us imagine two different situations. One situation, where we do the test in a setting where the chances of patients being diseased is high.Healthy Sick
13Predictive Values in hospital-based data Most test positives here are sick. But this is because there are as many sick as healthy people overall. What if fewer people were sick, relative to the healthy?
14Predictive Values in population-based data Now most test positives below are healthy. This is because the number of false positives from the larger healthy group outweighs the true positives from the sick group. Thus, the chance that a test positive is sick depends on the prevalence of the disease in the group tested!
15Test Parameters: Example Gold StandardDisease StatusPresent (D+)Absent (D-)TotalDiagnostic TestPositive (T+)90595Negative (T-)10105100200Let us examine this taking a numerical example.Prevalence = 50%PPV = 94.7%NPV = 90.5% Diagnostic Accuracy = 92.5%
19PPV & NPV with Prevalence 50%5%0.5%0.05%Sensitivity90%Specificity95%PPV94.7%48.6%8.3%0.9%NPV90.5%99.4%99.9%100%Diagnostic Accuracy92.5%94.8%
20Trade-offs between Sensitivity and Specificity It is important that we have a test that is both highly sensitive and highly specific. However, this is usually not possible. There is a trade-off between sensitivity and specificity. When we change a decision threshold for any test.
21Sensitivity and Specificity solve the wrong problem!!! When we use Diagnostic test clinically, we do not know who actually has and does not have the target disorder, if we did, we would not need the Diagnostic Test.Our Clinical Concern is not a vertical one of Sensitivity and Specificity, but a horizontal one of the meaning of Positive and Negative Test Results.BE-Workshop-DT-July2007
22When a clinician uses a test, which question is important ? If I obtain a positive test result, what is the probability that this person actually has the disease?If I obtain a negative test result, what is the probability that the person does not have the disease?BE-Workshop-DT-July2007
23Test parameters Gold Standard a b a+b c d c+d a+c b+d N Disease StatusPresent (D+)Absent (D-)TotalDiagnostic TestPositive (T+)aba+bNegative (T-)cdc+da+cb+dNSensitivity = Pr(T+|D+) = a/(a+c)Specificity = Pr(T-|D-) = d/(b+d)PPV = Pr(D+|T+) = a/(a+b)NPV = Pr(D-|T-) = d/(c+d)
24Likelihood Ratios Likelihood Ratio is a ratio of two probabilities Likelihood ratios state how many time more (or less) likely a particular test results are observed in patients with disease than in those without disease.LR+ tells how much the odds of the disease increase when a test is positive.LR- tells how much the odds of the disease decrease when a test is negative
25The likelihood ratio for a positive result (LR+) tells how much the odds of the disease increase when a test is positive.The likelihood ratio for a negative result (LR-) tells you how much the odds of the disease decrease when a test is negative
26Likelihood Ratios The LR for a positive test is defined as: LR (+) = Prob (T+|D) / Prob(T+|ND)LR (+) = [TP/(TP+FN)] [FP/(FP+TN)]LR (+) = (Sensitivity) / (1-Specificity)
27Likelihood Ratios The LR for a negative test is defined as: LR (-) = Prob (T-|D) / Prob(T-|ND)LR (-) = [FN/(TP+FN)] [TP/(FP+TN)]LR (-) = (1-Sensitivity) / (Specificity)
28What is a good ‘Likelihood Ratios’? A LR (+) more than 10 or a LR (-) less than 0.1 provides convincing diagnostic evidence.A LR (+) more than 5 or a LR (-) less than 0.2 is considered to give strong diagnostic evidence.
29Likelihood Ratio: Example Gold StandardDisease StatusPresent (D+)Absent (D-)TotalDiagnostic TestPositive (T+)90595Negative (T-)10105100200Likelihood Ratio for a positive test = (90/100) / (5/100)= 90/ 5 = 18Likelihood Ratio for a negative test = (10/100) / (95/100)= 10/ 95 = 0.11
30ExerciseIn a hypothetical example of a diagnostic test, serum levels of a biochemical marker of a particular disease were compared with the known diagnosis of the disease. 100 international units of the marker or greater was taken as an arbitrary positive test result:
31Example Disease Status Present Absent Total Marker >=100 431 30 461 <10029116145460146606
32ExerciseInitial creatine phosphokinase (CK) levels were related to the subsequent diagnosis of acute myocardial infarction (MI) in a group of patients with suspected MI. Four ranges of CK result were chosen for the study:
33Exercise Disease Status Present Absent Total CPK >=280 97 1 98 80-2791181513340-791326391-39288100230130360
35Odds and Probability a b a+b Disease StatusPresentAbsentTotalaba+bProbability of Disease = (# with disease) /(# with & # without disease) = a/ (a+b)Odds of a disease = (# with disease) /(# without disease) = a/ bProbability = Odds/ (Odds+1);Odds = Probability / (1-Probability)
36Use of Likelihood Ratio Employment of following three step procedure:1. Identify and convert the pre-test probability to pre-test odds.2. Determine the post-test odds using the formula,Post-test Odds = Pre-test Odds * Likelihood Ratio3. Convert the post-test odds into post-test probability.
37Likelihood Ratio: Example A 52 yr woman presents after detecting 1.5 cm breast lump on self-exam. On clinical exam, the lump is not freely movable. If the pre-test probability is 20% and the LR for non-movable breast lump is 4, calculate the probability that this woman has breast cancer.
38Likelihood Ratio: Solution First stepPre-test probability = 0.2Pre-test odds = Pre-test prob / (1-pre-test prob)Pre-test odds = 0.2/(1-0.2) = 0.2/0.8 = 0.25Second stepPost-test odds Pre-test odds * LRPost-test odds = 0.25*4 = 1Third stepPost-test probability = Post-test odds / (1 + Post-test odds)Post-test probability = 1/(1+1) = ½ = 0.5
41Receiver Operating Characteristic (ROC) Finding a best testFinding a best cut-offFinding a best combinationprobably negativeEquivocalProbably positiveDefinitive positiveAnother way to express the relationship between sensitivity and specificity for a given test is to construct a curve, called Receiver Operating Characteristic (ROC) Curve.
43ROC curve constructed from multiple test thresholds
44Receiver Operating Characteristic (ROC) ROC Curve allows comparison of different tests for the same condition without (before) specifying a cut-off point.The test with the largest AUC (Area under the curve) is the best.
47Features of good diagnosis study Comparative (compares new test against old test).Should be a “gold standard”Should include both positive and negative resultsUsually will involve “blinding” for both patient, tester and investigator.
48Gold standardIn any study of diagnosis, the method being evaluated has to be compared to somethingThe best available test that is used as comparison is called the GOLD STANDARDNeed to remember that all gold standards are not always gold; New test may be better than the gold standard
49Typical setting for finding Sensitivity and Specificity Best if everyone who gets the new test also gets “gold standard”Doesn’t happen in the real worldNot even a sample of each (case-control type)Case series of patients who had both tests
50Setting for finding Sensitivity and Specificity Sensitivity should not be tested in “sickest of sick”Should include spectrum of diseaseSpecificity should not be tested in “healthiest of healthy”Should include similar conditions.
51PrecisionHow precise are the estimates of Sensitivity, Specificity, False Positive Rate, False Negative Rate, Positive Predictive Value and Negative Predictive Value?If reported without a measure of precision, clinicians cannot know the range within which the true values of the indices are likely to lie.When evaluations of diagnostic accuracy are reported the precision of test characteristics should be stated.
54ExerciseDr Egbert Everard wants to test a new blood test (Sithtastic) for the diagnosis of the dark side gene. He wants the test to have a sensitivity of at least 70% and a specificity of 90% with 5% confidence levels. Disease prevalence in this population is 10%. (i) How many patients does Egbert need to be 95% sure his test is more than 70% sensitive? (ii) How many patients does Egbert need to be 95% sure that his test is more than 90% specific?
55Biases in Research on Diagnostic Tests Observer BiasSpectrum BiasReference Test BiasBias IndexWork-Up (Verification Bias)Diagnostic Suspicion BiasBE-Workshop-DT-July2007
56Observer bias Blinding Investigators should be blinded to the test results when interpreting the reference test, and blinded to the reference test results when interpreting the test.Should they also be blinded to other patient characteristics?BE-Workshop-DT-July2007
57Spectrum bias Indeterminate results dropped from analysis BE-Workshop-DT-July2007
58Reference Test Bias What if the ‘Gold Standard’ is not gold after all? Absence of Gold standardMethods to deal with the absence of a gold standard:Correcting for Reference Test Bias (Gart & Buck)Bayesian estimations (Joseph, Gyorkos, Coupal)Latent class modeling (Walter, Cook, Irwig)BE-Workshop-DT-July2007
59BIAS INDEXWhat if the test itself commits a certain types of errors more commonly than the other?BI = (b-c)/NBE-Workshop-DT-July2007
60Work-up (Verification Bias) Occurs when a test efficacy study is restricted to patients in whom the disease status is known.A study by Borow et al (Am Heart J,1983) on patients who were referred for valve surgery on the basis of echocardiographic assessment reported excellent diagnostic agreement between the findings at echocardiography and at surgery.BE-Workshop-DT-July2007
61Review BiasThe ‘Test’ and ‘Gold Standard’ should follow a randomized sequence of administration.This tends to offset the Diagnostic Suspicion Bias that may creep in, when the Gold Standard is always applied and interpreted last.It will also balance any effect of time on rapidly increasing severity of the disease and thereby avoid a bias towards more positives in the test which is performed later.BE-Workshop-DT-July2007
62Ethical Issues in Diagnostic Test Research Invasive techniquesLabelingConfidentialityHuman subjectsBE-Workshop-DT-July2007
63Review of studies published between 1990-93. QUALITIES OF STUDIES EVALUATING DIAGNOSTIC TESTSReid MC et al. Use of methodological standards in diagnostic test research: getting better but still not good. JAMA 1995; 274: 645.Review of studies published betweenWork-up Bias: 38% StudiesObserver Bias (Blinding): 53% StudiesBias from Indeterminate Results: 62% StudiesNo assessment of variability across test observers, test instruments, or time: 68% StudiesBE-Workshop-DT-July2007
64Patient Characteristics not described: 68% Studies QUALITIES OF STUDIES EVALUATING DIAGNOSTIC TESTSSmall Sample Size, with no description of Confidence Intervals: 76% StudiesPatient Characteristics not described: 68% StudiesPossible Interactions or Effect Modification Ignored: 88% StudiesOnly two (6%) of 34 articles published from (N Engl J Med, JAMA, Lancet, BMJ) met six or more of the Standards.BE-Workshop-DT-July2007
65How to use an Article about a Diagnostic Test? USERS GUIDES TO THE MEDICAL LITERATUREHow to use an Article about a Diagnostic Test?Are the results of the study valid?What are the results and will they help me in caring for my patients?BE-Workshop-DT-July2007
66Methodological Questions for Appraising Journal Articles about Diagnostic Tests 1. Was there an independent, ‘blind’ comparison with a ‘gold’ standard’ of diagnosis?2. Was the setting for the study as well as the filter through which the study patients passed, adequately described?3. Did the patient sample include an appropriate spectrum of disease?4. Have they done analysis of the pertinent subgroups5. Where the tactics for carrying out the test described in sufficient detail to permit their exact replication?
676. Was the reproducibility of the test result (precision) and its interpretation (observer variation) determined?7. Was the term ‘ normal’ defined sensibly?8. Was precision of the test statistics given?9. Was the indeterminate test results presented?10. If the test is advocated as a part of a cluster or sequence of tests, was its contribution to the overall validity of the cluster or sequence determined?11. Was the ‘ utility’ of the test determined?