Presentation is loading. Please wait.

Presentation is loading. Please wait.

ROC Analysis Emily Kistner-Griffin, PhD Amy Wahlquist, MS Cancer Prevention and Control Statistics Tutorial August 13, 2009.

Similar presentations


Presentation on theme: "ROC Analysis Emily Kistner-Griffin, PhD Amy Wahlquist, MS Cancer Prevention and Control Statistics Tutorial August 13, 2009."— Presentation transcript:

1 ROC Analysis Emily Kistner-Griffin, PhD Amy Wahlquist, MS Cancer Prevention and Control Statistics Tutorial August 13, 2009

2 Outline I.Motivating Example: Chest CT II.Classification III.Sensitivity and Specificity IV.ROC curve and AUC estimation a.Nonparametric Curve b.Parametric Curve V.ROC and Logistic Regression VI.Comparing ROC curves

3 I. Motivating Example: Chest CT Evaluating the probability of malignancy in pulmonary nodules seen on chest CT in 213 MUSC patients from two cohorts Sample of 194 subjects seen in pulmonary clinic and 19 subjects with CT previous to an unrelated surgical intervention Develop a prediction model from clinical data and radiological characteristics of lung nodules

4 Chest CT A model of P (malignancy) of pulmonary nodules has been described in the literature (Swensen SJ et al., 1997) Model included three demographic characteristics: patient age, smoking status (ever vs. never), any history of cancer Model included three radiological characteristics: diameter, upper lobe location, and spiculation

5 Chest CT Swensen et al. reported an area under the reciever operating curve of 0.8014 ± 0.0360 in a validation sample, using a logistic regression approach. Interested in how well Swensen’s model performs in the MUSC cohort. Interested in evaluating whether we can improve the prediction model by including other patient characteristics

6 II. Classification Consider medical tests that are measured on a continuous or ordinal scale Goal: to describe the performance of the medical test in classifying subjects into individuals with and without disease Examples: PSA and CA-125 as biomarkers of prostate and ovarian cancer; BI-RADS for breast imaging (radiologist determined probability of malignancy)

7 Classification from CT Consider the diameter of the nodule as measured on the CT scan (range: 3.3mm-15mm) Larger nodules are more likely to be malignant (OR: 1.34, 95% CI: 1.20-1.49) How well can we predict malignancy from nodule diameter?

8 Contingency Table d<66≤d<88≤d<1010≤d<1212≤d≤15 Benign4130291724 Malignant210131433

9 Classification Tables Choose a cut-point on continuous or ordinal scale in order to assign disease status Truth D=1 Truth D=0 Classified D=1 TPFP Classified D=0 FNTN

10 III. Sensitivity & Specificity For selected cut-point determine sensitivity and specificity of medical test (or prediction model) Sensitivity = Pr ( TP | + ) = TP / (TP+FN) = TPF Specificity = Pr ( TN | — ) = TN / (TN+FP) = TNF In order to summarize test characteristics – must compute sensitivity and specificity at multiple cut- points

11 Sensitivity & Specificity Example Cut-pointSensitivitySpecificity 60.9720.291 80.8330.504 100.6530.709 120.4580.830

12 From Metz CE (1978) Basic Principles of ROC Analysis. Seminars in Nuclear Medicine; 8 (4): 283 – 297.

13

14

15 Decision Threshold Lowering the threshold increases TPF (sensitivity) and the FPF (1-specificity) Raising the threshold decreases the TPF and the FPF Points representing all possible TPF and FPF lie on a curve – passing through the lower (0,0) corner when all tests are called negative and the upper (1,1) corner when all the tests are called positive If the test is informative then all other points on the curve must be above the diagonal (TP more likely than FP) The curve describing the compromises between TPF and FPF is called the ROC curve

16

17 Detailed report of Sensitivity and Specificity ------------------------------------------------------------------------------ Correctly Cutpoint Sensitivity Specificity Classified LR+ LR- ------------------------------------------------------------------------------ ( >= 3.3 ) 100.00% 0.00% 33.80% 1.0000 ( >= 4 ) 100.00% 1.42% 34.74% 1.0144 0.0000 ( >= 5 ) 97.22% 13.48% 41.78% 1.1236 0.2061 ( >= 6 ) 97.22% 29.08% 52.11% 1.3708 0.0955 ( >= 7 ) 93.06% 39.72% 57.75% 1.5436 0.1749 ( >= 8 ) 83.33% 50.35% 61.50% 1.6786 0.3310 ( >= 9.1 ) 70.83% 61.70% 64.79% 1.8495 0.4727 ( >= 10 ) 65.28% 70.92% 69.01% 2.2449 0.4896 ( >= 11 ) 56.94% 78.72% 71.36% 2.6764 0.5469 ( >= 12 ) 45.83% 82.98% 70.42% 2.6927 0.6528 ( >= 13 ) 25.00% 90.78% 68.54% 2.7115 0.8262 ( >= 14 ) 13.89% 95.04% 67.61% 2.7976 0.9061 ( >= 15 ) 0.00% 98.58% 65.26% 0.0000 1.0144 ( > 15 ) 0.00% 100.00% 66.20% 1.0000 -------------------------------------------------------------------- roctab malignant diameter, detail graph ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 213 0.7411 0.0347 0.67317 0.80900

18 Likelihood Ratios LR+ = sensitivity / (1-specificity) = TPF FPF LR- = (1-sensitivity) / specificity = 1-TPF 1-FPF LR+ is the slope between the origin and the point on the ROC curve and LR- is the slope between the point on the curve and the (1,1) point (Choi 1998)

19 IV. ROC curve and AUC estimation ROC: Receiver Operating Characteristic Developed in signal detection theory to illustrate how the receiver deciphers between signal and noise (1960s) Illustration of two test characteristics: sensitivity and specificity at selected cut-points (decision thresholds) Popularized in medical testing in the field of Radiology (1980s)

20 ROC curve and Thresholds ROC curve describes disease detection independent of disease prevalence (sensitivity and specificity are also) Prevalence may help determine the operating threshold: –Low prevalence suggests reducing FPF (higher specificity, higher threshold, lower part of the curve) –High prevalence suggests increasing TPF (higher sensitivity, lower threshold, higher part of the curve) In practice, must consider costs and consequences of FP and FN before selecting the desirable cut-off: –Consequence of FN: death? –Consequence of FP: stressful, costly work-up or treatment

21 Area Under the ROC Curve Summarizes the performance of the test Probability that the result of the test for a randomly selected abnormal subject will be greater than the result of the test for a randomly selected normal subject Average TPF: averaged across whole range of FPF in (0,1) Perfect test gives AUC = 1.0 and an uninformative test gives AUC=0.50 Parametric and non-parametric approaches to constructing the ROC curve and calculating the area under the curve (AUC)

22 a. Nonparametric ROC Curve Constructed by plotting sensitivity and (1 – specificity) at each possible cut-point Area under the curve (AUC) constructed using the trapezoidal rule Variance estimators have been derived Delong et al. (1988), Hanley and McNeil (1982); Bamber (1975)

23 Variance of AUC Specifically for Delong et al. (1988) variance estimate:

24 Confidence Intervals for AUC Must consider distribution of AUC estimate: asymptotically normal or binomial assumption Must select standard error estimate (Delong et al. approach is the default): ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 213 0.7411 0.0347 0.67317 0.80900. roctab malignant diameter, binomial ROC -- Binomial Exact -- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 213 0.7411 0.0347 0.67754 0.79916

25 b. Parametric ROC Curve Assumes a binormal model A monotone transformation of the test results exists to give results that are normally distributed in the diseased and non- diseased populations Method involves fitting a straight line to the empirical ROC points by plotting using normal probability scales on each axis (plot inverse of the standard normal cumulative distribution function for sensitivity and specificity) Intercept of the line is the standardized difference in the continuous variable between the two populations; slope is a ratio of the standard deviations

26 Parametric AUC Estimation AUC is a function of the slope and intercept of the estimated line – using the standard normal cumulative distribution function

27 Nonparametric vs. Parametric Parametric approaches assume a binormal distribution to makes inferences (obtain MLE): only when the assumption is true are the estimators unbiased With continuous data a nonparametric approach is recommended With discrete ratings a parametric approach is recommended as nonparametric approaches tend to underestimate the true AUC Note standard error of the AUC is smaller using a continuous scale

28 . rocfit malignant diameter, cont(10) Fitting binormal model: Binormal model of malignant on diameter Number of obs = 213 Goodness-of-fit chi2(7) = 8.52 Prob > chi2 = 0.2894 Log likelihood = -456.16837 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- intercept | 0.997803 0.181857 5.49 0.000 0.641370 1.354236 slope (*) | 1.170680 0.139487 1.22 0.221 0.897290 1.444070 -------------+---------------------------------------------------------------- /cut1 | -1.296367 0.141750 -9.15 0.000 -1.574192 -1.018542 /cut2 | -0.668255 0.110960 -6.02 0.000 -0.885733 -0.450777 /cut3 | -0.222392 0.102919 -2.16 0.031 -0.424110 -0.020674 /cut4 | 0.202507 0.101135 2.00 0.045 0.004286 0.400729 /cut5 | 0.499186 0.103559 4.82 0.000 0.296214 0.702159 /cut6 | 0.756664 0.109249 6.93 0.000 0.542539 0.970788 /cut7 | 1.040925 0.119741 8.69 0.000 0.806237 1.275614 /cut8 | 1.541544 0.150124 10.27 0.000 1.247307 1.835781 /cut9 | 2.369036 0.244933 9.67 0.000 1.888975 2.849096 ------------------------------------------------------------------------------ | Indices from binormal fit Index | Estimate Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- ROC area | 0.741532 0.034471 0.673970 0.809094 delta(m) | 0.852328 0.144007 0.570080 1.134576 d(e) | 0.919346 0.151542 0.622329 1.216364 d(a) | 0.916517 0.150751 0.621050 1.211985 ------------------------------------------------------------------------------ (*) z test for slope==1

29 . rocplot, confband

30 V. ROC and Logistic Regression Prediction Model from Chest CT Use logistic regression to create probabilities of malignancy (represent diagnostic results from multiple predictors) Compare two logistic models of malignancy – one from previous literature and model with selected variables from the MUSC data Variables suggested in Swensen SJ et al. + surgical cohort (variable describing collection of samples) Variables selected using backwards regression in MUSC data

31 . logistic malignant surgical_cohort patient_age any_non_lung_cancer_history lung_cancer_history smoker_ever diameter upper_lobe spiculated Logistic regression Number of obs = 207 LR chi2(8) = 73.20 Prob > chi2 = 0.0000 Log likelihood = -94.454613 Pseudo R2 = 0.2793 ------------------------------------------------------------------------------ malignant | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- surgical_c~t | 7.045799 4.91585 2.80 0.005 1.794929 27.65751 patient_age |.9933921.0184868 -0.36 0.722.9578115 1.030294 any_non_lu~y | 4.017493 1.537066 3.63 0.000 1.897978 8.50392 lung_cance~y | 10.43958 8.011157 3.06 0.002 2.319987 46.9765 smoker_ever | 1.026627.5138138 0.05 0.958.3849437 2.737967 diameter | 1.233463.0787204 3.29 0.001 1.088433 1.397817 upper_lobe | 1.483983.5613942 1.04 0.297.7069965 3.114874 spiculated | 2.094564.8488535 1.82 0.068.9465232 4.635065 ------------------------------------------------------------------------------. predict swensen

32 . lsens, gensens(sensitivity) genspec(specificity) genpr(cutoffs)

33 . lroc

34 Use saved predicted probabilities from logistic model:. roctab malignant swensen ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 207 0.8344 0.0294 0.77682 0.89203. roctab malignant swensen, graph Postestimation: 95% CI

35 Again use saved predicted probabilities from logistic model:. roccomp malignant diameter swensen ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] ------------------------------------------------------------------------- diameter 207 0.7351 0.0357 0.66518 0.80499 swensen 207 0.8344 0.0294 0.77682 0.89203 ------------------------------------------------------------------------- Ho: area(diameter) = area(swensen) chi2(1) = 9.52 Prob>chi2 = 0.0020 VI. Comparing ROC curves

36 Using quantities defined by Delong et al. for variance estimation to define chi-squared test statistic: Testing AUC Equality

37 Models with Multiple Predictors. logistic malignant diameter any_non_lung_cancer_history surgical_cohort lung_cancer_history pet_positive pack Logistic regression Number of obs = 206 LR chi2(6) = 112.09 Prob > chi2 = 0.0000 Log likelihood = -75.983489 Pseudo R2 = 0.4245 ------------------------------------------------------------------------------ malignant | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- diameter | 1.218243.08847 2.72 0.007 1.05662 1.404588 any_non_lu~y | 3.830492 1.693158 3.04 0.002 1.610666 9.109691 surgical_c~t | 6.996053 5.682876 2.39 0.017 1.423719 34.37811 lung_cance~y | 10.16367 8.299078 2.84 0.005 2.051197 50.36092 pet_positive | 11.38458 5.025505 5.51 0.000 4.79259 27.04355 pack | 1.007755.0046908 1.66 0.097.9986032 1.016991 ------------------------------------------------------------------------------. predict musc

38 . roccomp malignant diameter swensen musc, graph summary

39 . roccomp malignant diameter swensen musc ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] ------------------------------------------------------------------------- diameter 202 0.7410 0.0359 0.67062 0.81131 swensen 202 0.8344 0.0298 0.77605 0.89272 musc 202 0.8987 0.0230 0.85374 0.94372 ------------------------------------------------------------------------- Ho: area(diameter) = area(swensen) = area(musc) chi2(2) = 22.81 Prob>chi2 = 0.0000. rocgold malignant swensen diameter musc ------------------------------------------------------------------------------- ROC Bonferroni Area Std. Err. chi2 df Pr>chi2 Pr>chi2 ------------------------------------------------------------------------------- swensen (standard) 0.8344 0.0298 diameter 0.7410 0.0359 8.2690 1 0.0040 0.0081 musc 0.8987 0.0230 8.6304 1 0.0033 0.0066 -------------------------------------------------------------------------------

40 Questions? Next: ROC in SPSS

41 b. Lorenz Curves ROC curve represents a monotone increasing function of the FPF (1-specificity) If the risk of disease does not vary monotonically with the diagnostic test then the ROC may not be convex Lee (1999) suggested a Lorenz curve (used commonly in economics) for such data The methodology involves reordering the test results to ensure that the ratio of disease subjects / no disease subjects in each category is increasing Must consider whether reordering makes practical sense (usually sensible on an ordinal scale but not necessarily on a continuous scale)

42

43 Defining Lorenz Curves Plot cumulative percent of individuals with disease against the cumulative percent of individuals without the disease at each cut-point Examples when a Lorenz might be appropriate: –Test has similar means but different variances across populations with and without disease –Bimodal distribution of test in either population –Skewed distribution in population with disease and symmetric distribution in population without the disease A flatter Lorenz curve suggests a worse diagnostic test Two summary indices describe the curvature –Gini index: twice the area between the Lorenz curve and the diagonal line –Pietra index: twice the area of the largest triangle inscribed between the diagonal line and the curve

44 Lorenz Curves and ROC. roctab malignant diameter, lorenz graph. roctab malignant diameter, lorenz Lorenz curve --------------------------- Pietra index = 0.2322 Gini index = 0.3301 If the at-risk probabilities increase (or decrease) with increasing values of the test results then Gini = 2(AUC)-1 Larger Pietra and Gini indices describe better diagnostic tests Gini index is related to average difference in post-test probabilities for two randomly selected subjects and Pietra index is related to average absolute change between pre and post test probabilities of disease


Download ppt "ROC Analysis Emily Kistner-Griffin, PhD Amy Wahlquist, MS Cancer Prevention and Control Statistics Tutorial August 13, 2009."

Similar presentations


Ads by Google