Presentation is loading. Please wait.

Presentation is loading. Please wait.

ROC & AUC, LIFT ד"ר אבי רוזנפלד.

Similar presentations


Presentation on theme: "ROC & AUC, LIFT ד"ר אבי רוזנפלד."— Presentation transcript:

1 ROC & AUC, LIFT ד"ר אבי רוזנפלד

2 Introduction to ROC curves
ROC = Receiver Operating Characteristic Started in electronic signal detection theory (1940s s) Has become very popular in biomedical applications, particularly radiology and imaging גם בשימוש בכריית מידע

3 False Positives / Negatives
Confusion matrix 1 Confusion matrix 2 P N 20 10 30 90 P N 10 20 15 105 FN Actual Actual FP Predicted Predicted Precision (P) = 20 / 50 = 0.4 Recall (P) = 20 / 30 = 0.666 F-measure=2*.4*.666/1.0666=.5

4 Different Cost Measures
The confusion matrix (easily generalize to multi-class) Machine Learning methods usually minimize FP+FN TPR (True Positive Rate): TP / (TP + FN) = Recall FPR (False Positive Rate): FP / (TN + FP) = Precision Predicted class Yes No Actual class TP: True positive FN: False negative FP: False positive TN: True negative

5 Specific Example People without disease People with disease
Test Result

6 Threshold Test Result Call these patients “negative”
Call these patients “positive” Test Result

7 Some definitions ... True Positives Test Result
Call these patients “negative” Call these patients “positive” True Positives Test Result without the disease with the disease

8 False Positives Test Result Call these patients “negative”
Call these patients “positive” False Positives Test Result without the disease with the disease

9 True negatives Test Result Call these patients “negative”
Call these patients “positive” True negatives Test Result without the disease with the disease

10 False negatives Test Result Call these patients “negative”
Call these patients “positive” False negatives Test Result without the disease with the disease

11 Moving the Threshold: left
‘‘-’’ ‘‘+’’ Test Result without the disease Which line has the higher recall of -? Which line has the higher precision of -? with the disease

12 ROC curve True Positive Rate (Recall)
0% 100% False Positive Rate (1-specificity) 0% 100%

13 Figure 5.2 A sample ROC curve.
Jagged line, shows, instance by instance (in order by sorted list shown (Table 5.6)) increases in true positives (correctly predicted as yes) and false positives (incorrectly predicted as yes) It HAD BETTER BE ABOVE THE DIAGONAL LINE or else the learning method is hurting us! Smooth curves may be drawn – or generated using cross validation … see next slide Figure 5.2 A sample ROC curve.

14 סוגים שונים של ROC גרפים

15 Area under ROC curve (AUC)
מדד כללי השטח מתחת לגרף ROC 0.50 הוא מחירה רנדומאלי, 1.0 הוא מושלם.

16 AUC for ROC curves AUC = 100% AUC = 50% AUC = 90% AUC = 65%
True Positive Rate 0% 100% False Positive Rate True Positive Rate 0% 100% False Positive Rate AUC = 100% AUC = 50% True Positive Rate 0% 100% False Positive Rate True Positive Rate 0% 100% False Positive Rate AUC = 90% AUC = 65%

17 Lift Charts X axis is sample size: (TP+FP) / N Y axis is TP
הגדרה פורמאלי: דיוק המודל / דיוק רנדומאלי 80% of responses for 40% of cost Lift factor = 2 Model 40% of responses for 10% of cost Lift factor = 4 Random

18 Lift factor Lift Value Sample Size

19 הקשר בין המדדים

20 בעיית הOVERFITTING

21 10-fold cross-validation (one example of K-fold cross-validation)
1. Randomly divide your data into 10 pieces, 1 through k. 2. Treat the 1st tenth of the data as the test dataset. Fit the model to the other nine-tenths of the data (which are now the training data). 3. Apply the model to the test data (e.g., for logistic regression, calculate predicted probabilities of the test observations). 4. Repeat this procedure for all 10 tenths of the data. 5. Calculate statistics of model accuracy and fit (e.g., ROC curves) from the test data only.

22 תמונה

23 ניתוח התוצאות

24 The Kappa Statistic Kappa measures relative improvement over random prediction Dreal / Dperfect = A (accuracy of the real model) Drandom / Dperfect= C (accuracy of a random model) Kappa Statistic = (A-C) / (1-C) = (Dreal / Dperfect – Drandom / Dperfect ) / (1 – Drandom / Dperfect ) Remove Dperfect from all places (Dreal – Drandom) / (Dperfect – Drandom) Kappa = 1 when A = 1 Kappa  0 if prediction is no better than random guessing

25 Aside: the Kappa statistic
Two confusion matrix for a 3-class problem: real model (left) vs random model (right) Number of successes: sum of values in diagonal (D) Kappa = (Dreal – Drandom) / (Dperfect – Drandom) (140 – 82) / (200 – 82) = 0.492 Accuracy = 140/200 = 0.70 Predicted Predicted a b c 88 10 2 100 14 40 6 60 18 12 120 20 200 a b c 60 30 10 100 36 18 6 24 12 4 40 120 20 200 total total Actual Actual total total

26 The kappa statistic – how to calculate Drandom ?
Expected confusion matrix, E, for a random model Actual confusion matrix, C a b c 88 10 2 100 14 40 6 60 18 12 120 20 200 total a b c ? 100 60 40 120 20 200 total Actual Actual The idea is to compare actual results with what would have happened if a random predictor predicted answers in the same proportion that the actual predictor did On the left is the actual results – with = 140 correct out of 200 (70%) The actual proportions are 100, 60, and 40 for A,B,C The prediction proportions are 120, 60, 20 for A,B, and C On the right. The prediction proportions are matched – but predictions are random so those 120 predictions of A are split among all of the actual values since 50% of the actual answers are A (100 of 200), of those 120 predictions of A we expect half to actually be correct (60) Since 30% of actual answers are B (60 of 200), of the 120 predictions of A, we expect 30% to be for instances that are actually Bs (36) Since 20% of actual answers are C (40 of 200), of the 120 predictions of A, we expect 20% to be for instances that are actually Cs (24) We’re going to have 60 predictions of B, but predictions are random so we expect those predictions to be split among all of the actual values since 50% of the actual answers are A (100 of 200), of those 60 predictions of B we expect half to be for instances that are actually As (30) Since 30% of actual answers are B (60 of 200), of the 60 predictions of B, we expect 30% to be correct (18) Since 20% of actual answers are C (40 of 200), of the 60 predictions of B, we expect 20% to be for instances that are actually Cs (12) We’re going to have 20 predictions of C, but predictions are random so we expect those predictions to be split among all of the actual values since 50% of the actual answers are A (100 of 200), of those 20 predictions of C we expect half to be for instances that are actually As (10) Since 30% of actual answers are B (60 of 200), of the 20 predictions of C, we expect 30% to be for instances that are actually Bs (6) Since 20% of actual answers are C (40 of 200), of the 20 predictions of C, we expect 20% to be correct (4) The expected results with stratified random prediction is =82 So this classifier being tested squeezed an extra = 58 correct predictions The best possible classifier (100% correct) could have obtained an extra =118 correct predictions So our classifier got 58 out of possible 118 extra correct predictions (49.2%)  that’s the Kappa Statistic total total 100*120/200 = 60 Rationale: 100 actual values, 120/200 in the predicted class, so random is: 100*120/200

27 לקראת התרגיל...


Download ppt "ROC & AUC, LIFT ד"ר אבי רוזנפלד."

Similar presentations


Ads by Google