Data Analytics CMIS Short Course part II Day 1 Part 4: ROC Curves Sam Buttrey December 2015.

Slides:



Advertisements
Similar presentations
Fair scores for ensemble forecasts Chris Ferro University of Exeter 13th EMS Annual Meeting and 11th ECAM (10 September 2013, Reading, UK)
Advertisements

Diagnostic Metrics Week 2 Video 3. Different Methods, Different Measures  Today we’ll continue our focus on classifiers  Later this week we’ll discuss.
Evaluating Classifiers
Chapter 4 Pattern Recognition Concepts: Introduction & ROC Analysis.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Learning Algorithm Evaluation
Curva ROC figuras esquemáticas Curva ROC figuras esquemáticas Prof. Ivan Balducci FOSJC / Unesp.
From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Receiver Operating Characteristic (ROC) Curves
The Diversity of Samples from the Same Population Thought Questions 1.40% of large population disagree with new law. In parts a and b, think about role.
Comparing Integers Lesson
Cost-Sensitive Classifier Evaluation Robert Holte Computing Science Dept. University of Alberta Co-author Chris Drummond IIT, National Research Council,
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
CAP and ROC curves.
Performance measures Morten Nielsen, CBS, BioCentrum, DTU.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Chapters 9 (NEW) Intro to Hypothesis Testing
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
ROC Curves.
Jeremy Wyatt Thanks to Gavin Brown
Determine whether each curve below is the graph of a function of x. Select all answers that are graphs of functions of x:
ROC Curve and Classification Matrix for Binary Choice Professor Thomas B. Fomby Department of Economics SMU Dallas, TX February, 2015.
ROC Curves.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Evaluating Classifiers
SPH 247 Statistical Analysis of Laboratory Data May 19, 2015SPH 247 Statistical Analysis of Laboratory Data1.
Non-Traditional Metrics Evaluation measures from the Evaluation measures from the medical diagnostic community medical diagnostic community Constructing.
ROC 1.Medical decision making 2.Machine learning 3.Data mining research communities A technique for visualizing, organizing, selecting classifiers based.
1 Evaluating Model Performance Lantz Ch 10 Wk 5, Part 2 Right – Graphing is often used to evaluate results from different variations of an algorithm. Depending.
Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =
Copyright © 2003, SAS Institute Inc. All rights reserved. Cost-Sensitive Classifier Selection Ross Bettinger Analytical Consultant SAS Services.
Experiments in Machine Learning COMP24111 lecture 5 Accuracy (%) A BC D Learning algorithm.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
CpSc 810: Machine Learning Evaluation of Classifier.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 3. Bayes Decision Theory: Part II. Prof. A.L. Yuille Stat 231. Fall 2004.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Appraising A Diagnostic Test
Evaluating Results of Learning Blaž Zupan
Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Preventing Overfitting Problem: We don’t want to these algorithms to fit to ``noise’’ Reduced-error pruning : –breaks the samples into a training set and.
Classification Evaluation. Estimating Future Accuracy Given available data, how can we reliably predict accuracy on future, unseen data? Three basic approaches.
1 Performance Measures for Machine Learning. 2 Performance Measures Accuracy Weighted (Cost-Sensitive) Accuracy Lift Precision/Recall –F –Break Even Point.
1 Chapter 2: The Normal Distribution 2.1Density Curves and the Normal Distributions 2.2Standard Normal Calculations.
Evaluating Classification Performance
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
Quiz 1 review. Evaluating Classifiers Reading: T. Fawcett paper, link on class website, Sections 1-4 Optional reading: Davis and Goadrich paper, link.
Professor William H. Press, Department of Computer Science, the University of Texas at Austin1 Opinionated in Statistics by Bill Press Lessons #50 Binary.
SPH 247 Statistical Analysis of Laboratory Data. Binary Classification Suppose we have two groups for which each case is a member of one or the other,
Chapter 5 – Evaluating Predictive Performance Data Mining for Business Analytics Shmueli, Patel & Bruce.
ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.
Backpacks on chairs if they don’t fit on hangers.
Normal Distribution, Likelihood Ratios, and ROC Curves Body Mass Indices for WNBA and NBA Players Seasons.
Performance measures Morten Nielsen, CBS, Department of Systems Biology, DTU.
Test Taking Strategies
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
Review Statistical inference and test of significance.
Density Curves & Normal Distributions Textbook Section 2.2.
Model Evaluation Saed Sayad
Sensitivity, Specificity, and Receiver- Operator Characteristic Curves 10/10/2013.
7. Performance Measurement
Evaluation – next steps
Performance Measures II
Measuring Success in Prediction
Receiver Operating Curves
ROC Curves and Operating Points
Model Evaluation and Selection
Computational Intelligence: Methods and Applications
Basics of ML Rohan Suri.
Roc curves By Vittoria Cozza, matr
Information Organization: Evaluation of Classification Performance
Presentation transcript:

Data Analytics CMIS Short Course part II Day 1 Part 4: ROC Curves Sam Buttrey December 2015

Assessing a Classifier In data sets with very few “bads,” the “naïve” model that says “everyone is good” is highly accurate –It never pays to predict “bad” How can we decide that a model is getting probabilities correct, or compare two models? How can we put our model to use? –What if we don’t like the 0.5 threshold to decide which are predicted good or not? One answer: ROC

ROC Plot

2x2 Confusion Matrix Pos a b Neg c d Sensitivity: true positive rate, a/(a+b) false negative rate, b/(a+b) Specificity: true negative rate, d/(c+d) false positive rate, c/(c+d) If t is large, few are predicted positive, sensitivity small, specificity high. Predicted Observed

Plotting the ROC The ROC curve plots Sensitivity (true pos rate) against 1 – Specificity (or the false pos rate) for different k’s In a good test, there is a k for which both Sensitivity and Specificity are near 1 That curve would pass near (1, 0) –Top left corner In a bad test, the proportion classified as positive would be the same regardless of the truth. We would have Sensitivity = (1 – Specificity) for all k –45 o line

The ROC Sensitivity (true pos. rate) % 1 – Specificity (false pos. rate) Low Threshold (t) 73% High Threshold (t) We give every observation a score. For this value of t, 73% of positive observations have score  t, but only 30% of negatives have score  t

On the 45  Line If your classifier’s ROC curve follows the 45  line, then for any t the probability of classifying a positive as positive (sensitivity) is the same as the probability of classifying a negative as positive (1 – specificity) The area between your ROC curve and the 45  line is a measure of quality. So is the total area under the curve or the AUC (area under the curve).

The ROC Sensitivity (true pos. rate)  Specificity (false pos. rate) Low Threshold Area Under Curve (AUC) High Threshold

Area Under The Curve The Area under the Curve (AUC) is often measured or estimated A “random” classifier has AUC = 0.5 Rule of thumb: AUC >.8 good -- but often only a few thresholds make sense R draws ROC curves via pROC, ROCR…

Other interpretations of AUC Select two observations, one with a “hit” and one without, randomly For a particular model, what is the probability that the predicted probability for the “hit” is the greater? Answer: it’s exactly the AUC This number also relates to the Wilcoxon two-sample non-parametric test applied to the predicted classes.

Using AUC ROC is for binary classifiers that produce numeric scores Produce a set of predicted scores on test set, plot ROCs If one classifier’s curve is always above another’s, it dominates Otherwise, compare by AUC – or by using some “real” threshold Examples! Yay?