ROC Curve and Classification Matrix for Binary Choice Professor Thomas B. Fomby Department of Economics SMU Dallas, TX February, 2015.

Slides:



Advertisements
Similar presentations
Chapter 4 Pattern Recognition Concepts: Introduction & ROC Analysis.
Advertisements

Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
Learning Algorithm Evaluation
Evaluation of segmentation. Example Reference standard & segmentation.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Model Evaluation Metrics for Performance Evaluation
Behavioural Science II Week 1, Semester 2, 2002
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
CAP and ROC curves.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Hypothesis Testing: Type II Error and Power.
PSY 307 – Statistics for the Behavioral Sciences
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
ROC Curves.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
PSY 307 – Statistics for the Behavioral Sciences
Notes on Measuring the Performance of a Binary Classifier David Madigan.
ROC Curves.
TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN 1 This sequence describes the testing of a hypothesis at the 5% and 1% significance levels. It also.
EC220 - Introduction to econometrics (review chapter)
Six Sigma Training Dr. Robert O. Neidigh Dr. Robert Setaputra.
Chapter Ten Introduction to Hypothesis Testing. Copyright © Houghton Mifflin Company. All rights reserved.Chapter New Statistical Notation The.
Evaluating Classifiers
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Hypothesis testing is used to make decisions concerning the value of a parameter.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Fundamentals of Hypothesis Testing: One-Sample Tests
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
1 Evaluating Model Performance Lantz Ch 10 Wk 5, Part 2 Right – Graphing is often used to evaluate results from different variations of an algorithm. Depending.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
TYPE II ERROR AND THE POWER OF A TEST A Type I error occurs when the null hypothesis is rejected when it is in fact true. A Type II error occurs when the.
1 Lecture note 4 Hypothesis Testing Significant Difference ©
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Issues concerning the interpretation of statistical significance tests.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
Type I and Type II Errors. An analogy may help us to understand two types of errors we can make with inference. Consider the judicial system in the US.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter 21: More About Tests
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Evaluating Classification Performance
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
Quiz 1 review. Evaluating Classifiers Reading: T. Fawcett paper, link on class website, Sections 1-4 Optional reading: Davis and Goadrich paper, link.
AP Statistics Chapter 21 Notes
Evaluating Classifiers Reading: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)An introduction to ROC analysis.
European Patients’ Academy on Therapeutic Innovation The Purpose and Fundamentals of Statistics in Clinical Trials.
Type I and Type II Errors. For type I and type II errors, we must know the null and alternate hypotheses. H 0 : µ = 40 The mean of the population is 40.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Data Analytics CMIS Short Course part II Day 1 Part 4: ROC Curves Sam Buttrey December 2015.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Chapter ?? 7 Statistical Issues in Research Planning and Evaluation C H A P T E R.
Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)
Hypothesis Testing and Statistical Significance
Sensitivity, Specificity, and Receiver- Operator Characteristic Curves 10/10/2013.
Evaluating Classifiers
Chapter 21 More About Tests.
Performance Measures II
Data Mining Classification: Alternative Techniques
Chapter 9 Hypothesis Testing
William P. Wattles, Ph.D. Psychology 302
Statistical inference
Evaluating Classifiers (& other algorithms)
Roc curves By Vittoria Cozza, matr
Evaluating Classifiers
ROC Curves and Operating Points
Presentation transcript:

ROC Curve and Classification Matrix for Binary Choice Professor Thomas B. Fomby Department of Economics SMU Dallas, TX February, 2015

Classification Matrix For the purpose of discussing Type I and Type II errors, we will assume that the null hypothesis we are testing is the negative outcome (0) while the alternative hypothesis is the positive outcome (1). Note: In some textbooks the null hypothesis is taken to be the positive outcome (1) while the alternative hypothesis is taken to be the negative outcome (0). In this case the labeling of the Type I and Type II errors are reversed. Predicted Value 10 Actual value 1True Positive False Negative (Type II Error) 0 False Positive (Type I Error) True Negative

Sensitivity, Specificity, Accuracy (Hit) Rate, False-Positive and False-Negative Rates

Classification Matrix with Outcomes on a Hold-out Sample

Warning: There are Different Definitions of False-Positive Rate and False-Negative Rate Depending on the Textbook and Presenter

Receiver Operating Characteristic (ROC) Curve: Shows the Relationship between Sensitivity and the False Positive Rate as a Function of the Threshold Value Used by the Classifier. The Dashed Line represents the Naïve Classifier which does not use a Threshold Value.

Tracing Out the ROC Curve for One Classifier as a Function of Threshold (Cutoff Probability) Note: A Perfect Classifier Would Produce the Point (0,1.0) in the ROC Diagram (upper left-hand corner). Strict Threshold means a high cutoff probability for classifying a case as a “success” (y = 1). Lax Threshold means a low cutoff probability for classifying a case as a “success” (y=1). If a case has a probability greater than the threshold, we classify the case as a success (y=1). Otherwise, we classify the case as a failure (y=0). In the below graph we have the following equivalent terms. FPF = False Positive Fraction = FPR = False Positive Rate. TPF = True Positive Fraction = TPR = True Positive Rate.

The Threshold Determines the Trade-off Between Committing a Type I Error (False Positives) and a Type II Error (False Negatives). If the Probability of a case exceeds the threshold, we classify the case as a 1, otherwise as a 0. As the Threshold approaches one, we find that the sensitivity of the classifier will be zero while the specificity will be one. (See the (0,0) point on the ROC curve.) As the Threshold approaches zero, we find that the sensitivity of the classifier will be one while the specificity will be zero. (See the (1,1) point on the ROC curve.) The ROC curve is then traced out by decreasing the threshold from 1.0 to 0 as you move from the (0,0) point in the lower left-hand corner of the ROC diagram to the (1,1) point in the upper right-hand corner of the diagram. Therefore, as the threshold becomes more strict (higher), we expect the Type I Error (false 1’s (positives)) to occur less frequently while the Type II Error (false 0’s (negatives)) would occur more frequently. Of course, with a lax (low) threshold you would expect the Type I Error to be more prevalent and the Type II Error to be less prevalent.

Training versus Test Data Set ROC Curves for One Classifier Here we are comparing the areas under the ROC Curves. Should be making Classifier Comparisons Across Test Data Set ROC Curves to Avoid Over-Training

Confidence Intervals can be put around a given ROC Curve as in the below figure. However, when making ROC comparisons across Competing Classifiers, this is usually not done.

A Comparison of Four Different Classifiers with the Naïve Classifier (the dashed line). The ROC Curves should be based on the Test Data Sets. The Best Classifier has the Largest Area Under the ROC Curve.

Euclidian Distance Comparison of ROC Curves Another way of comparing ROC curves is to compare the minimum distance between the perfect predictor point (0,1) with a point on the ROC curve of the classifier.

A Euclidian Distance Measure