CAP and ROC curves.

Slides:



Advertisements
Similar presentations
Chapter 4 Evaluating Classification and Predictive Performance
Advertisements

© National Bank of Belgium. Failure Prediction Models: Disagreements, Performance, and Credit Quality Janet MITCHELL and Patrick VAN ROY National Bank.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Chapter 5 – Evaluating Classification & Predictive Performance
Chapter 4 – Evaluating Classification & Predictive Performance © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel.
Curva ROC figuras esquemáticas Curva ROC figuras esquemáticas Prof. Ivan Balducci FOSJC / Unesp.
1 Practical Psychology 1 Week 5 Relative frequency, introduction to probability.
Chapter 6: Model Assessment
QUANTITATIVE DATA ANALYSIS
Model Evaluation Metrics for Performance Evaluation
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
NORMAL CURVE Needed for inferential statistics. Find percentile ranks without knowing all the scores in the distribution. Determine probabilities.
A quick introduction to the analysis of questionnaire data John Richardson.
1 Pertemuan 06 Sebaran Normal dan Sampling Matakuliah: >K0614/ >FISIKA Tahun: >2006.
ROC Curves.
REGRESSION MODEL ASSUMPTIONS. The Regression Model We have hypothesized that: y =  0 +  1 x +  | | + | | So far we focused on the regression part –
1.  Why understanding probability is important?  What is normal curve  How to compute and interpret z scores. 2.
The Normal Distributions
ROC Curve and Classification Matrix for Binary Choice Professor Thomas B. Fomby Department of Economics SMU Dallas, TX February, 2015.
The Normal Distributions
Graphing. Representing numerical information in a picture. Graph shows a picture of a relationship -how two processes relate -what happens when two events.
ROC Curves.
Chapter 7 Probability and Samples: The Distribution of Sample Means
Chapter 6: Probability.
Decision Tree Models in Data Mining
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Statistics in Screening/Diagnosis
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Sample Distributions. Review Parameter vs. Statistic Parameter vs. Statistic Population and Sample Population and Sample Construct Construct Variables.
COURSE: JUST 3900 TIPS FOR APLIA Developed By: Ethan Cooper (Lead Tutor) John Lohman Michael Mattocks Aubrey Urwick Chapter 2: Frequency Distributions.
Graphs of Frequency Distribution Introduction to Statistics Chapter 2 Jan 21, 2010 Class #2.
Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =
Stats/Methods I JEOPARDY. Jeopardy CorrelationRegressionZ-ScoresProbabilitySurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Chapter 2 Data Presentation Using Descriptive Graphs.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Section 9.3: Confidence Interval for a Population Mean.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Evaluating Results of Learning Blaž Zupan
Chapter 3.3 Measures of Position. Standard Score  A comparison that uses the mean and standard deviation is called a standard score or a z-score  A.
Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
Frequency Distribution ScoresFrequency Classes/Intervals – Determined after finding range of data. Class width is the range of each class/interval. Rule.
Statistics Sampling Intervals for a Single Sample Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers,
How to Read, Develop, and Interpret GRAPHS! OBSERVATIONS: often are recorded in a data table. We INTERPRET our data table by making INFERENCES and PREDICTIONS.
Evaluating Classification Performance
Introduction to statistics I Sophia King Rm. P24 HWB
Chapter 5 – Evaluating Predictive Performance Data Mining for Business Analytics Shmueli, Patel & Bruce.
How to Read, Develop, and Interpret GRAPHS!  OBSERVATIONS: often are recorded in a data table  INTERPRET (make inferences of) your  DATA TABLE by performing.
Data Analytics CMIS Short Course part II Day 1 Part 4: ROC Curves Sam Buttrey December 2015.
Performance measures Morten Nielsen, CBS, Department of Systems Biology, DTU.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
Unit 2: Modeling Distributions of Data of Data. Homework Assignment For the A: 1, 3, 5, Odd, 25 – 30, 33, 35, 39 – 59 Odd and 54, 63, 65 – 67,
Design and Data Analysis in Psychology I English group (A) Salvador Chacón Moscoso Susana Sanduvete Chaves Milagrosa Sánchez Martín School of Psychology.
Chapter 6: Probability. Probability Probability is a method for measuring and quantifying the likelihood of obtaining a specific sample from a specific.
7. Performance Measurement
Evaluation – next steps
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Evaluating Results of Learning
Performance Measures II
BPK 304W Correlation.
Data Mining Classification: Alternative Techniques
Receiver Operating Curves
Default Prediction Early warning signal Quantitative rating KMV-Moodys
Computational Intelligence: Methods and Applications
Chapter 8: Estimating With Confidence
Fig. 6 Relation between Q and other impact indicators.
Roc curves By Vittoria Cozza, matr
Normal as Approximation to Binomial
Presentation transcript:

CAP and ROC curves

Cumulative Accuracy Profiles (CAP) We first rank companies by their default probabilities (i.e., credit scores) as predicted by the model, from highest to lowest. Then, out of those companies with a score higher than a value such that altogether they represent x% of the total number of companies, we record the corresponding number of defaulted companies being captured as a percentage (y%) of total number of defaulted companies.

CAP The CAP curve can then be traced out by varying x from 0 to 100 and plotting the corresponding values of x and y along and x-axis and y-axis respectively. Using a good model will result in a majority of the defaulters having relatively high default probability estimates and so the percentage of defaulters being captured (the y values in Fig. 1) increases quickly as one moves down the sorted sample of all companies (the x values in Fig. 1).

CAP If the model were totally uninformative, for example, by assigning default probabilities randomly, we would expect to capture a proportional fraction (i.e., x% of the defaulters with about x% of the observations), resulting in a CAP curve along the 45-degree line (i.e., the “Random CAP” curve of Fig. 1).

CAP Accuracy ratio by CAP curve= (the area under a model’s CAP)/ (the area under the ideal CAP)

Operating Characteristic Curves (ROC) The ROC curve is constructed by varying the cutoff probability. In particular, for every cutoff probability, the ROC curve defines the “true positive rate” (percentage of defaults that the model correctly classifies as defaults) on the y-axis as a function of the corresponding “false positive rate” (percentage of non-defaults that are mistakenly classified as defaults) on the x-axis.

The ROC curve of a constant or entirely random prediction model corresponds to the 45-degree line, whereas a perfect model will have a ROC curve that goes straight up from (0, 0) to (0, 1) and then across to (1, 1).

ROC Accuracy ratio by ROC curve=2× (area under a model’s ROC curve-0.5)