Risk Solutions & Research © Copyright IBM Corporation 2005 Default Risk Modelling : Decision Tree Versus Logistic Regression Dr.Satchidananda S Sogala,Ph.D.,

Slides:



Advertisements
Similar presentations
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Advertisements

Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Chapter 8 – Logistic Regression
Indian Statistical Institute Kolkata
Predictive Modeling for Disability Pricing May 13, 2009 Claim Analytics Inc. Barry Senensky FSA FCIA MAAA Jonathan Polon FSA
1. Abstract 2 Introduction Related Work Conclusion References.
Shipi Kankane Prashanth Nakirekommula.  Applying analytics and risk- management capabilities to health insurance through LexisNexis data platforms. 
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Decision Tree Rong Jin. Determine Milage Per Gallon.
Clementine Server Clementine Server A data mining software for business solution.
Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Chapter 5 Data mining : A Closer Look.
Decision Tree Models in Data Mining
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
2006 CAS RATEMAKING SEMINAR CONSIDERATIONS FOR SMALL BUSINESSOWNERS POLICIES (COM-3) Beth Fitzgerald, FCAS, MAAA.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
Data Mining Techniques
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
COMP3503 Intro to Inductive Modeling
Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation's express consent.
Copyright © 2010, SAS Institute Inc. All rights reserved. Applied Analytics Using SAS ® Enterprise Miner™
Business Intelligence and Decision Modeling Week 11 Predictive Modeling (2) Logistic Regression.
Methodology Qiang Yang, MTM521 Material. A High-level Process View for Data Mining 1. Develop an understanding of application, set goals, lay down all.
CS 478 – Tools for Machine Learning and Data Mining Linear and Logistic Regression (Adapted from various sources) (e.g., Luiz Pessoa PY 206 class at Brown.
Multivariate Data Analysis Chapter 5 – Discrimination Analysis and Logistic Regression.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Data Mining – Best Practices Part #2 Richard Derrig, PhD, Opal Consulting LLC CAS Spring Meeting June 16-18, 2008.
Loan Default Model Saed Sayad 1www.ismartsoft.com.
Science in Business Data Mining? Background: support managerial decision making Background: support managerial decision making Is there a science to data.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Data Warehousing Lecture-30 What can Data Mining do? Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Data Mining and Decision Support
DR. SATISH NARGUNDKAR GEORGIA STATE UNIVERSITY Analytics Overview.
Cheng-Lung Huang Mu-Chen Chen Chieh-Jen Wang
Special Challenges With Large Data Mining Projects CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.
SUPPORTING SYNCHRONOUS SOCIAL Q&A THROUGHOUT THE QUESTION LIFECYCLE Matthew Richardson Ryen White Microsoft Research.
Data Mining Copyright KEYSOFT Solutions.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Financial Data mining and Tools CSCI 4333 Presentation Group 6 Date10th November 2003.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
A Decision Support Based on Data Mining in e-Banking Irina Ionita Liviu Ionita Department of Informatics University Petroleum-Gas of Ploiesti.
Show Me Potential Customers Data Mining Approach Leila Etaati.
Predicting Mortgage Pre-payment Risk. Introduction Definition Borrower pays off the loan before the contracted term loan length. Lender loses future part.
Jeremy Kingry, eBECS | PREDICTIVE INTELLIGENCE AND WHY YOU WANT TO KNOW ABOUT IT.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Advanced statistical methods for credit risk modeling in practice
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining in Higher Education
Chapter 6 Classification and Prediction
Introduction to Data Mining and Classification
Dr. Satish Nargundkar Georgia State University
CIKM Competition 2014 Second Place Solution
CIKM Competition 2014 Second Place Solution
Data Mining for Business Analytics
Machine Learning to Predict Experimental Protein-Ligand Complexes
Classification and Prediction
Machine Learning Interpretability
Abdur Rahman Department of Statistics
Overfitting and Underfitting
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Module 5, Lesson 1: Logistic Regression ( )
Predicting Loan Defaults
Built on the Powerful Azure Platform, Angoss Helps Businesses Turn Data into Actionable Insights That Reduce Risk, Increase Organizational Performance.
Presentation transcript:

Risk Solutions & Research © Copyright IBM Corporation 2005 Default Risk Modelling : Decision Tree Versus Logistic Regression Dr.Satchidananda S Sogala,Ph.D., Head, Risk Solutions & Research FSS, IBM India

IBM Financial Services © Copyright IBM Corporation 2005 | Agenda  Need for credit risk quantification  Uses of credit scoring model in credit analysis  Data mining approach to credit scoring  Comparing the Decision Trees with Logit Regression

IBM Financial Services © Copyright IBM Corporation 2005 | - Increase in market volatility - Emergence of new instruments for risk management - Integration of the credit markets globally - Availability of computing and analytical power and tools Need for Credit Risk Quantification

IBM Financial Services © Copyright IBM Corporation 2005 |  Objective tool for credit risk analysis  Can support scaling up  Particularly useful for standardized assets-based loans such as home loans, car loans  Default prediction Use of Credit Scoring Models

IBM Financial Services © Copyright IBM Corporation 2005 | Data Mining Process Data Feature selection Sampling Modeling Evaluation

IBM Financial Services © Copyright IBM Corporation 2005 | DM Approach to Creditscoring Data Feature selection Cluster prototypes Decision tree/Logit Evaluation 10 fold cross validation Classification accuracy True positive rate

Risk Solutions & Research © Copyright IBM Corporation 2005 Decision Trees VS Logit Regression Automatic feature selection Scalable Heterogeneous partitioning Intuitive graphical display Sequential feature selection Not scalable Homogeneous partitioning Assumes linear relations of log odds

Risk Solutions & Research © Copyright IBM Corporation 2005 Induced Rules from Decision Tree

Risk Solutions & Research © Copyright IBM Corporation 2005 Logistic Model

Risk Solutions & Research © Copyright IBM Corporation 2005 Features selected

Risk Solutions & Research © Copyright IBM Corporation 2005 Results

IBM Financial Services © Copyright IBM Corporation 2005 | Summary  Decision tree gave greater accuracy in identifying the true defaults (Lift of about 2)  More intuitive and less complex  Competitive pressures forcing operational sophistication  Use of cluster-based prototypes of non-defaulters for balancing the sample as the interest is in default

IBM Financial Services © Copyright IBM Corporation 2005 | THANK YOU THANK YOU FOR YOUR PATIENCE AND INTEREST!