Comparison of Classification Methods for Customer Attrition Analysis Xiaohua Hu, Ph.D. Drexel University Philadelphia, PA, 19104

Slides:



Advertisements
Similar presentations
Charge It Right 1. 2 Introduction Instructor and student introductions. Module overview.
Advertisements

Introduction to Data Mining with XLMiner
0 © Copyright GSTAT LTD Enhancing Microsoft CRM with Real-Time Analytical Capabilities “ GSTAT – Advanced Data Mining Solutions” in corporation with.
1. Abstract 2 Introduction Related Work Conclusion References.
Data Mining: A Closer Look Chapter Data Mining Strategies.
Chapter 9 Business Intelligence Systems
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
Chapter Extension 14 Database Marketing © 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke.
1 Applications of Data Mining in Banking Maria Luisa Barja Jesús Cerquides Ubilab IT Laboratory UBS AG.
Metodi Quantitativi per Economia, Finanza e Management Lezione n°2.
Biometrics: Voice Recognition
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Neural Networks And Its Applications By Dr. Surya Chitra.
Decision Tree Models in Data Mining
Data Mining: An Introduction Wing Kee Ho Xiaohua Luan.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Making Small Business Finance Profitable Peer Stein, Banking Advisory Group December 4, 2002 Key Lessons Learned about Applying New Technologies to SME.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Stock Value Ratio Classification Yan SuiZheng Chai.
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Chapter 5: Data Mining for Business Intelligence
Data Mining Techniques
1 REVIEW LEARNING OUTCOME Customer Relationship Management LO I.
1 Chapter 21: Customer Relationship Management (CRM) Prepared by Amit Shah, Frostburg State University Designed by Eric Brengle, B-books, Ltd. Copyright.
Customer Relationship Management Key Concepts. Customer Relationship Management Strategy Link all processes of the company from its customers through.
Steven Parker Head CRM Consumer Banking Standard Chartered
Section 7-3 Computing the Costs of Credit
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Charge It Right 1. 2 Purpose Charge It Right will teach you about credit cards and how to use them responsibly.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Use of web scraping and text mining techniques in the Istat survey on “Information and Communication Technology in enterprises” Giulio Barcaroli(*), Alessandra.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
 What are advantages of credit  What are disadvantages of credit.
Banking on Analytics Dr A S Ramasastri Director, IDRBT.
Market Analysis CHAPTER 6 BBUSS 2403 BUSINESS PLANNING 3-1.
Methodology Qiang Yang, MTM521 Material. A High-level Process View for Data Mining 1. Develop an understanding of application, set goals, lay down all.
The CRISP Data Mining Process. August 28, 2004Data Mining2 The Data Mining Process Business understanding Data evaluation Data preparation Modeling Evaluation.
Zoran Bohaček Croatian Quants Day 22. II Business reasons to analyse a credit registry database.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin 8-1 Chapter Eight Audit Sampling: An Overview and Application.
Preventing Overfitting Problem: We don’t want to these algorithms to fit to ``noise’’ Reduced-error pruning : –breaks the samples into a training set and.
2.6.1.G1 Credit Reports and Scores Take Charge G1 © Take Charge Today – August 2013– Credit Reports and Scores– Slide 2 Funded by a grant from.
Credit Cards. When thinking of getting a Credit Card follow the Three C’s: Character: Will you repay the debt? How you used credit before? Do you pay.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Kaggle Competition Prudential Life Insurance Assessment
Evaluating Classification Performance
Consumerism UNIT IV. Disposable and Discretionary Income Consumer- a person or group who buys or uses goods and services to satisfy needs/want Disposable.
10 Points Question- What is the definition of Character?
Data Mining and Decision Support
Data Mining Copyright KEYSOFT Solutions.
Clustering Algorithms Minimize distance But to Centers of Groups.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Chapter 1 MARKETING IS ALL AROUND US. The Scope of Marketing Marketing is activity, set of institutions, and processes for creating, communicating, delivering,
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
Decision Trees in Analytical Model Development
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Sources of consumer credit
Week 11 Knowledge Discovery Systems & Data Mining :
Dr. Morgan C. Wang Department of Statistics
Prepared by: Mahmoud Rafeek Al-Farra
Presentation transcript:

Comparison of Classification Methods for Customer Attrition Analysis Xiaohua Hu, Ph.D. Drexel University Philadelphia, PA,

Outline Introduction of the Business Problem Data Selection and Data Processing Data Mining Model Development Process Data Mining Findings Q & A

Data Mining for Customer Attrition Analysis In the financial industry, data mining has been applied successfully in determining: Target-oriented campaign Identify and understand customer segment: attriter vs. loyal customers, profitable customers vs. regular Identify cross-sell, up-sell opportunity increase the wallet-share of the customers Risk analysis for loan applications, credit fraud detection Finance planning and asset evaluation

Customer Attrition Analysis The goal of attrition analysis is to identify a group of customers who have a high probability to attrite, and then the company can conduct marketing campaigns to change the behavior in the desired direction (change their behavior, reduce the attrition rate).

Business Problem Our client is one of the largest banks in the world This attrition analysis project related to one type of credit load service, Over 750,000 customers currently use this service with $1.5 billion in outstanding, every month, about 5,700 customer close their accounts/ transfer to other banks mostly due to rate, credit line, and fees

Problem Definition Slow attriters: Customers who slowly pay down their outstanding balance until they become inactive. Fast attriters: Customers who quickly pay down their balance and either lapse it or close it via phone call or write in.

Data Mining Tasks 1.Utilizing data on accounts that remained continuously open in the last 4 months, predict, with 60 days in advance notice, the likelihood that a particular customer will opt to voluntarily close his/her account either by phone or write-in. 2.Utilizing data on accounts that remained continuously open in the last 4 months, predict, with 60 days advance notice, the likelihood that a particular customer will have his account transferred to a competing institution. The account may or may not remain open.

Challenging issues in our project Data highly skewed: 3% attriters vs 97% regular customers Time-series data: our data warehouse has the past 12 month credit loan service information, High dimensions: 850 attributes for each customers Lots of dirty data and missing values in the records

Data Mining Process for Customer Attrition Analysis 1.Problem definition: formulation of business problems in the area of customer retention. 2.Data review and initial selection 3.Problem formulation in terms of existing data 4.Data gathering, cataloging and formatting 5.Data Processing: (a) Data cleansing, data unfolding and time- sensitive variable definition, target variable definition, (b) Statistical analysis, (c) Sensitivity analysis, (d) Feature selection, (d) Leaker detection 6.Data modeling via classification model: Decision Trees, Neural Networks, Bayesian Networks, an ensemble of classifiers 7.Result review and analysis: use the data mining model to predict the likely attriters among the current customers 8.Result Deployment: target the likely attriters (called rollout)

Data Source Data Warehouse: Credit Card Data Warehouse containing about 200 product specific fields Third Party Data : A set of account related demographic and credit bureau information Segmentation files :Set of account related segmentation values based on our client's segmentation scheme which combines Risk, Profitability and External potential Payment Database :Database that stores all checks processed. The database can categorize source of checks

Data Processing Goals Reflects data changes over time. Recognizes and removes statistically insignificant fields Defines and introduces the "target" field Allows for second stage preprocessing and statistical analysis.

Data Processing Steps Time series "unrolling" Target value definition First stage statistical analysis Field sensitivity analysis and field reduction Files set generation

Data Mining Algorithms for Attrition Analysis 1.Boosted Naïve Bayesian (BNB) 2.NeuralWare Predict (a commercial neural network from NeuralWare Inc) 3.Decision Tree (based on C4.5 with some modification) 4.Selective Naïve Bayesian (SNB). 5.An ensemble of classifier of the above four methods

Classification accuracy is not a proper measure for attrition analysis The goal of attrition analysis is not to to predict the behavior of every customer, but to find a good subset of customers where the percentage of attriters is high Classification error (false positive, false negative) have different economic consequence in attrition analysis, need to be treated differently

Criterion for Attrition Analysis: Lift Lift rather than classification accuracy is a better measure for the attrition analysis, a lift reflects the redistribution of responders in the testing set after the testing examples are ranked lift can be calculated by looking at the cumulative targets captured up to p% as a percentage of all targets and dividing by p%. For example, the top 10% of the sorted list may contain 35% of likely attriters, then the model has a lift of 35/10=3.5.

Boosted Naïve Bayesian Network PctCasesHits BBN%hitsliftHits (no model)

Decision Tree (revised 4.5) PctCasesHits Decision Tree %hitsliftHits (no model)

Neural Network (Predict) PctCasesHits NN%hitsliftHits (no model)

Selective Naïve Bayesian Network PctCasesHits BBN%hitsliftHits (no model)

An Ensemble of Classifiers PctCasesHits BBN%hitsliftHits (no model)

Field Test Try to verify the following two points: the top percentage of the customer attrition list does contain concentrated attriters the data mining based marketing approach is effective for attrition analysis purpose.

Field Test Results Top 5% of customer = (output from the data mining prediction list), create 2 groups with customers each by random sampling from top customers from the prediction list sorted by the score Group 1: the marketing department contacted each customer and offered some incentive packages to encourage the customers to stay with the company Group 2: no action. Two months later, examines the customers in Group 1 and Group 2. Group 1 has a attrition rate 0.8%, while Group 2 has 10.6% (the average attrition rate is 2.2%). Lift is 4.8

Q & A ?