Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study Presenter : Cheng-Hui Chen Author : Siddhartha Bhattacharyya, Sanjeev Jha, Kurian Tharakunnel, J. Christopher Westland DSS 2010

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation  Over the years, along with the evolution of fraud detection methods, perpetrators of fraud have also been evolving their fraud practices to avoid detection.  While predictive models for credit card fraud detection are in active use in practice, reported studies on the use of data mining approaches for credit card fraud detection are relatively few. 3

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives  Credit card fraud detection methods need constant innovation. we evaluate two advanced data mining approaches, support vector machines and random forests, together with the well-known logistic regression, as part of an attempt to better credit card fraud. 4

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 5 Detection methods SVM Random forests Random forests Logistic regression Challenges 1.Unbalanced class. 2.Undetected fraud transactions, leading to mislabeled case. Primary attributes Derived attributes Primary attributes Derived attributes

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology  Credit card fraud ─ Application fraud Fraudsters obtaining new cards from issuing companies using false information or other people's information. ─ Behavioral fraud Mail theft Stolen/lost card Counterfeit card Card holder not present’ fraud 6

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Logistic regression  Qualitative response models are appropriate when dependent variable is categorical.  Our dependent variable fraud is binary, and logistic regression is a widely used technique. ─ For example used binary choice models in the case of insurance frauds to predict the likelihood of a claim being fraudulent. 7

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Support vector machines  SVMs are linear classifiers that work in a high- dimensional feature space that is a non-linear mapping of the input space of the problem at hand.  Properties ─ Margin optimization SVMs minimize the risk of overfitting the training data by determining the classification function (a hyper-plane) with maximal margin of separation between the two classes. ─ Kernel trick It can represent the dot product of projections of two data points in a high-dimensional feature space. Using a kernel function 8 K(X) = X+X 2 Fraud Non-fraud

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Random forests  A random forest model is an ensemble of classification (or regression) trees. 9 GINI INDEX

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Datasets ─ Primary attributes ─ Derived attributes 10

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 11

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 12 Experiments

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 13

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions  A factor contributing to the performance of logistic regression is possibly the carefully derived attributes used.  SVM performance at the upper file depths tended to increase with lower proportion of fraud in the training data.  Random forests demonstrated overall better performance across performance measures. 14

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments  Advantages ─ It’s write very detail  Drawback ─ …  Applications ─ Credit card fraud 15


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study."

Similar presentations


Ads by Google