Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining and Its Applications 1 Data Mining Techniques – For Marketing, Sales, and Customer Support, by Michael J.A. Berry and Gordon Linoff, John Wiley.

Similar presentations


Presentation on theme: "Data Mining and Its Applications 1 Data Mining Techniques – For Marketing, Sales, and Customer Support, by Michael J.A. Berry and Gordon Linoff, John Wiley."— Presentation transcript:

1 Data Mining and Its Applications 1 Data Mining Techniques – For Marketing, Sales, and Customer Support, by Michael J.A. Berry and Gordon Linoff, John Wiley & Sons, Inc., 1997. Discovering Data Mining from concept to implementation, by Cabena, Harjinian, Stadler, Verhees and Zanasi, Prentice Hall, 1997. Building Data Mining Applications for CRM, by Alex Berson, Stephen Smith and Kurt Thearling, McGraw Hall, 1999. Data Mining Cookbook – Modeling Data for Marketing, Risk, and Customer Relationship Management, by Olivia Parr Rud, John Wiley & Sons, Inc, 2001. Mastering Data Mining – The Art and Science of Customer Relationship management, by Michael J.A. Berry and Gordon S. Linoff, John Wiley & Sons, Inc, 2000. Machine Learning, by Tom M. Mitchell, McGraw-Hill, 1997. Data Mining – Concepts and Techniques, by Jiawei Han and Micheline Kamber, Morgan Kaufmann, 2001. Introduction to Data Mining, by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison Wesley, 2005.

2 Data Mining and Its Applications 2 r Lots of data is being collected and warehoused m Web data, e-commerce m purchases at department/ grocery stores m Bank/Credit Card transactions r Computers have become cheaper and more powerful r Competitive Pressure is Strong m Provide better, customized services for an edge (e.g. in Customer Relationship Management) Why Mine Data?

3 Data Mining and Its Applications 3 Mining Large Data Sets - Motivation r There is often information “ hidden ” in the data that is not readily evident r Human analysts may take weeks to discover useful information r Much of the data is never analyzed at all

4 Data Mining and Its Applications 4 What is Data Mining? r Many Definitions m Non-trivial extraction of implicit, previously unknown and potentially useful information from data m Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns

5 Data Mining and Its Applications 5 What is (not) Data Mining? l What is Data Mining? – Certain names are more prevalent in certain US locations (O’Brien, O’Rurke, O’Reilly… in Boston area) – Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,) l What is not Data Mining? – Look up phone number in phone directory – Query a Web search engine for information about “Amazon”

6 Data Mining and Its Applications 6 Data Mining Tasks r Prediction Methods m Use some variables to predict unknown or future values of other variables. r Description Methods m Find human-interpretable patterns that describe the data.

7 Data Mining and Its Applications 7 Three Main Data Mining Tasks r Classification r Clustering r Association Rule Discovery r There are many other approaches. But most of them can be categorized into one of the three approaches.

8 August 3, 2015 Data Mining: Concepts and Techniques 8 Data Mining for Retail Industry r Retail industry: huge amounts of data on sales, customer shopping history, etc. r Applications of retail data mining m Identify customer buying behaviors m Discover customer shopping patterns and trends m Improve the quality of customer service m Achieve better customer retention and satisfaction m Enhance goods consumption ratios m Design more effective goods transportation and distribution policies

9 Data Mining and Its Applications 9 Customer Profiling r Question: what kinds of customers were profitable in last year? r Data m Customer details such as Age, Gender, Occupation, Salary Levels, Account, etc., m Earnings from customers in last year. r Data Mining m Divide customers into profitability categories according to earnings such as highly profitable, profitable, non-profitable, loss. m Find rules using data mining techniques m Analyze the rules and take actions

10 Data Mining and Its Applications 10 Customer Profiling: Rules IF age > 30 and Age <=45 and occupation is professional and salary level is between 50,000 and 70,000 Then this user is profitable The rules are with some statistic support such as support and confidence.

11 Data Mining and Its Applications 11 Customer Segmentation r Customer segmentation is a process to divide customers into different groups or segments. Customers in the same segment have similar needs or behaviors so that similar marketing strategies or service policies can be applied to them. r Customer segments are required in several business areas including m Marketing m Customer services m Products and service development m Sales promotion m Purchase recommendation m Customer retention

12 Data Mining and Its Applications 12 Customer Retention r In most industries the cost of retaining a customer, subscriber or client is substantially less than the initial cost of obtaining that customer. r Question: m Find out what kinds of customers tend to churn and build a model which can predict the likely-to-churn customers. r Data mining solution: m Collect data about the customers who have churned. m Select a set of customers who have been loyal. m Merge the two data sets to form training, testing and evaluation data sets.

13 Data Mining and Its Applications 13 Financial Products Recommendation r Mellon Bank Corporation is a major financial services company head-quarted in Pittsburgh.  Build an extendible loan secured by the values of a client ’ s own property. m Achieve the highest possible Return On Investment. m Based on customers with DDA, build a model for HELOC.

14 Data Mining and Its Applications 14 Data Preparaton r The primary data source was the approximately 40,000 Mellon customers who had (or once had) HELCOCs and DDAs. r Data m Demographic data sourced both internally and externally (age, income, length of residence, and other indicators of economic condition) m DDA data (history of loan balance over 3, 6, 9, 12, 18 months, history of returned checks, history of interest rates. m Property data sourced externally (home purchase price, loan-to- value ratio) m Other data related to credit worthiness r Use 120 variables

15 Data Mining and Its Applications 15

16 Data Mining and Its Applications 16 Responders

17 Data Mining and Its Applications 17 Basket Analysis

18 Data Mining and Its Applications 18 Basket Analysis Rule A  D C  A A  C B & C  D Support 2/5 1/5 Confidence 2/3 2/4 2/3 1/3 A B C A C D B C D A D E B C E

19 Data Mining and Its Applications 19 The Impact of Fraud r GAO (The United States General Accounting Office) cited $19.1 billion in improper government payments in 17 major programs for fiscal year 1998. m Medicare $12.6 Billion m Supplemental Security Income $1.6 B m The Food Stamp Program $1.4 B m Old Age and Survival Insurance $1.2 B m Disability Insurance $941 Million m Housing Subsidies $847 Million m Veterans’ Benefits, Unemployment Insurance and Others $514 Million

20 Data Mining and Its Applications 20 Background r HIC (The Health Insurance Commission) in Australia is a federal government agency. r HIC pays insurance claims more than 20 million Australian dollars and pay out about A$8 billion in funds every year. r More than 300 million transactions are processed and stored every year. 1.3TB in five year.

21 Data Mining and Its Applications 21 Preventing Fraud and Abuse r Business Objectives m The focus of the HIC project was on the recent and steady 10% annual rise in the cost of pathology claims for clinical tests. r Approaches m To identify potential fraudulent claims or claims arising from inappropriate practice, and m To develop general profiles of the GP practices in order to compare practice behaviors of individual GPs.

22 Data Mining and Its Applications 22 Data Proprocessing r Two databases m Episode Database One Episode record records a patient visit. In total, 6.8 million records. There were 227 different pathology tests. m GP (doctor) database There are 17,000 records related to active GPs r The behavior of 10,409 GPs was to be studied. m A matrix of 10,409 by 227 elements. m The elements were then scaled from 0 to 1 with respect to the total number of tests of each kind.

23 Data Mining and Its Applications 23 Input to Segmentation

24 Data Mining and Its Applications 24 Overview

25 Data Mining and Its Applications 25 Data Mining r They conducted association rule mining, when support = 0.25% , the team decided that the presence of some tests in the input database was causing spurious rules to be revealed (Pathology Episode Initiation (PEI)). r PEI tests depend on who ordered them and where they were ordered. r When the PEI tests were removed, the number of rules dropped significantly.

26 Data Mining and Its Applications 26 Result Analysis r A request for a microscopic examination of feces for parasites (OCP) was associated with a cultural examination of feces (FCS) in 0.85% of cases. m A 92.6% chance that if OCP tests were requested, they would be done with FCS. m A 0.61% of chance, OCP was associated with a different more expensive test called MCS32, which costs A$13.55 per test.

27 Data Mining and Its Applications 27 GP Profiles

28 Data Mining and Its Applications 28 Discussions r Segment 13: m Represent the majority of traditional GPs who are practicing conventionally. 5,450 GPs. Total 52% of GPs. m Only 6.2% of the medical pathology tests r Segment 4: m 54 GPs. Only 0.51% of GPs. m 2.7% of the medical pathology tests.

29 Data Mining and Its Applications 29 Financial Data Mining: News Sensitive Stock Prediction

30 Advanced Topics r Sequential Mining r Time-Series Mining r Spatial Mining r Web Mining r Social Network Mining r Text Mining r Data Streaming Mining r Mining and Privacy Data Mining and Its Applications 30

31 Data Mining: Concepts and Techniques 31 Examples of Data Mining Systems r Mirosoft SQLServer m Integrate DB and OLAP with mining m Support OLEDB for DM standard r SAS Enterprise Miner m A variety of statistical analysis tools m Data warehouse tools and data mining algorithms r IBM Intelligent Miner m A wide range of data mining algorithms m Scalable mining algorithms m data preparation, and data visualization tools m Tight integration with IBM's DB2 RDBMS r Clementine (SPSS) m An integrated data mining development environment for end- users and developers m Multiple data mining algorithms and visualization tools r Weka (http://cs.waikato.ac.nz/ml/weka) m A free data mining tool.


Download ppt "Data Mining and Its Applications 1 Data Mining Techniques – For Marketing, Sales, and Customer Support, by Michael J.A. Berry and Gordon Linoff, John Wiley."

Similar presentations


Ads by Google