Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5: Data Mining for Business Intelligence

Similar presentations

Presentation on theme: "Chapter 5: Data Mining for Business Intelligence"— Presentation transcript:

1 Chapter 5: Data Mining for Business Intelligence
Decision Support and Business Intelligence Systems (9th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence

2 Learning Objectives Define data mining as an enabling technology for business intelligence Understand the objectives and benefits of business analytics and data mining Recognize the wide range of applications of data mining Understand the steps involved in data preprocessing for data mining

Introduction Data is produced at a phenomenal rate Our ability to store has grown Users expect more sophisticated information How? UNCOVER HIDDEN INFORMATION DATA MINING

4 Examples: What is (not) Data Mining?
Look up phone number in phone directory Query a Web search engine for information about “Amazon” What is Data Mining? Certain names are more prevalent in certain US locations (e.g. in Boston area,…) Group together similar documents returned by search engine according to their context (e.g., …) A customer with income between 10,000 and 20,000 and age between 20 and 25 who purchased milk and bread is likely to purchase diapers within 5 years. The amount of fish sold to people living in a certain area and have income between 20,000 and 35,000 is increasing.

5 Data Mining Data Mining: the process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions. Involves analysis of data and use of software techniques for finding hidden and unexpected patterns and relationships in sets of data. Potential Result: Higher-level meta information that may not be obvious when looking at raw data Similar terms Exploratory data analysis Data driven discovery Deductive learning

6 Decisions in Data Mining
Databases to be mined Relational, transactional, object-oriented, object-relational, spatial, time-series, text, multi-media, heterogeneous, legacy, WWW, etc. Knowledge to be mined Association, classification, clustering, etc. Techniques utilized Database-oriented, data warehouse (OLAP), machine learning, statistics, visualization, neural network, etc. Applications adapted Retail, telecommunication, banking, fraud analysis, DNA mining, stock market analysis, Web mining, Weblog analysis, etc.

7 DBMS and Data Mining DBMS Data Mining Task
Extraction of detailed and summary data Knowledge discovery of hidden patterns and insights Type of result Information Insight and Prediction Method Deduction (Ask the question, verify with data) Induction (Build the model, apply it to new data, get the result) Example question Who purchased mutual funds in the last 3 years? Who will buy a mutual fund in the next 6 months and why?

8 Data Mining Tasks Prediction Tasks Description Tasks
Use some variables to predict unknown or future values of other variables Description Tasks Find human-interpretable patterns that describe the data. Common data mining tasks Classification [Predictive] Find all credit applicants who are poor credit risks. (classification) Clustering [Descriptive] Identify customers with similar buying habits.(Clustering) Association Rule Discovery [Descriptive] Find all items which are frequently purchased with milk Sequential Pattern Discovery [Descriptive]

9 A Taxonomy for Data Mining Tasks

10 Classification: Definition
Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. Find a model for class attribute as a function of the values of other attributes. Goal: previously unseen records should be assigned a class as accurately as possible. A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.

11 Classification Example
Test Set Learn Classifier Model Training Set

12 Classification: Application Example
Direct Marketing Goal: Reduce cost of mailing by targeting a set of consumers likely to buy a new cell-phone product. Approach: Use the data for a similar product introduced before. We know which customers decided to buy and which decided otherwise. This {buy, don’t buy} decision forms the class attribute. Collect various demographic, lifestyle, and company-interaction related information about all such customers. Type of business, where they stay, how much they earn, etc. Use this information as input attributes to learn a classifier model.

13 Clustering Definition
Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that Data points in one cluster are more similar to one another. Data points in separate clusters are less similar to one another.

14 Clustering: Application Example
Market Segmentation: Goal: subdivide a market into distinct subsets of customers where any subset may conceivably be selected as a market target to be reached with a distinct marketing mix. Approach: Collect different attributes of customers based on their geographical and lifestyle related information. Find clusters of similar customers.

15 Association Rule :Application Example
Supermarket shelf management. Goal: To identify items that are bought together by sufficiently many customers. Approach: Process the point-of-sale data collected with barcode scanners to find dependencies among items. A classic rule -- If a customer buys diaper and milk, then he is very likely to buy beer:

16 Data Preparation – A Critical DM Task

17 Examples of Data Mining applications:
Retail / Marketing Identifying buying patterns of customers. Predicting response to mailing campaigns. Banking Detecting patterns of CC fraud Identifying loyal customers. Medicine Identifying successful medical therapies. Banking and Other Financial Automate the loan application process Detecting fraudulent transactions Maximize customer value (cross-, up-selling)

18 Examples of Data Mining applications:
Customer Relationship Management Maximize return on marketing campaigns Improve customer retention Maximize customer value (cross-, up-selling) Identify and treat most valued customers Manufacturing and Maintenance Predict/prevent machinery failures Identify anomalies in production systems to optimize the use manufacturing capacity Discover novel patterns to improve product quality

19 Data Mining Applications
Brokerage and Securities Trading Predict changes on certain bond prices Forecast the direction of stock fluctuations

20 Source:, May 2009
Data Mining Software Commercial SPSS - PASW (formerly Clementine) SAS - Enterprise Miner IBM - Intelligent Miner StatSoft – Statistical Data Miner … many more Free and/or Open Source Weka RapidMiner… Source:, May 2009

Download ppt "Chapter 5: Data Mining for Business Intelligence"

Similar presentations

Ads by Google