Download presentation
Presentation is loading. Please wait.
1
MIS 451 Building Business Intelligence Systems
Introduction to Data Mining
2
Why data mining? OLAP can only provide shallow data analysis -- what
Ex: sales distribution by product
3
Why data mining? Shallow data analysis is not sufficient to support business decisions -- how Ex: how to boost sales of other products Ex: when people buy product 6 what other products do they are likely to buy? – cross selling
4
Why data mining? OLAP can only do shallow data analysis
OLAP is based on SQL SELECT PRODUCTS.PNAME, SUM(SALESFACTS.SALES_AMT) FROM DBSR.PRODUCTS PRODUCTS, DBSR.SALESFACTS SALESFACTS WHERE ( ( PRODUCTS.PRODUCT_KEY = SALESFACTS.PRODUCT_KEY ) ) GROUP BY PRODUCTS.PNAME; The nature of SQL decides that complicated algorithm cannot be implemented with SQL. Complicated algorithms need to be developed to support deep data analysis – data mining
5
Why data mining? OLAP results generated from data sets with large number of attributes are difficult to be interpreted Ex: cluster customers of my company --- target marketing Pick two attributes related to a customer: income level and sales amount
6
Why data mining? Ex: cluster customers of my company --- target marketing Pick three attributes related to a customer: income level, education level and sales amount
7
What is data mining? Data mining is a process to extract hidden and interesting patterns from data. Data mining is a step in the process of Knowledge Discovery in Database (KDD).
8
Steps of the KDD Process
Step 5: Interpretation & Evaluation Step 4: Data Mining Knowledge Patterns Step 3: Transformation Step 2: Cleaning Transformed Data Preprocessed Data Step 1: Selection Target Data Data
9
Steps of the KDD Process
Step 1: select interested columns (attributes) and rows (records) to be mined. Step 2: clean errors from selected data Step 3: data are transformed to be suitable for high performance data mining Step 4: data mining Step 5: filter out non-interesting patterns from data mining results
10
Data mining – on what kind of data
Transactional Database Data warehouse Flat file Web data Web content Web structure Web log
11
Major data mining tasks
Association rule mining – cross selling Clustering – target marketing Classification – potential customer identification, fraud detection
12
Reading : data mining book chapter 1
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.