Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-1.

Slides:

Advertisements

Similar presentations

Cluster Analysis.

Advertisements

Chapter 9 Business Intelligence Systems

Chapter Extension 14 Database Marketing © 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke.

Basic Data Mining Techniques

Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.

Data Mining Adrian Tuhtan CS157A Section1.

Recommender systems Ram Akella November 26 th 2008.

Chapter Extension 15 Database Marketing. Q1:What is a database marketing opportunity? Q2: How does RFM analysis classify customers? Q3: How does market-basket.

Chapter Extension 12 Database Marketing.

Database Processing for Business Intelligence Systems

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 18-1 Chapter 18 Data Analysis Overview Statistics for Managers using Microsoft Excel.

Chapter 5 Data mining : A Closer Look.

Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.

Decision Tree Models in Data Mining

Clustering analysis workshop Clustering analysis workshop CITM, Lab 3 18, Oct 2014 Facilitator: Hosam Al-Samarraie, PhD.

Business Analytics: Methods, Models, and Decisions, 1 st edition James R. Evans Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-1.

Basic Data Mining Techniques

Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.

Data mining methodology in Weka

Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.

Chapter 9 Business Intelligence and Information Systems for Decision Making.

© 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke Slide 1 Chapter 9 Competitive Advantage with Information Systems for Decision Making.

Chapter 9 – Classification and Regression Trees

Chapter 12 – Discriminant Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.

Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.

Final Exam Review. The following is a list of items that you should review in preparation for the exam. Note that not every item in the following slides.

 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.

Business Intelligence Systems Appendix J DAVID M. KROENKE and DAVID J. AUER DATABASE CONCEPTS, 6 th Edition.

Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.

XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd.

CLUSTERING. Overview Definition of Clustering Existing clustering methods Clustering examples.

Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.

EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.

Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.

Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.

Analyzing Systems Using Data Dictionaries Systems Analysis and Design, 8e Kendall & Kendall 8.

Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.

DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.

Data Mining Brandon Leonardo CS157B (Spring 2006).

Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.

IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.

1-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.

Evaluating Classification Performance

Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.

Chapter 5 – Evaluating Predictive Performance Data Mining for Business Analytics Shmueli, Patel & Bruce.

Monday, February 22,  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.

1 Cluster Analysis Prepared by : Prof Neha Yadav.

Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.

Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.

CLUSTER ANALYSIS. Cluster Analysis  Cluster analysis is a major technique for classifying a ‘mountain’ of information into manageable meaningful piles.

Chapter 10 Introduction to Data Mining

Unsupervised Learning

Chapter 12 – Discriminant Analysis

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Chapter 12 Understanding Research Results: Description and Correlation

Data Mining ICCM

XLMiner – a Data Mining Toolkit

DATA MINING © Prentice Hall.

Adrian Tuhtan CS157A Section1

Exam #3 Review Zuyin (Alvin) Zheng.

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Cluster Analysis.

MIS2502: Data Analytics Classification Using Decision Trees

Cluster analysis Presented by Dr.Chayada Bhadrakom

Unsupervised Learning

Presentation transcript:

Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-1

 The Scope of Data Mining  Data Exploration and Reduction  Classification  Classification Techniques  Association Rule Mining  Cause-and-Effect Modeling Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-2

 Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and patterns among variables in large data sets.  It is used to identify and understand hidden patterns that large data sets may contain.  It involves both descriptive and prescriptive analytics, though it is primarily prescriptive. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-3

Some common approaches to data mining  Data Exploration and Reduction - identify groups in which elements are similar ◦ Understand difference among customers and segment them into homogenous groups ◦ Macys has identified four lifestyles of customers (male versions too) 1.Traditional classic dresser – likes quality, dislikes risk 2.Neotraditional – more edgy, still classic 3.Contemporary – loves newness, shops by brand 4.Fashion customer – wants latest and greatest  Useful in design and marketing to better target product  Also used to id successful employees and improve recruiting and hiring. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-4

Some common approaches to data mining  Classification - analyze data to predict how to classify new elements ◦ Spam filtering in by examining textural characteristics of message ◦ Help predict if credit-card transaction may be fraudulent ◦ Is a loan application high risk ◦ Will a consumer respond to an ad Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-5

Some common approaches to data mining  Association  - analyze data to identify natural associations among variables and create rules for target marketing or buying recommendations  Netflix uses association to understand what types of movies a customer likes and provides recommendations based on the data  Amazon makes recommendations based on past purchases  Supermarket loyalty cards collect data on customer purchase habits and print coupons based on what was currently bought. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-6

Some common approaches to data mining  Cause-and-effect Modeling - develop analytic models to describe relationships (e.g.; regression) that drive business performance  Profitability, customer satisfaction, employee satisfaction  Johnson Controls predicted that a one percent increase in overall customer satisfaction score was worth $13 M in service contract renewals a year.  Regression and correlation analysis are key tools for cause-and-effect modeling Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-7

Cluster Analysis Cluster Analysis has many powerful uses like Market Segmentation. You can view individual record’s predicted cluster membership.  Also called data segmentation  Two major methods 1. Hierarchical clustering a) Agglomerative methods (used in XLMiner) proceed as a series of fusions b) Divisive methods successively separate data into finer groups 2. k-means clustering (available in XLMiner) partitions data into k clusters so that each element belongs to the cluster with the closest mean Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-8

Agglomerative versus Divisive Hierarchical Clustering Methods Figure 12.1 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-9 Most common. XLMiner Series of fusions of the objects into groups. Each fusion joins together 2 clusters that are most similar The above figure is called a dendrogram and represents the fusions or divisions made at each successive stage of the analysis., A dendrogram is a tree like diagram that summarizes the process of clustering.

Cluster Analysis – Agglomerative Methods  Dendrogram – a diagram illustrating fusions or divisions at successive stages  Objects “closest” in distance to each other are gradually joined together.  Euclidean distance is the most commonly used measure of the distance between objects. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-10 Figure 12.2 Euclidean

Example 12.1 Clustering Colleges and Universities  Cluster the Colleges and Universities data using the five numeric columns in the data set.  Use the hierarchical method Figure 12.3 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-11

Example 12.1 (continued) Clustering Colleges and Universities Figure 12.4 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-12 Step 1 of 3: Data Range: A3:G52 Selected Variables: Median SAT : : Graduation % Add-Ins XLMiner Data Reduction and Exploration Hierarchical Clustering

Example 12.1 (continued) Clustering Colleges and Universities Figure 12.5 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-13 Step 2 of 3: Normalize input data Similarity Measure: Euclidean distance Clustering Method: Average group linkage

Example 12.1 (continued) Clustering Colleges and Universities Figure 12.6 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-14 Step 3 of 3: Draw dendrogram Show cluster membership # Clusters: 4 (this stops the method from continuing until only 1 cluster is left)

Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-15 Steps in Agglomerative Clustering The steps in Agglomerative Clustering are as follows: 1.Start with n clusters (each observation = cluster) 2. The two closest observations are merged into one cluster 3. At every step, the two clusters that are “closest” to each other are merged. That is, either single observations are added to existing or two exiting clusters are merged. 4. This process continues until all observations are merged.

Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-16 This process of agglomeration leads to the construction of a dendrogram. This is a tree-like diagram that summarizes the process of clustering. For any given number of clusters we can determine the records in the clusters by sliding a horizontal line (ruler) up and down the dendrogram until the number of vertical intersections of the horizontal line equals the number of clusters desired.

Example 12.1 (continued) Clustering Colleges and Universities Figure 12.7 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-17 Hierarchical clustering results: Inputs section

Example 12.1 (continued) Clustering Colleges and Universities Figure 12.8 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-18 Hierarchical clustering results: Dendogram y-axis measures intercluster distance x-axis indicates Subcluster ID’s

Example 12.1 (continued) Clustering of Colleges From Figure 12.8 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-19 Hierarchical clustering results: Dendrogram Smaller clusters “agglomerate” into bigger ones, with least possible loss of cohesiveness at each stage. Height of the bars is a measure of dissimilarity in the clusters that are merging into one.

Example 12.1 (continued) Clustering of Colleges From Figure 12.9 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-20 Hierarchical clustering results: Predicted clusters

Example 12.1 (continued) Clustering of Colleges Figure 12.9 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-21 Hierarchical clustering results: Predicted clusters Cluster # Colleges

Example 12.1 (continued) Clustering of Colleges Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-22 Hierarchical clustering results for clusters 3 and 4 Schools in cluster 3 appear similar. Cluster 4 has considerably higher Median SAT and Expenditures/Student.

We will analyze the Credit Approval Decisions data to predict how to classify new elements.  Categorical variable of interest: Decision (whether to approve or reject a credit application)  Predictor variables: shown in columns A-E Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-23

Modified Credit Approval Decisions The categorical variables are coded as numeric:  Homeowner - 0 if No, 1 if Yes  Decision - 0 if Reject, 1 if Approve Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-24

Example 12.2 Classifying Credit-Approval Decisions  Large bubbles correspond to rejected applications  Classification rule: Reject if credit score ≤ 640 Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall misclassifications out of 50  4%

Example 12.2 (continued) Classifying Credit-Approval Decisions  Classification rule: Reject if 0.095(credit score) + (years of credit history) ≤ Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall misclassifications out of 50  6%

Example 12.3 Classification Matrix for Credit- Approval Classification Rules  Off-diagonal elements are the misclassifications  4% = probability of a misclassification Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-27 Table12.1

Using Training and Validation Data  Data mining projects typically involve large volumes of data.  The data can be partitioned into: ▪training data set – has known outcomes and is used to “teach” the data-mining algorithm ▪validation data set – used to fine-tune a model ▪test data set – tests the accuracy of the model  In XLMiner, partitioning can be random or user- specified. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-28

Example 12.4 Partitioning Data Sets in XLMiner (Modified Credit Approval Decisions data) Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-29 XLMiner Partition Data Standard Partition Data Range: A3:F53 Pick up rows randomly Variables in the partitioned data: (all) Partitioning %: Automatic

Example 12.4 (continued) Partitioning Data Sets in XLMiner  Partitioning choices when choosing random 1.Automatic 60% training, 40% validation 2.Specify % 50% training, 30% validation, 20% test (training and validation % can be modified) 3.Equal # records 33.33% training, validation, test  XLMiner has size and relative size limitations on the data sets, which can affect the amount and % of data assigned to the data sets. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-30

Example 12.4 (continued) Partitioning Data Sets in XLMiner Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-31 Portion of the output from a Standard Partition First 30 rows: Training data Last 20 rows: Validation data

 Example 12.5 Classifying New Data for Credit Decisions Using Credit Scores and Years of Credit History  Use the Classification rule from Example 12.2: Reject if 0.095(credit score) + (years of credit history) ≤ Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-32 Figure 12.16

 Example 12.5 (continued) Classifying New Data for Credit Decisions Using Credit Scores and Years of Credit History Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-33 New data to classify Reject if this is > *

Three Data-Mining Approaches to Classification: 1. k-Nearest Neighbors (k-NN) Algorithm find records in a database that have similar numerical values of a set of predictor variables 2. Discriminant Analysis use predefined classes based on a set of linear discriminant functions of the predictor variables 3. Logistic Regression estimate the probability of belonging to a category using a regression on the predictor variables Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-34

Discriminant Analysis  Determine the class of an observation using linear discriminant functions of the form:  b i are the discriminant coefficients (weights)  b i are determined by maximizing between-group variance relative to within-group variance  One discriminant function is formed for each category. New observations are assigned to the class whose function L has the highest value. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-35

Example 12.8 Classifying Credit Decisions Using Discriminant Analysis Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-36 Step 1 XLMiner Classification Discriminant Analysis Worksheet: Data_Partition1 Input Variables: (5 of them) Output variable: Decision Partition the data (see Example 12.4) to create the Data_Partition1 worksheet.

Example 12.8 (continued) Classifying Credit Decisions Using Discriminant Analysis Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-37 Figure Steps 2 and 3

Example 12.8 (continued) Classifying Credit Decisions Using Discriminant Analysis Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-38

Example 12.8 (continued) Classifying Credit Decisions Using Discriminant Analysis Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-39 No misclassifications in the training data set. 15% misclassifications in the validation data set.

Example 12.9 Using Discriminant Analysis for Classifying New Data Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-40 Follow Steps 1 and 2 in Example Step 3 Score new data in: Detailed Report From Figure √ Partition the data (see Example 12.4) to create the Data_Partition1 worksheet.

Example 12.9 (continued) Using Discriminant Analysis for Classifying New Data Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-41 Figure Match variables in new range: Worksheet: Credit Decisions Data range: A57:E63 Match variables with same names

Example 12.9 (continued) Using Discriminant Analysis for Classifying New Data Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-42 Half of the applicants are in the “Approved” class (the same 3 applicants as in Example 12.7).

Association Rule Mining (affinity analysis)  Seeks to uncover associations in large data sets  Association rules identify attributes that occur together frequently in a given data set.  Market basket analysis, for example, is used determine groups of items consumers tend to purchase together.  Association rules provide information in the form of if-then (antecedent-consequent) statements.  The rules are probabilistic in nature. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-43

Example Custom Computer Configuration (PC Purchase Data)  Suppose we want to know which PC components are often ordered together. Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-44

Measuring the Strength of Association Rules  Support for the (association) rule is the percentage (or number) of transactions that include all items both antecedent and consequent. = P(antecedent and consequent)  Confidence of the (association) rule: = P(consequent|antecedent) = P(antecedent and consequent)/P(antecedent)  Expected confidence = P(antecedent)  Lift is a ratio of confidence to expected confidence. Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-45

Example Measuring Strength of Association  A supermarket database has 100,000 point-of-sale transactions: 2000 include both A and B items 5000 include C 800 include A, B, and C Association rule: If A and B are purchased, then C is also purchased.  Support = 800/100,000 =  Confidence = 800/2000 = 0.40  Expected confidence = 5000/100,000 = 0.05  Lift = 0.40/0.05 = 8 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-46

Example Identifying Association Rules for PC Purchase Data Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-47 XLMiner Association Affinity Worksheet: Market Basket Data range: A5:L72 First row headers Minimum support: 5 Minimum confidence: 80

Example (continued) Identifying Association Rules for PC Purchase Data Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-48

Example (continued) Identifying Association Rules for PC Purchase Data Figure Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall12-49 Rules are sorted by their Lift Ratio (how much more likely one is to purchase the consequent if they purchase the antecedents).