XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd. www.quantlink.com www.xlminer.com.

Slides:



Advertisements
Similar presentations
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Advertisements

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Chapter 8 – Logistic Regression
Indian Statistical Institute Kolkata
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Introduction to Data Mining with XLMiner
Data Mining: A Closer Look Chapter Data Mining Strategies.
Chapter 9 Business Intelligence Systems
Analyzing Direct Marketing Campaign Performance Using Weight of Evidence Coding and Information Value through SAS® Enterprise Miner™ Incremental Response.
Chapter 7 – K-Nearest-Neighbor
Data Mining.
Basic Data Mining Techniques
Three kinds of learning
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Part I: Classification and Bayesian Learning
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Enterprise systems infrastructure and architecture DT211 4
Clustering analysis workshop Clustering analysis workshop CITM, Lab 3 18, Oct 2014 Facilitator: Hosam Al-Samarraie, PhD.
April 11, 2008 Data Mining Competition 2008 The 4 th Annual Business Intelligence Symposium Hualin Wang Manager of Advanced.
Midterm Review. 1-Intro Data Mining vs. Statistics –Predictive v. experimental; hypotheses vs data-driven Different types of data Data Mining pitfalls.
Comparison of Classification Methods for Customer Attrition Analysis Xiaohua Hu, Ph.D. Drexel University Philadelphia, PA, 19104
Overview DM for Business Intelligence.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Unsupervised Learning. CS583, Bing Liu, UIC 2 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate.
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
An Excel-based Data Mining Tool Chapter The iData Analyzer.
Inductive learning Simplest form: learn a function from examples
Chapter 9 – Classification and Regression Trees
Chapter 12 – Discriminant Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
TOPICS IN BUSINESS INTELLIGENCE K-NN & Naive Bayes – GROUP 1 Isabel van der Lijke Nathan Bok Gökhan Korkmaz.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Final Exam Review. The following is a list of items that you should review in preparation for the exam. Note that not every item in the following slides.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Chapter 6 Data Mining 1. Introduction The increase in the use of data-mining techniques in business has been caused largely by three events. The explosion.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
Predictive Analytics World CONFIDENTIAL1 Predictive Keyword Scores to Optimize PPC Campaigns Vincent Granville, Ph.D. Click Forensics February 19, 2009.
1 Statistical Techniques Chapter Linear Regression Analysis Simple Linear Regression.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 4 An Excel-based Data Mining Tool (iData Analyzer) Jason C. H. Chen, Ph.D. Professor of MIS.
Data Mining and Decision Support
Overview of the Data Mining Process
BY International School of Engineering {We Are Applied Engineering} Disclaimer: Some of the Images and content have been taken from multiple online sources.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
 Propensity Model  Propensity Model refers to Statistical Models predicting “willingness” to perform an action like accepting an offer etc. Acquisition.
Chapter 10 Introduction to Data Mining
Chapter 12 – Discriminant Analysis
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
Data Transformation: Normalization
XLMiner – a Data Mining Toolkit
DATA MINING © Prentice Hall.
Data Mining Lecture 11.
Machine Learning & Data Science
An Excel-based Data Mining Tool
Prepared by: Mahmoud Rafeek Al-Farra
CSCI N317 Computation for Scientific Applications Unit Weka
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Predictive Keyword Scores to Optimize Online Advertising Campaigns
Chapter 7: Transformations
MIS2502: Data Analytics Classification Using Decision Trees
Presentation transcript:

XLMiner – a Data Mining Toolkit QuantLink Solutions Pvt. Ltd.

XLMiner - the Data Mining Toolkit2 XLMiner – a quick tour Here is a short demo of XLMiner. Let us use a simple example: a bank sends mailers to its customers, offering a special deal on Personal Loans. In its previous campaign, it got only about 9% positive response. Objective: How to target customers for increased conversion rate. In other words, the question to address is: what profile indicates a high-potential customer?

XLMiner - the Data Mining Toolkit3 XLMiner – a quick tour Past campaign data will be used to train the data mining model This is called supervised learning in DataMining terms Let’s see how to build a model and use it for improving the response rate.

XLMiner - the Data Mining Toolkit4 XLMiner Quick Tour Data description Our past campaign data has the following customer attributes: Customer ID Customer’s Age Professional Experience Family Income Credit Card average annual spending Education Level #appliances owned Did this customer accept past campaign offer? The last variable is the known outcome of the past campaign. Our Data Mining model will use this for Supervised Learning.

XLMiner - the Data Mining Toolkit5 XLMiner Quick Tour A view of the data This is what the data looks like: The variable labeled as “PersLoan?” is binary: 0 means the customer was not interested in the Personal Loan. 1 means the customer was interested.

XLMiner - the Data Mining Toolkit6 XLMiner Quick Tour the Data Mining Process Partition the data into Training & Validation Partitions Fit the Model on Training Partition only Obtain results, see if they look good enough Check if they are good for Validation data too! Study the outputs for validation data Try out several alternative models Choose and deploy the best model

XLMiner - the Data Mining Toolkit7 XLMiner Quick Tour Start the analysis Let’s get going with XLMiner. Notice that XLMiner is as easy to use as Excel! All we need to do is use the friendly menus. We follow just three simple steps to fit a model and see the outputs!

XLMiner - the Data Mining Toolkit8 XLMiner Quick Tour Step 1: Partition the data We’ll create two partitions by choosing the records randomly. The Training Partition will be used for fitting the model. The Validation partition will be used for checking if the model gives a good fit for another piece of known data.

XLMiner - the Data Mining Toolkit9 XLMiner Quick Tour Partitioned Data XLMiner creates a Partition Sheet that shows the data split into Two partitions. Easy Hyperlinks on the Navigator facilitate viewing of either partition

XLMiner - the Data Mining Toolkit10 XLMiner Quick Tour Step 2: Fitting the Model This is a “Classification” Problem where we want to predict customers as likely / not likely to take a Personal Loan. Let’s use one of the available techniques – Classification Tree. Later we can use other Classification techniques. We select input (predictor) variables here… …and the outcome variable here

XLMiner - the Data Mining Toolkit11 XLMiner Quick Tour Step 2: Fitting the Model The model fit guides us through easy wizard-like steps. In these steps we choose technique- specific parameters and the output options. In the end, we click Finish to produce the results.

XLMiner - the Data Mining Toolkit12 XLMiner Quick Tour Step 3: Understanding the Outputs The friendly Output Navigator lets us go over all the outputs. The Summaries show us the classification error percentages – i.e. how well the model is predicting Other outputs (like the Tree here) will tell us the decision rules that the model is suggesting. Many other diagnostic outputs are available depending on options we choose.

XLMiner - the Data Mining Toolkit13 XLMiner Quick Tour Output 1: Validation Summary First, we look at how well the model predicted for the Validation data set In the Training data where we already knew the outcome, 156 “will buy” were predicted correctly, and 38 wrongly “Won’t buy” were predicted correctly and merely 5 wrongly. Here are the corresponding error percentages. The errors are not very small but could still indicate a workable model.

XLMiner - the Data Mining Toolkit14 XLMiner Quick Tour Output 2: the decision rule Here is the Classification Tree that gives the easy-to- understand and implement Decision Rules Cut-off points for different variables decide whether to go Left or Right 0: not likely to buy 1: likely to buy

XLMiner - the Data Mining Toolkit15 XLMiner Quick Tour the decision rule in table form The same decision rule as shown visually, can be converted into the table below. This is useful for implementing it in your information systems.

XLMiner - the Data Mining Toolkit16 XLMiner Quick Tour Output 3: more details Each technique (Classification Tree in this case) has additional helpful outputs The example here shows the “Prune Log” – how the percentage error reduced by “pruning” the tree

XLMiner - the Data Mining Toolkit17 XLMiner Quick Tour Output 4: the Lift Chart “Lift” tells us how much better the model did compared to a random targeting of customers. This is one of the most important outputs. If customers were targeted randomly, we would expect this outcome. For instance, 1000 mailers would probably yield less than 100 customers. With our Tree model, we get a much superior result. In less than 500 mailers sent to high probability customers, we would get nearly 170 successes!

XLMiner - the Data Mining Toolkit18 XLMiner Quick Tour Output 5: the Detailed report The Validation data is “scored” in detail as shown below. Scoring means using the fitted model to classify each record of the data. Predicted values can be seen against the actuals here. Probability of success is computed for each record. This is what helps XLMiner suggest selective records (customers) to target.

XLMiner - the Data Mining Toolkit19 XLMiner Quick Tour Try several techniques! That was just one of the many techniques in XLMiner – Classification Tree. A typical Data Mining exercise involves several alternative approaches on the same data. This can be either with different techniques, or with different parameters, or both. Comparing multiple approaches lets us “assess” which model to finally choose for implementation.

XLMiner - the Data Mining Toolkit20 XLMiner Quick Tour Rich repertoire of techniques! XLMiner supports a comprehensive array of supervised learning procedures: Multiple Linear Regression Logistic Regression Classification & Regression Trees Neural Networks k Nearest Neighbors Naïve Bayes Classifier Discriminant Analysis

XLMiner - the Data Mining Toolkit21 XLMiner Quick Tour Rich repertoire of techniques!... and several other features in Unsupervised Learning, Data Reduction and Exploration: Principal Components Analysis k-means Clustering Hierarchical Clustering Self-organizing Maps (coming soon) Affinity– Market Basket Analysis Here are some sample outputs from these methods …

XLMiner - the Data Mining Toolkit22 XLMiner Quick Tour sample output - Dendrogram Hierarchical Clustering produces a dendrogram – an excellent visual representation of Cluster formation. Height of the bars is a measure of dissimilarity in the clusters that are merging into one. Smaller clusters “agglomerate” into bigger ones, with least possible loss of cohesiveness at each stage.

XLMiner - the Data Mining Toolkit23 XLMiner Quick Tour sample output – cluster predictions Cluster Analysis has many powerful uses like Market Segmentation. We can view individual record’s predicted cluster membership.

XLMiner - the Data Mining Toolkit24 XLMiner Quick Tour sample output – BoxPlots XLMiner supports powerful visualization. The example here shows BoxPlots of two variables. Cluster 2 clearly shows higher Income & Credit Card spend than Cluster 1. This is an excellent aid to characterizing the clusters

XLMiner - the Data Mining Toolkit25 XLMiner Quick Tour sample output – Scatter Plots Matrix Scatterplots in XLMiner give a visual insight into relationship among variables.

XLMiner - the Data Mining Toolkit26 XLMiner Quick Tour sample output – Association Rules For Market Basket Analysis XLMiner produces easy-to-read Association Rules Rules are explained in simple English! Each rule tells us which offerings will go well together

XLMiner - the Data Mining Toolkit27 XLMiner Quick Tour … and that’s not all! XLMiner has handy utilities for Data handling: Missing data treatment Transforming categorical data Binning continuous data Sampling from Databases Scoring to Databases

XLMiner - the Data Mining Toolkit28 XLMiner Quick Tour XLMiner => Versatility! This was a quick demonstration of just a few things XLMiner can do. It can do lots more. It is comprehensive in coverage, like the best DM products around. Get your free download for evaluation at

XLMiner - the Data Mining Toolkit29 XLMiner Quick Tour XLMiner => Simplicity! Daryl Pregibon had said – Data Mining is “Statistics at Scale and Speed”. You’ll find that XLMiner is Statistics at Scale, Speed and Simplicity! If you know to use Excel, you already know XLMiner. You can get started in minutes.

XLMiner - the Data Mining Toolkit30 XLMiner Quick Tour XLMiner => Great Value! Several comprehensive DM products are many times more expensive. For exploring how Data Mining will work for you, XLMiner provides a great start!

XLMiner - the Data Mining Toolkit31 XLMiner Quick Tour What others say … The American Statistician reviewed XLMiner along with other reputed products in the November 2003 issue This is what it had to say: “An easy to use… an excellent, inexpensive add-on that greatly expands the capabilities of Excel.” “XLMiner’s documentation is remarkably good…”

XLMiner - the Data Mining Toolkit32 XLMiner Quick Tour More Resources For your initiation into Data Mining: Free evaluation download Online Courses at Case Book in the making Technical references on product website

XLMiner - the Data Mining Toolkit33 XLMiner Quick Tour Thank you for viewing this Demo! XLMiner - the Data Mining Toolset for the Managers of Tomorrow