Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.

Slides:



Advertisements
Similar presentations
An Introduction to Data Mining
Advertisements

Classification Techniques: Decision Tree Learning
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
1. Abstract 2 Introduction Related Work Conclusion References.
Data Mining Techniques Outline
Week 9 Data Mining System (Knowledge Data Discovery)
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
Chapter Extension 14 Database Marketing © 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke.
Classification Continued
Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.
Classification.
Chapter Extension 12 Database Marketing.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Microsoft Enterprise Consortium Data Mining Concepts Introduction to Directed Data Mining: Decision Trees Prepared by David Douglas, University of ArkansasHosted.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Introduction to Directed Data Mining: Decision Trees
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
1 Chapter 1: Introduction 1.1 Introduction to SAS Enterprise Miner.
Chapter 1: Introduction
Comparison of Classification Methods for Customer Attrition Analysis Xiaohua Hu, Ph.D. Drexel University Philadelphia, PA, 19104
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
Data Mining Chun-Hung Chou
1 An Introduction to Data Mining Hosein Rostani Alireza Zohdi Report 1 for “advance data base” course Supervisor: Dr. Masoud Rahgozar December 2007.
Understanding Data Analytics and Data Mining Introduction.
Forecast Anything! The Seven Data Mining Models Andy Cheung ISV Developer Evangelist Microsoft Hong Kong.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Copyright © 2010, SAS Institute Inc. All rights reserved. Applied Analytics Using SAS ® Enterprise Miner™
Basic Data Mining Technique
DATA MINING 1. 2 Data Mining Extracting or “mining” knowledge from large amounts of data Data mining is the process of autonomously retrieving useful.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Chapter 20 Data Analysis and Mining. 2 n Decision Support Systems  Obtain high-level information out of detailed information stored in (DB) transaction-processing.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Copyright © 2010 SAS Institute Inc. All rights reserved. Decision Trees Using SAS Sylvain Tremblay SAS Canada – Education SAS Halifax Regional User Group.
1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
ISQS 7342 Dr. zhangxi Lin By: Tej Pulapa. DT in Forecasting Targeted Marketing - Know before hand what an online customer loves to see or hear about.
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Monday, February 22,  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.
Lecture 10 (big data) Knowledge Induction using association rule and decision tree (Understanding customer behavior Using data mining skills)
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining Tarek Soukieh 11/18/2010. Agenda 1.The Evolution of Database Technology 2.Introduction 3.Data Preprocessing 4.OLAP vs. Data Mining 5.Data.
By Arijit Chatterjee Dr
DATA MINING © Prentice Hall.
Ch9: Decision Trees 9.1 Introduction A decision tree:
Chapter 6 Classification and Prediction
Classification and Prediction
Sangeeta Devadiga CS 157B, Spring 2007
MIS2502: Data Analytics Classification using Decision Trees
MIS2502: Data Analytics Clustering and Segmentation
MIS2502: Data Analytics Clustering and Segmentation
©Jiawei Han and Micheline Kamber
MIS2502: Data Analytics Classification Using Decision Trees
Presentation transcript:

Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst

Agenda Introduction SAS EM at Dalhousie University Exploring SAS EM Discussion

Introduction Teaching Assistant with Dalhousie University Analyst, Precision BioLogic Inc. Consultant

Informatics at Dalhousie Informatics The study of the application of computer and statistical techniques to the management of information -HGSC glossary Dalhousie University First marketing informatics MBA major in North America The first to use SAS EM for teaching purposes Health Informatics program New Bachelor of Informatics Success story

Other courses required for Informatics major Multivariate statistics Direct marketing Marketing research Marketing strategy Database design Internet marketing

Our students Work for: Small consulting companies Large financial institutions Not for profit organizations Telecommunications companies Insurance companies Hospitals Loyalty program companies Travel companies Oil and gas industry Publishing houses A common thing is – they all work with information

SEMMA Process Sample Input, partition and sample data Explore View distributions and associations Modify Transform data, filter outliers, cluster to derive new variables Model Develop models i.e. Decision tree’s and Regression Access Assess models

Business Problem Have you ever wanted to understanding things that occur together or in sequence? Market Basket Analysis: Association Node Broad applications Basket data analysis, cross-marketing, catalog design, campaign sales analysis Web log (click stream) analysis, DNA sequence analysis, etc.

Associations Node Support, probability that a transaction contains X  Y Frequency the combination occurs Confidence, conditional probability that a transaction having X also contains Y Percentage of cases that Y occurs, given that X has occurred Sequential Association Y occurs some time period after X occurs

Associations Node If a customer purchases Avocado, then 80% of the time they will purchase steak Confidence = 800 / 1,000 = 80% Support = 800 / 8,000 = 10% Avocado Steak 8,000 transactions 1,000 Avocados 2,000 Steak 800 Avocados & Steak antecedent consequent

Business Problem Have you ever wanted to classify or segment data on the basis of similar attributes so that each segment or cluster differs from another and all objects within a cluster share traits? Segmentation: Clustering Node Broad Applications Demographic / psychographic segmentation, campaign segmentation etc.

Clustering Example Identify similar objects or groups that are dissimilar from other clusters through disjoint cluster analysis on the basis of Euclidean distances Profile clusters graphically within EM Use derived segments for further analysis / algorithms (as an input variable or a target) Customize clusters based on standardization method, clustering method and clustering criterion

Business Problem Have you ever wanted to predict the likelihood of an event (and assign a cost to it)? Decision tree Node Broad Applications classify observations, predict outcomes based on decision alternatives.

Decision Tree Example A flow-chart-like tree structure Internal node denotes a test on an attribute Branch represents an outcome of the test Leaf nodes represent class labels or class distribution Handles missing data well Represent the knowledge in the form of IF-THEN rules Decision tree generation consists of two phases Tree construction At start, all the training examples are at the root Partition examples recursively based on selected attributes Tree pruning Identify and remove branches that reflect noise or outliers

Business Problem Have you ever wanted to ensure you target those most likely to purchase from a campaign whom you’ve never contacted previously? Scoring Node Broad applications: Testing model scalability, applying learning for subsequent events, etc.

EM Diagram

Lessons learned Data cleansing and transformation takes most of the time Data analysis done using EM – interpretable results Data modeling techniques are very robust SAS EM works well with huge datasets Knowledge obtained is transferred easily Learning never stops – EM reference, tutorial examples You can analyze almost any kind of data You can use SAS EM regardless the industry and size of dataset You need: a good computer, SAS support, and patience While not all students use SAS in their careers, the analytical principles they learn are extremely useful for their careers

Discussion