Monday, February 22, 2016.  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.

Slides:



Advertisements
Similar presentations
Hierarchical Clustering
Advertisements

1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
ICS 421 Spring 2010 Data Mining 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/8/20101Lipyeow Lim.
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
Data Mining By Archana Ketkar.
Data Mining Adrian Tuhtan CS157A Section1.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining Techniques
10 Data Mining. What is Data Mining? “Data Mining is the process of selecting, exploring and modeling large amounts of data to uncover previously unknown.
Overview DM for Business Intelligence.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
Chapter 1: Introduction to Predictive Modeling 1.1 Applications 1.2 Generalization 1.3 JMP Predictive Modeling Platforms.
Data Mining Chun-Hung Chou
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Forecast Anything! The Seven Data Mining Models Andy Cheung ISV Developer Evangelist Microsoft Hong Kong.
Chapter 7 DATA, TEXT, AND WEB MINING Pages , 311, Sections 7.3, 7.5, 7.6.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Inductive learning Simplest form: learn a function from examples
Overview of Data Mining Methods Data mining techniques What techniques do, examples, advantages & disadvantages.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Introduction, or what is data mining? Introduction, or what is data mining? Data warehouse and query tools Data warehouse and query tools Decision trees.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Amer Kanj Data Mining For Business Professionals.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Prepared by: Mahmoud Rafeek Al-Farra
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree like diagram that.
Data Mining and Decision Support
Monday, January 11,  INSTRUCTORS  STUDENTS:  Name?  Class?  Hometown?  Major?  Background: Math? Computers? Statistics?  Why did you take.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Data Mining Copyright KEYSOFT Solutions.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
Clustering Algorithms Minimize distance But to Centers of Groups.
Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining – Intro.
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Chapter 15 – Cluster Analysis
Data Based Decision Making
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Week 11 Knowledge Discovery Systems & Data Mining :
MIS2502: Data Analytics Classification using Decision Trees
I don’t need a title slide for a lecture
כריית נתונים.
MIS2502: Data Analytics Clustering and Segmentation
MIS2502: Data Analytics Clustering and Segmentation
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Cluster Analysis.
Topic 5: Cluster Analysis
Presentation transcript:

Monday, February 22, 2016

 The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful business patterns or mathematical decision models from a preprocessed data set

 Analytics techniques come from a variety of disciplines:  Statistics (e.g., regression)  Machine learning (e.g., decision trees)  Biology (e.g., neural networks, genetic algorithms)

 Applications exist in numerous areas  Retail  Travel  Health care  Actuarial science  Credit scoring  Movies  Sports  Marketing  Financial services  Pharmaceuticals  Telecommunications  Etc.

1. In predictive analytics, a target variable is typically available  Can be categorical (e.g., churn or not, fraud or not) or continuous (e.g., customer lifetime value, loss given default) 2. In descriptive analytics, no such target variable is available  Clustering is one example

 Missing data values can occur for various reasons  Customer decides not to disclose income  Error occurs in merging because of typos in name  Popular schemes to deal with it:  Replace data With average or median Using a regression based on other data (e.g., age, income)  Delete data Simplest and most straightforward option Assumes no meaningful interpretation is lost  Keep data Missing data may be meaningful (e.g., customer did not disclose income because he is currently unemployed)

 Two types of outliers can be considered:  Valid observation (e.g., salary of $2 million)  Invalid observation (e.g., age of 200 years)  Detection can be done statistically  Couple techniques:  Trimming/truncating – remove outliers  Winsorising – bring data back to lower and upper limits (e.g., median +/- 3SD)

 Regression – target variable is continuous  Stock prices  Loss given default (LGD)  Customer lifetime value (CLV)  Classification – target is categorical  Binary (fraud, churn, credit risk)  Multiclass (predict credit ratings)

 Active churn – customer stops relationship with firm  Contractual setting (e.g., cell phone service) – easy to detect – customer cancels contract  Noncontractual setting (e.g., grocery store) – need to operationalize – customer has not purchased any products in last 3 months  Passive churn – decreasing product or service usage  Forced churn – company stops the relationship  Expected churn – customer no longer needs a product or service (e.g., baby products)

 Recursive partitioning algorithm (RPA) that represents patterns in underlying data set  Leaf/terminal nodes represent outcomes  Building a decision tree:  Splitting: Which variables and at what values?  Stopping: When to stop growing the tree?  Decisions: What class to assign each leaf node?

 Decision trees essentially model decision boundaries orthogonal to the axes

 Decision trees can be used for continuous targets

 Contrary to predictive analytics, there is no real target variable available  Sometimes called unsupervised learning since there is no target variable to steer the learning process

 Typically begins with a database of transactions:

 Stochastic in nature, with a statistical measure of the strength of the association  Rules measure correlation association and should not be interpreted in a causal way  Examples:  If a customer buys spaghetti, then customer buys red wine in 70 percent of the cases  If a customer visits web page A, then the customer will visit web page B in 90% of the cases  If a customer has a car loan and car insurance, then the customer has a checking account in 80% of the cases

 Suppose customer web page visits were logged:  Session 1: A, B, C  Session 2: B, C  Session 3: A, C, D  Session 4: A, B, D  Session 5: D, C, A  Consider the sequence rule A -> C  The support and confidence can be measure in various ways  Support: C follows A in any subsequent stage (2/5) C immediately follows A (1/5)  Confidence (given that A occurs): C follows A in any subsequent stage (2/4) C immediately follows A (1/4)

 Divisive clustering starts with the entire data set in one cluster and breaks it up into smaller clusters until one observation per cluster remains (right to left below)  Agglomerative clustering does the reverse – it merges clusters until one big cluster is left (left to right)

 The vertical lines on the dendogram gives the distance between two clusters amalgamated  The elbow point of a scree plot indicates the optimal clustering

 Non-hierarchical procedure 1. Select k observations as initial cluster centroids (seeds) 2. Assign each observation to the cluster that has the closest centroid 3. When all observations have been assigned, recalculate the positions of the k centroids 4. Repeat steps #2 and #3 until the cluster centroids no longer change  Notes: The number of clusters, k, must be specified before the procedure begins. Different seeds should be tried to verify the stability of the clustering solution.

 Read Chapter 6 of your textbook  Work on my term project