DATA MINING FOR DECISION MAKING Mary Malliaris Loyola University Chicago DSI Annual Meeting Baltimore November 16, 2013.

Slides:



Advertisements
Similar presentations
Section 1.3 Experimental Design © 2012 Pearson Education, Inc. All rights reserved. 1 of 61.
Advertisements

Section 1.3 Experimental Design.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Chapter 14 Comparing two groups Dr Richard Bußmann.
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Neural Networks Marco Loog.
Data Mining.
Basic Data Mining Techniques
Three kinds of learning
Data Mining Adrian Tuhtan CS157A Section1.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Enterprise systems infrastructure and architecture DT211 4
3-1 Chapter Three. 3-2 Secondary Data vs. Primary Data Secondary Data: Data that have been gathered previously. Primary Data: New data gathered to help.
Data Mining Techniques
©2013 Cengage Learning. All Rights Reserved. Business Management, 13e Data Analysis and Decision Making Mathematics and Management Basic.
Enabling Organization-Decision Making
Inductive learning Simplest form: learn a function from examples
Entrepreneurship: Ideas in Action 5e © 2011 Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible.
Experimental Design 1 Section 1.3. Section 1.3 Objectives 2 Discuss how to design a statistical study Discuss data collection techniques Discuss how to.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Using Data Mining Technologies to find Currency Trading Rules A. G. Malliaris M. E. Malliaris Loyola University Chicago Multinational Finance Society,
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts.
Neural Networks Steven Le. Overview Introduction Architectures Learning Techniques Advantages Applications.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Section 1.3 Experimental Design.
Knowledge Discovery and Data Mining 19 th Meeting Course Name: Business Intelligence Year: 2009.
THE WORLD OF MARKETING.  Learn how to conduct a SWOT analysis.  List the three key areas of an internal company analysis.  Identify the factors in.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Monday, February 22,  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.
CHAPTER 13 MARKETING in TODAY’S WORLD The Basics of Marketing Market A market is a group of customers who share common wants and needs, and who have.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Section 1.3 Objectives Discuss how to design a statistical study Discuss data collection techniques Discuss how to design an experiment Discuss sampling.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Nearest Neighbour and Clustering. Nearest Neighbour and clustering Clustering and nearest neighbour prediction technique was one of the oldest techniques.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
TOPIC 5 Search For a New Venture Building a Powerful Marketing Plan.
Introduction The two-sample z procedures of Chapter 10 allow us to compare the proportions of successes in two populations or for two treatments. What.
CHAPTER 11 Inference for Distributions of Categorical Data
Adrian Tuhtan CS157A Section1
Exam #3 Review Zuyin (Alvin) Zheng.
Data Analysis.
CHAPTER 11 Inference for Distributions of Categorical Data
Word Embedding Word2Vec.
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Presentation transcript:

DATA MINING FOR DECISION MAKING Mary Malliaris Loyola University Chicago DSI Annual Meeting Baltimore November 16, 2013

What is Data Mining? Searching for meaningful patterns in large data sets OR identifying valid, novel, and potentially useful patterns in large and complex data collections.

What Type of Problems? [We will get to decisions later] 1.What occurs at the same time? 2.What similar groups occur in the data? 3.What determines the value of a target variable? 4.Can we predict?

Data Mining vs Statistics, Origin Statistics Originally Data Mining Data gathered by hand Data gathered by computer Data hard to get Data easy to get Not much data available Lots of data Starting with no data, how much do I need to get Starting with lots of data, what is best to use Method: generalize from sample to population Method: run on very large data sets and see if results continue to be true Calculated by hand Always done by machine

Data Mining vs. Statistics, Cont. StatisticsData Mining HypothesisNo Hypothesis Distribution AssumedNo Distribution Random SampleUse All the Good Data Conduct a TestUse a Technique Reject Null or NotResults Interesting? Meaning determined by hypothesis Meaning determined by results [and application] Test done ONE timeModel may be run many times

Two Styles of Data Mining (Each uses different techniques) ◦ Directed data mining [also called supervised] ◦ Has a Target variable ◦ Training data has answers included so model can check against them ◦ Undirected data mining [unsupervised] ◦ No Target variable ◦ Finds common occurrences in the data and leaves it up to the user to interpret

ASSOCIATION ANALYSIS also called: MARKET BASKET ANALYSIS (What Happens Together?)

Introduction ◦ These techniques were developed to analyze consumer shopping patterns ◦ Want to find grouping of items that typically occur together ◦ Output generates rules and is easy to understand ◦ Decisions: which rules are useful, and how do we use them?

Terms ◦ Rule [an if-then statement] ◦ Antecedent [the “if” part] ◦ Consequent [the “then” part] ◦ Support is the percent of time the IF part is true ◦ Confidence is the percent of time the THEN part is true when we already know the IF part is true

Table of Rules

Data Issues ◦ The matrix of data can be very large, with millions of rows and tens of thousands of columns, and is generally very sparse, since a typical basket contains only a few possible items in a store. ◦ The search problem is formidable given the exponential number of possible association rules. ◦ Therefore, a retailer usually groups products into larger categories.

Suppose a rule tells us that soy sauce is often purchased when rice is, what decision might we make?

Soy sauce is often purchased when rice is; what decision might we make? 1.Put them closer together in the store. 2.Put them far apart in the store. 3.Package soy sauce with rice. 4.Package soy sauce + rice + poorly selling item. 5.Raise the price on one, and lower it on the other. 6.Offer soy sauce for proofs of purchase of rice. 7.Do not advertise soy sauce and rice together. 8.Introduce a new brand of soy sauce with the most popular selling rice.

Cluster Analysis

Clustering ◦ In clustering, the groups you generate (called clusters) are not predefined ◦ Instead, grouping is accomplished by finding similarities between data according to characteristics found in the actual data ◦ Thus, clustering models focus on identifying groups of similar records. ◦ Then the data miner finds words to describe the clusters

Clustering Problems ◦ Interpreting the semantic meaning of each cluster may be difficult ◦ There is no one correct answer to a clustering problem ◦ There is no external standard by which to judge the model’s performance. Their value is determined by their ability to capture interesting groupings in the data. ◦ Domain knowledge will play a role in deciding among alternative solutions

Prizm Clusters PRIZM NE Social Groups You Are Where You Live Scroll down to Zip Code lookup and explore the clusters of your zip code

Hierarchical Clustering ◦ Agglomerative Divisive

Partitive Clustering reference vectors (seeds) X X X X Initial State observations Final State X X X X X X X X

Decisions Based on Clusters ◦ Marketing: Use clusters to develop targeted marketing programs ◦ Land use: Use clusters to identify areas of similar land use in an earth observation database ◦ Insurance: Use clusters to identify groups of policy holders with a similar claim behavior ◦ City-planning: Use clusters to find groups of houses with similar type and value ◦ Finance: Identify groups with same financial structure

The Cluster Viewer

Cluster Comparison View

Cell Distribution View

CLUSTER MEMBERSHIP AND DISTANCE FROM CLUSTER CENTER

Decision Trees

Decision Tree Models ◦ A Decision Tree has one variable that is the Target variable ◦ Decision trees divide up a large collection of records into successively smaller sets of records by applying a sequence of simple decision rules ◦ A good decision tree model consists of a set of rules that results in homogeneous groups

10 No 3 Yes 6 No 2 Yes 4 No 1 Yes 1 No 1 Yes 5 No 1 Yes 5 No 1 Yes Income > 50K Income <= 50K Age > 35Age <= 35 HH Size >4 HH Size <=4 2 No 1 Yes 2 No Gender = MGender = F Status = Married Status = Single Begin Profile who bought a new car 1 Yes 1 No Gender= F Gender= M

Advantages ◦ Can handle a large number of predictor variables ◦ Easy to understand ◦ Maps nicely to a set of business rules ◦ Identifies key relationships and thus give insight into the data set ◦ Can process both numeric and category data

Method Comparison TARGETSPLITS C5.0CategoryMultiple C&RTNumeric or Category Binary QUESTCategoryBinary CHAIDNumeric or Category Multiple

Decision Tree Decisions ◦ What type of car do I use in an ad in a women’s magazine? ◦ Run a decision tree with gender as the target and car description variables as inputs ◦ What type of customer is most likely to buy my product? ◦ Run a decision tree with purchase-Yes-No as the target and customer description variables as inputs ◦ What are the characteristics of companies that fail? ◦ Run a decision tree with Fail-Succeed as the target and company characteristics as inputs ◦ What dessert will be ordered at the end of a restaurant meal? ◦ Run a decision tree with dessert choice as the target and appetizer & entree variables as inputs

Neural Networks

Brainmaker Visit this site for many examples of problems neural networks have been useful for.

Neural Networks ◦ A neural network is a simplified model of the way the human brain processes information ◦ It simulates a large number of interconnected simple processing units ◦ The most popular kind of neural network is called a feed forward back propagation network

The Architecture: Nodes Input Layer The input layer receives information from the external environment. This layer does not perform any calculation; it just sends information to the next level.

The Architecture:Nodes Input Layer Output Layer The output layer produces the final result. This node corresponds to the variable you are trying to predict.

The Architecture: Nodes Input Layer Hidden Layer Output Layer The hidden layer takes data from the input variables and adapts it more closely to the data.

The Architecture: Nodes & Connections Each node in one layer is connected to each node in the next layer

The Architecture: Nodes, Connections, & Weights Each connection has a weight attached. The weights are assigned randomly in the beginning. w1 w2 w3 w16 w17 w18 w19 w20 w21

The Architecture: Nodes, Connections, & Weights Each node in the hidden & output layers applies a function to the sum of the weighted inputs. w1 w2 w3 w16 w17 w18 w19 w20 w21 F(sum inputs*weights)=node output F(sum inputs*weights)=output

Assumptions In order to use a neural network, we make some assumptions 1.There are inputs that affect the pattern 2.We know the inputs, we just don’t know exactly how they are related. 3. The examples of input/output we have contain the pattern we want the neural network to recognize.

How good is your model? ◦ Compare training and validation set results ◦ Compare validation set results to some standard benchmark such as ◦ Random walk model ◦ Regression model ◦ Typical measures for numeric data: ◦ MSE ◦ MAD

Techniques So Far ◦ Association Analysis ◦ Cluster Analysis ◦ Decision Trees ◦ Neural Networks

AA CA DT NN Undirected No Single Target

AA CA DT NN Directed One Target Field

AA CA DT NN Easy to Understand Results Clear Rules; Clear Decision

AA CA DT NN Gives Result but Reasoning Hidden You Figure It Out

BizEd Article recently ◦ “What corporations really want are graduates with…the ability to use data in a persuasive manner and make an immediate impact.” ◦ One employer told us, “We want students who can take a complex data set, review it, identify patterns, use those patterns to develop new business practices, and communicate those practices in a convincing way to senior management.”