Chapter 13 – Association Rules

Slides:



Advertisements
Similar presentations
Association Rules Evgueni Smirnov.
Advertisements

Mining Association Rules from Microarray Gene Expression Data.
Data Mining Techniques Association Rule
DATA MINING Association Rule Discovery. AR Definition aka Affinity Grouping Common example: Discovery of which items are frequently sold together at a.
Data Mining (Apriori Algorithm)DCS 802, Spring DCS 802 Data Mining Apriori Algorithm Spring of 2002 Prof. Sung-Hyuk Cha School of Computer Science.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Chapter 8 – Logistic Regression
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
Association Rules l Mining Association Rules between Sets of Items in Large Databases (R. Agrawal, T. Imielinski & A. Swami) l Fast Algorithms for.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Association Rule Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Association Rules Olson Yanhong Li. Fuzzy Association Rules Association rules mining provides information to assess significant correlations in large.
Fast Algorithms for Association Rule Mining
Research Project Mining Negative Rules in Large Databases using GRD.
IBM SPSS Modeler 14.2 Data Mining Concepts Introduction to Undirected Data Mining: Association Analysis Prepared by David Douglas, University of ArkansasHosted.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
MIS 451 Building Business Intelligence Systems Association Rule Mining (1)
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Bayesian Decision Theory Making Decisions Under uncertainty 1.
Basic Data Mining Techniques
Market Basket Analysis 포항공대 산업공학과 PASTA Lab. 석사과정 신원영.
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Using Data Mining Technologies to find Currency Trading Rules A. G. Malliaris M. E. Malliaris Loyola University Chicago Multinational Finance Society,
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
Association Rule Mining on Remotely Sensed Imagery Using Peano-trees (P-trees) Qin Ding, Qiang Ding, and William Perrizo Computer Science Department North.
Chapter 14 – Cluster Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
CS 478 – Tools for Machine Learning and Data Mining Association Rule Mining.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Association Rule Mining
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Associations and Frequent Item Analysis. 2 Outline  Transactions  Frequent itemsets  Subset Property  Association rules  Applications.
Charles Tappert Seidenberg School of CSIS, Pace University
Recommendation Algorithms for E-Commerce. Introduction Millions of products are sold over the web. Choosing among so many options is proving challenging.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.5: Mining Association Rules Rodney Nielsen.
Elsayed Hemayed Data Mining Course
Visualization of Apriori and Association Rules n Presented By: –Manoj Wartikar –Sameer Sagade.
Data Mining  Association Rule  Classification  Clustering.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
MIS2502: Data Analytics Association Rule Mining David Schuff
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.
Chapter 13 – Association Rules DM for Business Intelligence.
MIS2502: Data Analytics Association Rule Mining Jeremy Shafer
Chapter 10 Introduction to Data Mining
Association Rules Repoussis Panagiotis.
Tutorial 3: Using XLMiner for Association Rule Mining
Association Rules.
Association Rule Mining (Data Mining Tool)
Market Basket Analysis and Association Rules
Market Basket Many-to-many relationship between different objects
Association Rule Mining
DIRECT HASHING AND PRUNING (DHP) ALGORITHM
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Transactional data Algorithm Applications
MIS2502: Data Analytics Association Rule Mining
Market Basket Analysis and Association Rules
MIS2502: Data Analytics Association Rule Mining
©Jiawei Han and Micheline Kamber
Association Rules :A book store case
MIS2502: Data Analytics Association Rule Learning
Chapter 14 – Association Rules
Presentation transcript:

Chapter 13 – Association Rules Data Mining for Business Intelligence Shmueli, Patel & Bruce

What are Association Rules? Study of “what goes with what” “Customers who bought X also bought Y” What symptoms go with what diagnosis Transaction-based or event-based Also called “market basket analysis” and “affinity analysis” Originated with study of customer transactions databases to determine associations among items purchased

Used in many recommender systems

Generating Rules

Terms “IF” part = antecedent “THEN” part = consequent “Item set” = the items (e.g., products) comprising the antecedent or consequent Antecedent and consequent are disjoint (i.e., have no items in common)

Tiny Example: Phone Faceplates

Many Rules are Possible For example: Transaction 1 supports several rules, such as “If red, then white” (“If a red faceplate is purchased, then so is a white one”) “If white, then red” “If red and white, then green” + several more

Frequent Item Sets Ideally, we want to create all possible combinations of items Problem: computation time grows exponentially as # items increases Solution: consider only “frequent item sets” Criterion for frequent: support

Support Support = # (or percent) of transactions that include both the antecedent and the consequent Example: support for the item set {red, white} is 4 out of 10 transactions, or 40%

Apriori Algorithm

Generating Frequent Item Sets For k products… User sets a minimum support criterion Next, generate list of one-item sets that meet the support criterion Use the list of one-item sets to generate list of two- item sets that meet the support criterion Use list of two-item sets to generate list of three- item sets Continue up through k-item sets

Measures of Performance Confidence: the % of antecedent transactions that also have the consequent item set Lift = confidence/(benchmark confidence) Benchmark confidence = transactions with consequent as % of all transactions Lift > 1 indicates a rule that is useful in finding consequent items sets (i.e., more useful than just selecting transactions randomly)

Alternate Data Format: Binary Matrix

Process of Rule Selection Generate all rules that meet specified support & confidence Find frequent item sets (those with sufficient support – see above) From these item sets, generate rules with sufficient confidence

Example: Rules from {red, white, green} {red, white} > {green} with confidence = 2/4 = 50% [(support {red, white, green})/(support {red, white})] {red, green} > {white} with confidence = 2/2 = 100% [(support {red, white, green})/(support {red, green})] Plus 4 more with confidence of 100%, 33%, 29% & 100% If confidence criterion is 70%, report only rules 2, 3 and 6

All Rules (XLMiner Output)

Interpretation Lift ratio shows how effective the rule is in finding consequents (useful if finding particular consequents is important) Confidence shows the rate at which consequents will be found (useful in learning costs of promotion) Support measures overall impact

Caution: The Role of Chance Random data can generate apparently interesting association rules The more rules you produce, the greater this danger Rules based on large numbers of records are less subject to this danger

Example: Charles Book Club Row 1, e.g., is a transaction in which books were bought in the following categories: Youth, Do it Yourself, Geography

XLMiner Output Rules arrayed in order of lift Information can be compressed e.g., rules 2 and 7 have same trio of books

Summary Association rules (or affinity analysis, or market basket analysis) produce rules on associations between items from a database of transactions Widely used in recommender systems Most popular method is Apriori algorithm To reduce computation, we consider only “frequent” item sets (=support) Performance is measured by confidence and lift Can produce a profusion of rules; review is required to identify useful rules and to reduce redundancy