DATA MINING Association Rule Discovery. AR Definition aka Affinity Grouping Common example: Discovery of which items are frequently sold together at a.

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Mining Association Rules from Microarray Gene Expression Data.
Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Data Mining of Very Large Data
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
 Back to finding frequent itemsets  Typically, data is kept in flat files rather than in a database system:  Stored on disk  Stored basket-by-basket.
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
Association rules The goal of mining association rules is to generate all possible rules that exceed some minimum user-specified support and confidence.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Mining Data Mining Spring Transactional Database Transaction – A row in the database i.e.: {Eggs, Cheese, Milk} Transactional Database.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Association Rule Mining Part 2 (under construction!) Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
CSE Data Mining, 2003Lecture 2.1 Data Mining - CSE5230 Market Basket Analysis Machine Learning CSE5230/DMS/2003/2.
Association Rules Presented by: Anilkumar Panicker Presented by: Anilkumar Panicker.
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Asssociation Rules Prof. Sin-Min Lee Department of Computer Science.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Association Rule Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Association Rule Mining Part 1 Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006.
Fast Algorithms for Association Rule Mining
Lecture14: Association Rules
Chapter 13 – Association Rules
Market Basket Analysis 포항공대 산업공학과 PASTA Lab. 석사과정 신원영.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
1 Mining Association Rules Mohamed G. Elfeky. 2 Introduction Data mining is the discovery of knowledge and useful information from the large amounts of.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Supermarket shelf management – Market-basket model:  Goal: Identify items that are bought together by sufficiently many customers  Approach: Process.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Association Rule Mining March 5, 2009.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Association Rule Mining
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster.
Jeffrey D. Ullman Stanford University.  2% of your grade will be for answering other students’ questions on Piazza.  18% for Gradiance.  Piazza code.
Elsayed Hemayed Data Mining Course
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
MIS2502: Data Analytics Association Rule Mining David Schuff
Chapter 13 – Association Rules DM for Business Intelligence.
MIS2502: Data Analytics Association Rule Mining Jeremy Shafer
Data Mining Association Analysis: Basic Concepts and Algorithms
A Research Oriented Study Report By :- Akash Saxena
Frequent Pattern Mining
CPS216: Advanced Database Systems Data Mining
Market Basket Analysis and Association Rules
Market Basket Many-to-many relationship between different objects
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
MIS2502: Data Analytics Association Rule Mining
Market Basket Analysis and Association Rules
MIS2502: Data Analytics Association Rule Mining
MIS2502: Data Analytics Association Rule Learning
Chapter 14 – Association Rules
Association Analysis: Basic Concepts
Presentation transcript:

DATA MINING Association Rule Discovery

AR Definition aka Affinity Grouping Common example: Discovery of which items are frequently sold together at a supermarket. If this is known, decisions can be made about: – Arranging items on shelves – Which items should be promoted together – Which items should not simultanously be discounted 2

AR Definition -2- 3

AR Definition -3- Confidence Factor: the degree to which the rule is true across individual records – Confidence Factor = the number of transactions supporting the rule divided by the number of transactions supporting the rule body only – The Confidence Factor in the above example is 70% Support Factor: the relative occurrence of the detected rules within the overall data set of transactions – Support Factor = the number of transactions supporting the rule divided by the total number of transactions – The Support Factor in the above example is thus 13.5% The minimum thresholds for both factors can be set by users or domain experts 4

AR Usefulness Some rules are useful: – unknown, unexpected and indicative of some action to take. Some rules are trivial: – known by anyone familiar with the business. Some rules are inexplicable: – seem to have no explanation and do not suggest a course of action. 5

AR Example : Co-occurrence Table CustomerItems 1orange juice (OJ), cola 2 milk, orange juice, window cleaner 3 orange juice, detergent 4 orange juice, detergent, cola 5 window cleaner, cola OJCleaner MilkColaDetergent OJ Cleaner12110 Milk11100 Cola21031 Detergent

AR Discovery Process A co-occurrence cube would show associations in 3D – it is hard to visualise more dimensions than that – Worse, the number of cells in a co-occurrence hypercube grows exponentially with the number of items: It rapidly becomes impossible to store the required number of cells – Smart algorithms are thus needed for finding frequent large itemsets We would like to: – Choose the right set of items – Generate rules by deciphering the counts in the co-occurrence matrix (for two-item rules) – Overcome the practical limits imposed by many items in large numbers of transactions 7

Choosing the Right Item Set Choosing the right level of detail (the creation of classes and a taxonomy) – For example, we might look for associations between product categories, rather than at the finest-grain level of product detail, e.g. “Corn Chips” and “Salsa”, rather than “Doritos Nacho Cheese Corn Chips (250g)” and “Masterfoods Mild Salsa (300g)” – Important associations can be missed if we look at the wrong level of detail Virtual items may be added to take advantage of information that goes beyond the taxonomy 8

AR: Rules Note: if (nappies and Thursday) then beer is usually better than (in the sense that it is more actionable) if Thursday then nappies and beer because it has just one item in the result. If a 3-way combination is the most common, then perhaps consider rules with just 1 item in the consequent, e.g. if (A and B) then C if (A and C) then B 9 if condition then result

Discovering Large Itemsets The term “frequent item set S” means “a set S that appears in at least fraction s of the baskets,” where s is some chosen constant, typically 0.01 (i.e. 1%). DM datasets are usually too large to fit in main memory. When evaluating the running time of AR discovery algorithms we: – count the number of passes through the data Since the principal cost is often the time it takes to read data from disk, the number of times we need to read each datum is often the best measure of running time of the algorithm. 10

Discovering Large Itemsets -2- There is a key principle, called monotonicity or the a- priori algorithm that helps us find frequent itemsets [AgS1994]: If a set of items S is frequent (i.e., appears in at least fraction s of the baskets), then every subset of S is also frequent. To find frequent itemsets, we can: – Proceed level-wise, finding first the frequent items (sets of size 1), then the frequent pairs, the frequent triples, etc. ¾ Level-wise algorithms use one pass per level. – Find all maximal frequent itemsets (i.e., sets S such that no proper superset of S is frequent) in one (or few) passes 11

The Apriori Algorithm The A-priori algorithm proceeds level-wise. Given support threshold s, in the first pass we find the items that appear in at least fraction s of the baskets. This set is called L1, the frequent 1-itemsets (Presumably there is enough main memory to count occurrences of each item, since a typical store sells no more than 100,000 different items.) Pairs of items in L1 become the candidate pairs C2 for the second pass. The pairs in C2 whose count reaches s become L2, the frequent 2-itemsets. (We hope that the number of C2 is not so large that there is not enough memory for an integer count per candidate pair) 12

The Apriori Algorithm -2- The candidate triples, C3 are those sets {X, Y, Z} such that all of {X, Y}, {X, Z} and {Y, Z} are in L2. On the third pass, count the occurrences of triples in C3; those with a count of at least s are the frequent triples, L3. Proceed as far as you like (or until the sets become empty). Li is the frequent sets of size i; C(i+1) is the set of sets of size i + 1 such that each subset of size i is in Li. The pruning using the Apriori property: – All nonempty subsets of a frequent itemset must also be frequent. – This helps because it means that the number of sets which must be considered at each level is much smaller than it otherwise would be. 13

Generating Association Rules from Frequent Itemsets Once the frequent itemsets from transactions in a database D have been found, it is straightforward to generate strong associations rules from them – Where strong association rules satisfy both minimum support and minimum confidence Step 1: For each frequent itemset L, generate all nonempty subsets of L Step 2: For each nonempty subset U of L, output the rule: 14

Generating Association Rules from Frequent Itemsets –Example 1- Suppose we have the following transactional data from a store= Suppose that the data contain the frequent itemset L = {I1, I2, I5}. What are the association rules that can be generated from L? 15

Generating Association Rules from Frequent Itemsets –Example 2- The nonempty subsets of L are {I1,I2}, {I1,I5}, {I2,I5}, {I1}, {I2}, {I5}. The resulting association rules are thus: Suppose the minimum confidence threshold is 70%. Hence, only the second, third and last rules above are output – Since these are the only ones generated that are strong 16

Limitation of Minimum Support Discontinuity in ‘interestingness’ function Feast or famine – minimum support is a crude control mechanism – often results in too few or too many associations Cannot handle dense data Cannot prune search space using constraints on relationship between antecedent and consequent – egconfidence Minimum support may not be relevant – cannot be sufficiently low to capture all valid rules – cannot be sufficiently high to exclude all spurious rules 17

Roles of Constraint Select most relevant patterns – patterns that are likely to be interesting Control the number of patterns that the user must consider Make computation feasible 18

19

AR: Is the Rule a Useful Predictor? Confidence Factor is the ratio of the number of transactions with all the items in the rule to the number of transactions with just the items in the condition (rule body). Consider: if B and C then A If this rule has a confidence of 0.33, it means that when B and C occur in a transaction, there is a 33% chance that A also occurs. 20

AR: Is the Rule a Useful Predictor?-2- Consider the following table of probabilities of items and their combinations: 21

AR: Is the Rule a Useful Predictor?-3- Now consider the following rules: It is tempting to choose “If B and C then A”, because it is most confident(33%) – but there is a problem 22

AR: Is the Rule a Useful Predictor? A measure called lift indicates whether the rule predicts the result better than just assuming the result in the first place

AR: Is the Rule a Useful Predictor?-5- When lift > 1, the rule is better at predicting the result than random chance The lift measure is based on whether or not the probability P(condition& result) is higher than it would be if condition and result were statistically independent If there is no statistical dependence between condition and result, improvement = 1. – Because in this case: P(condition & result) = P(condition) × P(result) 24

AR: Is the Rule a Useful Predictor?-6- Consider the lift for our rules: Rulesupportconfidencelift if A and B then C if A and C then B if B and C then A if A then B None of the rules with three items shows any lift - the best rule in the data actually has only two items: “if A then B”. A predicts the occurrence of B 1.31 times better than chance. 25

AR: Is the Rule a Useful Predictor? When lift < 1, negating the result produces a better rule. For example if B and C thennot A has a confidence of 0.67 and thus an lift of 0.67/0.55 = 1.22 Negated rules may not be as useful as the original association rules when it comes to acting on the results