Data Mining – Association Rules

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or causal structures.
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis: Basic Concepts and Algorithms.
Data Mining Association Analysis: Basic Concepts and Algorithms
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Fast Algorithms for Association Rule Mining
SEG Tutorial 2 – Frequent Pattern Mining.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Data & Text Mining1 Introduction to Association Analysis Zhangxi Lin ISQS 3358 Texas Tech University.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
The Three Analytics Techniques. Decision Trees – Determining Probability.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Frequent-Itemset Mining. Market-Basket Model A large set of items, e.g., things sold in a supermarket. A large set of baskets, each of which is a small.
Association Rule Mining
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Elsayed Hemayed Data Mining Course
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Association Rules Carissa Wang February 23, 2010.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
COMP53311 Association Rule Mining Prepared by Raymond Wong Presented by Raymond Wong
Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Text Books: Data Mining Concepts and Techniques- Micheline Kamber.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
MIS2502: Data Analytics Association Rule Mining David Schuff
Introduction to Data Mining Mining Association Rules Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
MIS2502: Data Analytics Association Rule Mining Jeremy Shafer
Stats 202: Statistical Aspects of Data Mining Professor Rajan Patel
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Frequent Pattern Mining
William Norris Professor and Head, Department of Computer Science
Waikato Environment for Knowledge Analysis
Market Basket Analysis and Association Rules
Market Basket Many-to-many relationship between different objects
Dynamic Itemset Counting
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Transactional data Algorithm Applications
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis: Basic Concepts and Algorithms
MIS2502: Data Analytics Association Rule Mining
MIS2502: Data Analytics Association Rule Mining
MIS2502: Data Analytics Association Rule Learning
Association Analysis: Basic Concepts
Presentation transcript:

Data Mining – Association Rules Tutorial 4 Data Mining – Association Rules

Contents Review Questions Algorithm Questions Question 1: Data Mining and Metrics Algorithm Questions Question 2: Applying Apriori Algorithm Question 3: Finding Association Rules

Review Questions

Question 1: Data Mining and Metrics What is an Association Rule?

Question 1: Data Mining and Metrics What is an Association Rule? An association rule states that: given a set of records, each of which contain some number of items from a given collection, there will be a dependency rule that will predict the occurrence of an item based on the occurrences of other items in the transaction. In other words, if it has been found in all transactions that coke is always bought with milk, then there will be a rule that states {milk} -> {coke} (however, not the other way around since not all milk is bought with coke).

Question 1: Data Mining and Metrics What are the metrics for evaluating association rules?

Question 1: Data Mining and Metrics What are the metrics for evaluating association rules? The association rule evaluation metrics are “Support” (s) and “Confidence” (c). Support is the fractions of the transactions that contain both X and Y. Confidence measures how often items in Y appears in transactions that contain X.

Question 1: Data Mining and Metrics What are the metrics for evaluating association rules? For example given the following table, these are the support and confidence values: Example Association Rule: {Milk, Diaper} => Beer s = (Milk, Diaper, Beer)/Total Transactions = 2/5 = 0.4 c = (Milk, Diaper, Beer)/ (Milk, Diaper) = 2/3 = 0.67 TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke

Algorithm Questions

Question 2: Applying the Apiori Algorithm Apply the Apriori algorithm to find all itemsets with support >= 0.2 from the following data: Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 2: Applying the Apiori Algorithm Apriori Principle Step 1: Count up the occurrences of 1 item: *Note: since it is out of 10, 0.2 support means if it appears twice in the list. Itemset Count Milk 5 Bread 4 Eggs Juice 3 Butter 2 Coffee Cookies

Question 2: Applying the Apiori Algorithm Apriori Principle Step 2: Look for frequent occurrences of 2 items (in bold, not strikethrough): Itemset Count Milk, Bread 4 Milk, Eggs 3 Milk, Juice 1 Milk, Cookies Bread, Eggs Bread, Cookies Eggs, Coffee Eggs, Cookies Juice, Butter Juice, Coffee Butter, Cookies

Question 2: Applying the Apiori Algorithm Apriori Principle Step 3: Look for frequent occurrences of 3 items (in bold, not strikethrough): Therefore, the most frequent and highest itemset data mining sub-itemset is {Milk, Bread, Eggs}. Itemset Count Milk, Bread, Eggs 3

Question 3: Applying the Apiori Algorithm Using the data set in question 2 ({Milk, Bread, Eggs}), find all the association rules with support >= 0.2 and confidence >= 0.8. “{Milk, Bread} -> Eggs” where {Milk, Bread} is X and Eggs is Y. Support = {itemset (X and Y)}/transactions Confidence = {itemset (X and Y)}/{itemset (X)} To do this, we check each permutation of the association rules.

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = Confidence = {Milk Eggs} -> {Bread} Support = {Eggs, Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Milk Eggs} -> {Bread} Support = Confidence = {Eggs, Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Milk Eggs} -> {Bread} Support = 3/10 = 0.3 Confidence = 3/3 = 1 {Eggs, Bread} -> {Milk} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Milk Eggs} -> {Bread} Support = 3/10 = 0.3 Confidence = 3/3 = 1 {Eggs, Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = Confidence = {Bread} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = 4/10 = 0.4 Confidence = 4/5 = 0.8 {Bread} -> {Milk} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = 4/10 = 0.4 Confidence = 4/5 = 0.8 {Bread} -> {Milk} Confidence = 4/4 = 1 Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = Confidence = {Eggs} -> {Milk} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/5 = 0.6 {Eggs} -> {Milk} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = 3/10 = 0.25 Confidence = 3/5 = 0.6 {Eggs} -> {Milk} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = Confidence = {Eggs} -> {Bread} Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = 3/10 = 0.3 Confidence = 3/4 = 0.75 {Eggs} -> {Bread} Support = Confidence = Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = 3/10 = 0.25 Confidence = 3/4 = 0.75 {Eggs} -> {Bread} Support = 3/10 = 0.3 Transaction Items in Transaction 1 Milk, Bread, Eggs 2 Milk, Juice 3 Juice, Butter 4 5 Coffee, Eggs 6 Coffee 7 Coffee, Juice 8 Milk, Bread, Cookies, Eggs 9 Cookies, Butter 10 Milk, Bread

Question 3: Applying the Apiori Algorithm Therefore, the only Association Rules that satisfy the restriction of having support >= 2 and confidence >= 0.8 is: {Milk, Eggs} -> {Bread} (s=0.3, c=1) {Eggs, Bread} -> {Milk} (s=0.3, c=1) {Milk} -> {Bread} (s=0.4, c=0.8) {Bread} -> {Milk} (s=0.4, c=1)

Questions?