Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and.

Slides:



Advertisements
Similar presentations
Recap: Mining association rules from large datasets
Advertisements

Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Frequent Closed Pattern Search By Row and Feature Enumeration
Zeev Dvir – GenMax From: “ Efficiently Mining Frequent Itemsets ” By : Karam Gouda & Mohammed J. Zaki.
1 Department of Information & Computer Education, NTNU SmartMiner: A Depth First Algorithm Guided by Tail Information for Mining Maximal Frequent Itemsets.
Association Rule Mining. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and closed patterns.
FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
CPS : Information Management and Mining
Frequent Item Mining.
Chapter 5: Mining Frequent Patterns, Association and Correlations
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining: Concepts and Techniques (2nd ed.) — Chapter 5 —
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Data Mining Association Analysis: Basic Concepts and Algorithms
Reducing the collection of itemsets: alternative representations and combinatorial problems.
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Association Rule Mining Instructor Qiang Yang Slides from Jiawei Han and Jian Pei And from Introduction to Data Mining By Tan, Steinbach, Kumar.
Data Mining Association Analysis: Basic Concepts and Algorithms
Mining Time-Series Databases Mohamed G. Elfeky. Introduction A Time-Series Database is a database that contains data for each point in time. Examples:
Association Analysis: Basic Concepts and Algorithms.
Data Mining Association Analysis: Basic Concepts and Algorithms
Chapter 4: Mining Frequent Patterns, Associations and Correlations
Mining Frequent Patterns
Mining Association Rules
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Association Rule Mining. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and closed patterns.
Rule Generation [Chapter ]
What Is Association Mining? l Association rule mining: – Finding frequent patterns, associations, correlations, or causal structures among sets of items.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining III COMP Seminar GNET 713 BCB Module Spring 2007.
Frequent Item Mining. What is data mining? =Pattern Mining? What patterns? Why are they useful?
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining ARM: Improvements March 10, 2009 Slide.
Association Analysis (3)
The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining Spring 2009.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining.
Reducing Number of Candidates Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining Jinze Liu.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining COMP Seminar BCB 713 Module Spring 2011.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
CS685: Special Topics in Data Mining The UNIVERSITY of KENTUCKY Frequent Itemset Mining II Tree-based Algorithm Max Itemsets Closed Itemsets.
DATA MINING: ASSOCIATION ANALYSIS (2) Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015.
Reducing Number of Candidates
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining: Concepts and Techniques
Frequent Pattern Mining
EECS 647: Introduction to Database Systems
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Market Baskets Frequent Itemsets A-Priori Algorithm
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Association Analysis: Basic Concepts and Algorithms
Fractional Factorial Design
Frequent-Pattern Tree
Association Rule Mining
Design matrix Run A B C D E
Association Analysis: Basic Concepts
What Is Association Mining?
Presentation transcript:

Association Rule Mining - MaxMiner

Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and closed patterns  Mining various kinds of association/correlation rules

Max-patterns & Close-patterns  If there are frequent patterns with many items, enumerating all of them is costly.  We may be interested in finding the ‘ boundary ’ frequent patterns.  Two types …

Max-patterns  Frequent pattern {a 1, …, a 100 }  ( ) + ( ) + … + ( ) = = 1.27*10 30 frequent sub-patterns!  Max-pattern: frequent patterns without proper frequent super pattern BCDE, ACD are max-patterns BCD is not a max-pattern TidItems 10A,B,C,D,E 20B,C,D,E, 30A,C,D,F Min_sup=2

Maximal Frequent Itemset Border Infrequent Itemsets Maximal Itemsets An itemset is maximal frequent if none of its immediate supersets is frequent

Closed Itemset  An itemset is closed if none of its immediate supersets has the same support as the itemset

Maximal vs Closed Itemsets Transaction Ids Not supported by any transactions

Maximal vs Closed Frequent Itemsets Minimum support = 2 # Closed = 9 # Maximal = 4 Closed and maximal Closed but not maximal

Maximal vs Closed Itemsets

MaxMiner: Mining Max-patterns  Idea: generate the complete set- enumeration tree one level at a time, while prune if applicable.  (ABCD) A (BCD) B (CD) C (D)D () AB (CD)AC (D)AD () BC (D)BD () CD ()ABC (C) ABCD () ABD ()ACD ()BCD ()

Local Pruning Techniques (e.g. at node A) Check the frequency of ABCD and AB, AC, AD.  If ABCD is frequent, prune the whole sub-tree.  If AC is NOT frequent, remove C from the parenthesis before expanding.  (ABCD) A (BCD) B (CD) C (D)D () AB (CD)AC (D)AD () BC (D)BD () CD ()ABC (C) ABCD () ABD ()ACD ()BCD ()

Algorithm MaxMiner  Initially, generate one node N=, where h(N)= and t(N)={A,B,C,D}.  Consider expanding N, If h(N)t(N) is frequent, do not expand N. If for some it(N), h(N){i} is NOT frequent, remove i from t(N) before expanding N.  Apply global pruning techniques …  (ABCD)

Global Pruning Technique (across sub-trees)  When a max pattern is identified (e.g. ABCD), prune all nodes (e.g. B, C and D) where h(N)t(N) is a sub-set of it (e.g. ABCD).  (ABCD) A (BCD) B (CD) C (D)D () AB (CD)AC (D)AD () BC (D)BD () CD ()ABC (C) ABCD () ABD ()ACD ()BCD ()

Example TidItems 10A,B,C,D,E 20B,C,D,E, 30A,C,D,F  (ABCDEF) ItemsFrequency ABCDEF0 A2 B2 C3 D3 E2 F1 Min_sup=2 Max patterns: A (BCDE) B (CDE)C (DE)E ()D (E)

Example TidItems 10A,B,C,D,E 20B,C,D,E, 30A,C,D,F  (ABCDEF) ItemsFrequency ABCDE1 AB1 AC2 AD2 AE1 Min_sup=2 A (BCDE) B (CDE)C (DE)E ()D (E) AC (D)AD () Max patterns: Node A

Example TidItems 10A,B,C,D,E 20B,C,D,E, 30A,C,D,F  (ABCDEF) ItemsFrequency BCDE2 BC BD BE Min_sup=2 A (BCDE) B (CDE)C (DE)E ()D (E) AC (D)AD () Max patterns: BCDE Node B

Example TidItems 10A,B,C,D,E 20B,C,D,E, 30A,C,D,F  (ABCDEF) ItemsFrequency ACD2 Min_sup=2 A (BCDE) B (CDE)C (DE)E ()D (E) AC (D)AD () Max patterns: BCDE ACD Node AC