COMP5331 FP-Tree Prepared by Raymond Wong Presented by Raymond Wong

Slides:

Advertisements

Similar presentations

Mining Association Rules

Advertisements

Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.

CSE 634 Data Mining Techniques

Association rules and frequent itemsets mining

IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department

732A02 Data Mining - Clustering and Association Analysis ………………… Jose M. Peña FP grow algorithm Correlation analysis.

FP-Growth algorithm Vasiljevic Vladica,

FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.

Data Mining Association Analysis: Basic Concepts and Algorithms

FPtree/FPGrowth (Complete Example). First scan – determine frequent 1- itemsets, then build header B8 A7 C7 D5 E3.

Rakesh Agrawal Ramakrishnan Srikant

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,

732A02 Data Mining - Clustering and Association Analysis ………………… Jose M. Peña Association rules Apriori algorithm FP grow algorithm.

FP-growth. Challenges of Frequent Pattern Mining Improving Apriori Fp-growth Fp-tree Mining frequent patterns with FP-tree Visualization of Association.

Data Mining Association Analysis: Basic Concepts and Algorithms

1 Mining Frequent Patterns Without Candidate Generation Apriori-like algorithm suffers from long patterns or quite low minimum support thresholds. Two.

Mining Frequent patterns without candidate generation Jiawei Han, Jian Pei and Yiwen Yin.

Association Analysis: Basic Concepts and Algorithms.

Association Rule Mining. Generating assoc. rules from frequent itemsets  Assume that we have discovered the frequent itemsets and their support  How.

Data Mining Association Analysis: Basic Concepts and Algorithms

FPtree/FPGrowth. FP-Tree/FP-Growth Algorithm Use a compressed representation of the database using an FP-tree Then use a recursive divide-and-conquer.

1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Association Rule Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.

Frequent-Pattern Tree. 2 Bottleneck of Frequent-pattern Mining  Multiple database scans are costly  Mining long patterns needs many passes of scanning.

Association Analysis (3). FP-Tree/FP-Growth Algorithm Use a compressed representation of the database using an FP-tree Once an FP-tree has been constructed,

© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.

SEG Tutorial 2 – Frequent Pattern Mining.

Mining Frequent Patterns without Candidate Generation Presented by Song Wang. March 18 th, 2009 Data Mining Class Slides Modified From Mohammed and Zhenyu’s.

Jiawei Han, Jian Pei, and Yiwen Yin School of Computing Science Simon Fraser University Mining Frequent Patterns without Candidate Generation SIGMOD 2000.

AR mining Implementation and comparison of three AR mining algorithms Xuehai Wang, Xiaobo Chen, Shen chen CSCI6405 class project.

Data Mining Frequent-Pattern Tree Approach Towards ARM Lecture

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,

EFFICIENT ITEMSET EXTRACTION USING IMINE INDEX By By U.P.Pushpavalli U.P.Pushpavalli II Year ME(CSE) II Year ME(CSE)

Mining Frequent Patterns without Candidate Generation.

Mining Frequent Patterns without Candidate Generation : A Frequent-Pattern Tree Approach 指導教授：廖述賢博士報告人：朱佩慧班級：管科所博一.

Frequent Item Mining. What is data mining? =Pattern Mining? What patterns? Why are they useful?

Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.

Association Analysis (3)

Reducing Number of Candidates Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due.

COMP53311 Association Rule Mining Prepared by Raymond Wong Presented by Raymond Wong

Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001.

P Left half of rt half ? false  Left half pure1? false  Whole is pure1? false  0 5. Rt half of right half? true  1.

MapReduce MapReduce is one of the most popular distributed programming models Model has two phases: Map Phase: Distributed processing based on key, value.

Prepared by Dr. Maher Abuhamdeh

Reducing Number of Candidates

Data Mining Association Analysis: Basic Concepts and Algorithms

New ideas on FP-Growth and batch incremental mining with FP-Tree

Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques

Association rule mining

Frequent Pattern Mining

COMP 5331: Knowledge Discovery and Data Mining

Chapter 6 Tutorial.

Market Basket Analysis and Association Rules

Vasiljevic Vladica, FP-Growth algorithm Vasiljevic Vladica,

Mining Association Rules in Large Databases

Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak

Association Rule Mining

A Parameterised Algorithm for Mining Association Rules

Mining Complex Data COMP Seminar Spring 2011.

Mining Access Pattrens Efficiently from Web Logs Jian Pei, Jiawei Han, Behzad Mortazavi-asl, and Hua Zhu 2000년 5월 26일 DE Lab. 윤지영.

Association Rule Mining

732A02 Data Mining - Clustering and Association Analysis

Mining Frequent Patterns without Candidate Generation

Frequent-Pattern Tree

Market Basket Analysis and Association Rules

FP-Growth Wenlong Zhang.

Association Rule Mining

Finding Frequent Itemsets by Transaction Mapping

Mining Association Rules in Large Databases

Presentation transcript:

COMP5331 FP-Tree Prepared by Raymond Wong Presented by Raymond Wong raywong@cse COMP5331

Large Itemset Mining Frequent Itemset Mining TID Items Bought 100 Problem: to find all “large” (or frequent) itemsets with support at least a threshold (i.e., itemsets with support >= 3) TID Items Bought 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g COMP5331

Apriori Join Step Prune Step L1 C2 L2 Candidate Generation “Large” Itemset Generation Large 2-itemset Generation C3 L3 Large 3-itemset Generation … Disadvantage 1: It is costly to handle a large number of candidate sets Disadvantage 2: It is tedious to repeatedly scan the database and check the candidate patterns Counting Step COMP5331

FP-tree Scan the database once to store all essential information in a data structure called FP-tree (Frequent Pattern Tree) The FP-tree is concise and is used in directly generating large itemsets COMP5331

FP-tree Step 1: Deduce the ordered frequent items. For items with the same frequency, the order is given by the alphabetical order. Step 2: Construct the FP-tree from the above data Step 3: From the FP-tree above, construct the FP-conditional tree for each item (or itemset). Step 4: Determine the frequent patterns. COMP5331

FP-tree Frequent Itemset Mining TID Items Bought 100 Problem: to find all “large” (or frequent) itemsets with support at least a threshold (i.e., itemsets with support >= 3) TID Items Bought 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g COMP5331

FP-tree TID Items Bought 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g COMP5331

FP-tree TID Items Bought 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g FP-tree COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 Item Frequency a b c d e f g h i j k 4 COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 Item Frequency a b c d e f g h i j k 4 4 1 3 COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 Item Frequency a b c d e f g h i j k Item Frequency a 4 b d 3 e f g 4 4 1 3 3 3 3 1 1 1 COMP5331 1

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g Item Frequency a b c d e f g h i j k Item Frequency a 4 b d 3 e f g 4 4 1 3 3 3 3 1 1 1 COMP5331 1

FP-tree Step 1: Deduce the ordered frequent items. For items with the same frequency, the order is given by the alphabetical order. Step 2: Construct the FP-tree from the above data Step 3: From the FP-tree above, construct the FP-conditional tree for each item (or itemset). Step 4: Determine the frequent patterns. COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g root Item Head of node-link a b d e f g a:1 b:1 d:1 e:1 f:1 g:1 COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g root Item Head of node-link a b d e f g a:2 a:1 f:1 b:1 g:1 d:1 e:1 f:1 g:1 COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g root Item Head of node-link a b d e f g b:1 a:2 b:1 d:1 f:1 f:1 d:1 e:1 g:1 e:1 f:1 g:1 COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g root Item Head of node-link a b d e f g b:1 a:3 a:2 b:2 b:1 d:1 f:1 f:1 d:1 d:2 g:1 e:1 e:1 f:1 g:1 COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g root Item Head of node-link a b d e f g a:3 a:4 b:1 b:2 b:3 f:1 d:1 e:1 d:2 g:1 e:1 g:1 e:1 f:1 f:1 g:1 COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g root Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 g:1 COMP5331

FP-tree Step 1: Deduce the ordered frequent items. For items with the same frequency, the order is given by the alphabetical order. Step 2: Construct the FP-tree from the above data Step 3: From the FP-tree above, construct the FP-conditional tree for each item (or itemset). Step 4: Determine the frequent patterns. COMP5331

Threshold = 3 TID Items Bought (Ordered) Frequent Items 100 a, b, c, d, e, f, g, h 200 a, f, g 300 b, d, e, f, j 400 a, b, d, i, k 500 a, b, e, g Threshold = 3 a, b, d, e, f, g a, f, g b, d, e, f a, b, d a, b, e, g root Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 g:1 COMP5331

root a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 g:1 COMP5331

Threshold = 3 Cond. FP-tree on “g” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “g” { } (a:1, b:1, d:1, e:1, f:1, g:1), g:1 COMP5331

Threshold = 3 Cond. FP-tree on “g” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “g” { } (a:1, b:1, d:1, e:1, f:1, g:1), g:1 (a:1, b:1, e:1, g:1), COMP5331

Threshold = 3 Cond. FP-tree on “g” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “g” { } (a:1, b:1, d:1, e:1, f:1, g:1), g:1 (a:1, b:1, e:1, g:1), (a:1, f:1, g:1) Item Frequency a b d e f g 3 2 1 3 COMP5331

Threshold = 3 Cond. FP-tree on “g” 3 root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “g” 3 { } (a:1, b:1, d:1, e:1, f:1, g:1), { } (a:1, g:1), g:1 conditional pattern base of “g” (a:1, b:1, e:1, g:1), (a:1, g:1), (a:1, f:1, g:1) (a:1, g:1) Item Frequency a b d e f g Item Frequency a 3 g root Item Head of node-link a 3 a:3 2 1 2 COMP5331 2 3

Threshold = 3 Cond. FP-tree on “f” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “f” { } (a:1, b:1, d:1, e:1, f:1), g:1 COMP5331

Threshold = 3 Cond. FP-tree on “f” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “f” { } (a:1, b:1, d:1, e:1, f:1), g:1 (a:1, f:1), COMP5331

Threshold = 3 Cond. FP-tree on “f” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “f” { } (a:1, b:1, d:1, e:1, f:1), g:1 (a:1, f:1), (b:1, d:1, e:1, f:1) Item Frequency a b d e f g 2 2 3 COMP5331

Threshold = 3 Cond. FP-tree on “f” 3 root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “f” 3 { } (a:1, b:1, d:1, e:1, f:1), { } (f:1), g:1 (a:1, f:1), (f:1), (b:1, d:1, e:1, f:1) (f:1) Item Frequency a b d e f g Item Frequency f 3 root 2 2 2 2 COMP5331 3

Threshold = 3 Cond. FP-tree on “e” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “e” { } (a:1, b:1, d:1, e:1), (a:1, b:1, e:1), (b:1, d:1, e:1) g:1 Item Frequency a b d e f g 2 3 COMP5331

Threshold = 3 Cond. FP-tree on “e” 3 root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “e” 3 { } (a:1, b:1, d:1, e:1), { } (b:1, e:1), (b:1, e:1) g:1 g:1 (a:1, b:1, e:1), (b:1, d:1, e:1) Item Frequency a b d e f g Item Frequency b 3 e root Item Head of node-link b 2 b:3 3 2 3 COMP5331

Threshold = 3 Cond. FP-tree on “d” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “d” { } (a:2, b:2, d:2), (b:1, d:1) g:1 Item Frequency a b d e f g 2 3 COMP5331

Threshold = 3 Cond. FP-tree on “d” 3 root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “d” 3 { } (a:2, b:2, d:2), { } (b:2, d:2), (b:1, d:1) g:1 g:1 g:1 (b:1, d:1) Item Frequency a b d e f g Item Frequency b 3 d root Item Head of node-link b 2 b:3 3 3 COMP5331

Threshold = 3 Cond. FP-tree on “b” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “b” { } (a:3, b:3), (b:1) g:1 Item Frequency a b d e f g 3 4 COMP5331

Threshold = 3 Cond. FP-tree on “b” 4 root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “b” 4 { } (a:3, b:3), { } (b:3, a:3), (b:1) g:1 g:1 g:1 g:1 (b:1) Item Frequency a b d e f g Item Frequency b 4 a 3 root Item Head of node-link a 3 a:3 4 COMP5331

Threshold = 3 Cond. FP-tree on “a” root a b d e f g a:4 b:1 b:3 f:1 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “a” { } (a:4) g:1 Item Frequency a b d e f g 4 COMP5331

Threshold = 3 Cond. FP-tree on “a” 4 { } root a b d e f g a:4 b:1 b:3 Item Head of node-link a b d e f g a:4 b:1 b:3 f:1 d:1 d:2 e:1 g:1 e:1 e:1 g:1 f:1 f:1 Cond. FP-tree on “a” 4 { } (a:4) { } (a:4) g:1 g:1 g:1 g:1 Item Frequency a b d e f g Item Frequency a 4 root 4 COMP5331

FP-tree Step 1: Deduce the ordered frequent items. For items with the same frequency, the order is given by the alphabetical order. Step 2: Construct the FP-tree from the above data Step 3: From the FP-tree above, construct the FP-conditional tree for each item (or itemset). Step 4: Determine the frequent patterns. COMP5331

Cond. FP-tree on “g” 3 COMP5331

Cond. FP-tree on “g” 3 Cond. FP-tree on “f” 3 Cond. FP-tree on “e” 3 root Item Head of node-link a a:3 Cond. FP-tree on “f” 3 root Cond. FP-tree on “e” 3 COMP5331

Cond. FP-tree on “g” 3 Cond. FP-tree on “d” 3 Cond. FP-tree on “f” 3 root Item Head of node-link a a:3 Cond. FP-tree on “f” 3 root Cond. FP-tree on “e” 3 root Item Head of node-link b b:3 COMP5331

Cond. FP-tree on “g” 3 Cond. FP-tree on “d” 3 Cond. FP-tree on “f” 3 root root Item Head of node-link b Item Head of node-link a b:3 a:3 Cond. FP-tree on “f” 3 Cond. FP-tree on “b” 4 root Cond. FP-tree on “e” 3 root Item Head of node-link b b:3 COMP5331

Cond. FP-tree on “g” 3 Cond. FP-tree on “d” 3 Cond. FP-tree on “f” 3 root root Item Head of node-link b Item Head of node-link a b:3 a:3 Cond. FP-tree on “f” 3 Cond. FP-tree on “b” 4 root root Item Head of node-link a a:3 Cond. FP-tree on “a” 4 Cond. FP-tree on “e” 3 root root Item Head of node-link b b:3 COMP5331

Cond. FP-tree on “g” 3 Cond. FP-tree on “d” 3 Cond. FP-tree on “f” 3 root root 1. Before generating this cond. tree, we generate {d} (support = 3) 1. Before generating this cond. tree, we generate {g} (support = 3) Item Head of node-link b Item Head of node-link a b:3 a:3 2. After generating this cond. tree, we generate {b, d} (support = 3) 2. After generating this cond. tree, we generate {a, g} (support = 3) Cond. FP-tree on “f” 3 Cond. FP-tree on “b” 4 root 1. Before generating this cond. tree, we generate {f} (support = 3) 1. Before generating this cond. tree, we generate {b} (support = 4) root Item Head of node-link a a:3 2. After generating this cond. tree, we do not generate any itemset. 2. After generating this cond. tree, we generate {a, b} (support = 3) Cond. FP-tree on “a” 4 Cond. FP-tree on “e” 3 1. Before generating this cond. tree, we generate {a} (support = 4) root 1. Before generating this cond. tree, we generate {e} (support = 3) root Item Head of node-link b b:3 2. After generating this cond. tree, we do not generate any itemset. 2. After generating this cond. tree, we generate {b, e} (support = 3) COMP5331

Complexity Complexity in building FP-tree Two scans of the transactions DB Collect frequent items Construct the FP-tree Cost to insert one transaction Number of frequent items in this transaction COMP5331

Size of the FP-tree The size of the FP-tree is bounded by the overall occurrences of the frequent items in the database COMP5331

Height of the Tree The height of the tree is bounded by the maximum number of frequent items in any transaction in the database COMP5331

Compression With respect to the total number of items stored, is FP-tree more compressed compared with the original databases? COMP5331

Details of the Algorithm Procedure FP-growth (Tree, ) if Tree contains a single path P for each combination (denoted by ) of the nodes in the path P do generate pattern  U  with support = minimum support of nodes in  else for each ai in the header table of Tree do generate pattern  = ai U  with support = ai.support construct ’s conditional pattern base and then ’s conditional FP-tree Tree if Tree   Call FP-growth(Tree, ) COMP5331