Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National."— Presentation transcript:

1 Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National Yunlin University of Science and Technology Efficient Mining of Intertransaction Association Rules IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING VOL. 15, NO. 1, JANUARY/FEBRUARY 2003

2 Intelligent Database Systems Lab Outline Motivation Objective Introduction PROBLEM DESCRIPTION PRINCIPLES OF INTERTRANSACTION ASSOCIATION MINNING METHODS THE FITI ALGORITHM PERFORMANCE STUDY CONCLUSION PERSONAL OPINION N.Y.U.S.T. I.M.

3 Intelligent Database Systems Lab Intratransaction v.s. Intertransaction Intratransaction  same transaction  same customer  same day Intertransaction  different transaction

4 Intelligent Database Systems Lab Motivation Most of the previous studies on mining association rules are on mining intratransaction associations.  same transaction  same customer  same day

5 Intelligent Database Systems Lab Objective we break the barrier of transactions and extend the scope of mining association rules from traditional single-dimensional, intratransaction associations to multidimensional, intertransaction associations.

6 Intelligent Database Systems Lab Introduction Most of the pervious studies … There is an important form of association rules which are useful but could not discovered with existing association rule mining framework.

7 Intelligent Database Systems Lab Introduction Example : stock  R1: When the prices of IBM and SUN go up, at 80 percent of probability the price of Microsoft goes up on the same day.  R2: If the prices of IBM and SUN go up, Microsoft’s will most likely(80 percent of probability) go up the next day and then drop four days later.

8 Intelligent Database Systems Lab Introduction It is different from mining sequential patterns in transaction data. Contribution  an association mining problem that is more general than what has been discussed in the literature.  Developed an efficient algorithm for mining intertransaction association rule from large databases. (FITI)

9 Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 1.  Let be a set of literals, called items.  Let D be an attribute and Dom(D) be the domain of D.  A transaction database is a database containing transactions in the form of (d, E), where Definition 2.  A sliding window W in a transaction database T is a block of w continuous intervals along domain D, starting from interval such that T contains a transaction at interval.  Each interval in W is called a subwindow of W denoted as W[j], where.  We call j the subwindow number of within W.

10 Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 3.  Let W be a sliding window with w intervals and u be the number of literals in ∑.  We define a megatransaction M contained within W to be  To distinguish the items in a megatransaction from the items in a traditional transaction, the items in a megatransaction are called extended-items.

11 Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 4.  intratransaction itemset is a set of items  An intertransaction itemset is a set of extended items such that Definition 5.  An intertransaction association rule is an implication of the form, where

12 Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 6.  Let S be the number of transactions in the transaction database.  Let be the set of megatransactions that contain a set of extended-items and be the set of megatransactions that contain X.  Then, the support and confidence of an intertransaction association rule are defined as

13 Intelligent Database Systems Lab PRINCIPLES OF INTERTRANSACTION ASSOCIATION MINNING METHODS Decomposed into two subproblems:

14 Intelligent Database Systems Lab THE FITI ALGORITHM FITI consists of the following three phasses.

15 Intelligent Database Systems Lab Phase 1 : Mining and Storing Frequent IntraTransaction Itemsets Frequent intratransaction itemsets are first mined using the Apriori algorithm Frequent-Itemsets Linked Table, FILT  ItemSet Hash Table  1. Lookup Links  2. Generator and Extension Links  3. Subset Links  4. Descendant Links

16 Intelligent Database Systems Lab Phase 1 : Mining and Storing Frequent IntraTransaction Itemsets

17 Intelligent Database Systems Lab Phase 2 : Database Transformation FITI is to transform the database into a set of Encoded Frequent-Itemset Tables, FIT

18 Intelligent Database Systems Lab

19 Phase 3 : Mining frequent Intertrasaction Itemsets

20 Intelligent Database Systems Lab PERFORMANCE STUDY Generation of Synthetic Data Relative Performance Effect of Increasing Maxspan Effect of Increasing the Number of Transactions Effect of Increasing the Average Transaction Size Experiment on Real-Life Data

21 Intelligent Database Systems Lab Generation of Synthetic Data TABLE4 Parameter Settings

22 Intelligent Database Systems Lab Relative Performance Compare the performance of EH-Apriori and FITI  vary the support level on data sets one and two  maxspan equal to four and six

23 Intelligent Database Systems Lab Relative Performance

24 Intelligent Database Systems Lab Effect of Increasing Maxspan vary maxspan from two to eight fro both data sets

25 Intelligent Database Systems Lab Effect of Increasing the Number of Transactions Vary the number of transactions in dataset one from 10k to 1,000k increase linearly

26 Intelligent Database Systems Lab Effect of Increasing the Average Transaction Size Increase the average transaction size of data set one from five to 20

27 Intelligent Database Systems Lab Experiment on Real-Life Data

28 Intelligent Database Systems Lab CONCLUSION Intorduce a new problem of mining intertransaction association rules and provided an extensive study of it. EH-Apriori and FITI  FITI proves to be much faster than EH-Apriori  Performance is acceptable for real life application Useful in providing prediction capability along a single dimension.


Download ppt "Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National."

Similar presentations


Ads by Google