Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A 24-h forecast of solar irradiance using artificial neural.
Advertisements

Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Gianfranco Chicco, Roberto Napoli Federico Piglione, Petru Postolache.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Andrew K. C. Wong Yang Wang 國立雲林科技大學 National Yunlin University of.
Mining Sequential Patterns: Generalizations and Performance Improvements R. Srikant R. Agrawal IBM Almaden Research Center Advisor: Dr. Hsu Presented by:
實驗室研究暨成果說明會 Content and Knowledge Management Laboratory (B) Data Mining Part Director: Anthony J. T. Lee Presenter: Wan-chuen Lin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The k-means range algorithm for personalized data clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extraction Presenter : Jiang-Shan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Keng-Wei Chang Author: Yehuda.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.
A Fuzzy k-Modes Algorithm for Clustering Categorical Data
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Fast accurate fuzzy clustering through data reduction Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Using Text Mining and Natural Language Processing for.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Lian Yan and David J. Miller 國立雲林科技大學 National Yunlin University of.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fraud detection in online consumer reviews Presenter: Tsai Tzung Ruei Authors: Nan Hu, Ling Liu, Vallabh.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Salvatore Orlando Raffaele Perego Claudio Silvestri 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Unsupervised Learning with Mixed Numeric and Nominal Data.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Balaji Rajagopalan Mark W. Isken 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A hierarchical clustering algorithm for categorical sequence.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Recognizing Partially Occluded, Expression Variant Faces.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Concept Frequency Distribution in Biomedical Text Summarization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology ACM SIGMOD1 Subsequence Matching on Structured Time Series.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Hierarchical model-based clustering of large datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Advisor-Advisee Relationships from Research Publication.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
國立雲林科技大學 National Yunlin University of Science and Technology Mining Generalized Associations of Semantic Relations from Textual Web Content Tao Jiang,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Sanghamitra.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Discovering Interesting Usage Patterns in Text Collections:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text Classification, Business Intelligence, and Interactivity:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive Clustering for Multiple Evolving Streams Graduate.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Aristidis Likas Nikos Vlassis Jakob J.Verbeek 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology IEEE EC1 Generating War Game Strategies Using A Genetic.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
Presentation transcript:

Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National Yunlin University of Science and Technology Efficient Mining of Intertransaction Association Rules IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING VOL. 15, NO. 1, JANUARY/FEBRUARY 2003

Intelligent Database Systems Lab Outline Motivation Objective Introduction PROBLEM DESCRIPTION PRINCIPLES OF INTERTRANSACTION ASSOCIATION MINNING METHODS THE FITI ALGORITHM PERFORMANCE STUDY CONCLUSION PERSONAL OPINION N.Y.U.S.T. I.M.

Intelligent Database Systems Lab Intratransaction v.s. Intertransaction Intratransaction  same transaction  same customer  same day Intertransaction  different transaction

Intelligent Database Systems Lab Motivation Most of the previous studies on mining association rules are on mining intratransaction associations.  same transaction  same customer  same day

Intelligent Database Systems Lab Objective we break the barrier of transactions and extend the scope of mining association rules from traditional single-dimensional, intratransaction associations to multidimensional, intertransaction associations.

Intelligent Database Systems Lab Introduction Most of the pervious studies … There is an important form of association rules which are useful but could not discovered with existing association rule mining framework.

Intelligent Database Systems Lab Introduction Example : stock  R1: When the prices of IBM and SUN go up, at 80 percent of probability the price of Microsoft goes up on the same day.  R2: If the prices of IBM and SUN go up, Microsoft’s will most likely(80 percent of probability) go up the next day and then drop four days later.

Intelligent Database Systems Lab Introduction It is different from mining sequential patterns in transaction data. Contribution  an association mining problem that is more general than what has been discussed in the literature.  Developed an efficient algorithm for mining intertransaction association rule from large databases. (FITI)

Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 1.  Let be a set of literals, called items.  Let D be an attribute and Dom(D) be the domain of D.  A transaction database is a database containing transactions in the form of (d, E), where Definition 2.  A sliding window W in a transaction database T is a block of w continuous intervals along domain D, starting from interval such that T contains a transaction at interval.  Each interval in W is called a subwindow of W denoted as W[j], where.  We call j the subwindow number of within W.

Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 3.  Let W be a sliding window with w intervals and u be the number of literals in ∑.  We define a megatransaction M contained within W to be  To distinguish the items in a megatransaction from the items in a traditional transaction, the items in a megatransaction are called extended-items.

Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 4.  intratransaction itemset is a set of items  An intertransaction itemset is a set of extended items such that Definition 5.  An intertransaction association rule is an implication of the form, where

Intelligent Database Systems Lab PROBLEM DESCRIPTION Definition 6.  Let S be the number of transactions in the transaction database.  Let be the set of megatransactions that contain a set of extended-items and be the set of megatransactions that contain X.  Then, the support and confidence of an intertransaction association rule are defined as

Intelligent Database Systems Lab PRINCIPLES OF INTERTRANSACTION ASSOCIATION MINNING METHODS Decomposed into two subproblems:

Intelligent Database Systems Lab THE FITI ALGORITHM FITI consists of the following three phasses.

Intelligent Database Systems Lab Phase 1 : Mining and Storing Frequent IntraTransaction Itemsets Frequent intratransaction itemsets are first mined using the Apriori algorithm Frequent-Itemsets Linked Table, FILT  ItemSet Hash Table  1. Lookup Links  2. Generator and Extension Links  3. Subset Links  4. Descendant Links

Intelligent Database Systems Lab Phase 1 : Mining and Storing Frequent IntraTransaction Itemsets

Intelligent Database Systems Lab Phase 2 : Database Transformation FITI is to transform the database into a set of Encoded Frequent-Itemset Tables, FIT

Intelligent Database Systems Lab

Phase 3 : Mining frequent Intertrasaction Itemsets

Intelligent Database Systems Lab PERFORMANCE STUDY Generation of Synthetic Data Relative Performance Effect of Increasing Maxspan Effect of Increasing the Number of Transactions Effect of Increasing the Average Transaction Size Experiment on Real-Life Data

Intelligent Database Systems Lab Generation of Synthetic Data TABLE4 Parameter Settings

Intelligent Database Systems Lab Relative Performance Compare the performance of EH-Apriori and FITI  vary the support level on data sets one and two  maxspan equal to four and six

Intelligent Database Systems Lab Relative Performance

Intelligent Database Systems Lab Effect of Increasing Maxspan vary maxspan from two to eight fro both data sets

Intelligent Database Systems Lab Effect of Increasing the Number of Transactions Vary the number of transactions in dataset one from 10k to 1,000k increase linearly

Intelligent Database Systems Lab Effect of Increasing the Average Transaction Size Increase the average transaction size of data set one from five to 20

Intelligent Database Systems Lab Experiment on Real-Life Data

Intelligent Database Systems Lab CONCLUSION Intorduce a new problem of mining intertransaction association rules and provided an extensive study of it. EH-Apriori and FITI  FITI proves to be much faster than EH-Apriori  Performance is acceptable for real life application Useful in providing prediction capability along a single dimension.