Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Gianfranco Chicco, Roberto Napoli Federico Piglione, Petru Postolache.
Advertisements

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: Hichem.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Validating Transliteration Hypotheses Using the Web: Web.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel genetic algorithm for automatic clustering Advisor.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Byoung-Kee Yi N.D.Sidiropoulos Theodore Johnson 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extraction Presenter : Jiang-Shan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Virus Pattern Recognition Using Self-Organization Map.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
A Fuzzy k-Modes Algorithm for Clustering Categorical Data
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study on Automatic Recognition of Road Signs Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A modified version of the K-means algorithm with a distance.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: YU-SHENG.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Authors :
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Lian Yan and David J. Miller 國立雲林科技大學 National Yunlin University of.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Multiclass boosting with repartitioning Graduate : Chen,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Region-based image retrieval using integrated color, shape,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Unsupervised Learning with Mixed Numeric and Nominal Data.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Balaji Rajagopalan Mark W. Isken 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A personal route prediction system base on trajectory.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A hierarchical clustering algorithm for categorical sequence.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology ACM SIGMOD1 Subsequence Matching on Structured Time Series.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2005.ACM GECCO.8.Discriminating and visualizing anomalies.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Sanghamitra.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Prediction model building and feature selection with support.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive Clustering for Multiple Evolving Streams Graduate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters.
Presentation transcript:

Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學 National Yunlin University of Science and Technology Statistical Models for Time Sequences Data Mining

Intelligent Database Systems Lab Outline Motivation Objective Introduction Autoregression, Autocorrelation, Autocovariance Non-adaptive Statistical, Adaptive Statistical models Experimental Results Conclusions Personal Opinion Review N.Y.U.S.T. I.M.

Intelligent Database Systems Lab N.Y.U.S.T. I.M. Motivation The method like DFT(discrete Fourier transform) can only handle “whole matching”. Similarity between two time sequences is usually defined based on the similarity of the curve shapes, but it is too difficult to visually compare similarity between time sequences.

Intelligent Database Systems Lab Objective We can apply a clustering algorithm to the coefficients to cluster time sequences. We can also use the AR models to predict near future values. The coefficients of these AR models can be used as features to index subsequences to facilitate the query of subsequences with similar behaviors. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 1-1.Introduction Similarity between two time sequences is usually defined based on the similarity of the curve shapes. Our approach is to define a sliding windows (of different window sizes) over a time sequence and build autoregression models from the subsequences in diferent windows. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 1-2.Introduction N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 2-1.Autogression Models Y t = ậ Y t-1 + t N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 2-2.Autoregression Models simple regression: Yt = + Xt + t Y t = ậ Y t-1 + t N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 2-3.autocovariance N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 2-4.no false dismissals We need to show this indexing method guarantees no “false dismissals”., and are representations of x and y in the index space respectively, then the method can guarantee no “false dismissals”. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 2-5.distance between two series Let x = and y = be two data sequences of zero mean and 2-norm being equal to 1. X and y must be of exactly the same length N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 2-6.distance between two series Let x = and y = be two data sequences of zero mean and 2-norm being equal to 1.Here m n and we assume m >= n. Let Vi=( y i,…, y i+n-1) for 1 =1 N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 2-7.versus N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 3-1.Indexing the time sequence We start to index the time sequence X=(x1,x2,……xn) We rescale it so that it is of zero mean and 2-norm being equal to 1. Then we fit AR models from the first order to higher orders for s until the descreasing rate of the modelling error is less than a specified tolerance. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 3-2. Non-adaptive Statistical Model N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 3-3. Adaptive Statistical Model The Adaptive statistical model which is modified from the non-adaptive model. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 3-4. non-adaptive statistical N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 3-5. Adative statistical model N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 3-6. Conception From this example, we find that it’s a good idea to extract features from subsequences by the adaptive statistical model since we can notice the change of model in the whole time sequence. The idea of adaptive statistical model is easy to be understood and it’s similar to the idea of non-adaptive N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 4-1 AR models to predict near future values. Non-adaptive Statistical Model Adaptive Statistical Model N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 4-2.Prediction If the order of AR model is the same and the distance between the features are not large (up to tolerance), then we can still use the AR model and the autocovariance function for the new data subsequence, and then we continue to add data point to test the AR model. Y[1:w]=( y 1, y 2, y 3,….., y w) Y[1:w+d]=( y 1, y 2, y 3,…., y w, y w+1, y w+2………. y w+d] N.Y.U.S.T. I.M.

Intelligent Database Systems Lab N.Y.U.S.T. I.M.

Intelligent Database Systems Lab N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 5-1.Experimental Results 1.Statistical Models versus Fourier Transforms 2.Adaptive versus Non-adaptive Models N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 5-1 Statistical Models versus Fourier Transforms We compare the performance of two methodology in clustering. 1. generate several sets of time sequences of known classes. 2. calculate the autocovariance function values and Fourier coefficients for each time sequence. 3. use the autocariance function value and the magnitude of Fourier coefficients as feature vectors for classification with a clustering algorithm. 4. compare the clustering results with the originally known classes, and calculate the classification accuracy. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 5-2 Statistical Models versus Fourier Transforms In real applications, many time sequences look like cosine curves (or sinusoidal curves). M is the number of cosine curves Each is the adjusted frequency component Ai is the associated amplitude of each frequency component is a noise function. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 5-3. adjusted frequency Given the frequency perturbation level and the frequency component, the adjusted frequency component is formulated by N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 5-4. result N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 6-1.predict by Adaptive versus Non-adaptive Models N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 6-2. datas of experiment All of these time sequences are of length 750, which is approximately equal to 3 years trading days. Stock prices of Guangdong Investiment Ltd. Stock prices of Great Eagle Holdings Ltd. Stock prices of Wheelock Co.Ltd. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab N.Y.U.S.T. I.M. 750, predict five data points 751,…,755.

Intelligent Database Systems Lab N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 6-3. explain N.Y.U.S.T. I.M. 74,354,501,741, great change

Intelligent Database Systems Lab N.Y.U.S.T. I.M.

Intelligent Database Systems Lab N.Y.U.S.T. I.M.

Intelligent Database Systems Lab N.Y.U.S.T. I.M. non real adaptive

Intelligent Database Systems Lab 7.Concluding 1.the computational efficiency of calculating the autocovariance functions and AR models, which are capable to handle very large data volume 2.prediction capability 3.short indices N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 8.Personal Opinion This method can be used in our lab’s. ex:classification,clustering,……. N.Y.U.S.T. I.M.

Intelligent Database Systems Lab 9.Review Time sequences Subsequences AR model:autocovariance, autocorrelate Non-adaptive, adaptive statistical model Prediction Statistical V.S. DFT Adaptive V.S. non-adaptive N.Y.U.S.T. I.M.