Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A 24-h forecast of solar irradiance using artificial neural.
Advertisements

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Graph self-organizing maps for cyclic and unbounded graphs.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comparison of neural network models with ARIMA and regression models for prediction of Houston's daily.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The k-means range algorithm for personalized data clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Web usage mining: extracting unexpected periods from web.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2008.NN.10 Modeling propagation delays in the development.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Finding Terminology Translations From Hyperlinks On the.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A quantitative stock prediction system based on financial news Presenter : Chun-Jung Shih Authors :Robert.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology SIGIR1 Improving Web Search Results Using Affinity Graph.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Concept similarity in Formal Concept Analysis-An information.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Plagiarism Detection Technique for Java Program Using.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Development of a reading material recommendation system based on a knowledge engineering approach Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extensions of vector quantization for incremental clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Manoranjan.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study on Automatic Recognition of Road Signs Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. The application of SOM as a decision support tool to identify AACSB peer schools Presenter : Chun-Ping.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fraud detection in online consumer reviews Presenter: Tsai Tzung Ruei Authors: Nan Hu, Ling Liu, Vallabh.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Iterative Translation Disambiguation for Cross-Language.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A hierarchical clustering algorithm for categorical sequence.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 TIARA: A Visual Exploratory Text Analytic System Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Self Organizing Maps and Bit Signature: a study applied.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Modeling Semantic Similarities in Multiple Maps Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Predicting corporate bankruptcy using a self-organizing map: An empirical study to improve the forecasting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology ACM SIGMOD1 Subsequence Matching on Structured Time Series.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Hierarchical model-based clustering of large datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text Classification, Business Intelligence, and Interactivity:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An Integrated Machine Learning Approach to Stroke Prediction Presenter: Tsai Tzung Ruei Authors: Aditya.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Prediction model building and feature selection with support.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A support system for predicting eBay end prices Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 f-information measures in medical image registration Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Investigating the Effect of Sampling Methods for Imbalanced.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree and Named Entity Advisor : Dr. Hsu Presenter : Hsin-Yi Huang Authors : Zhang Kuo, Li Juan Zi, Wu Gang

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 2 Outline Introduction Motivation Objective Basic New Event Detection (NED) Model News Indexing-tree Term reweighting approach Experiment Conclusion Comments

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 3 Introduction Traditional New Event Detection (NED) System NED model News story stream 1.The decision 2.The confidence of the decision 1.Story Representation (1) Preprocessing (2) Term weight calculation 2.Similarity Calculation 3.Detection Procedure (1)S-S type (2)S-C type marriagestormexplodefilmdiet A B C D E

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 4 Motivation How to speed up the detection procedure while do not decrease the detection accuracy? How to make good use of cluster (topic) information to improve accuracy? How to obtain news story representation by better understanding of named entities?

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 5 Objective Efficiency  News Indexing-tree Accuracy  Using of cluster (topic) information  To make use of named entities based on news classification

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 6 Basic NED Model TF-IDF (term frequency–inverse document frequency) Incremental TF-IDF 1 2 …t-1 t

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 7 Basic NED Model (cont.) Similarity Calculation Detection Procedure an new storya old story No Yes

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 8 News Indexing-tree

N.Y.U.S.T. I. M. Intelligent Database Systems Lab Term reweighting approach Base on Distribution Distance 2007/8/15 9

N.Y.U.S.T. I. M. Intelligent Database Systems Lab Term reweighting approach (cont.) Base on Term Type and Story Class 2007/8/15 10

N.Y.U.S.T. I. M. Intelligent Database Systems Lab Term reweighting approach (cont.) 2007/8/15 11

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 12 Experiment Datasets  TDT2 (news story from January to June,1998)  TDT3 (English story from Oct. to Dec.,1998) Evaluation Metric term weight calculateSimilarity CalculationDetection Procedure System-1 incremental TF-IDF Hellinger distance S-S type Ststem-2S-C type Ststem-3 Indexing-tree Ststem-4term distributions Ststem-5 Term Type and Story Class Ststem-6 the other NED systems Ststem-7 Ststem-8

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 13 Experiment (cont.)

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 14 Conclusion To reduce comparing time without hurting NED accuracy. The two extensions contribute to improvement in accuracy. Future work  to collect news set which span for a longer period from internet, and integrate time information in NED task.  to refine cluster granularity to event-level, and identify different events and their relations within a topic

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 15 Comments Advantage  More efficient  More accurate Drawback  Ambiguous signs  Too many parameters Application  …