Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text.

Slides:



Advertisements
Similar presentations
CMU SCS : Multimedia Databases and Data Mining Lecture #17: Text - part IV (LSI) C. Faloutsos.
Advertisements

Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Harun Ug˘uz 2011.KBS A two-stage feature selection method for text categorization by.
Intelligent Database Systems Lab Presenter: WU, JHEN-WEI Authors: Jorge Gorricha, Victor Lobo CG Improvements on the visualization of clusters in.
Comparison of information retrieval techniques: Latent semantic indexing (LSI) and Concept indexing (CI) Jasminka Dobša Faculty of organization and informatics,
Evaluation of kernel function modification in text classification using SVMs Yangzhe Xiao.
Vector Space Information Retrieval Using Concept Projection Presented by Zhiguo Li
TFIDF-space  An obvious way to combine TF-IDF: the coordinate of document in axis is given by  General form of consists of three parts: Local weight.
OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Quality evaluation of product reviews using an information.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text classification based on multi-word with support vector.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Tie-Yan.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. BNS Feature Scaling: An Improved Representation over TF·IDF for SVM Text Classification Presenter : Lin,
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : Jorge Villalon and Rafael A. Calvo 2011, EST Concept Maps as Cognitive Visualizations.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Wenyi Huang, Yabin Zheng and Maosong Sun 2010, ACM Automatic Keyphrase Extraction.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Sheng-Tun Li a,b,*, Fu-Ching Tsai a 2013, KBS A fuzzy conceptualization model for.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A quantitative stock prediction system based on financial news Presenter : Chun-Jung Shih Authors :Robert.
June 5, 2006University of Trento1 Latent Semantic Indexing for the Routing Problem Doctorate course “Web Information Retrieval” PhD Student Irina Veredina.
SINGULAR VALUE DECOMPOSITION (SVD)
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology SIGIR1 Improving Web Search Results Using Affinity Graph.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Cube Kohonen Self-Organizing Map (CKSOM) Model
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Youngjoong Ko, Jungyun Seo 2009, IPM Text classification from unlabeled documents.
1 Mining the Web to Determine Similarity Between Words, Objects, and Communities Author : Mehran Sahami Reporter : Tse Ho Lin 2007/9/10 FLAIRS, 2006.
Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Medhdi Khashei, Mehdi Bijari 2011, ASOC A novel hybridization of artificial neural.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Development of a reading material recommendation system based on a knowledge engineering approach Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Intelligent Database Systems Lab Presenter : CHANG, SHIH-JIE Authors : Peter Sarlin* 2013.PRL Decomposing the global financial crisis: A Self-Organizing.
Intelligent Database Systems Lab Presenter: Wu, Jhen-Wei Authors: Fabian Bürger, Josef Pauli ICPRAM. Representation Optimization with Feature Selection.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Bui Quang Hung, Masanori Otsubo, Yoshinori Hijikata, Shogo Nishida 2010.WIA. HITS.
Techniques for Collaboration in Text Filtering 1 Ian Soboroff Department of Computer Science and Electrical Engineering University of Maryland, Baltimore.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: YU-SHENG.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Kevin Meijer, Flavius Frasincar, Frederik Hogenboom 2014.DSS. A semantic approach.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy.
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Wen Zhang, Taketoshi Yoshida, Xijin Tang 2011.ESWA A comparative study of TF*IDF,
Application of latent semantic analysis to protein remote homology detection Wu Dongyin 4/13/2015.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Longzhuang Li, Yi Shang, Wei Zhang 2002.ACM. Improvement of HITS-based Algorithms.
Intelligent Database Systems Lab Presenter : CHANG, SHIH-JIE Authors : Ya-Han Hu, Fan Wu a, Chia-Lun Lo, Chun-Tien Tai b 2012.AIM. Predicting warfarin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab Presenter : Chuang, Kai-Ting Authors : Rafael Odon de Alencar, Clodoveu Augusto Davis Jr., Marcos André Gonçalves 2010,
Intelligent Database Systems Lab Presenter: NENG-KAI, HONG Authors: HUAN LONG A, ZIJUN ZHANG A, ⇑, YAN SU 2014, APPLIED ENERGY Analysis of daily solar.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : STEPHEN T. O’ROURKE, RAFAEL A. CALVO and Danielle S. McNamara 2011, EST Visualizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Modeling Semantic Similarities in Multiple Maps Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Christopher C. Yang and Tobun Dorbin Ng TSMCA Analyzing and Visualizing Web Opinion.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Community self-Organizing Map and its Application to Data Extraction Presenter: Chun-Ping Wu Authors:
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Junping Zhang, Hua Huang and Jue Wang IEEE INTELLIGENT SYSTEMS Manifold Learning.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Vittorio Carlei, Massimiliano Nuccio PRL Mapping industrial patterns in spatial agglomeration:
Intelligent Database Systems Lab Presenter : Fen-Rou Ciou Authors : Hamdy K. Elminir, Yosry A. Azzam, Farag I. Younes 2007,ENERGY Prediction of hourly.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Emilio Corchado, Bruno Baruque 2012 NeurCom WeVoS-ViSOM: An ensemble summarization.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee IPM Multilingual document mining.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Boosting the Feature Space: Text Classification for Unstructured.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : JAMAL A. NASIR, IRAKLIS VARLAMIS, ASIM KARIM, GEORGE TSATSARONIS KNOWLEDGE-BASED.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Yong-Bin Kang, Pari Delir Haghighi, Frada Burstein ESA CFinder: An intelligent key.
Using lexical chains for keyword extraction
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Presentation transcript:

Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text Classification using Local Latent Semantic Indexing

Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab Motivation Global LSI ignores class discrimination. It has no help to improve the discrimination power of document classes, so it always yields no better on classification. In Local LSI, due to the weighting problem, the improvement of classification performance very limited.

Intelligent Database Systems Lab Objectives Propose new local LSI method(Local Relevancy Weighted LSI) to solve problem.

Intelligent Database Systems Lab Methodology - Local LSI statistic (QS-CHI): measures the association between the term and the topic. Mutual Information (QS-MI): measures how important a term to a topic.

Intelligent Database Systems Lab LRW-LSI Training (1) initial classifier IC of topic c is used to assign initial relevancy score ( rs ) to each training document. (2) each training document is weighted. (3) the top n documents are selected to generate the local term-by-document matrix of the topic c. (4) a truncated SVD is performed to generate the local semantic space. (5) all other weighted training documents are folded into the new space. (6) all training documents in local LSI vector are used to train a real classifier RC of topic c. Methodology-Local Relevancy Weighted LSI

Intelligent Database Systems Lab Methodology-Local Relevancy Weighted LSI

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Experiments

Intelligent Database Systems Lab Conclusions LRW-LSI can improve the classification performance greatly using a much smaller dimension compared to the global LSI and local LSI methods.

Intelligent Database Systems Lab Comments Advantages - LRW-LSI is quite effective. Applications - Text Classification.