Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text Classification using Local Latent Semantic Indexing

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab Motivation Global LSI ignores class discrimination. It has no help to improve the discrimination power of document classes, so it always yields no better on classification. In Local LSI, due to the weighting problem, the improvement of classification performance very limited.

4 Intelligent Database Systems Lab Objectives Propose new local LSI method(Local Relevancy Weighted LSI) to solve problem.

5 Intelligent Database Systems Lab Methodology - Local LSI statistic (QS-CHI): measures the association between the term and the topic. Mutual Information (QS-MI): measures how important a term to a topic.

6 Intelligent Database Systems Lab LRW-LSI Training (1) initial classifier IC of topic c is used to assign initial relevancy score ( rs ) to each training document. (2) each training document is weighted. (3) the top n documents are selected to generate the local term-by-document matrix of the topic c. (4) a truncated SVD is performed to generate the local semantic space. (5) all other weighted training documents are folded into the new space. (6) all training documents in local LSI vector are used to train a real classifier RC of topic c. Methodology-Local Relevancy Weighted LSI

7 Intelligent Database Systems Lab Methodology-Local Relevancy Weighted LSI

8 Intelligent Database Systems Lab Experiments

9 Intelligent Database Systems Lab Experiments

10 Intelligent Database Systems Lab Experiments

11 Intelligent Database Systems Lab Experiments

12 Intelligent Database Systems Lab Experiments

13 Intelligent Database Systems Lab Conclusions LRW-LSI can improve the classification performance greatly using a much smaller dimension compared to the global LSI and local LSI methods.

14 Intelligent Database Systems Lab Comments Advantages - LRW-LSI is quite effective. Applications - Text Classification.


Download ppt "Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text."

Similar presentations


Ads by Google