Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transductive Inference for Text Classification using Support Vector Machines - Thorsten Joachims (1999) 서울시립대 전자전기컴퓨터공학부 데이터마이닝 연구실 G201149027 노준호.

Similar presentations


Presentation on theme: "Transductive Inference for Text Classification using Support Vector Machines - Thorsten Joachims (1999) 서울시립대 전자전기컴퓨터공학부 데이터마이닝 연구실 G201149027 노준호."— Presentation transcript:

1 Transductive Inference for Text Classification using Support Vector Machines - Thorsten Joachims (1999) 서울시립대 전자전기컴퓨터공학부 데이터마이닝 연구실 G201149027 노준호

2 Table of Contents Introduction Text Classification Transductive Support Vector Machines Experiments

3 Introduction Text classification (using SVM) be used to organize document databases filter spam learn users’ newsreding preferences problem little training data, large test set solution transductive inference (semi-supervised learning)

4 Text Classification Text classification using machine learning 1. to learn classifier from examples 2. classifier assign categories automatically Documents strings of characters (  feature : word) Information Retrieval(IR) research suggests that oword stems work computes, computing, computer  comput oordering can be ignored

5 Text Classification - Representing text as a feature vector

6 Text Classification representation of text TF – IDF TF(term frequency) IDF(Inverse document frequency n : total number of documents oa word is low if it occurs in many documents oa word is highest if the word occurs in only one

7 Transductive Support Vector Machines SVM Minimize : subjet to :

8 Transductive Support Vector Machines TSVM - training examples : +/-, test examples : dot SVM TSVM

9 Transductive Support Vector Machines * : test data C : trade off margin size parameta : measure the degree of misclassification of the data TSVM

10

11 Transductive Support Vector Machines How can TSVM be any better? - strong co-occurrence patterns Training data : D1(category A), D6(category B) SVM Test data : D3  ? TSVM Test data : D3  A

12 Experiments Test Colletions Reuters-21578 dataset WebKB collection Ohsumed corpus Performance Measure Precision/Recall-Breakeven Point (F1 measure)

13 Experiments Reuters(Average)

14 Experiments WebKB(category course)

15 Experiments WebKB(category project)

16 Experiments - Reuters - Ohsumed - WebKB


Download ppt "Transductive Inference for Text Classification using Support Vector Machines - Thorsten Joachims (1999) 서울시립대 전자전기컴퓨터공학부 데이터마이닝 연구실 G201149027 노준호."

Similar presentations


Ads by Google