Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1.

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1

 Introduction  Network data  Traditional VS Networked data Classification  Collective Classification  ICA  Problem  Our Method  Collective Inference With Ambiguous Node (CIAN)  Experiments  Conclusion 2

 traditional data:  instances are independent of each other  network data:  instances are maybe related to each other  application:  emails  web page  paper citation independent related 3

 traditional VS network data classification 5 F D G E C B H A 1 2 F D G E C B H A 1 2 B Class: 1 2 : Class 1

 To classify interrelated instances using content features and link features. 6 1 +2 A 1 B C D E 2 1 1 0 0 1/2 1/2 0 1 2 link featureclass1class2class3 Binary Count Proportion 1 0 0 2 0 0 1 0 0 1 1 0 1/2 1/2 0 D:E: We use :

 ICA : Iterative Classification Algorithm 7 Initial : Training local classifier use content features to predict unlabel instances Iterative { for predict each unlabeled instance { set unlabeled instance ’s link feature use local classifier to predict unlabeled instance } } step1 step2

8 Training data: 2/3 0 1/3 1/3 1/3 1/3 1 3 A: 1 2 2 A E 1 1 B 1 C D F 1 G 3 H 2 1 2 3 3 Class : 1 2 3 unlabel data: training data: 1/2 1/2 0 1/4 1/2 1/4 2 2 B: 3 I

 label the wrong class  judge the label with difficulty  make a mistake 9 2 1 1 A B 2 2 D C E F G content feature a.b. womanmanage≤20age>20 0101 class non-smokingsmoking 12 1 1 or 2 ? 1

10 1 1 1 A unlabel data: training data: 1 1 1 C 1 2 22 A True class : B 1 C 1 2 D Training data 2/3 1/3 0 A: 1 1 E F G B 2 H - Ambiguous C: 1 I 2 J

 Make a new prediction for neighbors of unlabeled instance  Use probability to compute link feature  Retrain the CC classifier 11

 compute link feature  use probability 12 A 1 2 3 ( 1, 80%) ( 2, 60%) ( 3, 70%) Our method: Class 1 : 80/(80+60+70) Class 2 : 60/(80+60+70) Class 3 : 70/(80+60+70) General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3

13 1 1 1 A 1 1 1 C 2 2 E F G B 2 H - Ambiguous 22 A True class : B 1 C D To predict unlabeled instance ’s neighbors again. ( 1, 70%) ( 2, 80%) ( 1, 70%) ( 2, 60%) ( 2, 80%) predict again 1 1 1 A 1 1 1 C 2 2 E F G B 2 H - Noise D ( 1, 70%) ( 2, 80%) ( 1, 70%) ( 2, 90%) ( 2, 80%) predict again B is ambiguous node. B is noise node.

 To predict unlabeled instance ’s neighbors again  first iteration needs to predict again  difference between original and predict label : ▪ This iteration doesn’t to adopt ▪ Next iteration need to predict again  similarity between original and predict label : ▪ Average the probability ▪ Next iteration doesn’t need to predict again 14 A 12 ( 1, 80%) ( 2, 60%) ( 2, 80%)( 2, 60%) ( 2, 70%) ( 2, 60%) 1 new prediction B C Example:

15 2 w x y 3 z 1 ( 1, 60%) ( 2, 70%) ( 3, 60%) ( 2, 80%) ( 2, 70%) new prediction ( 2, 75%) ( 3, 60%) ( ?, ??%) link feature Class 1Class 2Class 3 original0.3150.3680.315 Method A0.270.4050.324 Method B00.6920.307 Method C00.5550.444 x: Method A : (1, 50%) Method B : (2, 60%) Method C : (1, 0%) 2 x’s True label : 2 x is ambiguous ( or noise) node: Method B >Method C > Method A x is not ambiguous ( or noise) node: Method A >Method C > Method B Method A & Method B is too extreme. So we choose the Method C. not adopt change class not change class -Ambiguous

16 Accuracy

 Retrain CC classifier 17 ( 3, 70%) 1 + A B C D E 2 1 1 +2 A 3 B C D E 2 1 ( 1, 90%) ( 2, 60%) ( 2, 70%) ( 1, 80%) retrain Initial ( ICA )

18 1 1 1 A B 1 1 1 C 2 -Ambiguous 2 1 ( 2, 60%) ( 1, 80%) 2 ( 2, 80%) ( 1, 80%) ( 2, 80%) ( 1, 70%) predict again ( 2, 60%) ( 1, 60%) D 22 A True label: B 1 C Training data E F G 1/2 1/2 0 Our: 2 unlabel data: training data: B: ( 1, 60%) ( 2, 80%) 2 ICA:

19 1 1 1 A B 1 1 1 C 2 - Noise 2 1 ( 2, 80%) ( 1, 80%) 2 ( 2, 80%) ( 1, 80%) ( 2, 80%) ( 1, 70%) predict again ( 2, 70%) ( 1, 70%) D 22 A True label: B 1 C Training data E F G 1/2 1/2 0 Our: 2 unlabel data: training data: B: ( 1, 60%) ( 2, 80%) 2 ICA:

 CIAN : Collective Inference With Ambiguous Node 20 Initial : Training local classifier use content features to predict unlabel instances Iterative { for predict each unlabel instance { for nb unlabeled instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s link feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5

21 CharacteristicsCoraCiteSeerWebKB- texas WebKB- washington Instances 27083312187230 Class labels 7655 Link number 47325429328446 Content features 143337031703 Link features 7655

22 CharacteristicsCoraCiteSeerWebKB- texas WebKB- washington Instances 27083312187230 Max ambiguous nodes (NB) 4295905250 Max ambiguous nodes (SVM) 3563652031 Training data 15002000100120 Iteration 5555 fixed argument ‧ Compare with CO 、 ICA 、 CIAN

 1. misclassified nodes  Proportion of misclassified nodes (0%~30%, 80%)  2. ambiguous nodes  NB vs SVM  3. misclassified and Ambiguous nodes  Proportion of misclassified and ambiguous nodes (0%~30%, 80%)  4. iteration & stable  number of iterations 23

 Cora 24 0% 4.52.5 10% 3.23.3 20% 2.73.9 30% 2.34.2

 CiteSeer 25 0% 3.60.2 10% 1.91.3 20% 1.31.8 30% 1.22.1

 WebKB-texas 26 0% 1.40.3 10% 1.50.9 20% 11.5 30% 0.42

 WebKB-washington 27 0% 1.50.9 10% 1.4 20% 11.9 30% 0.82.2

 80% of misclassified nodes 28

 Cora 29 Max ambiguous nodes : 429Max ambiguous nodes : 356

 CiteSeer 30 Max ambiguous nodes : 590Max ambiguous nodes : 365

 WebKB-texas 31 Max ambiguous nodes : 52 Max ambiguous nodes : 20

 WebKB-washington 32 Max ambiguous nodes : 33 Max ambiguous nodes : 31

CharacteristicsCoraCiteSeerWebKB- texas WebKB- washington Instances 27083312187230 Max ambiguous nodes (NB) 4295905250 Max ambiguous nodes (SVM) 3563652031 The same ambiguous nodes 1571641517 The proportion of the same ambiguous nodes (NB) 36.5%27.7%28.8%34% The proportion of the same ambiguous nodes (SVM) 44.1%44.9%75%54.8% 33 ‧ How much the same ambiguous nodes between NB and SVM?

 Cora 34 10% 6.31.5 20% 6.72 30% 6.72.3

 CiteSeer 35 10% 2.20.9 20% 1.11.7 30% 2.22.3

 WebKB-texas 36 10% 1.81 20% 1.81.4 30% 1.43

 WebKB-washington 37 10% 0.51.8 20% 1.22.4 30% 1.85.2

 80% of misclassified and ambiguous nodes 38

39 ‧ When the accuracy of ICA is lower than CO ?

 Cora 40

 CiteSeer 41

 WebKB-texas 42

 WebKB-washington 43

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1.

Similar presentations

Presentation on theme: "Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1.

Similar presentations

Presentation on theme: "Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1."— Presentation transcript:

Similar presentations

About project

Feedback