Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1.

Similar presentations


Presentation on theme: "Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1."— Presentation transcript:

1 Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1

2  Introduction  Network data  Traditional VS Networked data Classification  Collective Classification  ICA  Problem  Our Method  Collective Inference With Ambiguous Node (CIAN)  Experiments  Conclusion 2

3  traditional data:  instances are independent of each other  network data:  instances are maybe related to each other  application:  emails  web page  paper citation independent related 3

4 4

5  traditional VS network data classification 5 F D G E C B H A 1 2 F D G E C B H A 1 2 B Class: 1 2 : Class 1

6  To classify interrelated instances using content features and link features. 6 1 +2 A 1 B C D E 2 1 1 0 0 1/2 1/2 0 1 2 link featureclass1class2class3 Binary Count Proportion 1 0 0 2 0 0 1 0 0 1 1 0 1/2 1/2 0 D:E: We use :

7  ICA : Iterative Classification Algorithm 7 Initial : Training local classifier use content features to predict unlabel instances Iterative { for predict each unlabeled instance { set unlabeled instance ’s link feature use local classifier to predict unlabeled instance } } step1 step2

8 8 Training data: 2/3 0 1/3 1/3 1/3 1/3 1 3 A: 1 2 2 A E 1 1 B 1 C D F 1 G 3 H 2 1 2 3 3 Class : 1 2 3 unlabel data: training data: 1/2 1/2 0 1/4 1/2 1/4 2 2 B: 3 I

9  label the wrong class  judge the label with difficulty  make a mistake 9 2 1 1 A B 2 2 D C E F G content feature a.b. womanmanage≤20age>20 0101 class non-smokingsmoking 12 1 1 or 2 ? 1

10 10 1 1 1 A unlabel data: training data: 1 1 1 C 1 2 22 A True class : B 1 C 1 2 D Training data 2/3 1/3 0 A: 1 1 E F G B 2 H - Ambiguous C: 1 I 2 J

11  Make a new prediction for neighbors of unlabeled instance  Use probability to compute link feature  Retrain the CC classifier 11

12  compute link feature  use probability 12 A 1 2 3 ( 1, 80%) ( 2, 60%) ( 3, 70%) Our method: Class 1 : 80/(80+60+70) Class 2 : 60/(80+60+70) Class 3 : 70/(80+60+70) General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3

13 13 1 1 1 A 1 1 1 C 2 2 E F G B 2 H - Ambiguous 22 A True class : B 1 C D To predict unlabeled instance ’s neighbors again. ( 1, 70%) ( 2, 80%) ( 1, 70%) ( 2, 60%) ( 2, 80%) predict again 1 1 1 A 1 1 1 C 2 2 E F G B 2 H - Noise D ( 1, 70%) ( 2, 80%) ( 1, 70%) ( 2, 90%) ( 2, 80%) predict again B is ambiguous node. B is noise node.

14  To predict unlabeled instance ’s neighbors again  first iteration needs to predict again  difference between original and predict label : ▪ This iteration doesn’t to adopt ▪ Next iteration need to predict again  similarity between original and predict label : ▪ Average the probability ▪ Next iteration doesn’t need to predict again 14 A 12 ( 1, 80%) ( 2, 60%) ( 2, 80%)( 2, 60%) ( 2, 70%) ( 2, 60%) 1 new prediction B C Example:

15 15 2 w x y 3 z 1 ( 1, 60%) ( 2, 70%) ( 3, 60%) ( 2, 80%) ( 2, 70%) new prediction ( 2, 75%) ( 3, 60%) ( ?, ??%) link feature Class 1Class 2Class 3 original0.3150.3680.315 Method A0.270.4050.324 Method B00.6920.307 Method C00.5550.444 x: Method A : (1, 50%) Method B : (2, 60%) Method C : (1, 0%) 2 x’s True label : 2 x is ambiguous ( or noise) node: Method B >Method C > Method A x is not ambiguous ( or noise) node: Method A >Method C > Method B Method A & Method B is too extreme. So we choose the Method C. not adopt change class not change class -Ambiguous

16 16 Accuracy

17  Retrain CC classifier 17 ( 3, 70%) 1 + A B C D E 2 1 1 +2 A 3 B C D E 2 1 ( 1, 90%) ( 2, 60%) ( 2, 70%) ( 1, 80%) retrain Initial ( ICA )

18 18 1 1 1 A B 1 1 1 C 2 -Ambiguous 2 1 ( 2, 60%) ( 1, 80%) 2 ( 2, 80%) ( 1, 80%) ( 2, 80%) ( 1, 70%) predict again ( 2, 60%) ( 1, 60%) D 22 A True label: B 1 C Training data E F G 1/2 1/2 0 Our: 2 unlabel data: training data: B: ( 1, 60%) ( 2, 80%) 2 ICA:

19 19 1 1 1 A B 1 1 1 C 2 - Noise 2 1 ( 2, 80%) ( 1, 80%) 2 ( 2, 80%) ( 1, 80%) ( 2, 80%) ( 1, 70%) predict again ( 2, 70%) ( 1, 70%) D 22 A True label: B 1 C Training data E F G 1/2 1/2 0 Our: 2 unlabel data: training data: B: ( 1, 60%) ( 2, 80%) 2 ICA:

20  CIAN : Collective Inference With Ambiguous Node 20 Initial : Training local classifier use content features to predict unlabel instances Iterative { for predict each unlabel instance { for nb unlabeled instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s link feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5

21 21 CharacteristicsCoraCiteSeerWebKB- texas WebKB- washington Instances 27083312187230 Class labels 7655 Link number 47325429328446 Content features 143337031703 Link features 7655

22 22 CharacteristicsCoraCiteSeerWebKB- texas WebKB- washington Instances 27083312187230 Max ambiguous nodes (NB) 4295905250 Max ambiguous nodes (SVM) 3563652031 Training data 15002000100120 Iteration 5555 fixed argument ‧ Compare with CO 、 ICA 、 CIAN

23  1. misclassified nodes  Proportion of misclassified nodes (0%~30%, 80%)  2. ambiguous nodes  NB vs SVM  3. misclassified and Ambiguous nodes  Proportion of misclassified and ambiguous nodes (0%~30%, 80%)  4. iteration & stable  number of iterations 23

24  Cora 24 0% 4.52.5 10% 3.23.3 20% 2.73.9 30% 2.34.2

25  CiteSeer 25 0% 3.60.2 10% 1.91.3 20% 1.31.8 30% 1.22.1

26  WebKB-texas 26 0% 1.40.3 10% 1.50.9 20% 11.5 30% 0.42

27  WebKB-washington 27 0% 1.50.9 10% 1.4 20% 11.9 30% 0.82.2

28  80% of misclassified nodes 28

29  Cora 29 Max ambiguous nodes : 429Max ambiguous nodes : 356

30  CiteSeer 30 Max ambiguous nodes : 590Max ambiguous nodes : 365

31  WebKB-texas 31 Max ambiguous nodes : 52 Max ambiguous nodes : 20

32  WebKB-washington 32 Max ambiguous nodes : 33 Max ambiguous nodes : 31

33 CharacteristicsCoraCiteSeerWebKB- texas WebKB- washington Instances 27083312187230 Max ambiguous nodes (NB) 4295905250 Max ambiguous nodes (SVM) 3563652031 The same ambiguous nodes 1571641517 The proportion of the same ambiguous nodes (NB) 36.5%27.7%28.8%34% The proportion of the same ambiguous nodes (SVM) 44.1%44.9%75%54.8% 33 ‧ How much the same ambiguous nodes between NB and SVM?

34  Cora 34 10% 6.31.5 20% 6.72 30% 6.72.3

35  CiteSeer 35 10% 2.20.9 20% 1.11.7 30% 2.22.3

36  WebKB-texas 36 10% 1.81 20% 1.81.4 30% 1.43

37  WebKB-washington 37 10% 0.51.8 20% 1.22.4 30% 1.85.2

38  80% of misclassified and ambiguous nodes 38

39 39 ‧ When the accuracy of ICA is lower than CO ?

40  Cora 40

41  CiteSeer 41

42  WebKB-texas 42

43  WebKB-washington 43


Download ppt "Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1."

Similar presentations


Ads by Google