Presentation is loading. Please wait.

Presentation is loading. Please wait.

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.

Similar presentations


Presentation on theme: "+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George."— Presentation transcript:

1 + Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George Mason University Fairfax, VA, USA

2 + Outline Introduction Motivation Related Work Problem definition Proposed Method Results Conclusion and Future Work

3 + Importance of Multi-label Network Classification Datasets with link structure can be represented as graphs or networks Classifying nodes in networks is useful for: 1. Social Network Analysis to identify user interests – product recommendations 2. Scientific collaboration analysis to identify research interests of individuals – recommendation of relevant scholarly articles 3. Protein-Protein Interaction (PPI) network analysis – protein function prediction Entities can belong to multiple classes – multi-label classification Existence of inter-label, intra-label and inter-instance dependency make multi-label classification hard

4 + Multi-label Networks (Palla et al. Nature 2005) Multi-label PPI networkExample: Zoomed in PPI network

5 + Related Work Single label classification in Network data (a.k.a. Collective Classification) [Lu & Getoor ICML 2003, Neville & Jensen AAAI 2000, Sen et al. AI Magazine 2008] Multi-label Classification [Zhang et al. IJCAI 2011, Zhang et al. SIGKDD 2010] Multi-label Classification in Network data [Kong et al. SIAM SDM 2011]

6 + Classification in Networks Input: A graph G = (V,E) with given percentage of labeled nodes for training, node features for all the nodes Output: Predicted labels of the test nodes Model: 1. Relational features and node features are used for training local classifier using labeled nodes 2. Test nodes labels are initialized with labels predicted by local classifier using node attributes 3. Inference through iterative classification of test nodes until convergence criterion reached Network of researchers ML DM SW AI Bio ?

7 + Collective Classification (Sen et al. ‘08) Classification of linked data in networks using iterative inference Collectively predict the labels of all test nodes Relational features = [1 1 0 0 1] Total features = node features + relational features What is the label of this orange guy? ML DM AI SW Bio

8 + X1X1 X1X1 Y1Y1 Y1Y1 X3X3 X3X3 Y3Y3 Y3Y3 X2X2 X2X2 Y2Y2 Y2Y2 X4X4 X4X4 Y4Y4 Y4Y4 Single Label Collective Classification Partially labeled network with node attributes for all nodes

9 + Multi-label Classification in non- network data Y11Y11 Y11Y11 X1X1 X1X1 Y12Y12 Y12Y12 Y13Y13 Y13Y13 X2X2 X2X2 Y21Y21 Y21Y21 Y22Y22 Y22Y22 Y23Y23 Y23Y23 X3X3 X3X3 Y31Y31 Y31Y31 Y32Y32 Y32Y32 Y33Y33 Y33Y33 X4X4 X4X4 Y41Y41 Y41Y41 Y42Y42 Y42Y42 Y43Y43 Y43Y43 No link structure between instances

10 + Y11Y11 Y11Y11 X1X1 X1X1 Y12Y12 Y12Y12 Y13Y13 Y13Y13 X2X2 X2X2 Y21Y21 Y21Y21 Y22Y22 Y22Y22 Y23Y23 Y23Y23 X3X3 X3X3 Y31Y31 Y31Y31 Y32Y32 Y32Y32 Y33Y33 Y33Y33 X4X4 X4X4 Y41Y41 Y41Y41 Y42Y42 Y42Y42 Y43Y43 Y43Y43 Link structure between instances, Partially labeled network Multi-label Collective Classification

11 + Multi-label Collective Classification [Kong et al. SDM 2011] (Kong et al. SIAM SDM 2011)

12 + Our Contributions Incorporate rank based neighborhood selection for influential nodes in Multi-label Collective Classification Proposing a simple neighborhood ranking based naïve Bayes network classifier Proposing an unbiased validation method

13 + Multi-label Collective Classification with Ranked Neighbors (ICML_Rank) Assign a rank score to all training nodes Use only top ranked nodes (ranks above given threshold) as influential neighbors for relational feature computation Train K different SVM models (K = no of classes) on the training nodes Use K trained models from previous step to predict K labels of each test node using only node features Compute relational features of all test nodes considering only influential training nodes Use iterative inference to collectively predict K labels of all test nodes until convergence

14 + Multi-label Classification in Networks (RankNN) Local neighborhood based ranking method Computes the rank of all training nodes Compute prior probability of each label for a test node Computes likelihood of each label for a test node based on the influential neighbors Computes posterior probability of each label for a test node from the prior and likelihood Faster than multi-label collective classification methods when training sample size is small

15 + Influence – concept & measure How to distinguish between different neighbors?

16 + Influence of a Training Node Rank of training node is: Where C is cosine similarity matrix between pairwise node features, A is weighted adjacency matrix, M is node label association matrix and “r” is the rank of label Use a threshold to prune out nodes with low rank score Default threshold is set as median value of the rank scores

17 + Influential Neighbors P1={DM, ML} P2={DM,AI} P3={DM,AI, ML} 2 2 4 1 1 1 1 1 Rank(P1) = 0.5, Rank(P2) = 0.3, Rank(P3) = 0.2 and Rank of others is <0.1 P8={DM} P7={ML} P6={SW} P5={DM} P4={AI} If rank threshold is 0.2 then influential neighbors are P1, P2 and P3 ONLY!

18 + Proposed Validation Protocol : Filtered VS Unfiltered data Unbiased evaluation Remove shared information between training node and test node, from the training node Re-compute labels of training node Unfiltered data – without performing any label pre- processing before training

19 + Unbiased label set: an example Training Node Test Node Coauthored 3 papers in ML Label Before: {DM, AI, ML} Label After: {DM, AI} With the test node

20 + Datasets DBLP computer science bibliography data is pre-processed to generate to multi-labeled networks Publication records from 2000 to 2010 were extracted Conference locations are categorized into labels Each node is an author Weight on edge is no of co-authored papers Node attributes are tf-idf weights of titles of papers co-authored

21 + Baseline Methods & Proposed Methods Baseline Methods (multi-label collective classification): 1. ICML (Kong et al. SIAM SDM 2011) 2. ML_ICA (Kong et al. SIAM SDM 2011) Proposed Methods (neighborhood ranking based multi-label collective classification with unbiased evaluation) 1. ICML_Rank 2. ML_ICA_Rank 3. RankNN

22 + Evaluation Metrics Hamming Loss – considers a label not predicted (i.e. a missing error) and a label incorrectly predicted (i.e. a prediction error) Subset Loss – strict loss i.e. error if predicted labels do not exactly match the true labels Macro-F1 score – computes the average of F1-score on the prediction of all labels Micro-F1 score – harmonic mean of precision & recall across individual label, averaged across set of test nodes

23 + Results (Loss for different methods w.r.t training percentage)

24 + Results (Loss of RankNN w.r.t. different probability thresholds)

25 +

26 +

27 + Time consumption of different methods with 30% labeled data

28 + Proposed a node ranking method to prune neighborhood for selecting influential nodes adaptively Proposed an improved rank-based method for multi-label collective classification Proposed a simple classification method for multi-label network classification Reported results on real world dataset Need to come up with methods that can scale to huge number of nodes maintaining similar accuracy Need to develop methods that can handle dynamic networks

29 + Thank You!


Download ppt "+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George."

Similar presentations


Ads by Google