Presentation on theme: "Sparsification and Sampling of Networks for Collective Classification"— Presentation transcript:
1Sparsification and Sampling of Networks for Collective Classification Tanwistha Saha, Huzefa Rangwala and Carlotta DomeniconiDepartment of Computer ScienceGeorge Mason UniversityFairfax, VA, USA
2Outline Introduction Motivation Related Work Proposed Methods Results Conclusion and Future Work
3Sparsification and Sampling of Networks for Collective Classification Given:Partially labeled weighted networkNode attributes for all the nodesGoal:Predict the labels of unlabeled nodes in networkPoints to consider:Networks with fewer edges can be formed using sparsification algorithmsThe selection of labeled nodes for training, influences the overall accuracy – research on sampling algorithms for collective classification
5Relational Network Sparsification Study of networks involves Relational LearningRelational network consists of nodes representing entities and edges representing pairwise interactionsEdges can be weighted / unweightedWeights represents similarity between pair of nodesEdges with low weights don’t carry much information – we can remove them based on some criteria!Sparsify the network without losing much information
8Importance of Sparsification in Network Problems:Data analysis is time consumingNoisy edges can not convey fruitful information in relational dataSolutions:Identify and remove the noisy edgesMake sure to remove noisy edges only, and not the others!Classify the unlabeled nodes in sparsified network using Collective Classification and compare results with unsparsified network
9Graph sparsification methods for clustering (GS) Global Graph Sparsification (Satuluri et al. SIGMOD 2011)(LS) Local Graph Sparsification (Satuluri et al. SIGMOD 2011)Drawbacks:Methods designed for fast clustering, not suitable for classificationAll edges treated equallySparsified network becomes more disconnected
10Global Graph Sparsification (Satuluri et al. SIGMOD 2011)Disconnected componentSingleton nodes
11Local Graph Sparsification (Satuluri et al. SIGMOD 2011)Removal of this edge disconnects the graphIn addition to edges marked red, some more edges marked blue were removed!The edges removed with this method might not be a superset of the edgesremoved by global sparsification method.
12Adaptive Global Sparsifier (Saha et al. SBP 2013)Aims to address the drawbacks of LS and GSDoesn’t remove an edge if the removal is going to make the graph more disconnectedNote:This method is less aggressive in removing edges compared to local and global sparsification algorithms by Satuluri et al.
13Adaptive Global Sparsifier Keep the edges with top similarity scores (here, score >= 0.3)
14Adaptive Global Sparsifier (contd.) Removing red edges doesn’t increase the number of connected componentsMauve colored edges have low similarity score but we put them back to avoiddisconnect components
15Collective Classification in Networks Input: A graph G = (V,E) with given percentage of labeled nodes for training, node features for all the nodesOutput: Predicted labels of the test nodesModel:Relational features and node features are used for training local classifier using labeled nodesTest nodes labels are initialized with labels predicted by local classifier using node attributesInference through iterative classification of test nodes until convergence criterion reachedNetwork of researchersSWDMAIBioML?
16Datasets & Experiments Cora citation network, directed graph of 2708 research papers belonging to either one of 7 research areas (classes) in Computer Science (data downloaded from ml )DBLP co-authorship network among 5602 researchers in 6 different areas of computer science (raw data downloaded from and processed)Number of edges acquired with different sparsification algorithms with sparsification ratio s=70%:DatasetTotal edges in networkAdaptive Global SparsifierGlobal SparsifierLocal SparsifierCora5429385038002429DBLP1726512251120866859
17Experiments (contd.)Weighted Vote Relational Neighbor (wvRN) is used as the base collective classification algorithm (Macskassy et al. JMLR 2007)Baseline methods: Global Sparsification Algorithm (GS) and Local Sparsification Algorithm (LS) (Satuluri et al. SIGMOD 2011)Performance metric: Accuracy of Classification
19Sampling for Collective Classification A good sample from a data should inherit all the characteristicsForest fire sampling, node sampling, edge sampling with induction (Ahmed et al. ICWSM 2012)We argue: “goodness” of a sample is defined based on the problem we want to solveRationale:Choosing samples for training should make sure that each test node is connected to at least one training nodeWhy? To facilitate collective classification by ensuring test nodes can have useful relational features computed from training nodes!
20Adaptive Forest Fire Sampling Modified version of Forest Fire Sampling (Leskovec et al. KDD 2005)Selects a random node as “seed node” to start and marks as “visited”“Adaptive” because it randomly selects only a certain percentage of edges incident on a visited node, to propagate along the network and mark the nodes on the other end of edges as “visited”Maintains a queue of unvisited nodes as propagation occurs in the networkEnsures that each test node is connected to at least one training node
21Adaptive Forest Fire Sampling of network with 19 nodes Test nodesTest nodes
22ExperimentsBaseline classifiers used for comparing Random Sampling with Adaptive Forest Fire sampling:wvRN (Macskassy et al. JMLR 2007)Multi-class SVM (Krammer and Singer JMLR 2001, Tsochantaridis et al. ICML 2004)RankNN for single labeled data (Saha et al. ICMLA 2012)
23Results (Cora citation network) Random SamplingAdaptive Forest Fire Sampling
24ConclusionsIntroduced a sparsification method for collective classification of network datasets without losing much information and comparable accuraciesIntroduced a network sampling algorithm for facilitating collective classificationThese algorithms work on single labeled networks, in future we would extend these approach to treat multi-labeled networks as wellThese algorithms are designed for static networks, an interesting work would be to formulate sampling methods for networks that change over time