Presentation is loading. Please wait.

Presentation is loading. Please wait.

Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and.

Similar presentations


Presentation on theme: "Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and."— Presentation transcript:

1 Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and Technology Tsinghua University, China

2 An Example

3 Instances Correlation

4 An Example Instances Correlation ? ? ? ? ? ? Classify each instance into {+1, -1}

5 An Example Instances Correlation +1 ? +1 ? ?

6 An Example Instances Correlation +1 ? +1 ? ? Query for label

7 An Example Instances Correlation +1 ? +1 ?

8 Problem: Active Learning for Networked Data Instances Correalation +1 ? +1 ? ? Challenge It is expensive to query for labels! Questions Which instances should we select to query? How many instances do we need to query, for an accurate classifier?

9 Challenges Active Learning for Networked Data How to leverage network correlation among instances? How to query in a batch mode?

10 Batch Mode Active Learning for Networked Data Given a graph Unlabeled instances Features Matrix Labeled instances Labels of labeled instances Edges Our objective is Subject to A subset of unlabeled instances The utility function Labeling budget

11 Factor Graph Model ? ? ? ? ? ? Variable Node Factor Node

12 Factor Graph Model The joint probability Local factor function Edge factor function Log likelihood of labeled instances

13 Factor Graph Model Learning Gradient descent Calculate the expectation: Loopy Belief Propagation (LBP) Message from variable to factor Message from factor to variable

14 Question: How to select instances from Factor graph for active learning?

15 Basic principle: Maximize the Ripple Effects ? ? ? ? ? ?

16 Maximize the Ripple Effects ? ? ? +1 ? ? Labeling information is propagated

17 Maximize the Ripple Effects ? ? ? +1 ? ? Labeling information is propagated

18 Maximize the Ripple Effects ? ? ? +1 ? ? Labeling information is propagated Statistical bias is propagated How to model the propagation process in a unlabeled network?

19 Diffusion Model Linear Threshold Model Progressive Diffusion Model Non-Progressive Diffusion Model Linear Threshold

20 Maximize the Ripple Effects ? ? ? +1 ? ? Labeling information is propagated Statistical bias is propagated Will it be dominated by labeling information (active) or statistical bias (inactive)? Based on non-progressive diffusion model Maximize the number of activated instances in the end We aim to activate the most uncertain instances!

21 Instantiate the Problem Active Learning Based on Non-Progressive Diffusion Model, The number of activated instances With constraints Initially activate all queried instances We activate the most uncertain instances Based on the non-progressive diffusion

22 Reduce the Problem The original problem The reduced problem Constraints are inherited. Reduction procedure

23 Algorithm The reduced problem The key idea

24 Algorithm

25 Theoretical Analysis Convergence Lemma 1 The algorithm will converge within time. Correctness Approximation Ratio

26 Experiments Datasets #Variable node#Factor node Coauthor6,09624,468 Slashdot3701,686 Mobile314513 Enron100236 Comparison Methods Batch Mode Active Learning (BMAL), proposed by Shi et al. Influence Maximization Selection (IMS), proposed by Zhuang et al. Maximum Uncertainty (MU) Random (RAN) Max Coverage (MaxCo), our method

27 Experiments Performance

28 Related Work Active Learning for Networked Data Actively learning to infer social ties H. Zhuang, J. Tang, W. Tang, T. Lou, A. Chin and X. Wang Batch mode active learning for networked data L. Shi, Y. Zhao and J. Tang Towards active learning on graphs: an error bound minimization approach Q. Gu and J. Han Integreation of active learing in a collaborative crf O. Martinez and G. Tsechpenakis Diffusion Model On the non-progressive spread of influence through social networks M. Fazli, M. Ghodsi, J. Habibi, P. J. Khalilabadi, V. Mirrokni and S. S. Sadeghabad Maximizing the spread of influence through a social network D. Kempe, J. Kleinberg and E. Tardos

29 Conclusion Connect active learning for networked data to non-progressive diffusion model, and precisely formulate the problem Propose an algorithm to solve the problem Theoretically guarantee the convergence, correctness and approximation ratio of the algorithm Empirically evaluate the performance of the algorithm on four datasets of different genres

30 Future work Consider active learning for networked data in a streaming setting, where data distribution and network structure are changing over time

31 About Me Zhilin Yang kimiyoung@yeah.net 3 rd year undergraduate at Tsinghua Univ. Applying for PhD programs this year Data Mining & Machine Learning

32 Thanks! kimiyoung@yeah.net


Download ppt "Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and."

Similar presentations


Ads by Google