Presentation is loading. Please wait.

Presentation is loading. Please wait.

CO-AUTHOR RELATIONSHIP PREDICTION IN HETEROGENEOUS BIBLIOGRAPHIC NETWORKS Yizhou Sun, Rick Barber, Manish Gupta, Charu C. Aggarwal, Jiawei Han 1.

Similar presentations


Presentation on theme: "CO-AUTHOR RELATIONSHIP PREDICTION IN HETEROGENEOUS BIBLIOGRAPHIC NETWORKS Yizhou Sun, Rick Barber, Manish Gupta, Charu C. Aggarwal, Jiawei Han 1."— Presentation transcript:

1 CO-AUTHOR RELATIONSHIP PREDICTION IN HETEROGENEOUS BIBLIOGRAPHIC NETWORKS Yizhou Sun, Rick Barber, Manish Gupta, Charu C. Aggarwal, Jiawei Han 1

2 Content Background and motivation Problem definition PathPredict: meta path-based relationship prediction model Meta path-based topological features The supervised learning framework and model Experiments Conclusions 2

3 Background Homogeneous networks One type of objects One type of links Examples: Friendship network in Facebook Link prediction in homogeneous networks Predict whether a link between two objects will appear in the future, according to: Topological feature of the network Attribute feature of the objects (usually cannot be fully obtained) 3

4 Motivation In reality, heterogeneous networks are ubiquitous Multiple types of objects Multiple types of links Examples: bibliographic network movie network From link prediction to relationship prediction A relationship between two objects could be a composition of two or more links E.g., two authors have a co-author relationship if and only if they have co- written a paper Need to re-design topological features in heterogeneous information network Our goal: Study the topological features in heterogeneous networks in predicting the co-author relationship building 4

5 Content Background and motivation Problem definition PathPredict: meta path-based relationship prediction model Meta path-based topological features The supervised learning framework and model Experiments Conclusions 5

6 Heterogeneous Information Networks 6

7 The DBLP Bibliographic Network The underneath meta structure is the network schema: 7

8 Co-author Relationship Prediction 8

9 Content Background and motivation Problem definition PathPredict: meta path-based relationship prediction model Meta path-based topological features The supervised learning framework and model Experiments Conclusions 9

10 The PathPredict Model PathPredict: meta path-based relationship prediction model Propose meta path-based topological features in HIN Topological features used in homogeneous networks cannot be directly used Using logistic regression-based supervised learning methods to learn the coefficients associated with each feature 10

11 Content Background and motivation Problem definition PathPredict: meta path-based relationship prediction model Meta path-based topological features The supervised learning framework and model Experiments Conclusions 11

12 Existing Topological Measure in Homogeneous Networks 12

13 Meta Path-based Topological Features 13

14 Meta Paths for Co-authorship Prediction in DBLP 14 List of all the meta paths between authors under length 4

15 Four Meta Path-based Measures 15

16 Example A-P-V-P-A meta path PC(J,M) = 7 NPC(J,M) = (7+7)/(7+9) RW(J,M) = ½ SRW(J,M) = ½ + 1/16 16

17 Content Background and motivation Problem definition PathPredict: meta path-based relationship prediction model Meta path-based topological features The supervised learning framework and model Experiments Conclusions 17

18 Supervised Learning Framework Training: T0-T1 time framework T0: feature collection (x) T1: label of relationship collection (y) Testing: T0’-T1’ time framework, which may have a shift of time compared with training stage 18

19 Prediction Model 19

20 Content Background and motivation Problem definition PathPredict: meta path-based relationship prediction model Meta path-based topological features The supervised learning framework and model Experiments Conclusions 20

21 Experiment Setting 21

22 Homogeneous measures vs. heterogeneous measures Homogeneous measures: Only consider co-author sub-network : common neighbor; rooted PageRank Consider the whole network and mix all types together: total path count Heterogeneous measure: Heterogeneous path count Heterogeneous Path Count produces the best accuracy! 22

23 Over Four Datasets 23

24 Compare among Different Heterogeneous Measures Normalized path count is slightly better and the hybrid measure that combines all measures is the best 24

25 Over four datasets 25

26 Impacts of Collaboration Frequency on Different Measures Symmetric random walk is better for predicting frequent co-author relationship 26

27 Model Generalization Over Time Conclusion: Historical training can help predict future relationship building 27

28 Learned Significance for Each Topological Feature The co-attending venues and the shared co-authors are very critical in determining two authors’ future collaboration 28

29 29 Case Studies for Queries

30 Content Background and motivation Problem definition PathPredict: meta path-based relationship prediction model Meta path-based topological features The supervised learning framework and model Experiments Conclusions 30

31 Conclusions Problem: Extend link prediction problem in homogeneous networks into relationship prediction in HIN, using co-authorship prediction as a case study Solution Propose meta path-based topological features and measures in HIN Using logistic regression-based supervised learning methods to learn the coefficients associated with each feature Results Hetero. measures beats homo. measures Hybrid measure beats single measures 31

32 Thank you! 32 Q & A


Download ppt "CO-AUTHOR RELATIONSHIP PREDICTION IN HETEROGENEOUS BIBLIOGRAPHIC NETWORKS Yizhou Sun, Rick Barber, Manish Gupta, Charu C. Aggarwal, Jiawei Han 1."

Similar presentations


Ads by Google