Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi.

Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi Guo 2 1 University of Illinois at Urbana-Champaign {chiwang1, hanj, yjia3, dzhang22, yintao}@illinois.edu 2 Tsinghua University {jietang, guojy07@mails}.tsinghua.edu.cn

Motivation Latent knowledge in information network: – Relationships: friends/relatives/colleagues/enemies? If they can be mined by links, it will benefit our study in – Community structure  clustering & classification – Exerting Searching  search & ranking – Evolution patterns  prediction & recommendation

Overall Framework

a i : author i p j : paper j py: paper year pn: paper# st i,yi : starting time ed i,yi : ending time r i,yi : ranking score

Heuristics ASSUMPTION 1: at each time t during the publication history of a node x, x is either being advised or not being advised. Once x starts to advise another node, it will never be advised again. ASSUMPTION 2: for a given pair of advisor and advisee, the advisor always has a longer publication history than the advisee.

Stage 1: Preprocessing From author-paper bipartite network to authorship collaboration homogenous network. Then a filtering process is performed to remove unlikely relations of advisor-advisee.

Stage 1: Preprocessing Author aj is not considered to be ai’s advisor if one of the following conditions holds:

Stage 1: Preprocessing In addition, estimate: – the starting time st ij is estimated as the time they started to collaborate; – the ending time ed ij can be estimated as either the time point when the Kulczynski measure starts to decrease; – the local likelihood of aj being ai’s advisor lij

Stage 2: Graph Factor Model For each node ai, there are three variables to decide: yi, sti, and edi. Suppose we have already had a local feature function g(yi, sti, edi) defined on the three variables of any given node.

Experiment Results DBLP data: 654, 628 authors, 1076,946 publications, years provided. DatasetsRULESVMIndMAXTPFG TEST169.9%73.4%75.2%78.9%80.2%84.4% TEST269.8%74.6% 79.0%81.5%84.3% TEST380.6%86.7%83.1%90.9%88.8%91.3% Empirical parameter optimized parameter heuristicsSupervised learning

Case Study AdviseeTop Ranked AdvisorTimeNote David M. Blei 1. Michael I. Jordan01-03PhD advisor, 2004 grad 2. John D. Lafferty05-06Postdoc, 2006 Hong Cheng 1. Qiang Yang02-03MS advisor, 2003 2. Jiawei Han04-08PhD advisor, 2008 Sergey Brin 1. Rajeev Motawani97-98“Unofficial advisor”

Effect of rules - ROC curve Filtering rules in TPFG 12

THANK YOU

Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi.

Similar presentations

Presentation on theme: "Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi.

Similar presentations

Presentation on theme: "Advisor-advisee Relationship Mining from Research Publication Network Chi Wang 1, Jiawei Han 1, Yuntao Jia 1, Jie Tang 2, Duo Zhang 1, Yintao Yu 1, Jingyi."— Presentation transcript:

Similar presentations

About project

Feedback