Guided Learning for Role Discovery (GLRD) Presented by Rui Liu Gilpin, Sean, Tina Eliassi-Rad, and Ian Davidson. "Guided learning for role discovery (glrd):

Guided Learning for Role Discovery (GLRD) Presented by Rui Liu Gilpin, Sean, Tina Eliassi-Rad, and Ian Davidson. "Guided learning for role discovery (glrd): Framework, algorithms, and applications." SIGKDD, 2013.

Background Role Discovery – Find groups of nodes that share similar topological structure in the graph (e.g. hub nodes, members of clique, peripheral nodes) Feature matrix V for a graph – Pre-computed given a graph – Examples of features Node Degree The number of triangles a node participates in Maximal neighbor degree Average neighbor degree – Existing algorithm to compute feature matrix ReFex [1] 1. Henderson, Keith, et al. "It's who you know: graph mining using recursive structural features." SIGKDD, 2011.

Contribution of this paper Existing work for role discovery problem – RolX [1] : Achieve special solution by adding convex constraints – E.g. sparsity, diversity, alternativeness where n*r matrix G is role assignment matrix and r*f matrix F is the role explanation matrix 1. Henderson, Keith, et al. "Rolx: structural role extraction & mining in large graphs." SIGKDD, 2012.

Constraints-sparsity Sparsity – Nodes are assigned to as few roles as possible – Roles are defined with respect to as few features as possible – Simple explanation of the data

Constraints-diversity Diversity – Prevent role definitions and role assignment to be highly overlapping – Each role uses a different set of features and nodes are assigned to different combinations of roles

Constraints-Alternativeness There may exist multiple explanations of data Returned explanation may be undesirable – Find another good explanation that is different to those already found

Algorithm Solve the following general form Basic strategy – Alternating Least Square (ALS) – Solve for one column of G or one row of F at a time – Original problem become a series of the following subproblem (convex! Easy to solve) where

Experiment—Identity Resolution DBLP Dataset: – 6 co-author graphs from 6 different conferences KDD, ICDM, SDM, CIKM, SIGMOD, VLDB Steps for evaluation 4. Recall icdm = num_match / num_total ; where num_total is the size of set S ( S include authors shared by ICDM and KDD ), num_match is number of authors from S satisfying: consider author, G kdd (i, :)’s k nearest neighbors (rows) from G icdm include the original author i.

Experiment–Identity Resolution Conclusion: GLRD is better on data mining conferences such as CIKM, SDM, ICDM, not on other conferences such as SIGMOD and VLDB Reason: The same authors play similar roles in KDD and other data mining conferences (CIKM, SDM, ICDM). Their role assignment vectors should be similar which results in high recall.

Experiment–Alternative roles Dataset: KDD co-author graph Use RoIX to get the original role definition Use GLRD to find role definition different from original one

Guided Learning for Role Discovery (GLRD) Presented by Rui Liu Gilpin, Sean, Tina Eliassi-Rad, and Ian Davidson. "Guided learning for role discovery (glrd):

Similar presentations

Presentation on theme: "Guided Learning for Role Discovery (GLRD) Presented by Rui Liu Gilpin, Sean, Tina Eliassi-Rad, and Ian Davidson. "Guided learning for role discovery (glrd):"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Guided Learning for Role Discovery (GLRD) Presented by Rui Liu Gilpin, Sean, Tina Eliassi-Rad, and Ian Davidson. "Guided learning for role discovery (glrd):

Similar presentations

Presentation on theme: "Guided Learning for Role Discovery (GLRD) Presented by Rui Liu Gilpin, Sean, Tina Eliassi-Rad, and Ian Davidson. "Guided learning for role discovery (glrd):"— Presentation transcript:

Similar presentations

About project

Feedback