R-MAT: A Recursive Model for Graph Mining

Slides:



Advertisements
Similar presentations
1 Generating Network Topologies That Obey Power LawsPalmer/Steffan Carnegie Mellon Generating Network Topologies That Obey Power Laws Christopher R. Palmer.
Advertisements

1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos.
CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
Power Laws By Cameron Megaw 3/11/2013. What is a Power Law?
CHARALAMPOS E. TSOURAKAKIS SCHOOL OF COMPUTER SCIENCE CARNEGIE MELLON UNIVERSITY Fast counting of triangles in large networks without counting: Algorithms.
Kronecker Graphs: An Approach to Modeling Networks Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, Zoubin Ghahramani Presented.
On Power-Law Relationships of the Internet Topology Michalis Faloutsos Petros Faloutsos Christos Faloutsos.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Modeling Real Graphs using Kronecker Multiplication
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
NetMine: Mining Tools for Large Graphs Deepayan Chakrabarti Yiping Zhan Daniel Blandford Christos Faloutsos Guy Blelloch.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
On Power-Law Relationships of the Internet Topology CSCI 780, Fall 2005.
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
CMU SCS Graph and stream mining Christos Faloutsos CMU.
CSE 321 Discrete Structures Winter 2008 Lecture 25 Graph Theory.
CMU SCS Yahoo/Hadoop, 2008#1 Peta-Graph Mining Christos Faloutsos Prakash, Aditya Shringarpure, Suyash Tsourakakis, Charalampos Appel, Ana Chau, Polo Leskovec,
HCC class lecture 22 comments John Canny 4/13/05.
Alan Frieze Charalampos (Babis) E. Tsourakakis WAW June ‘12 WAW '121.
1 Tools for Large Graph Mining Thesis Committee:  Christos Faloutsos  Chris Olston  Guy Blelloch  Jon Kleinberg (Cornell) - Deepayan Chakrabarti.
Information Networks Introduction to networks Lecture 1.
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P0-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
Vertices and Edges Introduction to Graphs and Networks Mills College Spring 2012.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
Graph Algorithms. Graph Algorithms: Topics  Introduction to graph algorithms and graph represent ations  Single Source Shortest Path (SSSP) problem.
Yongqin Gao, Greg Madey Computer Science & Engineering Department University of Notre Dame © Copyright 2002~2003 by Serendip Gao, all rights reserved.
R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
Class 2: Graph Theory IST402. Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF.
Ljiljana Rajačić. Page Rank Web as a directed graph  Nodes: Web pages  Edges: Hyperlinks 2 / 25 Ljiljana Rajačić.
Class 2: Graph Theory IST402.
1 Patterns of Cascading Behavior in Large Blog Graphs Jure Leskoves, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst SDM 2007 Date:2008/8/21.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Dynamic Network Analysis Case study of PageRank-based Rewiring Narjès Bellamine-BenSaoud Galen Wilkerson 2 nd Second Annual French Complex Systems Summer.
1 Data Structures and Algorithms Graphs. 2 Graphs Basic Definitions Paths and Cycles Connectivity Other Properties Representation Examples of Graph Algorithms:
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Dynamics of Real-world Networks
Modeling networks using Kronecker multiplication
PEGASUS: A PETA-SCALE GRAPH MINING SYSTEM
On the distributed complexity of computing maximal matching Michal Hanckowiak, Michal Karonski, Allesandro Panconesi Presented by Maya Levy January 2016.
Modeling, sampling, generating Networks with MRV
NetMine: Mining Tools for Large Graphs
Tools for Large Graph Mining
Generative Model To Construct Blog and Post Networks In Blogosphere
Degree and Eigenvector Centrality
CS200: Algorithm Analysis
Network Science: A Short Introduction i3 Workshop
Part 1: Graph Mining – patterns
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
The likelihood of linking to a popular website is higher
Graph and Tensor Mining for fun and profit
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Dynamics of Real-world Networks
Graph and Tensor Mining for fun and profit
Clustering Coefficients
Katz Centrality (directed graphs).
Modelling and Searching Networks Lecture 2 – Complex Networks
Advanced Topics in Data Mining Special focus: Social Networks
Graphs G = (V,E) V is the vertex set.
Presentation transcript:

R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos

Introduction Graphs are ubiquitous Protein Interactions [genomebiology.com] Internet Map [lumeta.com] Food Web [Martinez ’91] Graphs are ubiquitous “Patterns”  regularities that occur in many graphs We want a realistic and efficient graph generator which matches many patterns and would be very useful for simulation studies.

“Network values” vs Rank Graph Patterns Effective Diameter Count vs Indegree Count vs Outdegree Hop-plot Power Laws “Network values” vs Rank Count vs Stress Eigenvalue vs Rank

Our Proposed Generator Choose quadrant b a b a=0.4 b=0.15 c d c=0.15 d=0.3 Initially Choose quadrant c and so on ….. Final cell chosen, “drop” an edge here.

Our Proposed Generator Communities RedHat a b Communities within communities Linux guys b c d Mandrake Windows guys c d Cross-community links Shows a “community” effect

Experiments (Epinions directed graph) Effective Diameter Count vs Indegree Count vs Outdegree Hop-plot Count vs Stress Eigenvalue vs Rank “Network value” ►R-MAT matches directed graphs

Experiments (Clickstream bipartite graph) Count vs Indegree Count vs Outdegree Hop-plot Singular value vs Rank Left “Network value” Right “Network value” ►R-MAT matches bipartite graphs

Experiments (Epinions undirected graph) Count vs Indegree Singular value vs Rank Hop-plot “Network value” Count vs Stress ►R-MAT matches undirected graphs

Conclusions The R-MAT graph generator matches the patterns mentioned before along with DGX/lognormal degree distributions  can be shown theoretically exhibits a “Community” effect generates undirected, directed, bipartite and weighted graphs with ease requires only 3 parameters (a,b,c), and, is fast and scalable  O(E logN)

The “DGX”/lognormal distribution Deviations from power-laws have been observed [Pennock+ ’02] These are well-modeled by the DGX distri- bution [Bi+’01] Essentially fits a parabola instead of a line to the log-log plot. “Drifting” surfers Count “Devoted” surfer Degree Clickstream data

Our Proposed Generator R-MAT (Recursive MATrix) [SIAM DM’04] 2n Subdivide the adjacency matrix and choose one quadrant with probability (a,b,c,d) Recurse till we reach a 1*1 cell where we place an edge and repeat for all edges. a = 0.4 b = 0.15 c = 0.15 d = 0.3