CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos.

Slides:



Advertisements
Similar presentations
CMU SCS PageRank Brin, Page description: C. Faloutsos, CMU.
Advertisements

1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU
1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos.
CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
Modeling Blog Dynamics Speaker: Michaela Götz Joint work with: Jure Leskovec, Mary McGlohon, Christos Faloutsos Cornell University Carnegie Mellon University.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Power Laws By Cameron Megaw 3/11/2013. What is a Power Law?
Analysis and Modeling of Social Networks Foudalis Ilias.
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
Kronecker Graphs: An Approach to Modeling Networks Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, Zoubin Ghahramani Presented.
Efficient Distribution Mining and Classification Yasushi Sakurai (NTT Communication Science Labs), Rosalynn Chong (University of British Columbia), Lei.
Xiaowei Ying Xintao Wu Univ. of North Carolina at Charlotte 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed.
Masters Thesis Defense Amit Karandikar Advisor: Dr. Anupam Joshi Committee: Dr. Finin, Dr. Yesha, Dr. Oates Date: 1 st May 2007 Time: 9:30 am Place: ITE.
Graphs (Part I) Shannon Quinn (with thanks to William Cohen of CMU and Jure Leskovec, Anand Rajaraman, and Jeff Ullman of Stanford University)
Modeling Real Graphs using Kronecker Multiplication
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
Weighted Graphs and Disconnected Components Patterns and a Generator Mary McGlohon, Leman Akoglu, Christos Faloutsos Carnegie Mellon University School.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Common Properties of Real Networks. Erdős-Rényi Random Graphs.
CS Lecture 6 Generative Graph Models Part II.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos Carnegie Mellon University.
Graphs over time: densification laws, shrinking diameters and possible explanations 1.
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
CMU SCS Bio-informatics, Graph and Stream mining Christos Faloutsos CMU.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
CMU SCS Yahoo/Hadoop, 2008#1 Peta-Graph Mining Christos Faloutsos Prakash, Aditya Shringarpure, Suyash Tsourakakis, Charalampos Appel, Ana Chau, Polo Leskovec,
On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
CMU SCS Data Mining in Streams and Graphs Christos Faloutsos CMU.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Exploratory Data Analysis on Graphs William Cohen.
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Week 3 - Complex Networks and their Properties
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
Graph Algorithms - continued William Cohen. Outline Last week: – PageRank – one sample algorithm on graphs edges and nodes in memory nodes in memory nothing.
CMU SCS Mining Billion-node Graphs: Patterns, Generators and Tools Christos Faloutsos CMU.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
CMU SCS Mining Billion Node Graphs Christos Faloutsos CMU.
Graph Algorithms: Properties of Graphs? William Cohen.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
Butterfly model slides. Topological Model: “Butterfly” Objective: Develop model to help explain behavioral mechanisms that cause observed properties,
Overview of this week Debugging tips for ML algorithms Graph algorithms – A prototypical graph algorithm: PageRank In memory Putting more and more on disk.
A Visual and Statistical Benchmark for Graph Sampling Methods Fangyan Zhang 1 Song Zhang 1 Pak Chung Wong 2 J. Edward Swan II 1 T.J. Jankun-Kelly 1 1 Mississippi.
R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University.
CMU SCS Mining Large Social Networks: Patterns and Anomalies Christos Faloutsos CMU.
Overview of this week Debugging tips for ML algorithms Graph algorithms – A prototypical graph algorithm: PageRank In memory Putting more and more on disk.
Graph Algorithms - continued William Cohen. Outline Last week: – PageRank – one algorithm on graphs edges and nodes in memory nodes in memory nothing.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Modeling, sampling, generating Networks with MRV
Large Graph Mining: Power Tools and a Practitioner’s guide
Part 1: Graph Mining – patterns
Lecture 13 Network evolution
R-MAT: A Recursive Model for Graph Mining
Statistical properties of network community structure
Dynamics of Real-world Networks
Graph and Tensor Mining for fun and profit
Peer-to-Peer and Social Networks
Graph and Tensor Mining for fun and profit
CS224w: Social and Information Network Analysis
Lecture 21 Network evolution
Presentation transcript:

CMU SCS KDD 2006Leskovec & Faloutsos1 ??

CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos Faloutsos Carnegie Mellon University

CMU SCS KDD 2006Leskovec & Faloutsos3 Problems and recommendations Q: How to sample from a large graph? A: FF, RN Q: Which properties to preserve? A: (at least) the 13 ones we list Q: How to measure success/similarity? A: K-S, towards ‘back-in-time’ version

CMU SCS KDD 2006Leskovec & Faloutsos4 Criteria in-degree; out-degree distribution distr. of WCC; SCC hop-plot; hop-plot for WCC distr. of first left singular vector values scree plot distr. of clustering coefficient Densification power law shrinking diameter normalized size of largest c.c. first eigenvalue STATICTEMPORAL

CMU SCS KDD 2006Leskovec & Faloutsos5 Targets scale-down (= fewer nodes; same diameter, same degree etc) back-in-time (match an earlier, real, smaller version of the graph)

CMU SCS KDD 2006Leskovec & Faloutsos6 Sampling Methods RN random nodes RPN pageRank random nodes RDN random nodes, degree- biased RE random edges RNE HYB (Hybrid) RNN RJ random jump RW random walk FF Forest fire

CMU SCS KDD 2006Leskovec & Faloutsos7 4 Datasets Arxiv (author-paper) Citation (HEP-TH, HEP-PH) A.S. epinions.com 26K - 500K edges

CMU SCS KDD 2006Leskovec & Faloutsos8 Diameter vs N; CC vs degree

CMU SCS KDD 2006Leskovec & Faloutsos9 degree distribution; avg CC vs N

CMU SCS KDD 2006Leskovec & Faloutsos10 diameterDPL

CMU SCS KDD 2006Leskovec & Faloutsos11 better D-statistic vs sample size scale-downback-in-time

CMU SCS KDD 2006Leskovec & Faloutsos12 Conclusions random nodes + a little exploration -> FF (RN, RJ are close) 15% sample seems enough back-in-time concept