CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.

Slides:



Advertisements
Similar presentations
CMU SCS Large Graph Mining - Patterns, Explanations and Cascade Analysis Christos Faloutsos CMU.
Advertisements

CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
School of Computer Science Carnegie Mellon University Duke University DeltaCon: A Principled Massive- Graph Similarity Function Danai Koutra Joshua T.
CMU SCS Copyright: C. Faloutsos (2012)# : Multimedia Databases and Data Mining Lecture #27: Graph mining - Communities and a paradox Christos.
CMU SCS : Multimedia Databases and Data Mining Lecture #26: Graph mining - patterns Christos Faloutsos.
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
SCS CMU Joint Work by Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos Speaker: Hanghang Tong Aug , 2008, Las Vegas.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
CMU SCS Mining Large Graphs Christos Faloutsos CMU.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
CMU SCS APWeb 07(c) 2007, C. Faloutsos 1 Copyright notice Copyright (c) 2007, Christos Faloutsos - all rights preserved. Permission to use all or some.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS Graph mining: Patterns, Generators and Tools Christos Faloutsos CMU.
Common Properties of Real Networks. Erdős-Rényi Random Graphs.
RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos Carnegie Mellon University.
SCS CMU Proximity Tracking on Time- Evolving Bipartite Graphs Speaker: Hanghang Tong Joint Work with Spiros Papadimitriou, Philip S. Yu, Christos Faloutsos.
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
CMU SCS Bio-informatics, Graph and Stream mining Christos Faloutsos CMU.
CMU SCS Graph and stream mining Christos Faloutsos CMU.
CMU SCS Graph Mining and Influence Propagation Christos Faloutsos CMU.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P3-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 3: Recommendations & proximity Faloutsos,
CMU SCS Yahoo/Hadoop, 2008#1 Peta-Graph Mining Christos Faloutsos Prakash, Aditya Shringarpure, Suyash Tsourakakis, Charalampos Appel, Ana Chau, Polo Leskovec,
CMU SCS Multimedia and Graph mining Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
Spectral Graph Theory (Basics)
CMU SCS Data Mining in Streams and Graphs Christos Faloutsos CMU.
Alan Frieze Charalampos (Babis) E. Tsourakakis WAW June ‘12 WAW '121.
CMU SCS Large Graph Mining: Power Tools and a Practitioner’s Guide Christos Faloutsos Gary Miller Charalampos (Babis) Tsourakakis CMU.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P0-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
CMU SCS Mining Billion-node Graphs: Patterns, Generators and Tools Christos Faloutsos CMU.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
CMU SCS Mining Billion-Node Graphs Christos Faloutsos CMU.
CMU SCS Mining Billion-Node Graphs: Patterns and Algorithms Christos Faloutsos CMU.
CMU SCS Graph Mining - surprising patterns in real graphs Christos Faloutsos CMU.
CMU SCS Mining Billion Node Graphs Christos Faloutsos CMU.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P5-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 5: Graphs over time & tensors Faloutsos,
Du, Faloutsos, Wang, Akoglu Large Human Communication Networks Patterns and a Utility-Driven Generator Nan Du 1,2, Christos Faloutsos 2, Bai Wang 1, Leman.
R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos.
CMU SCS Graph Mining: patterns and tools for static and time-evolving graphs Christos Faloutsos CMU.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
CMU SCS Graph Mining Christos Faloutsos CMU. CMU SCS iCAST, Jan. 09C. Faloutsos 2 Thank you! Prof. Hsing-Kuo Kenneth Pao Eric, Morgan, Ian, Teenet.
CMU SCS Mining Large Social Networks: Patterns and Anomalies Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
CMU SCS Graph Mining: Laws, Generators and Tools Christos Faloutsos CMU.
1 Patterns of Cascading Behavior in Large Blog Graphs Jure Leskoves, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst SDM 2007 Date:2008/8/21.
CMU SCS Panel: Social Networks Christos Faloutsos CMU.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P8-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 8: hadoop and Tera/Peta byte graphs.
Large Graph Mining: Power Tools and a Practitioner’s guide
DOULION: Counting Triangles in Massive Graphs with a Coin
NetMine: Mining Tools for Large Graphs
Large Graph Mining: Power Tools and a Practitioner’s guide
Large Graph Mining: Power Tools and a Practitioner’s guide
Part 1: Graph Mining – patterns
R-MAT: A Recursive Model for Graph Mining
Graph and Tensor Mining for fun and profit
Graph and Tensor Mining for fun and profit
Algorithms for Large Graph Mining
Large Graph Mining: Power Tools and a Practitioner’s guide
Presentation transcript:

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos (Babis) Tsourakakis CMU

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-2 Outline Introduction – Motivation Task 1: Node importance Task 2: Community detection Task 3: Recommendations Task 4: Connection sub-graphs Task 5: Mining graphs over time Task 6: Virus/influence propagation Task 7: Spectral graph theory Task 8: Tera/peta graph mining: hadoop Observations – patterns of real graphs Conclusions

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-3 Observations – ‘laws’ of real graphs Observation #1: small and SHRINKING diameter Observation #2: power law / skewed degree distributions Observation #3: power laws in several aspects Observation #4: communities

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-4 Observation 1 – diameter Small diameter – ‘six degrees’ … and the diameter SHRINKS as the graph grows (!)

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-5 Diameter – “Patents” Patent citation network 25 years of data time [years] diameter

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-6 Observation 1 – diameter Small diameter – ‘six degrees’ … and the diameter SHRINKS as the graph grows (!) Practical implication: BFS may die: –3-step-away neighbors => half of the graph!

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-7 Observations 2 – degree distribution Skewed degree distribution Most nodes have degree 1 or 2 … but they probably have a neighbor with degree 100,000 or so (!)

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-8 Degree distributions web hit counts [w/ A. Montgomery] Web Site Traffic log(in-degree) log(count) Zipf users sites ``ebay’’

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-9 epinions.com who-trusts-whom [Richardson + Domingos, KDD 2001] (out) degree count trusts-2000-people user

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-10 Observation 2 – degree distributions Skewed degree distribution Most nodes have degree 1 or 2 … but they probably have a neighbor with degree 100,000 or so (!) Practical implications: May need to delete/ignore those high degree nodes Could probably also trim the 1-degree nodes, saving significant space and time

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-11 Observation 3 – power laws Power-laws / skewed distributions in everything: –Most pairs: within 2-3 steps; but, some pair: ~20 or more steps away –Triangles: power laws[Tsourakakis’08] –# of cliques: ditto [Du+’09] –Weight vs degree: ditto [McGlohon+’08]

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-12 Observation 4 – communities ‘Negative dimensionality’ paradox [Chakrabarti+’04] Practical implication: Graphs may have no good cuts

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-13 Conclusions 0) Graphs appear in numerous settings 1) Singular / eigenvalue analysis: valuable –Fixed points – random walks – importance –Eigenvalue and epidemic threshold –Laplacians -> communities

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-14 Conclusions – cont’d 2) Random walks -> proximity –Recommendations, auto-captioning, etc –Fast algo’s, through Sherman-Morrison 3) Tera-byte scale graphs: hadoop 4) Beware: counter-intuitive properties –small diameters; power-laws; possible lack of good cuts

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-15 Acknowledgements Funding: PITA (PA Inf. Tech. Alliance) IIS , IIS , DBI , CNS

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-16 Acknowledgements - foils Chakrabarti, Deepay (cross- associations) Kolda, Tamara (tensors) Papadimitriou, Spiros (cross-associations) Sun, Jimeng (tensors) Tong, Hanghang (proximity)

CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-17 THANK YOU! Christos Faloutsos Gary Miller Charalampos (Babis) Tsourakakis