Finding patterns in large, real networks

Slides:



Advertisements
Similar presentations
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Advertisements

Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU
1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos.
Analysis and Modeling of Social Networks Foudalis Ilias.
The Connectivity and Fault-Tolerance of the Internet Topology
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
CS728 Lecture 5 Generative Graph Models and the Web.
CMU SCS Copyright: C. Faloutsos (2014)# : Multimedia Databases and Data Mining Lecture #29: Graph mining - Generators & tools Christos Faloutsos.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
NetMine: Mining Tools for Large Graphs Deepayan Chakrabarti Yiping Zhan Daniel Blandford Christos Faloutsos Guy Blelloch.
Weighted Graphs and Disconnected Components Patterns and a Generator Mary McGlohon, Leman Akoglu, Christos Faloutsos Carnegie Mellon University School.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
CMU SCS APWeb 07(c) 2007, C. Faloutsos 1 Copyright notice Copyright (c) 2007, Christos Faloutsos - all rights preserved. Permission to use all or some.
School of Computer Science Carnegie Mellon Data Mining using Fractals and Power laws Christos Faloutsos Carnegie Mellon University.
1 Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK New York Times Slides: thanks to A-L Barabasi.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
CS Lecture 6 Generative Graph Models Part II.
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
CMU SCS Bio-informatics, Graph and Stream mining Christos Faloutsos CMU.
CMU SCS Graph and stream mining Christos Faloutsos CMU.
SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
The structure of the Internet. The Internet as a graph Remember: the Internet is a collection of networks called autonomous systems (ASs) The Internet.
CMU SCS Multimedia and Graph mining Christos Faloutsos CMU.
Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –
Data Mining using Fractals and Power laws
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
CMU SCS Data Mining in Streams and Graphs Christos Faloutsos CMU.
CMU SCS : Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.
School of Computer Science Carnegie Mellon UIUC 04C. Faloutsos1 Advanced Data Mining Tools: Fractals and Power Laws for Graphs, Streams and Traditional.
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
School of Computer Science Carnegie Mellon Data Mining using Fractals (fractals for fun and profit) Christos Faloutsos Carnegie Mellon University.
School of Computer Science Carnegie Mellon Data Mining using Fractals and Power laws Christos Faloutsos Carnegie Mellon University.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
Carnegie Mellon Finding patterns in large, real networks Christos Faloutsos CMU
CMU SCS Finding patterns in large, real networks Christos Faloutsos CMU.
R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos.
CMU SCS Graph Mining: patterns and tools for static and time-evolving graphs Christos Faloutsos CMU.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
School of Computer Science Carnegie Mellon Data Mining using Fractals and Power laws Christos Faloutsos Carnegie Mellon University.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Carnegie Mellon Data Mining – Research Directions C. Faloutsos CMU
1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Christian Schindelhauer Search Algorithms Winter Semester 2004/ Dec.
Carnegie Mellon IIT Bombay KDD04© 2004, S. Chakrabarti and C. Faloutsos1 Graph structures in data mining Soumen Chakrabarti (IIT-Bombay) Christos Faloutsos.
Graph Models Class Algorithmic Methods of Data Mining
L – Modeling and Simulating Social Systems with MATLAB
Topics In Social Computing (67810)
Modeling networks using Kronecker multiplication
Indexing and Data Mining in Multimedia Databases
NetMine: Mining Tools for Large Graphs
15-826: Multimedia Databases and Data Mining
Random Graph Models of large networks
Part 1: Graph Mining – patterns
15-826: Multimedia Databases and Data Mining
Lecture 13 Network evolution
R-MAT: A Recursive Model for Graph Mining
Peer-to-Peer and Social Networks Fall 2017
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Graph and Tensor Mining for fun and profit
Large Graph Mining: Power Tools and a Practitioner’s guide
Data Mining using Fractals and Power laws
Lecture 21 Network evolution
Modelling and Searching Networks Lecture 2 – Complex Networks
Presentation transcript:

Finding patterns in large, real networks Christos Faloutsos CMU

Thanks to Deepayan Chakrabarti (CMU/Yahoo) Michalis Faloutsos (UCR) Spiros Papadimitriou (CMU/IBM) George Siganos (UCR) Yahoo 05 (c) 2005 C. Faloutsos

Introduction Graphs are everywhere! Protein Interactions [genomebiology.com] Internet Map [lumeta.com] Food Web [Martinez ’91] Graphs are everywhere! Friendship Network [Moody ’01] Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? tools/results future steps Yahoo 05 (c) 2005 C. Faloutsos

Motivating questions Q1: What do real graphs look like? Q2: How to generate realistic graphs? Yahoo 05 (c) 2005 C. Faloutsos

Virus propagation Q3: How do viruses/rumors propagate? Q4: Who is the best person/computer to immunize against a virus? Yahoo 05 (c) 2005 C. Faloutsos

Graph clustering & mining Q5: which edges/nodes are ‘abnormal’? Q6: split a graph in k ‘natural’ communities - but how to determine k? Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? tools/results future steps Yahoo 05 (c) 2005 C. Faloutsos

Topology Q1: How does the Internet look like? Any rules? (Looks random – right?) Yahoo 05 (c) 2005 C. Faloutsos

Laws – degree distributions Q: avg degree is ~2 - what is the most probable degree? count ?? 2 degree Yahoo 05 (c) 2005 C. Faloutsos

Laws – degree distributions Q: avg degree is ~2 - what is the most probable degree? degree count ?? 2 WRONG ! degree Yahoo 05 (c) 2005 C. Faloutsos

I.Power-law: outdegree O Frequency Exponent = slope O = -2.15 -2.15 Nov’97 Outdegree The plot is linear in log-log scale [FFF’99] freq = degree (-2.15) Yahoo 05 (c) 2005 C. Faloutsos

II.Power-law: rank R The plot is a line in log-log scale att.com log(rank) log(degree) -0.82 att.com ibm.com April’98 R Rank: nodes in decreasing degree order The plot is a line in log-log scale Yahoo 05 (c) 2005 C. Faloutsos

III.Power-law: eigen E Eigenvalues in decreasing order (first 20) Exponent = slope E = -0.48 Dec’98 Rank of decreasing eigenvalue Eigenvalues in decreasing order (first 20) [Mihail+, 02]: R = 2 * E Yahoo 05 (c) 2005 C. Faloutsos

Any other ‘laws’? Yahoo 05 (c) 2005 C. Faloutsos

Any other ‘laws’? Yes! Small diameter six degrees of separation / ‘Kevin Bacon’ small worlds [Watts and Strogatz] Yahoo 05 (c) 2005 C. Faloutsos

Any other ‘laws’? Bow-tie, for the web [Kumar+ ‘99] IN, SCC, OUT, ‘tendrils’ disconnected components Yahoo 05 (c) 2005 C. Faloutsos

Any other ‘laws’? power-laws in communities (bi-partite cores) [Kumar+, ‘99] Log(count) n:1 n:3 n:2 2:3 core (m:n core) Log(m) Yahoo 05 (c) 2005 C. Faloutsos

More Power laws Also hold for other web graphs [Barabasi+, ‘99], [Kumar+, ‘99] citation graphs (see later) and many more Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? tools/results future steps Yahoo 05 (c) 2005 C. Faloutsos

How do graphs evolve? Yahoo 05 (c) 2005 C. Faloutsos

Time Evolution: rank R #days since Nov. ‘97 log(rank) log(degree) - Time Evolution: rank R Domain level #days since Nov. ‘97 The rank exponent has not changed! [Siganos+, ‘03] Yahoo 05 (c) 2005 C. Faloutsos

How do graphs evolve? degree-exponent seems constant - anything else? Yahoo 05 (c) 2005 C. Faloutsos

Evolution of diameter? Prior analysis, on power-law-like graphs, hints that diameter ~ O(log(N)) or diameter ~ O( log(log(N))) i.e.., slowly increasing with network size Q: What is happening, in reality? Yahoo 05 (c) 2005 C. Faloutsos

Evolution of diameter? Prior analysis, on power-law-like graphs, hints that diameter ~ O(log(N)) or diameter ~ O( log(log(N))) i.e.., slowly increasing with network size Q: What is happening, in reality? A: It shrinks(!!), towards a constant value Yahoo 05 (c) 2005 C. Faloutsos

Shrinking diameter diameter time [Leskovec+05a] Citations among physics papers 11yrs; @ 2003: 29,555 papers 352,807 citations For each month M, create a graph of all citations up to month M time Yahoo 05 (c) 2005 C. Faloutsos

Shrinking diameter Authors & publications 1992 2002 318 nodes 272 edges 2002 60,000 nodes 20,000 authors 38,000 papers 133,000 edges Yahoo 05 (c) 2005 C. Faloutsos

Shrinking diameter Patents & citations 1975 1999 334,000 nodes 676,000 edges 1999 2.9 million nodes 16.5 million edges Each year is a datapoint Yahoo 05 (c) 2005 C. Faloutsos

Shrinking diameter Autonomous systems 1997 2000 One graph per day 3,000 nodes 10,000 edges 2000 6,000 nodes 26,000 edges One graph per day diameter N Yahoo 05 (c) 2005 C. Faloutsos

Temporal evolution of graphs N(t) nodes; E(t) edges at time t suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =? 2 * E(t) Yahoo 05 (c) 2005 C. Faloutsos

Temporal evolution of graphs N(t) nodes; E(t) edges at time t suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =? 2 * E(t) A: over-doubled! x Yahoo 05 (c) 2005 C. Faloutsos

Temporal evolution of graphs A: over-doubled - but obeying: E(t) ~ N(t)a for all t where 1<a<2 Yahoo 05 (c) 2005 C. Faloutsos

Densification Power Law ArXiv: Physics papers and their citations N(t) E(t) 1.69 Yahoo 05 (c) 2005 C. Faloutsos

Densification Power Law ArXiv: Physics papers and their citations N(t) E(t) 1 1.69 ‘tree’ Yahoo 05 (c) 2005 C. Faloutsos

Densification Power Law ArXiv: Physics papers and their citations ‘clique’ N(t) E(t) 2 1.69 Yahoo 05 (c) 2005 C. Faloutsos

Densification Power Law U.S. Patents, citing each other N(t) E(t) 1.66 Yahoo 05 (c) 2005 C. Faloutsos

Densification Power Law Autonomous Systems N(t) E(t) 1.18 Yahoo 05 (c) 2005 C. Faloutsos

Densification Power Law ArXiv: authors & papers N(t) E(t) 1.15 Yahoo 05 (c) 2005 C. Faloutsos

Summary of ‘laws’ Power laws for degree distributions …………... for eigenvalues, bi-partite cores Small & shrinking diameter (‘6 degrees’) ‘Bow-tie’ for web; ‘jelly-fish’ for internet ``Densification Power Law’’, over time Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? other power laws fractals & self-similarity case study: Kronecker graphs tools/results future steps Yahoo 05 (c) 2005 C. Faloutsos

Power laws Many more power laws, in diverse settings: Yahoo 05 (c) 2005 C. Faloutsos

A famous power law: Zipf’s law log(freq) “a” Bible - rank vs frequency (log-log) “the” log(rank) Yahoo 05 (c) 2005 C. Faloutsos

Power laws, cont’ed length of file transfers [Bestavros+] web hit counts [Huberman] magnitude of earthquakes (Guttenberg-Richter law) sizes of lakes/islands (Korcak’s law) Income distribution (Pareto’s law) Yahoo 05 (c) 2005 C. Faloutsos

The Peer-to-Peer Topology [Jovanovic+] Frequency versus degree Number of adjacent peers follows a power-law Yahoo 05 (c) 2005 C. Faloutsos

Click-stream data u-id’s url’s Web Site Traffic log(count) Zipf ‘yahoo’ log(freq) log(count) ‘super-surfer’ log(freq) Yahoo 05 (c) 2005 C. Faloutsos

Lotka’s law (Lotka’s law of publication count); and citation counts: (citeseer.nj.nec.com 6/2001) log(count) J. Ullman log(#citations) Yahoo 05 (c) 2005 C. Faloutsos

Albert Laszlo Barabasi Swedish sex-web Nodes: people (Females; Males) Links: sexual relationships Albert Laszlo Barabasi http://www.nd.edu/~networks/ Publication%20Categories/ 04%20Talks/2005-norway-3hours.ppt 4781 Swedes; 18-74; 59% response rate. Yahoo 05 (c) 2005 C. Faloutsos Liljeros et al. Nature 2001

Albert Laszlo Barabasi Swedish sex-web Nodes: people (Females; Males) Links: sexual relationships Albert Laszlo Barabasi http://www.nd.edu/~networks/ Publication%20Categories/ 04%20Talks/2005-norway-3hours.ppt 4781 Swedes; 18-74; 59% response rate. Yahoo 05 (c) 2005 C. Faloutsos Liljeros et al. Nature 2001

Power laws Q: Why so many power laws? A1: self-similarity A2: ‘rich-get-richer’ Yahoo 05 (c) 2005 C. Faloutsos

Digression: intro to fractals Fractals: sets of points that are self similar Yahoo 05 (c) 2005 C. Faloutsos

A famous fractal dimensionality = ?? e.g., Sierpinski triangle: ... zero area; infinite length! ... dimensionality = ?? Yahoo 05 (c) 2005 C. Faloutsos

A famous fractal dimensionality = log(3)/log(2) = 1.58 e.g., Sierpinski triangle: zero area; infinite length! ... dimensionality = log(3)/log(2) = 1.58 Yahoo 05 (c) 2005 C. Faloutsos

A famous fractal equivalent graph: Yahoo 05 (c) 2005 C. Faloutsos

Intrinsic (‘fractal’) dimension How to estimate it? Yahoo 05 (c) 2005 C. Faloutsos

Intrinsic (‘fractal’) dimension Q: fractal dimension of a line? A: nn ( <= r ) ~ r^1 (‘power law’: y=x^a) Q: fd of a plane? A: nn ( <= r ) ~ r^2 fd== slope of (log(nn) vs log(r) ) Yahoo 05 (c) 2005 C. Faloutsos

Sierpinsky triangle == ‘correlation integral’ = CDF of pairwise distances log( r ) log(#pairs within <=r ) 1.58 Yahoo 05 (c) 2005 C. Faloutsos

Fractals and power laws They are related concepts: fractals <=> self-similarity <=> scale-free <=> power laws ( y= xa ) F = C r(-2) vs y=e-ax Yahoo 05 (c) 2005 C. Faloutsos

Fractals and power laws Power laws and fractals are closely related And fractals appear in MANY cases coast-lines: 1.1-1.5 brain-surface: 2.6 rain-patches: 1.3 tree-bark: ~2.1 stock prices / random walks: 1.5 ... [see Mandelbrot; or Schroeder] Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? other power laws fractals & self-similarity case study: Kronecker graph generator tools/results future steps Yahoo 05 (c) 2005 C. Faloutsos

Self-similarity at work Q2: How to generate a realistic graph? (Q2’) and how to grow it over time? Yahoo 05 (c) 2005 C. Faloutsos

Wish list for a generator: Power-law-tail in- and out-degrees Power-law-tail scree plots shrinking/constant diameter Densification Power Law communities-within-communities Q: how to achieve all of them? Yahoo 05 (c) 2005 C. Faloutsos

Wish list for a generator: Power-law-tail in- and out-degrees Power-law-tail scree plots shrinking/constant diameter Densification Power Law communities-within-communities Q: how to achieve all of them? A: Kronecker matrix product [Leskovec+05b] Yahoo 05 (c) 2005 C. Faloutsos

Kronecker product Yahoo 05 (c) 2005 C. Faloutsos

Kronecker product Yahoo 05 (c) 2005 C. Faloutsos

Kronecker product N N*N N**4 Yahoo 05 (c) 2005 C. Faloutsos

Properties of Kronecker graphs: Power-law-tail in- and out-degrees Power-law-tail scree plots constant diameter perfect Densification Power Law communities-within-communities Yahoo 05 (c) 2005 C. Faloutsos

Properties of Kronecker graphs: Power-law-tail in- and out-degrees Power-law-tail scree plots constant diameter perfect Densification Power Law communities-within-communities and we can prove all of the above (first and only generator that does that) Yahoo 05 (c) 2005 C. Faloutsos

Kronecker - patents Scree D.P.L. Degree Diameter Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? tools/results virus propagation connection subgraphs cross-associations future steps Yahoo 05 (c) 2005 C. Faloutsos

Virus propagation Q3: How do viruses/rumors propagate? Q4: Who is the best person/computer to immunize against a virus? Yahoo 05 (c) 2005 C. Faloutsos

The model: SIS ‘Flu’ like: Susceptible-Infected-Susceptible Virus ‘strength’ s= b/d Healthy Prob. d N2 Prob. b Prob. β N1 N Infected N3 Yahoo 05 (c) 2005 C. Faloutsos

if strength s = b / d < t Epidemic threshold t of a graph: the value of t, such that if strength s = b / d < t an epidemic can not happen Thus, given a graph compute its epidemic threshold Yahoo 05 (c) 2005 C. Faloutsos

Epidemic threshold t What should t depend on? avg. degree? and/or highest degree? and/or variance of degree? and/or third moment of degree? and/or diameter? Yahoo 05 (c) 2005 C. Faloutsos

Epidemic threshold β/δ <τ = 1/ λ1,A [Theorem] We have no epidemic, if β/δ <τ = 1/ λ1,A Yahoo 05 (c) 2005 C. Faloutsos

Epidemic threshold β/δ <τ = 1/ λ1,A [Theorem] We have no epidemic, if epidemic threshold recovery prob. β/δ <τ = 1/ λ1,A largest eigenvalue of adj. matrix A attack prob. Proof: [Wang+03] Yahoo 05 (c) 2005 C. Faloutsos

Experiments (Oregon) b/d > τ (above threshold) b/d = τ (at the threshold) b/d < τ (below threshold) Yahoo 05 (c) 2005 C. Faloutsos

Our result: Holds for any graph includes older results as special cases Yahoo 05 (c) 2005 C. Faloutsos

Virus propagation Q3: How do viruses/rumors propagate? Q4: Who is the best person/computer to immunize against a virus? A4: the one that causes the highest decrease in l1 Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? tools/results virus propagation connection subgraphs cross-associations future steps Yahoo 05 (c) 2005 C. Faloutsos

(A-A) Kidman-Diaz What are the best paths between ‘Kidman’ and ‘Diaz’? Yahoo 05 (c) 2005 C. Faloutsos

(A-A) Kidman-Diaz What are the best paths between ‘Kidman’ and ‘Diaz’? [KDD’04, w/ McCurley+Tomkins] Strong, direct link Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results future steps static time-evolution laws why so many power laws? tools/results virus propagation connection subgraphs cross-associations future steps Yahoo 05 (c) 2005 C. Faloutsos

Graph clustering & mining Q5: which edges/nodes are ‘abnormal’? Q6: split a graph in k ‘natural’ communities - but how to determine k? Yahoo 05 (c) 2005 C. Faloutsos

Graph partitioning Documents x terms Customers x products Users x web-sites Yahoo 05 (c) 2005 C. Faloutsos

Graph partitioning Documents x terms Customers x products Users x web-sites Q: HOW MANY PIECES? Yahoo 05 (c) 2005 C. Faloutsos

Graph partitioning Documents x terms Customers x products Users x web-sites Q: HOW MANY PIECES? A: MDL/ compression Yahoo 05 (c) 2005 C. Faloutsos

Cross-associations 2x2 1x2 Yahoo 05 (c) 2005 C. Faloutsos

Cross-associations 3x4 3x3 2x3 Yahoo 05 (c) 2005 C. Faloutsos

Cross-associations Yahoo 05 (c) 2005 C. Faloutsos

Cross-associations Yahoo 05 (c) 2005 C. Faloutsos

Graph clustering & mining Q5: which edges/nodes are ‘abnormal’? Q6: split a graph in k ‘natural’ communities - but how to determine k? A6: choose the k that leads to best overall compression (= MDL = Minimum Description Language) Yahoo 05 (c) 2005 C. Faloutsos

Cross-associations missing edge outlier edge Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results other related projects future steps static time-evolution laws why so many power laws? tools/results other related projects future steps Yahoo 05 (c) 2005 C. Faloutsos

Stream mining : : : : : : chlorine concentrations Phase 1 Phase 2 Phase 3 : : : : : : chlorine concentrations This is basically our main contribution: “wake up, you *can* do dimensionality reduction incrementally, in real-time”. water distribution network Yahoo 05 (c) 2005 C. Faloutsos

Stream mining k = 1 actual measurements k hidden variable(s) Phase 1 : : : : : : chlorine concentrations Phase 1 k = 1 actual measurements (n streams) k hidden variable(s) We would like to discover a few “hidden (latent) variables” that summarize the key trends Yahoo 05 (c) 2005 C. Faloutsos

Stream mining Solution: SPIRIT [VLDB’05] incremental, on-line PCA Yahoo 05 (c) 2005 C. Faloutsos

Outline Laws tools/results other related projects future steps Yahoo 05 (c) 2005 C. Faloutsos

Future steps connection sub-graphs on multi-graphs [Tong] traffic matrix, evolving over time [Sun+] influence propagation on blogs [Leskovec+] automatic web site classification [Leskovec] Yahoo 05 (c) 2005 C. Faloutsos

Conclusions Power laws & self similarity Additional tools excellent tools, for real datasets, where Gaussian, uniform etc assumptions fail Additional tools Kronecker, for graph generators eigenvalues for virus propagation MDL/compression for graph partitioning Yahoo 05 (c) 2005 C. Faloutsos

References [Aiello+, '00] William Aiello, Fan R. K. Chung, Linyuan Lu: A random graph model for massive graphs. STOC 2000: 171-180 [Albert+] Reka Albert, Hawoong Jeong, and Albert-Laszlo Barabasi: Diameter of the World Wide Web, Nature 401 130-131 (1999) [Barabasi, '03] Albert-Laszlo Barabasi Linked: How Everything Is Connected to Everything Else and What It Means (Plume, 2003) Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Barabasi+, '99] Albert-Laszlo Barabasi and Reka Albert. Emergence of scaling in random networks. Science, 286:509--512, 1999 [Broder+, '00] Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. Graph structure in the web, WWW, 2000 Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Chakrabarti+, ‘04] RMAT: A recursive graph generator, D. Chakrabarti, Y. Zhan, C. Faloutsos, SIAM-DM 2004 [Dill+, '01] Stephen Dill, Ravi Kumar, Kevin S. McCurley, Sridhar Rajagopalan, D. Sivakumar, Andrew Tomkins: Self-similarity in the Web. VLDB 2001: 69-78 Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Fabrikant+, '02] A. Fabrikant, E. Koutsoupias, and C.H. Papadimitriou. Heuristically Optimized Trade-offs: A New Paradigm for Power Laws in the Internet. ICALP, Malaga, Spain, July 2002 [FFF, 99] M. Faloutsos, P. Faloutsos, and C. Faloutsos, "On power-law relationships of the Internet topology," in SIGCOMM, 1999. Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Leskovec+05a] Jure Leskovec, Jon Kleinberg and Christos Faloutsos Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations KDD 2005, Chicago, IL. (Best research paper award) [Leskovec+05b] Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg and Christos Faloutsos, Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication, ECML/PKDD 2005, Porto, Portugal. Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Jovanovic+, '01] M. Jovanovic, F.S. Annexstein, and K.A. Berman. Modeling Peer-to-Peer Network Topologies through "Small-World" Models and Power Laws. In TELFOR, Belgrade, Yugoslavia, November, 2001 [Kumar+ '99] Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins: Extracting Large-Scale Knowledge Bases from the Web. VLDB 1999: 639-650 Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Leland+, '94] W. E. Leland, M.S. Taqqu, W. Willinger, D.V. Wilson, On the Self-Similar Nature of Ethernet Traffic, IEEE Transactions on Networking, 2, 1, pp 1-15, Feb. 1994. [Mihail+, '02] Milena Mihail, Christos H. Papadimitriou: On the Eigenvalue Power Law. RANDOM 2002: 254-262 Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Milgram '67] Stanley Milgram: The Small World Problem, Psychology Today 1(1), 60-67 (1967) [Montgomery+, ‘01] Alan L. Montgomery, Christos Faloutsos: Identifying Web Browsing Trends and Patterns. IEEE Computer 34(7): 94-95 (2001) Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Palmer+, ‘01] Chris Palmer, Georgos Siganos, Michalis Faloutsos, Christos Faloutsos and Phil Gibbons The connectivity and fault-tolerance of the Internet topology (NRDM 2001), Santa Barbara, CA, May 25, 2001 [Papadimitriou+,’05] Spiros Papadimitriou, Jimeng Sun and Christos Faloutsos Streaming Pattern Discovery in Multiple Time-Series VLDB 2005, Trondheim, Norway Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Pennock+, '02] David M. Pennock, Gary William Flake, Steve Lawrence, Eric J. Glover, C. Lee Giles: Winners don't take all: Characterizing the competition for links on the web Proc. Natl. Acad. Sci. USA 99(8): 5207-5211 (2002) [Schroeder, ‘91] Manfred Schroeder Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise W H Freeman & Co., 1991 (excellent book on fractals) Yahoo 05 (c) 2005 C. Faloutsos

References, cont’d [Siganos+, '03] G. Siganos, M. Faloutsos, P. Faloutsos, C. Faloutsos Power-Laws and the AS-level Internet Topology, Transactions on Networking, August 2003. [Wang+03] Yang Wang, Deepayan Chakrabarti, Chenxi Wang and Christos Faloutsos: Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint, SRDS 2003, Florence, Italy. Yahoo 05 (c) 2005 C. Faloutsos

Reference [Watts+ Strogatz, '98] D. J. Watts and S. H. Strogatz Collective dynamics of 'small-world' networks, Nature, 393:440-442 (1998) [Watts, '03] Duncan J. Watts Six Degrees: The Science of a Connected Age W.W. Norton & Company; (February 2003) Yahoo 05 (c) 2005 C. Faloutsos

Thank you! www.cs.cmu.edu/~christos www.db.cs.cmu.edu Yahoo 05 (c) 2005 C. Faloutsos