Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMU SCS APWeb 07(c) 2007, C. Faloutsos 1 Copyright notice Copyright (c) 2007, Christos Faloutsos - all rights preserved. Permission to use all or some.

Similar presentations


Presentation on theme: "CMU SCS APWeb 07(c) 2007, C. Faloutsos 1 Copyright notice Copyright (c) 2007, Christos Faloutsos - all rights preserved. Permission to use all or some."— Presentation transcript:

1 CMU SCS APWeb 07(c) 2007, C. Faloutsos 1 Copyright notice Copyright (c) 2007, Christos Faloutsos - all rights preserved. Permission to use all or some of these foils is granted, for research and educational purposes. For commercial, for-profit uses, author’s permission is required. In any case, the author name should appear promintently on the foils used.

2 CMU SCS Data Mining using Fractals and Power laws Christos Faloutsos Carnegie Mellon University

3 CMU SCS APWeb 07(c) 2007, C. Faloutsos 3 Thank you! Guozhu Dong Xuemin Lin Yun Yang Jeffrey Xu Yu Carol

4 CMU SCS APWeb 07(c) 2007, C. Faloutsos 4 Deviation from published abstract More emphasis on graphs and less on fractals/self-similarity –fractals & power laws appear often, & together –Power Laws: Zipf, Pareto, Korcak, Lotka,... –Fractals: most real curves, surfaces, and point clouds are self-similar: coast-lines (1.1-1.58), mountain surfaces (2.2+/-) tumor shapes (1.5), mammalian brain (2.6-2.7) US road intersections (1.5-1.7) earthquakes, galaxies,...

5 CMU SCS APWeb 07(c) 2007, C. Faloutsos 5 Reminder: fractals Sierpinski triangle Hilbert curve Dragon curve

6 CMU SCS APWeb 07(c) 2007, C. Faloutsos 6 Focus: Graph Mining Laws (mainly, power laws) Generators and Tools

7 CMU SCS APWeb 07(c) 2007, C. Faloutsos 7 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) Conclusions

8 CMU SCS APWeb 07(c) 2007, C. Faloutsos 8 Motivation Data mining: ~ find patterns (rules, outliers) Problem#1: How do real graphs look like? Problem#2: How do they evolve? Problem#3: How to generate realistic graphs TOOLS Problem#4: Who is the ‘master-mind’? Problem#5: Track communities over time

9 CMU SCS APWeb 07(c) 2007, C. Faloutsos 9 Problem#1: Joint work with Dr. Deepayan Chakrabarti (CMU/Yahoo R.L.)

10 CMU SCS APWeb 07(c) 2007, C. Faloutsos 10 Graphs - why should we care? web: hyper-text graph IR: bi-partite graphs (doc-terms)... and more: D1D1 DNDN T1T1 TMTM...

11 CMU SCS APWeb 07(c) 2007, C. Faloutsos 11 Graphs - why should we care? Internet Map [lumeta.com] Food Web [Martinez ’91] Protein Interactions [genomebiology.com] Friendship Network [Moody ’01]

12 CMU SCS APWeb 07(c) 2007, C. Faloutsos 12 Graphs - why should we care? network of companies & board-of-directors members ‘viral’ marketing web-log (‘blog’) news propagation computer network security: email/IP traffic and anomaly detection....

13 CMU SCS APWeb 07(c) 2007, C. Faloutsos 13 Problem #1 - network and graph mining How does the Internet look like? How does the web look like? What is ‘normal’/‘abnormal’? which patterns/laws hold?

14 CMU SCS APWeb 07(c) 2007, C. Faloutsos 14 Graph mining Are real graphs random?

15 CMU SCS APWeb 07(c) 2007, C. Faloutsos 15 Laws and patterns Are real graphs random? A: NO!! –Diameter –in- and out- degree distributions –other (surprising) patterns

16 CMU SCS APWeb 07(c) 2007, C. Faloutsos 16 Solution#1 Power law in the degree distribution [SIGCOMM99] log(rank) log(degree) -0.82 internet domains att.com ibm.com

17 CMU SCS APWeb 07(c) 2007, C. Faloutsos 17 Solution#1’: Eigen Exponent E A2: power law in the eigenvalues of the adjacency matrix E = -0.48 Exponent = slope Eigenvalue Rank of decreasing eigenvalue May 2001

18 CMU SCS APWeb 07(c) 2007, C. Faloutsos 18 But: How about graphs from other domains?

19 CMU SCS APWeb 07(c) 2007, C. Faloutsos 19 Web In- and out-degree distribution of web sites [Barabasi], [IBM-CLEVER] log indegree - log(freq) from [Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins ]

20 CMU SCS APWeb 07(c) 2007, C. Faloutsos 20 Web In- and out-degree distribution of web sites [Barabasi], [IBM-CLEVER] log indegree log(freq) from [Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins ]

21 CMU SCS APWeb 07(c) 2007, C. Faloutsos 21 The Peer-to-Peer Topology Frequency versus degree Number of adjacent peers follows a power-law [Jovanovic+]

22 CMU SCS APWeb 07(c) 2007, C. Faloutsos 22 More power laws: citation counts: (citeseer.nj.nec.com 6/2001) log(#citations) log(count) Ullman

23 CMU SCS APWeb 07(c) 2007, C. Faloutsos 23 Swedish sex-web Nodes: people (Females; Males) Links: sexual relationships Liljeros et al. Nature 2001 4781 Swedes; 18-74; 59% response rate. Albert Laszlo Barabasi http://www.nd.edu/~networks/ Publication%20Categories/ 04%20Talks/2005-norway- 3hours.ppt

24 CMU SCS APWeb 07(c) 2007, C. Faloutsos 24 Swedish sex-web Nodes: people (Females; Males) Links: sexual relationships Liljeros et al. Nature 2001 4781 Swedes; 18-74; 59% response rate. Albert Laszlo Barabasi http://www.nd.edu/~networks/ Publication%20Categories/ 04%20Talks/2005-norway- 3hours.ppt

25 CMU SCS APWeb 07(c) 2007, C. Faloutsos 25 More power laws: web hit counts [w/ A. Montgomery] Web Site Traffic log(in-degree) log(count) Zipf users sites ``ebay’’

26 CMU SCS APWeb 07(c) 2007, C. Faloutsos 26 epinions.com who-trusts-whom [Richardson + Domingos, KDD 2001] (out) degree count trusts-2000-people user

27 CMU SCS APWeb 07(c) 2007, C. Faloutsos 27 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) Conclusions

28 CMU SCS APWeb 07(c) 2007, C. Faloutsos 28 Motivation Data mining: ~ find patterns (rules, outliers) Problem#1: How do real graphs look like? Problem#2: How do they evolve? Problem#3: How to generate realistic graphs TOOLS Problem#4: Who is the ‘master-mind’? Problem#5: Track communities over time

29 CMU SCS APWeb 07(c) 2007, C. Faloutsos 29 Problem#2: Time evolution with Jure Leskovec (CMU/MLD) and Jon Kleinberg (Cornell – sabb. @ CMU)

30 CMU SCS APWeb 07(c) 2007, C. Faloutsos 30 Evolution of the Diameter Prior work on Power Law graphs hints at slowly growing diameter: –diameter ~ O(log N) –diameter ~ O(log log N) What is happening in real data?

31 CMU SCS APWeb 07(c) 2007, C. Faloutsos 31 Evolution of the Diameter Prior work on Power Law graphs hints at slowly growing diameter: –diameter ~ O(log N) –diameter ~ O(log log N) What is happening in real data? Diameter shrinks over time

32 CMU SCS APWeb 07(c) 2007, C. Faloutsos 32 Diameter – ArXiv citation graph Citations among physics papers 1992 –2003 One graph per year 2003: –29,555 papers, –352,807 citations time [years] diameter

33 CMU SCS APWeb 07(c) 2007, C. Faloutsos 33 Diameter – “Autonomous Systems” Graph of Internet One graph per day 1997 – 2000 2000 –6,000 nodes –26,000 edges number of nodes diameter

34 CMU SCS APWeb 07(c) 2007, C. Faloutsos 34 Diameter – “Affiliation Network” Graph of collaborations in physics – authors linked to papers 10 years of data 2002 –60,000 nodes 20,000 authors 38,000 papers –133,000 edges time [years] diameter

35 CMU SCS APWeb 07(c) 2007, C. Faloutsos 35 Diameter – “Patents” Patent citation network 25 years of data 1999 –2.9 million nodes –16.5 million edges time [years] diameter

36 CMU SCS APWeb 07(c) 2007, C. Faloutsos 36 Temporal Evolution of the Graphs N(t) … nodes at time t E(t) … edges at time t Suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =? 2 * E(t)

37 CMU SCS APWeb 07(c) 2007, C. Faloutsos 37 Temporal Evolution of the Graphs N(t) … nodes at time t E(t) … edges at time t Suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =? 2 * E(t) A: over-doubled! –But obeying the ``Densification Power Law’’

38 CMU SCS APWeb 07(c) 2007, C. Faloutsos 38 Densification – Physics Citations Citations among physics papers 2003: –29,555 papers, 352,807 citations N(t) E(t) ??

39 CMU SCS APWeb 07(c) 2007, C. Faloutsos 39 Densification – Physics Citations Citations among physics papers 2003: –29,555 papers, 352,807 citations N(t) E(t) 1.69

40 CMU SCS APWeb 07(c) 2007, C. Faloutsos 40 Densification – Physics Citations Citations among physics papers 2003: –29,555 papers, 352,807 citations N(t) E(t) 1.69 1: tree

41 CMU SCS APWeb 07(c) 2007, C. Faloutsos 41 Densification – Physics Citations Citations among physics papers 2003: –29,555 papers, 352,807 citations N(t) E(t) 1.69 clique: 2 1: tree

42 CMU SCS APWeb 07(c) 2007, C. Faloutsos 42 Densification – Patent Citations Citations among patents granted 1999 –2.9 million nodes –16.5 million edges Each year is a datapoint N(t) E(t) 1.66

43 CMU SCS APWeb 07(c) 2007, C. Faloutsos 43 Densification – Autonomous Systems Graph of Internet 2000 –6,000 nodes –26,000 edges One graph per day N(t) E(t) 1.18

44 CMU SCS APWeb 07(c) 2007, C. Faloutsos 44 Densification – Affiliation Network Authors linked to their publications 2002 –60,000 nodes 20,000 authors 38,000 papers –133,000 edges N(t) E(t) 1.15

45 CMU SCS APWeb 07(c) 2007, C. Faloutsos 45 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) Conclusions

46 CMU SCS APWeb 07(c) 2007, C. Faloutsos 46 Motivation Data mining: ~ find patterns (rules, outliers) Problem#1: How do real graphs look like? Problem#2: How do they evolve? Problem#3: How to generate realistic graphs TOOLS Problem#4: Who is the ‘master-mind’? Problem#5: Track communities over time

47 CMU SCS APWeb 07(c) 2007, C. Faloutsos 47 Problem Definition Given a growing graph with count of nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns

48 CMU SCS APWeb 07(c) 2007, C. Faloutsos 48 Problem Definition Given a growing graph with count of nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns –Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter –Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters

49 CMU SCS APWeb 07(c) 2007, C. Faloutsos 49 Problem Definition Given a growing graph with count of nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Idea: Self-similarity Leads to power laws Communities within communities …

50 CMU SCS APWeb 07(c) 2007, C. Faloutsos 50 There are many obvious (but wrong) ways –Does not obey Densification Power Law –Has increasing diameter Kronecker Product is exactly what we need Recursive Graph Generation Initial graph Recursive expansion

51 CMU SCS APWeb 07(c) 2007, C. Faloutsos 51 Adjacency matrix Kronecker Product – a Graph Intermediate stage Adjacency matrix

52 CMU SCS APWeb 07(c) 2007, C. Faloutsos 52 Kronecker Product – a Graph Continuing multiplying with G 1 we obtain G 4 and so on … G 4 adjacency matrix

53 CMU SCS APWeb 07(c) 2007, C. Faloutsos 53 Kronecker Product – a Graph Continuing multiplying with G 1 we obtain G 4 and so on … G 4 adjacency matrix

54 CMU SCS APWeb 07(c) 2007, C. Faloutsos 54 Kronecker Product – a Graph Continuing multiplying with G 1 we obtain G 4 and so on … G 4 adjacency matrix

55 CMU SCS APWeb 07(c) 2007, C. Faloutsos 55 Kronecker Graphs – Formally: We create the self-similar graphs recursively: –Start with a initiator graph G 1 on N 1 nodes and E 1 edges –The recursion will then product larger graphs G 2, G 3, …G k on N 1 k nodes –Since we want to obey Densification Power Law graph G k has to have E 1 k edges

56 CMU SCS APWeb 07(c) 2007, C. Faloutsos 56 Kronecker Product – Definition The Kronecker product of matrices A and B is given by We define a Kronecker product of two graphs as a Kronecker product of their adjacency matrices N x MK x L N*K x M*L

57 CMU SCS APWeb 07(c) 2007, C. Faloutsos 57 Kronecker Graphs We propose a growing sequence of graphs by iterating the Kronecker product Each Kronecker multiplication exponentially increases the size of the graph

58 CMU SCS APWeb 07(c) 2007, C. Faloutsos 58 Kronecker Graphs – Intuition Intuition: –Recursive growth of graph communities –Nodes get expanded to micro communities –Nodes in sub-community link among themselves and to nodes from different communities

59 CMU SCS APWeb 07(c) 2007, C. Faloutsos 59 Properties: We can prove that –Degree distribution is multinomial ~ power law –Diameter: constant –Eigenvalue distribution: multinomial –First eigenvector: multinomial See [Leskovec+, PKDD’05] for proofs

60 CMU SCS APWeb 07(c) 2007, C. Faloutsos 60 Problem Definition Given a growing graph with nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns –Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter –Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters First and only generator for which we can prove all these properties

61 CMU SCS APWeb 07(c) 2007, C. Faloutsos 61 Stochastic Kronecker Graphs Create N 1  N 1 probability matrix P 1 Compute the k th Kronecker power P k For each entry p uv of P k include an edge (u,v) with probability p uv 0.40.2 0.10.3 P1P1 Instance Matrix G 2 0.160.08 0.04 0.120.020.06 0.040.020.120.06 0.010.03 0.09 PkPk flip biased coins Kronecker multiplication skip

62 CMU SCS APWeb 07(c) 2007, C. Faloutsos 62 Experiments How well can we match real graphs? –Arxiv: physics citations: 30,000 papers, 350,000 citations 10 years of data –U.S. Patent citation network 4 million patents, 16 million citations 37 years of data –Autonomous systems – graph of internet Single snapshot from January 2002 6,400 nodes, 26,000 edges We show both static and temporal patterns

63 CMU SCS APWeb 07(c) 2007, C. Faloutsos 63 Arxiv – Degree Distribution degree count Real graph Deterministic Kronecker Stochastic Kronecker

64 CMU SCS APWeb 07(c) 2007, C. Faloutsos 64 Arxiv – Scree Plot Rank Eigenvalue Real graph Deterministic Kronecker Stochastic Kronecker

65 CMU SCS APWeb 07(c) 2007, C. Faloutsos 65 Arxiv – Densification Nodes(t) Edges Real graph Deterministic Kronecker Stochastic Kronecker

66 CMU SCS APWeb 07(c) 2007, C. Faloutsos 66 Arxiv – Effective Diameter Nodes(t) Diameter Real graph Deterministic Kronecker Stochastic Kronecker

67 CMU SCS APWeb 07(c) 2007, C. Faloutsos 67 Arxiv citation network

68 CMU SCS APWeb 07(c) 2007, C. Faloutsos 68 U.S. Patent citations Static patternsTemporal patterns

69 CMU SCS APWeb 07(c) 2007, C. Faloutsos 69 Autonomous Systems Static patterns

70 CMU SCS APWeb 07(c) 2007, C. Faloutsos 70 (Q: how to fit the parm’s?) A: Stochastic version of Kronecker graphs + Max likelihood + Metropolis sampling [Leskovec+, ICML’07]

71 CMU SCS APWeb 07(c) 2007, C. Faloutsos 71 Experiments on real AS graph Degree distributionHop plot Network valueAdjacency matrix eigen values

72 CMU SCS APWeb 07(c) 2007, C. Faloutsos 72 Conclusions Kronecker graphs have: –All the static properties Heavy tailed degree distributions Small diameter Multinomial eigenvalues and eigenvectors –All the temporal properties Densification Power Law Shrinking/Stabilizing Diameters –We can formally prove these results

73 CMU SCS APWeb 07(c) 2007, C. Faloutsos 73 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) Conclusions

74 CMU SCS APWeb 07(c) 2007, C. Faloutsos 74 Motivation Data mining: ~ find patterns (rules, outliers) Problem#1: How do real graphs look like? Problem#2: How do they evolve? Problem#3: How to generate realistic graphs TOOLS Problem#4: Who is the ‘master-mind’? Problem#5: Track communities over time

75 CMU SCS APWeb 07(c) 2007, C. Faloutsos 75 Problem#4: MasterMind – ‘CePS’ w/ Hanghang Tong, KDD 2006 htong cs.cmu.edu

76 CMU SCS APWeb 07(c) 2007, C. Faloutsos 76 Center-Piece Subgraph(CePS) Given Q query nodes Find Center-piece ( ) App. –Social Networks –Law Inforcement, … Idea: –Proximity -> random walk with restarts

77 CMU SCS APWeb 07(c) 2007, C. Faloutsos 77 Center-Piece Subgraph(Ceps) Given Q query nodes Find Center-piece ( ) App. –Social Networks –Law Inforcement, … Idea: –Proximity -> random walk with restarts

78 CMU SCS APWeb 07(c) 2007, C. Faloutsos 78 Case Study: AND query R.AgrawalJiawei Han V.VapnikM.Jordan

79 CMU SCS APWeb 07(c) 2007, C. Faloutsos 79 Case Study: AND query

80 CMU SCS APWeb 07(c) 2007, C. Faloutsos 80 Case Study: AND query

81 CMU SCS APWeb 07(c) 2007, C. Faloutsos 81 ML/Statistics databases 2_SoftAnd query

82 CMU SCS APWeb 07(c) 2007, C. Faloutsos 82 Conclusions Q1:How to measure the importance? A1: RWR+K_SoftAnd Q2: How to find connection subgraph? A2:”Extract” Alg. Q3:How to do it efficiently? A3:Graph Partition (Fast CePS) –~90% quality –6:1 speedup; 150x speedup (ICDM’06, b.p. award)

83 CMU SCS APWeb 07(c) 2007, C. Faloutsos 83 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) Conclusions

84 CMU SCS APWeb 07(c) 2007, C. Faloutsos 84 Motivation Data mining: ~ find patterns (rules, outliers) Problem#1: How do real graphs look like? Problem#2: How do they evolve? Problem#3: How to generate realistic graphs TOOLS Problem#4: Who is the ‘master-mind’? Problem#5: Track communities over time

85 CMU SCS APWeb 07(c) 2007, C. Faloutsos 85 Tensors for time evolving graphs [Jimeng Sun+ KDD’06] [ “, SDM’07] [ CF, Kolda, Sun, SDM’07 and SIGMOD’07 tutorial]

86 CMU SCS APWeb 07(c) 2007, C. Faloutsos 86 Social network analysis Static: find community structures A u t h o r s Keywords 1990

87 CMU SCS APWeb 07(c) 2007, C. Faloutsos 87 Social network analysis Static: find community structures Dynamic: monitor community structure evolution; spot abnormal individuals; abnormal time-stamps

88 CMU SCS APWeb 07(c) 2007, C. Faloutsos 88 DB DM Application 1: Multiway latent semantic indexing (LSI) DB 2004 1990 Michael Stonebraker Query Pattern U keyword authors keyword U authors Projection matrices specify the clusters Core tensors give cluster activation level Philip Yu

89 CMU SCS APWeb 07(c) 2007, C. Faloutsos 89 Bibliographic data (DBLP) Papers from VLDB and KDD conferences Construct 2nd order tensors with yearly windows with Each tensor: 4584  3741 11 timestamps (years)

90 CMU SCS APWeb 07(c) 2007, C. Faloutsos 90 Multiway LSI AuthorsKeywordsYear michael carey, michael stonebraker, h. jagadish, hector garcia-molina queri,parallel,optimization,concurr, objectorient 1995 surajit chaudhuri,mitch cherniack,michael stonebraker,ugur etintemel distribut,systems,view,storage,servic,pr ocess,cache 2004 jiawei han,jian pei,philip s. yu, jianyong wang,charu c. aggarwal streams,pattern,support, cluster, index,gener,queri 2004 Two groups are correctly identified: Databases and Data mining People and concepts are drifting over time DM DB

91 CMU SCS APWeb 07(c) 2007, C. Faloutsos 91 Conclusions Tensor-based methods: spot patterns and anomalies on time evolving graphs, and on streams (monitoring)

92 CMU SCS APWeb 07(c) 2007, C. Faloutsos 92 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) Conclusions

93 CMU SCS APWeb 07(c) 2007, C. Faloutsos 93 Virus propagation How do viruses/rumors/blog-influence propagate? Will a flu-like virus linger, or will it become extinct soon?

94 CMU SCS APWeb 07(c) 2007, C. Faloutsos 94 The model: SIS ‘Flu’ like: Susceptible-Infected-Susceptible Virus ‘strength’ s=  /  Infected Healthy NN1 N3 N2 Prob.  Prob. β Prob. 

95 CMU SCS APWeb 07(c) 2007, C. Faloutsos 95 Epidemic threshold  of a graph: the value of , such that if strength s =  /  <  an epidemic can not happen Thus, given a graph compute its epidemic threshold

96 CMU SCS APWeb 07(c) 2007, C. Faloutsos 96 Epidemic threshold  What should  depend on? avg. degree? and/or highest degree? and/or variance of degree? and/or third moment of degree? and/or diameter?

97 CMU SCS APWeb 07(c) 2007, C. Faloutsos 97 Epidemic threshold [Theorem] We have no epidemic, if β/δ <τ = 1/ λ 1,A

98 CMU SCS APWeb 07(c) 2007, C. Faloutsos 98 Epidemic threshold [Theorem] We have no epidemic, if β/δ <τ = 1/ λ 1,A largest eigenvalue of adj. matrix A attack prob. recovery prob. epidemic threshold Proof: [Wang+03]

99 CMU SCS APWeb 07(c) 2007, C. Faloutsos 99 Experiments (Oregon)  /  > τ (above threshold)  /  = τ (at the threshold)  /  < τ (below threshold)

100 CMU SCS APWeb 07(c) 2007, C. Faloutsos 100 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) Conclusions

101 CMU SCS APWeb 07(c) 2007, C. Faloutsos 101 E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [WWW’07]

102 CMU SCS APWeb 07(c) 2007, C. Faloutsos 102 E-bay Fraud detection - NetProbe

103 CMU SCS APWeb 07(c) 2007, C. Faloutsos 103 OVERALL CONCLUSIONS Graphs pose a wealth of fascinating problems self-similarity and power laws work, when textbook methods fail! New patterns (shrinking diameter!) New generator: Kronecker

104 CMU SCS APWeb 07(c) 2007, C. Faloutsos 104 ‘Philosophical’ observation Graph mining brings together: ML/AI / IR Stat, Num. analysis, Systems (DB (Gb/Tb), Networks ) sociology, epidemiology physics, ++…

105 CMU SCS APWeb 07(c) 2007, C. Faloutsos 105 References Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan Fast Random Walk with Restart and Its Applications ICDM 2006, Hong Kong.Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos Center-Piece Subgraphs: Problem Definition and Fast Solutions, KDD 2006, Philadelphia, PACenter-Piece Subgraphs: Problem Definition and Fast Solutions,

106 CMU SCS APWeb 07(c) 2007, C. Faloutsos 106 References Jure Leskovec, Jon Kleinberg and Christos Faloutsos Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations KDD 2005, Chicago, IL. ("Best Research Paper" award).Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication (ECML/PKDD 2005), Porto, Portugal, 2005.Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker MultiplicationECML/PKDD 2005

107 CMU SCS APWeb 07(c) 2007, C. Faloutsos 107 References Jure Leskovec and Christos Faloutsos, Scalable Modeling of Real Graphs using Kronecker Multiplication, ICML 2007, Corvallis, OR, USA Jimeng Sun, Dacheng Tao, Christos Faloutsos Beyond Streams and Graphs: Dynamic Tensor Analysis, KDD 2006, Philadelphia, PA Beyond Streams and Graphs: Dynamic Tensor Analysis,

108 CMU SCS APWeb 07(c) 2007, C. Faloutsos 108 References Jimeng Sun, Yinglian Xie, Hui Zhang, Christos Faloutsos. Less is More: Compact Matrix Decomposition for Large Sparse Graphs, SDM, Minneapolis, Minnesota, Apr 2007. [pdf]pdf

109 CMU SCS APWeb 07(c) 2007, C. Faloutsos 109 Contact info: www. cs.cmu.edu /~christos (w/ papers, datasets, code, etc)


Download ppt "CMU SCS APWeb 07(c) 2007, C. Faloutsos 1 Copyright notice Copyright (c) 2007, Christos Faloutsos - all rights preserved. Permission to use all or some."

Similar presentations


Ads by Google