Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos.

Similar presentations


Presentation on theme: "CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos."— Presentation transcript:

1 CMU SCS KDD 2006Leskovec & Faloutsos1 ??

2 CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos Faloutsos Carnegie Mellon University

3 CMU SCS KDD 2006Leskovec & Faloutsos3 Problems and recommendations Q: How to sample from a large graph? A: FF, RN Q: Which properties to preserve? A: (at least) the 13 ones we list Q: How to measure success/similarity? A: K-S, towards ‘back-in-time’ version

4 CMU SCS KDD 2006Leskovec & Faloutsos4 Criteria in-degree; out-degree distribution distr. of WCC; SCC hop-plot; hop-plot for WCC distr. of first left singular vector values scree plot distr. of clustering coefficient Densification power law shrinking diameter normalized size of largest c.c. first eigenvalue STATICTEMPORAL

5 CMU SCS KDD 2006Leskovec & Faloutsos5 Targets scale-down (= fewer nodes; same diameter, same degree etc) back-in-time (match an earlier, real, smaller version of the graph)

6 CMU SCS KDD 2006Leskovec & Faloutsos6 Sampling Methods RN random nodes RPN pageRank random nodes RDN random nodes, degree- biased RE random edges RNE HYB (Hybrid) RNN RJ random jump RW random walk FF Forest fire

7 CMU SCS KDD 2006Leskovec & Faloutsos7 4 Datasets Arxiv (author-paper) Citation (HEP-TH, HEP-PH) A.S. epinions.com 26K - 500K edges

8 CMU SCS KDD 2006Leskovec & Faloutsos8 Diameter vs N; CC vs degree

9 CMU SCS KDD 2006Leskovec & Faloutsos9 degree distribution; avg CC vs N

10 CMU SCS KDD 2006Leskovec & Faloutsos10 diameterDPL

11 CMU SCS KDD 2006Leskovec & Faloutsos11 better D-statistic vs sample size scale-downback-in-time

12 CMU SCS KDD 2006Leskovec & Faloutsos12 Conclusions random nodes + a little exploration -> FF (RN, RJ are close) 15% sample seems enough back-in-time concept


Download ppt "CMU SCS KDD 2006Leskovec & Faloutsos1 ??. CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos."

Similar presentations


Ads by Google