Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multigraph Sampling of Online Social Networks Minas Gjoka, Carter Butts, Maciej Kurant, Athina Markopoulou 1Multigraph sampling.

Similar presentations


Presentation on theme: "Multigraph Sampling of Online Social Networks Minas Gjoka, Carter Butts, Maciej Kurant, Athina Markopoulou 1Multigraph sampling."— Presentation transcript:

1 Multigraph Sampling of Online Social Networks Minas Gjoka, Carter Butts, Maciej Kurant, Athina Markopoulou 1Multigraph sampling

2 Outline Multigraph sampling – Motivation – Sampling method – Internet Measurements – Conclusion 2Multigraph samplingMinas Gjoka

3 3 Problem statement Obtain a representative sample of OSN users by exploration of the social graph. F H E IG D B C A Multigraph samplingMinas Gjoka

4 Motivation for multiple relations Principled methods for graph sampling – Metropolis Hastings Random Walk – Re-weighted Random Walk “Walking in Facebook: A Case Study of Unbiased Sampling of OSNs,” INFOCOM ‘10 But..graph characteristics affect mixing and convergence fragmented social graph highly clustered areas 4Multigraph samplingMinas Gjoka

5 Fragmented social graph 5 Union Friendship Event attendance Group membership Multigraph sampling Largest Connected Component Other Connected Components

6 Highly clustered social graph Friendship Event attendance 6 Union Multigraph samplingMinas Gjoka

7 Proposal Graph exploration using multiple user relations – perform random walk – re-weighting at the end of the walk – online convergence diagnostics applicable Theoretical benefits – faster mixing – discovery of isolated components Open questions – how to combine relations – implementation efficiency – evaluation of sampling benefits in a realistic scenario 7Multigraph samplingMinas Gjoka

8 8 D F H E I J G C B A K D F H E I J G C B A K D F H E I J G C B A K Friends Events Groups Multigraph samplingMinas Gjoka

9 9 D F H E I J G C B A K D F H E I J G C B A K D F H E I J G C B A K D F H E I J G C B A K Friends Events Groups Multigraph samplingMinas Gjoka

10 10 D F HEIJGCBAK deg(F, tot) = 8 deg(F, red) = 1 deg(F, blue) = 3 deg(F, green) = 4 G * = Friends + Events + Groups ( G * is a union multigraph ) Combination of multiple relations D F HEIJGCBAK G = Friends + Events + Groups ( G is a union graph ) Multigraph samplingMinas Gjoka

11 Multigraph sampling Implementation efficiency Degree information available without enumeration Take advantage of pages functionality 11Multigraph samplingMinas Gjoka

12 Multigraph sampling Internet Measurements Last.fm, an Internet radio service – social networking features – multiple relations – fragmented graph components and highly clustered users expected Last.fm relations used – Friends – Groups – Events – Neighbors 12Multigraph samplingMinas Gjoka

13 Data Collection Sampled node information Crawling using Last.fm API and HTML scraping userID country age registration time … 13Multigraph samplingMinas Gjoka

14 14 Uniform userID Sampling (UNI) Last.fm UNI sampling possible in Last.fm. – will serve as the ground truth Multigraph samplingMinas Gjoka

15 Summary of datasets Last.fm - July 2010 Friend:0.3% Events:5.4% Groups:94.2% Neighbors:0.02% Crawl type# Total Users% Unique Users Friends5x50K71% Events5x50K58% Groups5x50K74% Neighbors5x50K53% Friends-Events- Groups-Neighbors 5x50K76% UNI500K99% 15Multigraph samplingMinas Gjoka

16 Comparison to UNI % of Subscribers 16 % of Subscribers Multigraph samplingMinas Gjoka

17 Last.fm Charts Estimation Application of sampling 17Multigraph samplingMinas Gjoka

18 Last.fm Charts Estimation Artist Charts 18Multigraph samplingMinas Gjoka

19 Related Work Fastest mixing Markov Chain – Boyd et al - SIAM Review 2004 Sampling in fragmented graphs – Ribeiro et al. Frontier Sampling – IMC 2010 Last.fm studies – Konstas et al - SIGIR ‘09 – Schifanella et al - WSDM ‘10 19Multigraph samplingMinas Gjoka

20 20 Conclusion Introduced multigraph sampling – simple and efficient – discovers isolates components – better approximation of distributions and means – multigraph dataset planned for public release Future work on multigraph sampling – selection of relations – weighted relations Multigraph samplingMinas Gjoka

21 21 Thank you Questions? Multigraph samplingMinas Gjoka


Download ppt "Multigraph Sampling of Online Social Networks Minas Gjoka, Carter Butts, Maciej Kurant, Athina Markopoulou 1Multigraph sampling."

Similar presentations


Ads by Google