Presentation is loading. Please wait.

Presentation is loading. Please wait.

Who are the most influential bloggers?

Similar presentations


Presentation on theme: "Who are the most influential bloggers?"— Presentation transcript:

1 Who are the most influential bloggers?
Jure Leskovec Machine Learning Department Carnegie Mellon University Who are the most influential bloggers?

2 Question… ? = I have 10 minutes. Which blogs should I read to be most up to date? = Who are the most influential bloggers?

3 A single story propagates…
So, who is really the influential here? Obscure technology story Small tech blog Small tech blog Slashdot Wired What about multiple stories propagating?  Blogs New Scientist New York Times CNN Add a small cascade drawing --- chain vs. star BBC

4 Blogs: Detecting big stories early
Want to read things before others do. Detect blue & green soon but miss red. = Data: All posts from top 50,000 blogs for 2006 = 60 million posts, 120 million links Detect all stories but late.

5 Problem definition: Covering blogs
= Given a budget (e.g., of 3 blogs) = Select blogs to cover the most of the blogosphere? = Bad news: Solving this exactly is NP-hard = Good news: Theorem: Our algorithm can do it in linear time and with factor 3 approximation Blogosphere

6 % of stories detected(higher is better)
Experimental results Which blogs to read to be most up to date? Our solution % of stories detected(higher is better) In-links (used by Technorati) Out-links # posts Random Number of selected blogs

7 So, who is influential? What should I read?
Jure Leskovec k Score Blog Posts InLinks OutLinks 1 0.13 4593 4636 5255 2 0.18 1534 1206 3495 3 0.22 924 576 2701 4 0.26 261 941 3630 5 0.29 1839 12642 6323 6 0.32 189 2313 9272 7 0.34 475 717 4944 8 0.35 895 247 10201 9 0.37 5776 6337 6183 10 0.38 4682 3205 3102 11 0.39 1862 463 6597 12 0.40 6223 3324 17172 13 0.41 25925 199 47933 14 0.42 1174 128 939 15 0.43 302 428 2481

8 Same problem: Water Network
Given: a real city water distribution network data on how contaminants spread over time Place sensors (to save lives) Problem posed by the US Environmental Protection Agency c1 S S c2

9 Water network: Results
[w/ Ostfeld et al., J. of Water Resource Planning] Water network: Results CELF Author Score CMU (CELF) 26 Sandia 21 U Exter 20 Bentley systems 19 Technion (1) 14 Bordeaux 12 U Cyprus 11 U Guelph 7 U Michigan 4 Michigan Tech U 3 Malcolm 2 Proteo Technion (2) 1 Degree Population saved (higher is better) Random Population Flow Number of placed sensors Our approach performed best at the Battle of Water Sensor Networks competition


Download ppt "Who are the most influential bloggers?"

Similar presentations


Ads by Google