Presentation is loading. Please wait.

Presentation is loading. Please wait.

EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics.

Similar presentations


Presentation on theme: "EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics."— Presentation transcript:

1 EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics

2 Numbers Your grades so far in this class. The weight of an apple. The temperature in Chicago on July 4 th. The height of a Dutch man. The speed of a car on I-90. Most instances are typical. Seeing a rare number is very surprising. These numbers are well-characterized by the average and the standard deviation.

3 City populations 1. New York8,310,212 2. Los Angeles 3,834,340 3. Chicago2,836,658 230. Cambridge, MA 101,335 240. Gainesville, FL 95,447 250. McKinney, TX 54,369 A few cities with high population Many cities with low population

4 City populations

5 Power Law: Fraction f(k) of items with popularity k is proportional to k -c. f(k) k -c log [f(k)] log [k -c ] log [f(k)] -c log [k]

6 City populations

7 Number of Web page in-links (Broder+)

8 Other examples

9 Length of the URL’s host

10 Number of host name resolutions to a single IP

11 Web page out-degrees

12 Web page in-degrees

13 Word count variance

14 Content evolution

15 Cluster size

16 … because they care to know ;-)

17 Why does data exhibit power laws? ImitationPower law

18 Constructing the web 1. Pages are created in order, named 1, 2, …, N 2. When created, page j links to a page by a) With probability p, picking a page i uniformly at random from 1, …, j-1 b) With probability (1-p), pick page i uniformly at random and link to the page that i links too Imitation

19 The rich get richer 2 b) With prob. (1-p), pick page i uniformly at random and link to the page that i links too 1/43/4

20 The rich get richer 2 b) With prob. (1-p), pick page i uniformly at random and link to the page that i links too Equivalently, 2 b)With prob. (1-p), pick a page proportional to its in- degree and link to it

21 Food for thought Why is Harry Potter popular? If we could re-play history, would we still read Harry Potter, or would it be some other book?

22 Information cascades and the rich Information cascade = so some people get a little bit richer by chance and then rich-get-richer dynamics = the random rich people get a lot richer very fast

23 Music download site – 8 worlds 1.“Let’s go driving,” Barzin 2.“Silence is sexy,” Einstu ̈ rzende Neubauten 3.“Go it alone,” Noonday Underground 10.“Picadilly Lilly,” Tiger Lillies 1.“Let’s go driving,” Barzin 2.“Silence is sexy,” Einstu ̈ rzende Neubauten 3.“Go it alone,” Noonday Underground 10.“Picadilly Lilly,” Tiger Lillies 18 3 47 2 59 7 10 1


Download ppt "EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics."

Similar presentations


Ads by Google