# EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics.

## Presentation on theme: "EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics."— Presentation transcript:

EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics

Numbers Your grades so far in this class. The weight of an apple. The temperature in Chicago on July 4 th. The height of a Dutch man. The speed of a car on I-90. Most instances are typical. Seeing a rare number is very surprising. These numbers are well-characterized by the average and the standard deviation.

City populations 1. New York8,310,212 2. Los Angeles 3,834,340 3. Chicago2,836,658 230. Cambridge, MA 101,335 240. Gainesville, FL 95,447 250. McKinney, TX 54,369 A few cities with high population Many cities with low population

City populations

Power Law: Fraction f(k) of items with popularity k is proportional to k -c. f(k) k -c log [f(k)] log [k -c ] log [f(k)] -c log [k]

City populations

Number of Web page in-links (Broder+)

Other examples

Length of the URL’s host

Number of host name resolutions to a single IP

Web page out-degrees

Web page in-degrees

Word count variance

Content evolution

Cluster size

… because they care to know ;-)

Why does data exhibit power laws? ImitationPower law

Constructing the web 1. Pages are created in order, named 1, 2, …, N 2. When created, page j links to a page by a) With probability p, picking a page i uniformly at random from 1, …, j-1 b) With probability (1-p), pick page i uniformly at random and link to the page that i links too Imitation

The rich get richer 2 b) With prob. (1-p), pick page i uniformly at random and link to the page that i links too 1/43/4

The rich get richer 2 b) With prob. (1-p), pick page i uniformly at random and link to the page that i links too Equivalently, 2 b)With prob. (1-p), pick a page proportional to its in- degree and link to it

Food for thought Why is Harry Potter popular? If we could re-play history, would we still read Harry Potter, or would it be some other book?

Information cascades and the rich Information cascade = so some people get a little bit richer by chance and then rich-get-richer dynamics = the random rich people get a lot richer very fast

Music download site – 8 worlds 1.“Let’s go driving,” Barzin 2.“Silence is sexy,” Einstu ̈ rzende Neubauten 3.“Go it alone,” Noonday Underground 10.“Picadilly Lilly,” Tiger Lillies 1.“Let’s go driving,” Barzin 2.“Silence is sexy,” Einstu ̈ rzende Neubauten 3.“Go it alone,” Noonday Underground 10.“Picadilly Lilly,” Tiger Lillies 18 3 47 2 59 7 10 1

Download ppt "EK Ch 17: Power laws and rich-get-richer phenomena (with an application of Web Spam detection Spam, Damn Spam and Statistics ) Spam, Damn Spam and Statistics."

Similar presentations