Presentation is loading. Please wait.

# 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira.

## Presentation on theme: "1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira."— Presentation transcript:

1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira

2 The Web as a Graph Consider the World Wide Web as a graph, with web pages as nodes and hyperlinks between pages as edges. links.html resume.html index.html http://cnn.com

3 Studying the Web Since the Web emerged there has been a lot of interest in: 1.Empirically studying properties of the Web Graph. 2.Modeling the Web Graph mathematically. Benefits of Generative Models: 1.Simulation – When real data is scarce 2.Extrapolation – How will the graph change? 3.Understanding – Inspire further research on real data

4 Power Law The distribution of a random variable X follows a power law if Prob [X=k] ~ Ck -α f(x) ~ g(x) if Lim x→∞ f(x)/g(x) = 1 e.g (x+1) ~ (x+2) Example: Prob [X=k] = k -2

5 Power Law: Prob [X=k] = k -2

6 Power Law log Prob [X=k] ~ log C –α log k Prob [X=k] ~ Ck -α Prob [X=k] = k -2 log Prob [X=k] = -2 log k

7 Power Law: Log-Log plot

8 Power Law contd. Prob [X≥k] ~ Ck -α Particularly useful if X takes on real values. More general definition: Sometimes referred to as “heavy tailed” or “scale free.”

9 Power Laws in Degree distribution Let G be a graph. Let X k be the proportion of nodes with degree k in G. Then if X k ~ Ck -α we say that G has power law degree distribution.

10 Properties of the Web Graph A Power-law degree distribution has been observed in a wide variety of graphs including citation networks, social networks, protein-protein interaction networks and so on. It has also been observed in the Web Graph. [Barabási & Albert]

11 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

12 Classic Random Graph Models In the G(n,p) random graph model: 1.There are n nodes. 2.There is an edge between any two nodes with probability p. Was proposed by Erdös and Renyi in 1960s.

13 Online G(n,p) In this model each new node makes k connections to existing nodes uniformly at random. For this talk we will focus on k = 1, hence the graph will be a tree.

14 Online G(n,p) T=1 T=2 ½ T=3 ½ T=4 ⅓ ⅓ ⅓

15 Properties of Online G(n,p) X k = Proportion of nodes with degree k E[X k ] =  (½ k ) E[degree of first node] = 1+ 1/2 +1/3+1/4 + … 1/n =  (log n) E[max degree] =  (log n) NOT POWER LAWED!!

16 Online G(n,p) (n=100,000, average of 100 runs)

17 Preferential Attachment In the Preferential Attachment model, each new node connects to the existing nodes with a probability proportional to their degree. [Barabási & Albert]

18 Preferential Attachment T=2 ¾ T=3 ¼ Deg = 3 Deg = 1 T=4 Deg = 4 Deg = 1 T=1 Degree = in-degree + out-degree

19 Preferential Attachment Preferential Attachment gives a power-law degree distribution. [Mitzenmacher, Cooper & Frieze 03, KRRSTU00] E[degree of 1st node] = √n

20 Preferential Attachment

21 Other Models Kumar et. al. proposed the “copying model.” [KRRSTU00] Leskovec et. al. propose a “forest fire” model which has some similarites to this work. [LKF05]

22 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

23 Motivating Questions Why would a new node connect to nodes of high degree? -Are high degree nodes more attractive? -Or are there other explanations? How does a new node find out what the high degree nodes are?

24 Motivating Questions Motivating Observation: If p is small then this is the same as preferential attachment. Suppose a user does a (undirected) random walk until they find an interesting page. What about other processes and directed graphs? Suppose each page has a small probability p of being interesting.

25 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

26 Directed 1-step Random Surfer, p=.5 ¾ T=3 ¼ (½) (½)+ (½) (½)+ (½) (½) T=1 Start with a single node with a self-loop. T=2 1.Choose a node uniformly at random 2.With probability p connect 3.With probability (1-p) connect to its neighbor

27 Directed 1-step Random Surfer It turns out this model is a mixture of connecting to nodes uniformly at random and preferential attachment. But taking one step is not very natural. Has a power-law degree distribution. What about doing a real random walk?

28 NEW NODE RANDOM STARTING NODE 1. COIN TOSS: TAIL (at node A) 2. COIN TOSS: TAIL (at node B) 3. COIN TOSS: HEAD (at node C) 1.Pick a node uniformly at random. 2. Flip a coin of bias pIf HEADS connect to current node, else walk to neighbor A B C D Directed Coin Flipping model

29 Directed Coin Flipping model 1.At time 1, we start with a single node with a self-loop. 2.At time t, we choose a node u uniformly at random. 3.We then flip a coin of bias p. 4.If the coin comes up heads, we connect to the current node. 5.Else we walk to a random neighbor and go to step 3. “each page has equal probability p of being interesting to us”

30 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

31 Is Directed Coin-Flipping Power- lawed? We don’t know … but we do have some partial results...

32 Virtual Degree Definitions: Let l i (u) be the number of level i descendents of node u. l 1 (u) = # of children l 2 (u) = # of grandchildren, e.t.c. Let  = (β 1, β 2,..) be a sequence of real numbers with  1 =1. Then v  (u) = 1 + β 1 l 1 (u) + β 2 l 2 (u) + β 3 l 3 (u) + … We’ll call v  (u) the “Virtual degree of u with respect to .”

33 u Virtual Degree v(u) = 1 + β 1 (2) + β 2 (4) + β 3 (0) + β 4 (0) +... # of children# of grandchildren

34 Virtual Degree Easy observation: If we set β i = (1-p) i then the expected increase in deg(u) is proportional to v(u). Expected increase in deg(u) = p/t + (1-p)pl 1 (u)/t + (1-p) 2 pl 2 (u)/t + … = (p/t)v(u) u

35 Virtual Degree Theorem: There always exist β i such that 1.For i ≥ 1, |β i | · 1. 2.As i → ∞, β i →0 exponentially. 3.The expected increase in v(u) is proportional to v(u). Recurrence:  1 =1,  2 =p,  i+1 =  i – (1-p)  i-1 for p=½,  i = 1, 1/2, 0, -1/4, -1/4, -1/8, 0, 1/16, … E.g., for p=¾,  i = 1, 3/4, 1/2, 5/16, 3/16, 7/64,...

36 Virtual Degree, continued Theorem: For any node u and time t ≥ t u, E[v t (u)] = Θ((t/t u ) p ) Let v t (u) be the virtual degree of node u at time t and t u be the time when node u first appears. So, the expected virtual degrees follow a power law.

37 Actual Degree Theorem: For any node u and time t ≥ t u, E[degree(u)] ≥ Ω((t/t u ) p(1-p) ) We can also obtain lower bounds on the expected values of the actual degrees:

38 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

39 Experiments Random graphs of n=100,000 nodes Compute statistics averaged over 100 runs. K=1 (Every node has out-degree 1)

40 Online Erdös-Renyi

41 Directed 1-Step Random Surfer, p=3/4

42 Directed 1-Step Random Surfer, p=1/2

43 Directed 1-Step Random Surfer, p=1/4

44 Directed Coin Flipping, p=1/2

45 Directed Coin Flipping, p=1/4

46 Undirected coin flipping, p=1/2

47 Undirected Coin Flipping p=0.05

48 Outline Background/Previous Work Motivation Models Theoretical results Experimental results Conclusions

49 Conclusions Directed random walk models appear to generate power-laws (and partial theoretical results). Power laws can naturally emerge, even if all nodes have the same intrinsic “attractiveness”.

50 Open questions Can we prove that the degrees in the directed coin- flipping model do indeed follow a power law? Analyze degree distribution for the undirected coin-flipping model with p=1/2? Suppose page i has “interestingness” p i. Can we analyze the degree as a function of t, i and p i ?

51 Questions?

Download ppt "1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira."

Similar presentations

Ads by Google