Presentation on theme: "1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira."— Presentation transcript:
1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira
2 The Web as a Graph Consider the World Wide Web as a graph, with web pages as nodes and hyperlinks between pages as edges. links.html resume.html index.html http://cnn.com
3 Studying the Web Since the Web emerged there has been a lot of interest in: 1.Empirically studying properties of the Web Graph. 2.Modeling the Web Graph mathematically. Benefits of Generative Models: 1.Simulation – When real data is scarce 2.Extrapolation – How will the graph change? 3.Understanding – Inspire further research on real data
4 Power Law The distribution of a random variable X follows a power law if Prob [X=k] ~ Ck -α f(x) ~ g(x) if Lim x→∞ f(x)/g(x) = 1 e.g (x+1) ~ (x+2) Example: Prob [X=k] = k -2
8 Power Law contd. Prob [X≥k] ~ Ck -α Particularly useful if X takes on real values. More general definition: Sometimes referred to as “heavy tailed” or “scale free.”
9 Power Laws in Degree distribution Let G be a graph. Let X k be the proportion of nodes with degree k in G. Then if X k ~ Ck -α we say that G has power law degree distribution.
10 Properties of the Web Graph A Power-law degree distribution has been observed in a wide variety of graphs including citation networks, social networks, protein-protein interaction networks and so on. It has also been observed in the Web Graph. [Barabási & Albert]
23 Motivating Questions Why would a new node connect to nodes of high degree? -Are high degree nodes more attractive? -Or are there other explanations? How does a new node find out what the high degree nodes are?
24 Motivating Questions Motivating Observation: If p is small then this is the same as preferential attachment. Suppose a user does a (undirected) random walk until they find an interesting page. What about other processes and directed graphs? Suppose each page has a small probability p of being interesting.
26 Directed 1-step Random Surfer, p=.5 ¾ T=3 ¼ (½) (½)+ (½) (½)+ (½) (½) T=1 Start with a single node with a self-loop. T=2 1.Choose a node uniformly at random 2.With probability p connect 3.With probability (1-p) connect to its neighbor
27 Directed 1-step Random Surfer It turns out this model is a mixture of connecting to nodes uniformly at random and preferential attachment. But taking one step is not very natural. Has a power-law degree distribution. What about doing a real random walk?
28 NEW NODE RANDOM STARTING NODE 1. COIN TOSS: TAIL (at node A) 2. COIN TOSS: TAIL (at node B) 3. COIN TOSS: HEAD (at node C) 1.Pick a node uniformly at random. 2. Flip a coin of bias pIf HEADS connect to current node, else walk to neighbor A B C D Directed Coin Flipping model
29 Directed Coin Flipping model 1.At time 1, we start with a single node with a self-loop. 2.At time t, we choose a node u uniformly at random. 3.We then flip a coin of bias p. 4.If the coin comes up heads, we connect to the current node. 5.Else we walk to a random neighbor and go to step 3. “each page has equal probability p of being interesting to us”
31 Is Directed Coin-Flipping Power- lawed? We don’t know … but we do have some partial results...
32 Virtual Degree Definitions: Let l i (u) be the number of level i descendents of node u. l 1 (u) = # of children l 2 (u) = # of grandchildren, e.t.c. Let = (β 1, β 2,..) be a sequence of real numbers with 1 =1. Then v (u) = 1 + β 1 l 1 (u) + β 2 l 2 (u) + β 3 l 3 (u) + … We’ll call v (u) the “Virtual degree of u with respect to .”
33 u Virtual Degree v(u) = 1 + β 1 (2) + β 2 (4) + β 3 (0) + β 4 (0) +... # of children# of grandchildren
34 Virtual Degree Easy observation: If we set β i = (1-p) i then the expected increase in deg(u) is proportional to v(u). Expected increase in deg(u) = p/t + (1-p)pl 1 (u)/t + (1-p) 2 pl 2 (u)/t + … = (p/t)v(u) u
35 Virtual Degree Theorem: There always exist β i such that 1.For i ≥ 1, |β i | · 1. 2.As i → ∞, β i →0 exponentially. 3.The expected increase in v(u) is proportional to v(u). Recurrence: 1 =1, 2 =p, i+1 = i – (1-p) i-1 for p=½, i = 1, 1/2, 0, -1/4, -1/4, -1/8, 0, 1/16, … E.g., for p=¾, i = 1, 3/4, 1/2, 5/16, 3/16, 7/64,...
36 Virtual Degree, continued Theorem: For any node u and time t ≥ t u, E[v t (u)] = Θ((t/t u ) p ) Let v t (u) be the virtual degree of node u at time t and t u be the time when node u first appears. So, the expected virtual degrees follow a power law.
37 Actual Degree Theorem: For any node u and time t ≥ t u, E[degree(u)] ≥ Ω((t/t u ) p(1-p) ) We can also obtain lower bounds on the expected values of the actual degrees:
49 Conclusions Directed random walk models appear to generate power-laws (and partial theoretical results). Power laws can naturally emerge, even if all nodes have the same intrinsic “attractiveness”.
50 Open questions Can we prove that the degrees in the directed coin- flipping model do indeed follow a power law? Analyze degree distribution for the undirected coin-flipping model with p=1/2? Suppose page i has “interestingness” p i. Can we analyze the degree as a function of t, i and p i ?