Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan

Similar presentations


Presentation on theme: "Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan"— Presentation transcript:

1 Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan jang@mirlab.org http://mirlab.org/jang

2 PageRank Algorithm Facts about PageRank Algorithm – Developed by Google’s founders, Larry Page and Sergey Brin, when they were graduate students at Stanford University – Determined entirely by the link structure of the WWW – Recomputed about once a month – The world’s largest matrix computation Ideas – A random walk problem known as Markov chain/process – Page rank: Limiting probability that a random surfer visits a page – A page has a high rank if other pages with high ranks link to it. 2

3 Connectivity Matrix G Notations – U: the set of all n web pages in the world (n > 4 billion by June 2004) – G: the connectivity matrix g ij = 1 if there is a hyperlink to page i from page j and g ij = 0 otherwise. Facts – G is huge, but very sparse – No. of nonzeros in G is the total no. of hyperlinks in U. 3 1 2 3 4 65

4 Degrees of a Page Degrees of a page – Define row and column sums of G: – r i : in-degree of page i – c j : out-degree of page j 4 1 2 3 4 65

5 From Connectivity Matrix to Transition Probability Matrix Connectivity matrix – G: g ij = 1 if there is a hyperlink from page j to i Transition prob. Matrix – A: a ij the prob. of jumping from page j to i 5 1 2 3 4 65 Column j is the prob. of jumping from page j to others.

6 Two Types of Transitions Type 1: Follow one of the link (with prob. p) Type 2: Jump to a random page (with prob. 1-p) 6 Overall:

7 Transition Probability Matrix Facts – A is the transition prob. matrix of the Markov chain. Its elements are all strictly between 0 and 1 and its column sums are all equal to 1. – If z is the initial prob. on each page, Az is the prob after 1 transition, A 2 z is the prob after 2 transitions, … – A k z converges to the page rank if k is big enough. – A k+1 z= A k z when k is big  Ax=x, with x= A k z when k is big – Perron-Frobenius theorem: A nonzero solution of x=Ax exists and is unique to within a scaling factor. – If the scaling factor is chosen so that the sum of x is 1, then x is Google’s PageRank. – Most of the elements of A are equal to (1-p)/n. If n=4*10^9 and p=0.85, then (1-p)/n=3.75*10^-11. 7

8 How to Compute PageRank Eigenvector method – x=A*x  x is the eigenvector corresponding to eigenvalue 1 – Fact A always has an eigenvalue of 1 Power method – Repeat x=A*x until x converges – The only possible approach for a large n – Fact 1 is the maximum eigenvalue of A A n z is not affected by z as n increases 8

9 Fact 1: Proof A always has an eigenvalue of 1 – Since the column sum of A is an all-1 vector, A T has 1 as its eigenvalue: – So 1 is also an eigenvalue of A since 9

10 Fact 1: Another Proof 10 Given a square matrix A with each column sum equal to K, prove that K is a eigenvalue of A.

11 Eigenvalue Decomposition 11

12 Fact 2 A has 1 as its eigenvalue of max magnitude A n z approaches the page rank as long as n is big enough and z sums to 1. 12

13 Example A tiny web Transition matrix A When p=0.85, we have the page rank (via pagerank.m): 13 2 3 4 5 1 6

14 Application Scenerio Team ranking in a sport – Eigenvalue decomposition for soccer games Eigenvalue decomposition for soccer games 互推系統 14


Download ppt "Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan"

Similar presentations


Ads by Google