Presentation is loading. Please wait.

Presentation is loading. Please wait.

PageRank and Markov Chains

Similar presentations


Presentation on theme: "PageRank and Markov Chains"— Presentation transcript:

1 PageRank and Markov Chains
Tolga Çekiç

2 Introduction PageRank Overview Markov Chains PageRank Continuation Conclusion

3 Introduction PageRank is named after one of its co-founders: Larry Page One of the algorithms used by Google search engine Ranks web pages according to importance Based on previous work on citation count Uses Markov Chain Structures

4 Citation Count Academic papers receive and give citations
Every citation made to a paper count as a vote Those papers with high numbers of votes are important They are some problems with this basic scheme of vote counting PageRank tries to address those by treating web pages as papers and links as citations

5 Problems If rank is determined as total number of links directed to web page, links from more important web sites wouldn’t count much Another problem arises if a web page has too many outlinks, then that web page would have higher influence in determining the rank.

6 PageRank Sum of all the importance scores of links that direct to a web page is calculated Importance score of a page is divided evenly amongst all its outgoing links Uses Markov Chain PageRank calculation formula

7 Simple PageRank Calculation

8 Markov Chains Named after Andrey Markov
A mathematical system of transitioning of states in a state-space States have Markov Property or ‘memorylessness’ Transitioning from one state to another depends only on the current state Used as statiscal-models in real world applications

9 Markov Chain Examples Drunkard’s walk, a random walking process
Board games with dice A simple weather model

10 Probability Vector At each time, there are n states the system could be in At time k the system as modeled as a vector A probability vector is a vector in whose entries are nonnegative and sum to 1.

11 Markov Chains A Markov matrix (or stochastic matrix) is a square matrix M whose rows or columns are probability vectors. A Markov chain is a sequence of probability vectors such that for some Markov Matrix M

12 Weather Model Example Initial State: Day 1: Day 2: Day n:

13 Steady State Vector Representing probabilities for all days, independent of initial weather Since it’s independent from all states, it is unchanged by P. That makes q an eigenvector of P(with eigenvalue 1)

14 Weather Example Steady State Calculation

15 Existence of Steady State Vector
Given a Markov matrix M, does there exist a steady-state vector? If M is a Markov matrix with all positive entries, then M has a unique steady-state vector (Perron-Frobenius Theorem)

16 PageRank cont. PageRank creates a square matrix A, rows and columns refer to web pages A is a Markov matrix

17 Problems Random Surfer Model; a real surfer might randomly go to another URL, different from the ones linked in the current page This model does not ensure a unique Steady-State Vector

18 PageRank To follow the PF theorem and realize random surfer model and damping factor is introduced (generally taken as 0.85) Or simply: B = 0.85A (matrix with every entry 1=n) (B is a Markov Matrix)

19 PageRank Computation

20 Conclusion Larry Page: “PageRank can be thought of as a model of user behavior. We assume there is a random surfer who is given a web page at random and keeps clicking on links, never hitting back but eventually gets bored and starts on another random page.” PageRank is the probability a user will end up in that site or fraction of time spent on that site in the long run


Download ppt "PageRank and Markov Chains"

Similar presentations


Ads by Google