DTMC Applications Ranking Web Pages & Slotted ALOHA

Slides:



Advertisements
Similar presentations
Markov Models.
Advertisements

Matrices, Digraphs, Markov Chains & Their Use by Google Leslie Hogben Iowa State University and American Institute of Mathematics Leslie Hogben Iowa State.
Channel Allocation Protocols. Dynamic Channel Allocation Parameters Station Model. –N independent stations, each acting as a Poisson Process for the purpose.
Google Pagerank: how Google orders your webpages Dan Teague NCSSM.
Graphs, Node importance, Link Analysis Ranking, Random walks
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Link Analysis: PageRank
Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Introduction to PageRank Algorithm and Programming Assignment 1 CSC4170 Web Intelligence and Social Computing Tutorial 4 Tutor: Tom Chao Zhou
Multimedia Databases SVD II. Optimality of SVD Def: The Frobenius norm of a n x m matrix M is (reminder) The rank of a matrix M is the number of independent.
Page Rank.  Intuition: solve the recursive equation: “a page is important if important pages link to it.”  Maximailly: importance = the principal eigenvector.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 3 April 2, 2006
Link Analysis, PageRank and Search Engines on the Web
Presented By: Wang Hao March 8 th, 2011 The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd.
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Markov Models. Markov Chain A sequence of states: X 1, X 2, X 3, … Usually over time The transition from X t-1 to X t depends only on X t-1 (Markov Property).
Motivation When searching for information on the WWW, user perform a query to a search engine. The engine return, as the query’s result, a list of Web.
The effect of New Links on Google Pagerank By Hui Xie Apr, 07.
Cloud and Big Data Summer School, Stockholm, Aug., 2015 Jeffrey D. Ullman.
Presented By: - Chandrika B N
Piyush Kumar (Lecture 2: PageRank) Welcome to COT5405.
Multiple Access Protocols Chapter 6 of Hiroshi Harada Book
1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too.
Methods of Computing the PageRank Vector Tom Mangan.
Propagation Delay and Receiver Collision Analysis in WDMA Protocols I.E. Pountourakis, P.A. Baziana and G. Panagiotopoulos School of Electrical and Computer.
CS315 – Link Analysis Three generations of Search Engines Anchor text Link analysis for ranking Pagerank HITS.
NETE4631:Capacity Planning (2)- Lecture 10 Suronapee Phoomvuthisarn, Ph.D. /
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd Presented by Anca Leuca, Antonis Makropoulos.
PageRank. s1s1 p 12 p 21 s2s2 s3s3 p 31 s4s4 p 41 p 34 p 42 p 13 x 1 = p 21 p 34 p 41 + p 34 p 42 p 21 + p 21 p 31 p 41 + p 31 p 42 p 21 / Σ x 2 = p 31.
CompSci 100E 3.1 Random Walks “A drunk man wil l find his way home, but a drunk bird may get lost forever”  – Shizuo Kakutani Suppose you proceed randomly.
How works M. Ram Murty, FRSC Queen’s Research Chair Queen’s University or How linear algebra powers the search engine.
Link Analysis Rong Jin. Web Structure  Web is a graph Each web site correspond to a node A link from one site to another site forms a directed edge 
Medium Access Control Protocols, Local Area Networks, and Wireless Local Area Networks Lecture Note 9.
Understanding Google’s PageRank™ 1. Review: The Search Engine 2.
15 October 2012 Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / Discrete Time Markov Chains.
CompSci 100E 4.1 Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx.
Link Analysis Algorithms Page Rank Slides from Stanford CS345, slightly modified.
Ljiljana Rajačić. Page Rank Web as a directed graph  Nodes: Web pages  Edges: Hyperlinks 2 / 25 Ljiljana Rajačić.
15 October 2012 Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / Absorbing Markov Chains 1.
Tel Hai Academic College Department of Computer Science Prof. Reuven Aviv Markov Models for Access Control in Computer Networks Resource: Fayez Gebali,
Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.
Web Mining Link Analysis Algorithms Page Rank. Ranking web pages  Web pages are not equally “important” v  Inlinks.
Jeffrey D. Ullman Stanford University.  Web pages are important if people visit them a lot.  But we can’t watch everybody using the Web.  A good surrogate.
Examples of DTMCs.
Motivation Modern search engines for the World Wide Web use methods that require solving huge problems. Our aim: to develop multiscale techniques that.
Medium Access Control Protocols
Industrial Engineering Dep
Discrete Time Markov Chains (cont’d)
Search Engines and Link Analysis on the Web
Multiple Access Mahesh Jangid Assistant Professor JVW University.
Discrete Time Markov Chains
Link-Based Ranking Seminar Social Media Mining University UC3M
PageRank and Markov Chains
Lecture on Markov Chain
Laboratory of Intelligent Networks (LINK) Youn-Hee Han
Iterative Aggregation Disaggregation
Lecture 22 SVD, Eigenvector, and Web Search
Piyush Kumar (Lecture 2: PageRank)
2014 session 1 TELE4642: Network Performance Week 12 Review
CS 440 Database Management Systems
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Javad Ghaderi, Tianxiong Ji and R. Srikant
Junghoo “John” Cho UCLA
of the IEEE Distributed Coordination Function
Lecture 22 SVD, Eigenvector, and Web Search
Lecture 22 SVD, Eigenvector, and Web Search
Chapter 6 Multiple Radio Access.
Satellite Packet Communications A UNIT -V Satellite Packet Communications.
Presentation transcript:

DTMC Applications Ranking Web Pages & Slotted ALOHA TELE4642: Week11

Outline Apply the theory of discrete time Markov chains: Google’s ranking of web-pages What page is the user most likely searching for? Formulate web-graph as a Markov chain Does steady-state exist? Does a user randomly walk the web-graph? Can search results be improved further? Slotted ALOHA medium access control protocol Is the protocol stable for large number of nodes? How should the retransmission probability be chosen? Network Performance

Ranking of Web-pages Problem: how should a search engine rank web-pages? Idea: rank pages based on number of in-links (citations) Weakness: not all in-links are equal Google’s idea: a page has high rank if the sum of the ranks of its in-link pages is high Formulate moves between web-pages as Markov chain Solve to obtain steady-state probability of each state State probability is proportional to importance of page Example with three web-pages: N M A Network Performance

Markov Model of the Web 2 3 1 5 4 Issue 1: how to choose transition probabilities? Assumption: each link is equally likely to be clicked Can accommodate non-uniform probability if such information available Issue 2: some rows are zero (dead ends) Assumption: on reaching dead-end restart at any state r is an Nx1 column vector whose i-th row is non-zero for dead-end nodes v is an Nx1 column vector whose entries add to 1 could all be 1/N (uniform) could be different from uniform (i.e. personalized) Network Performance

Markov Model of the Web (contd.) 5 1 4 2 3 Issue 3: Transition probability matrix may still be non-stationary Solution: inter-connect all nodes: where u is an Nx1 column vector with all entries 1 α is a number between 0 and 1 (“tax” on “importance”) For and : The very sparse initial matrix now becomes the dense matrix Network Performance

Computing the page rank Issue 4: Computing involves solving billion+ equations! Instead take powers of Iterative procedure: No matrix multiplication, work with only one vector Multiplication with sparse matrix P, dense matrix not formed Convergence depends on parameter α What should α be set at? Small α allows faster convergence (why?) Large α preserves better the true nature of the web-graph (why?) Brin and Page [Google] claim that α=0.85 works well only 50 to 100 iterations are required for convergence Network Performance

Discussion 2 3 1 5 4 Basic idea: Random walk on the web-graph The more often you visit a node, the more “popular” the page Does your model of the walk path match real user behavior? Instead of connecting every node to every other node (“tax”), create a dummy node to which all other nodes are connected and that connects to all nodes; this alters the true web-graph less. At dead-end, user often hits the “back” button; so bias the transition probability towards predecessor pages. How to increase the ranking of your web-page? Create replicas of your page? Create many “dummy” web-pages that point to your page? Make your web-pages link to each other? Further reading: “The PageRank Citation Ranking: Bringing Order to the Web”, 1999 “Random Walks with Back Buttons”, 2000 “Deeper inside PageRank”, 2004 Network Performance

Slotted Aloha N nodes, time-slotted system, equal-size packets Probability of new packet arrival in a slot to any given node is pa and the new packet is transmitted immediately Collision happens if more than one node transmits in the same slot; detected by all nodes at end of slot If collision, each backlogged node retries in every slot with probability pr until successful transmission No queueing: new arrivals to a backlogged node are dropped Network Performance

Slotted Aloha: Markov chain State: number of backlogged nodes m = 0,…,N Probability that i backlogged nodes transmit in a slot is Probability that j non-backlogged nodes transmit in a slot is Markov chain: Network Performance

Slotted Aloha: Efficiency Probability of successful transmission in state m: For small pa and pr , and using for small x: Let be the transmission attempt rate in state m, the throughput (successful transmissions per slot) is Throughput maximized at G(m)=1 Max. throughput = 1/e = 36% Network Performance

Slotted Aloha: Instability Does slotted Aloha work when N is large? Given you are in state m, what is the probability of moving backwards (i.e. state < m)? Stated another way, when the number of backlogged nodes is large enough, the average attempt rate G(m) becomes > 1 i.e. there are excessive collisions and state keeps growing Potential solution: ensure the attempt rate G(m) < 1 How? make the retransmission probability dependent on state E.g.: exponential backoff: Price for making retransmission probability too small: large delay Network Performance