Markov Models.

Slides:



Advertisements
Similar presentations
1 Introduction to Discrete-Time Markov Chain. 2 Motivation  many dependent systems, e.g.,  inventory across periods  state of a machine  customers.
Advertisements

. Markov Chains. 2 Dependencies along the genome In previous classes we assumed every letter in a sequence is sampled randomly from some distribution.
Matrices, Digraphs, Markov Chains & Their Use by Google Leslie Hogben Iowa State University and American Institute of Mathematics Leslie Hogben Iowa State.
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Link Analysis: PageRank
Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU.
Markov Models Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001.
Lecture 12 – Discrete-Time Markov Chains
Entropy Rates of a Stochastic Process
Overview of Markov chains David Gleich Purdue University Network & Matrix Computations Computer Science 15 Sept 2011.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 3 March 23, 2005
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
Introduction to PageRank Algorithm and Programming Assignment 1 CSC4170 Web Intelligence and Social Computing Tutorial 4 Tutor: Tom Chao Zhou
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Review.
Multimedia Databases SVD II. Optimality of SVD Def: The Frobenius norm of a n x m matrix M is (reminder) The rank of a matrix M is the number of independent.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 21: Link Analysis.
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
Page Rank.  Intuition: solve the recursive equation: “a page is important if important pages link to it.”  Maximailly: importance = the principal eigenvector.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 3 April 2, 2006
Multimedia Databases SVD II. SVD - Detailed outline Motivation Definition - properties Interpretation Complexity Case studies SVD properties More case.
Link Analysis, PageRank and Search Engines on the Web
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Markov Models. Markov Chain A sequence of states: X 1, X 2, X 3, … Usually over time The transition from X t-1 to X t depends only on X t-1 (Markov Property).
PageRank Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata October 27, 2014.
Motivation When searching for information on the WWW, user perform a query to a search engine. The engine return, as the query’s result, a list of Web.
The effect of New Links on Google Pagerank By Hui Xie Apr, 07.
Google’s PageRank: The Math Behind the Search Engine Author:Rebecca S. Wills, 2006 Instructor: Dr. Yuan Presenter: Wayne.
CDA6530: Performance Models of Computers and Networks Examples of Stochastic Process, Markov Chain, M/M/* Queue TexPoint fonts used in EMF. Read the TexPoint.
Entropy Rate of a Markov Chain
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 21: Link Analysis.
Roshnika Fernando P AGE R ANK. W HY P AGE R ANK ?  The internet is a global system of networks linking to smaller networks.  This system keeps growing,
Graph-based Algorithms in Large Scale Information Retrieval Fatemeh Kaveh-Yazdy Computer Engineering Department School of Electrical and Computer Engineering.
1 Random Walks on Graphs: An Overview Purnamrita Sarkar, CMU Shortened and modified by Longin Jan Latecki.
CS315 – Link Analysis Three generations of Search Engines Anchor text Link analysis for ranking Pagerank HITS.
The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd Presented by Anca Leuca, Antonis Makropoulos.
CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains.
PageRank. s1s1 p 12 p 21 s2s2 s3s3 p 31 s4s4 p 41 p 34 p 42 p 13 x 1 = p 21 p 34 p 41 + p 34 p 42 p 21 + p 21 p 31 p 41 + p 31 p 42 p 21 / Σ x 2 = p 31.
CompSci 100E 3.1 Random Walks “A drunk man wil l find his way home, but a drunk bird may get lost forever”  – Shizuo Kakutani Suppose you proceed randomly.
Trust Management for the Semantic Web Matthew Richardson1†, Rakesh Agrawal2, Pedro Domingos By Tyrone Cadenhead.
 { X n : n =0, 1, 2,...} is a discrete time stochastic process Markov Chains.
Ranking Link-based Ranking (2° generation) Reading 21.
8/14/04J. Bard and J. W. Barnes Operations Research Models and Methods Copyright All rights reserved Lecture 12 – Discrete-Time Markov Chains Topics.
COMS Network Theory Week 5: October 6, 2010 Dragomir R. Radev Wednesdays, 6:10-8 PM 325 Pupin Terrace Fall 2010.
11. Markov Chains (MCs) 2 Courtesy of J. Bard, L. Page, and J. Heyl.
CompSci 100E 4.1 Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx.
11. Markov Chains (MCs) 2 Courtesy of J. Bard, L. Page, and J. Heyl.
Link Analysis Algorithms Page Rank Slides from Stanford CS345, slightly modified.
Ljiljana Rajačić. Page Rank Web as a directed graph  Nodes: Web pages  Edges: Hyperlinks 2 / 25 Ljiljana Rajačić.
Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.
A Sublinear Time Algorithm for PageRank Computations CHRISTIA N BORGS MICHAEL BRAUTBA R JENNIFER CHAYES SHANG- HUA TENG.
Let E denote some event. Define a random variable X by Computing probabilities by conditioning.
Jeffrey D. Ullman Stanford University.  Web pages are important if people visit them a lot.  But we can’t watch everybody using the Web.  A good surrogate.
Courtesy of J. Bard, L. Page, and J. Heyl
Search Engines and Link Analysis on the Web
Link-Based Ranking Seminar Social Media Mining University UC3M
PageRank and Markov Chains
DTMC Applications Ranking Web Pages & Slotted ALOHA
Lecture 22 SVD, Eigenvector, and Web Search
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
9 Algorithms: PageRank.
CS 440 Database Management Systems
TexPoint fonts used in EMF.
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Lecture 22 SVD, Eigenvector, and Web Search
Lecture 22 SVD, Eigenvector, and Web Search
TexPoint fonts used in EMF.
Presentation transcript:

Markov Models

Markov Chain A sequence of states: X1, X2, X3, … Usually over time The transition from Xt-1 to Xt depends only on Xt-1 (Markov Property). A Bayesian network that forms a chain The transition probabilities are the same for any t (stationary process) X1 X2 X3 X4

Example: Gambler’s Ruin Courtsey of Michael Littman Example: Gambler’s Ruin Specification: Gambler has 3 dollars. Win a dollar with prob. 1/3. Lose a dollar with prob. 2/3. Fail: no dollars. Succeed: Have 5 dollars. States: the amount of money 0, 1, 2, 3, 4, 5 Transition Probabilities

Transition Probabilities Suppose a state has N possible values Xt=s1, Xt=s2,….., Xt=sN. N2 Transition Probabilities P(Xt=si|Xt-1=sj), 1≤ i, j ≤N The transition probabilities can be represented as a NxN matrix or a directed graph. Example: Gambler’s Ruin

What can Markov Chains Do? Example: Gambler’s Ruin The probability of a particular sequence 3, 4, 3, 2, 3, 2, 1, 0 The probability of success for the gambler The average number of bets the gambler will make.

Example: Academic Life Courtsey of Michael Littman Example: Academic Life Assistant Prof.: 20 B. Associate Prof.: 60 0.2 0.2 0.7 0.2 0.6 T. Tenured Prof.: 90 0.2 0.6 S. Out on the Street: 10 0.3 D. Dead: 0 0.2 1.0 0.8 What is the expected lifetime income of an academic?

Solving for Total Reward L(i) is expected total reward received starting in state i. How could we compute L(A)? Would it help to compute L(B), L(T), L(S), and L(D) also?

Solving the Academic Life The expected income at state D is 0 L(T)=90+0.7x90+0.72x90+… L(T)=90+0.7xL(T) L(T)=300 0.7 T. Tenured Prof.: 90 0.3 D. Dead: 0

Working Backwards 287.5 325 Assistant Prof.: 20 B. Associate Prof.: 60 0.2 0.2 0.7 0.2 0.6 T. Tenured Prof.: 90 300 0.2 0.6 50 S. Out on the Street: 10 0.3 0.2 D. Dead: 0 0.8 1.0 Another question: What is the life expectancy of professors?

Ruin Chain 2/3 1 1 2 3 4 5 1 +1 1/3

Gambling Time Chain +1 2/3 1 1 2 3 4 5 1 1/3

Google’s Search Engine Assumption: A link from page A to page B is a recommendation of page B by the author of A (we say B is successor of A) Quality of a page is related to its in-degree Recursion: Quality of a page is related to its in-degree, and to the quality of pages linking to it PageRank [Brin and Page ‘98]

Definition of PageRank Consider the following infinite random walk (surf): Initially the surfer is at a random page At each step, the surfer proceeds to a randomly chosen web page with probability d to a randomly chosen successor of the current page with probability 1-d The PageRank of a page p is the fraction of steps the surfer spends at p in the limit.

Random Web Surfer What’s the probability of a page being visited?

Stationary Distributions Let S is the set of states in a Markov Chain P is its transition probability matrix The initial state chosen according to some probability distribution q(0) over S q(t) = row vector whose i-th component is the probability that the chain is in state i at time t q(t+1) = q(t) P  q(t) = q(0) Pt A stationary distribution is a probability distribution q such that q = q P (steady-state behavior)

Markov Chains Theorem: Under certain conditions: There exists a unique stationary distribution q with qi > 0 for all i Let N(i,t) be the number of times the Markov chain visits state i in t steps. Then,

PageRank PageRank = the probability for this Markov chain, i.e. where n is the total number of nodes in the graph d is the probability of making a random jump. Query-independent Summarizes the “web opinion” of the page importance

(1-d)* ( 1/4th the PageRank of A + 1/3rd the PageRank of B ) +d/n PageRank of P is (1-d)* ( 1/4th the PageRank of A + 1/3rd the PageRank of B ) +d/n

Kth-Order Markov Chain What we have discussed so far is the first-order Markov Chain. More generally, in kth-order Markov Chain, each state transition depends on previous k states. What’s the size of transition probability matrix? X1 X2 X3 X4