Download presentation
Presentation is loading. Please wait.
1
IEG4140 Teletraffic Engineering
Professor: S.-Y. Robert Li Rm. HSH734 (inside 727), x8369, Tutor: Diana Q. Wang Rm. HSH729, Visit course website: frequently to catch spontaneous announcements. Textbook: Lecture notes on course website, under constant revision. Reference Book: Chapters 4~8 of S. Ross, “Probability Models,” 8th ed., Academic Press
2
Assessment scheme: The grading emphasizes on logical reasoning rather than numerical calculation:
10% homework 30% mid-term exam 60% final exam Bonus for actively asking questions in the classroom Prof. Bob Li
3
Lecture notes: Chap 1. Motivating puzzles, markov chain, stopping time, and martingale Chap 2. Poisson process Chap 3. Continuous-time markov chain Chap 4. Renewal process Chap 5. Queueing theory Prof. Bob Li
4
Schedule: Time Content Week 1 (Sep 8)
Some puzzles in stochastic processes & markov Chain I Week 2 (Sep 15) Markov Chain II Week 3 (Sep 22) Time Reversed markov Chains Week 4 (Sep 29) Stopping time, Wald’s equation & Martingale Week 5 (Oct 6) Exponential and Poisson Distribution I Week 6 (Oct 13) Exponential and Poisson Distribution II Week 7 (Oct 20) The Poisson Process I Week 8 (Oct 27) Midterm exam (tentative date) Week 9 (Nov 3) The Poisson Process II Week 10 (Nov 10) Continuous-time markov Chain I Week 11 (Nov 17) Continuous-time markov Chain II Week 12 (Nov 24) Queuing Theory I Week 13 (Dec 1) Queuing Theory II Prof. Bob Li
5
Lecture 1 Sep. 8, 2011 Prof. Bob Li
6
Chapter 1. Introductionmotivating puzzles, markov chain, stopping time, and martingale
1.1. Some puzzles in stochastic processes 1.2. Markov chain 1.3. Time reversed markov chain 1.4. Markov Decision Process 1.5. Stopping Time 1.6 Martingales Prof. Bob Li
7
1. 1 Some puzzles in stochastic processes 1. 1. 1
1.1 Some puzzles in stochastic processes Pattern occurrence in repeated coin toss P{Head} = P{Tail} = ½. In a very long binary sequence, all the 64 length-6 patterns appear equally frequently. HTHTTTTHTHHTHTTTHTTTTHTHHTHTTTHTHTHTTHTTHTHTTTHTHHTHTTHTHHTHHTHHTHTTHTHHTHHTHHTHTTTHTHHTTHHTTTHTTTTTHTHHTHHTHHTHTTHTHHTHTTTTHTHHTHTTTHTHHTHHTHHTTTHHTHHTHTHTTHTHTHHTHTHTTHTHHHHTHTHTTHTTHTTTHTHTHTHTHHTTTTHHTHTTHTHTHTHTTHHTHHTHTHTHTHHTHTTTHTTTHHTHTHTHHTTHHTHTTHTTHTTHTHHTHHTHHTHTTHTTHTHTHTHTTHTHTHTTHTHTHHTTHTTHTHTHTTHTHTHHHTHHTHTHTTHTTHTHHTHTTHTHHTHTHHTHTHTTHTTHTHTHTHTTHTHTHTHTTHTHTTTTHTHHTTHHHTTHHTTTHHHHTHHTHTHHHHHHTTHTHTHTHHTHHTHTTHHTHTHTTHHTHTHTTTHTHTHHTHTHHTTHTHHTTHHTTHHHTTHTTHTTHTTHHTHHTHTTHTTHTTHTHTHHTHTHTTHTTHTHTTHTTHTTHTHHTHTTHTTTHTHTTHTHHTTH … So, the average waiting time for every length-6 pattern should be 64. Right? Prof. Bob Li
8
William Feller’s biblical textbook
The book has first raised these questions: Average waiting time for HHHHHH = 64? Average waiting time for HHTTHH = 64? Feller did not live to see his 2-volume book. Prof. Bob Li
9
William Feller’s biblical textbook
Through long derivation by markov chain, Average waiting time for HHHHHH = 64? Average waiting time for HHTTHH = 64? William Feller found these numbers counter-intuitive. 126 70 Feller did not live to see his 2-volume book. Prof. Bob Li
10
Paradoxes in pattern occurrence
HTHTTTTHTHHTHTTTHTTTTHTHHTHTTTHTHTHTTHTTHTHTTTHTHHTHTTHTHHTHHTHHTHTTHTHHTHHTHHTHTTTHTHHTTHHTTTHTTTTTHTHHTHHTHHTHTTHTHHTHTTTTHTHHTHTTTHTHHTHHTHHTTTHHTHHTHTHTTHTHTHHTHTHTTHTHHHHTHTHTTHTTHTTTHTHTHTHTHHTTTTHHTHTTHTHTHTHTTHHTHHTHTHTHTHHTHTTTHTTTHHTHTHTHHTTHHTHTTHTTHTTHTHHTHHTHHTHTTHTTHTHTHTHTTHTHTHTTHTHTHHTTHTTHTHTHTTHTHTHHHTHHTHTHTTHTTHTHHTHTTHTHHTHTHHTHTHTTHTTHTHTHTHTTHTHTHTHTTHTHTTTTHTHHTTHHHTTHHTTTHHHHTHHTHTHHHHHHTTHTHTHTHHTHHTHTTHHTHTHTTHHTHTHTTTHTHTHHTHTHHTTHTHHTTHHTTHHHTTHTTHTTHTTHHTHHTHTTHTTHTTHTHTHHTHTHTTHTTHTHTTHTTHTTHTHHTHTTHTTTHTHTTHTHHTTH … In the long run, the pattern HHTTHH occurs equally frequently as any other length-6 patterns and hence it occurs after every 64 coin tosses on the average. In more precise terms, the renewal time averages 64 (and 4), which is measured from one appearance of the pattern till the next. Prof. Bob Li
11
Intuition vs. misconception
For a length-6 binary pattern: The renewal time, measured from one appearance to the next, averages 64. The waiting time, measured from the very beginning of the coin-toss process, can be longer. Quick intuition defies scientific truth when it mixes up the two concepts. Let’s see how far can quick intuition be away from scientific truth? Prof. Bob Li
12
Race between 2 patterns On fair coin-toss,
Average waiting for HTHH = 18 // A slightly faster pattern Average waiting for THTH = 20 // A slightly slower pattern In a race between HTHH vs. THTH, the odds are nothing like 10 : 9, but rather … Prof. Bob Li
13
Race between 2 patterns On fair coin-toss,
Average waiting for HTHH = 18 // A slightly faster pattern Average waiting for THTH = 20 // A slightly slower pattern In a race between HTHH vs. THTH, the odds are the landslide 5 : 9. nothing like 10 : 9, but rather … Prof. Bob Li
14
The Sting The fairy-tale race On fair coin-toss,
Average waiting for HTHH = 18 // A slightly faster pattern Average waiting for THTH = 20 // A slightly slower pattern In a race between HTHH vs. THTH, the odds are the landslide 5 : 9 in favor of THTH. Hare Tortoise Scientific truth and fairy tales happily agree with each other, while ordinary intuition is left out.
15
It’s a bit like 田忌賽馬 Tortoise often wins by a neck, while a Hare’s win is at least a full length. Homework. Compute P(HH occurs before TTT) with a biased coin where P(Head) = 0.4 . Later we’ll deal with this topic again by martingales.
16
1.1.2. An imaginary casino Pay $100 to enter the game:
Repeatedly toss a fair coin until a Head shows up. Receive $2n back if Head occurs on the nth toss. E[Return] = = = $ // E[N] can be , even when P(N < ) = 1. Q. Is this $ vs. $100 a huge advantage to the gambler in practice?
17
Symmetric random walk Abel & Billy gamble $1 on each toss of a fair coin. This is a zero-sum game. The net winning of Abel is a symmetric random walk. P{Abel nets +$1 sooner or later, i.e., time till +$1 is finite} =1 ? The moral. Despite the 100% probability for the occurrence of this event in finite time, that finite time is unbounded and, in fact, has an mean value = . On the average, it takes tosses to achieve the net gain of $1. The average net gain per toss is $1/ = $0, as it should be in fair gamble.
18
Symmetric random walk (cont’d)
Abel & Billy gamble $1 on each toss of a fair coin. This is a zero-sum game. The net winning of Abel is a symmetric random walk. P{Time till Abel nets +$1 is finite} = 1 ? P{Time till Abel nets +$1000 is finite} = 1 ? P{Time till Billy nets +$1000 is finite} = 1 ? Answer: Yes to all, even though it is a zero-sum game.
19
1.1.4. 2-dimensional random walk
The street map of Manhattan is an infinite checker board. A drunkard starts a symmetric 2-dimensional random walk at the street corner outside the pub. Q. P{Return to the origin within finite time} = 1? 1/4 Answer: Yes. Prof. Bob Li
20
1.1.5. 3- or higher-dim random walk
Consider an infinite 3-dimensional checker board. At every cross point, the six directions are equally likely: east, west, north, south, up, and down Q1. P{Return to the origin within finite time} =1? Answer: No. Q2. How about 4-dim random walk? Prof. Bob Li
21
Progressive gambler On casino roulette, a gambler keeps betting on "odd", where P(Odd) = P({1, 3, ... , 35}) = 18/38 P(Not odd) = P({0, 00, 2, 4, ... , 36}) = 20/38 A gambler starts with a $1 bet. Whenever he loses, he doubles the bet on the next spin. This process continues until he wins, at which time, he nets +$1 regardless of how long the process takes. Q. Is this a sure way to win $1 from the casino? Answer: It is a sure way to enrich the casino. Prof. Bob Li
22
A board game Each move advances the token by a number of steps determined by rolling a die. P{Square #1 will be landed on} = 1/6 P{Square #2 will be landed on} = 1/6 + 1/36 = 7/36 Q. Which square has the highest probability to be ever landed on?
23
A board game (cont’d) Let pn = probability of ever landing on the square n. By conditioning on the first roll, pn= 1k6 P(1st roll = k) P(ever landing on n | 1st roll = k) = 1k6 P(ever landing on n | 1st roll = k) /6 = (pn-1 + … + pn-6) /6 That is, every probability pn, n 1, is the average of its six immediate predecessors. The recursive formula is a 6th-order difference equation. There are 6 boundary conditions: p0 = 1, p-1 = p-2 = p-3 = p-4 = p-5 = 0 In this course, we shall solve difference equations quite often.
24
A board game (cont’d) Values of pn are tabulated below.
The largest among p1 to p6 is p6. Because every probability mass pn is the average of its six immediate predecessors, it is an easy induction to find pn < p6 for all n > 6. n pn -5 1 1/6 = .1667 7 = .2536 13 = .2790 -4 2 7/62 = .1944 8 = .2681 14 = .2832 -3 3 72/63 =.2269 9 = .2804 15 = .2857 -2 4 73/64 = .2647 10 = .2891 16 = .2866 -1 5 74/65 = .3088 11 = .2932 … 6 75/66 = .3602 12 = .2906 2/7= .2857
25
A board game (cont’d) Homework. Show that p13 pn p11 for all n > 13. Homework. If we roll two dice instead of just one, which square has the highest probability to be landed on? Hint: Find the largest among p1 to p12. Remark. This board game can be modeled as a markov chain in the obvious way. Conditioning on the first roll is the same as a markov transition, which is also the same as multiplication by the transition matrix. Prof. Bob Li
26
1.2 Markov chains Definition. A sequence of discrete r.v. X0, X1, X2, ... is called a markov chain if there exists a matrix such that // P(Xt+1 = j | Xt = i) is time-invariant. // Memoryless: Only the newest knowledge Xt counts. Prof. Bob Li
27
Memoryless and time-invariant
Definition. A sequence of discrete r.v. X0, X1, X2, ... is called a markov chain if there exists a matrix such that // P(Xt+1 = j | Xt = i) is time-invariant. // Memoryless: Only the newest knowledge Xt counts. Remarks. Any possible value of any Xt is called a state of the markov chain. When there are k ( ) states in the markov chain, the matrix is square (k by k) and is called the transition matrix of the markov chain. Pij is called the transition probability from state i to state j.
28
Example of 3-state weather model
A simple 3-state model of weather assumes that the weather on any day depends only upon the weather of the preceding day. This model can be described as a 3-state markov chain with the transition matrix The transition graph (or transition diagram) is: To sunny rainy cloudy From Prof. Bob Li
29
Every row sum = 1 in transition matrix
The transition equation Pij = P(Xt+1= j | Xt= i) says that the ith row of the transition matrix is the conditional distribution of Xt+1 given that Xt = i. Hence the row sum must be 1. In contrast, column sum is in general not 1. To sunny rainy cloudy From Row sum =1 Row sum =1 Row sum =1 Column sum 1 Prof. Bob Li
30
Free 1-dimensional Random Walk
At each game, the gambler wins $1 with the probability p and loses $1 with the probability q = 1p. Let Xt be the total net winning after t games. Thus the states are integers and Modeling by a markov chain, the transition matrix and transition graph are: j = i i
31
Random Walk with an absorbing state
In the same gamble as before, the gambler starts with $2. He has to stop gambling if the net winning reaches $2. In other words, P-2,-2 = 1 and P-2,j = 0 for all j 2. The transition matrix and diagram for the markov chain are: i j 2 - é ë ê ù û ú 2 1 K q p Prof. Bob Li
32
Independence of Partial Memory
P(X3= j | X2= i, X1= i1, X0= i0) = Pij To calculate the probabilities for X3, if we know X2, then we can throw away the knowledge about both X1 and X0. Example. P(X3= j | X2= i, X0= i0) = Pij // Only the newest knowledge counts. Proof. Conditioning on X1, P(X3= j | X2= i, X0= i0) Problem. Prove that P(X3= j | X2= i, X1= i1) = Pij by conditioning on X0. // Conditioning on X1 = Pij
33
Independence of Partial Memory (cont’d)
Example. P(X3= j | X1= i, X0= i0) = ? Answer. Conditioning on X2, P(X3= j | X1= i, X0= i0) // By conditioning on X2 = The ijth entry in 2 = P(X3= j | X1= i) // Special case of Chapman-Komogorov equation below Conclusion: Only the newest knowledge counts. Prof. Bob Li
34
Transition Equation in Matrix Form
Conditioning on Xt, we have where the summation is over all states i. In the matrix form, this becomes ,where the row vector denotes the distribution of Xt // Multiplication by = a transition. Prof. Bob Li
35
Transition Equation in Matrix Form
Whether example. The notation of the row vector in this example is simply Vt = The transition equation in the matrix form becomes Prof. Bob Li
36
Chapman-Komogorov Equation
Now we iterate the transition equation starting with Vt. Vt+1 = Vt // Multiplication with represents a transition. Vt+2 = Vt+1 = (Vt) = Vt2 By induction on n, we arrive at the following Chapman-Komogorov Equation. The matrix n gives the n-step transitions. That is, Vt+n = Vtn. Note. By conditioning on Xt , Thus, P(Xt+n =j | Xt = i) = the ijth entry in the matrix n // For all t
37
Prisoner in Dark Cell Hole A leads to freedom through a tunnel of a 3-hour journey. Holes B and C are at the two ends of a tunnel of a 2-hour journey. Whenever the prisoner returns to the dark cell, he would immediately enter one of the three holes by a random choice. Q. When can the prisoner get out? Q: What should be the "states" in the model of a markov chain? Prof. Bob Li
38
Prisoner in Dark Cell (cont’d)
Label the “hour-points" as F, R, & T. Use these points, together with 'cell' and 'freedom' as states. Transition matrix Transition graph
39
Prisoner in Dark Cell (cont’d)
As before let Vt denote the distribution of Xt. Then, V0 = ( ) // Initially the prisoner is in the cell. V1 = V0 = (2/ / ) V2 = V1 = (0 2/ /3 0) = V02 V3 = V2 = (4/ / /3) = V0 // 1/3 = P(Free after 3 days) V4 = V3 = (0 4/ /9 1/3) = V04 V5 = V4 = (8/ / /9) = V0 // 5/9 = P(Free after 5 days) V6 = V5 = (0 8/ /27 5/9) = V06 ... V = V = ( ) = V0 // 1 = P(Free eventually) Note that V0 means the second row in the matrix = In fact, even with any other V0 (i.e., any distribution of the initial state), we would still have V = V = ( ) = V0 Thus every row in is ( ).
40
Lecture 2 Sep. 15, 2011 Prof. Bob Li
41
Slotted Aloha multi-access protocol
Node A Node B Node C In every timeslot, transmission is successful when there is a unique node actively transmitting. An transmission attempt can be either a new packet or a retransmission. Every backlogged node in every timeslot reattempts transmission with a probability p. The time till reattempt is Geometric1(p): P(time = t) = (1p)t 1p for all t 1 Note. Geometric0(p): P(time = t) = (1p)t p for all t 0
42
Markov model of Slotted Aloha
Markov model. Using the number of backlogged nodes as the state of the system, a transition from state k is always to some state j ≥ k−1. 1 2 3 4 5 Assumption for the convenience of analysis. The probability ai that i new arriving packets intend transmission in a timeslot is fixed and is independent of the state. Every backlogged node in every timeslot reattempts transmission with an independent probability p.
43
Markov model of Slotted Aloha
Markov model. Using the number of backlogged nodes as the state of the system, a transition from state k is always to some state j ≥ k−1. 1 2 3 4 5 Assumption for the convenience of analysis. The probability ai that i new arriving packets intend transmission in a timeslot is fixed and is independent of the state. The probability bi that i of the backlogged attempt retransmission depends on the state k binomially:
44
Markov model of Slotted Aloha
Markov model. Using the number of backlogged nodes as the state of the system, a transition from state k is always to some state j ≥ k−1. The transition probabilities from state k are: Pk,k1 = a0b // k > 0; 1 reattempt; no new Pk,k = a0(1b1) // 1 reattempt; no new +a1b // no reattempt; 1 new Pk,k+1 = a1(1b0) // k > 0; ≥ 1 reattempt; 1 new For i 2, Pk,k+i = ai // 2 new Prof. Bob Li
45
1.4 Limiting probabilities of markov chain
Definition. In the markov chain X0, X1, X2, ..., let the row vector Vt represent the distribution of Xt. If there exists a row vector V such that regardless of the initial distribution V0, then the distribution V is called the limiting distribution of the markov chain. Remarks. The limiting distribution is also called the “stationary state” by its interpretation as a “random state.” = V0 for all V0. Taking V0 = ( ), we find V equal to 1st row in the matrix . Similarly, taking V0 = ( ), we find V equal to 2nd row in . In the same manner, V is equal to every row in .
46
Example of limiting probability
Example. For the prisoner in the dark cell, Prof. Bob Li
47
Ergodic markov chain Definition. A finite-state markov chain is ergodic if, for some t 1, all entries in the matrix t are nonzero. // For some particular t, the markov chain can go // from anywhere to anywhere in exactly t steps. Theorem. If a finite-state markov chain is ergodic, then the limiting distribution exists. // Ergodicity is a sufficient condition but, we shall use an // example to show that it is not a necessary condition. Prof. Bob Li
48
Example of ergodic markov chain
Example. A salesman travels between HK and Macau. HK Macau All entries in 2 are positive. Hence this markov chain is ergodic. Prof. Bob Li
49
Limiting probabilities = eigenvector
Since V = V0( ) = V0( ) = V0( ) = V, the limiting distribution is a (row) eigenvector of the transition matrix with the eigenvalue 1. V is a probability distribution (sum of entries = 1). The normalization by 2) is crucial since any scalar multiple of V is also an eigenvector with the eigenvalue 1. A row vector with both properties is called the long-run distribution. In the ergodic case, V is the unique long-run distribution. Prof. Bob Li
50
Example of a 2-state markov chain
HK Macau All entries in 2 are positive. Hence this markov chain is ergodic. Since the limiting distribution V is a row eigenvector of with eigenvalue 1, V = V or V ( I) = (0 0 … 0) The existence of an eigenvector renders ( I) a singular matrix. Write V = (x y). Thus, x/2 + y = x x/ = y are linearly dependent equations. The span the full rank minus 1, because the eigenspace is 1-dimensional. Together with the normalization equation x + y = 1, We can solve x = 2/3 and y = 1/3.
51
Limiting distribution of a 3-state chain
= é ë ê ù û ú Õ 1 9 05 5 45 . 2 05 54 405 03 36 63 005 14 855 = é ë ê ù û ú Õ . Given , all entries in are positive. Hence the markov chain is ergodic. Write the limiting distribution as V = (x y z). We want to calculate x, y, and z. Since V is a distribution, x y z = 1, On the other hand, V has to be a row eigenvector of with eigenvalue 1. Thus, we have the linearly dependent equations: 0.1x y = x 0.9x y + 0.1z = y 0.45y + 0.9z = z They are worth only two equations. Together with the normalization equation, we get V = ( ).
52
Color-Blindness Gene Color-blindness is X-genetic. There are two kinds of X genes: X0 = color blind X1 = normal Thus there are three types of female: X0X0 = color blind X0X1 = normal // X1 gene dominates X0 X1X1 = normal and two types of male: X0Y = color blind X1Y = normal Question. In a “stationary” society, a survey shows that approximately 10% of the male population is color blind. However, the survey also finds too few color blind females to make a percentage estimate. What can we do?
53
Color-Blindness Gene (cont’d)
Solution. From the percentage of color blind males, we assert the 1 : 9 ratio between the X0 gene and the X1 gene. Then, model the problem by the mother-to-daughter transition matrix : Daughter X0X0 X0X1 X1X1 X0X0 Mother X0X1 X1X1 From the nature of the problem, we know the markov chain is ergodic. The preceding calculation also proves this and, in fact, yields the limiting distribution V = ( ). Thus, 1% of females are color blind, while another 18% are carriers of the color blindness gene. Prof. Bob Li
54
Convolution decoding Digital communications often stipulate inspection of the pattern of the last few, say, 3 received bits. Assume each bit is independently random with P(bit = 1) = p. All 3-step transition probabilities are positive. Hence the markov chain is ergodic. The components of V are calculated as: These are not surprising at all because of the independence assumption. Prof. Bob Li
55
Example of a 2-state markov chain
( ) ú û ù ê ë é - = P 1 2 n ú û ù ê ë é = P 1 lim n The markov chain is not ergodic. Nevertheless, the limiting distribution V = (0 1) exists. Prof. Bob Li
56
Example of Oscillation
State 0 State 1 There exists zero entries in t for all t. Hence the markov chain is not ergodic, but rather periodic. In fact, there is no limiting distribution. Let V0 = (1 0). Then Vt alternates between (1 0) and (0 1). Nevertheless, there exists the eigenvector ( ) of the transition matrix with the eigenvalue 1. Prof. Bob Li
57
Ergodicity, limiting distributions, eigenvector
Summary of theorems. For finite-state markov chains, Ergodicity // All entries in the matrix t are nonzero for some t 1. The limiting distribution V exists. // regardless of the initial distribution V0. Eigenvalue 1 of the transition matrix with 1-dim eigenspace // Rank of the matrix I is full minus 1 (dim of eigenspace = 1). // Long-run distr. = unique normalized eigenvector with eigenvalue 1 Eigenvalue 1 of the transition matrix // The matrix I is a singular. det(I) = 0 Q. When there is no limiting distribution, there may or may not be the eigenvalue 1. What happens then? See the next example. Prof. Bob Li
58
A gambler wants to go home.
A gambler in Macau needs $3 to return to HK but has only $2. He gambles $1 at a time, and wins with probability p. = The matrix I has the rank 2 (< the full rank 4). There are two linearly independent eigenvectors of eigenvalue 1: ( ) and ( ). However, there is no limiting distribution. The probability of ending in the state $3 instead of $0 depends on the initial distribution V0. For the gambler, V0 = ( ). 1 2 3 p 1p Prof. Bob Li
59
Classification of states in a markov chain
Definition. For the markov chain X0, X1, X2, ... The probability of recurrence at a state i is fi = P(Xt = i for some t > 0 | X0 = i) 1 The state i is said to be recurrent when fi = 1 and transient when fi < 1. // For sure, the state will be revisited sooner or later. The multiplicity of visiting state i, denoted by Mi ( ), is the number of indices t > 0 such that Xt = i. Theorem. The conditional r.v. (Mi | X0 = i) is: deterministically equal to when fi = 1 Geometric0(1fi) distributed when fi < // That is, P(Mi = n | X0 = i) = fin(1fi) Intuitive proof. Think of every revisit to state i as either Head or Tail depending whether it is the final visit. Thus P(Head) = 1 fi & P(Tail) = fi .
60
Classification of states (cont’d)
Computational Proof. Assuming that fi < 1, we shall prove by induction on n that P(Mi = n | X0 = i) = (1fi) fin P(Mi = 0 | X0 = i) = P(Xti for all t>0 | X0=i) = 1fi P(Mi= 1 | X0 = i) = t>0 {P(Mi=1 | X0 = i and 1st recurrence at time t) P(1st recurrence at t | X0 = i)} = t>0 {P(no recurrence after time t | Xt = i) P(1st recurrence at time t | X0 = i)} = t>0 {P(Mi=0 | X0 = i) P(1st recurrence at time t | X0 = i)} // Time-invariant = (1fi) t>0 P(1st recurrence at time t | X0 = i)} = (1fi) fi P(Mi= n | X0 = i) = t>0 {P(Mi = n | X0 = i and 1st recurrence at time t) P(1st recurrence at t | X0 = i)} = t>0 { P(n1 recurrence after time t | Xt = i) P(1st recurrence at time t | X0 = i)} = t>0 {P(Mi = n1 | X0 = i) P(1st recurrence at time t | X0 = i)} // Induction on n = (1fi) fin1 t>0 P(1st recurrence at time t | X0 = i)} = (1fi) fin // Geometric0(1fi) distribution
61
Classification of states (cont’d)
Corollary. E[Mi | X0= i] = 1 / (1fi) for fi 1. Corollary. E[Mi | X0= i] = if and only if fi = 1. Proposition. E[Mi | X0 = i] = entry in the matrix t) Proof. Given i, let Yt be the characteristic r.v. of the event Xt = i. Thus, and hence E[Mi | X0 = i] = // Mean of sum is sum of means. = entry in the matrix t) // Chapman-Komogorov
62
Classification of states (cont’d)
Summary. fi = 1 entry in the matrix t) = // Chapman-Komogorov // The final condition can be used to verify the recurrence of 1- // or 2-dim symmetric random walk but not of the 3-dim one. Corollary. In a finite-state markov chain, there is at least one recurrent state. In an infinite-state markov chain however, it is possible that all states are transient. We shall see that this is the case for the 1-dim asymmetric free random walk. Prof. Bob Li
63
Examples Example. In the 3-state weather model, all three states are recurrent. Example. In the markov chain for the prisoner in the dark cell, only the “freedom” state is recurrent. Prof. Bob Li
64
Lecture 3 Sep. 22, 2011 Prof. Bob Li
65
1-dim free random walk At each game, the gambler wins $1 with the probability p and loses $1 with the probability q = 1p. Let Xt be the total net winning after t games. Theorem. In an asymmetric 1-dimensional free random walk, every state is transient. In the symmetric 1-dimensional free random walk, every state is recurrent and hence the process almost surely will return to the initial state within finite time // “Almost surely” means “with probability 1.” Proof. Next slide. Prof. Bob Li
66
1-dim free random walk Proof. 0 for odd t when t = 2n
// To go from i to i in exactly 2n steps, there must // be exactly n steps to the right and n to the left. The state i is transient or recurrent hinges on the convergence the series = // t = 2n = // Stirling's Formula: as n If p 1/2, then 4pq < 1 and the series converges. If p = 1/2, then 4pq = 1 and the series diverges. // By the “integral test”
67
1-dim symmetric random walk
Problem. Show that the symmetric random walk will almost surely reach the state to the right of the initial state. Solution = P[will ever return to 0 | X0 = 0] = (1/2) P[will ever return to 0 | X0 = 0 and X1 = 1] + (1/2) P[will ever return to 0 | X0 = 0 and X1 = 1] // Conditioning on X1 under the a priori of X0 = 0 = (1/2) P[will ever return to 0 | X1 = 1] + (1/2) P[will ever return to 0 | X1 = 1] // Memoryless property: only the newest information counts = P[will ever reach state 0 | X1 = 1] // by symmetry Prof. Bob Li
68
Positive recurrent states
Definition. For the markov chain X0, X1, X2, ... The waiting time Ti for state i is the smallest index t > 0 such that Xt = i. When there is no such an index t, define Ti = . The state i is positive recurrent if E[Ti | X0 = i] < . Theorem. Positive recurrent recurrent Proof E[Ti | X0 = i] < P(Ti = | X0 = i) = 0 P(No recurrence | X0 = i) = 0 State i is recurrent. Prof. Bob Li
69
State classification Positive recurrent Recurrent Immortal 夫
Amah Rock (望夫石), Shatin, HK CUHK Recurrent Transient Positive recurrent Non-positive recurrent e.g., symmetric 1-dim random walk as we shall prove later Immortal 夫 Prof. Bob Li
70
Positive recurrent states
(cont’d) Positive recurrent states Theorem. Recurrent states in a finite-state markov chain are all positive recurrent. Proof. Omitted. Remarks. Fact. 2-dim symmetric free random walk is also recurrent. It cannot be positive recurrent. Why? Fact. 3-dim symmetric free random walk turns out to be transient. 4-dim symmetric free random walk is also transient. Why? Prof. Bob Li
71
Equivalent classes of states
Definition. State j is said to be accessible from state i if, starting from state i, the probability is nonzero for the markov chain to ever enter state j, that is, the ijth entry in the matrix t is positive for some t ≥ 0. If two states i and j are accessible to each other, we say that they communicate. // Mathematically, state communication is an “equivalence relation,” // that is: reflexive, symmetric, and transitive. Example. A 3-state markov chain with only one equivalent class of states. Prof. Bob Li
72
Equivalent classes of states (cont’d)
Example. A markov chain with four states in three equivalent classes: {0, 1}, {2} and {3}. Only state 2 is transient. Prof. Bob Li
73
Equivalent classes of states (cont’d)
Theorem. Recurrence/transience is a class property. // Thus it makes sense to mention a recurrent class or a transient class. Proof. Let states i and j belongs to an equivalent class. Write for the ijth entry in k. Thus and for some k and m. Suppose that i is recurrent, i.e We shall prove that so is j. Prof. Bob Li
74
Equivalent classes of states (cont’d)
Example. A markov chain with five states in three equivalent classes: {0, 1} is recurrent {2, 3} is recurrent {4} is transient Prof. Bob Li
75
skipEquivalent classes of states (cont’d)
Definition. When there is only one equivalence class, the markov chain is said to be irreducible. Theorem. All states of a finite-state irreducible markov chain are recurrent (and hence positive recurrent by an aforementioned theorem). Proof. Not all states can be transient in a finite-state markov chain. Problem. Prove that a transient state cannot be accessed from a recurrent state. Prof. Bob Li
76
SkipCalculation of expected time & multiplicity
Example of Gambler’s Ruin A gambler in Macau needs $7 to return to HK but has only $3. He starts to wage $1 at a time and wins with probability p = 0.4. The transition matrix of his fortune is as right One way to calculate fij = P(Xt = j for some t > 0 | X0 = i) is by first considering the following quantity si,j = E[number of visits to j | start at i] and by conditioning on whether the state j is ever entered (See p.197 of [Ross, 6th ed.] for detail.) Below, we calculate si,j for all i = 1 to 6.
77
skipExample of Gambler’s Ruin (cont’d)
Conditioning on the outcome of the initial play, si,j = i,j // The Knonecker delta i,j = 1 when i = j; else = 0 + p E[number of visits to j | start at i; initial play wins] + q E[number of visits to j | start at i; initial play loses] = i,j + p si+1,j + q si1,j Denote by S the 66 matrix (sij)1i6,1j6. Then, S = I + TS where I is the 66 identity matrix and T denotes the matrix within blue circumscription. Thus, I = S TS = (I T)S S = (I T)1 As p = 0.4 and q = 0.6, we can explicitly calculate this inverse matrix and find, for instance, s3,2 = and s3,5 = .9228 Prof. Bob Li
78
Calculation of expected time till freedom
The transition graph The only recurrent state, i.e., the terminal state
79
Calculation of expected time till freedom
Method 1: Given the initial state = cell, let the frequency of entering the internal tunnel be MT, which is Geometric0(1/3) distributed. Thus, E[Waiting time for freedom | initial state = cell] = E[2MT+3] = 2EMT+3 = 2*2+3 = 7 Method 2: Calculate from markov chain. For every state S, define the conditional r.v. TS = (Waiting time for freedom | initial state = S) and adopt the abbreviation eS = ETS. We have the following matrix equation. This matrix equation means conditioning on the outcome of the 1st step.
80
Calculation of expected time till freedom
In other words, we have the following system of linear equations: From these, we can solve for eS for all states S. // Even though we only wanted to compute ecell, the // markov-chain computation gives eS for all states S. It turns out that eR = 1, eF = 2, ecell = 7, eT = 8 // Mean of sum is sum of means. Prof. Bob Li
81
Example of pattern occurrence in coin tossing
Q. In fair-coin tossing, the expected waiting time for the pattern HTHT to occur consecutively is not 16. A solid arrow means HEAD and a dotted means TAIL. Probability of every arrow is 0.5. For every state S, define the conditional r.v. TS = (waiting time for HTHT | initial state = S) and write eS = ETS. We have the following equation. The terminal state
82
Example of pattern occurrence in coin tossing
which implies // Mean of sum is sum of means. Solving the five simultaneous equations, it turns out that enull = 20. Alternative calculation of this quantity by martingale is instantly. See: S.-Y. R. Li, “A martingale approach to the study of occurrence of sequence patterns in repeated experiments,” Annals of Probability, v.8, pp , 1980.
83
eS = E[waiting time for THTT | initial state = S]
Example of pattern occurrence in coin tossing Q. What is the expected waiting time for THTT to occur consecutively? For every state S, define eS = E[waiting time for THTT | initial state = S] It turns out that enull = 18. Prof. Bob Li
84
eS = E[waiting time till HTHT or THTT | initial state = S]
Example of pattern occurrence in coin tossing Q. What is the expected waiting time for either HTHT or THTT, whichever occurs first? Homework. For every state S, define eS = E[waiting time till HTHT or THTT | initial state = S] In particular, eHTHT = eTHTT = 0. Calculate enull . Two terminal states Prof. Bob Li
85
P(HTHT eventually occurs | initial state = S]
Probabilities toward multiple terminal states Q. Which pattern between THTT and HTHT is more likely to occur first? For every state S, define pS as P(HTHT eventually occurs | initial state = S] In particular, pHTHT = 1 and pTHTT = 0. The above question becomes: pnull = ? This question does not pertain to the “elapsed time” at any state. The state H eventually leads to HT and hence can be merged into state HT. Similarly we also merge the states T into the state TH. The transition graph: A B
86
Probabilities toward multiple terminal states
The column vector of pS is an eigenvector of the transition matrix with eigenvalue 1. (Do not confuse this with the limiting distribution, which is a row eigenvector.) Prof. Bob Li
87
Random-walk gambler A gambler in Macau needs $n for boat fare to go home but has only $i. He gambles $1 at a time and wins with probability p each time. The two terminal states are at ends. Denote pi = P{The gambler will reach $n | starting at $i}. We are to calculate pi for all i.
88
Calculation by markov chain
pi = p P{Will reach $n | start from $i; initial play wins} + (1p) P{Will reach $n | start from $i; initial play loses} // Conditioning on the outcome of the initial step again // except that we do not use the matrix notation this time. = p P{Will reach $n | start from $i+1} + (1p) P{Will reach $n | start from $i1} Thus, pi = p pi+1 + (1p) pi1 The problem becomes to solve the 2nd-order difference equation with the boundary conditions: p0 = 0 and pn = 1.
89
Calculation by markov chain
pi = p pi+1 + (1p) pi1 p pi + (1p) pi = p pi+1 + (1p) pi1 (1p)(pi pi1) = p(pi+1 pi) Write xi = pi+1 pi for 0 i < n. (1p) xi1 = p xi This is only a 1st-order difference equation. Thus {xi}0i<N form a geometric progression: xi = x0 Next, we shall make use of the two boundary conditions.
90
= pi-1 + [ + ] p1 Calculation by markov chain
pi+1 = pi (p1 0) // p0 = 0 pi+1 = pi p1 = [pi p1 ] p // Recursively = pi-1 + [ ] p1 = … = p0 + [1+ … ] p1 // Telescope; p0 = 0 Thus, pi = [1+ … ] p1
91
Calculation by markov chain
It remains to calculate p1 from the boundary condition. Thus take i = n in particular, Later, a martingale will turn this long calculation into an instant one.
92
Gambler’s Ruin Example. A gambler in Macau starts with $900, needs $1000 to return to HK, gambles $1 at a time on the casino roulette, and wins with probability p = 18/38 at each bet. (Thus, N = 1000 and i = 900.) Then, (1p)/p = 10/9 pi = = (10/9)100 = 2.656105 With the probability , the gambler is doomed. Prof. Bob Li
93
Remark. Optimal strategy over random walk
Assumptions. The gambler in Macau needs $N to go home but has only $i. At the casino he can bet any integer amount at a time and the probability of winning is 18/38. Since this is not fair gamble to him. It is called a “super martingale,” while the “martingale” concept formalizes fair gamble. Strategy to optimize the chance of going home: Whenever he has $N/2 or less, bet all. Whenever he has more than $N/2, bet the amount that he is short of. Intuition reasoning: Because of the disadvantage, he is better off keeping the gambling process as short as possible. The formal proof of the optimality of the said strategy is beyond the scope of this course. Prof. Bob Li
94
Skip Simplex algorithm in linear programming
Minimize ct x // ct = (c1 … cn), xt = (x1 … xn) subject to A x = b // A is an mn matrix of rank m. // bt = (b1 … bm) and x 0 The optimal value of x is at an extreme point of the feasibility region, a polytope in the n-dim space, and hence has at least n–m components equal to 0. There are extreme points. Label them from 1 to N by the increasing value of the objective function ct x. The simplex method always moves from an extreme point to a better extreme point.
95
Efficiency of simplex algorithm in linear programming
The simplex method always moves from an extreme point to a better extreme point. Assumption for the convenience of performance analysis: When the algorithm is at the Lth point, the next extreme point will be equally likely to be any one among the (L–1)st, (L – 2)nd, … , 1st. Thus, Prof. Bob Li
96
Efficiency of simplex algorithm in linear programming (cont’d)
Let Ti be the number of transitions to go from state i to state 1. In particular, T1 = 0.
97
Efficiency of simplex algorithm in linear programming (cont’d)
Shifting the index, Take the difference between the last two equations. // Approx. by Stirling’s formula
98
Efficiency of simplex algorithm in linear programming (cont’d)
Let c = n/m. Therefore, under the said assumption, the number of simplex-method iterations from the worst start is only Numerical example. If n = 8000 and m = 1000, then c = 8, and
99
1.4 Time reversed markov chain
Prof. Bob Li
100
Reverse transition probability
Let Pij denote the transition probability and i the stationary probability of an ergodic markov chain. Assume that the markov chain has been in operation for a long time. Consider the reverse process where n > m . Define the reverse transition probability Qij At a time m , Prof. Bob Li
101
Mnemonic interpretation of j Pji = i Qij
j Pji = stationary rate at which the chain flows from j to i i Qij = stationary rate at which the chain flows backward from i to j Prof. Bob Li
102
Time reversed markov chain
Theorem. The reverse process where m , is by itself a markov chain with the transition probability Qij = jPji /i. Prof. Bob Li
103
at the time m+1 B, A} Prof. Bob Li
104
at the time m+1 B, A} Prof. Bob Li
105
The formula Qij = jPji /i calculates:
Reverse transition matrix [Qij]i,j Forward transition matrix [Pij]i,j limiting distribution (… i …) The proposition below offers a way to calculate: Reverse transition matrix [Qij]i,j Forward transition matrix [Pij]i,j limiting distribution (… i …)
106
Proposition 1. For an irreducible stationary ergodic markov chain with the transition probabilities Pij, if one can find positive numbers i and Qij meeting the equations: // “Irreducible” = can reach any state i from any state j. (a) i Qij = j Pji for all i, j (b) i i = 1 // Normalizer (c) i Qji = 1 for each j // Normalizer then i are the limiting probabilities and Qij are the transition probabilities of the reversed chain. Proof. Summing both sides of (a) over i, In the matrix form, (… i …) [Pij]i,j = (… i …) That is, the row vector (… i …) is an eigenvector of the transition matrix [Pij]i,j with the eigenvalue 1. It must be the limiting distribution because of (b). This, together with (c), shows that Qij are the transition probabilities of the reversed markov chain.
107
Example of light bulb at the lighthouse
When the light bulb at a very old lighthouse fails during day n, it is replaced at the beginning of the day n+1. Define the integer random variables: L = lifespan of light bulb (rounded up to integer) // Distribution of L is by manufacturer Xn = age of the bulb at the end of day n Thus, {Xn, n = 1, 2, 3, …} is a markov chain. // e.g., {Xn, n = 1, 2, 3, …}={1, 2, 3, 1, 1,…} 1 2 3 4 5 The transition probabilities are:
108
In the long run, the transition probabilities of the reverse chain are:
When j > 1, clearly Qj,j1 = 1 For all i 1, Q1,i = P{Age of the light bulb yesterday was i | light bulb today is a new one} = P{Age of the light bulb yesterday was i | a light bulb failed yesterday} = P{Age of a light bulb on its day of failing is i} // Yesterday was also in limiting distribution. = P{Lifespan of a light bulb is i} = P{L = i} // Distribution of L is given. By the above Proposition 1, if we shall solve {i} from the equations (a) and (b) below. Then, {i} is the limiting distribution: (a) for all i, j // Need this for j =1 and j 1 (b) i i = // Normalizer (c) i Qji = 1 for each j // Clearly true
109
Taking j = 1 in (a), for all i 1 From (b), Thus, part of (a) together with (b) uniquely determines {i}. We have yet to verify the remaining part of (a), that is, for all j 1 (and all i 1). Since Qj,j1 = 1, it suffices to verify just for j = i+1 > 1, that is, to show .
110
Time reversibility Definition. A stationary ergodic markov chain is said to be time reversible if Pij = Qij for all i and j // Recall that i Qij = jPji or, equivalently, for all i and j. // The stationary rate at which the chain flows from i to j // is equal to the stationary rate at which it flows from j to i. An analogy of the stationary rate of transition is the annual trade volume from a nation to another. The limiting distribution is reached when there is overall trade balance for every nation. A time-reversible markov chain is like trade balance between any two nations. This is a stronger statement. Prof. Bob Li
111
Summing both sides of the equation over i,
Proposition 2. The existence of any nonnegative numbers i summing to 1 such that make the markov chain time reversible. Moreover, such numbers i represent the limiting probabilities. Proof // The proof is similar to and simpler than that of Proposition 1. Summing both sides of the equation over i, // i Pji = 1 = the total probability That is, the row vector (… i …) is an eigenvector of the transition matrix [Pij] with the eigenvalue 1. Hence it is the limiting distribution. Prof. Bob Li
112
Random walk between two blocking states
The fact that {j} is the limiting distribution corresponds to the red cut. It balances the influx to state i with the outflow from it. In other words, {j} is a (row) eigenvector of the transition matrix with eigenvalue 1. Because the transition graph is a single string of states, the influx to the state group {0, 1, …, i−1} balances with the outflow from it. In other words, the flow from i1 to i and the flow from i to i1 are in equilibrium, as embodied by the simple blue cut. Thus, the process is time reversible. Prof. Bob Li
113
Random walk between two blocking states
The flow from i1 to i and the flow from i to i1 are in equilibrium. This is equivalent to the eigenvector statement and is mathematically simpler. We now calculate j from it. This is only a 1st-order difference equation, and the boundary condition is the equation of normalization: i i = 1.
114
Next, calculate 0 from the boundary condition … (Homework)
Taking the product of the above equalities, we can express all i in terms of 0. Next, calculate 0 from the boundary condition … (Homework) Prof. Bob Li
115
Lecture 4 Sep. 29, 2011 Oct. 6, 2011 Prof. Bob Li
116
Time-reversibility of random walk 0 M
The special case when i = for all i Prof. Bob Li
117
Time-reversibility of random walk 0 M
The special case when i = ½ for all i // Here 1, i.e., ½ Prof. Bob Li
118
Time-reversibility of random walk 0 M
The special case when i = ½ for all i . Hence i = 1/(M+1) for all i. // Oscillation between 0 and M leads to uniform stationary distribution. Intuitive interpretation. Consider random walk around the following cycle, in which every line indicates bidirectional transition. 1 2 M
119
Another way of oscillation
When i = ½ for all i except that 0 = 1 and M = 0 1 It turns out that i = 1/M for 0 < i < M and 0 = M = 1/2M. Intuitive interpretation. Consider random walk around the cycle below. 1 2 M−1 M Prof. Bob Li
120
Time-reversibility of random walk 0 M
The special case of a two-urn model M molecules are distributed between two urns. At each transition, one of the M molecules is chosen at random for relocation from its urn to the other. = = = = = = = = Prof. Bob Li
121
Applying the equation i i = 1,
Model the number of molecules in one urn as a random walk between 0 and M with i = (M i)/M. Hence, Applying the equation i i = 1, Conclusion. In the limiting distribution, it is as if all M molecules are distributed to the two urns randomly and independently. Homework: Give an intuitive interpretation of this conclusion. Prof. Bob Li
122
Traveling around a weighted graph
States of the markov chain = nodes in a graph Every edge (i, j) in the graph is associated with a weight wij = wji 0. Transition probabilities Pij = wij / k wik // Weight proportional transition We want to find i such that i i = 1 and because of Proposition 2. i wij / k wik = j wji / k wjk i / k wik = j / k wjk // wij = wji i / k wik = c for some constant c independent of i 1 = i i = c i k wik // Calculate c by normalization. c = 1 / i k wik i = k wik / i k wik Prof. Bob Li
123
Traveling around a weighted graph
Interpretation of formula i = k wik / i k wik Approximating by rational numbers, we may assume that wij are all integers. Replace each edge (i, j) with wij non-weighted edges. Then, a transition from a node i means moving through a randomly selected outgoing edge. The formula says that, regardless of the network topology, the limiting frequency of visiting a node is proportional to the quantity of its adjacent edges. Prof. Bob Li
124
1.5 Stopping time Prof. Bob Li
125
// Think of N as the number of customers to a shop
Theorem 1. Let X1, X2, ... , Xn, ... be i.i.d. and N 0 be a discrete random variable that is independent of all Xn. Then, // Think of N as the number of customers to a shop // and Xn as the amount spent by the nth customer. Proof. // Conditioning on N // N is independent of all Xn. // Mean of sum is sum of means. = EX1 EN Prof. Bob LI
126
1.5.1. Stopping time and Wald’s equation
Q. Roll a die repeatedly until the side “4” shows up. How many rolls does it take on the average? Calculative Solution. Let N 1 represent the waiting time for “4”. P(N = n) = (5/6)n1(1/6) // Geometric1(1/6) EN = 6 Intuitive Solution. Let Xn = 1 or 0 depending on the event for the nth roll to show “4.” // Xn is called the characteristic r.v. of the event. Thus EXn = P{Xn = 1} = 1/6. We speculate that EN is the reciprocal of this probability 1/6. If Theorem 1 in the above could apply, then we could justify the intuition by EN EX1 = EN / 6 However, Theorem 1 requires the independence of N from all Xn, which unfortunately is not the case. So, what to do? We shall strengthen Theorem 1 by relaxing the independence condition in it.
127
Stopping time Definition. Consider a sequence X1, X2, ... , Xn, ... of random variables. A nonnegative integer random variable N is called a stopping time (or stopping rule) of this sequence if For every n, the event {N n} is independent of the random variable Xn . // For all n and all x, the events {N n} and {Xn x} are independent. // Equivalently, the characteristic r.v. of the event {N n} is indep. of Xn. Xn represents the outcome of the nth experiment. Right before the nth experiment, you need to decide whether to stop without knowing how Xn will turn out. The decision, though, may depend on the past knowledge on X1, X2, ... , Xn1. Hence, the stopping time N is not independent of the process X1, X2, ... , Xn1, but it abides with causality in life.
128
Examples of stopping times Roll a die until the side “4” appears.
Gamble $1 at a time at a casino until midnight or losing all money. Gamble $1 at a time at a casino until there is only $k left. Examples of non stopping times Gamble $1 at a time at a casino until right before losing the last $1. Gamble $1 at a time at a casino until right before my luck turns bad. Prof. Bob LI
129
Wald’s equation Theorem (Wald's Equation). Let X1, X2, ... , Xn, ... be i.i.d. with a stopping time N. If EN < , then Proof. Define the r.v. // Characteristic r.v. of the event {N n} Observe two things: The stopping time means the indep. between Yn and Xn. XnYn = Xn when N ≥ n = 0 otherwise Hence we can remove the randomness in the upper bound of the summation by Abraham Wald
130
Wald’s equation = EX1 EN Thus, // Mean of sum is sum of means.
// Xn and Yn are indep. = // Identical distribution // Yn = characteristic r.v. of event {N n}. = EX1 EN Prof. Bob LI
131
Example. Gambler’s ruin on casino roulette
Q. A gambler starts with $100 and intends to keep betting $1 on ODD at the roulette until he loses all. How long does this process take on the average? Answer. Let Xn be the net winning on the n-th bet. Then, X1, X2, ... , Xn, ... are i.i.d. and Xn = Thus, EXn = P(Xn = 1) P(Xn = 1) = 1/19. Let the random variable N represent the number of bets till losing all. Clearly, N is a stopping time for the process X1, X2, ... , Xn, ... By Wald’s equation, Hence EN = 1900 < . Prof. Bob LI
132
Example of non-positive recurrent markov chain
Theorem. Symmetric 1-dim free random walk is not positive recurrent. // It was shown to be recurrent. Proof. Let Yn be the winning on the n-th game in fair gamble. This defines i.i.d. The symmetric 1-dim free random walk is represented by the markov chain X0, X1, X2, ... , where Xn = j≤n Yj. Let T01 be the waiting time for state 1 given X0 = 0. Clearly T01 is a stopping time for the markov chain. Claim that ET01 = . If ET01 < , Wald's Equation would have yielded the following contradiction: 1 = E [ ] = ET01 EY1 = ET01 0 = 0 Thus ET01 must be . Similarly, ET10 = = ET1,0. Conditioning on the first move, we have ET00 = 1 + ET10 / 2 + ET1,0 / 2 =
133
Revisiting the board game
Let pn = P{ever landing on square #n}. Previously, we found that p6 pn for all n. Q. limnpn = ? Intuitive solution. As n gets very large, the value of pn should have little correlation to n. Thus, limn pn exists. The average value per roll of the die is 7/2. So, 2 out of every 7 squares should be landed. Therefore limn pn = 2/7. Advance the token by rolling a die. Prof. Bob LI
134
To articulate the intuition in rigor
Denote Xn = outcome of the nth roll Yn = characteristic r.v. of the event {ever landing on n} pn = P{Yn=1} = EYn Tk = waiting time until the square k is passed = Y1 + Y2 + … + Yk // Count only landed squares. k+1 X1 + X2 + … k+6 // Count both landed and passed squares. Clearly Tk is a stopping time for the process X1, X2, … , Xn, … From Wald's Equation, E[X1 + X2 + … ] = ETk EX1 = ETk7/2 k+1 ETk7/2 k+6 2k/7 + 2/7 ETk 2k/7 + 12/7 Since ETk = EY1 + … + EYk + 1 = p1 + … + pk + 1, 2k/7 + 2/7 p1 + p2 + … + pk + 1 2k/7 + 12/7
135
To articulate the intuition in rigor
2k/7 + 2/7 p1 + p2 + … + pk + 1 2k/7 + 12/7 2/7 5/7k (p1 + p2 + … + pk)/k 2/7 + 5/7k p1 + p2 + … + pk k limn pn = limk = 2/7 Homework. If we roll two dice instead of just one, show that limn pn = 1/7. Define Wk as the waiting time until the square k is landed or passed. Show that both Tk and Wk are stopping times for the process X1, X2, … , Xn, … Clearly Tk = Y1 + Y2 + … + Yk + 1. How does Wk relate to Y1 + Y2 + … + Yk?
136
1.6 Martingales Prof. Bob LI
137
Sum of randomly many i.i.d.
Theorem. When X and Y are independent r.v., E[XY] = EX EY, Cov(X,Y) = 0 and hence Var(X+Y) = Var(X) + Var(Y) Proof. E[XY] // By independence = EX EY Cov(X,Y) = E[(XEX)(YEY)] = E[XY] E[X(EY)] E[(EX)Y] + EX EY = E[XY] (EY)EX (EX)EY + EX EY = EX EY EY EX EX EY+ EX EY = 0
138
Var(X1+X2+...+Xn) = n Var(X1)
Corollary. Let X1, X2, ... , Xn be i.i.d. Then, Var(X1+X2+...+Xn) = n Var(X1) Proof. Var(X1+X2+...+Xn) = Var(X1) + Var(X2) Var(Xn) // By independence = n Var(X1) // Identical distribution Contrast: Var(nX) = n2 Var(X) // The contrast leads to Central Limit Theorem, // Law of Large Numbers, Chebychev inequality, etc. Proof. Var(nX) = E[(nX)2] (E[nX])2 = E[n2 X2] (n EX)2 = n2 E[X2] n2 (EX)2 = n2 Var(X) Prof. Bob LI
139
E[Xk+1 | Xk = xk, … , X1 = x1] < xk
Martingale Definition. A process X1, X2, ... , Xn, ... is called a martingale if, for all k, E|Xk| < and E[Xk+1 | Xk = xk, … , X1 = x1] = xk // This is sometimes abbreviated as E[Xk+1 | Xk, … , X1] = Xk Example. Xk = net cumulative winning after k games in fair gamble. Counterexample. When you gamble on casino roulette, E[Xk+1 | Xk = xk, … , X1 = x1] < xk The process X1, X2, ... , Xn, ... is then called a “super-martingale” by J. Doob ironically. Prof. Bob LI
140
Martingale = fair gamble
Intuitive definition. A martingale is a stochastic process X0, X1, X2, ... with Xk = cumulative net winning after k games in fair gamble Rigorous definition. A martingale is a stochastic process X0, X1, X2, ... such that E|Xk| < and for all k. Prof. Bob LI
141
History of martingale The word martingale came from middle-age French.
It was first related to the concept of probability through a betting system in the 18th century. Paul Levy (1886~1971) formulated the mathematical concept. Prof. Bob LI
142
Martingale Stopping Theorem
The intuitive fact of E[Net gain in fair gamble upon stopping] = 0 is rigorously formulated as: Martingale Stopping Theorem. (Joseph Doob) Let X0=0, X1, ... , Xn, ... be a martingale and N a stopping time. If E|XN| < and , then EXN = 0. Proof. Omitted. Prof. Bob LI
143
An artificial casino of fair gamble
A gambler brings in $1 and bets on Head in fair-coin tossing. Upon the winning every time, he doubles his fortune and then parley the next coin toss. The process continues until he loses. Thus the stopping time N is Geometric1(1/2) distributed. Upon stopping, the net winning of the gambler is $(1), which differs from the initial winning of $0. Therefore, the Martingale Stopping Theorem does not apply here. This example shows that the “lim inf condition” in the Martingale Stopping Theorem is not superfluous.
144
Gambler’s random walk A gambler in Macau needs $n for boat fare to go home but has only $i. So he gambles $1 at a time and wins with probability p each time. Denote pi = P{Will reach $n | starting at $i}. Recall the long markov-chain calculation for We shall derive this with almost no calculation at all.
145
Instant calculation by martingale
The random walk is not a martingale. Labels of states are in an arithmetic progression. Deploy a geometric progression instead with the common ratio r = = odds.
146
Instant calculation by martingale
The random walk is not a martingale. Labels of states are in an arithmetic progression. Deploy a geometric progression instead with the common ratio r = = odds. By symmetry, we shall assume that p < q. $rk1, the loss is $(rkrk1) From $rk to $rk+1, the gain is $(rk+1rk) = $r(rkrk1) The random walk becomes a martingale X0 = ri, X1, ..., Xn, ...
147
Instant calculation by martingale
The time T till the gambler reaches either end is a stopping time. From the Martingale Stopping Theorem, ri = EXT = pi rn + (1 pi)1 // Same solution as before. Prof. Bob LI
148
Possible applications of this martingale
r = = odds Financial engineering. Take r in this martingale to be the rate of interest, inflation, depreciation, … Information engineering. r can be, for instance, the attenuation of signal strength. Prof. Bob LI
149
Expected waiting time We have solved by markov chain or by martingale.
Let W = waiting time from state i till reaching either end = a stopping time Q: EW = ? Below we give a solution by a different martingale. Prof. Bob LI
150
Expected waiting time If p < q, there is an average loss $(qp) per step. If p > q, there is an average gain $(pq) per step. By symmetry, we shall assume the former case. Prof. Bob LI
151
Expected waiting time EW The average loss per step is $(qp).
To convert the random walk into a martingale, one way is to compensate the gambler by the amount $(qp) per bet. By the martingale stopping theorem, expected net gain upon stopping is 0. E[pi n (qp)W] = i n + (qp) EW = i Thus, EW
152
Homework. In NBA championship, a “best 4 out of 7” series is played between Miami and Dallas. Assume that the result of every game will be a toss-up. A friendly bookie accepts bets at fair odds on every single game. Starting with an initial capital of $10,000, your mission is to place your bets and eventually double the money if Miami wins the championship. Show that there is one and only one way to accomplish this mission. Hint. Formulate the problem rigorously by the martingale concept. Think of expected value after each game. Prof. Bob LI
153
Non-binary non-uniform pattern
Roll a special die repeatedly, where P{a} = 1/2 P{b} = 1/3 P{c} = 1/6 The average waiting time for the pattern aba = ? How is this calculated and why so? 14. S.-Y. R. Li, “A martingale approach to the study of occurrence of sequence patterns in repeated experiments,” Annals of Probability, v.8, pp , 1980.
154
How to calculate average waiting time?
Formula. The average waiting time for the pattern aba is 3 2 + P{a} 1 + P{b} P{a} Calculating 3 bits: 1 = 2 = 3 = a b a 1 a b a 1
155
How to calculate average waiting time?
Formula. The average waiting time for the pattern aba is 1 Worth 12 0 + 1/2 1 + Worth 2 1/3 = 14 1/2 1 = 2 = 3 = 1 1
156
Proof by an artificial casino
A gambler arrives with $1 and bets on the pattern aba by rolling the special die: If a appears on the 1st roll, he receives $2 in total. Then, parlay on the 2nd roll. If b appears, he receives $6 in total. Then, parlay on the 3rd roll. If a appears again, he receives $12 in total. Game over. This is fair gamble!
157
Gambling team enters the artificial casino
Y1 = a = 2 = 1
158
Gambling team enters the artificial casino
Y1 = a Y2 = c = 0 = 2 = 0 = 1
159
Gambling team enters the artificial casino
Y1 = a Y2 = c Y3 = b Y4 = a Y5 = a Y6 = b Y7 = a = 0 = 0 = 1 = 0 = 0 = 1 = 2 = 2 = 12 = 6 = 1 = 0 = 1 = 2 = 1 Total receipt = regardless of the outcome of die rolling. Net gain of the team upon stopping = – N, where N is the random variable representing the waiting time for aba. Because this is fair gamble, 0 = E[Net gain upon stopping] = 14 – EN EN = 14
160
Average waiting time for a pattern
Formula. The average waiting time for the pattern aba is 3rd last gambler has $12 in the end 1 0 + 1/2 Last gambler has $2 in the end 1 + 1/3 = 14 1/2
161
Lecture 5 Oct. 13, 2011 Prof. Bob Li
162
Generalizing the example
Formula. For a pattern B = b1b2…bm, calculate j as before. Then, the average waiting time for B is P{Y=b1} 1 + P{Y=b2} m-2 + … m-1 + P{Y=bm} m Prof. Bob Li
163
The correlation operator
Example. The average waiting time for the pattern aba is 1 0 + 1/2 1 + 1/3 aba aba = = 14 1/2 1 = 2 = 3 = 1 1 Prof. Bob LI
164
Correlation between two different patterns
3 2 + P{Y=c} 1 + P{Y=b} acab abc = P{Y=a} a c a b 1 = 2 = 3 = 1 a b c
165
Artificial casino for betting on pattern abc
At one point, the outcome pattern is acab. Y1 = a Y2 = c Y3 = a Y4 = b = 0 = 0 = 6 = 0 acababc =
166
Correlation in the special case of fair-coin toss
1/2 1 + AB = 2 + 3 + 4 = 2 binary(4321) Example. Av. wait for HTHH = HTHHHTHH = 2binary(1001) = 18 Av. wait for HTHH = THTH THTH = 2binary(1010) = 20 HTHH THTH = 2 binary(0000) = 0 THTH HTHH = 2 binary(0101) = 10 Prof. Bob LI
167
THTH vs. HTHH There are two teams of gamblers called Tortoises and Hares. Before each toss of the fair coin, a new Tortoise joins the casino and bets $1 on the pattern THTH and a new Hare bets $1 on the pattern HTHH. Denote: NA = waiting time for A = THTH in coin tossing NB = waiting time for B = HTHH in coin tossing N = min{NA, NB}, a stopping time for coin-toss process p = P{THTH prevails upon the stopping time N} // 1p = P{HTHH prevails upon the stopping time N} We want to calculate p. Prof. Bob LI
168
(XN | The pattern THTH prevails) = A A A B
Let Xn be the fortune of Tortoise minus the fortune of Hare after n games. The process X1, X2, X3, … is a martingale, because the gamble is fair. (XN | The pattern THTH prevails) = A A A B (XN | The pattern HTHH prevails) = B A B B By the Martingale Stopping Theorem, 0 = EXN = p(A A A B) + (1p)(B A B B) Hence p : (1p) = (B B B A) : (A A A B) = (18 0) : (20 10) = 9 : 5 Equivalently, p = 9/14. Homework. Compute P(A occurs before B) with a biased coin where P(H) = 0.4. Prof. Bob LI
169
Alternative calculation of THTH vs. HTHH
BB = ENB = EN + E[NB N] = EN + p E[NB N | N = NA] + (1p) E[NB N | N = NB] = EN + p(BB AB) + 0 EN = p AB+ (1p) BB Symmetrically, EN = (1p) BA + p AA p AB+ (1p) BB = p AA+ (1p) BA p : (1p) = (BB BA) : (AA AB) = (180) : (2010) = 9 : 5 This alternative calculation can also calculate EN = 90/7. Prof. Bob LI
170
The general problem T, T, G, C, T A, T, G, C G, G, G, G, G, G, G
Let Y1, Y2, …, Yn, … be i.i.d. representing outcomes of repeated experiments. Given a collection of sequence patterns: T, T, G, C, T A, T, G, C G, G, G, G, G, G, G C, C, A C, A, T, C What is the probability for each pattern to win the race? What is the average waiting time until a winner emerges? Prof. Bob LI
171
Non-uniform probability distribution
The general problem Non-uniform probability distribution Let Y1, Y2, …, Yn, … be i.i.d. representing outcomes of repeated experiments. Given a collection of sequence patterns: T, T, G, C, T A, T, G, C G, G, G, G, G, G, G C, C, A C, A, T, C What is the probability for each pattern to win the race? What is the average waiting time until a winner emerges? Non-binary: A, T, G, C, … Any number of patterns Uneven lengths Seems a hard problem!
172
The Main Theorem [Li '80] Main Theorem. Let the random variable N be the waiting time of the i.i.d. process Y1, Y2, …, Yn, … till any of the n competing patterns A1, A2, … , An appears. Denote by pi the winning probability of Ai. Then, // In lifetime, I rarely had such a simple solution to a seemingly formidable problem. Prof. Bob LI
173
The Main Theorem[Li '80] Main Theorem+. Let the pattern A be present since the beginning. Let the random variable N be the stopping time of the i.i.d. process Y1, Y2, …, Yn, … till any of the n competing patterns A1, A2, … , An appears. Denote by pi the winning probability of Ai. Then, Prof. Bob LI
174
A lemma to the Main Theorem+
Lemma. Let pattern A be present since the beginning. Then, the expected waiting time for a pattern B is BB AB The lemma is the special case of Main Theorem+ with only one pattern A1 = B. // The pattern B is not a connected subsequence of pattern A. Prof. Bob LI
175
[L(A, A)– L(A, B)] : [L(B, B)–L(B, A)]
Earlier results on fair coin-toss John Conway first discovered the integer binary(k…21) in fair coin-toss and called it the leading number L(A, B). His computation algorithms below were quoted by Martin Gardner’ Mathematical games column in [Scientific American, 1974]: The average waiting time for a pattern B = 2 L(B, B). The odds for pattern B to precede pattern A are [L(A, A)– L(A, B)] : [L(B, B)–L(B, A)] Applying the aforementioned Main Theorem to just one or two patterns in fair coin-toss Prof. Bob LI
176
Earlier results on coin-toss
Even earlier, William Feller found that, in fair coin-toss: Average waiting time for HHHHHH = 126 Average waiting time for HHTTHH = 70 He also considered the biased coin with P{H} = p and P{T} = q. Through lengthy markov-chain argument: P{A run of m H’s precedes a run of n T’s.} Applying Main Theorem to particular coin-toss patterns Prof. Bob LI
177
Mathematical irony The more general the concept,
the more transparent is the theory. Any number of patterns Uneven lengths Non-binary Non-uniform probability distribution
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.