Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 - Markov Chains Jim Vallandingham.

Similar presentations


Presentation on theme: "11 - Markov Chains Jim Vallandingham."— Presentation transcript:

1 11 - Markov Chains Jim Vallandingham

2 Outline Irreducible Markov Chains Monte Carlo Methods
Outline of Proof of Convergence to Stationary Distribution Convergence Example Reversible Markov Chain Monte Carlo Methods Hastings-Metropolis Algorithm Gibbs Sampling Simulated Annealing Absorbing Markov Chains

3 Stationary Distribution
As approaches Each row is the stationary distribution

4 Stationary Dist. Example

5 Stationary Dist. Example
Long Term averages: 24% time spent in state E1 39% time spent in state E2 21% time spent in state E3 17% time spent in state E4

6 Stationary Distribution
Any finite, aperiodic irreducible Markov chain will converge to a stationary distribution Regardless of starting distribution Outline of Proof requires linear algebra Appendix B.19

7 L.A. : Eigenvalues Let P be an s x s matrix. P has s eigenvalues
Found as the s solutions to Assume all eigenvalues of P are distinct

8 L.A. : left & right eigenvectors
Corresponding to each eigenvalue Is a right eigenvector - And a left eigenvector - For which: Assume they are normalized:

9 L.A. : Spectral Expansion
Can express P in terms of its eigenvectors and eigenvalues: Called a spectral expansion of P

10 L.A. : Spectral Expansion
If is an eigenvalue of P with corresponding left and right eigenvectors & Then is an eigenvalue of Pn with same left and right eigenvectors &

11 L.A. : Spectral Expansion
Implies spectral expansion of Pn can be written as:

12 Outline of Proof Going back to proof… P has one eigenvalue, equal to 1
P is transition matrix for finite aperiodic irreducible Markov chain P has one eigenvalue, equal to 1 All other eigenvalues have absolute value < 1

13 Outline of Proof Choosing left and right eigenvectors of Requirements:
Also satisfies : & = 1 Probability vector (sum to 1) Normalization (definition of left eigenvector as eigenvalue of 1)

14 Outline of Proof Same equation satisfied by the stationary distribution Also: Can be shown that there is a unique solution of this equation that also satisfies so so that

15 Outline of Proof Pn gives the n-step transition probabilities.
Spectral Expansion of Pn is: So as n increases Pn approaches Only one eigenvalue is = 1. Rest are < 1

16 Convergence Example

17 Convergence Example Has Eigenvalues of :

18 Convergence Example Has Eigenvalues of : Less than 1

19 Convergence Example Left & Right eigenvectors satisfying

20 Convergence Example Left & Right eigenvectors satisfying
Stationary distribution

21 Convergence Example Spectral expansion Stationary distribution

22 Reversible Markov Chains

23 Reversible Markov Chains
Typically moving forward in ‘time’ in a Markov chain 1  2  3  … t What about moving backward in this chain? t  t-1  t-2 …  1

24 Reversible Markov Chains
Ancestor Back in time Forward in time Species A Species B

25 Reversible Markov Chains
Have a finite irreducible aperiodic Markov chain with stationary distribution During t transitions, chain will move through states: Reverse chain Define Then reverse chain will move through states:

26 Reversible Markov Chains
Want to show structure determining the reverse chain sequence is also a Markov chain Typical element found from typical element of P, using:

27 Reversible Markov Chains
Shown by using Bayes rule to invert conditional probability Intuitively: The future is independent of the past, given the present The past is independent of the future, given the present

28 Reversible Markov Chains
Stationary distribution of reverse chain is still Follows from Stationary distribution property

29 Reversible Markov Chains
Markov chain is said to be reversible if This only holds if

30 Monte Carlo Methods

31 Markov Chain Monte Carlo
Class of algorithms for sampling from probability distributions Involve constructing a Markov Chain Want to have stationary distribution State of chain after large number of steps is used as a sample of desired distribution We discuss 2 algorithms Gibbs Sampling Simulated Annealing

32 Basic Problem Find transition matrix P such that
Its stationary distribution is the target distribution Know that Markov chain will converge to stationary distribution, regardless of initial distribution How can we find such a P with its stationary distribution as the target distribution?

33 Basic Idea Construct transition matrix Q
“candidate generating matrix” Modify to have correct stationary distribution Modification involves inserting factors So that Various ways to picking a’s

34 Hastings-Metropolis Goal: construct aperiodic irreducible Markov chain
Having prescribed stationary distribution Produces a correlated sequence of draws from the target density that may be difficult to sample using a classical independence method.

35 Hastings-Metropolis Choose set of constants Define Such that And
Process: Choose set of constants Such that And Define Accept state change Reject state change Chain doesn’t change value

36 Hastings-Metropolis Example
= (.4 .6) 1 2 .5 .9 .1 Q =

37 Hastings-Metropolis Example
1 2 .5 .9 .1 = (.4 .6) Q = 1 2 .5 .33 .67 P=

38 Hastings-Metropolis Example
= (.4 .6) 1 2 .5 .33 .67 P= 1 2 .415 .585 .386 .614 P2= 1 2 .398 .602 P50=

39 Algorithmic Description
Start with State E1, then iterate Propose E’ from q(Et,E’) Calculate ratio If a > 1, Accept E(t+1) = E’ Else Accept with probability of a If rejected, E(t+1) = Et

40 Gibbs Sampling

41 Gibbs Sampling Definitions Be the random vector Be the distribution of Assume We define a Markov chain whose states are the possible values of Y

42 Gibbs Sampling Enumerate vectors in some order
Process Enumerate vectors in some order 1, 2,…,s Pick vector j with jth state in chain pij : 0 : if vectors i & j differ by more than 1 component If they differ by at most 1 component, y1*

43 Gibbs Sampling Assume Joint distribution p(X,Y)
Looking to sample k values of X Begin with value of y0 Sample xi using p(X | Y = yi-1) Once xi is found use it to find yi p(Y | X = xi) Repeat k times

44 Visual Example

45 Gibbs Sampling Allows us to deal with univariate conditional distributions Instead of complex joint distributions Chain has stationary distribution of

46 Why is is Hastings-Metropolis ?
If we define Can see that for Gibbs: When a is always 1

47 Simulated Annealing

48 Simulated Annealing Goal: Find (approximate) minimum of some positive function Function defined on an extremely large number of states, s And to find those states where this function is minimized Value of the function for state is:

49 Simulated Annealing Construct neighborhood of each state
Process Construct neighborhood of each state Set of states “close” to the state Variable in Markov chain can move to a neighbor in one step Moves outside neighborhood not allowed

50 Simulated Annealing Requirements of neighborhood
If is in neighborhood of then is in the neighborhood of Number of states in a neighborhood (N) is independent of that state Neighborhoods are linked so that chain can eventually make it from any Ej to any Em. If in state Ej, then the next move must be in neighborhood of Ej.

51 Simulated Annealing Uses a positive parameter T
Aim is to have the stationary distribution of each Markov chain state being: Constant to ensure sum of probabilities is 1 Visit often enough to allow those states with low value of f() to become recognizable

52 Simulated Annealing

53 Simulated Annealing Large T values Small T values
All states in current states neighborhood are chosen with ~ equal probability Stationary distribution of chain tends to be uniform Small T values Different states in neighborhoods have much different stationary distribution probabilities Too small might get stuck in local maxima

54 Simulated Annealing Art of picking T value
Want rapid movement from one neighborhood to another (Large T) Picks out states in neighborhoods with large stationary probabilities (Small T)

55 SA Example

56 Absorbing Markov Chains

57 Absorbing Markov Chains
Absorbing state: State which is impossible to leave pii = 1 Transient state: Non-absorbing state in absorbing chain

58 Absorbing Markov Chains
Questions to answer: Given chain starts at a particular state, what is the expected number of steps before being absorbed? Given chain starts at a particular state, what is the probability it will be absorbed by a particular absorbing state?

59 General Process Use Explanation from
Introduction to Probability – Grinstead Convert matrix into canonical form Uses conversions to answer these questions Use simple example throughout

60 Canonical Form Rearrange states so that the transient states come first in P t x t matrix t x r matrix r x r identity matrix r x t zero matrix t : # of transient states r : # of absorbing states

61 Drunkard’s Walk Example
Man walking home from a bar 4 blocks to walk 5 states total Absorbing states: Corner 4 – Home Corner 0 – Bar Each block he has an equal probability of going forward or backward

62 Drunkard’s Walk Example

63 Drunkard’s Walk : Canonical Form

64 Fundamental Matrix For an absorbing Markov Chain P
Fundamental Matrix for P is: nij entry gives expected number of times that the process is in the transient state sj if started in transient state si (Before being absorbed)

65 Proof

66 Proof Let si and sj be two transient states Let be random variable
1 : if chain is in state sj after k steps 0 : otherwise

67 Proof Expected # of times chain is in state sj in the first n steps:
As n goes to infinity

68 Example Fundamental Matrix
Canonical form

69 Time to Absorption Expected number of steps before chain is absorbed.
ti is expected number of steps before chain is absorbed, Given it started in si. Vector with elements ti Column vector of 1’s

70 Proof Sum of the ith row of N:
Expected number of times in any transient state for a given starting state si Expected time required before absorption This is what each value of t is

71 Example: Time to Absorption

72 Absorption Probabilities
bij – probability that chain will be absorbed in absorbing state sj if starts in transient state si B – t x r matrix with entries of bij Other component of canonical matrix

73 Proof

74 Example: Absorption Probabilities

75 Absorbing Markov Chains
Given chain starts at a particular state, what is the expected number of steps before being absorbed? Given chain starts at a particular state, what is the probability it will be absorbed by a particular absorbing state?

76 Interesting Markov Chain use

77 Sentence Creator Feed text into Markov chain to create transition matrix Holds the probability of going from word i to word j in a sentence Start at a particular word in the chain and use distributions to create new sentences

78 Dracula + Huckleberry Finn:
Sentence Creator Dracula + Huckleberry Finn: This afternoon I don't know of humbug talky-talk, just set in, and perpetually violent. Then I saw, and looking tired them pens was a few minutes our sight.

79 End


Download ppt "11 - Markov Chains Jim Vallandingham."

Similar presentations


Ads by Google