Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Rate of Concentration of the stationary distribution of a Markov Chain on the Homogenous Populations. Boris Mitavskiy and Jonathan Rowe School of Computer.

Similar presentations


Presentation on theme: "The Rate of Concentration of the stationary distribution of a Markov Chain on the Homogenous Populations. Boris Mitavskiy and Jonathan Rowe School of Computer."— Presentation transcript:

1 The Rate of Concentration of the stationary distribution of a Markov Chain on the Homogenous Populations. Boris Mitavskiy and Jonathan Rowe School of Computer Science, University of Birmingham

2 Notation denotes a set, usually finite, called a search space is the fitness function

3 How Does an Evolutionary Computation Algorithm Work? is chosen randomly.

4 Selection:

5 Recombination is just some probabilistic rule which sends a given population

6 Recombination is just some probabilistic rule which sends a given population to the population

7 In our framework we only assume the following “weak purity” (in the sense of Radcliffe) about recombination: Recombination with probability 1.

8 Recombination In other words, homogenous populations stay homogenous with probability 1.

9 Mutation For every individual of the transformation population select a mutation where is the family of mutation transformations

10 Replace every individual of with the individual

11 This once again gives us a new population

12 Quotients of Markov Chains: is the state space of our irreducible Markov chain

13 Quotients of Markov Chains: is the state space of our irreducible Markov chain Partition into equivalence classes:

14 Quotients of Markov Chains: is the state space of our irreducible Markov chain How do we define the transition probabilities among the equivalence classes? ? ? ? ?? ? ? ? ?

15 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

16 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

17 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

18 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

19 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

20 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

21 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

22 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

23 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

24 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

25 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

26 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

27 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at.

28 Imagine that the chain runs for a very long time. Let denote the stationary distribution of this Markov chain. Suppose we run the original Markov chain For an extensive period of time. We are interested in computing the probability Of reaching given that we are at. An element arises with frequency. On the other hand, a given inside of occurs with relative frequency among the states inside of. We therefore, obtain the following transition probability formula:

29 Where is the probability of getting somewhere inside of Starting from. Computing is also quite easy:

30 Where is the probability of getting somewhere inside of Starting from. Computing is also quite easy:

31 Where is the probability of getting somewhere inside of Starting from. Computing is also quite easy:

32 Where is the probability of getting somewhere inside of Starting from. Computing is also quite easy: And so we finally obtain:

33 It is not surprising then that the quotient Markov chain is also irreducible and its stationary distribution is coherent with the original one. The irreducibility is left as an exercise. Let denote the distribution obtained from as follows: For every equivalence class we let.

34 It is not surprising then that the quotient Markov chain is also irreducible and its stationary distribution is coherent with the original one. The irreducibility is left as an exercise. Let denote the distribution obtained from as follows: For every equivalence class we let. It can be verified By direct computation that is the stationary distribution.

35 Although the transition probabilities of the quotient chain are defined in terms of the stationary distribution of the original chain they may still give us some useful information:

36 Although the transition probabilities of the quotient chain are defined in terms of the stationary distribution of the original chain they may still give us some useful information:

37 Although the transition probabilities of the quotient chain are defined in terms of the stationary distribution of the original chain they may still give us some useful information:

38 Although the transition probabilities of the quotient chain are defined in terms of the stationary distribution of the original chain they may still give us some useful information:

39 The quotient chain has the transition matrix

40 The unique stationary distribution is then given by the formulas

41 where and

42 And so where and

43 And so where and

44 What does this say about Markov chains modelling EAs? Recall that, the family of mutation transformations, is just the family of functions on. In practice, the transformations in are sampled independently with respect to some probability distribution. In practice, mutation happens with a small rate. This means that is very small but positive, and. The Markov chain modelling an EA involving such type of mutation is irreducible and, hence, has a unique stationary distribution.

45 What happens to the stationary distribution of a Markov chain modelling such an EA as the “mutation rate” ? Let denote the set of homogenous populations (i.e. populations consisting of repeated copies of the same individual). The state space of the Markov chain modelling EA is the set of all populations. Here we consider ordered populations so that the state space is.

46 What happens to the stationary distribution of a Markov chain modelling such an EA as the “mutation rate” ? Let denote the set of homogenous populations (i.e. populations consisting of repeated copies of the same individual). The state space of the Markov chain modelling EA is the set of all populations. Here we consider ordered populations so that the state space is. Let’s apply our previous general result to this Markov chain with and :

47 Destroying a given homogenous population is equivalent to applying the non-identity map to at least one of the individuals in the population. There are individuals in the population. And so it follows that

48 Destroying a given homogenous population is equivalent to applying the non-identity map to at least one of the individuals in the population. There are individuals in the population. And so it follows that The probability of passing from a non-homogenous population to a homogenous one is at least as large as the probability of consecutive drawings of The fittest individual in the population. The probability of drawing the best individual is bounded below by. Doing so independently times gives us a lower bound on the transition probability to the homogenous population upon the completion of selection. Afterwards, with probability everyone stays the same which means that the homogenous population is preserved. Thus

49 So, and It follows then that

50 So, and It follows then that How can we improve the bound?

51 So, and It follows then that How can we improve the bound? Well, a power of a Markov transition matrix has the same stationary distribution as the original matrix. We can therefore apply the general “quotient” inequality to the power of the Markov transition matrix instead of the original one. Combining this with Markov inequality finally gives us the following:

52


Download ppt "The Rate of Concentration of the stationary distribution of a Markov Chain on the Homogenous Populations. Boris Mitavskiy and Jonathan Rowe School of Computer."

Similar presentations


Ads by Google