School of Computer Science & Engineering

School of Computer Science & Engineering
Artificial Intelligence Escaping Local Optima: Genetic Algorithm Dae-Won Kim School of Computer Science & Engineering Chung-Ang University

We’re trying to escape local optima

To achieve this, we have learned simulated annealing & tabu search

They always have a single current best solution stored that they try to improve in the next step.

Hill-climbing uses deterministic rules

If an examined neighbor is better, proceed to that neighbor and continue to search from there.

SA uses probabilistic rules.

If an examined neighbor is better, accept this as the new current position.

Otherwise, either probabilistically accept this new weaker solution anyway or continue to search in the current neighborhood.

What would happen if we instead maintained several solutions simultaneously?

What would happen if our search algorithm worked with a population (or set) of solutions?

At first glance, this idea doesn’t provide us with anything new.

We just process several solutions in parallel with other methods!

Q: What makes it essentially different from other methods?

1. Evolutionary process of competition and selection and let the candidate solutions fight for the room in future generations

2. Random variation to search for new solutions similar to natural evolution

We start with a population of initial solution, called chromosomes.

We use an evaluation function to determine the merit of each solutions.
* Fitness function == Obj. fn

The better solutions become parents for the next generation of offspring.

How should we generate these offspring?

We don’t have to rely on the neighborhoods of each individual solution.

We can examine the neighborhoods of pairs of solutions.

We can use more than one parent solution to generate a new candidate solution.

One way is by taking parts of two parents and putting them together with to form an offspring.

We might take the first half of one parent together with the second half of another.

With each generation, the individuals compete (among themselves or parents) for inclusion in the next iteration.

After a series of generations, we observe a succession of improvements and a convergence.

Why should we labor to solve a problem by calculating difficult mathematical expressions or developing computer programs to optimize the results?

Genetic Algorithm Genetic Programming Evolutionary Algorithm Evolutionary Programming …

It’s often called as Swiss Army Knife

begin t := 0 initialize P(t) evaluate P(t) while (not termination-condition) do t := t + 1 select P(t) from P(t-1) alter P(t) end

Q: How to design GA for TSP?

The idea is quite simple:
a population of TSP candidate solutions is evolved over iterations of variation and selection.

(Random) variation provides the mechanism for discovering new solutions.

Selection determines which solutions to maintain as a basis for further exploration.

Issues for solving TSP:
Representation Variation operators Evaluation function Selection

Issue 1: Representation

A representation is a mapping from the state space of possible solutions to a state space of encoded solutions within a particular data structure.

One possible way is to make an ordered list of the cities to visit.

A ten city tour may be represented:
< >

<permutation> representation provides a feasible tour, but isn’t the only possible representation.

In the <adjacency> representation, city j is listed in the position i if and only if the tour leads from city i to city j.

The vector < > represents the tour 1 – 2 – 4 – 3 – 8 – 5 – 9 – 6 – 7

What’s the advantage of this <adjacency> representation?

It allows us to look for templates that are associated with good solutions.

In the <ordinal> representation, the i-th element of the list is a number in the range from 1 to n-i+1.

The ordered list serves as a reference point for lists in ordinal representation.

Given an ordered list (1 2 3 4 5 6 7 8 9),
A tour 1 – 2 – 4 – 3 – 8 – 5 – 9 – 6 – 7 is represented as a list of references: < >

What’s the advantage of this <ordinal> representation?

A good representation leads to good search (variation) operators.

Issue 2: Variation operators

The choice of how to represent a tour is related to the choice of what variation operators to use.

Let us assume that we use <Path Permutation> representation.

A tour 5 – 1 – 7 – 8 – 9 – 4 – 6 – 2 – 3 is represented as < >

Q: how to make offspring from two parents (sequences)?

For example, P1 = < > P2 = < >

We introduce two variation operators, called crossovers.

1. Partially mapped crossover: PMX

PMX builds a offspring by choosing a subsequence of a tour from one parent and preserving the order and position of as many cities as possible from the other parent.

A subsequence of tour is selected by choosing two random cut points, which serve as boundaries for the swapping operations.

For example, P1 = < > P2 = < >

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3>

The segments between cut points are swapped.

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3> O1 = <x x x | | x x> O2 = <x x x | | x x>

O1 = <x x x | | x x> O2 = <x x x | | x x> We can see a series of mappings: 1 – 4, 8 – 5, 7 – 6, 6 – 7

The, we can fill in additional cities from the original parents for which there’s no conflict.

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3> O1 = <x 2 3 | | x 9> O2 = <x x 2 | | 9 3>

O1 = <x 2 3 | | x 9> O2 = <x x 2 | | 9 3> How to fill the value of x?

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3> O1 = <4 2 3 | | x 9> O2 = <x x 2 | | 9 3> We use the mapping 1 – 4.

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3> O1 = <4 2 3 | | 5 9> O2 = <1 8 2 | | 9 3>

2. Ordered crossover: OX

The OX builds offspring by choosing a subsequence of a tour from one parent and preserving the relative order of cities from the other parent.

Starting from the second cut point of one parent, the cities from the other parent are copied in the same order, omitting symbols that are already present.

Reaching the end of the string, we continue from the first place of the string.

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3>

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3> O1 = <x x x | | x x> O2 = <x x x | | x x>

P1 = <1 2 3 | | 8 9> P2 = <4 5 2 | | 9 3> The sequence of the cities in the second parent is 9 – 3 – 4 – 5 – 2 – 1 – 8 – 7 – 6

O1 = <x x x | | x x> O2 = <x x x | | x x> 9 – 3 – 4 – 5 – 2 – 1 – 8 – 7 – 6 After removing 4, 5, 6, 7, which are in the first offspring, we obtain 9 – 3 – 2 – 1 – 8

The OX operator exploits that the relative order of the cities is important, i.e., the two tours
9 – 3 – 4 – 5 – 2 – 1 – 8 – 7 – 6 4 – 5 – 2 – 1 – 8 – 7 – 6 – 9 – 3 are in fact identical.

Besides, many crossovers have been suggested; we should tell the pros and cons of the methods to achieve good results for TSP project.

Other types of variation operators are mutations, which use a probability that an arbitrary point in a solution sequence is changed from its parent.

The design of crossover and mutation allows GA to avoid local optima; you should devote much time in this issue.

Issue 3: Evaluation function

Evaluation = Fitness function
= Objective function

The TSP has an easy and natural evaluation function: we calculate the total length of the tour.

Issue 4: Selection

The selection acts on individuals in the current population.

Based on each individual’s quality, it determines the next population.

Selection typically comes in one of two forms:

1. Some individuals are eliminated from consideration: each contribute at most one time.

2. Individuals are sampled with replacement: each can be chosen multiple times.

Other researchers classify selection methods as either deterministic or stochastic.

1. Deterministic selection will always eliminate the same individuals, given the same population.

It is faster than stochastic selection.

Stochastic (probabilistic) selection generates a probability over the possible compositions of the next iteration.

e.g. 1) the number of individuals for the next generation is made proportional to current population

e.g. 2) random tournaments

It may or may not be beneficial depending on where the population converges.

Q: What’s the weak points?

1. GA is very slower than others

2. GA requires a few parameters

The population size, probability of various operators, mutations size, the number of crossover points, the ratio of parents and offspring, …

It’s time-consuming, and considerable effort is necessary to develop good heuristics for good choices across many problems.

Premature Convergence
vs. Convergence Speed

Estimation of Distribution Algorithms
Particle Swam Optimization Ant Colony Optimization Genetic Local Search

Hybrid / GLS / Memetic Algorithm

From Wikipedia, “Metaheuristic”

School of Computer Science & Engineering

Similar presentations

Presentation on theme: "School of Computer Science & Engineering"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

School of Computer Science & Engineering

Similar presentations

Presentation on theme: "School of Computer Science & Engineering"— Presentation transcript:

Similar presentations

About project

Feedback