Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genetic Algorithms Beasley, Bull and Martin,

Similar presentations


Presentation on theme: "Genetic Algorithms Beasley, Bull and Martin,"— Presentation transcript:

1 Genetic Algorithms Beasley, Bull and Martin,
An Overview of Genetic Algorithms: Part 1, Fundamentals & Part 2, Research Topics University Computing, 1993

2 Background of GAs GAs are based on genetic processes of biological organisms, i.e. evolution according to principles of natural selection and survival of the fittest In nature, individuals in a population compete with each other for resources and to attract a mate The fittest ones survive and produce offspring, spreading their genetic properties to population Combination of good properties may in time produce “superfit” offspring GAs were first proposed by Holland (1975)

3 Basic notions of GAs GAs work with a population of individuals, each representing a solution to the problem at hand Each individual is assigned a fitness score Highly fit individuals are selected as parents and given an opportunity to reproduce by crossover, leading to exploration of the most promising regions in the search space Offspring produced share features taken from their parents and may be subject to mutation

4 Basic notions of GAs (cont.)
Best individuals are selected (from parents and/or offspring) to form the next generation Over many generations, good features spread throughout the population, being mixed and exchanged with other good features This leads to convergence to a good solution GAs are robust and used for a variety of problems The main area for GAs is difficult problems for which there are no specialized techniques

5 A generic GA Generate an initial population of size population_size
Compute fitness of each individual Repeat Repeat for population_size/2 Select two parents based on fitness values Recombine the parents to produce two offspring by applying crossover (with rate/probability pc) and mutation (with probability pm) Compute fitness values of offspring Form the population for the next generation Until population has converged

6 Decisions in a GA Chromosome representation or coding of a solution
Fitness function Population size, generation of initial population Parent selection for reproduction Crossover rate/probability (pc), crossover operator Mutation probability (pm), mutation operator Forming the population for next generation Stopping (convergence) condition

7 Coding or representation
A potential solution to a problem may be coded or represented by a set of variables In GAs, each of these variables (solution components) is called a gene A string of genes, representing a complete solution, is called a chromosome The set of variables represented by a chromosome is called genotype, solution constructed using these variables is called phenotype The ideal representation scheme is binary coding

8 Examples of binary coding
Optimization of function f(x,y,z) (assuming ) Phenotype: x=46, y=24, z=13 Genotype: | |

9 Examples of binary coding (cont.)
Assignment problem Phenotype: facility f1 is assigned to location l1, f2 to l4, f3 to l2 Genotype: location

10 Examples of binary coding (cont.)
TSP Phenotype: the tour Genotype: to city

11 Fitness function Fitness function returns a single numerical fitness for a chromosome Fitness value is used in probabilistic selection of parents for reproduction; usually the higher the fitness, the higher the probability of selection Fitness function may simply be the objective function where we optimize a single criterion It may also be a more complicated measure involving multiple criteria and penalties for infeasibility

12 Fitness function (cont.)
Fitness function should be smooth and regular (similar chromosomes must have close fitness) It should reflect real value of a chromosome Let the magnitude of penalty reflect the “amount” of constraint violation, e.g. how much will it cost to convert the chromosome into a valid one? Use approximate function evaluation when Evaluating true fitness is too costly Fitness function is stochastic

13 Population size and generation of initial population
Increasing the population size usually increases solution quality but requires more computation Common population sizes are 20, 30, 50, 100 In any case, population size is such a small fraction of the search space that increasing it further is not justified Initial population can be generated Randomly Completely or partly (called seeding) by heuristic(s)

14 Reproduction Reproduction involves
Parent selection Recombination by using crossover and mutation operators Crossover is the more important of the two for rapidly exploring the search space Mutation provides a small amount of random search and ensures that every point in search space is accessible

15 Parent selection Some individuals from the population are selected to form a mating pool (multiple copies of the same individual may be allowed) Size of the mating pool depends on the crossover rate/probability and the replacement scheme Two extremes are For 100% replacement: Size of the mating pool is the same as the population size, pc=1.0 For steady-state replacement: Two parents are selected to reproduce two offspring, which replace two worst parents

16 Parent selection (cont.)
One idea is to allocate the number of reproductive trials (the number of times an individual is copied into the mating pool) or to assign selection probability to individuals in proportion to their fitness, e.g. for maximization

17 Parent selection (cont.)
The number of reproductive trials may not be integer in which case we can use a stochastic sampling method, e.g. if reproductive trials for individuals i and j are 1.8 and 1.2 then each will have one copy in the mating pool and the third will be either i or j with remainder probabilities Selection probabilities may not add up to 1.0 in which case we can normalize.

18 Parent selection (cont.)
The above would have worked with infinite population size but with finite population it may cause a few highly fit individuals to dominate the population rapidly (premature convergence) Conversely, the population may converge after many generations, but without precisely locating the optimum due to insufficient gradient in fitness function to push the GA towards the optimum (slow finishing)

19 Fitness remapping Fitness remapping is used to avoid
Premature convergence, by compressing the range of fitness values Slow finishing, by expanding the range of fitness values Selection pressure (ratio of maximum to average reproductive trials allocated) can be adjusted by explicit or implicit fitness remapping

20 Fitness remapping (cont.)
Fitness scaling: Bring the maximum number of reproductive trials allocated to an individual to desired level by shifting the fitness values, i.e. Fitness windowing: Same as above where s is the minimum fitness observed during the last n (typically 10) generations

21 Fitness remapping (cont.)
Fitness ranking: To avoid using extreme fitness values, sort individuals according to raw fitness, then assign reproductive trials according to rank (found to be superior to scaling) Tournament selection: Select a pair of individuals from the population at random (with replacement), copy the better one into the mating pool, repeat until the pool is full Probabilistic tournament selection: Better individual is selected with a probability > 0.5.

22 Crossover operator Typically, crossover takes two parents, cuts their chromosome strings at a randomly chosen position, swaps the head (or tail) segments to produce two offspring Crossover is not usually applied to all pairs of parents selected for mating Likelyhood of crossover being applied to a pair, pc, is usually between 0.6 and 1.0 If crossover is not applied, offspring are produced by duplicating their parents (no disruption)

23 Crossover operator (cont.)
Alternatively, taking pc as constant crossover rate, pc x population_size/2 pairs are selected, and crossover is applied to all selected pairs Most common crossover operators for binary representation are: 1-point crossover 2-point crossover (chromosome is viewed as a loop rather than a string) Uniform crossover

24 1- and 2-point crossover 1-point Parent 1: 1 0 1 0 | 0 0 1 1 1 0
crossover Parent 2: | Offspring 1: | Offspring 2: | 2-point Parent 1: | | 1 1 0 crossover Parent 2: | | 0 1 0 Offspring 1: | | 1 1 0 Offspring 2: | | 0 1 0

25 Uniform crossover A binary crossover mask is used to determine which gene will be taken from which parent Parent 1: Parent 2: Crossover mask: Offspring 1: Offspring 2:

26 Mutation operator Mutation is applied to every offspring by altering each binary gene with a small probability, pm (typically 0.001) Offspring: Mutated offspring: Alternatively, the entire chromosome may be mutated at once by a higher pm, particularly when a non-binary representation and problem specific genetic operators are used

27 Forming population for next generation (replacement)
After two offspring are produced and mutated, they may replace their parents Unconditionally (a generation gap of 100%) If they are more fit than their parents Alternatively, all parents and offspring may be sorted together according to their fitness, and the best population_size of them may be selected Steady-state replacement replaces only a few parents (two worst parents by two best offspring)

28 Convergence A gene is said to converge when 95% of the population share the same value The population is said to converge when all genes have converged To monitor convergence, plot population average, population best and incumbent solution throughout the generations As the population converges, average fitness approaches the best

29 Convergence (cont.)

30 Why GAs work: Schemata A schema is a pattern of gene values represented by a string of characters in the alphabet {0, 1, #} where # matches anything For example, the chromosome 1010 contains, among others, the schemata 10##, #0#0, ##1#, 101# Order of a schema is the number of non-# symbols it contains (2, 2, 1, 3 in the example) Defining length of a schema is the distance between the outermost non-# symbols (2, 3, 1, 3)

31 Schema theorem It is assumed that an individual’s high fitness is due to the good schemata it contains Holland (1975) showed that, under simplifying assumptions, the optimum way to explore the search space is to allocate reproductive trials to individuals in proportion to their fitness values In this way, good schemata receive an exponentially increasing number of reproductive trials in successive generations (this is called the schema theorem)

32 Schema theorem (cont.) Holland also showed that, since each individual contains many different schemata, the number of schemata effectively processed in each generation is in the order of population_size3 This property is known as implicit parallelism, and is one of the explanations for the good performance of GAs Binary coding is thought to be ideal because these theoretical results are valid for binary coding

33 Building block hypothesis
Goldberg (1989) claims that power of the GA lies in it being able to find good building blocks Building blocks are schemata of short defining length consisting of genes that work well together, and improve performance when incorporated into an individual Short defining length is needed so that building blocks are disrupted less by random cut points in 1- or 2-point crossover

34 Building block hypothesis (cont.)
Hence a successful coding scheme encourages formation of building blocks by ensuring that Related genes are close together on the chromosome There is little interaction between genes Interaction (epistasis) means that contribution of a gene to the fitness depends on values of other genes in the chromosome There is always some interaction between genes in multimodal fitness functions, and the above conditions are not easy to satisfy

35 Exploration and exploitation
Any good search algorithm must find a tradeoff between exploration and exploitation, e.g. random search does only exploration whereas traditional descent does only exploitation Holland showed that a GA does both simultaneously in an optimal way, assuming that Population size is infinite Fitness function accurately reflects the solution’s utility Genes in a chromosome do not interact significantly

36 Exploration and exploitation (cont.)
The first assumption can never be satisfied in practice; GA’s “population” is only a sample and stochastic error is unavoidable Genetic drift: Even in the absence of any selection pressure (i.e. a constant fitness function), the GA will still converge if, by chance, a chromosome becomes predominant in the population For the GA to properly exploit, the fitness function must provide a sufficiently large slope to counteract the genetic drift Mutation can be useful in avoiding genetic drift

37 Comparison of GAs with others
Random search does only exploration, traditional ascent (hillclimbing) does only exploitation, GA does both. Iterated hillclimbing with random restarting points allocates its trials evenly over the search space, GA allocates increasing trials to promising regions SA and TS deal with one candidate solution at a time, GA has a population and implicit parallelism TS is usually deterministic, GA is stochastic SA does not have memory, TS does, GA?

38 Part 2: Crossover revisited
2-point crossover is better than 1-point because a chromosome, when consired as a loop, can contain more building blocks Schemata of a particular order are equally likely to be disrupted by uniform crossover, irrespective of their defining length Schemata with long defining length are more likely to be disrupted by 2-point crossover, irrespective of their order

39 Crossover revisited (cont.)
Schemata with short defining length are more likely to be disrupted by uniform crossover, but the same is not true for longer defining length (?) Hence, total amount of schemata disruption may be lower with uniform crossover Ordering of genes in the chromosome is not important with uniform crossover, hence it is more robust than 2-point crossover Theoretical and empirical results show that there is no overall winner

40 Mutation revisited Mutation is traditionally regarded less important than crossover and used to provide a small amount of random search and to avoid genetic drift However, asexual reproduction can also result in successful evolution Naive evolution (just selection and mutation) results in slower evolution than crossover alone, but it may find better solutions at the end Indeed, as the population converges, mutation becomes more productive than crossover

41 Inversion and reordering
Order of genes on a chromosome is critical for the building block hypothesis to work effectively Purpose of inversion/reordering is to find gene orderings that have better evolutionary potential Inversion is a special form of reordering which reverses the order of genes between two randomly selected positions Reordering does not lower epistasis; nor does it help when linear ordering of genes is not possible Reordering also expands the search space

42 More on epistasis (interaction)
In biology, a gene is epistatic when its presence suppresses the effect of a gene at another position Even when individual genes are not epistatic, there will be “chains of influence” (one gene’s product affects another gene’s function) Hence, interaction among genes is unavoidable Interaction is inherent in some problems, e.g. In TSP, it is the relationship (distance) between cities that counts, not the cities themselves Two facilities cannot be assigned to the same location

43 Deception Normally, short, low-order schemata contained in global optimal solution are expected to increase in frequency throughout the evolution If schemata not contained in optimal solution have higher fitness, then they will increase in frequency faster, and the GA will be misled Deception is a special case of epistasis Deceptive problems may be difficult to solve, but the bias introduced in average fitness estimation after the first generation may help solve them

44 Tackling epistasis Epistasis can be tackled in two ways
As a coding problem As a GA theory problem If taken as a coding problem, the solution is to find a different coding scheme and to develop appropriate genetic operators, e.g. Goldsberg’s order-schemata and PMX crossover Expansive coding which uses a larger number of weakly interacting genes (larger search space) instead of a small number of strongly interacting genes

45 Tackling epistasis (cont.)
We will see such examples for ordering problems when we discuss genetic operators for TSP When treated as a GA theory problem, a new theory (and new algorithms) may have to be developed, which takes epistasis into account Although Holland’s convergence proof assumes low epistasis, there may be a weaker proof for domains of high epistasis

46 Non-binary representations
Binary representation, where each gene has a cardinality of two, is traditionally believed to give the largest number of schemata and to provide highest degree of implicit parallelism Recently, higher-cardinality representations are claimed to contain more schemata; they can perform well Integer or real numbers can be used as high-cardinality alphabets, and meaningful problem specific genetic operators can be defined easily

47 Non-binary representations (cont.)
Examples of non-binary crossover operators Take arithmetic average of the two gene values Take geometric mean (square root of the product) Take the difference between the two gene values, add it to the higher or subtract it from the lower Examples of non-binary mutation operators Replace the current value with a random one Add or subtract a small random amount (creep) Multiply by a random amount close to one (geometric creep)

48 Dynamic operator probabilities
Crossover probability, pc, and mutation probability, pm, may vary during the evolution For better exploration, pc may decrease and pm may increase during the run according to a fixed schedule, e.g. linearly For convergence, pm may decrease exponentially (similar to the temperature in SA) pc and pm can be adjusted dynamically, depending on the spread of fitness values, e.g. increase pm as the spread decreases (as the population converges)

49 Dynamic operator probabilities (cont.)
Probability of the more successful operator can be increased, e.g. Monitor the fitness improvement due to crossover and mutation operators over the last n reproductive trials Give more weight to the more successful operator For each reproductive trial, choose one of the operators probabilistically according to its weight Different crossover and mutation operators may also be weighted in a similar manner

50 Niche and speciation In nature, different species evolve to fill different ecological niches Speciation is the process by which a single species differentiates into two or more different species Niches are analogous to alternative maxima of fitness values in GAs Normally, a GA cannot find these alternatives because of genetic drift and convergence (a GA does not allow speciation and the entire population end up in the same niche)

51 Niche and speciation (cont.)
To solve this problem, we should Maintain diversity by encouraging speciation Share the payoff associated with a niche Preselection: Offspring replace the parents only if the offspring’s fitness is higher than that of the inferior parent (this maintains population diversity since similar individuals replace each other) Crowding: Offspring is compared with a few randomly selected individuals and replaces the most similar one (again for diversity)

52 Niche and speciation (cont.)
Restricted mating: Individuals are allowed to mate only if they are similar (this encourages speciation); offspring of two highly fit but dissimilar parents may be unfit Multiple subpopulations: Population is divided into subpopulations, each evolving in itself, and migration is allowed at a limited rate (again for speciation) Local mating: Similar to multiple subpopulations, but without explicit boundaries

53 Niche and speciation (cont.)
Sharing: Similar individuals that are in the same niche share the fitness payoff among them A full niche is no longer rewarding since the payoff is shared and individual fitness values are reduced Sharing distributes individuals to peaks in fitness function in proportion to the height of the peak Sharing is found to be superior to crowding Sequential niches: Multiple GA runs are made, each locating a new peak (previously found peaks are cancelled out from the fitness function)

54 Diploidy and dominance
Diploidy: Higher lifeforms like mamals have two sets of genes; of a pair of genes, one is dominant and the other is recessive Diploidy allows two solutions to be remembered instead of one, and provides higher diversity Potentially useful gene sets can be maintained in recessive position An extension can be to keep the best individuals (elite solutions?) and try reintroducing them to the population if the performance falls

55 Problem specific knowledge
Binary coding, random initial population, and traditional crossover/mutation operators follow the biological process more closely (are generic) but do not make use of problem specific knowledge Using problem specific knowledge, we can Find more suitable representation schemes Generate initial population using heuristics Develop problem specific genetic operators that guarantee feasibility, particularly in ordering problems Use local improvement as a form of mutation


Download ppt "Genetic Algorithms Beasley, Bull and Martin,"

Similar presentations


Ads by Google