# Introduction to Genetic Algorithms Speaker: Moch. Rif’an

## Presentation on theme: "Introduction to Genetic Algorithms Speaker: Moch. Rif’an"— Presentation transcript:

Introduction to Genetic Algorithms Speaker: Moch. Rif’an rifan@ub.ac.id

What are genetic algorithms? Genetic algorithms (GAs) is a search technique used in computing to find exact or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms (also known as evolutionary computation) that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination).

A short history of genetic algorithms 1954 - computer simulations of evolution started by the work of Nils Aall Barricelli 1960s - Hans Bremermann published a series of papers that adopted a population of solution to optimization problems 1963 - Barricelli had simulated the evolution of ability to play a simple game 1970s - artificial evolution became a widely recognized optimization method 1970 - the first books that described the used methods (Fraser and Burnell) 1980s - The First International Conference on Genetic Algorithms, Pennsylvania; General Electric started selling the world's first genetic algorithm product 1989 - Axcelis, Inc. released Evolver, the world's second GA product and the first for desktop computers

Genetic Algorithms - History Pioneered by John Holland in the 1970’s Pioneered by John Holland in the 1970’s Got popular in the late 1980’s Got popular in the late 1980’s Based on ideas from Darwinian Evolution Based on ideas from Darwinian Evolution Can be used to solve a variety of problems that are not easy to solve using other techniques Can be used to solve a variety of problems that are not easy to solve using other techniques

Biological Background (1) – The cell Every animal cell is a complex of many small “factories” working together The center of this all is the cell nucleus The nucleus contains the genetic information

Biological Background (2) – Chromosomes Genetic information is stored in the chromosomes Each chromosome is build of DNA Chromosomes in humans form pairs There are 23 pairs The chromosome is divided in parts: genes Genes code for properties The posibilities of the genesfor one property is called: allele Every gene has an unique position on the chromosome: locus

Biological Background (3) – Genetics The entire combination of genes is called genotype A genotype develops to a phenotype Alleles can be either dominant or recessive Dominant alleles will always express from the genotype to the fenotype Recessive alleles can survive in the population for many generations, without being expressed.

Biological Background (4) – Reproduction Reproduction of genetical information Mitosis Meiosis Mitosis is copying the same genetic information to new offspring: there is no exchange of information Mitosis is the normal way of growing of multicell structures, like organs.

Biological Background (5) – Reproduction Meiosis is the basis of sexual reproduction After meiotic division 2 gametes appear in the process In reproduction two gametes conjugate to a zygote wich will become the new individual Hence genetic information is shared between the parents in order to create new offspring

Biological Background (6) – Reproduction During reproduction “errors” occur Due to these “errors” genetic variation exists Most important “errors” are: Recombination (cross-over) Mutation

Biological Background (7) – Natural selection The origin of species: “Preservation of favourable variations and rejection of unfavourable variations.” There are more individuals born than can survive, so there is a continuous struggle for life. Individuals with an advantage have a greater chance for survive: survival of the fittest.

Biological Background (8) – Natural selection Important aspects in natural selection are: adaptation to the environment isolation of populations in different groups which cannot mutually mate If small changes in the genotypes of individuals are expressed easily, especially in small populations, we speak of genetic drift Mathematical expresses as fitness: success in life

How GA are Different than Traditional Search Methods GAs work with a coding of the parameter set, not the parameters themselves. GAs work with a coding of the parameter set, not the parameters themselves. GAs search from a population of points, not a single point. GAs search from a population of points, not a single point. GAs use payoff information, not derivatives or auxiliary knowldege. GAs use payoff information, not derivatives or auxiliary knowldege. GAs use probablistic transition rules, not deterministic rules. GAs use probablistic transition rules, not deterministic rules.

Evolution in the real world Each cell of a living thing contains chromosomes - strings of DNA Each cell of a living thing contains chromosomes - strings of DNA Each chromosome contains a set of genes - blocks of DNA Each chromosome contains a set of genes - blocks of DNA Each gene determines some aspect of the organism (like eye colour) Each gene determines some aspect of the organism (like eye colour) A collection of genes is sometimes called a genotype A collection of genes is sometimes called a genotype A collection of aspects (like eye colour) is sometimes called a phenotype A collection of aspects (like eye colour) is sometimes called a phenotype Reproduction involves recombination of genes from parents and then small amounts of mutation (errors) in copying Reproduction involves recombination of genes from parents and then small amounts of mutation (errors) in copying The fitness of an organism is how much it can reproduce before it dies The fitness of an organism is how much it can reproduce before it dies Evolution based on “survival of the fittest” Evolution based on “survival of the fittest”

Start with a Dream… Suppose you have a problem Suppose you have a problem You don’t know how to solve it You don’t know how to solve it What can you do? What can you do? Can you use a computer to somehow find a solution for you? Can you use a computer to somehow find a solution for you? This would be nice! Can it be done? This would be nice! Can it be done?

A dumb solution A “blind generate and test” algorithm: Repeat Generate a random possible solution Test the solution and see how good it is Until solution is good enough

Can we use this dumb idea? Sometimes - yes: Sometimes - yes: if there are only a few possible solutions if there are only a few possible solutions and you have enough time and you have enough time then such a method could be used then such a method could be used For most problems - no: For most problems - no: many possible solutions many possible solutions with no time to try them all with no time to try them all so this method can not be used so this method can not be used

A “less-dumb” idea (GA) Generate a set of random solutions Repeat Test each solution in the set (rank them) Remove some bad solutions from set Duplicate some good solutions make small changes to some of them Until best solution is good enough

Silly Example - Drilling for Oil Imagine you had to drill for oil somewhere along a single 1km desert road Imagine you had to drill for oil somewhere along a single 1km desert road Problem: choose the best place on the road that produces the most oil per day Problem: choose the best place on the road that produces the most oil per day We could represent each solution as a position on the road We could represent each solution as a position on the road Say, a whole number between [0..1000] Say, a whole number between [0..1000]

Where to drill for oil? 0 5001000 Road Solution2 = 900Solution1 = 300

Digging for Oil The set of all possible solutions [0..1000] is called the search space or state space The set of all possible solutions [0..1000] is called the search space or state space In this case it’s just one number but it could be many numbers or symbols In this case it’s just one number but it could be many numbers or symbols Often GA’s code numbers in binary producing a bitstring representing a solution Often GA’s code numbers in binary producing a bitstring representing a solution In our example we choose 10 bits which is enough to represent 0..1000 In our example we choose 10 bits which is enough to represent 0..1000

Drilling for Oil 0 1000 Road Solution2 = 900 (1110000100) Solution1 = 300 (0100101100) O I L Location 30 5

Classes of Search Techniques Search Techniqes Calculus Base Techniqes Guided random search techniqes Enumerative Techniqes BFSDFS Dynamic Programming Tabu SearchHill Climbing Simulated Anealing Evolutionary Algorithms Genetic Programming Genetic Algorithms Fibonacci Sort

Search Space For a simple function f(x) the search space is one dimensional. For a simple function f(x) the search space is one dimensional. But by encoding several values into the chromosome many dimensions can be searched e.g. two dimensions f(x,y) But by encoding several values into the chromosome many dimensions can be searched e.g. two dimensions f(x,y) Search space can be visualised as a surface or fitness landscape in which fitness dictates height Search space can be visualised as a surface or fitness landscape in which fitness dictates height Each possible genotype is a point in the space Each possible genotype is a point in the space A GA tries to move the points to better places (higher fitness) in the space A GA tries to move the points to better places (higher fitness) in the space

Fitness landscapes

Search Space Obviously, the nature of the search space dictates how a GA will perform Obviously, the nature of the search space dictates how a GA will perform A completely random space would be bad for a GA A completely random space would be bad for a GA Also GA’s can get stuck in local maxima if search spaces contain lots of these Also GA’s can get stuck in local maxima if search spaces contain lots of these Generally, spaces in which small improvements get closer to the global optimum are good Generally, spaces in which small improvements get closer to the global optimum are good

GA Algorithm Generate a set of random solutions Repeat Test each solution in the set (rank them) Remove some bad solutions from set Duplicate some good solutions make small changes to some of them Until best solution is good enough

The Evolutionary Cycle selection population evaluation modification discard deleted members parents modified offspring evaluated offspring initiate & evaluate

A genetic algorithm maintains a population of candidate solutions for the problem at hand, and makes it evolve by iteratively applying a set of stochastic operators

Vocabulary Gene – An single encoding of part of the solution space. Gene – An single encoding of part of the solution space. Chromosome – A string of “Genes” that represents a solution. Chromosome – A string of “Genes” that represents a solution. Population - The number of “Chromosomes” available to test. Population - The number of “Chromosomes” available to test.

Adding Sex - Crossover Although it may work for simple search spaces our algorithm is still very simple Although it may work for simple search spaces our algorithm is still very simple It relies on random mutation to find a good solution It relies on random mutation to find a good solution It has been found that by introducing “sex” into the algorithm better results are obtained It has been found that by introducing “sex” into the algorithm better results are obtained This is done by selecting two parents during reproduction and combining their genes to produce offspring This is done by selecting two parents during reproduction and combining their genes to produce offspring

Adding Sex - Crossover Two high scoring “parent” bit strings (chromosomes) are selected and with some probability (crossover rate) combined Two high scoring “parent” bit strings (chromosomes) are selected and with some probability (crossover rate) combined Producing two new offspring (bit strings) Producing two new offspring (bit strings) Each offspring may then be changed randomly (mutation) Each offspring may then be changed randomly (mutation)

Methodology Genetic algorithms are implemented as a computer simulation in which a population of abstract representations (called chromosomes or the genotype or the genome) of candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem evolves toward better solutions.

A typical genetic algorithm requires two things to be defined:  A genetic representation of the solution domain  A fitness function to evaluate the solution domain A standard representation of the solution is as an array of bits. Arrays of other types and structures can be used in essentially the same way. The fitness function is defined over the genetic representation and measures the quality of the represented solution.

Initialization Initially many individual solutions are randomly generated to form an initial population. The population size depends on the nature of the problem (hundreds or thousands of possible solutions ). Traditionally, the population is generated randomly, covering the entire range of possible solutions.

Selecting Parents Many schemes are possible so long as better scoring chromosomes more likely selected Many schemes are possible so long as better scoring chromosomes more likely selected Score is often termed the fitness Score is often termed the fitness “Roulette Wheel” selection can be used: “Roulette Wheel” selection can be used: Add up the fitness's of all chromosomes Add up the fitness's of all chromosomes Generate a random number R in that range Generate a random number R in that range Select the first chromosome in the population that - when all previous fitness’s are added - gives you at least the value R Select the first chromosome in the population that - when all previous fitness’s are added - gives you at least the value R

Example: Discrete Representation (Binary alphabet) CHROMOSOME GENE  Representation of an individual can be using discrete values (binary, integer, or any other system with a discrete set of values).  Following is an example of binary representation.

Example: Discrete Representation (Binary alphabet) 8 bits Genotype Phenotype: Integer Real Number Schedule... Anything?

Example: Discrete Representation (Binary alphabet) Phenotype could be integer numbers Genotype: 1*2 7 + 0*2 6 + 1*2 5 + 0*2 4 + 0*2 3 + 0*2 2 + 1*2 1 + 1*2 0 = 128 + 32 + 2 + 1 = 163 = 163 Phenotype:

Example: Discrete Representation (Binary alphabet) Phenotype could be Real Numbers Phenotype could be Real Numbers e.g. a number between 2.5 and 20.5 using 8 binary digits e.g. a number between 2.5 and 20.5 using 8 binary digits = 13.9609 Genotype:Phenotype:

Example: Discrete Representation (Binary alphabet) Phenotype could be a Schedule Phenotype could be a Schedule e.g. 8 jobs, 2 time steps e.g. 8 jobs, 2 time steps Genotype: = 1234567812345678 2121112221211122 Job Time Step Phenotype

Selection During each successive generation, a proportion of the existing population is selected to breed a new generation. Individual solutions are selected through a fitness-based process, where fitter solutions are to be selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as this process may be very time-consuming.

Example (selection1) Next we apply fitness proportionate selection with the roulette wheel method: Area is Proportional to fitness value Individual i will have a probability to be chosen 2 1 n 3 4 We repeat the extraction as many times as the number of individuals we need to have the same parent population size (6 in our case)

Reproduction The next step is to generate a second generation population of solutions from those selected through genetic operators: crossover (also called recombination) and/or mutation. For each new solution, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using crossover and mutation, a new solution is created which shares many of the characteristics of its "parents". New parents are selected for each child and the process continues until a new population of solutions of appropriate size is generated.

Termination This generational process is repeated until a termination condition has been reached: A solution is found that satisfies minimum criteria A solution is found that satisfies minimum criteria Fixed number of generations reached Fixed number of generations reached Allocated budget (computation time/money) reached Allocated budget (computation time/money) reached The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results Manual inspection Manual inspection Combinations of the above. Combinations of the above.

Example (initialization) We toss a fair coin 60 times and get the following initial population: We toss a fair coin 60 times and get the following initial population: s1 = 1111010101f (s1) = 7 s1 = 1111010101f (s1) = 7 s2 = 0111000101f (s2) = 5 s2 = 0111000101f (s2) = 5 s3 = 1110110101f (s3) = 7 s3 = 1110110101f (s3) = 7 s4 = 0100010011f (s4) = 4 s4 = 0100010011f (s4) = 4 s5 = 1110111101f (s5) = 8 s5 = 1110111101f (s5) = 8 s6 = 0100110000f (s6) = 3 s6 = 0100110000f (s6) = 3

Simple Example f(x) = {MAX(x 2 ): 0 <= x <= 32 } f(x) = {MAX(x 2 ): 0 <= x <= 32 } Encode Solution: Just use 5 bits (1 or 0). Encode Solution: Just use 5 bits (1 or 0). Generate initial population. Generate initial population. Evaluate each solution against objective. Evaluate each solution against objective. A01101 B11000 C01000 D10011 Sol.StringFitness % of Total A0110116914.4 B1100057649.2 C01000645.5 D1001136130.9

Pseudo-code algorithm Choose initial population Choose initial population Evaluate the fitness of each individual in the population Evaluate the fitness of each individual in the population Repeat Repeat Select best-ranking individuals to reproduce Select best-ranking individuals to reproduce Breed new generation through crossover and mutation (genetic operations) and give birth to offspring Breed new generation through crossover and mutation (genetic operations) and give birth to offspring Evaluate the individual fitnesses of the offspring Evaluate the individual fitnesses of the offspring Replace worst ranked part of population with offspring Replace worst ranked part of population with offspring Until termination Until termination

Simple Example (cont.) Create next generation of solutions Create next generation of solutions Probability of “being a parent” depends on the fitness. Probability of “being a parent” depends on the fitness. Ways for parents to create next generation Ways for parents to create next generation Reproduction Reproduction Use a string again unmodified. Use a string again unmodified. Crossover Crossover Cut and paste portions of one string to another. Cut and paste portions of one string to another. Mutation Mutation Randomly flip a bit. Randomly flip a bit. COMBINATION of all of the above. COMBINATION of all of the above.

The Basic Genetic Algorithm 1. [Start] Generate random population of n chromosomes (suitable solutions for the problem) 2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population 3. [New population] Create a new population by repeating following steps until the new population is complete 1. [Selection] Select two parent chromosomes from a population according to their fitness (the better fitness, the bigger chance to be selected) 2. [Crossover] With a crossover probability cross over the parents to form new offspring (children). If no crossover was performed, offspring is the exact copy of parents. 3. [Mutation] With a mutation probability mutate new offspring at each locus (position in chromosome). 4. [Accepting] Place new offspring in the new population 4. [Replace] Use new generated population for a further run of the algorithm 5. [Test] If the end condition is satisfied, stop, and return the best solution in current population 6. [Loop] Go to step 2

Example of mutation (Negnevitsky, Pearson Education, 2002)

Some GA Application Types

Conclusions Question:‘If GAs are so smart, why ain’t they rich?’ Answer:‘Genetic algorithms are rich - rich in application across a large and growing number of disciplines.’ - David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning