Searching for solutions: Genetic Algorithms The Genetic Algorithm (GA) is an example of the evolutionary approach to AI. The underlying idea is to evolve a population of candidate solutions to a given problem using operators inspired by natural genetic variation and selection. Note that evolution is not a purposive or directed process; in biology, it seems to boil down to different individuals competing for resources in the environment. Some are better than others, and they are more likely to survive and propagate their genetic material. In very simplistic terms, the GA works as follows: Identify initial set (population) of candidate solutions to the problem at hand. Define a function to evaluate how “good” a candidate solution is. Select two “better” candidate solutions to produce their off-springs. Replace “bad” candidate solutions with these off-springs, and repeat this process a specified number of times (generations).
Genetic algorithms: basic terminology [https://www. tutorialspoint Population: a subset of all the possible (encoded as string of bits) solutions to the given problem (analogous to the population for human beings except that instead of human beings, we have candidate solutions representing human beings). Example: 00110101, 01110001, 11000011, etc. Chromosome: a candidate solution encoded as a string of bits. Example: 00110101. Gene: one element position of a chromosome (a single bit, or short blocks of adjacent bits in more complex versions). Allele: the actual value of the gene (0 in the first gene in the chromosome example). Genotype: the population in the computation space, where solutions are represented in a way which can be easily understood and manipulated by a program. Phenotype: the population in the real world solution space, where solutions are represented in real world situations. Encoding and Decoding: a process of transforming from the phenotype to genotype, and a process of transforming a solution from the genotype to the phenotype, respectively.
GA operators Simplest genetic algorithms involve the following three operators: Selection: this operator selects chromosomes in the population according to their fitness for reproduction. Some GAs use a simple function of the fitness measure to select individuals to undergo genetic operation. This is called fitness-proportionate selection. Other implementations use a model in which certain randomly selected individuals in a subgroup compete and the fittest is selected. This is called tournament selection. Crossover: this operator randomly chooses a point and exchanges the subsequences before and after that point between two chromosomes to create two offspring. For example, consider chromosomes 1100 0001 and 0001 1111. If they crossover after their forth point, the two offspring will be 1100 1111 and 0001 0001. Mutation: this operator randomly converts some of the bits in a chromosome. For example, if mutation occurs at the second bit in chromosome 11000001, the result is 10000001.
A simple genetic algorithm The outline of a simple genetic algorithm is the following: Start with the randomly generated population of “n” j-bits chromosomes. Evaluate the fitness of each chromosome. Repeat the following steps until n offspring have been created: Select a pair of parent chromosomes from the current population based on their fitness. With the probability pc, called the crossover rate, crossover the pair at a randomly chosen point to form two offspring. If no crossover occurs, the two offspring are exact copies of their respective parents. Mutate the two offspring at each locus with probability pm, called the mutation rate, and place the resulting chromosomes in the new population. If n is odd, one member of the new population is discarded at random. Replace the current population with the new population. Go to step 2. Each iteration of this process is called a generation. Typically, a GA produces between 50 to 500 generations in one run of the algorithm. Since randomness plays a large role in this process, the results of two runs are different, but each run at the end produces one or more highly fit chromosomes.
Example Assume the following: length of each chromosome = 8, fitness function f(x) = the number of ones in the bit string, population size n = 4, crossover rate pc = 0.7, mutation rate pm = 0.001 The initial, randomly generated, population is the following: Chromosome label Chromosome string Fitness A 00000110 2 B 11101110 6 C 00100000 1 D 00110100 3
Example (cont.): step 3a D B A We will use a fitness-proportionate selection, where the number of times an individual is selected for reproduction is equal to its fitness divided by the average of the fitnesses in the population, which is (2 + 6 + 1 + 3) / 4 For chromosome A, this number is 2 / 3 = 0.667 For chromosome B, this number is 6 / 3 = 2 For chromosome C, this number is 1 / 3 = 0.333 For chromosome D, this number is 3 / 3 = 1 (0.667 + 2 + 0.333 + 1 = 4) To implement this selection method, we can use “roulette-wheel sampling”, which gives each individual a slice of a circular roulette wheel equal to the individual’s fitness, i.e. Assume that the roulette wheel is spun, and the ball comes to rest on some slice; the individual corresponding to that slice is selected for reproduction. Because n = 4, the roulette wheel will be spun four times. Let the first two spins choose B and D to be parents, and the second two spins choose B and C to be parents. B D C A
Example (cont.): steps 3b and 3c Step 3b Apply the crossover operator on the selected parents: Given that B and D are selected as parents, assume they crossover after the first locus with probability pc to form two offspring, say E = 10110100 and F = 01101110. Assume that B and C do not crossover thus forming two offspring which are exact copies of B and C. Step 3c: Apply the mutation operator on the selected parents: Each offspring is subject to mutation at each locus with probability pm. Let E is mutated after the sixth locus to form E’ = 10110000, and offspring B is mutated after the first locus to form B’ = 01101110. The new population now becomes: Chromosome label Chromosome string Fitness E’ 10110000 3 F 01101110 5 C 00100000 1 B’ 01101110 5 Note that the best string, B, with fitness 6 was lost, but the average fitness of the population increased to (3 + 5 + 1 + 5) / 4. Iterating this process will eventually result in a string with all ones.
Example Assume you are planning a camping trip and must limit the number of items you can take to fit in your backpack – say, no more than 40 lb. (this is a version of the famous “knapsack” problem – an optimization problem known to be NP-complete). Here are your choices defined in terms of their weight and importance on the scale [1 (least important) – 5 (most important)]: Sleeping bag, 10 lb, 4. Electronics equipment, 5 lb, 5. Tent, 15 lb, 3. Medicine cabinet, 2 lb, 5. Climbing ropes, 8 lb, 5. Cloths, 10 lb, 2. Rain gear, 3 lb, 4. Food, 10 lb, 3.
Example contd. Let the initial, randomly generated, population be: Chromosome label Chromosome string Fitness A 11001100 16 Total fitness sum: 94 B 11111000 22 Average fitness: 24 C 10011111 29 D 01110011 29 where 1 means “item included”, 0 means “item not included”. Notice, 11101101 is NOT a possible solution , because the total weight of these items exceeds 40 lb.
Example contd. Using again the fitness-proportionate selection, we get: For chromosome A, 16 / 24 = 0.667 (a chance to be selected) For chromosome B, 22 / 24 = 0.917 For chromosome C, 29 / 24 = 1.208 For chromosome D, 29 / 24 = 1.208 (0.667 + 0.917 + 1.208 + 1.208 = 4) Using the “roulette-wheel sampling”, assume that the first two spins choose A and C and the crossover randomly happens at the fourth allele resulting in: E: 1100 1100 + 1001 1111 11001111 (weight = 32, fitness = 23) F: 10011100 (weight = 30, fitness = 16) Assume that the second offspring mutated at the second allele resulting in: F’: 11011100 (weight 35, fitness = 21) After substituting for their parents, the new generation now has average fitness of (23 + 22 + 21 + 29) / 4 = 23.75 -- worst solution eliminated, but no better solution found (yet !)
N-Queens problem: an overview Representation: The board is represented as a 8-tupple (for 8-Queens problem), where each number represents the position of the queen in the N-th column, and serves as a candidate solution (a chromosome). Example: [1 6 2 5 7 4 8 3] can be converted to binary, where every gene is a block of 3 bits, i.e. [001 110 010 101 111 100 111 011] See Figure 4.6 and 4.7 (page 127) for a nice illustration and more. X
N-Queens problem contd. Fitness: Because the goal is to minimize the number of clashes among the queens, the fitness of each chromosome will depend on the number of clashes; the higher number of clashes, the lower the survival chances. The maximum number of clashes is 28, where Row clashes: 7 Column clashes:7 Diagonal clashes: 7 + 7 The fitness formula: 28 – number of clashes, to ensure that the fittest solution is the one with the highest fitness function. The population: a randomly generated set of board arrangements with one queen per column. If you want to learn more about genetic algorithms for N-queens, in addition to the textbook (pages 126 – 129) see also https://kushalvyas.github.io/gen_8Q.html http://genetic-algorithms-explained.appspot.com/