Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genetic Programming Using Simulated Natural Selection to Automatically Write Programs.

Similar presentations


Presentation on theme: "Genetic Programming Using Simulated Natural Selection to Automatically Write Programs."— Presentation transcript:

1 Genetic Programming Using Simulated Natural Selection to Automatically Write Programs

2 Genetic Programming John Koza, Stanford University John Koza, Stanford University Principal proponent of GP Principal proponent of GP Has obtained human-competitive results in a number of problem domains Has obtained human-competitive results in a number of problem domains Reproduced existing patents Reproduced existing patents Created new patentable designs Created new patentable designs Has written extensively on GP Has written extensively on GP Four volume set on Genetic Programming Four volume set on Genetic Programming Numerous papers on the GP Numerous papers on the GP

3 Genetic Programming Basic Algorithm Basic Algorithm Create a population of programs Create a population of programs Each program attempts to solve a set of problems in a “training set.” Each program attempts to solve a set of problems in a “training set.” Program fitness is determined by success in solving training set Program fitness is determined by success in solving training set More fit members have better chance to produce offspring in the next generation More fit members have better chance to produce offspring in the next generation Offspring are produced using some form of crossover Offspring are produced using some form of crossover

4 Tree Structure of Genetic Programs Various structures are used to represent genetic programs, but tree structures are the most well known. Various structures are used to represent genetic programs, but tree structures are the most well known. Nonterminal nodes are functions that take their children as parameters. Nonterminal nodes are functions that take their children as parameters. + 21

5 Tree Structure Terminal Nodes, the nodes that make up the leaves of a program tree, provide data to the program. Terminal Nodes, the nodes that make up the leaves of a program tree, provide data to the program. Constants Constants Parameterless functions Parameterless functions Inputs Inputs

6 Genetic Program Components Terminal Set Terminal Set Work as set of primitive data types Work as set of primitive data types Constants Constants Parameterless functions Parameterless functions Input Values Input Values Function set Function set Set of available functions Set of available functions Often tailored specifically for the needs of the program domain. Often tailored specifically for the needs of the program domain.

7 Initializing the Population The following two parameters are specified The following two parameters are specified Maximum depth of a program tree Maximum depth of a program tree Maximum number of nodes in a program tree Maximum number of nodes in a program tree Three methods in common use (Koza) Three methods in common use (Koza) Full Full Nonterminals are used to build a complete tree up to the leaf nodes, which are then completely populated with terminals. Every tree is grown to maximum depth and has the maximum number of nodes allowed. Nonterminals are used to build a complete tree up to the leaf nodes, which are then completely populated with terminals. Every tree is grown to maximum depth and has the maximum number of nodes allowed.

8 Initializing the Population (continued) Three methods in common use (Koza) Three methods in common use (Koza) Grow Grow The root node is chosen from the function set The root node is chosen from the function set All nodes not at maximum depth are chosen randomly. All nodes not at maximum depth are chosen randomly. Growth for a branch ends when a terminal is chosen. Growth for a branch ends when a terminal is chosen. Trees can have irregular shapes. Trees can have irregular shapes. Nodes at the maximum depth are chosen from the terminal set only. Nodes at the maximum depth are chosen from the terminal set only.

9 Initializing the Population (continued) Three methods in common use (Koza) Three methods in common use (Koza) Ramped Half and Half Ramped Half and Half M is the max depth of deepest partition in the population M is the max depth of deepest partition in the population The population is separated into M partitions The population is separated into M partitions The ith partition, (i ranges from 0 to M-1) has a max depth of M – i. The ith partition, (i ranges from 0 to M-1) has a max depth of M – i. Half of each partition is populated with grow, the other half is populated with full. Half of each partition is populated with grow, the other half is populated with full.

10 Genetic Operators: Crossover Crossover Crossover Randomly select a node in the mother Randomly select a node in the mother Randomly select a node in the father Randomly select a node in the father Swap the two nodes along with their subtrees Swap the two nodes along with their subtrees

11 Crossover Example + 1 * 2 - 2 - 13 / 4 power abs2 -7 Parent 1 Parent 2 + * 2 - 2 - power abs2 -7 1 13 / 4 Child 1Child 2

12 Genetic Operations: Mutation Mutation Mutation Randomly select a node in the program tree Randomly select a node in the program tree Remove that node and its subtree Remove that node and its subtree Replace the node with a new subtree, using the same method used to initially instantiate the population. Replace the node with a new subtree, using the same method used to initially instantiate the population. Typically, mutation is applied to a small number of offspring after crossover. Typically, mutation is applied to a small number of offspring after crossover.

13 Mutation Example + 13 * 24 + + 1 * 2 Left subtree is randomly selected for mutation. - *2 74 The entire subtree is replaced

14 Fitness-based Selection Gives “graded and continuous feedback about how well a program performs on the training set” (Banzhaf et. al.) Gives “graded and continuous feedback about how well a program performs on the training set” (Banzhaf et. al.) Standardized Fitness Standardized Fitness Fitness scores are transformed so that 0 is the fitness of the most fit member. Fitness scores are transformed so that 0 is the fitness of the most fit member. Normalized Fitness Normalized Fitness Fitness is transformed to values that always are between 0 and 1. Fitness is transformed to values that always are between 0 and 1.

15 Different Selection Algorithms GA Scenario GA Scenario Same as that used in Genetic Algorithms Same as that used in Genetic Algorithms Create gene pool by selecting parents based on fitness Create gene pool by selecting parents based on fitness Next generation completely replaces current generation Next generation completely replaces current generation ES Scenario ES Scenario Same as used in Evolutionary Strategies Same as used in Evolutionary Strategies Generate children first Generate children first Apply fitness function to parents and children Apply fitness function to parents and children Select the next generation from children (and possibly parents too) Select the next generation from children (and possibly parents too) Selection pressure can be tuned by adjusting the ratio of the number of offspring to the number of parents. Selection pressure can be tuned by adjusting the ratio of the number of offspring to the number of parents.

16 Selection Pressure Ratio of the best individual’s selection probability to the average selection probability Ratio of the best individual’s selection probability to the average selection probability MostFitSelectionProbability / AverageFitSelectionProbability The larger this ratio, the greater the selection pressure. The larger this ratio, the greater the selection pressure.

17 Sample Fitness Measures Error Fitness Error Fitness The sum of the absolute value of the differences between the computed result and the desired result. The sum of the absolute value of the differences between the computed result and the desired result. Where: f p is the fitness of the p th individual in the population o i is the desired output for the i th example in the training set p i is the output from the p th individual on the i th example in the training set * Squaring the expressing (p i -o i ) can provide larger penalties for errors.

18 Fitness Measures can be as Varied as the Applications Examples Examples Number of correct solutions Number of correct solutions Number of wins competing against other members of the population. Number of wins competing against other members of the population. Number of errors navigating a maze Number of errors navigating a maze Time required to solve a puzzle Time required to solve a puzzle

19 Truncation or (µ, λ) Selection A number of parents (µ) are allowed to breed and produce (λ) children. The µ best children are used to produce the next generation. A number of parents (µ) are allowed to breed and produce (λ) children. The µ best children are used to produce the next generation. A variation, (µ + λ) selection includes the parents in those considered for selection into the next generation. A variation, (µ + λ) selection includes the parents in those considered for selection into the next generation.

20 Ranking Selection Selection Based on Fitness Order Selection Based on Fitness Order The members of the population are ranked from best to worst. The members of the population are ranked from best to worst. The selection probability is assigned based on the rank. The selection probability is assigned based on the rank.

21 Tournament Selection Select a subset of the population (the tournament size) randomly. Select a subset of the population (the tournament size) randomly. More fit (winning) individuals are used to generate replacements for less fit (losing) individuals. More fit (winning) individuals are used to generate replacements for less fit (losing) individuals. Accelerates processing time (compared with full competition) Accelerates processing time (compared with full competition) Facilitates parallel processing Facilitates parallel processing

22 The Basic GP Algorithm (from Banzhaf, et. al) Define the terminal set Define the terminal set Define the function set Define the function set Define the fitness function Define the fitness function Define parameters such as population size, maximum individual size, crossover probability, selection method, and termination criterion Define parameters such as population size, maximum individual size, crossover probability, selection method, and termination criterion

23 Generational GP Like what we have seen in GA Like what we have seen in GA New generation completely replaces the previous generation. New generation completely replaces the previous generation. Initialize the population Initialize the population Evaluate the individual programs Evaluate the individual programs Until a new population is fully populated, repeat Until a new population is fully populated, repeat Select an individual or individuals in the population using selection algorithm Select an individual or individuals in the population using selection algorithm Perform genetic operations on the selected individual or individuals Perform genetic operations on the selected individual or individuals Insert the result of the genetic operations into the new population Insert the result of the genetic operations into the new population Best individual is the resulting program. Best individual is the resulting program.

24 Steady State GP There are no generations There are no generations 1. Initialize the population 2. Randomly choose a subset of the population to take part in the tournament 3. Evaluate the fitness value of each competitor in the tournament. 4. Select the winner or winners from the competitors in the tournament using the selection algorithm. 5. Apply genetic operators to the winner or winners of the tournament

25 Steady State GP (continued) 6. Replace the losers in the tournament with the results of the application of the genetic operators to the winners of the tournament. 7. Repeat steps 2-6 until the termination criterion is met.

26 Introns Code sections (functions) that provide no real value for the problem at hand Code sections (functions) that provide no real value for the problem at hand Introns do not directly affect the fitness of the individual. Introns do not directly affect the fitness of the individual. e.g., j = j + 0 or j = j * 1 e.g., j = j + 0 or j = j * 1 Early and middle sections of GP runs might include 40-60% introns. Early and middle sections of GP runs might include 40-60% introns. Later in the run, introns begin to dominate the code. Later in the run, introns begin to dominate the code. Introns growth is exponential! Introns growth is exponential!

27 Why GP Introns Emerge Children tend to be less fit than parents Children tend to be less fit than parents Crossover and mutation can be extremely destructive Crossover and mutation can be extremely destructive Introns reduce the destructive effects of genetic operators Introns reduce the destructive effects of genetic operators Parents generate introns when it is easier to protect what they already can do, through the creation of introns, than improve on what they are currently doing. Parents generate introns when it is easier to protect what they already can do, through the creation of introns, than improve on what they are currently doing.

28 Effective Fitness Function of at least two factors Function of at least two factors The fitness of the parent The fitness of the parent Likelihood that genetic operators will affect the fitness of the parent’s children Likelihood that genetic operators will affect the fitness of the parent’s children

29 Effects of Introns Introns may have differing effects before and after exponential growth of introns begins Introns may have differing effects before and after exponential growth of introns begins Different systems may generate different types of introns with different probabilities. Different systems may generate different types of introns with different probabilities. The extent to which genetic operatos are destructive in their effect is likely to be a very important initial condition in intron growth. The extent to which genetic operatos are destructive in their effect is likely to be a very important initial condition in intron growth. Mutation and crossover may affect different types of introns differently. Mutation and crossover may affect different types of introns differently.

30 Problems Caused by Introns Run stagnation (no progress) Run stagnation (no progress) Poor results (do nothing code) Poor results (do nothing code) Drain on memory and CPU time (storing and executing unnecessary code) Drain on memory and CPU time (storing and executing unnecessary code)

31 Possible Beneficial Effects of Introns Introns might serve to isolate useful code blocks Introns might serve to isolate useful code blocks This facilitates the building block model by protecting useful building blocks from disruption This facilitates the building block model by protecting useful building blocks from disruption

32 Methods of Handling Introns Reduce the destructiveness of genetic operators Reduce the destructiveness of genetic operators Reducing destructive crossover to 0 results in hill climbing Reducing destructive crossover to 0 results in hill climbing Attach fitness penalty to the length of the program. Attach fitness penalty to the length of the program. Change the fitness function Change the fitness function Provides the GP with a way to improve that is better than just insulating the current best solution. Provides the GP with a way to improve that is better than just insulating the current best solution.

33 References Genetic Programming, An Introduction Genetic Programming, An Introduction Wolfgang Banzhaf, Peter Nordin, Robert E. Keller, Frank D. Francone Wolfgang Banzhaf, Peter Nordin, Robert E. Keller, Frank D. Francone Genetic Programming Tutorial Genetic Programming Tutorial John Koza, Gecco 2005 John Koza, Gecco 2005 Genetic Programming: The Movie Genetic Programming: The Movie John Koza John Koza


Download ppt "Genetic Programming Using Simulated Natural Selection to Automatically Write Programs."

Similar presentations


Ads by Google