Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 478 - Machine Learning Genetic Algorithms (II).

Similar presentations


Presentation on theme: "CS 478 - Machine Learning Genetic Algorithms (II)."— Presentation transcript:

1 CS 478 - Machine Learning Genetic Algorithms (II)

2 Fall 2004CS 478 - Machine Learning2 Schema (I) A schema H is a string from the extended alphabet {0, 1, *}, where * stands for “don't-care” (i.e., a wild card) A schema H is a string from the extended alphabet {0, 1, *}, where * stands for “don't-care” (i.e., a wild card) A schema represents or matches a number of strings: A schema represents or matches a number of strings: SchemaRepresentatives *1* 010, 011, 110, 111 10* 100, 101 00*11 00011, 00111 There are 3 L schemata over strings of length L There are 3 L schemata over strings of length L

3 Fall 2004CS 478 - Machine Learning3 Schema (II) Since each position in a string may take on either its actual value or a *, each binary string in a GA population contains, or is a representative of, 2 L schemata Since each position in a string may take on either its actual value or a *, each binary string in a GA population contains, or is a representative of, 2 L schemata Hence, a population with n members contains between 2 L and min(n2 L, 3 L ) schemata, depending on population diversity. (The upper bound is not strictly n2 L as there are a maximum of 3 L schemata) Hence, a population with n members contains between 2 L and min(n2 L, 3 L ) schemata, depending on population diversity. (The upper bound is not strictly n2 L as there are a maximum of 3 L schemata) Geometrically, strings of length L can be viewed as points in a discrete L-dimensional space (i.e., the vertices of hypercubes). Then, schemata can be viewed as hyperplanes (i.e., hyper-edges and hyper-faces of hypercubes) Geometrically, strings of length L can be viewed as points in a discrete L-dimensional space (i.e., the vertices of hypercubes). Then, schemata can be viewed as hyperplanes (i.e., hyper-edges and hyper-faces of hypercubes)

4 Fall 2004CS 478 - Machine Learning4 Schema Order The order of a schema H is the number of non * symbols in H The order of a schema H is the number of non * symbols in H It is denoted by o(H): It is denoted by o(H): SchemaOrder 1*1*014 0*1 *0**12 A schema of order o over strings of length L represents 2 L-o strings A schema of order o over strings of length L represents 2 L-o strings

5 Fall 2004CS 478 - Machine Learning5 Schema Defining Length The defining length of a schema H is the distance the first and last non * symbols in H The defining length of a schema H is the distance the first and last non * symbols in H It is denoted by  (H): It is denoted by  (H): Schema Defining Length 1*1*015 *1*12 *0***0

6 Fall 2004CS 478 - Machine Learning6 Intuitive Approach Schemata encode useful/promising characteristics found in the population. Schemata encode useful/promising characteristics found in the population. What do selection, crossover and mutation do to schemata? What do selection, crossover and mutation do to schemata? Since more highly fit strings have higher probability of selection, on average an ever-increasing number of samples is given to the observed best schemata. Since more highly fit strings have higher probability of selection, on average an ever-increasing number of samples is given to the observed best schemata. Crossover cuts strings at arbitrary sites and swaps. Crossover leaves a schema unscathed if it does not cut the schema, but it may disrupt a schema when it does. For example, 1***0 is more likely to be disrupted than **11* is. In general, schemata of short defining length are unaltered by crossover. Crossover cuts strings at arbitrary sites and swaps. Crossover leaves a schema unscathed if it does not cut the schema, but it may disrupt a schema when it does. For example, 1***0 is more likely to be disrupted than **11* is. In general, schemata of short defining length are unaltered by crossover. Mutation at normal, low rates does not disrupt a particular schema very frequently. Mutation at normal, low rates does not disrupt a particular schema very frequently.

7 Fall 2004CS 478 - Machine Learning7 Intuitive Conclusion Highly-fit, short-defining-length schemata (called building blocks) are propagated generation to generation by giving exponentially increasing samples to the observed best …And all this takes place in parallel, with no memory other than the population. This parallelism as been termed implicit as n strings of length L actually allow min(n2 L, 3 L ) schemata to be processed. …And all this takes place in parallel, with no memory other than the population. This parallelism as been termed implicit as n strings of length L actually allow min(n2 L, 3 L ) schemata to be processed.

8 Fall 2004CS 478 - Machine Learning8 Formal Account See the PDF document containing a formal account of the effect of selection, crossover and mutation, culminating in the Schema Theorem. See the PDF document containing a formal account of the effect of selection, crossover and mutation, culminating in the Schema Theorem.

9 Fall 2004CS 478 - Machine Learning9 Prototypical Steady-state GA P  p randomly generated hypotheses P  p randomly generated hypotheses For each h in P, compute fitness(h) For each h in P, compute fitness(h) While max h fitness(h) < threshold (*) While max h fitness(h) < threshold (*) P s  Select r.p individuals from P (e.g., FPS, RS, tournament) P s  Select r.p individuals from P (e.g., FPS, RS, tournament) Apply crossover to random pairs in P s and add all offspring to P o Apply crossover to random pairs in P s and add all offspring to P o Select m% of the individuals in P o with uniform probability and apply mutation (i.e., flip one of their bits at random) Select m% of the individuals in P o with uniform probability and apply mutation (i.e., flip one of their bits at random) P w  r.p weakest individuals in P P w  r.p weakest individuals in P P  P – P w + P o P  P – P w + P o For each h in P, compute fitness(h) For each h in P, compute fitness(h)

10 Fall 2004CS 478 - Machine Learning10 Influence of Learning Baldwinian evolution: learned behaviour causes changes only to the fitness landscape Baldwinian evolution: learned behaviour causes changes only to the fitness landscape Lamarckian evolution: learned behaviour also causes changes to the parents' genotypes Lamarckian evolution: learned behaviour also causes changes to the parents' genotypes Example: Example: … calculating fitness involves two steps, namely k-means clustering and NAP classification. The effect of k-means clustering is to refine the starting positions of the centroids to more “representative” final positions. At the individual's level, this may be viewed as a form of learning, since NAP classification based on the final centroids' positions is most likely to yield better results than NAP classification based on their starting positions. Hence, through k-means clustering, an individual improves its performance. As fitness is computed after learning, GA-RBF makes implicit use of the Baldwin effect. (Here, we view the result of k-means clustering, namely the improved positions of the centroids, as the learned “traits”). A straightforward way of implementing Lamarckian evolution consists of coding the new centroids’ positions back onto the chromosomes of the individuals of the current generation, prior to genetic recombination. … calculating fitness involves two steps, namely k-means clustering and NAP classification. The effect of k-means clustering is to refine the starting positions of the centroids to more “representative” final positions. At the individual's level, this may be viewed as a form of learning, since NAP classification based on the final centroids' positions is most likely to yield better results than NAP classification based on their starting positions. Hence, through k-means clustering, an individual improves its performance. As fitness is computed after learning, GA-RBF makes implicit use of the Baldwin effect. (Here, we view the result of k-means clustering, namely the improved positions of the centroids, as the learned “traits”). A straightforward way of implementing Lamarckian evolution consists of coding the new centroids’ positions back onto the chromosomes of the individuals of the current generation, prior to genetic recombination.

11 Fall 2004CS 478 - Machine Learning11 Conclusion Genetic algorithms are used primarily for: Genetic algorithms are used primarily for: Optimization problems (e.g., TSP) Optimization problems (e.g., TSP) Hybrid systems (e.g., NN evolution) Hybrid systems (e.g., NN evolution) Artificial life Artificial life Learning in classifier systems Learning in classifier systems


Download ppt "CS 478 - Machine Learning Genetic Algorithms (II)."

Similar presentations


Ads by Google