Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Assessment and Application of Lineage Information in Genetic Programs for Producing Better Models Gary D. Boetticher Univ. of Houston.

Similar presentations


Presentation on theme: "The Assessment and Application of Lineage Information in Genetic Programs for Producing Better Models Gary D. Boetticher Univ. of Houston."— Presentation transcript:

1 The Assessment and Application of Lineage Information in Genetic Programs for Producing Better Models Gary D. Boetticher Univ. of Houston - Clear Lake, Houston, TX, USA IEEE International Conference on Information Reuse and Integration Kim Kaminsky Univ. of Houston - Clear Lake, Houston, TX, USA

2 About the Author: Gary D. Boetticher IEEE International Conference on Information Reuse and Integration  Ph.D. in Machine Learning and Software Engineering A neural network-based software reuse economic model  Executive member of IEEE Reuse Standard Committees (1990s)  Commercial consultant: U.S. Olympic Committee, LDDS Worldcom, Mellon Mortgage, …  Currently: Associate Professor Department of Comp. Science/Software Engineering University of Houston - Clear Lake, Houston, TX, USA  Research interests: Data mining, ML, Computational Bioinformatics, and Software metrics

3 Motivating Questions Does chromosome lineage information within a Genetic Program (GP) provide any insight into the effectiveness of solving problems? If so, how could these insights be utilized to make better breeding decisions? IEEE International Conference on Information Reuse and Integration

4 2) Determine the fitness for each (1 /Stand. Error) IEEE International Conference on Information Reuse and Integration Genetic Program Overview X, Y, and Z  RESULT? XYZRESULT :::: ) Create a population of equations Eq#Equation 1X+Y 2(Z-X)*Y+X :: 1000(X*X)-Z : 57 3) Breed Equations X + Y (Z-X) * Y+X (Z-X) + Y X * Y+X 4) Generate new populations and breed until a solution is found

5 Genetic Program Overview EquationFitness (X+Y)87 (X - Z) * (Y * Y)86 ZYZY 75 :: Y22 Y - X18 Generation N Generation N+1 EquationFitness (X - Z) (X + Y) * (Y * Y) Z + Y : X Y + Y Why discard legacy information? IEEE International Conference on Information Reuse and Integration

6 Goal: Examine fitness patterns over time EquationFitness (X+Y)87 (X - Z) * (Y * Y)86 ZY85 (X - Z) * (Y * Y)84 Y79 Y - X75 Z + Y75 (X - Z) * (Y * Y)75 Y73 Y - X71 (X - Z) * (Y * Y) + W + W68 Y - X67 ZY66 (X - Z) * (Y * Y)66 Y65 Y - X65 (X - Z) * (Y * Y) + W + W64 Y - X64 Z - Y62 (X - Z) * (Y * Y)59 Y58 Y - X55 (X - Z) * (Y * Y) + W + W44 EquationFitness (X+Y)87 (X - Z) * (Y * Y)86 ZY85 (X - Z) * (Y * Y)84 Y79 Y - X75 Z + Y75 (X - Z) * (Y * Y)75 Y73 Y - X71 (X - Z) * (Y * Y) + W + W68 Y - X67 ZY66 (X - Z) * (Y * Y)66 Y65 Y - X65 (X - Z) * (Y * Y) + W + W64 Y - X64 Z - Y62 (X - Z) * (Y * Y)59 Y58 Y - X55 (X - Z) * (Y * Y) + W + W44 EquationFitness (X+Y)87 (X - Z) * (Y * Y)86 ZY85 (X - Z) * (Y * Y)84 Y79 Y - X75 Z + Y75 (X - Z) * (Y * Y)75 Y73 Y - X71 (X - Z) * (Y * Y) + W + W68 Y - X67 ZY66 (X - Z) * (Y * Y)66 Y65 Y - X65 (X - Z) * (Y * Y) + W + W64 Y - X64 Z - Y62 (X - Z) * (Y * Y)59 Y58 Y - X55 (X - Z) * (Y * Y) + W + W IEEE International Conference on Information Reuse and Integration Generation 1 Generation 2 Generation 3 Localized? Volatile?

7 Proof of Concept Experiments experiments using synthetic equations: Z = W + X + Y Z = 2 * X + Y – W Z = X / Y Z = X 3 Z = W 2 + W * X - Y Data slightly perturbed to prevent premature convergence Genetic Program 1000 Chromosomes (Equations) 50 Generations Breeding based on fitness rank IEEE International Conference on Information Reuse and Integration

8 Proof of Concept Experiments - 2 For the 1000 Chromosomes: Divide into 5 groups of 200 (by fitness) Focus on the best, middle, and worst groups See where each group’s offspring occur in the next generation IEEE International Conference on Information Reuse and Integration

9 Results for Z = W + X + Y Best Middle Worst IEEE International Conference on Information Reuse and Integration

10 Results for Z = 2 * X + Y – W Best Middle Worst IEEE International Conference on Information Reuse and Integration

11 Results for Z = X / Y Best Middle Worst IEEE International Conference on Information Reuse and Integration

12 Results for Z = X 3 Best Middle Worst IEEE International Conference on Information Reuse and Integration

13 Results for Z = W 2 + W * X - Y Best Middle Worst IEEE International Conference on Information Reuse and Integration

14 Applied Experiments Best class produces best offspring. Now what? Compare 2 Genetic Programs (GPs) 1) Use a vanilla-based GP 2) Use a GP that breeds only the top 20% of a population and replicates 5 times IEEE International Conference on Information Reuse and Integration Genetic Program 1000 Chromosomes (Equations) 50 Generations 20 Trials Equations to model Z = Sin(W) + Sin(X) + Sin(Y) Z = log 10 (W X ) + (Y * Z)

15 Results for Z = Sin(W) + Sin(X) + Sin(Y) IEEE International Conference on Information Reuse and Integration Vanilla-Based GP Lineage-Based GP Average Fitness Average r Ave. Generations needed to complete

16 Results for Z = log 10 (W X ) + (Y * Z) IEEE International Conference on Information Reuse and Integration Vanilla-Based GP Lineage-Based GP Average Fitness Average r Ave. Generations needed to complete

17 Conclusions IEEE International Conference on Information Reuse and Integration Proof of concept experiments demonstrate the viability of considering lineage in GPs Applied experiments show that lineage-based GP modeling produce better results faster


Download ppt "The Assessment and Application of Lineage Information in Genetic Programs for Producing Better Models Gary D. Boetticher Univ. of Houston."

Similar presentations


Ads by Google