Graph Crossover Problem Any edge may be a member of one or more cycles. Graph fragments produced by division may have more than one crossover point ("broken edges") When two fragments are combined they may have different numbers of broken edges to be merged. Our crossover operator Operate on any connected graph. Divides graphs at randomly generated cut sets. Can evolve arbitrary cyclic structures given at least some cycles in the initial population. Always produces connected undirected graphs. Almost always produces connected directed graphs.
Crossover abcd wxyz abyz wxcd Strings Trees Graphs
Graph Crossover Rip Two Parents Apart Combine into a Child
Molecule Division Choose an initial random bond Repeat Find the shortest path between the initial bond's atoms. Remove and remember a random bond from this path. These bonds are called "broken edges. Until a cut set is found, i.e., no path exists between the initial bond's vertices.
Fragment Recombination Repeat Select a random broken edge. Determine which fragment it is associated with. If at least one broken edge in other fragment exists –choose one at random –merge the broken edges into one bond; respecting valence by reducing the order of the bond if necessary Else flip coin –heads -- attach the broken edge to a random atom in other fragment (respecting valence) –tails -- discard the broken edge Until each broken edge has been processed exactly once
Molecule Fitness Function All-pairs-shortest-path distance Assign extended types to each atom –Extended type = (element, |single bonds|, |double bonds|, |triple bonds|) Find shortest bond path between each pair of atoms Create bag: one item per atom pair –item = (type1, type2, path length) –bag = set with repeated items distance = 1 - |intersection| / |union|
JavaGenes in Action Finding with all-pairs-shortest-path and Tanimoto index fitness function (0 is perfect)
Molecular Dynamics and Mechanics Newtons laws of motion in a potential field Discover common conformations during dynamics Discover minimum energy conformations (e.g., protein folding problem) Began in 1960s with two body potentials for inert gas modeling 1980s extended to metals and bonded systems (upper-right corner of periodic table) Our studies focus on the evolving potentials for reactive systems (bonds break and form)
Molecular Potentials Energy = sum 2-body terms + sum 3-body terms + … Stillinger-Weber SiF potential function 2-body(r) –A(Br -p - r -q ) * cutoff –Cutoff = exp(C/(r-a)); r < a, 0 otherwise 3-body(r ij,r jk,theta) = –(alpha + lambda (cos(theta) - cos(theta 0 ))^2)) * cutoff –Cutoff = exp(gamma(1/(r ij - a1) + 1/(r jk - a1)) FFF additional term = –delta(r ij r jk ) -m * cutoff –Cutoff = exp(beta(1/(r ij - a2) + 1/(r jk - a2))) Discovering parameters can require months or years
Evolving Molecular Force Fields Chromosome 2D ragged array of floating point numbers –SiSi, SiF, FF, SiSiSi, SiSiF, SiFSi, FSiF, FFSi, FFF 5-63 parameters Transmission operators Interval crossover Mutation Fitness Function RMS difference between individuals and correct energies for n molecules Correct energies –Currently: energies generated with the force field with published parameters –Next step: energies generated by higher quality quantum codes
Interval Crossover For each allele: Lower Parental Value (1.1) Higher Parental Value (2.1) Construct larger interval (100% larger) (.6) (2.6) Choose a random number (1.3) 1. 2. 3. Construct an interval from parental values
Si potential results population = 1000 generations = 3000 fitness function: 100 random 5-body Si tetrahedra 31 runs. Best run results: A = 7.151346144801161 (7.049556277) B = 0.6007865398735448 (0.6022245584) p = 3.9825158463763977 (4) q = 0.014970062068368135 (0) a = 1.797123919332413 (1.8) alpha = 0.1442970771852687 (0) lambda = 27.783092740584205 (21) gamma = 1.328091763076223 (1.2) a1 = 1.8173559091012945 (1.8)
Future Plans Hill climbing Use experimental data for new fitness functions Feed results from easy to hard evolution SiSi (5) SiF (6) FF (6) SiSiSi (9) FFF (14) SiFSi (10) Full SiF (63) SiSiF (10) FSiF (10) FFSi (10)
Condor Cycle-scavenging batch system for single workstation jobs Desktop machines, nights, weekends, etc. University of Wisconsin In production since 1986 Unix workstations 250 SGI and 50 Sun workstations at code IN Good for parameter studies stochastic algorithms (e.g., GA) One JavaGenes job per Condor job