Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discrete Optimization Last Time  Local Search  Meta-Heuristic Search  Specific Search Strategies  Simulated Annealing  Stochastic Machines  Convergence.

Similar presentations


Presentation on theme: "Discrete Optimization Last Time  Local Search  Meta-Heuristic Search  Specific Search Strategies  Simulated Annealing  Stochastic Machines  Convergence."— Presentation transcript:

1 Discrete Optimization Last Time  Local Search  Meta-Heuristic Search  Specific Search Strategies  Simulated Annealing  Stochastic Machines  Convergence Analysis 2015/9/15 Shi-Chung Chang, NTUEE, GIIE, GICE, Spring, 2008

2 Discrete Optimization Today  SA: Convergence Analysis  Evolutionary Computation  Genetic Algorithms: Simple Example  GA: General  Coding and mapping  Selection  Genetic Variability Operators  Fitness  Design ssues 2015/9/15 Shi-Chung Chang, NTUEE, GIIE, GICE, Spring, 2008

3 Reading Assignments 1. A Genetic Algorithm Tutorial Darrell Whitley Statistics and Computing (4):65-85, 1994. 2015/9/15Shi-Chung Chang, NTUEE, GIIE, GICE

4 Convergence Analysis of Simulated Annealing Ref: E. Aarts, J. Korst, P. Van Laarhoven, “Simulated Annealing,” in Local Search in Combinatorial Optimization, edited by E. Aarts and J. Lenstra, 1997, pp. 98-104.

5 Boltzmann distribution At thermal equilibrium at temperature T, the Boltzmann distribution gives the relative probability that the system will occupy state A vs. state B as: where E(A) and E(B) are the energies associated with states A and B.

6 Simulated annealing in practice Geman & Geman (1984): if T is lowered sufficiently slowly (with respect to the number of iterations used to optimize at a given T), simulated annealing is guaranteed to find the global minimum. Caveat: this algorithm has no end (Geman & Geman’s T decrease schedule is in the 1/log of the number of iterations, so, T will never reach zero), so it may take an infinite amount of time for it to find the global minimum.

7 Simulated annealing algorithm Idea: Escape local extrema by allowing “bad moves,” but gradually decrease their size and frequency. Algorithm when goal is to minimize E. < - -

8 Note on simulated annealing: limit cases Boltzmann distribution: accept “bad move” with  E<0 (goal is to maximize E) with probability P(  E) = exp(  E/T) If T is large:  E < 0  E/T < 0 and very small exp(  E/T) close to 1 accept bad move with high probability If T is near 0:  E < 0  E/T < 0 and very large exp(  E/T) close to 0 accept bad move with low probability Random walk Deterministic down-hill

9 Markov Model Solution  State Cost of a solution  Energy of a state Generation Probability of state j from state i: G ij

10 Acceptance and Transition Probabilities Acceptance probability of state j as next state at state i Transition probability from state i to state j

11 Irreducibility and Aperiodicity of M.C. Irriducibility Aperiodicity

12 Theorem 1: Existence of Unique Stationary State Distribution Finite homogeneous M.C. Irriducibility + Aperiodicity  existence of unique stationary distribution

13 Theorem 2: Asymptotic Convergence of Simulated Annealing P(k): the transition matrix of the homogeneous M.C. associated with the S.A. algorithm C k = C for all k  existence of a unique stationary distribution

14 Asymptotic Convergence of Simulated Annealing From Theorem 2

15 STRI, University of Hertfordshire15 Genetic Algorithms and their Application to the Artificial Evolution of Genetic Regulatory Networks Tutorial ICSB 2007 Johannes F. Knabe, Katja Wegner, and Maria J. Schilstra University of Hertfordshire, UK

16 STRI, University of Hertfordshire16 Part 1: Fundamentals Biological Evolution

17 STRI, University of Hertfordshire 17 Evolutionary cycle : Generation Recombination Mutation Population OffspringParents Selection Replacement

18 STRI, University of Hertfordshire 18 Dictionary 1 Gene – smallest unit with genetic information Genotype – collectivity of all genes Phenotype – expression of genotype in environment Individual – single member of a population with genotype and phenotype Population – set of several individuals Generation – one iteration of evaluation, selection and reproduction with variation

19 STRI, University of Hertfordshire 19 Selection and Reproduction Selection does not act on genotype at all but on the performance of the phenotype (fitness) There is differential reproduction  phenotypes better adapted to the environment are likely to produce more offspring Slightly unfaithful reproduction creates genotypic variations  affect traits of the phenotype, which in turn affect fitness These genotypic variations are heritable

20 STRI, University of Hertfordshire 20 Recombination (crossover) Choose two individuals from current population  parents New combination of the genetic material of these individuals  offspring No new genetic information, only reshuffling of existing information But can have strong effects on phenotype http://student.biology.arizo na.edu/honors2001/group1 2/introduction.html

21 STRI, University of Hertfordshire 21 Duplication Any doubling of a certain region, e.g. through unequal recombination If this region c onsists of a gene, it is called gene-duplication http://en.wikipedia.org/wiki/Gene_duplication

22 STRI, University of Hertfordshire 22 Mutation Permanent changes to genetic material Can be caused by errors during reproduction of DNA Mutation rate: i.e. 1 in 10.000 bases is incorrectly reproduced Brings variability into reproduction Usually small changes at individual level but strongly depends on “importance” of mutated base to phenotype http://www.biocrawler.com/encyclopedia/Mutation

23 STRI, University of Hertfordshire 23 The Evolutionary Mechanisms Selection and differential reproduction DECREASE diversity in population Genetic operators (mutation, recombination) INCREASE diversity of population

24 STRI, University of Hertfordshire24 Part 1: Fundamentals (2) Evolutionary Computation Genetic Algorithms (GA)

25 Evolutionary Computation

26 STRI, University of Hertfordshire 26 Evolutionary Computation Exploitation of concepts of natural evolution for problem solving using computers Simulation of evolutionary processes (recombination, mutation, selection) for solving a desired problem Particularly well-suited to complex, multidimensional problems too big to search exhaustively (non-linear optimization problems) Cannot solve all problems perfectly, but has fewer restrictions than most problem-solving algorithms

27 STRI, University of Hertfordshire 27 Optimization - Problems Example: hill-climbing Start with estimate of global maximum Try to improve by finding other solutions that have a greater value than the current estimate (local search) Local maximum Global maximum Local maxima = hazards  could converge to local maximum instead of global

28 STRI, University of Hertfordshire 28 Evolutionary cycle - revisited Recombination Mutation Population Offspring Parents Replacement (random) start population Selection End? Evaluation

29 STRI, University of Hertfordshire 29 Individual - one candidate solution Population - set of individuals Genotype - encoded representation of individual Phenotype - decoded representation of individual Mapping - decodes the phenotype Mutation - variability operator that modifies a genotype Recombination/Crossover - variability operator mixing genotypes Fitness - performance of a phenotype with regard to objective Iteration - Generation Dictionary 2

30 STRI, University of Hertfordshire 30 EC - General properties Exploit collective “learning“ process of a population (each individual = one solution = one search point) Evaluation of individuals in their environment = measure of quality = fitness  comparison of individuals Selection favors better individuals who reproduce more often than those that are worse Offspring is generated by random recombination and mutation of selected parents

31 STRI, University of Hertfordshire 31 Main trends Genetic algorithms (GAs) Genetic programming (subform of GAs) Evolutionary strategies (ES) Evolutionary programming (EP)

32 Genetic Algorithms: Simple Example

33 STRI, University of Hertfordshire 33 Simple example – f(x) = x² Finding the maximum of a function: f(x) = x² Range [0, 31]  Goal: find max (31² = 961) Binary representation: string length 5 = 32 numbers (0-31) = f(x)

34 STRI, University of Hertfordshire 34 F(x) = x² - Start Population 100001String 5 4412110101String 4 1001001010String 3 9300011String 2 36600110String 1 fitnessvaluebinary 1

35 STRI, University of Hertfordshire 35 F(x) = x² - Selection 100001String 5 4412110101String 4 1001001010String 3 9300011String 2 36600110String 1 fitnessvaluebinary 1 Worst one removed

36 STRI, University of Hertfordshire 36 F(x) = x² - Selection 100001String 5 4412110101String 4 1001001010String 3 9300011String 2 36600110String 1 fitnessvaluebinary 1 Best individual: reproduces twice  keep population size constant

37 STRI, University of Hertfordshire 37 F(x) = x² - Selection 100001String 5 4412110101String 4 1001001010String 3 9300011String 2 36600110String 1 fitnessvaluebinary 1 All others are reproduced once

38 STRI, University of Hertfordshire 38 F(x) = x² - Recombination Parents and x-position randomly selected (equal recombination) 00110 00011 00111 00010 String 1: String 2: 2String 4String 3 4String 2String 1 x-positionpartner

39 STRI, University of Hertfordshire 39 F(x) = x² - Recombination Parents and x-position randomly selected (equal recombination) 01010 10101 01101 10010 String 3: String 4: 2String 4String 3 4String 2String 1 x-positionpartner

40 STRI, University of Hertfordshire 40 F(x) = x² - Mutation bit-flip : Offspring -String 1: 00111 (7)  10111 (23) String 4: 10101 (21)  10001 (17)

41 STRI, University of Hertfordshire 41 F(x) = x² All individuals in the parent population are replaced by offspring in the new generation (generations are discrete!) New population (Offspring): 289String 5 2561610000String 4 1691301101String 3 4200010String 2 5292310111String 1 fitnessvaluebinary 1710001

42 STRI, University of Hertfordshire 42 F(x) = x² - End Iterate until termination condition reached, e.g.: Number of generations Best fitness Process time No improvements after a number of generations Result after one generation: Best individual: 10111 (23) – fitness 529

43 Genetic Algorithms - General

44 STRI, University of Hertfordshire 44 Genetic algorithms Meta Search Differences to other search and optimization algorithms: GAs search from a population of points (possible solutions), not from a single point GAs use probabilistic, not deterministic rules

45 STRI, University of Hertfordshire 45 History In 1960s John H. Holland, University of Michigan: Abstraction and generalisation of the population concept with genetic coding and operators Use in Bioinformatics, e.g.: motif discovery, sequence alignment, protein structure prediction etc.

46 46 Procedure of Genetic Algorithms P(t): Parents in current generation t. C(t): offspring in current generation t.

47 Coding and Mapping

48 STRI, University of Hertfordshire 48 Genetic coding Finite strings (= genome, represents genotype) Strings consists of units with information (unit = gene) One string (  individual) = one possible solution of the problem Genotype often real numbers or bit string: 0101011.8530.492

49 STRI, University of Hertfordshire 49 Genetic coding and mapping What should the phenotype look like and how to encode it as a genotype? How does one map from genotype to phenotype, considering the sources of variation (mutation and recombination)? Highly problem dependent! Hint: small changes to genotype should often result in small changes to phenotype, i.e. similar performance: heritability of traits! heritability of traits is important  otherwise GA becomes only random search

50 STRI, University of Hertfordshire 50 Mapping – Example Binary coding versus Gray coding of a number Hamming distance: Number of bits that have to be changed to map one string into another one E.g. 000 and 001  distance = 1 Remember: small changes in genotype should cause small changes in phenotype

51 STRI, University of Hertfordshire 51 Mapping – Example cont‘d Binary coding of 0-7 (phenotype): 1106 1117 1015 1004 0113 0102 0011 0000

52 STRI, University of Hertfordshire 52 Mapping – Example cont‘d Binary coding of 0-7 (phenotype): 1106 1117 1015 1004 0113 0102 0011 0000 Hamming distance, e.g.: – 000 (0) and 001 (1) Distance = 1 (optimal) – 011 (3) and 100 (4) Distance = 3 (max possible)

53 STRI, University of Hertfordshire 53 Mapping – Example cont‘d Gray coding of 0-7: 1016 1007 1115 1104 0103 0112 0011 0000

54 STRI, University of Hertfordshire 54 Mapping – Example cont‘d Gray coding of 0-7: 1016 1007 1115 1104 0103 0112 0011 0000 Hamming distance: – Two neighboring numbers (phenotypes) have always a genotype distance of 1 (all differ only by one bit flip) – OPTIMAL mapping

55 STRI, University of Hertfordshire 55 Mapping – Example cont‘d Comparing kinship with distance = 1 Binary:Gray:

56 Selection

57 STRI, University of Hertfordshire 57 Selection Based on fitness function: Determines how „good“ an individual is (fitness) Better fitness, higher probability of selection Selection of individuals for differential reproduction of offspring in next generation Favors better solutions Decreases diversity in population

58 STRI, University of Hertfordshire 58 Selection - Roulette-Wheel Each solution gets a region on a roulette wheel according to its fitness Spun wheel, select solution marked by roulette-wheel pointer stochastic selection (better fitness = higher chance of reproduction) http://www.edc.ncl.ac.uk/highlight/rhjanuary2007g02.php

59 STRI, University of Hertfordshire 59 Selection - Elitism Individual(s) kept unchanged for next population Example: Selection based on fitness values Keep the best individual of current population unrealistic but ensures best fitness of a generation never decreases  decrease of diversity

60 STRI, University of Hertfordshire 60 Selection - Tournament randomly select q individuals from current population Winner: individual(s) with best fitness among these q individuals Example: select the best two individuals as parents for recombination q=6 selection

61 Genetic variability operators

62 STRI, University of Hertfordshire 62 Mutation Varies details, usually exploitive Changes one position in the string each position same small probability of undergoing a mutation Goal: search around existing good solution, possibly leave local optima 010000 1.853 1.807

63 STRI, University of Hertfordshire 63 Recombination/Crossover Usually explorative Creates new strings by combining parts of two existing strings 001000111 100101 0 00111 101 Parents: Offspring: 11

64 STRI, University of Hertfordshire 64 Recombination Unequal: – Crossover points independent for each string chosen 001000111 100101 0 00111 101 Parents: Offspring: 11

65 Fitness

66 STRI, University of Hertfordshire 66 Fitness function Nature: only survival and reproduction count how well do I do in my environment Fitness space structure: Defined by kinship of genotypes and fitness function Advantage: visual representation can be useful when thinking about model design Limitation: ideas might be too simplistic when not working on toy-problems - complex spaces and movements (think crossover!)

67 STRI, University of Hertfordshire 67 Fitness space or landscape Schema of genetic kinship How we “move” in that landscape over generations is defined by our variability operators, usually mutation and recombination Now add fitness… 0000 0001 100110001100 0100 011000100011

68 STRI, University of Hertfordshire 68 Fitness space or landscape Schema of genetic kinship How we “move” in that landscape over generations is defined by our variability operators, usually mutation and recombination Now add fitness… 0000 0001 100110001100 0100 011000100011 fitness

69 STRI, University of Hertfordshire 69 Fitness landscapes contd. x/y axes: kinship, i.e. the more genetic resemblance the closer together z axis: fitness Every “snowflake” one individual, search focuses on “promising” regions (due to differential reproduction) Animation adapted from Andy Keane, Uni. Of Southampton

70 STRI, University of Hertfordshire 70 Fitness space – Good design Easy to find the optimum by local search neighboring genotypes have similar fitness (smooth curve  high evolvability) Fitness Genotypes

71 STRI, University of Hertfordshire 71 Fitness space - Bad design Here we will have a hard time finding the optimum Low evolvability (fitness is right/wrong) Either problem not well suited for GA or bad design Fitness Genotypes

72 STRI, University of Hertfordshire 72 Fitness space – Mediocre design Many local optima, so we are likely to find one However not much of a gradient to find global optimum, random search could do as well Fitness Genotypes

73 STRI, University of Hertfordshire 73 Dynamic fitness landscape Fitness does not need to be static over generations Can allow to reach regions otherwise uncovered Natural fitness certainly very dynamic Animation by Michael Herdy, TU Berlin

74 Design issues

75 STRI, University of Hertfordshire 75 Integrating problem knowledge Always to some degree in representation/ mapping Create more complex fitness function Start population chosen instead of a uniform random one Useful e.g. if constraints on range of solutions Possible problems: Loss of diversity and bias

76 STRI, University of Hertfordshire 76 Design decisions GAs: high flexibility and adaptability because of many options: Problem representation Genetic operators with parameters Mechanism of selection Size of the population Fitness function Decisions are highly problem dependent Parameters not independent, you cannot optimize them one by one

77 STRI, University of Hertfordshire 77 Hints for the parameter search Find balance between: Exploration (new search regions) Exploitation (exhaustive search in current region) Parameters can be adaptable, e.g. from high in the beginning (exploration) to low (exploitation), or even be subject to evolution themselves Balance influenced by: Mutation, recombination: create indiviuals that are in new regions (diversity!!) fine tuning in current regions Selection: focus on interesting regions

78 STRI, University of Hertfordshire 78 Keep in mind Start population has a lot of diversity “Invest” search time in areas that have proven good in the past  Loss of diversity over evolutionary time Premature convergence: quick loss of diversity poses high risk of getting stuck in local optima Evolvability: Fitness landscape should not be too rugged Heredity of traits Small genetic changes should be mapped to small phenotype changes

79 Wrapping up Part 1

80 STRI, University of Hertfordshire 80 GA- Summary Selection: Focus on fittest individuals Recombination: Adds alternative solutions to population Mutation: Makes sure that most of the search space is reached

81 STRI, University of Hertfordshire 81 GA- Summary cont'd Advantages: Basic method simple and broadly applicable No need for very detailed understandung of the problem But can be adjusted to problem if knowledge present Fast and can be scheduled in parallel Disadvantages: No guarantee to find best solution High computational demands Adapting to problems at hand can be hard, e.g. finding suitable representation/mapping and evolutionary operators Search can get “caught” in local optima

82 STRI, University of Hertfordshire 82 More recent inputs from Biology Populations are spatial, e.g. for “speciation” interaction (mating, competition) localized to maintain diversity Populations have structure, e.g. niche protection competition will be stronger if many individuals do the same to maintain diversity Diploidy with dominance / recessivity N-point crossover and other variants Morphogenesis instead of simple function mapping (allowing for modularity, making crossover less fatal)

83 GA Fundamental Theorems Tian-Li Yu tianliyu@cc.ee.ntu.edu.tw Department of Electrical Engineering National Taiwan University Acknowledgment David E. Goldberg’s slides for his GA course. Ying-Ping Chen’s slides for his EC course.

84 Agenda Schema theorem Takeover & drift Control map Problem difficulty BB hypothesis Time-to-convergence Population sizing No-free-lunch (NFL) theorem

85 Derive the Schema Theorem Holland (1975). Ensure the growth of best schema on average. Conservative bound: ignores several favorable possibilities. Considers effect of different operators: –selection: proportionate –crossover: one-point –mutation: simple bitwise

86 Schema Schemata ( pl. ) Introduce the wild card ‘*’. –“1*0” denotes 100 and 110; we call 1 and 0 as fixed. Order of schema H : o(H) –The number of fixed positions. –o(1**0*) = 2 Defining length of schema H: δ(H) –Distance between the 1 st and the last fixed positions. –δ(1**0*) = 4-1 = 3

87 Schemata Growth: Selection Selection Pressure on H m(H): # of individuals in the population that belong to H. The average fitness changes over time. Exponential growth at the beginning. Saturated when population is nearly converged.

88 Schemata Disruptions One-Point XO: Mutation: Lower bounds. Innovation not considered.

89 The Schema Theorem If m(H,t+1) > m(H,t), the schema H grows. Lower-bound estimation of schema growth. Consider only destructive forces. Minimal, sequential, superior (ms 2 ) schemata grow.


Download ppt "Discrete Optimization Last Time  Local Search  Meta-Heuristic Search  Specific Search Strategies  Simulated Annealing  Stochastic Machines  Convergence."

Similar presentations


Ads by Google