# Local optimization technique G.Anuradha. Introduction The evaluation function defines a quality measure score landscape/response surface/fitness landscape.

## Presentation on theme: "Local optimization technique G.Anuradha. Introduction The evaluation function defines a quality measure score landscape/response surface/fitness landscape."— Presentation transcript:

Local optimization technique G.Anuradha

Introduction The evaluation function defines a quality measure score landscape/response surface/fitness landscape

Hill Climbing start at randomly generated state move to the neighbour with the best evaluation value if a strict local-minimum is reached then restart at other randomly generated state.

Flowchart of Hill climbing Select a current solution s Evaluate s Select a new solution x from the neighborhood of s Evaluate x Is x better than s? Select x as new current solution s yesno

Stopping condition Either the whole neighborhood has been searched Or we have exceeded the threshold of allowed attempts The last solution is the best solution or the current solution is stored and the same procedure is repeated again(iterated hill climbing)

Features of hill climbing techniques Provides local optimum values that depends on starting solution Cant be used for finding the global optimum because there is no general procedure for measuring the relative error with respect to global optimum The success of the algorithm is depended on the initial value choosen

Weakness of hill climbing algorithm Termination on local optimum values There is no indication of how much the local optimum deviates from global optimum The optimum value obtained depends on the initial configurations An upper bound of computation time cant be provided Hill climbing exploit the best available solution but they neglect exploring a large portion of the search space.

Hill climbing and car example Vector of 3000 values provides indices of auction sites from 1 to 50 Evaluate the solution and assign a quality measure score Find the neighbor, evaluate, if the evaluation function is more then go in that direction Else select a new solution.

Stochastic Hill Climber The problem of getting struck up in the local optima is eliminated to a certain extend in this approach In this approach new solutions having negative change in the quality measure score is also accepted. Some basic changes to the ordinary hill climbing is made in stochastic hill climbing approach

Flowchart of Hill climbing Select a current solution s Evaluate s Select a new solution x from the neighborhood of s Evaluate x Is x better than s? Select x as new current solution s yesno

Stochastic Hill climbing approach Select a current solution s Evaluate s Select a new solution x from the neighborhood of s Evaluate x Select x as a new current solution s with probability P The probability of acceptance depends on the quality measure score difference between these solutions and T

How this probabilistic function works? There are 3 cases – 50% probable: if the new solution x has the same quality measure score as the current solution s – >50% probable: if the new solution x is superior then the probability of acceptance is greater than 50% – <50% : if the new solution is inferior, then the probability of acceptance is smaller than 50%

Effect of parameter T If the new solution x is superior, then the probability of acceptance is closer to 50% for high values of T, or closer to 100% for low values of T If the new solution x is inferior, then the probability of acceptance is closer to 50% for high values of T or closer to 0% for low values of T

How this probabilistic function works? Contd… The probability of accepting a new solution x also depends on the value of parameter T( T remains constant during the execution of the algorithm) Superior solution x would have a probability of acceptance of atleast 50%(irrespective of T) Inferior solution x have a probability of acceptance of at most 50% (0 – 50%) T is neither too low nor too high for a particular problem Forerunner of simulated annealing

Annealing Heating steel at a suitable temperature, followed by relatively slow cooling. The purpose of annealing may be to remove stresses, to soften the steel, to improve machinability, to improve cold working properties, to obtain a desired structure. The annealing process usually involves allowing the steel to cool slowly in the furnace.

Simulated Annealing Set the initial temperature T Select a current solution s Evaluate s K=0 K=K+1 Is K large enough? Select a new solution x in the neighborhood of s Evaluate x X better than s Select x as a new current solution s Select x as a new current solution s with probability p Decrease T Is T low? STOP y n yn y n

Analogy between both AnnealingSimulated Annealing StateFeasible solution EnergyEvaluation function Ground stateOptimal solution Rapid quenchingLocal search Careful annealingSimulated annealing

SA resembles a random search at higher temperatures and classic hill climber at lower temperatures When applied to a specific applications some questions come in mind? – What is the representation? – How are neighbors defined? – What is the evaluation function? – How to determine how big is k? – How to cool the system or how to decrease the temperature? – How to determine the stopping condition?

Tabu Search Meta-heuristics search algorithm that guides a local heuristic search procedure to search beyond local optimality Uses adaptive memory and responsive exploration to explore the search space Its deterministic in nature, but its possible to add some probabilistic elements to it

Flowchart of tabu search Set the initial memory M Select a current solution s Evaluate S Select a no. of solutions x,y,.. From neighbourhood of s Evaluate x,y,z…… Select one solution x as new solution s, the decision based on quality measure score and M Update M

Memory component of tabu search There are 3 ways of computing memory – Recency based memory:- Memory structure gets updated after certain iterations and records the last few iterations – Frequency based memory: the memory structure works for a longer time horizon and measures the frequency of change at each position

22 Evolution Heres a very oversimplified description of how evolution works in biology Organisms (animals or plants) produce a number of offspring which are almost, but not entirely, like themselves – Variation may be due to mutation (random changes) – Variation may be due to sexual reproduction (offspring have some characteristics from each parent) Some of these offspring may survive to produce offspring of their ownsome wont – The better adapted offspring are more likely to survive – Over time, later generations become better and better adapted Genetic algorithms use this same process to evolve better programs

Evolutionary Algorithms

Genotype and Phenotype Genes are the basic instructions for building an organism A chromosome is a sequence of genes Biologists distinguish between an organisms genotype (the genes and chromosomes) and its phenotype (what the organism actually is like) Example: You might have genes to be tall, but never grow to be tall for other reasons (such as poor diet) Similarly, genes may describe a possible solution to a problem, without actually being the solution

26 The basic genetic algorithm Start with a large population of randomly generated attempted solutions to a problem Repeatedly do the following: – Evaluate each of the attempted solutions – Keep a subset of these solutions (the best ones) – Use these solutions to generate a new population Quit when you have a satisfactory solution (or you run out of time)

Flowchart of evolution algorithm Create initial population A Initialize counter t=0 Evaluate all s from A Select a set of parents from A Create a set of offspring Create a new population A from existing parents and offspring t=t+1 Is t large STOP yes no

28 A really simple example Suppose your organisms are 32-bit computer words You want a string in which all the bits are ones Heres how you can do it: – Create 100 randomly generated computer words – Repeatedly do the following: Count the 1 bits in each word Exit if any of the words have all 32 bits set to 1 Keep the ten words that have the most 1s (discard the rest) From each word, generate 9 new words as follows: – Pick a random bit in the word and toggle (change) it Note that this procedure does not guarantee that the next generation will have more 1 bits, but its likely

29 A more realistic example, part I Suppose you have a large number of (x, y) data points – For example, (1.0, 4.1), (3.1, 9.5), (-5.2, 8.6),... You would like to fit a polynomial (of up to degree 5) through these data points – That is, you want a formula y = ax 5 + bx 4 + cx 3 + dx 2 +ex + f that gives you a reasonably good fit to the actual data – Heres the usual way to compute goodness of fit: Compute the sum of (actual y – predicted y) 2 for all the data points The lowest sum represents the best fit There are some standard curve fitting techniques, but lets assume you dont know about them You can use a genetic algorithm to find a pretty good solution

30 A more realistic example, part II Your formula is y = ax 5 + bx 4 + cx 3 + dx 2 +ex + f Your genes are a, b, c, d, e, and f Your chromosome is the array [a, b, c, d, e, f] Your evaluation function for one array is: – For every actual data point (x, y), (Im using red to mean actual data) Compute ý = ax 5 + bx 4 + cx 3 + dx 2 +ex + f Find the sum of (y – ý) 2 over all x The sum is your measure of badness (larger numbers are worse) – Example: For [0, 0, 0, 2, 3, 5] and the data points (1, 12) and (2, 22) : ý = 0x 5 + 0x 4 + 0x 3 + 2x 2 +3x + 5 is 2 + 3 + 5 = 10 when x is 1 ý = 0x 5 + 0x 4 + 0x 3 + 2x 2 +3x + 5 is 8 + 6 + 5 = 19 when x is 2 (12 – 10) 2 + (22 – 19) 2 = 2 2 + 3 2 = 13 If these are the only two data points, the badness of [0, 0, 0, 2, 3, 5] is 13

31 A more realistic example, part III Your algorithm might be as follows: – Create 100 six-element arrays of random numbers – Repeat 500 times (or any other number): For each of the 100 arrays, compute its badness (using all data points) Keep the ten best arrays (discard the other 90) From each array you keep, generate nine new arrays as follows: – Pick a random element of the six – Pick a random floating-point number between 0.0 and 2.0 – Multiply the random element of the array by the random floating-point number – After all 500 trials, pick the best array as your final answer

32 The really simple example again Suppose your organisms are 32-bit computer words, and you want a string in which all the bits are ones Heres how you can do it: – Create 100 randomly generated computer words – Repeatedly do the following: Count the 1 bits in each word Exit if any of the words have all 32 bits set to 1 Keep the ten words that have the most 1s (discard the rest) From each word, generate 9 new words as follows: – Choose one of the other words – Take the first half of this word and combine it with the second half of the other word

33 Asexual vs. sexual reproduction In the examples so far, – Each organism (or solution) had only one parent – Reproduction was asexual – The only way to introduce variation was through mutation (random changes) In sexual reproduction, – Each organism (or solution) has two parents – Assuming that each organism has just one chromosome, new offspring are produced by forming a new chromosome from parts of the chromosomes of each parent

Crossover and mutation operators Mutation Crossover

Crossover is a genetic operator that combines (mates) two chromosomes (parents) to produce a new chromosome (offspring). Types of crossover – One point – Two point – Arithmetic – Heuristic

Types of crossover One point crossover 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 0110 1001 0100 1110 1011 0100 1010 0101 Two point crossover 0110 1001 0100 1110 1010 1101 1011 0101 1101 0100 0101 1010 1011 0100 1010 0101 0110 1001 0101 1010 1011 0100 1011 0101

Arithmetic crossover Offspring1 = a * Parent1 + (1- a) * Parent2 Offspring2 = (1 – a) * Parent1 + a * Parent2 Parent 1: (0.3)(1.4)(0.2)(7.4) Parent 2: (0.5)(4.5)(0.1)(5.6) a=0.7 Offspring1: (0.36)(2.33)(0.17)(6.86) Offspring2: (0.402)(2.981)(0.149)(6.842)

38 Comparison of simple examples In the simple example (trying to get all 1s): – The sexual (two-parent, no mutation) approach, if it succeeds, is likely to succeed much faster Because up to half of the bits change each time, not just one bit – However, with no mutation, it may not succeed at all By pure bad luck, maybe none of the first (randomly generated) words have (say) bit 17 set to 1 – Then there is no way a 1 could ever occur in this position Another problem is lack of genetic diversity – Maybe some of the first generation did have bit 17 set to 1, but none of them were selected for the second generation The best technique in general turns out to be sexual reproduction with a small probability of mutation

Flip Bit -A mutation operator that simply inverts the value of the chosen gene (0 goes to 1 and 1 goes to 0). This mutation operator can only be used for binary genes. Boundary - A mutation operator that replaces the value of the chosen gene with either the upper or lower bound for that gene (chosen randomly). This mutation operator can only be used for integer and float genes. Non-Uniform - A mutation operator that increases the probability that the amount of the mutation will be close to 0 as the generation number increases. This mutation operator keeps the population from stagnating in the early stages of the evolution then allows the genetic algorithm to fine tune the solution in the later stages of evolution. This mutation operator can only be used for integer and float genes. Uniform - A mutation operator that replaces the value of the chosen gene with a uniform random value selected between the user-specified upper and lower bounds for that gene. This mutation operator can only be used for integer and float genes. Gaussian - A mutation operator that adds a unit Gaussian distributed random value to the chosen gene. The new gene value is clipped if it falls outside of the user-specified lower or upper bounds for that gene. This mutation operator can only be used for integer and float genes.

Similar presentations