Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genetic algorithm. Definition The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical.

Similar presentations


Presentation on theme: "Genetic algorithm. Definition The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical."— Presentation transcript:

1 Genetic algorithm

2 Definition The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical objects (typically fixed-length binary character strings), each with an associated fitness value, into a new population of offspring objects using the Darwinian principle of natural selection and using operations that are patterned after naturally occurring genetic operations, such as crossover (sexual recombination) and mutation

3 Genetic Algorithms - History Pioneered by John Holland in the 1970’s Got popular in the late 1980’s Based on ideas from Darwinian Evolution Can be used to solve a variety of problems that are not easy to solve using other techniques

4 Finding a solution of a problem is often thought In computer science - is a process of search through the space of possible solutions. Partial solutions are viewed as a point in the search space In Engineering & Mathematics- The problems are first formulated as mathematical models. Set the parameters that gives the best solution

5 Why genetic Algorithm Genetic algorithm can be used to solve problems that are not well suited for standard optimization algorithms, including problems in which the objective function is discontinuous, non differentiable, stochastic, or highly nonlinear. Classical derivative based optimizationGenetic algorithm Generates a single point at each iteration. The sequence of points approaches an optimal solution Generates a population of points at each iteration. The best point in the population approaches an optimal solution Selects the next point in the sequence by a deterministic computation. Selects the next population by computation which uses random number generators

6 Optimization Optimization: process of finding an optimal solution (maximum/ minimum) satisfying the constraints It focuses on 3 factors 1) objective function : function which is to be maximized or minimized (Example : maximize the profit and minimize the cost in the case of manufacturing) 2) A set of unknowns or variables ( the amount of resources used/ time spent etc 3) A set of constrains ( availability of space, money etc)

7 Our Main concern here is to 1) How to describe the process of search 2) how to implement and carry out search 3)What are the elements required to carry out search

8

9 Genetic Algorithm

10 Basic genetics All living organism consists of cells Each cell of a living thing contains chromosomes - strings of DNA Each chromosome contains a set of genes - blocks of DNA Each gene determines some aspect of the organism (like eye colour) A collection of genes is sometimes called a genotype A collection of aspects (like eye colour) is sometimes called a phenotype

11 Basic genetics

12 General scheme of Evolutionary process

13 Terminology

14 Working principle

15 Outline of genetic algorithm

16 Silly Example - Drilling for Oil Imagine you had to drill for oil somewhere along a single 1km desert road Problem: choose the best place on the road that produces the most oil per day We could represent each solution as a position on the road say, a whole number between [0..1000]

17 Where to drill for oil? 0 5001000 Road Solution2 = 900Solution1 = 300

18 Digging for Oil The set of all possible solutions [0..1000] is called the search space or state space In this case it’s just one number but it could be many numbers or symbols Often GA’s code numbers in binary producing a bit string representing a solution In our example we choose 10 bits which is enough to represent 0..1000

19 Convert to binary string 5122561286432168421 9001110000100 3000100101100 1023 1111111111 In GA’s these encoded strings are sometimes called “genotypes” or “chromosomes” and the individual bits are sometimes called “genes”

20 Drilling for Oil 0 1000 Road Solution2 = 900 (1110000100)Solution1 = 300 (0100101100) O I L Location 30 5

21 Summary We have seen how to: represent possible solutions as a number encoded a number into a binary string generate a score for each number given a function of “how good” each solution is - this is often called a fitness function Our silly oil example is really optimisation over a function f(x) where we adapt the parameter x

22 Lecture 2 Representation Selection (Reproduction) Cross over Mutation Problem solving using GA

23 Representation Before any algorithm is put into work on any problem, the partial solutions have to be encoded so that a computer can process. Chromosomes could be: – Bit strings (0101... 1100) – Real numbers (43.2 -33.1... 0.0 89.2) – Permutations of element (E11 E3 E7... E1 E15) – Lists of rules (R1 R2 R3... R22 R23) – Program elements (genetic programming) –... any data structure...

24 Binary encoding Binary representation: Here encoding is done using sequence of 1’s and 0’s.

25

26 Example: Decoding a value For a string length n i the accuracy in the variable approximation is (X U i - X L i ) / 2 ni

27 Permutation encoding

28 Tree encoding

29 Genetic operators Selection ( Reproduction) Cross over (Recombination) Mutation

30 Selection Different methods Roulette wheel selection Rank selection Boltzman selection Tournament selection  Fitness value F is calculated  The probability of selection of ith chromosome is done  The cumulative frequency  Generate a random number r from the range [o,z]  If r < q1, select the first chromosome, otherwise select chromosome from 2 to pop_size

31 Example

32

33 Roulette -wheel selection In roulette wheel selection, individuals are given a probability of being selected that is directly proportionate to their fitness.

34

35 Populati on No Populati on FitnessProbabil ity pi Expecte d count (nxpi) Cumulat ive frequen cy Random number betwee n 0 and 1 String number Count in the mating pool 100001.04290.330.04290.25931 200102.1.0900.720.13260.03811 300013.11.13361.0640.2660.048651 400104.01.17231.3680.4380.42842 501104.66.21.60.6380.09522 611101.91.0820.6560.7200.340 711001.93.08290.6640.8090.61650 801114.55.19551.5610.89781

36 Problem Find the expected number of copies of the best string for a maximization problem using 1) Roulette wheel selection 2) tournament selection StringFitness 011015 110002 101101 0011110 101013 00010100

37 Boltzmann Selection

38 Cross over

39 One –point cross over

40 Two-point cross over Off spring 1 11011 1100001 0110 Offspring 2 11011 0010011 1110

41 Uniform crossover

42 Arithmetic crossover

43 Mutation Mutation is a genetic operator used to maintain genetic diversity from one generation of population of chromosome to the next. Various mutation operator are Boundary, uniform, non uniform

44 Uniform Mutation A gene(real number) is selected with the help of a randomly selected real number within a specific range. For a chromosome X t =[X 1, X 2, … X m ]. A random number k is selected such that k  [1,n] and an offstring X t+1 =[X 1,… X’ k … X m ], where X’ k is a random value generated according to uniform probability distribution from the range [X k L, X k U ]. Here X k L and X k U are lower and upper bounds on variable X k Boundary Mutation The replacement of X’ k by either X k L or X k U each with equal probability is known as boundary mutation Non-uniform Mutation Here X’ k is selected Where  (t, y) returns a value in the range [ 0, y] such that probability of  (t,y) being close to 0 as t increases Mutation can be implemented using 1) one’s complement operator 2) logic bitwise operator 3) shift operator and 4) masking operator

45 Problem

46

47

48

49

50

51 Support Vector machine one of the most well studied and widely used learning algorithms for binary classification Extensions of SVMs exist for a variety of other learning problems, including regression, multiclass classification, ordinal regression, ranking, structured prediction, and many others. Similar to perceptrons they aim to find a hyper plane that linearly separates data points belong to different classes In addition SVMs aim to find the hyper plane that is least likely to overfit the training data

52 Separating hyper planes Which one is better: B1 or B2? Why? Many other separating hyperplanes are possible

53

54 Each instance in X is an n-dimensional real vector i.e X  R n. Given a sample of m labeled examples Classification is done using the classifier for some w  R n, b  R Thus for X  R n, the basic SVM algorithm selects a classifier from the class of linear classifiers over X.

55 Learning linear SVM It is convenient to represent classes by +1 and -1 using y = 1; if wx+b > 0, -1; if wx+b < 0 w can be rescaled such that for all points x lying on the respective boundaries it holds that wx+b = 1 or wx+b = -1 These points are called the support vectors The task of learning a linear SVM consists of estimating the parameters w and b The first criterion is that all points in the training data must be classified correctly: w.x i + b ≥ 1 if y i = 1 w.x i +b ≤ -1 if y i = -1 This can be re-written as: yi(w.x i +b) ≥ 1 for 1≤ i ≤ N

56 Linear separable – hard margin SVM Although both classifier separates the data, the distance or margin with which separation achieved is different. The SVM algorithm selects maximum classifier margin

57 The margin on (x i y i ) is simply a signed version of this distance, with a positive sign if the example is classified correctly and negative otherwise. The margin of the classifier given by (w,b) on a sample is then defined as the mini mal margin on S:

58 The margin of such a classier on S then becomes simply Thus maximizing the margin becomes equivalent of minimizing the norm subject to the constraints given in equation 5 which can be written as following optimization problem i.e maximize the margin subject to the constrains that all points in the training data must be classified correctly. This problem can be solved using Lagrange Multipliers

59

60

61


Download ppt "Genetic algorithm. Definition The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical."

Similar presentations


Ads by Google