Genetic algorithm
Definition The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical objects (typically fixed-length binary character strings), each with an associated fitness value, into a new population of offspring objects using the Darwinian principle of natural selection and using operations that are patterned after naturally occurring genetic operations, such as crossover (sexual recombination) and mutation
Genetic Algorithms - History Pioneered by John Holland in the 1970’s Got popular in the late 1980’s Based on ideas from Darwinian Evolution Can be used to solve a variety of problems that are not easy to solve using other techniques
Finding a solution of a problem is often thought In computer science - is a process of search through the space of possible solutions. Partial solutions are viewed as a point in the search space In Engineering & Mathematics- The problems are first formulated as mathematical models. Set the parameters that gives the best solution
Why genetic Algorithm Genetic algorithm can be used to solve problems that are not well suited for standard optimization algorithms, including problems in which the objective function is discontinuous, non differentiable, stochastic, or highly nonlinear. Classical derivative based optimizationGenetic algorithm Generates a single point at each iteration. The sequence of points approaches an optimal solution Generates a population of points at each iteration. The best point in the population approaches an optimal solution Selects the next point in the sequence by a deterministic computation. Selects the next population by computation which uses random number generators
Optimization Optimization: process of finding an optimal solution (maximum/ minimum) satisfying the constraints It focuses on 3 factors 1) objective function : function which is to be maximized or minimized (Example : maximize the profit and minimize the cost in the case of manufacturing) 2) A set of unknowns or variables ( the amount of resources used/ time spent etc 3) A set of constrains ( availability of space, money etc)
Our Main concern here is to 1) How to describe the process of search 2) how to implement and carry out search 3)What are the elements required to carry out search
Genetic Algorithm
Basic genetics All living organism consists of cells Each cell of a living thing contains chromosomes - strings of DNA Each chromosome contains a set of genes - blocks of DNA Each gene determines some aspect of the organism (like eye colour) A collection of genes is sometimes called a genotype A collection of aspects (like eye colour) is sometimes called a phenotype
Basic genetics
General scheme of Evolutionary process
Terminology
Working principle
Outline of genetic algorithm
Silly Example - Drilling for Oil Imagine you had to drill for oil somewhere along a single 1km desert road Problem: choose the best place on the road that produces the most oil per day We could represent each solution as a position on the road say, a whole number between [ ]
Where to drill for oil? Road Solution2 = 900Solution1 = 300
Digging for Oil The set of all possible solutions [ ] is called the search space or state space In this case it’s just one number but it could be many numbers or symbols Often GA’s code numbers in binary producing a bit string representing a solution In our example we choose 10 bits which is enough to represent
Convert to binary string In GA’s these encoded strings are sometimes called “genotypes” or “chromosomes” and the individual bits are sometimes called “genes”
Drilling for Oil Road Solution2 = 900 ( )Solution1 = 300 ( ) O I L Location 30 5
Summary We have seen how to: represent possible solutions as a number encoded a number into a binary string generate a score for each number given a function of “how good” each solution is - this is often called a fitness function Our silly oil example is really optimisation over a function f(x) where we adapt the parameter x
Lecture 2 Representation Selection (Reproduction) Cross over Mutation Problem solving using GA
Representation Before any algorithm is put into work on any problem, the partial solutions have to be encoded so that a computer can process. Chromosomes could be: – Bit strings ( ) – Real numbers ( ) – Permutations of element (E11 E3 E7... E1 E15) – Lists of rules (R1 R2 R3... R22 R23) – Program elements (genetic programming) –... any data structure...
Binary encoding Binary representation: Here encoding is done using sequence of 1’s and 0’s.
Example: Decoding a value For a string length n i the accuracy in the variable approximation is (X U i - X L i ) / 2 ni
Permutation encoding
Tree encoding
Genetic operators Selection ( Reproduction) Cross over (Recombination) Mutation
Selection Different methods Roulette wheel selection Rank selection Boltzman selection Tournament selection Fitness value F is calculated The probability of selection of ith chromosome is done The cumulative frequency Generate a random number r from the range [o,z] If r < q1, select the first chromosome, otherwise select chromosome from 2 to pop_size
Example
Roulette -wheel selection In roulette wheel selection, individuals are given a probability of being selected that is directly proportionate to their fitness.
Populati on No Populati on FitnessProbabil ity pi Expecte d count (nxpi) Cumulat ive frequen cy Random number betwee n 0 and 1 String number Count in the mating pool
Problem Find the expected number of copies of the best string for a maximization problem using 1) Roulette wheel selection 2) tournament selection StringFitness
Boltzmann Selection
Cross over
One –point cross over
Two-point cross over Off spring Offspring
Uniform crossover
Arithmetic crossover
Mutation Mutation is a genetic operator used to maintain genetic diversity from one generation of population of chromosome to the next. Various mutation operator are Boundary, uniform, non uniform
Uniform Mutation A gene(real number) is selected with the help of a randomly selected real number within a specific range. For a chromosome X t =[X 1, X 2, … X m ]. A random number k is selected such that k [1,n] and an offstring X t+1 =[X 1,… X’ k … X m ], where X’ k is a random value generated according to uniform probability distribution from the range [X k L, X k U ]. Here X k L and X k U are lower and upper bounds on variable X k Boundary Mutation The replacement of X’ k by either X k L or X k U each with equal probability is known as boundary mutation Non-uniform Mutation Here X’ k is selected Where (t, y) returns a value in the range [ 0, y] such that probability of (t,y) being close to 0 as t increases Mutation can be implemented using 1) one’s complement operator 2) logic bitwise operator 3) shift operator and 4) masking operator
Problem
Support Vector machine one of the most well studied and widely used learning algorithms for binary classification Extensions of SVMs exist for a variety of other learning problems, including regression, multiclass classification, ordinal regression, ranking, structured prediction, and many others. Similar to perceptrons they aim to find a hyper plane that linearly separates data points belong to different classes In addition SVMs aim to find the hyper plane that is least likely to overfit the training data
Separating hyper planes Which one is better: B1 or B2? Why? Many other separating hyperplanes are possible
Each instance in X is an n-dimensional real vector i.e X R n. Given a sample of m labeled examples Classification is done using the classifier for some w R n, b R Thus for X R n, the basic SVM algorithm selects a classifier from the class of linear classifiers over X.
Learning linear SVM It is convenient to represent classes by +1 and -1 using y = 1; if wx+b > 0, -1; if wx+b < 0 w can be rescaled such that for all points x lying on the respective boundaries it holds that wx+b = 1 or wx+b = -1 These points are called the support vectors The task of learning a linear SVM consists of estimating the parameters w and b The first criterion is that all points in the training data must be classified correctly: w.x i + b ≥ 1 if y i = 1 w.x i +b ≤ -1 if y i = -1 This can be re-written as: yi(w.x i +b) ≥ 1 for 1≤ i ≤ N
Linear separable – hard margin SVM Although both classifier separates the data, the distance or margin with which separation achieved is different. The SVM algorithm selects maximum classifier margin
The margin on (x i y i ) is simply a signed version of this distance, with a positive sign if the example is classified correctly and negative otherwise. The margin of the classifier given by (w,b) on a sample is then defined as the mini mal margin on S:
The margin of such a classier on S then becomes simply Thus maximizing the margin becomes equivalent of minimizing the norm subject to the constraints given in equation 5 which can be written as following optimization problem i.e maximize the margin subject to the constrains that all points in the training data must be classified correctly. This problem can be solved using Lagrange Multipliers