 # An Introduction to Evolutionary Multiobjective Optimization Algorithms

## Presentation on theme: "An Introduction to Evolutionary Multiobjective Optimization Algorithms"— Presentation transcript:

An Introduction to Evolutionary Multiobjective Optimization Algorithms
Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology

Overview Nature Inspired Algorithms Constraint handling Applications

Nature Inspired Algorithms
Nature provide some of the efficient ways to solve problems Algorithms imitating processes in nature/inspired from nature – Nature Inspired Algorithms. What type of problems? Aircraft wing design

Nature Inspired Algorithms
Wind turbine design Bionic car BBC Performance improvement by 40%. They reduce turbulence across the surface, increasing angle of attack and decreasing drag. (Source: Popular Mechanics) Hexagonal plates - resulting in door paneling one-third lighter than conventional paneling, but just as strong. (Source: Popular Mechanics)

Nature Inspired Algorithms
Bullet train NATGEO Train's nose is designed after the beak of a kingfisher, which dives smoothly into water. (Source: Popular Mechanics)

Nature Inspired Algorithms for Optimization
An act, process, or methodology of making something (as a design, system, or decision) as fully perfect, functional, or effective as possible. ( Nature as an optimizer Birds: Minimize drag. Humpback whale: Maximize maneuverability (enhanced lift devices to control flow over the flipper and maintain lift at high angles of attack). Boxfish: Minimize drag and maximize rigidity of exoskeleton. Kingfisher: Minimize micro-pressure waves. Consider an optimization problem of the form

Practical Optimization Problems – Charecteristics!
Objective and constraint functions can be non-differentiable. Constraints nonlinear. Discrete/Discontinuous search space. Mixed variables (Integer, Real, Boolean etc.) Large number of constraints and variables. Objective functions can be multimodal. Multimodal functions have more than one optima, but can either have a single or more than one global optima. Computationally expensive objective functions and constraints.

Optimization algorithm
Practical Optimization Problems – Charecteristics! Decision vector Objective vector Simulation model Optimization algorithm

Different methods for different types of problems. Constraint handling e.g. using panalty method is sensitive to penalty parameters. Often get stuck in local optima (lack global perspective). Usually need knowledge of first/second order derivatives of objective functions and constraints.

Nature Inspired Algorithms for Optimization
Computational intelligence Nature inspired algorithms Fuzzy logic systems Neural networks

Nature Inspired Algorithms for Optimization
Evolutionary algorithms Genetic algorithm Differential evolution Swarm optimization Particle swarm optimization Ant colony optimization .... and many more.

Evolution Humans Macintosh Nokia

Evolutionary Algorithms
Charles Darwin Offsprings created by reproduction, mutation, etc. Natural selection - A guided search procedure Individuals suited to the environment survive, reproduce and pass their genetic traits to offspring Populations adapt to their environment. Variations accumulate over time to generate new species

Evolutionary Algorithms
Terminologies Individual - carrier of the genetic information (chromosome). It is characterized by its state in the search space, its fitness (objective function value). Population - pool of individuals which allows the application of genetic operators. Fitness function - The term “fitness function” is often used as a synonym for objective function. Generation - (natural) time unit of the EA, an iteration step of an evolutionary algorithm.

Evolutionary Algorithms
Population Individual Crossover Parents Offspring Mutation

Evolutionary Algorithms
Step 1 t:= 0 Step 2 Initialize P(t) Step 3 Evaluate P(t) Step 4 While not terminate do P’(t) := variation [P(t)]; evaluate [P’(t)]; P(t+1) := select [P’(t) U P(t)]; t := t + 1; od Evolutionary algorithms = Selection + Crossover + Mutation Reproduced from “Evolutionary Computation: Comments on the History and Current State” – Bäack et. al

Evolutionary Algorithms
Mean approaches optimum Variance reduces

Evolutionary Algorithms Robustness = Breadth + Efficiency
Robust scheme Specialized scheme Random scheme Problem type (Goldberg, 1989)

Evolutionary Algorithms
Selection - Roulette wheel, Tournement, steady state, etc. Motivation is to preserve the best (make multiple copies) and eliminate the worst Crossover – simulated binary crossover, Linear crossover, blend crossover, etc. Create new solutions by considering more than one individual Global search for new and hopefully better solutions Mutation – Polynomial mutation, random mutation, etc. Keep diversity in the population → (bit wise mutation)

Evolutionary Algorithms
Tournment selection 23 30 24 24 37 24 24 11 11 9 30 9 37 9 9 11 23 11 Tournment 1 Tournment 2 37 30 Deleted from population

Evolutionary Algorithms
Roulette wheel selection (proportional selection) Weaker solutions can survive.

Evolutionary Algorithms
Concept of exploration vs exploitation. Exploration – Search for promising solutions Crossover and mutation operators Exploitation – preferring the good solutions Selection operator Excessive exploration – Random search. Excessive exploitation – Premature convergence.

Good evolutionary algorithm
Evolutionary Algorithms Exploitation Exploration Good evolutionary algorithm

Evolutionary Algorithms
Classical gradient based algorithms Convergence to an optimal solution usually depends on the starting solution. Most algorithms tend to get stuck to a locally optimal solution. An algorithm efficient in solving one class of optimization problem may not be efficient in solving others. Algorithms cannot be easily parallelized. Evolutionary algorithms Convergence to an optimal solution is designed to be independent of initial population. A search based algorithm. Population helps not to get stuck to locally optimal solution. Can be applied to wide class of problems without major change in the algorithm. Can be easily parallelized.

Fitness Landscapes Using traditional gradient based methods Ideal and best case Multimodal f(x) f(x) x x f(x) f(x) Nightmare Teaser x x

Using population based algorithms
Fitness Landscapes Using population based algorithms Ideal and best case Multimodal f(x) f(x) x x f(x) f(x) Nightmare Teaser x x

History of Evolutionary Algorithms
GA: John Holland in 1962 (UMich) Evolutionary Strategy: Rechenberg and Schwefel in 1965 (Berlin) Evolutionary Programming: Larry Fogel in 1965 (California) First ICGA: 1985 in Carnegie Mellon University First GA book: Goldberg (1989) First FOGA workshop: 1990 in Indiana (Theory) First Fusion: 1990s (Evolutionary Algorithms) Journals: ECJ (MIT Press), IEEE TEC, Natural Computation (Elsevier) GECCO and CEC since 1999, PPSN since 1990 About 20 major conferences each year

Working cycle of a genetic algorithm
Population based probabilistic search and optimization technique based on natural genetics and Darwin’s principle of natural selection Proposed by Prof. John Holland, University of Michigan, Ann Arbor, USA "I have more ideas than I can ever follow up on in a lifetime, so I never worry if someone steals an idea from me.“ -- John Holland,

Working cycle of a genetic algorithm
Start Initialize a population of solutions Gen = 0 GA starts with a population of initial solutions generated at random. Fitness/goodness value of each solution in the population is calculated Minimization can be converted to maximization 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓(𝑥) 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 −𝑓 𝑥 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 1 𝑓 𝑥 , 𝑓(𝑥)≠0 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 1 1+𝑓(𝑥) , 𝑓(𝑥)≥0 Gen ≤ Max_gen Y N End Assign fitness to all solutions in the population Reproduction Crossover Mutation Gen = Gen + 1

Working cycle of a genetic algorithm
Selection scheme (reproduction) Select good solutions using their fitness values Leads to mating pool consisting of good solutions probabilistically Mating pool may contain multiple copies of a particularly good solution Size of mating pool is kept equal to that of the population of solutions considered before reproduction Average fitness of the mating pool is expected to be higher than that of pre-reproduction population of solutions Schemes Ruolette wheel Tournament selection Ranking selection

Working cycle of a genetic algorithm
Crossover Mating pairs are selected at random from the mating pool New solutions by crossover with a crossover probability Exchange of properties between the parents and new solutions are created Parents are good, children are expected to be good Various types of crossover: Single-point Two-point Multi-point Uniform SBX

Working cycle of a genetic algorithm
Mutation Sudden change of parameter In GA, local change around the current solution If a solution gets stuck at the local minimum, helps to come out of this situation Termination Maximum number of generations Desired accuracy in the solution

Binary-Coded GA Let us consider the following problem:
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑦=𝑓( 𝑥 1 , 𝑥 2 ) Subject to 𝑥 1 𝑚𝑖𝑛 ≤ 𝑥 1 ≤ 𝑥 1 𝑚𝑎𝑥 , 𝑥 2 𝑚𝑖𝑛 ≤ 𝑥 2 ≤ 𝑥 2 𝑚𝑎𝑥

Binary-Coded GA Step 1 – Generation of a population of solutions:
Initial population of solutions of size N is selected at random Solutions represented in the form of binary strings 1’s and 0’s 4 bit string: number of distinct substrings possible is 2 4 𝑜𝑟 16 𝑠𝑢𝑏𝑠𝑡𝑟𝑖𝑛𝑔𝑠, of search space 5 bit string: number of distinct substrings possible is 2 5 𝑜𝑟 32 𝑠𝑢𝑏𝑠𝑡𝑟𝑖𝑛𝑔𝑠, of search space Generalizing: ( 𝑥 1 𝑚𝑎𝑥 − 𝑥 1 𝑚𝑖𝑛 )/ 2 𝑙 is the obtainable accuracy (ε)

Binary-Coded GA The length of binary string is decided based on a desired precision in the value of the variables 𝑙= 𝑙𝑜𝑔 2 ( 𝑥 1 𝑚𝑎𝑥 − 𝑥 1 𝑚𝑖𝑛 𝜀 ) Let us assume 10 bits for each variable and GA string is 20 bit string. Initial population of GA strings at random is

Binary-Coded GA The parents or mating pairs are selected at random from the mating pool Check if they can participate in crossover operation given by crossover probability 𝑝 𝑐 Choose a random crossover site between 1 and L-1, where L is the length of the strign Single-point crossover: 𝑃 1 𝑃 2 𝐶 1 𝐶 2

Binary-Coded GA 2 – point crossover

Binary-Coded GA Uniform crossover

Binary-Coded GA More crossover points more disruption
Large search space, uniform crossover is found to perform better than both the single-point and two-point crossovers Step 5: Mutation

Binary-Coded GA Mutation probability ( 𝑝 𝑚 ) kept low value Range of 0.1 𝐿 𝑡𝑜 1 𝐿 , where L is the string length Genetic operators has to perform two potential roles, such as disruption and construction Mutation – disruption Crossover - construction

Binary-Coded GA Tournament selection
Tournament size n (say 2 or 3), small number compared to population size, N. Pick n strings from the population, at random and determine the best one in terms of fitness value Best string goes to mating pool and the n strings back to population N tournaments are to be played to make the size of mating pool equal to N. Interesting read “A comparative analysis of selection schemes used in genetic algorithms”

Constraint Handling Penalty parameter-less approach
A feasible solution is preferred to infeasible solution When both solutions feasible, choose the solution with better function value When both solutions are infeasible, choose the solution with lower constraint violation

Constraint Handling Box constraints
If variable is lower/higher than lower/upper bound, set to lower/upper bound A random value inside the bounds

Limitations of Evolutionary Algorithms
No guarantee of finding an optimal solution in finite time Asymptotic convergence Containing a number of parameters Sometimes the result is highly dependent on the parameters set Self-adaptive parameters are commonly used Computationally very expensive Metamodels of functions are commonly used

Applications Application 1 Tracking suspect
Caldwell and Johnston, 1991 Objective function: fitness rating on a nine point scale

Applications Optimization (Min/Max) of functions Airfoil optimization
Evolving optimal structure Games

Evolutionary Multi-objective Optimization – A Big Picture
Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology

Objectives The objectives of this lecture are to:
Discuss the transition: Single objective optimization to Multi-objective optimization Review the basic terminologies and concepts in use in multi-objective optimization Introduce evolutionary multi-objective optimization Goals in evolutionary multi-objective optimization Main Issues in evolutionary multi-objective optimization

Reference Books: K. Deb. Multi-Objective Optimization using Evolutionary Algorithms. Wiley, Chichester, 2001. K. Miettinen. Nonlinear Multiobjective Optimization. Kluwer, Boston, 1999.

Single objective: Maximize Performance
Transition Single objective: Maximize Performance Maximize: Performance Minimize: Cost

Basic terminologies and concepts
Multi-objective problem is usually of the form: Minimize/Maximize f(x) = (f1(x), f2(x),…, fk(x)) subject to gj(x) ≥ 0 hk(x) = 0 xL ≤ x ≤ xU Multiple objectives, constraints and decision variables Decision space Objective space

Basic terminologies and concepts
Concept of non-dominated solutions: solution a dominates solution b, if a is no worse than b in all objectives a is strictly better than b in at least one objective. 1 2 3 4 f1 (minimize) f2 (minimize) 5 6 3 dominates 2 and 4 1 does not dominate 3 and 4 1 dominates 2

Basic terminologies and concepts
Properties of dominance relationship Reflexive: The dominance relation is not reflexive. Since solution a does not dominate itself. Symmetric: The dominance relation is not symmetric. Solution a dominates b does not mean b dominated a. Dominance relation is asymmetric. Dominance relation is not antisymmetric. Transitive: The dominance relation is transitive. If a dominates b and b dominates c, then a dominates c. If a does not dominate b, it does not mean b dominates a.

Basic terminologies and concepts
Finding Pareto-optimal/non-dominated solutions Among a set of solutions P, the non-dominated set of solutions P’ are those that are not dominated by any member of the set P. If the set of solutions considered is the entire feasible objective space, P’ is Pareto optimal. Different approaches available. They differ in computational complexities. Naive and slow Worst time complexity is 0(MN2). Kung et al. approach O(NlogN)

Basic terminologies and concepts
Kung et al. approach Step 1: Sort objective 1 based on the descending order of importance. Ascending order for minimization objective 1 2 3 4 f1 (minimize) f2 (minimize) 5 6 5 P = {5,1,3,2,4}

Basic terminologies and concepts
Front(P) = {5} Front = {5} T = {5,1,3} B = {2,4} Front = {2,4} {5,1} {3} {2} {4} Front = {5} {5} {1}

Basic terminologies and concepts
Non-dominated sorting of population Step 1: Set all non-dominated fronts Pj , j = 1,2,… as empty sets and set non-domination level counter j = 1 Step 2: Use any one of the approaches to find the non-dominated set P’ of population P. Step 3: Update Pj = P’ and P = P\P’. Step 4: If P ≠ φ, increment j = j + 1 and go to Step 2. Otherwise, stop and declare all non-dominated fronts Pi, i = 1,2,…,j.

Basic terminologies and concepts
1 2 f2 (minimize) 4 5 3 f1 (minimize) Front 2 Front 3 f2 (minimize) Front 1 f1 (minimize)

Basic terminologies and concepts
Pareto optimal fronts (objective space) For a K objective problem, usually Pareto front is K-1 dimensional Min-Max Max-Max Min-Min Max-Min

Basic terminologies and concepts
Local and Global Pareto optimal front Local Pareto optimal front: Local dominance check. Global Pareto optimal front is also local Pareto optimal front. Objective space Decision space Locally Pareto optimal front

Basic terminologies and concepts
Ideal point: Non-existent lower bound of the Pareto front. Nadir point: Upper bound of the Pareto front. Normalization of objective vectors: fnormi = (fi - ziutopia )/(zinadir - ziutopia ) Max point: A vector formed by the maximum objective function values of the entire/part of objective space. Usually used in evolutionary multi-objective optimization algorithms, as nadir point is difficult to estimate. Used as an estimate of nadir point and updated as and when new estimates are obtained. Objective space Zmaximum Min-Min f2 Znadir Zideal ε Zutopia ε f1

Basic terminologies and concepts
What are evolutionary multi-objective optimization algorithms? Evolutionary algorithms used to solve multi-objective optimization problems. EMO algorithms use a population of solutions to obtain a diverse set of solutions close to the Pareto optimal front. Objective space

Basic terminologies and concepts
EMO is a population based approach Population evolves to finally converge on to the Pareto front. Multiple optimal solutions in a single run. In classical MCDM approaches Usually multiple runs necessary to obtain a set of Pareto optimal solutions. Usually problem knowledge is necessary.

Goal in evolutionary multi-objective optimization
Goals in evolutionary multi-objective optimization algorithms To find a set of solutions as close as possible to the Pareto optimal front. To find a set of solutions as diverse as possible. To find a set of satisficing solutions reflecting the decision maker’s preferences. Satisficing: a decision-making strategy that attempts to meet criteria for adequacy, rather than to identify an optimal solution.

Goal in evolutionary multi-objective optimization
Objective space Convergence Diversity

Goal in evolutionary multi-objective optimization
Objective space Convergence

Goal in evolutionary multi-objective optimization
Changes to single objective evolutionary algorithms Fitness computation must be changed Non-dominated solutions are preferred to maintain the drive towards the Pareto optimal front (attain convergence) Emphasis to be given to less crowded or isolated solutions to maintain diversity in the population

Goal in evolutionary multi-objective optimization
What are less-crowded solutions ? Crowding can occur in decision space and/or objective phase. Decision space diversity sometimes are needed As in engineering design problems, all solutions would look the same. Objective space Decision space Min-Min

Main Issues in evolutionary multi-objective optimization
How to maintain diversity and obtain a diverse set of Pareto optimal solutions? How to maintain non-dominated solutions? How to maintain the push towards the Pareto front ? (Achieve convergence)

EMO History 1984 – VEGA by Schaffer 1989 – Goldberg suggestion
Non-Elitist methods MOGA, NSGA, NPGA 1998 – Present – Elitist methods NSGA-II, DPGA, SPEA, PAES etc.

Evolutionary multi-objective algorithm design issues
Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology

Objectives The objectives of this lecture are to:
Address the design issues of evolutionary multi-objective optimization algorithms Fitness assignment Diversity preservation Elitism Explore ways to handle Constraints

References K. Deb. Multi-Objective Optimization using Evolutionary Algorithms. Wiley, Chichester, 2001. E. Zitzler, M. Laumanns, S. Bleuler. A Tutorial on Evolutionary Multiobjective Optimization, in Metaheuristics for Multiobjective Optimisation, 3-38, Springer-Verlag, 2003.

Algorithm design issues
The approximation of the Pareto front is itself multi-objective. Convergence: Compute solutions as close as possible to Pareto front quickly. Diversity: Maximize the diversity of the Pareto solutions. It is impossible to describe What a good approximation can be for a Pareto optimal front. Proximity to the Pareto optimal front.

Fitness assignment Unlike single objective, multiple objectives exists. Fitness assignment and selection go hand in hand. Fitness assignment can be classified in to following categories: Scalarization based E.g. Weighted sum, MOEA/D Objective based VEGA Dominance based NSGA-II

Fitness assignment Scalarization based (Aggregation based):
Aggregate the objective functions to form a single objective. Vary the parameters in the single objective function to generate multiple Pareto optimal solutions. Parameters - weights w1f1(x) + w2f2(x),…, wkfk Or, max(wi(fi - zi )) f1(x), f2(x),…, fk(x) F

Fitness assignment Advantages – Weighted sum
Easy to understand and implement. Fitness assignment is computationally efficient. If time available is short can be used to quickly provide a Pareto optimal solution. Disadvantages - Weighted sum Non-convex Pareto optimal fronts cannot be handled.

Fitness assignment Objective based
Switch between objectives in the selection phase. Every time an individual is chosen for reproduction, a different objective decides. E.g. Vector evaluated genetic algorithm (VEGA) proposed by David Schaffer. First implementation of an evolutionary multi-objective optimization algorithm. Subpopulations are created and each subpopulation is evaluated with a different objective. Mating pool Population New population f1 Selection Reproduction f2 f3

Simple idea and easy to implement. Simple single objective genetic algorithm can be easily extended to handle multi-objective optimization problems. Has tendency to produce solutions near the individual best for every objective. An advantage when this property is desirable. Disadvantages Each solution is evaluated only with respect to one objective. In multi-objective optimization algorithm all solutions are important. Individuals may be stuck at local optima of individual objectives.

Fitness assignment Dominance based Different ways
Pareto dominance based fitness ranking proposed by Goldberg in 1989. Different ways Dominance rank: Number of individuals by which an individual is dominated. E.g. MOGA, SPEA2 Dominance depth: The fitness is based on the front an individual belongs. NSGA-II Dominance count: Number of individuals dominated by an individual. SPEA2

Fitness assignment 4 1 1 1 1 2 4 2 Dominance rank Dominance count 3 2
4 1 1 1 1 2 4 2 Dominance count Dominance rank 3 2 1 Dominance depth

Diversity preservation
Chance of an individual being selected Increases: Low number of solutions in its neighborhood. Decreases: High number of solutions in its neighborhood. There are at least three types: Kernel methods Nearest neighbor Histogram

Diversity preservation
Kernel methods: Sum of f values, where f is a function of distance. E.g. NSGA Nearest neighbor The perimeter of the cuboid formed by the nearest neighbors as the vertices. E.g. NSGA-II f f f i-1 i i+1

Diversity preservation
Histogram Number of elements in a hyperbox. E.g. PAES

Elitism Elitism is needed to preserve the promising solutions
No archive strategy Old population Offspring Old population Offspring New Archive New population New population Archive

Constraint handling Penalty function approach
For every solution, calculate the overall constraint violation, OCV (sum of Constraint violations). Fm(xi) = fm(xi) + OCV Solution - (xi), fm(xi)- mth objective value for xi, , Fm(xi) – Overall mth objective value for xi . OCV is added to each of the objective function values. Use constraints as additional objectives Usually used when feasible search space is very narrow.

Constraint handling Deb’s constraint domination strategy
A solution xi constraint dominates a solution xj, if any is true: xi is feasible and xj is not. xi and xj are both infeasible, but xi has a smaller constraint violation. xi and xj are feasible and xi dominates xj. Advantages: Penalty less approach. Easy to implement and clearly distinguishes good from bad solutions. Can handle if population has only infeasible solutions. Disadvantages: Problem to maintain diversity of solutions. Slightly infeasible and near optimal solutions are not preferred over feasible solutions far from optima.

Non-dominated Sorting Genetic Algorithm (NSGA-II)
Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical Information Technology

Objectives The objectives of this lecture is to:

Reference K. Deb, S. Agarwal, A. Pratap, and T. Meyarivan. A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2):182–197, 2002.

NSGA-II Non-dominated sorting genetic algorithm –II was proposed by Deb et al. in 2000. NSGA-II procedure has three features: It uses an elitist principle It emphasizes non-dominated solutions. It uses an explicit diversity preserving mechanism

NSGA-II NSGA-II Crossover & Mutation ƒ2 ƒ1

NSGA-II Crowding distance Crowding distance assignment procedure
To get an estimate of the density of solutions surrounding a particular solution. Crowding distance assignment procedure Step 1: Set l = |F|, F is a set of solutions in a front. Set di = 0, i = 1,2,…,l. Step 2: For every objective function m = 1,2,…,M, sort the set in worse order of fm or find sorted indices vector: Im = sort(fm).

NSGA-II Step 3: For m = 1,2,…,M, assign a large distance to boundary solutions, i.e. set them to ∞ and for all other solutions j = 2 to (l-1), assign as follows: i-1 i i+1

NSGA-II Crowded tournament selection operator
A solution xi wins a tournament with another solution xj if any of the following conditions are true: If solution xi has a better rank, that is, ri < rj . If they have the same rank but solution xi has a better crowding distance than solution xj, that is, ri = rj and di > dj . Objective space